Inline.
On Mon, Aug 20, 2018 at 9:20 AM Paul Rogers <[email protected]> wrote: > ... > By contrast, migrating Drill internals to Arrow has always been seen as > the bulk of the cost; costs which the "crude-but-effective" suggestion > seeks to avoid. Some of the full-integration costs include: > > * Reworking Drill's direct memory model to work with Arrow's. > This should be relatively isolated to the allocation/deallocation code. The deallocation should become a no-op. The allocation becomes simpler and safer. > * Changing all low-level runtime code that works with vectors to instead > work with Arrow vectors. > Why? You already said that most code doesn't have to change since the format is the same. > * Change all Drill's vector metadata, and code that uses that metadata, to > use Arrow's metadata instead. > Why? You said that converting Arrow metadata to Drill's metadata would be simple. Why not just continue with that? > * Since generated code works directly with vectors, change all the code > generation. > Why? You said the UDFs would just work. > * Since Drill vectors and metadata are exposed via the Drill client to > JDBC and ODBC, those must be revised as well. > How much given the high level of compatibility? > * Since the wire format will change, clients of Drill must upgrade their > JDBC/ODBC drivers when migrating to an Arrow-based Drill. > Doesn't this have to happen fairly often anyway? Perhaps this would be a good excuse for a 2.0 step.
