Hi Weijie, It is possible if maxOuputRecordCount (received from memoryManager.getOutputRowCount()) is less than incomingRecordCount. For more details please see DRILL-6340 <https://issues.apache.org/jira/browse/DRILL-6340> and design document <https://docs.google.com/document/d/1h0WsQsen6xqqAyyYSrtiAniQpVZGmQNQqC1I2DJaxAA/edit?usp=sharing> attached to this Jira.
Kind regards, Volodymyr Vysotskyi On Thu, Apr 4, 2019 at 5:17 PM weijie tong <[email protected]> wrote: > I have a doubt about the ProjectRecordBatch implementation. Hope someone > could give an explanation about that. To the line 234 of > ProjectRecordBatch, at what case,the projector output row size less than > the input size ? > > On Thu, Apr 4, 2019 at 5:11 PM weijie tong <[email protected]> > wrote: > > > Hi Igor: > > That's a good idea! It could resolve that issue. The basic question has > > solved. To use the official Arrow, there's still two issues needed to be > > contributed to Arrow, that I will do: > > 1. make gcc lib static linked into the jni dynamic lib. > > Without this, it will require the platform installed right version gcc > > 2. add convertToNull function to gandiva > > This could make some project expression with convertToNull function to > be > > gandiva executed > > > > Of course, without these two issues solved, I still could give an > > integration implementation. > > > > BTW, once the integration is done. How do we supply the gandiva jni lib ? > > Leave it to user to build it ? or we supply different platform > > distributions? > > > > > > On Thu, Apr 4, 2019 at 3:53 PM Igor Guzenko <[email protected]> > > wrote: > > > >> Hello Weijie, > >> > >> Did you try to create same package as in Arrow, but in Drill and use > >> wrapper class around target for publishing > >> desired methods with package access ? > >> > >> Thanks, Igor > >> > >> On Thu, Apr 4, 2019 at 9:51 AM weijie tong <[email protected]> > >> wrote: > >> > > >> > HI : > >> > > >> > Gandiva is a sub project of Arrow. Arrow gandiva using LLVM codegen > and > >> > simd skill could achieve better query performance. Arrow and Drill > has > >> > similar column memory format. The main difference now is the null > >> > representation. Also Arrow has made great changes to the ValueVector. > To > >> > adopt Arrow to replace Drill's VV has been discussed before. That > would > >> be > >> > a great job. But to leverage gandiva , by working at the physical > memory > >> > address level , this work could be little relatively. > >> > > >> > Now I have done the integration work at our own branch by make some > >> changes > >> > to the Arrow branch, and issued DRILL-7087 and ARROW-4819. The main > >> changes > >> > to ARROW-4819 is to make some package level method to be public. But > >> arrow > >> > community seems not plan to accept this change. Their advice is to > have > >> a > >> > arrow branch. > >> > > >> > So what do you think? > >> > > >> > 1、Have a self branch of Arrow. > >> > 2、waiting for the Arrow integration completely. > >> > or some other ideas? > >> > > >
