paul-rogers commented on issue #1367: DRILL-6585: PartitionSender clones vectors, but shares field metdata URL: https://github.com/apache/drill/pull/1367#issuecomment-404987283 @sohami, thanks for your comments and questions. Unfortunately, I cannot debug the use case and so you may have a deeper understanding than I do. I'm working from experience gained some six months ago when working with the result set loader, and that knowledge is getting rusty. > In original PR there is a change for NullableValueVectors to add the values and bits vector materialized field as child field of parent vector field. ... From your comment it looks like because the internal values ValueVector mode needs to be required so you are creating another Materialized Field with that mode for internal values vector and adding it as child of parent vector field. The reason for that change is that the result set loader code that clones a vector needs to know the actual type. That code walks the vector tree, using the `MaterializedField` to get the type. If a `values` vector (which has no `bits` vector) reports its type as `Nullable`, then the clone will create a `bits` vector, which causes havoc. I'm thinking that I should change the cloning code. Rather than believing the `MaterializedField`, I can use the vector class type itself. That will be more clunky and slow, but it will eliminate the need to change the existing vector code. Given how long this discussion has gone on, that I can't do the required tests, and that we can't we discuss this in person, I'm thinking that the alternative approach may be more expedient. I suppose a larger question is whether the final bits of the result set loader are even still useful. Much work has been done on batch sizing since this work started. Is it still worth while finishing up this code so we can control the batch size for readers? Parquet has its own solution. Is it worth worrying about the others?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
