ilooner edited a comment on issue #1445: DRILL-6706: fixed null pointer 
exception in HashJoin
URL: https://github.com/apache/drill/pull/1445#issuecomment-416320678
 
 
   @sachouche @vvysotskyi I don't agree this should be handled by the column 
sizes map. The issue is that operators are expecting a column with the name of 
MYCOLUMN (because that is the name provided by the planner), but instead the 
input column has a name of `` `MYCOLUMN` ``. This can cause errors at many 
points in an operator's execution, not just within the RecordBatchSizer's 
columnSizes map. For example, in HashJoin the HashTable uses the unquoted 
column names provided by the planner to retrieve the key column from the 
incoming record batch (See ChainedHashTable.createAndSetupHashTable). So while 
this fix resolves a fatal exception in the batch sizer, it does not address the 
issue of functional correctness in other parts of the code like the HashTable 
which may be silently generating incorrect results.
   
   If we close this issue now with a temporary fix, some poor soul may spend 
weeks debugging strange and unexpected data correctness issues down the line. 
In order to avoid that scenario and to increase the urgency of fixing the root 
cause, I am actually thinking that we should leave the bug unfixed until we 
have a permanent fix for the parquet reader. What are your guys thoughts?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to