ilooner commented on issue #1445: DRILL-6706: fixed null pointer exception in 
HashJoin
URL: https://github.com/apache/drill/pull/1445#issuecomment-416320678
 
 
   @sachouche @vvysotskyi I don't agree this should be handled by the column 
sizes map. The issue is that operators are expecting a column with the name of 
MYCOLUMN (because that is the name provided by the planner), but instead the 
input column has a name of `MYCOLUMN` . This can cause errors at many points in 
an operator's execution, not just within the RecordBatchSizer's columnSizes 
map. For example, in HashJoin the HashTable uses the unquoted column names 
provided by the planner to retrieve the key column from the incoming record 
batch (See ChainedHashTable.createAndSetupHashTable). So while this fix 
resolves a fatal exception in the batch sizer, it does not address the issue of 
functional correctness in other parts of the code like the HashTable which may 
be silently generating incorrect results.
   
   If we close this issue now with a temporary fix, some poor soul may spend 
weeks debugging strange and unexpected data correctness issues down the line. 
In order to avoid that scenario and to increase the urgency of fixing the root 
cause, I am actually thinking that we should leave the bug unfixed until we 
have a permanent fix for the parquet reader. What are your guys thoughts?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to