[ 
https://issues.apache.org/jira/browse/DRILL-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14101657#comment-14101657
 ] 

Jason Altekruse edited comment on DRILL-1311 at 8/19/14 10:53 PM:
------------------------------------------------------------------

The JSON project pushdown that should be merged soon will at least be a partial 
fix for this, but there is a larger issue to solve as well. If a and b are the 
only columns referenced in the query, then they will be pushed down into the 
read and the schema change will not occur. If another column like c is given 
explicitly in the query, then the reader will null fill this column for any 
records/files that do not have it. The issue comes when * is given, and the 
file with just a and b is read first. We have no way to know what will come in 
the future, which means we will have to send batches with the known columns and 
upstream operators like join will have to know what constitutes a meaningful 
schema change.


was (Author: jaltekruse):
The JSON project pushdown that should be merged soon will at least be a partial 
fix for this, but there is a large issue to solve as well. If a and b are the 
only columns referenced in the query, then they will be pushed down into the 
read and the schema change will not occur. If another column like c is given 
explicitly in the query, then the reader will null fill this column for any 
records/files that do not have it. The issue comes when * is given, and the 
file with just a and b is read first. We have no way to know what will come in 
the future, which means we will have to send batches with the known columns and 
upstream operators like join will have to know what constitutes a meaningful 
schema change.

> Hash join does not support schema changes error
> -----------------------------------------------
>
>                 Key: DRILL-1311
>                 URL: https://issues.apache.org/jira/browse/DRILL-1311
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Neeraja
>
> - Create a directory with a couple of JSON files. One with columns a,b and 
> second with columns a,b,c. 
> - a & b attributes have same data types across both the files
> - create a view by selecting columns a, b from the directory
> - Join the view with any other table
> An error shows up indicating that 'Hash join does not support schema changes'.
> There is a schema change across the files with a new element being added, 
> however given that specific columns a,b are selected in the view, expect that 
> query works fine.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to