paul-rogers commented on issue #1675: DRILL-7055: Revise SELECT * to exclude 
partitions
URL: https://github.com/apache/drill/pull/1675#issuecomment-469357589
 
 
   @arina-ielchiieva, if we want to do this for 2.0 then let's keep the current 
behavior. I'll need to do more fixes to the scan framework, but I'll proceed 
with that work. Far too difficult to add another option and have to test both 
paths; doing so would double the effort.
   
   My guess is that the original idea was that the partition directories have 
data and should be considered columns. For example, if data is partitioned by 
year and month, then those might want to be columns. This is particularly 
important if the partition columns do not appear within the data itself.
   
   Using "dir0" and "dir1" has always been a hack: they are meaningless names 
and require that the user map from "year" to the partitioning structure and 
know that, in one table "dir0" means "year", while in another it might mean 
"store". Moving forward, it would be better to use a Hive-like solution: a 
mapping from partition directories to columns so that the user sees "year" and 
"month" as column names, not "dir0" and "dir1". Sounds like a job for the new 
metadata system.
   
   If we want a generic directory solution, then allowing a "dir" array would 
be handy: the array can allow any number of entries, avoiding the schema change 
issues inherent with the current design.
   
   Agree, all these are more than we can do short term, so let's mothball this 
change for the Drill 1.x series.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to