Aman Sinha created DRILL-3948:
---------------------------------

             Summary: Partitioning columns of a Parquet table should be made 
visible to end user
                 Key: DRILL-3948
                 URL: https://issues.apache.org/jira/browse/DRILL-3948
             Project: Apache Drill
          Issue Type: Improvement
          Components: Metadata, Query Planning & Optimization
    Affects Versions: 1.2.0
            Reporter: Aman Sinha


For Parquet files, Drill can do partition pruning for filter conditions on a 
column which satisfies the following criteria: 
  Each parquet file has a single value of that column. The parquet metadata is 
examined for the min and max values of that column and if they are the same, 
the column is considered a partitioning column. 

  When CTAS auto-partition is used, the above criteria is enforced, but even 
for files created through external methods could satisfy the criteria.  

It is difficult for users to know what exactly are the candidate partitioning 
columns in the table.  We should provide this information in a user friendly 
way:  for instance: 
  - special  'show partition columns for <table>'  command
  - In the Explain plan, show partition columns for the table in Scan node
 More options should be discussed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to