[ 
https://issues.apache.org/jira/browse/DRILL-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153435#comment-15153435
 ] 

Jinfeng Ni commented on DRILL-4392:
-----------------------------------

I submitted a patch for this issue.

Seems the issue was caused by the change of MaterializedField.getPath() 
returning a String in stead of SchemaPath. That makes the check for the 
internal partition-related field fail, since one uses String, while the other 
uses SchemaPath.  The fix is simply to compare two Strings. 

On a side note, I'm not sure if it is a right way to change 
MaterializedField.getPath() to return String, in stead of SchemaPath.  
Returning String means we have to ensure the case sensitivity in comparison is 
consistent across the code base, which seems harder to enforce.  



> CTAS with partition writes an internal field into generated parquet files
> -------------------------------------------------------------------------
>
>                 Key: DRILL-4392
>                 URL: https://issues.apache.org/jira/browse/DRILL-4392
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Jinfeng Ni
>            Assignee: Steven Phillips
>            Priority: Blocker
>
> On today's master branch:
> {code}
> select * from sys.version;
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------+----------------------------+
> |     version     |                 commit_id                 |               
>             commit_message                            |        commit_time    
>      |   build_email   |         build_time         |
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------+----------------------------+
> | 1.5.0-SNAPSHOT  | 9a3a5c4ff670a50a49f61f97dd838da59a12f976  | DRILL-4382: 
> Remove dependency on drill-logical from vector package  | 16.02.2016 @ 
> 11:58:48 PST  | [email protected]  | 16.02.2016 @ 17:40:44 PST  |
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------
> {code}
> Parquet table created by Drill's CTAS statement has one internal field 
> "P_A_R_T_I_T_I_O_N_C_O_M_P_A_R_A_T_O_R".   This additional field would not 
> impact non-star query, but would cause incorrect result for star query.
> {code}
> use dfs.tmp;
> create table nation_ctas partition by (n_regionkey) as select * from 
> cp.`tpch/nation.parquet`;
> select * from dfs.tmp.nation_ctas limit 6;
> +--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------+----------------------------------------+
> | n_nationkey  |     n_name     | n_regionkey  |                              
>                       n_comment                                               
>      | P_A_R_T_I_T_I_O_N_C_O_M_P_A_R_A_T_O_R  |
> +--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------+----------------------------------------+
> | 5            | ETHIOPIA       | 0            | ven packages wake quickly. 
> regu                                                                          
>        | true                                   |
> | 15           | MOROCCO        | 0            | rns. blithely bold courts 
> among the closely regular packages use furiously bold platelets?              
>         | false                                  |
> | 14           | KENYA          | 0            |  pending excuses haggle 
> furiously deposits. pending, express pinto beans wake fluffily past t         
>           | false                                  |
> | 0            | ALGERIA        | 0            |  haggle. carefully final 
> deposits detect slyly agai                                                    
>          | false                                  |
> | 16           | MOZAMBIQUE     | 0            | s. ironic, unusual 
> asymptotes wake blithely r                                                    
>                | false                                  |
> | 24           | UNITED STATES  | 1            | y final packages. slow foxes 
> cajole quickly. quickly silent platelets breach ironic accounts. unusual 
> pinto be  | true
> {code}
> This basically breaks all the parquet files created by Drill's CTAS with 
> partition support. 
> Also, it will also fail one of the Pre-commit functional test [1]
> [1] 
> https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/ctas/ctas_auto_partition/general/data/drill3361.q



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to