[ 
https://issues.apache.org/jira/browse/DRILL-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154553#comment-15154553
 ] 

ASF GitHub Bot commented on DRILL-4392:
---------------------------------------

Github user jinfengni commented on the pull request:

    https://github.com/apache/drill/pull/383#issuecomment-186332099
  
    Right. The planner could not remove that internal field by projection 
removal, since Writer operator has to use that field. It's the writer's job to 
exclude that field from the generated files.  


> CTAS with partition writes an internal field into generated parquet files
> -------------------------------------------------------------------------
>
>                 Key: DRILL-4392
>                 URL: https://issues.apache.org/jira/browse/DRILL-4392
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Jinfeng Ni
>            Assignee: Steven Phillips
>            Priority: Blocker
>
> On today's master branch:
> {code}
> select * from sys.version;
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------+----------------------------+
> |     version     |                 commit_id                 |               
>             commit_message                            |        commit_time    
>      |   build_email   |         build_time         |
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------+----------------------------+
> | 1.5.0-SNAPSHOT  | 9a3a5c4ff670a50a49f61f97dd838da59a12f976  | DRILL-4382: 
> Remove dependency on drill-logical from vector package  | 16.02.2016 @ 
> 11:58:48 PST  | j...@apache.org  | 16.02.2016 @ 17:40:44 PST  |
> +-----------------+-------------------------------------------+---------------------------------------------------------------------+----------------------------+-----------------
> {code}
> Parquet table created by Drill's CTAS statement has one internal field 
> "P_A_R_T_I_T_I_O_N_C_O_M_P_A_R_A_T_O_R".   This additional field would not 
> impact non-star query, but would cause incorrect result for star query.
> {code}
> use dfs.tmp;
> create table nation_ctas partition by (n_regionkey) as select * from 
> cp.`tpch/nation.parquet`;
> select * from dfs.tmp.nation_ctas limit 6;
> +--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------+----------------------------------------+
> | n_nationkey  |     n_name     | n_regionkey  |                              
>                       n_comment                                               
>      | P_A_R_T_I_T_I_O_N_C_O_M_P_A_R_A_T_O_R  |
> +--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------+----------------------------------------+
> | 5            | ETHIOPIA       | 0            | ven packages wake quickly. 
> regu                                                                          
>        | true                                   |
> | 15           | MOROCCO        | 0            | rns. blithely bold courts 
> among the closely regular packages use furiously bold platelets?              
>         | false                                  |
> | 14           | KENYA          | 0            |  pending excuses haggle 
> furiously deposits. pending, express pinto beans wake fluffily past t         
>           | false                                  |
> | 0            | ALGERIA        | 0            |  haggle. carefully final 
> deposits detect slyly agai                                                    
>          | false                                  |
> | 16           | MOZAMBIQUE     | 0            | s. ironic, unusual 
> asymptotes wake blithely r                                                    
>                | false                                  |
> | 24           | UNITED STATES  | 1            | y final packages. slow foxes 
> cajole quickly. quickly silent platelets breach ironic accounts. unusual 
> pinto be  | true
> {code}
> This basically breaks all the parquet files created by Drill's CTAS with 
> partition support. 
> Also, it will also fail one of the Pre-commit functional test [1]
> [1] 
> https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Functional/ctas/ctas_auto_partition/general/data/drill3361.q



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to