[jira] [Comment Edited] (TAJO-459) Target tuples shouldn't include the partition keys

Min Zhou (JIRA) Sat, 28 Dec 2013 15:42:54 -0800

    [ 
https://issues.apache.org/jira/browse/TAJO-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13858116#comment-13858116
 ]


Min Zhou edited comment on TAJO-459 at 12/28/13 11:40 PM:
----------------------------------------------------------

I'd like to discuss it since other partition type, hash/range/list can't 
guarantee the value of partition columns in the same partition are the same.  
At least I think we should distinguish regular result columns and partitioned 
columns. I have no idea that what will going on if we insert into a column 
partitioned table but with no partitioned key selected like the way below

{noformat}
insert overwrite into tbl(col1, col2) select l_orderkey, l_quantity from 
lineitem
{noformat}

This would be a bug, I will create another issue for it.


was (Author: coderplay):
I'd like to discuss it since other partition type, hash/range/list can't 
guarantee the value of partition columns in the same partition are the same.  
At least I think we should distinguish regular result columns and partitioned 
columns. I have no idea that what will going on if we insert into a column 
partitioned table but with no partitioned key selected like the way below

{noformat}
insert overwrite into tbl(col1, col2) select l_orderkey, l_quantity from 
lineitem
{noformat}

> Target tuples shouldn't include the partition keys
> --------------------------------------------------
>
>                 Key: TAJO-459
>                 URL: https://issues.apache.org/jira/browse/TAJO-459
>             Project: Tajo
>          Issue Type: Sub-task
>          Components: planner/optimizer
>    Affects Versions: 0.8-incubating
>            Reporter: Min Zhou
>             Fix For: 0.8-incubating
>
>
> Currently, if you create a column partitioned table 
> {noformat}
>  create table tbl (col1 int4, col2 int4, null_col int4) partition by 
> column(key float8)
> {noformat}
> and later insert into it 
> {noformat}
> insert overwrite into tbl(col1, col2, key) select l_orderkey, l_partkey, 
> l_quantity from lineitem
> {noformat}
> From the code of ColumnPartitionedTableStoreExec.java, it seems that we 
> didn't distinguish between the real result columns (col1 and col2 here)  and 
> the column partition keys.  Just store the result including the partition 
> keys.  Hive doesn't save them like this way,cuz the value of partition 
> columns in the same partition are the same. 
> {noformat}
>   public Tuple next() throws IOException {
>     StringBuilder sb = new StringBuilder();
>     while((tuple = child.next()) != null) {
>       // set subpartition directory name
>       sb.delete(0, sb.length());
>       if (partitionColumnIndices != null) {
>         for(int i = 0; i < partitionColumnIndices.length; i++) {
>           Datum datum = tuple.get(partitionColumnIndices[i]);
>           if(i > 0)
>             sb.append("/");
>           sb.append(partitionColumnNames[i]).append("=");
>           sb.append(datum.asChars());
>         }
>       }
>       // add tuple
>       Appender appender = getAppender(sb.toString());
>       appender.addTuple(tuple);  // this line add the whole tuple into result 
> record include the partition keys
>     }
>   ...
>   }
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Comment Edited] (TAJO-459) Target tuples shouldn't include the partition keys

Reply via email to