[
https://issues.apache.org/jira/browse/HIVE-6083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yin Huai updated HIVE-6083:
---------------------------
Description:
I was trying to use a CTAS query to create a table stored with ORC and
orc.compress was set to SNAPPY. However, the table was still compressed as ZLIB
(although the result of DESCRIBE still shows that this table is compressed by
SNAPPY). For a CTAS query, SemanticAnalyzer.genFileSinkPlan uses
CreateTableDesc to generate the TableDesc for the FileSinkDesc by calling
PlanUtils.getTableDesc. However, in PlanUtils.getTableDesc, I do not see user
provided table properties are assigned to the returned TableDesc
(CreateTableDesc.getTblProps was not called in this method ).
btw, I only checked the code of 0.12 and trunk.
Two examples:
* Snappy compression
{code}
create table web_sales_wrong_orc_snappy
stored as orc tblproperties ("orc.compress"="SNAPPY")
as select * from web_sales;
{code}
{code}
describe formatted web_sales_wrong_orc_snappy;
....
Location:
hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_snappy
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 719384
orc.compress SNAPPY
rawDataSize 97815412
totalSize 40625243
transient_lastDdlTime 1387566015
....
{code}
{code}
bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_snappy/000000_0
Rows: 719384
Compression: ZLIB
Compression size: 262144
...
{code}
* No compression
{code}
create table web_sales_wrong_orc_none
stored as orc tblproperties ("orc.compress"="NONE")
as select * from web_sales;
{code}
{code}
describe formatted web_sales_wrong_orc_none;
....
Location:
hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_none
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE true
numFiles 1
numRows 719384
orc.compress NONE
rawDataSize 97815412
totalSize 40625243
transient_lastDdlTime 1387566064
....
{code}
{code}
bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_none/000000_0
Rows: 719384
Compression: ZLIB
Compression size: 262144
...
{code}
was:
I was trying to use a CTAS query to create a table stored with ORC and
orc.compress was set to SNAPPY. However, the table was still compressed as ZLIB
(although the result of DESCRIBE still shows that this table is compressed by
SNAPPY). For a CTAS query, SemanticAnalyzer.genFileSinkPlan uses
CreateTableDesc to generate the TableDesc for the FileSinkDesc by calling
PlanUtils.getTableDesc. However, in PlanUtils.getTableDesc, I do not see user
provided table properties are assigned to the returned TableDesc
(CreateTableDesc.getTblProps was not called in this method ).
btw, I only checked the code of 0.12 and trunk.
> User provided table properties are not assigned to the TableDesc of the
> FileSinkDesc in a CTAS query
> ----------------------------------------------------------------------------------------------------
>
> Key: HIVE-6083
> URL: https://issues.apache.org/jira/browse/HIVE-6083
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.12.0, 0.13.0
> Reporter: Yin Huai
> Assignee: Yin Huai
> Attachments: HIVE-6083.1.patch.txt
>
>
> I was trying to use a CTAS query to create a table stored with ORC and
> orc.compress was set to SNAPPY. However, the table was still compressed as
> ZLIB (although the result of DESCRIBE still shows that this table is
> compressed by SNAPPY). For a CTAS query, SemanticAnalyzer.genFileSinkPlan
> uses CreateTableDesc to generate the TableDesc for the FileSinkDesc by
> calling PlanUtils.getTableDesc. However, in PlanUtils.getTableDesc, I do not
> see user provided table properties are assigned to the returned TableDesc
> (CreateTableDesc.getTblProps was not called in this method ).
> btw, I only checked the code of 0.12 and trunk.
> Two examples:
> * Snappy compression
> {code}
> create table web_sales_wrong_orc_snappy
> stored as orc tblproperties ("orc.compress"="SNAPPY")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_snappy;
> ....
> Location:
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_snappy
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE true
> numFiles 1
> numRows 719384
> orc.compress SNAPPY
> rawDataSize 97815412
> totalSize 40625243
> transient_lastDdlTime 1387566015
> ....
> {code}
> {code}
> bin/hive --orcfiledump
> /user/hive/warehouse/web_sales_wrong_orc_snappy/000000_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}
> * No compression
> {code}
> create table web_sales_wrong_orc_none
> stored as orc tblproperties ("orc.compress"="NONE")
> as select * from web_sales;
> {code}
> {code}
> describe formatted web_sales_wrong_orc_none;
> ....
> Location:
> hdfs://localhost:54310/user/hive/warehouse/web_sales_wrong_orc_none
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE true
> numFiles 1
> numRows 719384
> orc.compress NONE
> rawDataSize 97815412
> totalSize 40625243
> transient_lastDdlTime 1387566064
> ....
> {code}
> {code}
> bin/hive --orcfiledump /user/hive/warehouse/web_sales_wrong_orc_none/000000_0
> Rows: 719384
> Compression: ZLIB
> Compression size: 262144
> ...
> {code}
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)