[
https://issues.apache.org/jira/browse/HIVE-17923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eugene Koifman updated HIVE-17923:
----------------------------------
Description:
given
{noformat}
CREATE TABLE over10k_orc_bucketed(t tinyint,
si smallint,
i int,
b bigint,
f float,
d double,
bo boolean,
s string,
ts timestamp,
`dec` decimal(4,2),
bin binary) CLUSTERED BY(si) INTO 4 BUCKETS STORED AS ORC;
{noformat}
insert into over10k_orc_bucketed select * from over10k
{noformat}
produces 1 data file (bucket 0). It should produce 4 based on input data.
{noformat}
insert into over10k_orc_bucketed select * from over10k cluster by si
{noformat}
does the right thing.
acid_vectorization_original.q has the full script (HIVE-17458)
was:
given
{noformat}
CREATE TABLE over10k_orc_bucketed(t tinyint,
si smallint,
i int,
b bigint,
f float,
d double,
bo boolean,
s string,
ts timestamp,
`dec` decimal(4,2),
bin binary) CLUSTERED BY(si) INTO 4 BUCKETS STORED AS ORC;
{noformat}
insert into over10k_orc_bucketed select * from over10k
{noformat}
produces 1 data file (bucket 0). It should produce 4 based on input data.
{noformat}
insert into over10k_orc_bucketed select * from over10k cluster by si
{noformat}
does the right thing.
acid_vectorization_original.q has the full script
> 'cluster by' should not be needed for a bucketed table
> ------------------------------------------------------
>
> Key: HIVE-17923
> URL: https://issues.apache.org/jira/browse/HIVE-17923
> Project: Hive
> Issue Type: Bug
> Affects Versions: 3.0.0
> Reporter: Eugene Koifman
> Priority: Blocker
>
> given
> {noformat}
> CREATE TABLE over10k_orc_bucketed(t tinyint,
> si smallint,
> i int,
> b bigint,
> f float,
> d double,
> bo boolean,
> s string,
> ts timestamp,
> `dec` decimal(4,2),
> bin binary) CLUSTERED BY(si) INTO 4 BUCKETS STORED AS ORC;
> {noformat}
> insert into over10k_orc_bucketed select * from over10k
> {noformat}
> produces 1 data file (bucket 0). It should produce 4 based on input data.
> {noformat}
> insert into over10k_orc_bucketed select * from over10k cluster by si
> {noformat}
> does the right thing.
> acid_vectorization_original.q has the full script (HIVE-17458)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)