[
https://issues.apache.org/jira/browse/IMPALA-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16675907#comment-16675907
]
ASF subversion and git services commented on IMPALA-402:
--------------------------------------------------------
Commit 58cd69ac48d4014ef956a7df9dce63c0b8f122c4 in impala's branch
refs/heads/master from [[email protected]]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=58cd69a ]
IMPALA-402: test for random partitioning in insert
This adds a basic regression test for the bug reported in IMPALA-402.
Testing:
Exhaustive build.
Looped the modified test overnight.
Change-Id: I4bbca5c64977cadf79dabd72f0c8876a40fdf410
Reviewed-on: http://gerrit.cloudera.org:8080/11799
Reviewed-by: Impala Public Jenkins <[email protected]>
Tested-by: Impala Public Jenkins <[email protected]>
> Add test for dynamic partition expr involving rand()
> ----------------------------------------------------
>
> Key: IMPALA-402
> URL: https://issues.apache.org/jira/browse/IMPALA-402
> Project: IMPALA
> Issue Type: Improvement
> Components: Infrastructure
> Affects Versions: Impala 1.0, Impala 2.5.0, Impala 2.6.0, Impala 2.7.0,
> Impala 2.8.0, Impala 2.9.0
> Environment: CentOS 6.3
> Reporter: Benyi Wang
> Assignee: Tim Armstrong
> Priority: Major
> Fix For: Impala 3.1.0
>
>
> I found two problems:
> * "Insert overwrite table" doesn't clean up the directory (external table)
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x - impala supergroup 0 2013-06-06 12:46
> /user/benyiw/tmp_abc/slot=1
> -rw-r--r-- 2 impala supergroup 16088 2013-06-06 12:46
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119435_641430213_data.0
> -rw-r--r-- 2 impala supergroup 100691 2013-06-06 12:46
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119436_1260163059_data.0
> -rw-r--r-- 2 impala supergroup 43875 2013-06-06 12:46
> /user/benyiw/tmp_abc/slot=1/3456606565886086588--5331466032849119437_929705780_data.0
> drwxr-xr-x - impala supergroup 0 2013-06-06 12:40
> /user/benyiw/tmp_abc/slot=2
> -rw-r--r-- 2 impala supergroup 8 2013-06-06 12:40
> /user/benyiw/tmp_abc/slot=2/-8660787917599456385--5527614477985301990_1328141055_data.0
> drwxr-xr-x - impala supergroup 0 2013-06-06 12:40
> /user/benyiw/tmp_abc/slot=3
> -rw-r--r-- 2 impala supergroup 8 2013-06-06 12:40
> /user/benyiw/tmp_abc/slot=3/-8660787917599456385--5527614477985301990_501684742_data.0
> drwxr-xr-x - impala supergroup 0 2013-06-06 12:47
> /user/benyiw/tmp_abc/slot=b
> -rw-r--r-- 2 impala supergroup 16130 2013-06-06 12:47
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146409_792816606_data.0
> -rw-r--r-- 2 impala supergroup 100728 2013-06-06 12:47
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146410_157404218_data.0
> -rw-r--r-- 2 impala supergroup 43796 2013-06-06 12:47
> /user/benyiw/tmp_abc/slot=b/705210285518833776--6520969021873146411_157404218_data.0
> {code}
> * When I ran the following queries, all output files are put into the same
> partition.
> {code}
> create table tmp_abc (
> customer_id string,
> email string
> ) partitioned by (slot string)
> row format delimited fields terminated by '\t' lines terminated by '\n'
> stored as TextFile
> location '/user/benyiw/tmp_abc';
> insert overwrite table tmp_abc partition (slot) select customer_id, email,
> case when slot1 < 0.10 then "a" when slot1 < 0.70 then "b" else "c" end as
> slot from ( select customer_id, email, rand() as slot1 from (select
> customer_id, max(email) as email, sum(case when seg_num >= 0 then 1 else 0
> end) as included from customers where ( (seg_num in (1) and member = 'Y') or
> (seg_num = -1) ) and site_key = 'a_site' and coll_def_id = 'everything' group
> by customer_id having included > 0 ) a ) b
> {code}
> {code}
> $ hadoop fs -ls -R /user/benyiw/tmp_abc;
> drwxr-xr-x - impala supergroup 0 2013-06-06 13:01
> /user/benyiw/tmp_abc/slot=a
> -rw-r--r-- 2 impala supergroup 16021 2013-06-06 13:01
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985492_909811936_data.0
> -rw-r--r-- 2 impala supergroup 100713 2013-06-06 13:01
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985493_272258764_data.0
> -rw-r--r-- 2 impala supergroup 43920 2013-06-06 13:01
> /user/benyiw/tmp_abc/slot=a/-7883266034308591983--7883771993317985494_272258764_data.0
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]