> On May 24, 2015, 2:03 a.m., Xuefu Zhang wrote:
> > Have you thought of what if the client is not interactive, such as JDBC or 
> > thrift?
> 
> pengcheng xiong wrote:
>     I am sorry that we have not thought about it yet. We admitted that the 
> patch will not cover the case when the client is not interactive. Do you have 
> any good ideas that you can share with us? Do you think logging this besides 
> printing a waring msg is good enough? Thanks.
> 
> Xuefu Zhang wrote:
>     There are all kinds of issues with data loading into bucketed tables. 
> While advanced users might be able to load data correctly, I think that's 
> really rare. The data in a bucketed table needs to be generated by Hive. 
> Thefore, I think we should disable "insert into" and "load data 
> into|overwrite" for a bucketed table. We should also disallow external tables 
> for the same reason.
>     
>     To allow the advanced user to achieve what they used to do, we can have a 
> flag, such as "hive.enforce.strict.bucketing", which defaults to true. Those 
> users can proceed by turning this off.
>     
>     Another option for "insert into" would be supporting appending new data, 
> such as proposed in HIVE-3244.
> 
> Gopal V wrote:
>     Why would you disable "insert into" bucketed tables? How else would ACID 
> work?

yeah. but I guess we were talking about things out of the context of ACID. Even 
before ACID, user can do "insert into" a bucketed table, which can be very 
harmful.


- Xuefu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/34576/#review85082
-----------------------------------------------------------


On May 23, 2015, 5:47 p.m., pengcheng xiong wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/34576/
> -----------------------------------------------------------
> 
> (Updated May 23, 2015, 5:47 p.m.)
> 
> 
> Review request for hive and John Pullokkaran.
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Bucketized Table feature fails in some cases. if src & destination is 
> bucketed on same key, and if actual data in the src is not bucketed (because 
> data got loaded using LOAD DATA LOCAL INPATH ) then the data won't be 
> bucketed while writing to destination.
> Example
> ----------------------------------------------------------------------
> CREATE TABLE P1(key STRING, val STRING)
> CLUSTERED BY (key) SORTED BY (key) INTO 2 BUCKETS STORED AS TEXTFILE;
> LOAD DATA LOCAL INPATH '/Users/jp/apache-hive1/data/files/P1.txt' INTO TABLE 
> P1;
> – perform an insert to make sure there are 2 files
> INSERT OVERWRITE TABLE P1 select key, val from P1;
> --------------------------------------------------
> This is not a regression. This has never worked.
> This got only discovered due to Hadoop2 changes.
> In Hadoop1, in local mode, number of reducers will always be 1, regardless of 
> what is requested by app. Hadoop2 now honors the number of reducer setting in 
> local mode (by spawning threads).
> Long term solution seems to be to prevent load data for bucketed table.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java e53933e 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/LoadSemanticAnalyzer.java 
> 1a9b42b 
>   ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 623c2e8 
>   
> ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_1.q.out
>  f4522d2 
>   
> ql/src/test/results/clientnegative/bucket_mapjoin_wrong_table_metadata_2.q.out
>  9aa9b5d 
>   ql/src/test/results/clientnegative/exim_11_nonpart_noncompat_sorting.q.out 
> 9220c8e 
>   ql/src/test/results/clientpositive/auto_join32.q.out bfc8be8 
>   ql/src/test/results/clientpositive/auto_join_filters.q.out a6720d9 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_1.q.out 383defd 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_11.q.out e6e7ef3 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_12.q.out e9fb705 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_2.q.out c089419 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_3.q.out 6e443fa 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_4.q.out feaea04 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_5.q.out f64ecf0 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_7.q.out e89f548 
>   ql/src/test/results/clientpositive/auto_sortmerge_join_8.q.out 44c037f 
>   ql/src/test/results/clientpositive/bucket_map_join_1.q.out d778203 
>   ql/src/test/results/clientpositive/bucket_map_join_2.q.out aef77aa 
>   ql/src/test/results/clientpositive/bucket_map_join_spark1.q.out 870ecdd 
>   ql/src/test/results/clientpositive/bucket_map_join_spark2.q.out 33f5c46 
>   ql/src/test/results/clientpositive/bucket_map_join_spark3.q.out 067d1ff 
>   ql/src/test/results/clientpositive/bucketcontext_1.q.out 77bfcf9 
>   ql/src/test/results/clientpositive/bucketcontext_2.q.out a9db13d 
>   ql/src/test/results/clientpositive/bucketcontext_3.q.out 9ba3e0c 
>   ql/src/test/results/clientpositive/bucketcontext_4.q.out a2b37a8 
>   ql/src/test/results/clientpositive/bucketcontext_5.q.out 3ee1f0e 
>   ql/src/test/results/clientpositive/bucketcontext_6.q.out d2304fa 
>   ql/src/test/results/clientpositive/bucketcontext_7.q.out 1a105ed 
>   ql/src/test/results/clientpositive/bucketcontext_8.q.out 138e415 
>   ql/src/test/results/clientpositive/bucketizedhiveinputformat_auto.q.out 
> 215efdd 
>   ql/src/test/results/clientpositive/bucketmapjoin1.q.out 72f2a07 
>   ql/src/test/results/clientpositive/bucketmapjoin10.q.out b0e849d 
>   ql/src/test/results/clientpositive/bucketmapjoin11.q.out 4263cab 
>   ql/src/test/results/clientpositive/bucketmapjoin12.q.out bcd7394 
>   ql/src/test/results/clientpositive/bucketmapjoin2.q.out a8d9e9d 
>   ql/src/test/results/clientpositive/bucketmapjoin3.q.out c759f05 
>   ql/src/test/results/clientpositive/bucketmapjoin4.q.out f61500c 
>   ql/src/test/results/clientpositive/bucketmapjoin5.q.out 0cb2825 
>   ql/src/test/results/clientpositive/bucketmapjoin7.q.out 667a9db 
>   ql/src/test/results/clientpositive/bucketmapjoin8.q.out 252b377 
>   ql/src/test/results/clientpositive/bucketmapjoin9.q.out 5e28dc3 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative.q.out 6ae127d 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative2.q.out 4c9f54a 
>   ql/src/test/results/clientpositive/bucketmapjoin_negative3.q.out 9a0bfc4 
>   ql/src/test/results/clientpositive/groupby_sort_1_23.q.out 34cd1ff 
>   ql/src/test/results/clientpositive/groupby_sort_2.q.out b5e52f1 
>   ql/src/test/results/clientpositive/groupby_sort_3.q.out c16911a 
>   ql/src/test/results/clientpositive/groupby_sort_4.q.out a6b1c3d 
>   ql/src/test/results/clientpositive/groupby_sort_5.q.out 369e2b5 
>   ql/src/test/results/clientpositive/groupby_sort_7.q.out 7264695 
>   ql/src/test/results/clientpositive/groupby_sort_8.q.out ec16eb0 
>   ql/src/test/results/clientpositive/groupby_sort_9.q.out e49781a 
>   ql/src/test/results/clientpositive/groupby_sort_skew_1_23.q.out 0d631ce 
>   ql/src/test/results/clientpositive/groupby_sort_test_1.q.out 8c1765d 
>   ql/src/test/results/clientpositive/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/insert_values_orig_table.q.out 684cd1b 
>   ql/src/test/results/clientpositive/join_filters.q.out 4f112bd 
>   ql/src/test/results/clientpositive/join_nulls.q.out 46e0170 
>   ql/src/test/results/clientpositive/mergejoin.q.out cb96ab3 
>   ql/src/test/results/clientpositive/skewjoin_mapjoin11.q.out dd084e8 
>   ql/src/test/results/clientpositive/skewjoinopt19.q.out fd43409 
>   ql/src/test/results/clientpositive/skewjoinopt20.q.out a28e433 
>   ql/src/test/results/clientpositive/smb_mapjoin_1.q.out 9ab334b 
>   ql/src/test/results/clientpositive/smb_mapjoin_10.q.out ea2fa51 
>   ql/src/test/results/clientpositive/smb_mapjoin_2.q.out 379dc0d 
>   ql/src/test/results/clientpositive/smb_mapjoin_25.q.out c0a8959 
>   ql/src/test/results/clientpositive/smb_mapjoin_3.q.out 26fa5d4 
>   ql/src/test/results/clientpositive/smb_mapjoin_4.q.out 9fc7f93 
>   ql/src/test/results/clientpositive/smb_mapjoin_5.q.out 6e6882a 
>   ql/src/test/results/clientpositive/smb_mapjoin_7.q.out 82f5804 
>   ql/src/test/results/clientpositive/spark/auto_join32.q.out 361a968 
>   ql/src/test/results/clientpositive/spark/auto_join_filters.q.out 8934433 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_1.q.out 
> 09d2692 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_12.q.out 
> 8102ec1 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_2.q.out 
> 2ea0a65 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_3.q.out 
> 6281929 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_4.q.out 
> 31e9d86 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_5.q.out 
> 3eceb0b 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_7.q.out 
> ddbca05 
>   ql/src/test/results/clientpositive/spark/auto_sortmerge_join_8.q.out 
> 88d4dcb 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_1.q.out 4e8ce0d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_2.q.out c0a3c3d 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark1.q.out 
> 6230bef 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark2.q.out 
> 1a33625 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_spark3.q.out 
> fed923c 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez1.q.out 65bded2 
>   ql/src/test/results/clientpositive/spark/bucket_map_join_tez2.q.out 33e6d63 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin1.q.out 44f4d0c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin10.q.out 678ad54 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin11.q.out 95606f0 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin12.q.out d6c25e4 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin2.q.out d82480e 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin3.q.out 39552c1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin4.q.out ad2762d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin5.q.out f7c3d4d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin7.q.out 7bfe440 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin8.q.out 4601eb1 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin9.q.out 60bd103 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative.q.out 
> 031c46c 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative2.q.out 
> 4a8f46d 
>   ql/src/test/results/clientpositive/spark/bucketmapjoin_negative3.q.out 
> a09904e 
>   ql/src/test/results/clientpositive/spark/groupby_sort_1_23.q.out cfbce61 
>   ql/src/test/results/clientpositive/spark/groupby_sort_skew_1_23.q.out 
> 9343805 
>   ql/src/test/results/clientpositive/spark/skewjoinopt19.q.out eb9bb84 
>   ql/src/test/results/clientpositive/spark/skewjoinopt20.q.out 22de156 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_1.q.out 1ff1262 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_10.q.out cadf08e 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_2.q.out a0d51f3 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_25.q.out cb811ed 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_3.q.out f46b833 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_4.q.out a421a42 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_5.q.out af65010 
>   ql/src/test/results/clientpositive/spark/smb_mapjoin_7.q.out 622b950 
>   ql/src/test/results/clientpositive/stats11.q.out e51f049 
>   ql/src/test/results/clientpositive/tez/auto_join_filters.q.out 8fde41d 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_1.q.out a275d27 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_11.q.out 6ac74ca 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_12.q.out 8c8a3bf 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_2.q.out 2cb8416 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_3.q.out abeceb8 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_4.q.out 8eb9ce5 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_5.q.out adcc1fa 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_7.q.out 2562cb0 
>   ql/src/test/results/clientpositive/tez/auto_sortmerge_join_8.q.out 31b0a97 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez1.q.out 61c197f 
>   ql/src/test/results/clientpositive/tez/bucket_map_join_tez2.q.out 3f980b6 
>   ql/src/test/results/clientpositive/tez/explainuser_2.q.out f84524b 
>   ql/src/test/results/clientpositive/tez/insert_orig_table.q.out 5eea74d 
>   ql/src/test/results/clientpositive/tez/mergejoin.q.out 97df12a 
>   ql/src/test/results/clientpositive/tez/tez_fsstat.q.out 3fcf68c 
>   ql/src/test/results/clientpositive/tez/tez_smb_1.q.out d970bd9 
>   ql/src/test/results/clientpositive/tez/tez_smb_main.q.out 6183390 
>   ql/src/test/results/clientpositive/udaf_percentile_approx_23.q.out 14a6874 
> 
> Diff: https://reviews.apache.org/r/34576/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> pengcheng xiong
> 
>

Reply via email to