Re: Welcome Rui Li to Hive PMC

2017-05-26 Thread goun na
Congrats! Thanks in advance! 2017-05-26 3:54 GMT+09:00 Yongzhi Chen : > Congrats Rui! > > On Thu, May 25, 2017 at 1:48 PM, Vineet Garg > wrote: > >> Congrats Rui! >> >> > On May 24, 2017, at 9:19 PM, Xuefu Zhang wrote: >> > >> > Hi

Re: How can i merge multiple rows to one row in sparksql or hivesql?

2017-05-15 Thread goun na
I mentioned it opposite. collect_list generates duplicated results. 2017-05-16 0:50 GMT+09:00 goun na <gou...@gmail.com>: > Hi, Jone Zhang > > 1. Hive UDF > You might need collect_set or collect_list (to eliminate duplication), but > make sure reduce its cardinality

Re: How can i merge multiple rows to one row in sparksql or hivesql?

2017-05-15 Thread goun na
Hi, Jone Zhang 1. Hive UDF You might need collect_set or collect_list (to eliminate duplication), but make sure reduce its cardinality before applying UDFs as it can cause problems while handling 1 billion records. Union dataset 1,2,3 -> group by user_id1 -> collect_set (feature column) would

Re: Welcome new Hive committer, Zhihai Xu

2017-05-06 Thread goun na
Congratulations! 2017-05-06 14:28 GMT+09:00 Peter Vary : > Congratulations Zhihai! > > 2017. máj. 5. 18:52 ezt írta ("Xuefu Zhang" ): > >> Hi all, >> >> I'm very please to announce that Hive PMC has recently voted to offer >> Zhihai a committership which he

Re: Experimental results using TPC-DS (versus Spark and Presto)

2017-01-30 Thread goun na
Thanks for sharing benchmark results. May I ask why you choose ORC? 2017-01-30 19:57 GMT+09:00 김동원 : > Hi, > > Recently I did some experiments using Hive, Spark, and Presto using TPC-DS > benchmark > and I'd like to share the result with the community: http://www. >

Re: Hive ORC Table

2017-01-22 Thread goun na
Mahender Sarangam <mahender.bigd...@outlook.com>: > Yes below option, i tried it, But I'm not sure about work load (data > ingestion). I cant go with fixed hard coded value,I would like to know > reason for getting 1009 reducer task. > > On 1/20/2017 7:45 PM, goun na w

Re: Hive ORC Table

2017-01-20 Thread goun na
Hi Mahender Sarangam, 1st : Didn't work the following option in Tez? set mapreduce.job.reduces=100 or set mapred.reduce.tasks=100 (deprecated) 2nd : Possibility of data skew. It happens when handling null sometimes. Goun 2017-01-21 9:58 GMT+09:00 Mahender Sarangam

Re: how to load ORC file into hive orc table

2016-12-17 Thread goun na
One thing, timestamp is usually in high cardinality. It is not the right choice because it causes too many partitions. 2016-12-17 23:34 GMT+09:00 Elliot West : > It looks as though your table is partitioned yet perhaps you haven't > accounted for this when adding the data?

Re: [ANNOUNCE] New Hive Committer - Rajesh Balamohan

2016-12-14 Thread goun na
Congrats!! 2016-12-15 7:50 GMT+09:00 Gunther Hagleitner : > Congrats Rajesh! > > From: Jimmy Xiang > Sent: Wednesday, December 14, 2016 11:38 AM > To: user@hive.apache.org > Cc: d...@hive.apache.org;

Drop columns in ORC managed table

2016-01-24 Thread goun na
Hi users, The following drop column syntax does not work. > alter table test_db.test_table drop column col_1; FAILED: ParseException line 1:41 mismatched input 'column' expecting PARTITION near 'drop' in drop partition statement According to Hive manual, REPLACE COLUMNS can be used to drop

Re: Grouping sets with table alias causes parse exception

2016-01-21 Thread goun na
I found that it is already well described and fixed at Hive 1.2. Thanks! Parsing Error in GROUPING SETS https://issues.apache.org/jira/browse/HIVE-6950 2016-01-18 19:28 GMT+09:00 goun na <gou...@gmail.com>: > Hi, Users > > While converting legacy Oracle SQL to HiveQL using

Grouping sets with table alias causes parse exception

2016-01-18 Thread goun na
erver2 *org.apache.hive.service.cli.HiveSQLException:Error while compiling statement: FAILED: ParseException line 1:162 missing ) at ',' near ')' line 1:172 missing EOF at ',' near ')':28:27 Thanks, Goun Na