Hi,
check this out
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-GROUPingandSORTingonf(column)
There are several functions that act like 'group_concat' of mysql


2016-06-16 7:49 GMT+08:00 Markovitz, Dudu <dmarkov...@paypal.com>:

> Have you tried to increase the heap size (worked for me)?
>
>
>
> E.g. -
>
>
>
> *bash*
>
> mkdir t
>
> awk 'BEGIN{OFS=",";for(i=0;i<10000000;++i){print i,i}}' > t/t.csv
>
> hdfs dfs -put t /tmp
>
> export HADOOP_OPTS="$HADOOP_OPTS -Xmx1024m"
>
>
>
> *hive*
>
> create external table t (i int,s string) row format delimited fields
> terminated by ',' location '/tmp/t';
>
> select i%10,collect_list(s) from t group by i%10;
>
>
>
>
>
> Dudu
>
>
>
> *From:* Mahender Sarangam [mailto:mahender.bigd...@outlook.com]
> *Sent:* Thursday, June 16, 2016 1:47 AM
> *To:* user@hive.apache.org
> *Subject:* Is there any GROUP_CONCAT Function in Hive
>
>
>
> Hi,
>
> We have Hive table with 3 GB of data like 1000000 rows. We are looking for
> any functionality in hive, which can perform GROUP_CONCAT Function.
>
> We tried implement Group_Concat function with use *Collect_List* and
> *Collect_Set*. But we are getting heap space error. Because, For each
> group key around 100000 rows are present,  now these rows which needs to be
> concatenate.
>
> Any direct way to concat row data into single string column by GROUP BY.
>

Reply via email to