[
https://issues.apache.org/jira/browse/HIVE-707?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12736695#action_12736695
]
Namit Jain commented on HIVE-707:
---------------------------------
I agree 1. is a problem - we dont have any good way to handle the order by.
A workaround would be to first sort the results in a sub-query and then have
the group_concat outside the sub-query. But, syntactially, it is more painful.
Is there a requirement for the ordering ?
Separator can be handled as a configuration parameter like the maximum length.
Other option is to treat group_concat() specially, all the way from the parser
- I dont like it because it is kind of hacky.
I think, first we should find out whether anyone needs ORDER BY in the
group_concat and then think about it.
> add group_concat
> ----------------
>
> Key: HIVE-707
> URL: https://issues.apache.org/jira/browse/HIVE-707
> Project: Hadoop Hive
> Issue Type: New Feature
> Components: Query Processor
> Reporter: Namit Jain
> Assignee: Min Zhou
>
> Moving the discussion to a new jira:
> I've implemented group_cat() in a rush, and found something difficult to
> slove:
> 1. function group_cat() has a internal order by clause, currently, we can't
> implement such an aggregation in hive.
> 2. when the strings will be group concated are too large, in another words,
> if data skew appears, there is often not enough memory to store such a big
> result.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.