You could do something like use regexp_extract to extract just commas, and then get the length of that. But that would be a little ugly.

But there is also a built in Array type, and there is a "size" udf that operates on it...so maybe that could work for you.

http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Collection_Functions

On Oct 19, 2009, at 1:22 PM, Ryan LeCompte wrote:

Hello all,

Another Hive query question... :)

If I have a column in a table of type STRING, and it can take on a comma-delimited set of values (arbitrary, and unknown at query time)... For example: col1 = 'a','b','c' in one row and another row has col1='j','k','l'... Is it possible to write a query to get the counts of each comma-delimited value of col1? For example, something like this:


a  b   c    j   k   l
5 100 34  1  22 54

I guess I would have to know the full set of possible values of col1, no? Or maybe a user-defined function?

Thanks,
Ryan


Reply via email to