You could do something like use regexp_extract to extract just commas,
and then get the length of that. But that would be a little ugly.
But there is also a built in Array type, and there is a "size" udf
that operates on it...so maybe that could work for you.
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Collection_Functions
On Oct 19, 2009, at 1:22 PM, Ryan LeCompte wrote:
Hello all,
Another Hive query question... :)
If I have a column in a table of type STRING, and it can take on a
comma-delimited set of values (arbitrary, and unknown at query
time)... For example: col1 = 'a','b','c' in one row and another row
has col1='j','k','l'... Is it possible to write a query to get the
counts of each comma-delimited value of col1? For example, something
like this:
a b c j k l
5 100 34 1 22 54
I guess I would have to know the full set of possible values of
col1, no? Or maybe a user-defined function?
Thanks,
Ryan