Looking for an elegant way to do this:

Suppose there is a bag with names { James, John, Lisa, Larry, Amanda,
Amanda, John, James, Lisa, John}
I'd like to get something back along the lines of a tuple (2, 2, 3, 1,
2) where those are the counts for Amanda, James, John, Larry, Lisa
respectively.

Obviously I could write a UDF to do this, but I want to ensure that
there are the same columns in every row i.e. Bag { Amanda }  gives me
(1, 0, 0, 0.. ).  I could precompute the possible bag entries and pass
that along to the UDF but is this the only possibility?  Anything
better?

Thanks,

Arun

Reply via email to