Looking for an elegant way to do this:
Suppose there is a bag with names { James, John, Lisa, Larry, Amanda,
Amanda, John, James, Lisa, John}
I'd like to get something back along the lines of a tuple (2, 2, 3, 1,
2) where those are the counts for Amanda, James, John, Larry, Lisa
respectively.
Obviously I could write a UDF to do this, but I want to ensure that
there are the same columns in every row i.e. Bag { Amanda } gives me
(1, 0, 0, 0.. ). I could precompute the possible bag entries and pass
that along to the UDF but is this the only possibility? Anything
better?
Thanks,
Arun