You may try custom partitioner.

http://pig.apache.org/docs/r0.8.0/piglatin_ref2.html#partitionby
https://issues.apache.org/jira/browse/PIG-282.

Daniel

On 03/08/2011 02:04 PM, Dexin Wang wrote:
Unfortunately, it doesn't work.

Seems the same problem as in https://issues.apache.org/jira/browse/PIG-1547

On Tue, Mar 8, 2011 at 1:22 PM, Dexin Wang<[email protected]>  wrote:

awesome. Thanks Shawn.


On Tue, Mar 8, 2011 at 12:34 PM, Xiaomeng Wan<[email protected]>  wrote:

you can use the multistorage udf in piggybank.

Shawn

On Tue, Mar 8, 2011 at 1:29 PM, Dexin Wang<[email protected]>  wrote:
Is there a way to use STORE with variable or some other way to achieve
what
I need.

I have something like this:

grunt>  DESCRIBE A;
A: {f1, f2, f3, ...}

grunt>  DUMP A;
(v1, x2, x3, ...)
(v2, x4, x5, ...)
(v1, x6, x6, ...)
...

I do so processing and then group by f1 and would like to save the
result in
different directories for different f1, like this:

     /result/f1/result_for_v1
     /result/f2/result_for_v2
     /result/f2/result_for_v2
     ...

I know I could use SPLIT, but I have 100+ unique values for f1, and
number
of uniques varies each time I process. It will be nice I don't have list
100
BY lines with SPLIT and I certainly do not want to maintain the list of
possible values for f1 in my Pig script.

Thanks!
Dexin



Reply via email to