Alan means return a tuple of a single bag of many tuples (don't try to
make pig work with a loader that returns a bag instead of a tuple..
you'll be up to your neck in the visitor pattern in no time if you
start heading that direction).

Alternative is to change what constitutes a record your loader gets --
use a different inputformat/recordReader to produce the records as
needed, instead of feeding you lines.

-D

On Thu, Oct 28, 2010 at 8:36 AM, John Hui <[email protected]> wrote:
> I look into the return data bag as an option.  The problem is the Loader
> interface require me to return a Tuple object.
>
>   public Tuple getNext() throws IOException {
>
> but the DataBag interface is not a derive class of Tuple so this means I
> will need to change the internal code for pig for my loader to return a bag
> of tuples.  Right?
>
> John
>
> On Wed, Oct 27, 2010 at 6:00 PM, John Hui <[email protected]> wrote:
>
>> Hi Pig Users,
>>
>> I am currently writing a UDF loader.  In one of my use case, one line in
>> the input stream results in multiple tuples.  Has anyone encounter or solve
>> this issue on their end.
>>
>> The current structure of the code getNext method only return tuple but I
>> want it to return a List<tuple>.  Let me know if there's use case out there
>> like mine, I am coding it up to return List<tuple> which is more more
>> flexible than return only one tuple.
>>
>> Thanks,
>>
>> John
>>
>

Reply via email to