Olga Natkovich resolved PIG-844.

accumulate interface took care of this.

> PERFORMANCE: streaming data to the UDFs in foreach
> --------------------------------------------------
>                 Key: PIG-844
>                 URL: https://issues.apache.org/jira/browse/PIG-844
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
> Currently, Pig places the data passed to UDFs into a bag. This can cause the 
> process to use more memory than actually needed as in many cases it would be 
> better to push the data one tuple at a time to the UDFs.
> For the case where combiner is invoked, this might not be that important; 
> however, for non-algebraic UDFs as well as other cases where combiner can't 
> be used, this can provide significant memory improvement.
> Another possible use case is where the data is already grouped going into pig 
> and we don't need to group it again.
> How this will effect UDF interface needs to be further discussed.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to