[
https://issues.apache.org/jira/browse/PIG-1544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12899221#action_12899221
]
Thejas M Nair commented on PIG-1544:
------------------------------------
Note that it will not be possible to determine at query plan generation time,
the number of bags that will be present at a time during query execution in all
cases. For example, a udf could collect several bags. But that use case is
likely to be rare, so i don't think it needs to be considered for memory size
limit estimate. It should be sufficient to determine the number of places bags
are created in the query plan.
> proactive-spill bags should share the memory alloted for it
> -----------------------------------------------------------
>
> Key: PIG-1544
> URL: https://issues.apache.org/jira/browse/PIG-1544
> Project: Pig
> Issue Type: Bug
> Reporter: Thejas M Nair
>
> Initially proactive spill bags were designed for use in (co)group
> (InternalCacheBag) and they knew the total number of proactive bags that were
> present, and shared the memory limit specified using the property
> pig.cachedbag.memusage .
> But the two proactive bag implementations were added later -
> InternalDistinctBag and InternalSortedBag are not aware of actual number of
> bags being used - their users always assume total-numbags = 3.
> This needs to be fixed and all proactive-spill bags should share the
> memory-limit .
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.