[
https://issues.apache.org/jira/browse/PIG-3047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13496744#comment-13496744
]
Prashant Kommireddi commented on PIG-3047:
------------------------------------------
We could possibly do something similar to Hadoop's ShuffleRamManager, minus the
wait()/notify() on memory freeing-up
1. % of max heap/available memory to JVM
2. % could be specified by user
3. User could also choose to specify a certain size in bytes
4. Ship pig with a conservative % to start with
Thoughts?
> Check the size of a relation before adding it to distributed cache in
> Replicated join
> -------------------------------------------------------------------------------------
>
> Key: PIG-3047
> URL: https://issues.apache.org/jira/browse/PIG-3047
> Project: Pig
> Issue Type: Improvement
> Reporter: Julien Le Dem
>
> Right now if someone makes a mistake and put the large relation last, Pig
> will copy a huge file into distributed cache and it will take a long time
> before the job eventually fails. It would be better to check before copying
> the relation that it is of reasonable size.
> <1 GB ?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira