[
https://issues.apache.org/jira/browse/PIG-40?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12547906
]
Benjamin Reed commented on PIG-40:
----------------------------------
I like the MemoryPoolMXBean approach. The main advantage is doing a simple
boolean check rather than a native method call on every add. It's not clear
that it will be more accurate than getFreeMemory though. (Do you have reason to
believe otherwise?) The there are two things that bother me:
1) It is an optional mechanism. There seem to be quite a few caveats to it
being there and working, so we probably need a fallback mechanism. (Always
spilling seems harsh.)
2) The threshold event only triggers one way. (Seems odd doesn't it.) The low
memory situation will usually be a transient condition. We should probably poll
periodically to reset the condition.
Since low memory state may be used in other places, perhaps we should pull out
the memory tracker into separate class.
> Memory management in BigDataBag is probably wrong
> -------------------------------------------------
>
> Key: PIG-40
> URL: https://issues.apache.org/jira/browse/PIG-40
> Project: Pig
> Issue Type: Bug
> Components: impl
> Reporter: Sam Pullara
> Attachments: BigDataBag.java, MemoryUsage.java
>
>
> src/org/apache/pig/data/BigDataBag.java
> 1) You should not use finalizers for things other than external resources --
> using them here is very dangerous and could inadvertantly lead to deadlocks
> and object resurrection and just decreases performance without any advantage.
> 2) Using .freeMemory() the way it is used in this class is broken.
> freeMemory() is going to return a mostly random number between 0 and the real
> amount. Adding gc() in here is a terrible performance burden. If you really
> want to do something like this you should using softreferences and
> finalization queues.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.