Unfortunately that is not fixed, it depends on the computations and
data-structures you have; in my case for example I use more than 2GB since
I need to keep a large matrix in memory... having said that, in most cases
it should be relatively easy to estimate how much memory you are going to
need and use that... or if that's not possible you can just increase it and
try the "set and see" approach. Check for memory leaks as well... (unclosed
resources and so on...!)

Regards.

​A.​

On Thu, Mar 5, 2015 at 8:21 PM, Sa Li <[email protected]> wrote:

> Thanks, Nathan. How much is should be in general?
>
> On Thu, Mar 5, 2015 at 10:15 AM, Nathan Leung <[email protected]> wrote:
>
>> Your worker is allocated a maximum of 768mb of heap. It's quite possible
>> that this is not enough. Try increasing Xmx i worker.childopts.
>> On Mar 5, 2015 1:10 PM, "Sa Li" <[email protected]> wrote:
>>
>>> Hi, All
>>>
>>> I have been running a trident topology on production server, code is
>>> like this:
>>>
>>> topology.newStream("spoutInit", kafkaSpout)
>>>                 .each(new Fields("str"),
>>>                         new JsonObjectParse(),
>>>                         new Fields("eventType", "event"))
>>>                 .parallelismHint(pHint)
>>>                 .groupBy(new Fields("event"))
>>>                 .persistentAggregate(PostgresqlState.newFactory(config), 
>>> new Fields("eventType"), new EventUpdater(), new Fields("eventWord"))
>>>         ;
>>>
>>>         Config conf = new Config();
>>>         conf.registerMetricsConsumer(LoggingMetricsConsumer.class, 1);
>>>
>>> Basically, it does simple things to get data from kafka, parse to different 
>>> field and write into postgresDB. But in storm UI, I did see such error, 
>>> "java.lang.OutOfMemoryError: GC overhead limit exceeded". It all happens in 
>>> same worker of each node - 6703. I understand this is because by default 
>>> the JVM is configured to throw this error if you are spending more than 
>>> *98% of the total time in GC and after the GC less than 2% of the heap is 
>>> recovered*.
>>>
>>> I am not sure what is exact cause for memory leak, is it OK by simply 
>>> increase the heap? Here is my storm.yaml:
>>>
>>> supervisor.slots.ports:
>>>
>>>      - 6700
>>>
>>>      - 6701
>>>
>>>      - 6702
>>>
>>>      - 6703
>>>
>>> nimbus.childopts: "-Xmx1024m -Djava.net.preferIPv4Stack=true"
>>>
>>> ui.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
>>>
>>> supervisor.childopts: "-Djava.net.preferIPv4Stack=true"
>>>
>>> worker.childopts: "-Xmx768m -Djava.net.preferIPv4Stack=true"
>>>
>>>
>>> Anyone has similar issues, and what will be the best way to overcome?
>>>
>>>
>>> thanks in advance
>>>
>>> AL
>>>
>>>
>>>
>>>
>

Reply via email to