To recap, there seems to be three distinct problems:

1) Within a running cluster, adding new nodes causes a lot of object  
faults.
2) Upon restart of a whole cluster, the active master does a lot of  
faulting, eventually going OOM.
3) Upon restart, the passive also goes OOM while trying to sync up.

I tried to set the master eviction percentage to 25 and this seems to  
help, but there's still lots of faulting activity ...

Masters have both 1GB of heap, which is a reasonable amount of memory  
I think to hold just a few GBs of data.

Any help, maybe some tuning tips?

Sergio Bossa
Sent by iPhone

Il giorno 17/nov/2010, alle ore 18.36, Sergio Bossa <sergio.bo...@gmail.com 
 > ha scritto:

> I tried to wait several minutes for the masters to startup and stop
> logging faults, then started the clients and tried working with them.
> But, after a few minutes of usage, masters crash with OOME ...
>
> On Wed, Nov 17, 2010 at 6:01 PM, Sergio Bossa  
> <sergio.bo...@gmail.com> wrote:
>> After restarting all masters and clients, the cluster is completely
>> unusable: masters are very busy and clients cannot connect at all.
>> I'm attaching the active master logs, where I see lots of object
>> faulting and strange messages from the ObjectManager.
>>
>> Here are some info about my setup:
>> - Terracotta 3.4.0.
>> - 1.9GB of objectdb (on disk).
>> - 1 active and 1 master, each one with 1GB.
>> - 2 clients, each one with 1.5GB.
>>
>> On Wed, Nov 17, 2010 at 4:36 PM, Sergio Bossa  
>> <sergio.bo...@gmail.com> wrote:
>>> Hi guys,
>>>
>>> when I try to connect a new client to a Terracotta master holding a
>>> few millions of objects (I can give you an approximated size if
>>> needed), the client fails to connect with a timeout and the master
>>> gets heavily busy trying to fault objects to send to the new client.
>>> I'm attaching a thread dump of the master: you'll notice several
>>> threads being busy at faulting in objects (see
>>> managed_object_fault_stage).
>>> Two questions:
>>> 1) Why is the master trying to fault so many objects at client  
>>> connection?
>>> 2) Why is the master still busy in the fault stage even *after* the
>>> client is completely disconnected?
>>>
>>> --
>>> Sergio Bossa
>>> http://www.linkedin.com/in/sergiob
>>>
>>
>>
>>
>> --
>> Sergio Bossa
>> http://www.linkedin.com/in/sergiob
>>
>
>
>
> -- 
> Sergio Bossa
> http://www.linkedin.com/in/sergiob
_______________________________________________
tc-dev mailing list
tc-dev@lists.terracotta.org
http://lists.terracotta.org/mailman/listinfo/tc-dev

Reply via email to