D'oh! My bad. My brain was thinking malloc and lock contention,
I saw the network calls on the stack, connected the dots and starting
typing. I should have looked at the top of the stack. Sorry.

Let's back up a minute. You have a Java application and the JVM has a
large number of threads in LCK time (user lock), right?

plockstat showed a lot of lock activity in libc malloc, at which point
you tried mtmalloc and libumem, neither of which made much of
a difference, right?

We know empirically (and heard from Jarod) that LCK can be deceiving,
because threads that call cond_wait(3C) to wait on a condition variable
will be charged with LCK time, but it's really sleep time most of the time.
So it is sometimes an artifact of the application design that threads show
up at 100% LCK time - threads are waiting for a cond_signal() (or 
cond_broadcast())
to be put to work, but there's no work to do. Anyway...

The problem I thought we were chasing was lock activity in malloc.

If you have more recent data (plockstat, prstat) please pass it to me (sorry
I missed it), and I'll get back in sync on the current problem.

I would add that there's nothing here that I think belongs on 
dtrace_discuss.
I would start posting to the performance group and a Java performance
alias. Also, and I'm sorry if I missed this, how well (or poorly) is the
Java application performing, and what metric do we have to determine
application performance.

Ricky's script grabs a user stack when a thread goes off CPU to sleep, and
tally's what the threads are sleeping on. It's mostly condition 
variables, which
doesn't really help us much (everything sleeps on CVs...mostly). There's a
lof of user locks in there as well. So...

The CVs look like threads blocking on writes, including writes to a 
network socket.
You need to figure out the write target, and go from there.

Get the PID of the target JVM -

dtrace -n 'syscall::write:entry / pid == $target / 
@[fds[arg0].fi_pathname] = count(); }' -p [PID_OF_JVM]

The above will list the files being written to. The next step depends on 
what we see.

The same goes for the user locks. If they are indeed malloc locks (which 
I think they are),
I suspect that will come back around to network IOs larger than 2k. Try 
this:

dtrace -n 'pid$target:libc:malloc:entry { @j[jstack()] = count(); }' -p 
[PID_OF_JVM]

Thanks,
/jim



Kleyson Rios wrote:
> Hi Jim,
>
> But, if there are problems in malloc for buffers for network, I should see
> the locks when running plockstat, don't see ?
>
>  
> Regards,
>  
> ------------------------------------------------------------------
>  
> Kleyson Rios.
>
> -----Mensagem original-----
> De: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] 
> Enviada em: quarta-feira, 23 de abril de 2008 01:11
> Para: Kleyson Rios
> Cc: [email protected]
> Assunto: Re: [dtrace-discuss] RES: RES: Process in LCK / SLP (Please)
>
> You may want to cross-post to a Java alias, but I've been down this
> road before.
>
> Java will call into malloc() for buffers for network reads and writes
> that are larger than 2k bytes (the 2k is from memory, and I think it
> was a 1.5 JVM). A large number of malloc calls,
> and resulting contention on locks in the library, are due to the application
> doing network writes of larger than 2k.
>
> Newer JVMs (1.6) may improve this, but I'm not sure. There's also
> an alternative set of classes and methods, NIO, which also can
> help (although I've heard tell that NIO brings other problems along
> with it, but I can not speak from personal experience).
>
> At this point, I think you need to consult with Java experts to determine
> what options you have for buffer allocation for network IO from the
> Java heap, versus the current behavior of the JVM dropping back
> to malloc for allocating buffers for network IO.
>
> The other option of course is determining if the code can be changed to
> use buffers smaller than 2k.
>
> Thanks,
> /jim
>
>
>
>
> Kleyson Rios wrote:
>   
>> OK jonathan,
>>
>> I understand.
>>
>> So, looking on right place now, i can see few locks and sometimes no locks
>> (just Mutex Hold). But I still have many threads in 100% LCK.
>>
>> If I don't have a lot of locks, where is my problem ?
>>
>> Running rickey c weisner's script a get:
>>
>> (...)
>>     25736
>>               libc.so.1`_so_send+0x15
>>               libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67
>>               libjvm.so`JVM_Send+0x32
>>
>>     
> libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131
>   
>>               0xc3c098d3
>>                10
>>     25736
>>               0xc3d2a33a
>>                14
>>     25736
>>               libc.so.1`_write+0x15
>>               libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d
>>               libjvm.so`JVM_Write+0x30
>>               libjava.so`0xc8f7c04b
>>                16
>>     25736
>>               libc.so.1`stat64+0x15
>>                21
>>     25736
>>               libc.so.1`_write+0x15
>>               libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d
>>               libjvm.so`JVM_Write+0x30
>>               libjava.so`0xc8f80ce9
>>                76
>>   java                       25736  kernel-level lock              1
>>   java                       25736  shuttle                        6
>>   java                       25736  preempted                      7
>>   java                       25736  user-level lock                511
>>   java                       25736  condition variable             748
>>
>>  
>> Atenciosamente,
>>  
>> ------------------------------------------------------------------
>>  
>> Kleyson Rios.
>> Gerência de Suporte Técnico
>> Analista de Suporte / Líder de Equipe
>>  
>>
>> -----Mensagem original-----
>> De: Jonathan Adams [mailto:[EMAIL PROTECTED] 
>> Enviada em: sexta-feira, 18 de abril de 2008 15:40
>> Para: Kleyson Rios
>> Cc: [email protected]
>> Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please)
>>
>>
>> On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote:
>>   
>>     
>>> Hi przemol,
>>>
>>> Bellow output of plockstat for malloc and libumem. Both many locks.
>>> Why changing to libumem I didn't get less locks ?
>>>
>>>     
>>>       
>> You're looking at Mutex hold statistics, which don't mean a lot  
>> (unless contention is caused by long hold times)
>>
>> The important thing for multi-threaded performance is *contention*.   
>> (Spinning and blocking)  Those are the statistics you should be  
>> looking at.
>>
>> Both malloc and libumem use locks to protect their state;  libumem  
>> just uses many locks, in order to reduce contention.
>>
>> Cheers,
>> - jonathan
>>
>>
>>
>>
>> _______________________________________________
>> dtrace-discuss mailing list
>> [email protected]
>>   
>>     
>
>
>   
_______________________________________________
dtrace-discuss mailing list
[email protected]

Reply via email to