Re: [dtrace-discuss] RES: RES: Process in LCK / SLP (Please)

2008-04-22 Thread Jim Mauro
You may want to cross-post to a Java alias, but I've been down this
road before.

Java will call into malloc() for buffers for network reads and writes
that are larger than 2k bytes (the 2k is from memory, and I think it
was a 1.5 JVM). A large number of malloc calls,
and resulting contention on locks in the library, are due to the application
doing network writes of larger than 2k.

Newer JVMs (1.6) may improve this, but I'm not sure. There's also
an alternative set of classes and methods, NIO, which also can
help (although I've heard tell that NIO brings other problems along
with it, but I can not speak from personal experience).

At this point, I think you need to consult with Java experts to determine
what options you have for buffer allocation for network IO from the
Java heap, versus the current behavior of the JVM dropping back
to malloc for allocating buffers for network IO.

The other option of course is determining if the code can be changed to
use buffers smaller than 2k.

Thanks,
/jim




Kleyson Rios wrote:
> OK jonathan,
>
> I understand.
>
> So, looking on right place now, i can see few locks and sometimes no locks
> (just Mutex Hold). But I still have many threads in 100% LCK.
>
> If I don't have a lot of locks, where is my problem ?
>
> Running rickey c weisner's script a get:
>
> (...)
> 25736
>   libc.so.1`_so_send+0x15
>   libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67
>   libjvm.so`JVM_Send+0x32
>   libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131
>   0xc3c098d3
>10
> 25736
>   0xc3d2a33a
>14
> 25736
>   libc.so.1`_write+0x15
>   libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d
>   libjvm.so`JVM_Write+0x30
>   libjava.so`0xc8f7c04b
>16
> 25736
>   libc.so.1`stat64+0x15
>21
> 25736
>   libc.so.1`_write+0x15
>   libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d
>   libjvm.so`JVM_Write+0x30
>   libjava.so`0xc8f80ce9
>76
>   java   25736  kernel-level lock  1
>   java   25736  shuttle6
>   java   25736  preempted  7
>   java   25736  user-level lock511
>   java   25736  condition variable 748
>
>  
> Atenciosamente,
>  
> --
>  
> Kleyson Rios.
> Gerência de Suporte Técnico
> Analista de Suporte / Líder de Equipe
>  
>
> -Mensagem original-
> De: Jonathan Adams [mailto:[EMAIL PROTECTED] 
> Enviada em: sexta-feira, 18 de abril de 2008 15:40
> Para: Kleyson Rios
> Cc: dtrace-discuss@opensolaris.org
> Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please)
>
>
> On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote:
>   
>> Hi przemol,
>>
>> Bellow output of plockstat for malloc and libumem. Both many locks.
>> Why changing to libumem I didn't get less locks ?
>>
>> 
>
> You're looking at Mutex hold statistics, which don't mean a lot  
> (unless contention is caused by long hold times)
>
> The important thing for multi-threaded performance is *contention*.   
> (Spinning and blocking)  Those are the statistics you should be  
> looking at.
>
> Both malloc and libumem use locks to protect their state;  libumem  
> just uses many locks, in order to reduce contention.
>
> Cheers,
> - jonathan
>
>
>
>
> ___
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org
>   
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] Help with some custom kernel sdt probes

2008-04-22 Thread Fernando Gleiser
I'm playing with FreeBSD's port of DTrace

we're adding some sdt probes to some kernel modules so we can time some events

The probes load fine and they show up in dtrace -l, but I can't manage to access
 their arguments

here's some snip of code

static void
em_intr(void *arg)
{
struct adapter  *adapter = arg;
struct ifnet*ifp = adapter->ifp;
uint32_treg_icr;

SDT_INTR_PROBE(em, interrupt_start, adapter->dev, &em_intr, ifp, adapter
, 0);


If I want to access some of the struct ifnet's fields, like this:

sdt:::interrupt_end
{
this->ifnet=(struct ifnet *)arg2;
@[this->ifnet->if_xname]=quantize(timestamp-self->ts);
}



I get the following error:

dtrace: failed to compile script sdt-test2.d: line 13: operator -> cannot be 
applied to a forward declaration: no struct ifnet definition is available

is there anything I'm missing to make the struct definition visible to a
D script?

any help/suggestions would be greatly apreciated


Fer


  

Be a better friend, newshound, and 
know-it-all with Yahoo! Mobile.  Try it now.  
http://mobile.yahoo.com/;_ylt=Ahu06i62sR8HDtDypao8Wcj9tAcJ
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] SUSTAINABILITY OF OPEN SOURCE COMMUNITIES

2008-04-22 Thread Giuseppe Vaccaro
Hey all,

I'm a student of international business at University of Amsterdam (Uva). For 
my Master thesis i'm conducting an investigation about the main factors that 
influence the sustainability of open source communities, and in order to obtain 
an empirical confirmation of my reseach i need to conduct a survey and collect 
information from those who are members of open source communities. 
Can you please devote 5 minutes of yout time ro fill in my questionnaire? You 
can do it just following the link below:

http://www.thesistools.com/?qid=50932&ln=eng

Thanks a lot. I'll inform you about the results of my Research

Kind Regards

Giuseppe Vaccaro


--
This message posted from opensolaris.org
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] whatfor.d -- where's null pointer?

2008-04-22 Thread Adam Leventhal
On Tue, Apr 22, 2008 at 11:34:32AM -0700, Roman Shaposhnik wrote:
> Without knowing the details of how the structure to which t_sobj_ops is
> pointing gets managed it seems to me that there's a tiny window of
> opportunity between recording the address of the structure into
> this->tmp and the structure itself dealocated/reused for something
> else. Of course, the recorded address will still be valid, but once
> you do this->tmp->sobj_type you might get garbage. 
> 
> Can this happen?

No. t_sobj_ops is always assigned the address of a static structure that is
never deallocated. But it's a fair point for other cases, and I don't think
there's a generic solution.

Adam

-- 
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] whatfor.d -- where's null pointer?

2008-04-22 Thread Roman Shaposhnik
On Tue, 2008-04-22 at 11:21 -0700, Adam Leventhal wrote:
> On Tue, Apr 22, 2008 at 10:57:26AM -0700, Roman Shaposhnik wrote:
> > On Tue, 2008-04-22 at 10:21 -0700, Adam Leventhal wrote:
> > > On Tue, Apr 22, 2008 at 09:37:57AM -0400, Lytvyn, Oleksandr (IT) wrote:
> > > > That's an interesting analysis. Hope we can have a fix soon.
> > > 
> > > I forgot to mention that you can work around the issue by replacing
> > > curlwpsinfo->pr_stype with:
> > > 
> > > (((this->tmp = curthread->t_sobj_ops) != NULL) ? this->tmp->sobj_type 
> > > : 0)
> > 
> > Isn't there a race-condition still (although a much less probable one)?
> 
> I don't think so; can you describe the race condition you see?

Without knowing the details of how the structure to which t_sobj_ops is
pointing gets managed it seems to me that there's a tiny window of
opportunity between recording the address of the structure into
this->tmp and the structure itself dealocated/reused for something
else. Of course, the recorded address will still be valid, but once
you do this->tmp->sobj_type you might get garbage. 

Can this happen?

Thanks,
Roman.

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] whatfor.d -- where's null pointer?

2008-04-22 Thread Adam Leventhal
On Tue, Apr 22, 2008 at 10:57:26AM -0700, Roman Shaposhnik wrote:
> On Tue, 2008-04-22 at 10:21 -0700, Adam Leventhal wrote:
> > On Tue, Apr 22, 2008 at 09:37:57AM -0400, Lytvyn, Oleksandr (IT) wrote:
> > > That's an interesting analysis. Hope we can have a fix soon.
> > 
> > I forgot to mention that you can work around the issue by replacing
> > curlwpsinfo->pr_stype with:
> > 
> > (((this->tmp = curthread->t_sobj_ops) != NULL) ? this->tmp->sobj_type : 
> > 0)
> 
> Isn't there a race-condition still (although a much less probable one)?

I don't think so; can you describe the race condition you see?

Adam

-- 
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] whatfor.d -- where's null pointer?

2008-04-22 Thread Roman Shaposhnik
On Tue, 2008-04-22 at 10:21 -0700, Adam Leventhal wrote:
> On Tue, Apr 22, 2008 at 09:37:57AM -0400, Lytvyn, Oleksandr (IT) wrote:
> > That's an interesting analysis. Hope we can have a fix soon.
> 
> I forgot to mention that you can work around the issue by replacing
> curlwpsinfo->pr_stype with:
> 
> (((this->tmp = curthread->t_sobj_ops) != NULL) ? this->tmp->sobj_type : 0)

Isn't there a race-condition still (although a much less probable one)?

Thanks,
Roman.

___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


[dtrace-discuss] RES: RES: Process in LCK / SLP (Please)

2008-04-22 Thread Kleyson Rios
OK jonathan,

I understand.

So, looking on right place now, i can see few locks and sometimes no locks
(just Mutex Hold). But I still have many threads in 100% LCK.

If I don't have a lot of locks, where is my problem ?

Running rickey c weisner's script a get:

(...)
25736
  libc.so.1`_so_send+0x15
  libjvm.so`__1cDhpiEsend6Fipcii_i_+0x67
  libjvm.so`JVM_Send+0x32
  libnet.so`Java_java_net_SocketOutputStream_socketWrite0+0x131
  0xc3c098d3
   10
25736
  0xc3d2a33a
   14
25736
  libc.so.1`_write+0x15
  libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d
  libjvm.so`JVM_Write+0x30
  libjava.so`0xc8f7c04b
   16
25736
  libc.so.1`stat64+0x15
   21
25736
  libc.so.1`_write+0x15
  libjvm.so`__1cDhpiFwrite6FipkvI_I_+0x5d
  libjvm.so`JVM_Write+0x30
  libjava.so`0xc8f80ce9
   76
  java   25736  kernel-level lock  1
  java   25736  shuttle6
  java   25736  preempted  7
  java   25736  user-level lock511
  java   25736  condition variable 748

 
Atenciosamente,
 
--
 
Kleyson Rios.
Gerência de Suporte Técnico
Analista de Suporte / Líder de Equipe
 

-Mensagem original-
De: Jonathan Adams [mailto:[EMAIL PROTECTED] 
Enviada em: sexta-feira, 18 de abril de 2008 15:40
Para: Kleyson Rios
Cc: dtrace-discuss@opensolaris.org
Assunto: Re: [dtrace-discuss] RES: Process in LCK / SLP (Please)


On Apr 18, 2008, at 1:03 PM, Kleyson Rios wrote:
> Hi przemol,
>
> Bellow output of plockstat for malloc and libumem. Both many locks.
> Why changing to libumem I didn't get less locks ?
>

You're looking at Mutex hold statistics, which don't mean a lot  
(unless contention is caused by long hold times)

The important thing for multi-threaded performance is *contention*.   
(Spinning and blocking)  Those are the statistics you should be  
looking at.

Both malloc and libumem use locks to protect their state;  libumem  
just uses many locks, in order to reduce contention.

Cheers,
- jonathan




___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] whatfor.d -- where's null pointer?

2008-04-22 Thread Adam Leventhal
On Tue, Apr 22, 2008 at 09:37:57AM -0400, Lytvyn, Oleksandr (IT) wrote:
> That's an interesting analysis. Hope we can have a fix soon.

I forgot to mention that you can work around the issue by replacing
curlwpsinfo->pr_stype with:

(((this->tmp = curthread->t_sobj_ops) != NULL) ? this->tmp->sobj_type : 0)

> It also makes me wonder: how many races of that kind are there? And is
> it possible to have a comprehensive fix or approach to handle stuff like
> this?

There may be other problems like this, but the same fix can apply to all of
them. If we go with a compiler change that will cause the DIF to only load
the member once in a given probe firing, then that will clean up similarly
affected situations. If we create a language construct to make a new local
scope, then we'll need to vet our translators and fix them individually.

Adam

-- 
Adam Leventhal, Fishworkshttp://blogs.sun.com/ahl
___
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org


Re: [dtrace-discuss] whatfor.d -- where's null pointer?

2008-04-22 Thread Lytvyn, Oleksandr (IT)
Adam,

That's an interesting analysis. Hope we can have a fix soon.

It also makes me wonder: how many races of that kind are there? And is
it possible to have a comprehensive fix or approach to handle stuff like
this?

Thanks,

Oleksandr Lytvyn
Morgan Stanley | Technology
210 Carnegie Center, 4th Floor | Princeton, NJ  08540
Phone: +1 609 936-4026
Mobile: +1 732 773-4145
[EMAIL PROTECTED]
 

> -Original Message-
> From: Adam Leventhal [mailto:[EMAIL PROTECTED] 
> Sent: Monday, April 21, 2008 2:00 PM
> To: Lytvyn, Oleksandr (IT)
> Cc: dtrace-discuss@opensolaris.org
> Subject: Re: [dtrace-discuss] whatfor.d -- where's null pointer?
> 
> Hi Oleksandr,
> 
> This turned out to be a rather interesting problem. To 
> investigate, I used
> this script:
> 
> ---8<---
> off-cpu
> {
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
>   this->tmp = curlwpsinfo->pr_stype;
> }
> 
> ERROR
> {
>   @[arg2] = count();
> }
> ---8<---
> 
> Which resulted in a table like this:
> 
> 91
>101
> 52
> 72
> 13
> 33
> 25
> 65
> 4   10
> 
> So curlwpsinfo->pr_stype can work and later fail. Looking at 
> the translator
> for that field we see that it looks like this:
> 
> pr_stype = T->t_sobj_ops ? T->t_sobj_ops->sobj_type : 0;
> 
> This compiles to this DIF code:
> 
> OFF OPCODE  INSTRUCTION
> 00: 29010001ldgs DT_VAR(256), %r1   ! DT_VAR(256) 
> = "curthread"
> 01: 2502setx DT_INTEGER[0], %r2 ! 0x0
> 02: 04010201sll  %r1, %r2, %r1
> 03: 05010201srl  %r1, %r2, %r1
> 04: 0e010002mov  %r1, %r2
> 05: 25000103setx DT_INTEGER[1], %r3 ! 0x88
> 06: 07020302add  %r2, %r3, %r2
> 07: 22020002ldx  [%r2], %r2
> 08: 1002tst  %r2
> 09: 1211be   17
> 10: 0e010002mov  %r1, %r2
> 11: 25000103setx DT_INTEGER[1], %r3 ! 0x88
> 12: 07020302add  %r2, %r3, %r2
> 13: 22020002ldx  [%r2], %r2
> 14: 1e020002ldsw [%r2], %r2
> 15: 0e020002mov  %r2, %r2
> 16: 1112ba   18
> 17: 2502setx DT_INTEGER[0], %r2 ! 0x0
> 18: 25000203setx DT_INTEGER[2], %r3 ! 0x38
> 19: 04020302sll  %r2, %r3, %r2
> 20: 2e020302sra  %r2, %r3, %r2
> 21: 2302ret  %r2
> 
> We can see that we load the t_sobj_ops member once at offset 
> 07 and then again
> at offset 17 (right before we load sobj_type at offset 18). 
> The t_sobj_ops
> member can be set to NULL asynchronously from other threads 
> so this double
> load introduces a window for the failure that you're seeing.
> 
> Either we need to use some temporary, probe-local variable 
> (one that can't
> conflict with a user-defined variable), or we need to perform 
> some element of
> optimization to the generated DIF.
> 
> I've filed this bug:
> 
>   6691541 curlwpsinfo->pr_stype races
> 
> Adam
> 
> On Fri, Apr 18, 2008 at 04:20:31PM -0400, Lytvyn, Oleksandr 
> (IT) wrote:
> > Hi!
> >  
> > Anyone seen this? This one buffles me: I run whatfor.d from
> > /usr/demo/dtrace, and here's what I get:
> >  
> > dtrace: script '/usr/demo/dtrace/whatfor.d' matched 12 probes
> > dtrace: error on enabled probe ID 1 (ID 681: 
> sched:unix:resume:off-cpu):
> > invalid address (0x0) in action #1 at DIF offset 56
> > dtrace: error on enabled probe ID 1 (ID 681: 
> sched:unix:resume:off-cpu):
> > invalid address (0x0) in action #1 at DIF offset 56
> > dtrace: error on enabled probe ID 1 (ID 681: 
> sched:unix:resume:off-cpu):
> > invalid address (0x0) in action #1 at DIF offset 56
> > dtrace: error on enabled probe ID 1 (ID 681: 
> sched:unix:resume:off-cpu):
> > invalid address (0x0) in action #1 at DIF offset 56
> > ...
> >  
> > Multitudes of those. Apparently, action #1 of probe ID 1 is:
> >  
> > self->sobj = curlwpsinfo->pr_stype;
> >  
> > So, which address is invalid here? The curlwpsinfo is used 
> in predicate,
> > so it cannot be 0x0, because it'd complain about the 
> predicate too. And
> > pr_stype is supposed to be char. 
> >  
> > What's wrong here?
> >  
> > Found another error report like that on the web, BTW (in German,
> > accidentially). But no responses there, unfortunately.
> >  
> > Thanks,
> >  
> > Oleksandr Lytvyn
> > Morgan Stanley | Technology
> > 210 Carnegie Center, 4th Floor | Princeton, NJ  08540
> > Phone: +1 609 936-4026
> > Mobile: +1 732 773-4145
> > [EMAIL PROTECTED]
> >