Adam, That's an interesting analysis. Hope we can have a fix soon.
It also makes me wonder: how many races of that kind are there? And is it possible to have a comprehensive fix or approach to handle stuff like this? Thanks, Oleksandr Lytvyn Morgan Stanley | Technology 210 Carnegie Center, 4th Floor | Princeton, NJ 08540 Phone: +1 609 936-4026 Mobile: +1 732 773-4145 [EMAIL PROTECTED] > -----Original Message----- > From: Adam Leventhal [mailto:[EMAIL PROTECTED] > Sent: Monday, April 21, 2008 2:00 PM > To: Lytvyn, Oleksandr (IT) > Cc: dtrace-discuss@opensolaris.org > Subject: Re: [dtrace-discuss] whatfor.d -- where's null pointer? > > Hi Oleksandr, > > This turned out to be a rather interesting problem. To > investigate, I used > this script: > > ---8<--- > off-cpu > { > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > this->tmp = curlwpsinfo->pr_stype; > } > > ERROR > { > @[arg2] = count(); > } > ---8<--- > > Which resulted in a table like this: > > 9 1 > 10 1 > 5 2 > 7 2 > 1 3 > 3 3 > 2 5 > 6 5 > 4 10 > > So curlwpsinfo->pr_stype can work and later fail. Looking at > the translator > for that field we see that it looks like this: > > pr_stype = T->t_sobj_ops ? T->t_sobj_ops->sobj_type : 0; > > This compiles to this DIF code: > > OFF OPCODE INSTRUCTION > 00: 29010001 ldgs DT_VAR(256), %r1 ! DT_VAR(256) > = "curthread" > 01: 25000002 setx DT_INTEGER[0], %r2 ! 0x0 > 02: 04010201 sll %r1, %r2, %r1 > 03: 05010201 srl %r1, %r2, %r1 > 04: 0e010002 mov %r1, %r2 > 05: 25000103 setx DT_INTEGER[1], %r3 ! 0x88 > 06: 07020302 add %r2, %r3, %r2 > 07: 22020002 ldx [%r2], %r2 > 08: 10020000 tst %r2 > 09: 12000011 be 17 > 10: 0e010002 mov %r1, %r2 > 11: 25000103 setx DT_INTEGER[1], %r3 ! 0x88 > 12: 07020302 add %r2, %r3, %r2 > 13: 22020002 ldx [%r2], %r2 > 14: 1e020002 ldsw [%r2], %r2 > 15: 0e020002 mov %r2, %r2 > 16: 11000012 ba 18 > 17: 25000002 setx DT_INTEGER[0], %r2 ! 0x0 > 18: 25000203 setx DT_INTEGER[2], %r3 ! 0x38 > 19: 04020302 sll %r2, %r3, %r2 > 20: 2e020302 sra %r2, %r3, %r2 > 21: 23000002 ret %r2 > > We can see that we load the t_sobj_ops member once at offset > 07 and then again > at offset 17 (right before we load sobj_type at offset 18). > The t_sobj_ops > member can be set to NULL asynchronously from other threads > so this double > load introduces a window for the failure that you're seeing. > > Either we need to use some temporary, probe-local variable > (one that can't > conflict with a user-defined variable), or we need to perform > some element of > optimization to the generated DIF. > > I've filed this bug: > > 6691541 curlwpsinfo->pr_stype races > > Adam > > On Fri, Apr 18, 2008 at 04:20:31PM -0400, Lytvyn, Oleksandr > (IT) wrote: > > Hi! > > > > Anyone seen this? This one buffles me: I run whatfor.d from > > /usr/demo/dtrace, and here's what I get: > > > > dtrace: script '/usr/demo/dtrace/whatfor.d' matched 12 probes > > dtrace: error on enabled probe ID 1 (ID 681: > sched:unix:resume:off-cpu): > > invalid address (0x0) in action #1 at DIF offset 56 > > dtrace: error on enabled probe ID 1 (ID 681: > sched:unix:resume:off-cpu): > > invalid address (0x0) in action #1 at DIF offset 56 > > dtrace: error on enabled probe ID 1 (ID 681: > sched:unix:resume:off-cpu): > > invalid address (0x0) in action #1 at DIF offset 56 > > dtrace: error on enabled probe ID 1 (ID 681: > sched:unix:resume:off-cpu): > > invalid address (0x0) in action #1 at DIF offset 56 > > ... > > > > Multitudes of those. Apparently, action #1 of probe ID 1 is: > > > > self->sobj = curlwpsinfo->pr_stype; > > > > So, which address is invalid here? The curlwpsinfo is used > in predicate, > > so it cannot be 0x0, because it'd complain about the > predicate too. And > > pr_stype is supposed to be char. > > > > What's wrong here? > > > > Found another error report like that on the web, BTW (in German, > > accidentially). But no responses there, unfortunately. > > > > Thanks, > > > > Oleksandr Lytvyn > > Morgan Stanley | Technology > > 210 Carnegie Center, 4th Floor | Princeton, NJ 08540 > > Phone: +1 609 936-4026 > > Mobile: +1 732 773-4145 > > [EMAIL PROTECTED] > > -------------------------------------------------------- > > > > NOTICE: If received in error, please destroy and notify > sender. Sender does not intend to waive confidentiality or > privilege. Use of this email is prohibited when received in error. > > > _______________________________________________ > > dtrace-discuss mailing list > > dtrace-discuss@opensolaris.org > > > -- > Adam Leventhal, Fishworks > http://blogs.sun.com/ahl > -------------------------------------------------------- NOTICE: If received in error, please destroy and notify sender. Sender does not intend to waive confidentiality or privilege. Use of this email is prohibited when received in error. _______________________________________________ dtrace-discuss mailing list dtrace-discuss@opensolaris.org