Re: [9fans] 9p vs http

2010-11-16 Thread roger peppe
On 16 November 2010 01:18, erik quanstrom quans...@quanstro.net wrote:
  i claim that a fs with this behavior would be broken.  intro(5)
  seems to agree with this claim, unless i'm misreading.

 you're right - fossil is broken in this respect, as is exportfs
 {cd /mnt/term/dev; ls -lq | sort} for a quick demo.

 so what's fossil's excuse?

i'd say it's a bug. fossil could easily reserve some number of bits
of the qid (say 20 bits) to make the files in the dump unique
while still allowing space for a sufficient number of live files.

that doesn't invalidate my original point though.



Re: [9fans] 9p vs http

2010-11-16 Thread Russ Cox
On Mon, Nov 15, 2010 at 2:00 PM, Dan Adkins dadk...@gmail.com wrote:
 That brings up a question of interest to me.  How do you effectively
 read ahead with the 9p protocol?  Even if you issued many read
 requests in parallel, the server is allowed to return less data than
 was asked for.  You'll end up with holes in your buffer that require
 at least another roundtrip to fill.

The traditional store data file servers that Unix users
would recognize tend to follow the Unix semantics that
a read will return as much data as it can.  If you're talking
to one of them you issue reads for 8k each or whatever
and then do the roundtrip if necessary, but it rarely is.

Russ



Re: [9fans] That deadlock, again

2010-11-16 Thread erik quanstrom
 I tried acid, but I'm just not familiar enough with it to make it
 work.  I tried
 
   rumble% acid 2052 /bin/exportfs
   /bin/exportfs:386 plan 9 executable
   /sys/lib/acid/port
   /sys/lib/acid/386
   acid: src(0xf01e7377)
   no source for ?file?

cinap is right, the bug is in the kernel.  we know
that because it's a lock loop.  that can only happen
if the kernel screws up.  also, the address is a kernel
address (starts with 0xf).

if you didn't know the lock loop message was from
the kernel, you could grep for it

; cd /sys/src
; g 'lock (%#?[luxp]+ )loop' .
./9/port/taslock.c:61:  print(lock %#p loop key %#lux pc %#lux held by pc 
%#lux proc %lud\n,

- erik



Re: [9fans] 9p vs http

2010-11-16 Thread Charles Forsyth
i'd say it's a bug. fossil could easily reserve some number of bits
of the qid (say 20 bits) to make the files in the dump unique
while still allowing space for a sufficient number of live files.

that's possibly closest to the intent of the qid discussion in intro(5), 
although
it's not clear that considers the possibility of aliases within a hiearchy,
where it might make more sense to have the Qid.path associated with the file,
regardless of how you got to it. even so, for dump it's not been a big problem
in practice only because one usually binds from the dump, not into it.

i'm sure that somewhere it was suggested that high order bits of Qid.path
should be avoided by file servers to allow for their use to make qids unique
but i haven't been able to find that.



Re: [9fans] 9p vs http

2010-11-16 Thread erik quanstrom
 i'm sure that somewhere it was suggested that high order bits of Qid.path
 should be avoided by file servers to allow for their use to make qids unique
 but i haven't been able to find that.

unfortunately, there's just not enough bits to easily export
(an export)+.

i wonder if there's some way to expose the structure of the
export to the client so it will notice the devices are different.

- erik



Re: [9fans] 9p vs http

2010-11-16 Thread Charles Forsyth
unfortunately, there's just not enough bits to easily export
(an export)+.

i think that works: it checks for clashes.



Re: [9fans] 9p vs http

2010-11-16 Thread roger peppe
On 16 November 2010 16:32, Charles Forsyth fors...@terzarima.net wrote:
unfortunately, there's just not enough bits to easily export
(an export)+.

 i think that works: it checks for clashes.

only when a file is actually walked to.

of course, that's fine in practise - the only thing
that actually cares about qids is mount.
but it doesn't fit the spec.



Re: [9fans] That deadlock, again

2010-11-16 Thread lucio
 cinap is right, the bug is in the kernel.  we know
 that because it's a lock loop.  that can only happen
 if the kernel screws up.  also, the address is a kernel
 address (starts with 0xf).

Well, here is an acid dump, I'll inspect it in detail, but I'm hoping
someone will beat me to it (not hard at all, I have to confess):

rumble# acid /sys/src/9/pc/9pccpuf
/sys/src/9/pc/9pccpuf:386 plan 9 boot image
/sys/lib/acid/port
/sys/lib/acid/386

# lock 0xf0057d74 loop key 0xdeaddead pc 0xf01e736a held by pc 0xf01e736a proc 
2052

acid: src(0xf01e736a)
/sys/src/9/port/qlock.c:29
 24 print(qlock: %#p: nlocks %lud\n, getcallerpc(q), 
up-nlocks.ref);
 25 
 26 if(q-use.key == 0x)
 27 panic(qlock: q %#p, key 5*\n, q);
 28 lock(q-use);
29 rwstats.qlock++;
 30 if(!q-locked) {
 31 q-locked = 1;
 32 unlock(q-use);
 33 return;
 34 }

# 61:etherread4 pc f01ef8a0 dbgpc0   Running (Running) ut 2923 st 0 bss 
0 qpc f0148c8a nl 0 nd 0 lpc f0100f6e pri 13

acid: src(0xf01ef8a0)
/sys/src/9/port/taslock.c:96
 91 lockstats.glare++;
 92 for(;;){
 93 lockstats.inglare++;
 94 i = 0;
 95 while(l-key){
96 if(conf.nmach  2  up  up-edf  
(up-edf-flags  Admitted)){
 97 /*
 98  * Priority inversion, yield on a 
uniprocessor; on a
 99  * multiprocessor, the other processor 
will unlock
 100 */
 101print(inversion %#p pc %#lux proc %lud 
held by pc %#lux proc %lud\n,
acid: src(0xf0148c8a)
/sys/src/9/ip/tcp.c:2096
 2091   if(waserror()){
 2092   qunlock(s);
 2093   nexterror();
 2094   }
 2095   qlock(s);
2096   qunlock(tcp);
 2097   
 2098   /* fix up window */
 2099   seg.wnd = tcb-rcv.scale;
 2100   
 2101   /* every input packet in puts off the keep alive time out */
acid: src(0xf0100f6e)
/sys/src/9/pc/cga.c:112
 107return;
 108}
 109else
 110lock(cgascreenlock);
 111
112while(n--  0)
 113cgascreenputc(*s++);
 114
 115unlock(cgascreenlock);
 116}
 117

# 2052:  exportfs pc f01e7377 dbgpc 94adPwrite (Ready) ut 55 st 270 bss 
4 qpc f0145b62 nl 1 nd 0 lpc f01e2c60 pri 10

acid: src(0xf01e7377)
/sys/src/9/port/qlock.c:30
 25 
 26 if(q-use.key == 0x)
 27 panic(qlock: q %#p, key 5*\n, q);
 28 lock(q-use);
 29 rwstats.qlock++;
30 if(!q-locked) {
 31 q-locked = 1;
 32 unlock(q-use);
 33 return;
 34 }
 35 if(up == 0)
acid: src(0xf0145b62)
/sys/src/9/ip/tcp.c:704
 699tcpgo(Tcppriv *priv, Tcptimer *t)
 700{
 701if(t == nil || t-start == 0)
 702return;
 703
704qlock(priv-tl);
 705t-count = t-start;
 706timerstate(priv, t, TcptimerON);
 707qunlock(priv-tl);
 708}
 709
acid: src(0xf01e2c60)
/sys/src/9/port/proc.c:345
 340queueproc(Schedq *rq, Proc *p)
 341{
 342int pri;
 343
 344pri = rq - runq;
345lock(runq);
 346p-priority = pri;
 347p-rnext = 0;
 348if(rq-tail)
 349rq-tail-rnext = p;
 350else
acid: 




Re: [9fans] Plan9 development

2010-11-16 Thread Christopher Nielsen
On Mon, Nov 15, 2010 at 19:32,  lu...@proxima.alt.za wrote:
 I always had the impression that the object formats
 used by the various ?l are more for kernels and the
 various formats expected by loaders than for userland
 apps.  For userland, I would think the intent is for
 there to be a single consistent object format (at least
 for a given architecture).

 Well, we had alef for Irix and other similar user level/application
 level tricks that no longer seem important today, but without the
 option trickery Go would have had to wait for Ian Lance Taylor to
 produce a GCC version :-(

 Myself, I'm still trying to combine the Go toolchain with the Plan 9
 toolchain so that we can have a consistent framework for real
 cross-platform development, but the task doesn't quite fit within my
 resources and skills.  I don't have a problem with the trickery, it's
 just a shame (IMO) that it wasn't designed the same way as the target
 architecture stuff.  I understand the complexity involved and I'm still
 looking for ideas on reducing that complexity.

 Typically, the Go toolchain still has (had?) code in it to produce
 Plan 9 object code, but one could easily imagine that stuff
 bit-rotting.  If it hasn't been removed yet, it sure runs the risk of
 being removed before long.

FWIW, someone is working on a Plan 9 port of Go.

-- 
Christopher Nielsen
They who can give up essential liberty for temporary safety, deserve
neither liberty nor safety. --Benjamin Franklin
The tree of liberty must be refreshed from time to time with the
blood of patriots  tyrants. --Thomas Jefferson



[9fans] p9p venti sync?

2010-11-16 Thread David Leimbach
I'm trying to figure out how to correctly sync a plan9port venti instance so
I can start it back up again and have it actually function :-).

using venti/sync doesn't appear to get the job done...

Dave


Re: [9fans] p9p venti sync?

2010-11-16 Thread Bakul Shah
On Tue, 16 Nov 2010 14:43:20 PST David Leimbach leim...@gmail.com  wrote:
 --0016e6464d1a9112a304953348f5
 Content-Type: text/plain; charset=ISO-8859-1
 
 I'm trying to figure out how to correctly sync a plan9port venti instance so
 I can start it back up again and have it actually function :-).

 
 using venti/sync doesn't appear to get the job done...

hget http://localhost:9091/flushicache
hget http://localhost:9091/flushdcache

or wherever is your venti has its http port



Re: [9fans] p9p venti sync?

2010-11-16 Thread Russ Cox
On Tue, Nov 16, 2010 at 5:43 PM, David Leimbach leim...@gmail.com wrote:
 I'm trying to figure out how to correctly sync a plan9port venti instance so
 I can start it back up again and have it actually function :-).
 using venti/sync doesn't appear to get the job done...

It should.  Not using venti/sync should work too since
vac etc all sync before hanging up.  The flushicache/flushdcache
trick will make restarting a little faster, but it should not
be necessary for correctness and shouldn't even be
that much faster.

Russ



Re: [9fans] p9p venti sync?

2010-11-16 Thread David Leimbach
On Tuesday, November 16, 2010, Russ Cox r...@swtch.com wrote:
 On Tue, Nov 16, 2010 at 5:43 PM, David Leimbach leim...@gmail.com wrote:
 I'm trying to figure out how to correctly sync a plan9port venti instance so
 I can start it back up again and have it actually function :-).
 using venti/sync doesn't appear to get the job done...

 It should.  Not using venti/sync should work too since
 vac etc all sync before hanging up.  The flushicache/flushdcache
 trick will make restarting a little faster, but it should not
 be necessary for correctness and shouldn't even be
 that much faster.

I did a kill TERM... should the signal handler have cleaned up or was
I supposed to send HUP?



 Russ





Re: [9fans] p9p venti sync?

2010-11-16 Thread Russ Cox
On Tue, Nov 16, 2010 at 10:19 PM, David Leimbach leim...@gmail.com wrote:
 On Tuesday, November 16, 2010, Russ Cox r...@swtch.com wrote:
 On Tue, Nov 16, 2010 at 5:43 PM, David Leimbach leim...@gmail.com wrote:
 I'm trying to figure out how to correctly sync a plan9port venti instance so
 I can start it back up again and have it actually function :-).
 using venti/sync doesn't appear to get the job done...

 It should.  Not using venti/sync should work too since
 vac etc all sync before hanging up.  The flushicache/flushdcache
 trick will make restarting a little faster, but it should not
 be necessary for correctness and shouldn't even be
 that much faster.

 I did a kill TERM... should the signal handler have cleaned up or was
 I supposed to send HUP?

There's no cleanup.  The data's on disk after a sync RPC,
which is what venti/sync does and also what vac does when
it is done.  Venti's writes are ordered so that the on-disk state
is always recoverable.  Because of that, there's no explicit
shutdown: you just kill the server however you like.

Russ



Re: [9fans] That deadlock, again

2010-11-16 Thread Lucio De Re
 Well, here is an acid dump, I'll inspect it in detail, but I'm hoping
 someone will beat me to it (not hard at all, I have to confess):
 
 rumble# acid /sys/src/9/pc/9pccpuf
 /sys/src/9/pc/9pccpuf:386 plan 9 boot image
 /sys/lib/acid/port
 /sys/lib/acid/386
 
[ ... ]

This bit looks suspicious to me, but I'm really not an authority and I
may easily be missing something:

 acid: src(0xf0148c8a)
 /sys/src/9/ip/tcp.c:2096
  2091 if(waserror()){
  2092 qunlock(s);
  2093 nexterror();
  2094 }
  2095 qlock(s);
2096  qunlock(tcp);
  2097 
  2098 /* fix up window */
  2099 seg.wnd = tcb-rcv.scale;
  2100 
  2101 /* every input packet in puts off the keep alive time out */

The source actually says (to be pedantic):

/* The rest of the input state machine is run with the control block
 * locked and implements the state machine directly out of the RFC.
 * Out-of-band data is ignored - it was always a bad idea.
 */
tcb = (Tcpctl*)s-ptcl;
if(waserror()){
qunlock(s);
nexterror();
}
qlock(s);
qunlock(tcp);

Now, the qunlock(s) should not precede the qlock(s), this is the first
case in this procedure:

/sys/src/9/ip/tcp.c:1941,2424
void
tcpiput(Proto *tcp, Ipifc*, Block *bp)
{
...
}

and unlocking an unlocked item should not be permitted, right?  If s
(the conversation) is returned locked by either of

/sys/src/9/ip/tcp.c:2048
s = iphtlook(tpriv-ht, source, seg.source, dest, seg.dest);

or

/sys/src/9/ip/tcp.c:2081
s = tcpincoming(s, seg, source, dest, version);

then the qlock(s) is an issue, but I'm sure that is not the case here.

Finally, locking an item ahead of unlocking another is often cause for
a deadlock, although I understand the possible necessity.

qlock(s);
qunlock(tcp);

Even though acid points to a different location, I'd be curious to
have the details above expanded on by somebody familiar with the code.

++L




Re: [9fans] p9p venti sync?

2010-11-16 Thread David Leimbach
Could sparse files be an issue?  Bloom always shows up wrong when I restart.

On Tuesday, November 16, 2010, David Leimbach leim...@gmail.com wrote:
 On Tuesday, November 16, 2010, Russ Cox r...@swtch.com wrote:
 On Tue, Nov 16, 2010 at 5:43 PM, David Leimbach leim...@gmail.com wrote:
 I'm trying to figure out how to correctly sync a plan9port venti instance so
 I can start it back up again and have it actually function :-).
 using venti/sync doesn't appear to get the job done...

 It should.  Not using venti/sync should work too since
 vac etc all sync before hanging up.  The flushicache/flushdcache
 trick will make restarting a little faster, but it should not
 be necessary for correctness and shouldn't even be
 that much faster.

 I did a kill TERM... should the signal handler have cleaned up or was
 I supposed to send HUP?



 Russ






Re: [9fans] That deadlock, again

2010-11-16 Thread erik quanstrom
  acid: src(0xf0148c8a)
  /sys/src/9/ip/tcp.c:2096
   2091   if(waserror()){
   2092   qunlock(s);
   2093   nexterror();
   2094   }
   2095   qlock(s);
 2096qunlock(tcp);
   2097   
   2098   /* fix up window */
   2099   seg.wnd = tcb-rcv.scale;
   2100   
   2101   /* every input packet in puts off the keep alive time 
  out */
 
 The source actually says (to be pedantic):
 
   /* The rest of the input state machine is run with the control block
* locked and implements the state machine directly out of the RFC.
* Out-of-band data is ignored - it was always a bad idea.
*/
   tcb = (Tcpctl*)s-ptcl;
   if(waserror()){
   qunlock(s);
   nexterror();
   }
   qlock(s);
   qunlock(tcp);
 
 Now, the qunlock(s) should not precede the qlock(s), this is the first
 case in this procedure:

it doesn't.  waserror() can't be executed before the code
following it.  perhpas it could be more carefully written
as

   2095   qlock(s);
   2091   if(waserror()){
   2092   qunlock(s);
   2093   nexterror();
   2094   }
 2096qunlock(tcp);

but it really wouldn't make any difference.

i'm not completely convinced that tcp's to blame.
and if it is, i think the problem is probablly tcp
timers.

- erik



Re: [9fans] That deadlock, again

2010-11-16 Thread Lucio De Re
 Now, the qunlock(s) should not precede the qlock(s), this is the first
 case in this procedure:
 
 it doesn't.  waserror() can't be executed before the code
 following it.  perhpas it could be more carefully written
 as
 
   2095  qlock(s);
   2091  if(waserror()){
   2092  qunlock(s);
   2093  nexterror();
   2094  }
 2096   qunlock(tcp);

Hm, I thought I understood waserror(), but now I'm sure I don't.  What
condition is waserror() attempting to handle here?

++L




Re: [9fans] That deadlock, again

2010-11-16 Thread erik quanstrom
 Hm, I thought I understood waserror(), but now I'm sure I don't.  What
 condition is waserror() attempting to handle here?

waserror() sets up an entry in the error stack.
if there is a call to error() before poperror(),
then that entry is poped and waserror() returns
1.  it's just like set_jmp or sched().

- erik



Re: [9fans] p9p venti sync?

2010-11-16 Thread David Leimbach
On Tue, Nov 16, 2010 at 8:09 PM, David Leimbach leim...@gmail.com wrote:

 Could sparse files be an issue?  Bloom always shows up wrong when I
 restart.


Nope... Didn't make a difference it seems.

I recreated my venti setup, and it starts ok.  I do a vac and an unvac, then
kill it and restart and get the following:

% venti/venti
2010/1116 20:44:14 venti: conf...2010/1116 20:44:14 err 4: read
/Users/dave/venti/disks/bloom offset 0x0 count 65536 buf 380 returned
65536: No such file or directory
venti/venti: can't load bloom filter: read /Users/dave/venti/disks/bloom
offset 0x0 count 65536 buf 380 returned 65536: No such file or
directory

bloom is definitely there though... Not sure what the no such file or
directory is referring to just yet.
-rw-r--r--  1 dave  wheel   33554432 Nov 16 20:42 bloom

I'll try without the bloom filter.


 On Tuesday, November 16, 2010, David Leimbach leim...@gmail.com wrote:
  On Tuesday, November 16, 2010, Russ Cox r...@swtch.com wrote:
  On Tue, Nov 16, 2010 at 5:43 PM, David Leimbach leim...@gmail.com
 wrote:
  I'm trying to figure out how to correctly sync a plan9port venti
 instance so
  I can start it back up again and have it actually function :-).
  using venti/sync doesn't appear to get the job done...
 
  It should.  Not using venti/sync should work too since
  vac etc all sync before hanging up.  The flushicache/flushdcache
  trick will make restarting a little faster, but it should not
  be necessary for correctness and shouldn't even be
  that much faster.
 
  I did a kill TERM... should the signal handler have cleaned up or was
  I supposed to send HUP?
 
 
 
  Russ
 
 
 



Re: [9fans] p9p venti sync?

2010-11-16 Thread David Leimbach


 I'll try without the bloom filter.

 Now it's working... I probably don't need this enhancement anyway, but at
least it appears to be working now.  Unvac of a previously generated score
is working fine.

Dave



 On Tuesday, November 16, 2010, David Leimbach leim...@gmail.com wrote:
  On Tuesday, November 16, 2010, Russ Cox r...@swtch.com wrote:
  On Tue, Nov 16, 2010 at 5:43 PM, David Leimbach leim...@gmail.com
 wrote:
  I'm trying to figure out how to correctly sync a plan9port venti
 instance so
  I can start it back up again and have it actually function :-).
  using venti/sync doesn't appear to get the job done...
 
  It should.  Not using venti/sync should work too since
  vac etc all sync before hanging up.  The flushicache/flushdcache
  trick will make restarting a little faster, but it should not
  be necessary for correctness and shouldn't even be
  that much faster.
 
  I did a kill TERM... should the signal handler have cleaned up or was
  I supposed to send HUP?
 
 
 
  Russ
 
 
 





Re: [9fans] That deadlock, again

2010-11-16 Thread cinap_lenrek
qpc is the just the caller of the last successfull *acquired* qlock.
what we know is that the exportfs proc spins in the q-use taslock
called by qlock() right?  this already seems wired...  q-use is held
just long enougth to test q-locked and manipulate the queue.  also
sched() will avoid switching to another proc while we are holding tas
locks.

i would like to know which qlock is the kernel is trying to acquire
on behalf of exportfs that is also reachable from the etherread4
code.

one could move:

up-qpc = getcallerpc(q);

from qlock() before the lock(q-use); so we can see from where that
qlock gets called that hangs the exportfs call, or add another magic
debug pointer (qpctry) to the proc stucture and print it in dumpaproc().

--
cinap
---BeginMessage---
  acid: src(0xf0148c8a)
  /sys/src/9/ip/tcp.c:2096
   2091   if(waserror()){
   2092   qunlock(s);
   2093   nexterror();
   2094   }
   2095   qlock(s);
 2096qunlock(tcp);
   2097   
   2098   /* fix up window */
   2099   seg.wnd = tcb-rcv.scale;
   2100   
   2101   /* every input packet in puts off the keep alive time 
  out */
 
 The source actually says (to be pedantic):
 
   /* The rest of the input state machine is run with the control block
* locked and implements the state machine directly out of the RFC.
* Out-of-band data is ignored - it was always a bad idea.
*/
   tcb = (Tcpctl*)s-ptcl;
   if(waserror()){
   qunlock(s);
   nexterror();
   }
   qlock(s);
   qunlock(tcp);
 
 Now, the qunlock(s) should not precede the qlock(s), this is the first
 case in this procedure:

it doesn't.  waserror() can't be executed before the code
following it.  perhpas it could be more carefully written
as

   2095   qlock(s);
   2091   if(waserror()){
   2092   qunlock(s);
   2093   nexterror();
   2094   }
 2096qunlock(tcp);

but it really wouldn't make any difference.

i'm not completely convinced that tcp's to blame.
and if it is, i think the problem is probablly tcp
timers.

- erik
---End Message---


Re: [9fans] That deadlock, again

2010-11-16 Thread cinap_lenrek
sorry for not being clear.  what i ment was that qpc is for the last
qlock we succeeded to acquire.  its *not* the one we are spinning on.
also, qpc is not set to nil on unlock.

--
cinap
---BeginMessage---
  acid: src(0xf0148c8a)
  /sys/src/9/ip/tcp.c:2096
   2091   if(waserror()){
   2092   qunlock(s);
   2093   nexterror();
   2094   }
   2095   qlock(s);
 2096qunlock(tcp);
   2097   
   2098   /* fix up window */
   2099   seg.wnd = tcb-rcv.scale;
   2100   
   2101   /* every input packet in puts off the keep alive time 
  out */
 
 The source actually says (to be pedantic):
 
   /* The rest of the input state machine is run with the control block
* locked and implements the state machine directly out of the RFC.
* Out-of-band data is ignored - it was always a bad idea.
*/
   tcb = (Tcpctl*)s-ptcl;
   if(waserror()){
   qunlock(s);
   nexterror();
   }
   qlock(s);
   qunlock(tcp);
 
 Now, the qunlock(s) should not precede the qlock(s), this is the first
 case in this procedure:

it doesn't.  waserror() can't be executed before the code
following it.  perhpas it could be more carefully written
as

   2095   qlock(s);
   2091   if(waserror()){
   2092   qunlock(s);
   2093   nexterror();
   2094   }
 2096qunlock(tcp);

but it really wouldn't make any difference.

i'm not completely convinced that tcp's to blame.
and if it is, i think the problem is probablly tcp
timers.

- erik
---End Message---


Re: [9fans] That deadlock, again

2010-11-16 Thread Lucio De Re
On Wed, Nov 17, 2010 at 06:22:33AM +0100, cinap_len...@gmx.de wrote:
 
 qpc is the just the caller of the last successfull *acquired* qlock.
 what we know is that the exportfs proc spins in the q-use taslock
 called by qlock() right?  this already seems wired...  q-use is held
 just long enougth to test q-locked and manipulate the queue.  also
 sched() will avoid switching to another proc while we are holding tas
 locks.
 
I think I'm with you, probably not quite to the same depth of understanding.

 i would like to know which qlock is the kernel is trying to acquire
 on behalf of exportfs that is also reachable from the etherread4
 code.
 
... and from whatever the other proc is that also contributes to this
jam. I don't have the name right in front of me, but I will post it
separately.  As far as I know it's always those two that interfere with
exportfs and usually together, only a short time apart.

 one could move:
 
   up-qpc = getcallerpc(q);
 
 from qlock() before the lock(q-use); so we can see from where that
 qlock gets called that hangs the exportfs call, or add another magic
 debug pointer (qpctry) to the proc stucture and print it in dumpaproc().
 
I think I'll do the latter; even though it's more complex, it can be a
useful debugging tool in future.  I wouldn't leave in the kernel code,
but it would be worth being able to refer to it when the occasion arises.

How do you expect this qpctry to be initialised/set?

++L



Re: [9fans] That deadlock, again

2010-11-16 Thread Lucio De Re
On Wed, Nov 17, 2010 at 06:33:13AM +0100, cinap_len...@gmx.de wrote:
 sorry for not being clear.  what i ment was that qpc is for the last
 qlock we succeeded to acquire.  its *not* the one we are spinning on.
 also, qpc is not set to nil on unlock.
 
Ok, so we set qpctry (qpcdbg?) to qpc before changing qpc?  Irrespective
of whether qpc is set or nil?  And should qunlock() clear qpc for safety,
or would this just make debugging more difficult?

++L



Re: [9fans] That deadlock, again

2010-11-16 Thread Lucio De Re
On Wed, Nov 17, 2010 at 08:45:00AM +0200, Lucio De Re wrote:
 ... and from whatever the other proc is that also contributes to this
 jam. I don't have the name right in front of me, but I will post it
 separately.  As far as I know it's always those two that interfere with
 exportfs and usually together, only a short time apart.
 
#I0tcpack pc f01ff12a dbgpc ...

That's the other common factor.

++L



Re: [9fans] That deadlock, again

2010-11-16 Thread erik quanstrom
 On Wed, Nov 17, 2010 at 06:33:13AM +0100, cinap_len...@gmx.de wrote:
  sorry for not being clear.  what i ment was that qpc is for the last
  qlock we succeeded to acquire.  its *not* the one we are spinning on.
  also, qpc is not set to nil on unlock.
  
 Ok, so we set qpctry (qpcdbg?) to qpc before changing qpc?  Irrespective
 of whether qpc is set or nil?  And should qunlock() clear qpc for safety,
 or would this just make debugging more difficult?

no.  you're very likely to nil out the new qpc if you do that.

- erik



[9fans] Plan9 development

2010-11-16 Thread Pavel Zholkover
Hi,
I did a Go runtime port for x86, it is in already in the main hg repository.
Right now it is cross-compile from Linux for example (GOOS=plan9 8l -s
when linking. notice the -s, it is required).

There were a few changes made to the upstream so the following patch
is needed until the fix is committed:
http://codereview.appspot.com/2674041/

Right now I'm working on syscall package.

Pavel



Re: [9fans] Plan9 development

2010-11-16 Thread Lucio De Re
On Wed, Nov 17, 2010 at 09:38:46AM +0200, Pavel Zholkover wrote:
 I did a Go runtime port for x86, it is in already in the main hg repository.
 Right now it is cross-compile from Linux for example (GOOS=plan9 8l -s
 when linking. notice the -s, it is required).
 
I have Plan 9 versions of the toolchain that ought to make it possible
to do the same under Plan 9.  I'll have a look around the repository,
see if I can add any value.

 There were a few changes made to the upstream so the following patch
 is needed until the fix is committed:
 http://codereview.appspot.com/2674041/
 
 Right now I'm working on syscall package.
 
Thanks for letting us know.

++L