Re: [9fans] 9p fids and references
OK, thanks. 2009/7/30, roger peppe rogpe...@gmail.com: 2009/7/30 hugo rivera uai...@gmail.com: [...] there's no way two different files point to the same data structure (but maybe two different fids do?) so reference counting is unnecessary, am I right? no, because a file can be opened several times. when you open a file you get a new fid. so if you've got resources associated with the file, as opposed to resources private to the fid, you have to reference count them (or poison any fids that point to the file, if you *really* want the resource to go away) -- Hugo
[9fans] detecting drawterm
i'm probably being stupid here, but what's a good robust way of detecting in $home/lib/profile that the remote connection is from drawterm, so that i can start rio etc? currently the best i've got is to check /mnt/term/sysname, but that falls down the moment i connect from a different host...
Re: [9fans] detecting drawterm
Drawterm connects with service=cpu In the cpu clause I do this: if (! test -e /mnt/term/mnt/wsys) { # dt2k # cpu call from drawterm if (test -e /mnt/term/dev/secstore){ auth/factotum -n cat /mnt/term/dev/secstore | read -m /mnt/factotum/ctl echo /mnt/term/dev/secstore } if not {# old drawterm auth/factotum } webfs plumber webcookies upas/fs exec rio -s -i startup } note the secstore device created by drawterm which I push into my new factotum and then clean out (just in case). -Steve
Re: [9fans] detecting drawterm
/mnt/term/dev/hostdomain? AFAIK it's always drawterm.net. On Fri, Jul 31, 2009 at 12:31 PM, roger pepperogpe...@gmail.com wrote: i'm probably being stupid here, but what's a good robust way of detecting in $home/lib/profile that the remote connection is from drawterm, so that i can start rio etc? currently the best i've got is to check /mnt/term/sysname, but that falls down the moment i connect from a different host...
Re: [9fans] detecting drawterm
2009/7/31 Steve Simon st...@quintile.net: Drawterm connects with service=cpu In the cpu clause I do this: if (! test -e /mnt/term/mnt/wsys) { # dt2k # cpu call from drawterm if (test -e /mnt/term/dev/secstore){ auth/factotum -n cat /mnt/term/dev/secstore | read -m /mnt/factotum/ctl echo /mnt/term/dev/secstore } if not { # old drawterm auth/factotum } webfs plumber webcookies upas/fs exec rio -s -i startup } that's useful, thanks. i hadn't noticed the (apparently undocumented) secstore device.
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
this resulted in a little side discussion. to save someone else from having to break a strong oath about 9fans, i'll sum it up. the existing code handles this situation. when several processes share a segment, and any one of them decides to shrink the segment, all the processes must see the change before the pages are made available for re-use. any one of them could have any of those pages currently in their mmu state. and any must be removed from the mmu state local to the process, to ensure that no further access to the pages is possible before they are freed. the critical point is that it's irrelevant whether traps or syscalls are involved: ordinary store instructions are clearly bad enough. thus /sys/src/9/port/segment.c:/^mfreeseg contains (with s locked) /* flush this seg in all other processes */ if(s-ref 1) procflushseg(s); and procflushseg finds all processes that share s, sets them all up to flush their mmu states, and also sets any processor running such a process to flush its state (that's picked up by a clock interrupt). procflushseg will not proceed until all processes and processors that might need to flush state have done so. (s remains locked throughout.) after the flush, no process or processor can have a reference to any of the pages in its mmu state. it is safe to free them, which mfreeseg does. now, a process might still be executing a system call or some other trap that might refer to that segment, to an address that's now been removed by another process. to access the memory, it must ultimately issue a load or a store instruction (even for syscalls, such as read or write). that instruction will trap, because as described above there is no longer a valid mmu mapping for that address within the process. normally, the trap will find the right page in the software mmu structures and install the map. in this case, however, it can't find the address, so it will raise an exception in the process (ie, send a note). (all such searches are done with the segment properly locked.)---BeginMessage--- just stop processes with s-ref 1 from freeing parts of s with ibrk. it's not as if anything ever does in practice.---End Message---
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
... procflushseg finds all processes that share s, sets them all up to flush their mmu states, and also sets any processor running such a process to flush its state (that's picked up by a clock interrupt). procflushseg will not proceed until all processes and processors that might need to flush state have done so. (s remains locked throughout.) Coincidentally, I spent last Sunday debugging a deadlock in precisely this spot. I had absentmindedly tried to use VESA vga on a multiprocessor. The aux/vga -l apparently succeeded and the screen looked great, but as a stealthy side-effect the CPU which had done the VESA call stopped responding to interrupts -- including the local APIC clock interrupt required for the mmu flush as described above. So, some time later when another process (upas/fs as it happens) on the other CPU wanted to adjust a segment size, procflushseg was called and never returned. Debugging can be challenging when cause and effect are minutes or hours apart ...
Re: [9fans] installing from sdC0 or sdC1 CD
I was able to install from sdC1 in the Bochs emulator. If you install and your CD is not sdD0, you get this prompt: Unknown boot device: sdD0!cdboot!9pcflop.gz Boot device: fd0 boot from: If its sdC1 (like in my case), the answer has to be: boot from: sdC1!cdboot!9pcflop.gz More info about the install at http://plan9.bell-labs.com/wiki/plan9/installation_instructions/
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
very interesting. thanks for the sum up. but my testcase crashes a uniprocessor system, so here is no waiting for mmuflushes on other processors going on. any other process that shares the segment and was suspended in the kernel may potentialy hold a pointer to that freed memory area of the segment and may cause a fault in the kernel on resume. process1: in kernel: read(buf) validaddr(buf) ... *context switch* process2: in kernel: ... ibrk() mfreeseg() procflushseg() flushmmu() ... *context switch* process1: still in kernel: memmove(buf, ...) *fault* trap() fault386() fault() = -1 if(!user){ panic() *panic* } ... postnote() we cant really wait for all processes sharing the segment to be in userspace as they may already be waiting in a long blocking syscall. how do we fix this? -- cinap ---BeginMessage--- this resulted in a little side discussion. to save someone else from having to break a strong oath about 9fans, i'll sum it up. the existing code handles this situation. when several processes share a segment, and any one of them decides to shrink the segment, all the processes must see the change before the pages are made available for re-use. any one of them could have any of those pages currently in their mmu state. and any must be removed from the mmu state local to the process, to ensure that no further access to the pages is possible before they are freed. the critical point is that it's irrelevant whether traps or syscalls are involved: ordinary store instructions are clearly bad enough. thus /sys/src/9/port/segment.c:/^mfreeseg contains (with s locked) /* flush this seg in all other processes */ if(s-ref 1) procflushseg(s); and procflushseg finds all processes that share s, sets them all up to flush their mmu states, and also sets any processor running such a process to flush its state (that's picked up by a clock interrupt). procflushseg will not proceed until all processes and processors that might need to flush state have done so. (s remains locked throughout.) after the flush, no process or processor can have a reference to any of the pages in its mmu state. it is safe to free them, which mfreeseg does. now, a process might still be executing a system call or some other trap that might refer to that segment, to an address that's now been removed by another process. to access the memory, it must ultimately issue a load or a store instruction (even for syscalls, such as read or write). that instruction will trap, because as described above there is no longer a valid mmu mapping for that address within the process. normally, the trap will find the right page in the software mmu structures and install the map. in this case, however, it can't find the address, so it will raise an exception in the process (ie, send a note). (all such searches are done with the segment properly locked.)---BeginMessage--- just stop processes with s-ref 1 from freeing parts of s with ibrk. it's not as if anything ever does in practice.---End Message--- ---End Message---
[9fans] just an idea (Splashtop like)
A buddy of mine just got this: asus p5ql/epu motherboard. It came with Splashtop: http://en.wikipedia.org/wiki/Splashtop ... which is a linux distribution that boots in like 5 seconds or so. Complete with BlackBox for a window manager, Skype, an instant messager client and firefox. I wonder if it could be changed to be a plan 9 terminal, or if one could at least get 9vx on it. Dave
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
process1: still in kernel: memmove(buf, ...) *fault* trap() fault386() fault() = -1 if(!user){ panic() *panic* } ... postnote() could you be more specific. what is your test program, where is it crashing (if you know), and what is the panic message, if any? i must be dense, but i'm confused by your process diagram. - erik
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
2009/7/31 erik quanstrom quans...@coraid.com: process1: still in kernel: memmove(buf, ...) *fault* trap() fault386() fault() = -1 if(!user){ panic() *panic* } ... postnote() could you be more specific. what is your test program, where is it crashing (if you know), and what is the panic message, if any? i must be dense, but i'm confused by your process diagram. He posted it earlier in this thread --dho - erik
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
2009/7/31 erik quanstrom quans...@coraid.com: could you be more specific. what is your test program, where is it crashing (if you know), and what is the panic message, if any? i must be dense, but i'm confused by your process diagram. He posted it earlier in this thread --dho please post a reference. i do not see either the crash code or the panic message on 9fans.net/archive/2009/07. i don't recall it in the originals, either. - erik Was received as an attachment. Inline: #include u.h #include libc.h void main(int argc, char **argv) { char *buf; int fd; if((fd = open(/dev/zero, OREAD)) 0) sysfatal(open); buf = (char*)0x60; segattach(0, memory, buf, 2*4096); switch(rfork(RFPROC|RFMEM)){ case -1: sysfatal(fork); break; case 0: for(;;){ read(fd, buf+4096, 4096); } } sleep(1000); segbrk(buf, buf+4096); } --dho
Re: [9fans] detecting drawterm
i hadn't noticed the (apparently undocumented) secstore device. ditto.
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
but my testcase crashes a uniprocessor system, so here is no waiting for mmuflushes on other processors going on. it ensures mmuflushes in all other processes (sharing that segment) as well. in fact, the crash you describe just emphasises that point: the page reference no longer exists, hence the fault. the problem (which frankly doesn't bother me) is that fault386 is being overly cautious in assuming that a page fault that occurs in system mode but can't map a page successfully is necessarily a kernel bug: that's not true. it could just note the process instead. (it doesn't bother me because since unix days i've seen less than a handful of programs that SHRINK their existing data segments, and i think that's the only case that can cause the panic you're seeing.)
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
it ensures mmuflushes in all other processes (sharing that segment) as well. in fact, the crash you describe just emphasises that point: the page reference no longer exists, hence the fault. the problem (which frankly doesn't bother me) is that fault386 is being overly cautious in assuming that a page fault that occurs in system mode but can't map a page successfully is necessarily a kernel bug: that's not true. it could just note the process instead. (it doesn't bother me because since unix days i've seen less than a handful of programs that SHRINK their existing data segments, and i think that's the only case that can cause the panic you're seeing.) if this case is really not important, would it make sense to disallow shrinking segments? it might be worth it just to be able to define Eshrinkage. - erik
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
you can get similar effects by remapping things. i meant that it isn't likely to happen by accident, so am i bovvered? fault386 needs to be fixed mainly by or for people running a shared cpu server with hostile users (ie, students). for the rest of us it might be more useful to have the panic to prevent real kernel bugs (ie, just bad pointers in device driver implementations) from postnoting a process instead of stopping the system. having said that, it could be argued that even in that case a postnote to the invoking process would allow the rest of the system to run and `might not' mean that the broken driver has wrecked other data structures outside it in kernel memory.---BeginMessage--- it ensures mmuflushes in all other processes (sharing that segment) as well. in fact, the crash you describe just emphasises that point: the page reference no longer exists, hence the fault. the problem (which frankly doesn't bother me) is that fault386 is being overly cautious in assuming that a page fault that occurs in system mode but can't map a page successfully is necessarily a kernel bug: that's not true. it could just note the process instead. (it doesn't bother me because since unix days i've seen less than a handful of programs that SHRINK their existing data segments, and i think that's the only case that can cause the panic you're seeing.) if this case is really not important, would it make sense to disallow shrinking segments? it might be worth it just to be able to define Eshrinkage. - erik---End Message---
Re: [9fans] off topic: manual sets
Slightly off topic (of this thread), but I'm looking for troff sources of 8th, 9th, 10th and Plan 9 1st Editions to complete the man page collections at http://man.cat-v.org If anyone knows where I can find copies of any of those, I would be most grateful. uriel On Thu, Jul 30, 2009 at 6:13 PM, Benjamin Huntsmanbhunts...@mail2.cu-portland.edu wrote: Sorry for the off-topic post, but I'm striking out on google, and I'm virtually certain that someone here will know... Does anyone happen to have the ISBN's for the 7th Edition manual set? Volume I is 0-03-061742-1, but I can't seem to find the others... Thanks in advance! -Ben
Re: [9fans] Race condition in /sys/src/9/pc/trap.c?
i attached it in a previous mail... try again... -- cinap ---BeginMessage--- process1: still in kernel: memmove(buf, ...) *fault* trap() fault386() fault() = -1 if(!user){ panic() *panic* } ... postnote() could you be more specific. what is your test program, where is it crashing (if you know), and what is the panic message, if any? i must be dense, but i'm confused by your process diagram. - erik---End Message--- #include u.h #include libc.h void main(int argc, char **argv) { char *buf; int fd; if((fd = open(/dev/zero, OREAD)) 0) sysfatal(open); buf = (char*)0x60; segattach(0, memory, buf, 2*4096); switch(rfork(RFPROC|RFMEM)){ case -1: sysfatal(fork); break; case 0: for(;;){ read(fd, buf+4096, 4096); } } sleep(1000); segbrk(buf, buf+4096); }
Re: [9fans] just an idea (Splashtop like)
On Fri, Jul 31, 2009 at 1:50 PM, David Leimbachleim...@gmail.com wrote: A buddy of mine just got this: asus p5ql/epu motherboard. It came with Splashtop: http://en.wikipedia.org/wiki/Splashtop ... which is a linux distribution that boots in like 5 seconds or so. Complete with BlackBox for a window manager, Skype, an instant messager client and firefox. I wonder if it could be changed to be a plan 9 terminal, or if one could at least get 9vx on it. Dave Doesn't ASUS burn the Linux distro into a chip, though? Maybe there are utilities to flash it with something else.
Re: [9fans] just an idea (Splashtop like)
On Fri, Jul 31, 2009 at 6:56 PM, J.R. Maurojrm8...@gmail.com wrote: Doesn't ASUS burn the Linux distro into a chip, though? Maybe there are utilities to flash it with something else. see flashrom at coreboot.org This is a great idea assuming we can get a mobo that plan 9 can use. ron
Re: [9fans] ceph
On Jul 30, 2009, at 9:31 AM, sqweek wrote: 2009/7/30 Roman V Shaposhnik r...@sun.com: This is sort of off-topic, but does anybody have any experience with Ceph? http://ceph.newdream.net/ Good or bad war stories (and general thoughts) would be quite welcome. Not with ceph itself, but the description and terminology they use remind me a lot of lustre (seems like it's a userspace version) which we use at work. Does a damn fine job - as long as you get a stable version. We have run into issues trying out new versions several times... I guess that sums up my impression of ceph so far: I don't see where it would fit. I think that in HPC it is 99% Lustre, in enterprise it is either CIFS or NFS, etc. There's some internal push for it around here so I was wondering whether I missed a memo once again... Thanks, Roman.
Re: [9fans] ceph
I'm not a big fan of lustre. In fact I'm talking to someone who really wants 9p working well so he can have lustre on all but a few nodes, and those lustre nodes export 9p. ron
Re: [9fans] ceph
On Jul 31, 2009, at 10:41 PM, ron minnich wrote: I'm not a big fan of lustre. In fact I'm talking to someone who really wants 9p working well so he can have lustre on all but a few nodes, and those lustre nodes export 9p. What are your clients running? What are their requirements as far as POSIX is concerned? How much storage are talking about? I'd be interested in discussing some aspects of what you're trying to accomplish with 9P for the HPC guys. Thanks, Roman. P.S. If it is ok with everybody else -- I'll keep the conversation on the list.