Re: [9fans] simple venti demo:
On Thursday 11 of August 2011 18:15:09 ron minnich wrote: Could we make little venti files and finally try to build an SCM using these files? Venti would make a great backend for Git. I believe Git's commit and tree format are simple enough to be re-implemented, if porting proves to be too bothersome. either the whole .git/ directory would be held in venti, or just the .git/objects/ -- the storage. on the other hand, Git has that interesting feature that a bunch of recent commits is held in loose files, while older commits are re-packed into space- efficient format using deltas. was similar strategy ever considered for Venti? as in, to keep fresh data in present-day format, but migrate older data into denser format? -- dexen deVries [[[↓][→]]] For example, if the first thing in the file is: ?kzy irefvba=1.0 rapbqvat=ebg13? an XML parser will recognize that the document is stored in the traditional ROT13 encoding. (( Joe English, http://www.flightlab.com/~joe/sgml/faq-not.txt ))
Re: [9fans] simple venti demo:
On Fri, Aug 12, 2011 at 12:50 AM, dexen deVries dexen.devr...@gmail.com wrote: Venti would make a great backend for Git. I guess you could do this; it'd be interesting to do something more in keeping with the Unix model than git is. Create repo: dd -if etc. etc. checkout unvac commit vac etc. compare two trees: vacfs score1 /tree1 vacfs score2 /tree2 diff-somehow /tree1 /tree2 It seems to me that all the things that are done with git today, with all its special purpose commands, might be done with a Unix tool approach. Plus, some of the git commands are pretty interesting and might be useful if they could be applied to other contexts. ron
Re: [9fans] simple venti demo:
i call this 'ventino'. it's a tiny venti that keeps the whole index in memory, backed by a text file. i have not used it in a while (the file is dated may 25, 2009) but hey, it's a working venti server in 329 lines of code. #include u.h #include libc.h #include bio.h #include flate.h #include thread.h #include venti.h #include libsec.h typedef uchar byte; typedef u64int uint64; typedef u32int uint32; typedef struct IEntry IEntry; struct IEntry { IEntry *link; // disk data byte score[VtScoreSize]; uint64 offset; }; typedef struct Chunk Chunk; struct Chunk { byte score[VtScoreSize]; uint32 size; byte *data; }; IEntry **ihash; uint nihash; uint nientry; void rehash(void) { IEntry **new, *e, *next; uint i, n; uint32 h; n = nihash1; new = vtmallocz(nihash*sizeof new[0]); for(i=0; inihash; i++) { for(e = ihash[i]; e; e = next) { next = e-link; h = *(uint32*)e-score (n - 1); e-link = new[h]; new[h] = e; } } free(ihash); ihash = new; nihash = n; } IEntry* ilookup(byte *score) { uint32 h; IEntry *e; // Not a great hash; assumes we are // seeing all blocks, not just some chosen subset. h = *(uint32*)score (nihash - 1); for(e = ihash[h]; e; e = e-link) if(memcmp(e-score, score, VtScoreSize) == 0) return e; return nil; } IEntry* iinsert(byte *score) { uint32 h; IEntry *e; if(nihash (128) nientry 2*nihash) rehash(); h = *(uint32*)score (nihash - 1); e = vtmallocz(sizeof(IEntry)); e-link = ihash[h]; ihash[h] = e; memmove(e-score, score, VtScoreSize); return e; } void iload(Biobuf *b) { char *p; char *f[10]; int nf; byte score[VtScoreSize]; uint64 offset; IEntry *e; while((p = Brdline(b, '\n')) != nil) { p[Blinelen(b)-1] = '\0'; nf = tokenize(p, f, nelem(f)); if(nf != 2 || vtparsescore(f[0], nil, score) 0 || (offset = strtoull(f[1], 0, 0)) == 0) { sysfatal(malformed index); return; } e = iinsert(score); e-offset = offset; } } void iwrite(int fd, IEntry *e) { fprint(fd, %V %-22llud\n, e-score, e-offset); } enum { ArenaBlock = 130 }; uint64 dwrite(int fd, Chunk *c) { byte *zdat, *w; int nzdat; uint nw; uint64 offset, eoffset; zdat = vtmallocz(c-size + 1024); nzdat = deflateblock(zdat, c-size + 1024, c-data, c-size, 6, 0); if(nzdat 0 || nzdat c-size - 512) { // don't bother with compression w = c-data; nw = c-size; } else { w = zdat; nw = nzdat; } offset = seek(fd, 0, 1); eoffset = offset + 2*VtScoreSize + 12 + 12 + 1 + nw; if(eoffset / ArenaBlock != offset / ArenaBlock) { offset /= ArenaBlock; offset++; offset *= ArenaBlock; seek(fd, offset, 0); } fprint(fd, %V %-11ud %-11ud\n, c-score, c-size, nw); write(fd, w, nw); free(zdat); return offset; } void dread(Biobuf *b, Chunk *c) { char *p, *f[10]; int nf; uint zsize; byte *r; uint64 offset; char buf[100]; offset = Boffset(b); p = Brdline(b, '\n'); if(p == nil || Blinelen(b) = sizeof buf) sysfatal(malformed data - EOF); memmove(buf, p, Blinelen(b)); buf[Blinelen(b)-1] = '\0'; nf = tokenize(buf, f, nelem(f)); if(nf != 3 || vtparsescore(f[0], nil, c-score) 0 || (c-size = strtoul(f[1], 0, 0)) == 0 || (zsize = strtoul(f[2], 0, 0)) == 0) { sysfatal(malformed data at %llud / %d, offset, nf); return; } c-data = vtmalloc(c-size); if(c-size == zsize) r = c-data; else r = vtmallocz(zsize); Bread(b, r, zsize); if(c-size != zsize) { if((nf = inflateblock(c-data, c-size, r, zsize)) 0) sysfatal(inflateblock fail %d %d %d %.10H..., c-size, zsize, nf, r); free(r); } } Biobuf *bindexr; Biobuf *bdatar; int indexw, dataw; Biobuf *bsha1; // TODO int doCreate; int verbose; void doOpen(char *name, char *what, int *w, Biobuf **r) { int fd; char buf[100]; char *p; snprint(buf, sizeof buf, # sventi %s\n, what); if((*w = open(name, OWRITE)) 0)
Re: [9fans] simple venti demo:
On Fri, 2011-08-12 at 12:18 -0400, Russ Cox wrote: i call this 'ventino'. Shouldn't it be 'ventina'? Venti seems feminine.
Re: [9fans] simple venti demo:
On Wed, Aug 10, 2011 at 8:05 PM, David Leimbach leim...@gmail.com wrote: Isn't p9p POSIX enough? Confused I am, but wasn't that the point of p9p? p9p gives you a runtime environment just like Plan 9s. From the point of view of a programmer you can even pretend you're not in a POSIX world.It's wonderful but there are times when people want the functionality (e.g. venti server) but not using p9p libraries, but POSIX libraries. We hit that issue a lot in the early days of xcpu. The first few versions were very much p9p code. Users complained about the need for the extra libraries and unfamiliar programming environment. Later versions of xcpu were all POSIX, no p9p at all. Hope I said that right. ron
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 1:21 AM, ron minnich rminn...@gmail.com wrote: On Wed, Aug 10, 2011 at 8:05 PM, David Leimbach leim...@gmail.com wrote: Isn't p9p POSIX enough? Confused I am, but wasn't that the point of p9p? p9p gives you a runtime environment just like Plan 9s. From the point of view of a programmer you can even pretend you're not in a POSIX world.It's wonderful but there are times when people want the functionality (e.g. venti server) but not using p9p libraries, but POSIX libraries. We hit that issue a lot in the early days of xcpu. The first few versions were very much p9p code. Users complained about the need for the extra libraries and unfamiliar programming environment. Later versions of xcpu were all POSIX, no p9p at all. Hope I said that right. I don't know if its still the case, but when I was playing with venti a few years ago it had problems with chunks of memory 2G. I was trying to run p9p venti on a sever with 64GB of RAM but could only use a fraction of that for the venti caches. Now that may have been more of a venti problem than a p9p problem, sadly I didn't have the time to track it down. -eric
Re: [9fans] simple venti demo:
rather than continue to live for the next 20 years with (say) 20- to 30-year old include file structures and library implementations that became overly complicated (and badly implemented), a better approach might be to separate the libraries from the much larger distribution of plan 9-based commands etc, and make them available in the usual way as packages to import using the (many different) package managers on Unix-like systems.
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 10:09 AM, Charles Forsyth fors...@terzarima.net wrote: rather than continue to live for the next 20 years with (say) 20- to 30-year old include file structures and library implementations that became overly complicated (and badly implemented), a better approach might be to separate the libraries from the much larger distribution of plan 9-based commands etc, and make them available in the usual way as packages to import using the (many different) package managers on Unix-like systems. 9base? (http://tools.suckless.org/9base) Doesn't stick to just libraries but already has packages on Ubungo anyways. I think it at least cuts the GUI tools which eliminates the xorg-dev requirement which is probably the most onerous. -eric
Re: [9fans] simple venti demo:
OK, there is a go version that lucho wrote: https://code.google.com/p/govt/ It's very nice code. There will soon be a googlecode repo (lucho is setting it up now) with a non-plan9-ports version (vtmm). Find it in googlecode at libvt. It's also quite nice and much more capable than what I posted yesterday. We now have 3 very simple implementations of venti. I hope people look at this stuff. I think it makes the concepts of venti much more accessible. Note a difference between lucho and me: I ignore vtsync (I always sync on writes) and he properly pays attention to it. Question for the student: which one is better? Why? Could we make little venti files and finally try to build an SCM using these files? Have fun! ron p.s. while you can't run this on plan9 for anything big (you need to be able to have processes that can get bigger than 4G) you will be able to run it in small scale on Plan 9 and bigger on nix. You'll have to remove the use of mmap and replace the msync with writes to a file, but that's pretty trivial. A good project for someone.
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 9:15 AM, ron minnich rminn...@gmail.com wrote: OK, there is a go version that lucho wrote: https://code.google.com/p/govt/ Hooray for government! Oh, wait...
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 9:49 AM, erik quanstrom quans...@labs.coraid.comwrote: Note a difference between lucho and me: I ignore vtsync (I always sync on writes) and he properly pays attention to it. Question for the student: which one is better? Why? question cannot be answered due to insufficient information about what better means. are you after performance or reliability? - erik You just answered the question :-)
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 9:49 AM, erik quanstrom quans...@labs.coraid.com wrote: Note a difference between lucho and me: I ignore vtsync (I always sync on writes) and he properly pays attention to it. Question for the student: which one is better? Why? question cannot be answered due to insufficient information about what better means. are you after performance or reliability? That's part of the question Which is better? -Why?- Maybe I should say 'explain your answer' :-) ron
Re: [9fans] simple venti demo:
a better approach might be to separate the libraries from the much larger distribution of plan 9-based commands etc, and make them available in the usual way as packages to import using the (many different) package managers on Unix-like systems. this is what i was looking into just this morning. i wanted to package factotum - and others - individually in hope that more programs use it. i spent about 3 hours faffing with debain's packaging tools, then remembered that i have work to do :-(
Re: [9fans] simple venti demo:
My version of the govt actually works sometimes. On Thu, Aug 11, 2011 at 10:49 AM, David Leimbach leim...@gmail.com wrote: On Thu, Aug 11, 2011 at 9:15 AM, ron minnich rminn...@gmail.com wrote: OK, there is a go version that lucho wrote: https://code.google.com/p/govt/ Hooray for government! Oh, wait...
Re: [9fans] simple venti demo:
question cannot be answered due to insufficient information about what better means. are you after performance or reliability? That's part of the question Which is better? -Why?- Maybe I should say 'explain your answer' :-) you have to define better first, and you have to define what you mean by flushing immediately. i see three general approaches to this problem, flush eventually, flush immediately, and flush before ack. this is the same dillema any non content-addressed disk has. performance vs. safety. and of course one size doesn't fit all, so there are knobs in most disks to turn off write caching. this is a cs101 prerequsite question, is it not? - erik
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 10:28 AM, erik quanstrom quans...@labs.coraid.com wrote: this is the same dillema any non content-addressed disk has. performance vs. safety. and of course one size doesn't fit all, so there are knobs in most disks to turn off write caching. it's not as obvious a tradeoff as it seems. Anyway, I'm more interested in hearing from people who do something with the code. ron
Re: [9fans] simple venti demo:
On Thu Aug 11 13:38:25 EDT 2011, rminn...@gmail.com wrote: On Thu, Aug 11, 2011 at 10:28 AM, erik quanstrom quans...@labs.coraid.com wrote: this is the same dillema any non content-addressed disk has. performance vs. safety. and of course one size doesn't fit all, so there are knobs in most disks to turn off write caching. it's not as obvious a tradeoff as it seems. are you alluding to the fact that the client has no way of doing a synchronize cache with the venti disk? - erik
Re: [9fans] simple venti demo:
Isn't p9p POSIX enough? It's a matter of laziness; I'd rather port venti to POSIX once rather than port p9p to many things. There are just enough platform-dependent bits in p9p to make it enough of an annoyance for me to go the POSIX route.
Re: [9fans] simple venti demo:
On Thu, 11 Aug 2011 09:15:09 PDT ron minnich rminn...@gmail.com wrote: Note a difference between lucho and me: I ignore vtsync (I always sync on writes) and he properly pays attention to it. Question for the student: which one is better? Why? Pay attention to vtsync? May be not for your mythical multiTB ramflash but in real life syncing on every write is expensive. [As I see it] in a sense venti has an atomic `changeset' concept (each changeset maps to a single fingerprint). A partial changeset is of not much use.
Re: [9fans] simple venti demo:
Pay attention to vtsync? May be not for your mythical multiTB ramflash but in real life syncing on every write is expensive. flash has noticable latency. [As I see it] in a sense venti has an atomic `changeset' concept (each changeset maps to a single fingerprint). A partial changeset is of not much use. on the other hand, not every write is a meaningful state to the client. - erik
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 12:54 PM, Bakul Shah ba...@bitblocks.com wrote: Pay attention to vtsync? May be not for your mythical multiTB ramflash but in real life syncing on every write is expensive. are you sure? On a multicore server, why not have a syncing task and a serving task? Since all of the arena is in ram, the synciing task will not interfere with the serving task, esp. if sata controller and network are on different PCI busses. I don't think the tradeoffs are obvious at all. ron
Re: [9fans] simple venti demo:
anyway, enough discussion. hack hack is better than talk talk at some point :-) I'm about to bench lucho's server on a 32GB arena (all of which will be mmap'ed of course). ron
Re: [9fans] simple venti demo:
Pay attention to vtsync? May be not for your mythical multiTB ramflash but in real life syncing on every write is expensive. are you sure? On a multicore server, why not have a syncing task and a serving task? Since all of the arena is in ram, the synciing task will not interfere with the serving task, esp. if sata controller and network are on different PCI busses. I don't think the tradeoffs are obvious at all. that doesn't sound synchronous to me. what i think of when i think of flush on write is that the i/o is done before the reply to the write. this has two implications, there's no way to do any elevatoring, and you take a full round-trip to the disk delay for each write, no amortization is possible. i would think that the client is in the best position to tell the storage when things must be flushed. it might be best to only write when told to flush and do so in such a way that it's clear if the transaction has finished. that way, if you're really careful and flush caches down to the storage media, you can recover if things go sideways. - erik
Re: [9fans] simple venti demo:
On Thu, 11 Aug 2011 13:04:05 PDT ron minnich rminn...@gmail.com wrote: On Thu, Aug 11, 2011 at 12:54 PM, Bakul Shah ba...@bitblocks.com wrote: Pay attention to vtsync? May be not for your mythical multiTB ramflash but in real life syncing on every write is expensive. are you sure? On a multicore server, why not have a syncing task and a serving task? Since all of the arena is in ram, the synciing task will not interfere with the serving task, esp. if sata controller and network are on different PCI busses. Not sure we are on the same page Possible I missed what you are really asking! I thought you were comparing your implementation with lucho's. From a quick scan of your mmap based code it seems you do an msync on every write which I think is excessive. I don't know under what conditions vtsync is sent but presumably the client sends it at least at the end of an update. But that doesn't stop the server from doing opportunistic syncs in a separate thread to reduce the amount of work that remains to be done when it receives an actual vtsync from the client. But when it does receive one it has ensure that all the data is synced before responding back. I don't think the tradeoffs are obvious at all. I thought that was obvious!
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 1:50 PM, David Leimbach leim...@gmail.com wrote: Is it goinstallable? If so, I'm not sure what I'm doing wrong. I very rarely use any 3rd party Go code but my own :-). no idea. I just hg clone'd and did a make ron
Re: [9fans] simple venti demo:
Is it goinstallable? If so, I'm not sure what I'm doing wrong. I very rarely use any 3rd party Go code but my own :-). goinstall govt.googlecode.com/hg/vt/vtclnt goinstall govt.googlecode.com/hg/vt/vtsrv Works for me. fhs
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 2:19 PM, Fazlul Shahriar fshahr...@gmail.comwrote: Is it goinstallable? If so, I'm not sure what I'm doing wrong. I very rarely use any 3rd party Go code but my own :-). goinstall govt.googlecode.com/hg/vt/vtclnt goinstall govt.googlecode.com/hg/vt/vtsrv Works for me. fhs strings.SplitN is not there... I must be a release or so behind for go?
Re: [9fans] simple venti demo:
Lucho is always up to date, better do a pull for go ron
Re: [9fans] simple venti demo:
On Thu, Aug 11, 2011 at 5:42 PM, David Leimbach leim...@gmail.com wrote: On Thu, Aug 11, 2011 at 2:19 PM, Fazlul Shahriar fshahr...@gmail.com wrote: Is it goinstallable? If so, I'm not sure what I'm doing wrong. I very rarely use any 3rd party Go code but my own :-). goinstall govt.googlecode.com/hg/vt/vtclnt goinstall govt.googlecode.com/hg/vt/vtsrv Works for me. fhs strings.SplitN is not there... I must be a release or so behind for go? yes, SplitN was introduced in release r59. Latest weekly will also work. fhs
Re: [9fans] simple venti demo:
got it... Seems to build fine now. On Thu, Aug 11, 2011 at 2:54 PM, ron minnich rminn...@gmail.com wrote: Lucho is always up to date, better do a pull for go ron
Re: [9fans] simple venti demo:
How about a more complex venti that runs on a strict POSIX host? I would really prefer to run my venti on my Solaris fileserver. ZFS for this application lets me sleep better at night. I'm half-way there, but the boat takes priority this month. --lyndon
Re: [9fans] simple venti demo:
On Wed, Aug 10, 2011 at 2:36 PM, Lyndon Nerenberg (VE6BBM/VE7TFX) lyn...@orthanc.ca wrote: How about a more complex venti that runs on a strict POSIX host? I would really prefer to run my venti on my Solaris fileserver. ZFS for this application lets me sleep better at night. hang in there. ron
Re: [9fans] simple venti demo:
hang in there. Ach! Ye of little Faithe! 1) Write drivers for obtuse RAID controllers. or 2) Port venti to POSIX. Hmm ... let me think about that for a minute ... Time's up! Back to dealing with POSIX :-) And given enough tequila, it can revert to almost pure ANSI C. --lyndon
Re: [9fans] simple venti demo:
On Wed Aug 10 17:57:03 EDT 2011, lyn...@orthanc.ca wrote: hang in there. Ach! Ye of little Faithe! 1) Write drivers for obtuse RAID controllers. or 2) Port venti to POSIX. Hmm ... let me think about that for a minute ... Time's up! Back to dealing with POSIX :-) And given enough tequila, it can revert to almost pure ANSI C. just use aoe. - erik
Re: [9fans] simple venti demo:
On Wed, 10 Aug 2011 14:54:05 PDT Lyndon Nerenberg (VE6BBM/VE7TFX) lyn...@orthanc.ca wrote: hang in there. Ach! Ye of little Faithe! 1) Write drivers for obtuse RAID controllers. or 2) Port venti to POSIX. Isn't p9p venti good enough?
Re: [9fans] simple venti demo:
Isn't p9p venti good enough? Nope. It only works where p9p works. I want code that will compile on any POSIX-compliant host.
Re: [9fans] simple venti demo:
On Wed, Aug 10, 2011 at 3:07 PM, Lyndon Nerenberg (VE6BBM/VE7TFX) lyn...@orthanc.ca wrote: Isn't p9p venti good enough? Nope. It only works where p9p works. I want code that will compile on any POSIX-compliant host. hang in there for just a bit longer. I understand what you want. ron
Re: [9fans] simple venti demo:
Sent from my iPhone On Aug 10, 2011, at 3:07 PM, Lyndon Nerenberg (VE6BBM/VE7TFX) lyn...@orthanc.ca wrote: Isn't p9p venti good enough? Nope. It only works where p9p works. I want code that will compile on any POSIX-compliant host. Isn't p9p POSIX enough? Confused I am, but wasn't that the point of p9p?