Re: [9fans] Changelogs Patches?
it's important to keep in mind that fossil is just a write buffer. it is not intended for the perminant storage of data. Sure. But it must store the data *intact* long enough for me to be able to do a snap. It has to be able to at least warn me about data corruption. do you have any references to spontaenous data corruption happening so soon on media that can be written elsewhere without corruption? ian ibm paper argus for raid[56] + chksum that claimed that the p(lifetime) = 10^-13. http://domino.watson.ibm.com/library/cyberdig.nsf/80741a79b3d5f4d085256b3600733b05/ca7b221ad09be77885257149004f7c53?OpenDocumentHighlight=0,RZ3652 but i didn't see any reason that this would apply to short-term storage. That is my *entire* point. If fossil doesn't tell you that the data in its buffer was/is corrupted -- you have no reason to rollback. if you're that worried, you do not need to modify fossil. why don't you write a sdecc driver that as configuration another sd device and a blocksize. then you can just add ecc on the way in and check it on the way out. - erik
Re: [9fans] Changelogs Patches?
It depends on the vdev configuration. You can do simple mirroring or you can do RAID-Z (which is more or less RAID-5 done properly). raid5 done properly? could you back up this claim? also, with services like ec2, it's no use doing raid since all your data could be on the same drive, regardless what the tell you. does this depend on the amount of i/o one does on the data or does zfs scrub at a minimum rate anyway. if it does, that would be expensive. You can do resilvering (fixing the data that is known to be bad) or scrubbing (verifying and fixing *all* the data). You also can configure things so that bad blocks either trigger or don't automatic resilvering. Does this answer your question? no. not at all. if you're serious about using ec2, one of the costs you need to control is your b/w usage. you're going to notice overly-aggressive scrubbing in your mothly bill. maybe ec2 is heads amazon wins, tails you loose? The scariest takeaway from the conference was: with the economy the way it is physical on-site datacenters are becoming a luxury for all but the most wealthy companies. Thus whether we like it or not virtual data centers are here to stay. if the numbers i came up with for coraid are correct, it would would cost coraid about 50x more to use ec2. that is, if we can run plan 9 at all. - erik
Re: [9fans] Changelogs Patches?
On Mon, 2009-01-26 at 08:53 -0500, erik quanstrom wrote: It depends on the vdev configuration. You can do simple mirroring or you can do RAID-Z (which is more or less RAID-5 done properly). raid5 done properly? could you back up this claim? Yes. See here for details: http://blogs.sun.com/bonwick/entry/raid_z does this depend on the amount of i/o one does on the data or does zfs scrub at a minimum rate anyway. if it does, that would be expensive. You can do resilvering (fixing the data that is known to be bad) or scrubbing (verifying and fixing *all* the data). You also can configure things so that bad blocks either trigger or don't automatic resilvering. Does this answer your question? no. not at all. Then, please, restate it. if you're serious about using ec2, one of the costs you need to control is your b/w usage. you're going to notice overly-aggressive scrubbing in your mothly bill. Only if you asked for that to happen. Its all under your control. You may decide to never ever do scrubbing. The scariest takeaway from the conference was: with the economy the way it is physical on-site datacenters are becoming a luxury for all but the most wealthy companies. Thus whether we like it or not virtual data centers are here to stay. if the numbers i came up with for coraid are correct, it would would cost coraid about 50x more to use ec2. that is, if we can run plan 9 at all. You may think what you want, but obviously quite a few existing small to mid-size companies disagree. Including a couple of labs with MPI apps now running on EC2. May be your numbers are wrong, may be your usage patterns are different. Who knows. Thanks, Roman.
Re: [9fans] Changelogs Patches?
As for me, here's my wish list so far. It is all about fossil, since it looks like venti is quite fine (at least for my purposes) the way it is: 1. Block consistency. Yes I know the argument here is that you can always roll-back to the last known archival snapshot on venti. But the point is to kown *when* to roll back. And unless fossil warns you that a block has been corrupted you wouldn't know. I don't understand what you mean. Do you want fossil to tell you when your disk is silently corrupting data, or something else? 2. live mounting of arbitrary scores corresponding to vac VtRoot's to arbitrary sub-directories in my fossil tree. After all, if I can do create of regular files and sub-directories via fossil's console why can't I create pointers to the existing venti file-hierarchies? The only reason this is hard is the choice of qids. You need to decide whether to reuse the qids in the archive or renumber them to avoid conflicts with existing qids. The vac format already has a way to offset the qids of whole subtrees, but then if you make the tree editable and new files are created, it gets complicated. 3. Not sure whether this is a fossil requirement or not, but I feel uneasy that a root score is sort of unrecoverable from the pure venti archive. Its either that I know it or I don't. I don't understand what you mean here either. From a venti archive, you do cat file.vac to find the actual score. For what it's worth, I'll be the first to admit that fossil has a ton of rough edges and things that could be done better. There were early design decisions that we didn't know the implications of until relatively late in the implementation, and I would revisit many of those if I had the luxury of doing it over. It is very much version 0. The amazing thing to me about fossil is how indestructable it is when used with venti. While I was finishing fossil, I ran it on my laptop as my day-to-day file system, and I never lost a byte of data despite numerous bugs, because venti itself was solid, and I always did an archive to venti before trying out new code. Once you see the data in the archive tree, you can be very sure it's not going away. It is actually quite remarkable how similar the models of fossil/venti and Git seem to be: both build on the notion of the immutable history. Both address the history by the hash index. Both have a mutable area whose only purpose is to stage data for the subsequent commit to the permanent history. Etc. I don't think it's too remarkable. Content hash addressing was in the air for the last decade or so and there were a lot of systems built using it. The one thing it does really well is eliminate any worry about cache coherency and versioning. That makes it very attractive for any system with large amounts of data or multiple machines. Once you've gone down that route, you have to come to grips with how to implement mutability in a fundamentally immutable system, and the obvious way is with a mutable write buffer staging writes out to the immutable storage. Hm. There doesn't seem to be much of shutdown code for fossil/venti in there. Does it mean that sync'ing venti and then just slay(1)'ing it is ok? Yes, it is. Venti is designed to be crash-proof, as is fossil. They get the write ordering right and pick up where they left off. They are not, however, disk corruption-proof. Russ
Re: [9fans] Changelogs Patches?
Yes. See here for details: http://blogs.sun.com/bonwick/entry/raid_z since these arguments rely heavily on the meme that software raid == bad i have a hard time signing on. i believe i'm repeating myself by saying that afik, there is no such thing as pure hardware raid; that is, there is no hardware that does all of what raid level n does in hardware. even if it's an embedded processor, it's all software raid. perhaps there's an xor engine to speed things along. the other part of the argument — the write hole depends on two things that i don't think are universal a) zfs' demand for transactional storage b) a particular raid implentation. fancy raid cards often have battery-backed ram and thus from the pov of the host, writes are atomic. i don't have any nda that let me see the firmware for a variety of raid devices, but i find it hard to believe that all raid vendors rewrite the entire stripe whever the write is smaller than the stripe size and all could rewrite the data before the parity. You may think what you want, but obviously quite a few existing small to mid-size companies disagree. Including a couple of labs with MPI apps now running on EC2. more people use windows than use plan 9. should i therefore conclude that my use of plan 9 is illogical? http://en.wikipedia.org/wiki/Appeal_to_the_majority why do you think that mpi has anything to do with a plan 9 infastructure? May be your numbers are wrong, may be your usage patterns are different. Who knows. a single cpu on ec2 costs $150/month. my 6 personal machines don't suck down that much juice. the machines i have largely cost less than $500. so that's like $14/month. that doesn't change the equation much. - erik
Re: [9fans] Changelogs Patches?
On Mon, 2009-01-26 at 08:22 -0800, Russ Cox wrote: As for me, here's my wish list so far. It is all about fossil, since it looks like venti is quite fine (at least for my purposes) the way it is: 1. Block consistency. Yes I know the argument here is that you can always roll-back to the last known archival snapshot on venti. But the point is to kown *when* to roll back. And unless fossil warns you that a block has been corrupted you wouldn't know. I don't understand what you mean. Do you want fossil to tell you when your disk is silently corrupting data, or something else? Implementation vise I would be happy to see the same score checks that venti does implemented in fossil. Complaining like this: seterr(EStrange, lookuplump returned bad score %V not %V, u-score, score); Would be good enough. 2. live mounting of arbitrary scores corresponding to vac VtRoot's to arbitrary sub-directories in my fossil tree. After all, if I can do create of regular files and sub-directories via fossil's console why can't I create pointers to the existing venti file-hierarchies? The only reason this is hard is the choice of qids. You need to decide whether to reuse the qids in the archive or renumber them to avoid conflicts with existing qids. The vac format already has a way to offset the qids of whole subtrees, but then if you make the tree editable and new files are created, it gets complicated. I see. Thanks for the explanation. 3. Not sure whether this is a fossil requirement or not, but I feel uneasy that a root score is sort of unrecoverable from the pure venti archive. Its either that I know it or I don't. I don't understand what you mean here either. From a venti archive, you do cat file.vac to find the actual score. As I mentioned: this one is not really a hard requirement, but rather me thinking out loud. To me it feels that Venti is opaque. In a sense that if I don't know the score to give to flfmt -v then there's no way to browse through the venti to see what could be there (unless I get physical access to arenas, I guess). Now, suppose I have a fossil buffer that I constantly snap to venti. That will build quite a lengthy chain of VtRoots. Then my fossil buffer gets totally corrupted. I no longer know what was the score of the most recent snapshot. And I don't think I know of any way to find that out. The amazing thing to me about fossil is how indestructable it is when used with venti. I agree. That has been very much the case during my short evaluation of the two. It is actually quite remarkable how similar the models of fossil/venti and Git seem to be: both build on the notion of the immutable history. Both address the history by the hash index. Both have a mutable area whose only purpose is to stage data for the subsequent commit to the permanent history. Etc. I don't think it's too remarkable. Content hash addressing was in the air for the last decade or so and there were a lot of systems built using it. The one thing it does really well is eliminate any worry about cache coherency and versioning. That makes it very attractive for any system with large amounts of data or multiple machines. Once you've gone down that route, you have to come to grips with how to implement mutability in a fundamentally immutable system, and the obvious way is with a mutable write buffer staging writes out to the immutable storage. All true. Yet, it is surprising how many DSCMs that were built on the idea of hash addressable history got the implementation of mutability part wrong. Git is the closest one to, what I now understand, is the fossil/venti approach. Thanks, Roman.
Re: [9fans] Changelogs Patches?
Now, suppose I have a fossil buffer that I constantly snap to venti. That will build quite a lengthy chain of VtRoots. Then my fossil buffer gets totally corrupted. I no longer know what was the score of the most recent snapshot. And I don't think I know of any way to find that out. there is a command fossil/last which prints the last snapped root score. I run this from cron nightly and send the resulting score to a remote machine. If all else fails there is a script in /sys/src/cmd/venti/words/dumpvacroots which interogates the http server built into venti and prints all the recent root scores. I have had to use this in the past when I had a dead disk and was less carefull with my scores - all was fine, but I learnt my lesson. -Steve
Re: [9fans] Changelogs Patches?
On Jan 26, 2009, at 8:39 AM, erik quanstrom wrote: This approach will work too. But it seems that asking fossil to verify a checksum when the block is about to go to venti is not that much of an overhead. if checksumming is a good idea, shouldn't it be available outside fossil? It is available -- in venti ;-) perhaps the argument is that it might be more efficient to implement this inside fossil. The argument has nothing to do with the efficiency. However the way fossil is structured -- I think you're right it won't be able to get additional benefits from its own checksuming. while this might be the case, i don't see how the small overhead of a sd layer would matter when you're assuming an ec2-style service, which will have a minimum latency in the 10s of milliseconds. Somehow you've got this strange idea that I'm engineering something for ec2-style services. I am not. EC2 was a simple example I used once. If it agitates you too much I promise not too use it in the future ;-) Thanks, Roman.
Re: [9fans] Changelogs Patches?
On Jan 26, 2009, at 9:37 AM, erik quanstrom wrote: the other part of the argument — the write hole depends on two things that i don't think are universal a) zfs' demand for transactional storage Huh?!? b) a particular raid implentation. fancy raid cards I think you missed what I in RAID is supposed to be expanding into ;-) i don't have any nda that let me see the firmware for a variety of raid devices, but i find it hard to believe that all raid vendors rewrite the entire stripe whever the write is smaller than the stripe size and all could rewrite the data before the parity. Fancy ones might try to do fancy things, but see above. why do you think that mpi has anything to do with a plan 9 infastructure? It is the other way around: the fact that Plan9 still doesn't have anything to do with MPI keeps it away from the kind of clusters Ron used to care about (although, in reality, it is all about gcc anyway, so MPI is a lesser argument here). May be your numbers are wrong, may be your usage patterns are different. Who knows. a single cpu on ec2 costs $150/month. I don't know where did you get that #, but my instance on EC2 costs me about $70/m. Oh, wait! I know! It is all because Solaris is so energy efficient ;-) the machines i have largely cost less than $500. so that's like $14/month. that doesn't change the equation much. I believe you are distorting my argument on purpose. So lets just drop this conversation, ok? Thanks, Roman.
Re: [9fans] Changelogs Patches?
the other part of the argument — the write hole depends on two things that i don't think are universal a) zfs' demand for transactional storage Huh?!? why else would the zfs guys be worried about a write hole for zfs? what would happen to a raid-z if a write returned as successful but were really written to the disk's cache? and before the whole write is competed, the disk or chassis looses power. isn't that also a write hole? i suppose the answer to this problem is the checksumming. but if that is the case, what is the point of raid-z? - erik
Re: [9fans] Changelogs Patches?
On Tue, 2009-01-20 at 16:52 -0700, andrey mirtchovski wrote: for my personal $0.02 i will say that this argument seems to revolve around trying to bend fossil and venti to match the functionality of zfs and the design decisions of the team that wrote it. That is NOT the conversation I'm interested in. My main objective is to evaluate venti/fossil approach to storage and what kind of benefits it might provide. It is inevitable that I will contrast venti/fossil with ZFS, simply because it is the background I'm coming from. i, frankly, think that it should be the other way around; zfs should provide the equivalent of the fossil/venti snapshot/dump functionality to its users. that, to me would be a benefit. Ok. It is fair to turn the tables. So now, let me ask you: what are the benefits of fossil/venti that you want to see in ZFS? So far the only real issue that you've identified is this: ||| where the second choice becomes a nuisance for me is in the ||| case where one has thousands of clones and needs to keep track ||| of thousands of names in order to ensure that when the right one ||| has finished the right clone disappears. And I think it is a valid one. But is there anything else (execpt the issues that have to do with the fact tha ZFS lives in UNIX where fossil/venti in Plan9)? As for me, here's my wish list so far. It is all about fossil, since it looks like venti is quite fine (at least for my purposes) the way it is: 1. Block consistency. Yes I know the argument here is that you can always roll-back to the last known archival snapshot on venti. But the point is to kown *when* to roll back. And unless fossil warns you that a block has been corrupted you wouldn't know. 2. live mounting of arbitrary scores corresponding to vac VtRoot's to arbitrary sub-directories in my fossil tree. After all, if I can do create of regular files and sub-directories via fossil's console why can't I create pointers to the existing venti file-hierarchies? 3. Not sure whether this is a fossil requirement or not, but I feel uneasy that a root score is sort of unrecoverable from the pure venti archive. Its either that I know it or I don't. all this filesystem/snapshot/clone games are just a bunch of toys to make the admins happy and have little effective use for the end user. I disagree. Remember that this whole conversation started from a simple premise that a good archival system could be an efficient replacement for the SCM. If your end users are software developers -- that IS very relevant to them. It is actually quite remarkable how similar the models of fossil/venti and Git seem to be: both build on the notion of the immutable history. Both address the history by the hash index. Both have a mutable area whose only purpose is to stage data for the subsequent commit to the permanent history. Etc. I see what you mean, but in case of venti -- nothing disappears, really. From that perspective you can sort of make those zfs clones linger. The storage consumption won't be any different, right? the storage consumption should be the same, i presume. my problem is that in the case of zfs having several hundred snapshots significantly degrades the performance of the management tools to the extend that zfs list takes 30 seconds with about a thousand entries. Really?!? compared to fossil handling 5 years worth of daily dumps in less than a second. but that's not really a serious argument ;) And what's the output of term% ls -d path-to-your-fossil/archive/*/*/* | wc -l Great! I tired to do as much homework as possible (hence the delay) but I still have some questions left: 0. A dumb one: what's the proper way of cleanly shutting down fossil and venti? see fshalt. Hm. There doesn't seem to be much of shutdown code for fossil/venti in there. Does it mean that sync'ing venti and then just slay(1)'ing it is ok? Thanks, Roman.
Re: [9fans] Changelogs Patches?
On Fri, 2009-01-23 at 22:36 -0500, erik quanstrom wrote: You never know when end-to-end data consistency will start to really matter. Just the other day I attended the cloud conference where some Amazon EC2 customers were swapping stories of Amazon's networking stack malfunctioning and silently corrupting data that was written onto EBS. All of sudden, something like ZFS started to sound like a really good idea to them. i know we need to bow down before zfs's greatness, but i still have some questions. ☺ Oh, come on! I said something like ZFS ;-) These guys are on Linux, for crying out loud! They need to be saved one way or the other (and Solaris at least have *some* AMIs available on EC2). does ec2 corrupt all one's data en mass? From what I understood -- it was NOT en mass. But the scary thing is that they only noticed because of the dumb luck (the app coredumped because the input it was getting was not properly formatted or something) how do you do meaningful redundency in a cloud where one controls none of the failure-prone pieces. Well, that's the very point I'm trying to make: you have to be at least notified that your data got corrupted. Once you do get notified -- you can recover in variety of different ways: starting from simply re-uploading/re-generating your data all the way to the RAID-like things. finally, if p is the probability of a lost block, when does p become too large for zfs' redundency to overcome failures? It depends on the vdev configuration. You can do simple mirroring or you can do RAID-Z (which is more or less RAID-5 done properly). does this depend on the amount of i/o one does on the data or does zfs scrub at a minimum rate anyway. if it does, that would be expensive. You can do resilvering (fixing the data that is known to be bad) or scrubbing (verifying and fixing *all* the data). You also can configure things so that bad blocks either trigger or don't automatic resilvering. Does this answer your question? maybe ec2 is heads amazon wins, tails you loose? The scariest takeaway from the conference was: with the economy the way it is physical on-site datacenters are becoming a luxury for all but the most wealthy companies. Thus whether we like it or not virtual data centers are here to stay. Thanks, Roman.
Re: [9fans] Changelogs Patches?
On Tue, 2009-01-20 at 21:02 -0500, erik quanstrom wrote: In such a setup a corrupted block from a fossil partition will go undetected and could end up being stored in venti. At that point it will become venti problem. it's important to keep in mind that fossil is just a write buffer. it is not intended for the perminant storage of data. Sure. But it must store the data *intact* long enough for me to be able to do a snap. It has to be able to at least warn me about data corruption. while corrupt data could end up in venti, the exposure lies only between snapshots. you can rollback to the previous good score and continue. That is my *entire* point. If fossil doesn't tell you that the data in its buffer was/is corrupted -- you have no reason to rollback. Thanks, Roman.
Re: [9fans] Changelogs Patches?
On Wed, 2009-01-21 at 20:02 +0100, Uriel wrote: On Wed, Jan 21, 2009 at 2:43 AM, Roman V. Shaposhnik r...@sun.com wrote: Sure, but I can't really use venti without using fossil (again: we are talking about a typical setup here not something like vac/vacfs), can I? If I can NOT than fossil becomes a weak link that can let corrupted data go undetected all the way to a venti store. Fossil has always been a weak link, and probably will always be until somebody replaces it. There was some idea of replacing it with a version of ken's fs that uses a venti backend... Venti's rock solid design is the only thing that makes fossil minimally tolerable despite its usual tendency of stepping on its hair and falling on his face. After spending sometime reading the sources and grokking fossil I don't think it is a walking disaster. Far from it. There are a couple of places where things can be improved, to make *me* happier (YMMV), and I'll try to focus on these in replying to Andrei's email. Just to get some closure on this discussion. Thanks, Roman.
Re: [9fans] Changelogs Patches?
After spending sometime reading the sources and grokking fossil I don't think it is a walking disaster. Far from it. There are a couple of places where things can be improved, to make *me* happier (YMMV), and I'll try to focus on these in replying to Andrei's email. Just to get some closure on this discussion. it's important to note, though, that fossil is a write buffer and not a proper cache. i believe this fact is the main source of legitimate gripes with fossil. the other source of trouble is that both fossil and venti have at times suffered from being quite unfriendly when shut down unexpectedly. since they run on cpu servers, and since there is a temptation to have an all-in-wonder cpu server, unexpected shutdowns can be more common than one would like. - erik
Re: [9fans] Changelogs Patches?
You never know when end-to-end data consistency will start to really matter. Just the other day I attended the cloud conference where some Amazon EC2 customers were swapping stories of Amazon's networking stack malfunctioning and silently corrupting data that was written onto EBS. All of sudden, something like ZFS started to sound like a really good idea to them. i know we need to bow down before zfs's greatness, but i still have some questions. ☺ does ec2 corrupt all one's data en mass? how do you do meaningful redundency in a cloud where one controls none of the failure-prone pieces. finally, if p is the probability of a lost block, when does p become too large for zfs' redundency to overcome failures? does this depend on the amount of i/o one does on the data or does zfs scrub at a minimum rate anyway. if it does, that would be expensive. maybe ec2 is heads amazon wins, tails you loose? - erik
Re: [9fans] Changelogs Patches?
On Wed Jan 21 01:40:13 EST 2009, st...@quintile.net wrote: ... fossil does have the functionality to serve two different file systems from two different disks, but i don't think anyone has used that ... I do this, 'main' backed up by venti and 'other' which holds useful stuff that needn't be backed up, e.g. RFCs, cdrom images, datasheets etc. This is accessed via 9fs juke as an homage to the CDROM jukebox that once provided a similar filesystem at the labs. actually, it was a hp jukebox that had mo disks. alliance (neé plasmon) makes 60gb udo2 drives http://www.plasmon.com/archive_solutions/udodrives.html and these libraries http://www.plasmon.com/archive_solutions/glibrary.html the media are supposedly good for 50 years. www.quanstro.net/plan9/disklessfs.pdf describes coraid's worm-replacement strategy. it is both better (offsite, very fast access) and not better (the media are less reliable and not write-once). it would be neat to have a filesystem built as filsys main cpe2.0kcache{e2.1jw0w1} all the speed of disks and a perminant record, but clearly not very cost effective. and direct- attach storage doesn't like the right place for the worm. it should be offsite. - erik
Re: [9fans] Changelogs Patches?
On Wed, Jan 21, 2009 at 2:43 AM, Roman V. Shaposhnik r...@sun.com wrote: I was specifically referring to a normal operations to conjure an image of a typical setup of fossil+venti. In such a setup a corrupted block from a fossil partition will go undetected and could end up being stored in venti. At that point it will become venti problem. i should have been more clear that venti does the checking. there are many things that fossil doesn't do that it should. Sure, but I can't really use venti without using fossil (again: we are talking about a typical setup here not something like vac/vacfs), can I? If I can NOT than fossil becomes a weak link that can let corrupted data go undetected all the way to a venti store. Fossil has always been a weak link, and probably will always be until somebody replaces it. There was some idea of replacing it with a version of ken's fs that uses a venti backend... Venti's rock solid design is the only thing that makes fossil minimally tolerable despite its usual tendency of stepping on its hair and falling on his face. uriel This is quite worrisome for me. At least compared to ZFS it is. Thanks, Roman.
Re: [9fans] Changelogs Patches?
Fossil has always been a weak link, and probably will always be until somebody replaces it. There was some idea of replacing it with a version of ken's fs that uses a venti backend... i looked into how that would go enough to see that venti would work at cross purposes to the fs. having a w address doesn't make much sense when you can address by content. in hindsight, that was likely obvious to everyone but me. i think ken's fs makes perfect sense without venti. it has reasonable device support these days (aoe, ata, ahci, marvell 88sx). - erik
Re: [9fans] Changelogs Patches?
in the case of zfs, my claim is that since zfs can reuse blocks, two vdev backups, each with corruption or missing data in different places are pretty well useless. Got it. However, I'm still not fully convinced there's a definite edge one way or the other. Don't get me wrong: I'm not trying to defend ZFS (I don't think it needs defending, anyway) but rather I'm trying to test my mental model of how both work. if you end up rewriting a free block in zfs, there sure is. you can't decide which one is correct. P.S. Oh, and in case of ZFS a damaged vdev will be detected (and possibly re-silvered) under normal working conditions, while fossil might not even notice a corruption. not true. one of many score checks: srv/lump.c:103: seterr(EStrange, lookuplump returned bad score %V not %V, u-score, score); - erik
Re: [9fans] Changelogs Patches?
1. What's the use of copying arenas to CD/DVD? Is it purely back up, since they have to stay on-line forever? backup. 2. Would fossil/venti notice silent data corruptions in blocks? venti would. the score wouldn't match the block. 3. Do you think its a good idea to have volume management be part of filesystems, since that way you can try to heal the data on-the-fly? i think they are seperate questions. i see a couple of strong disadvantages to combining volume management with the fs - it's hard to reason about; zfs redundency stratgeies seem ideosyncratic. - you need a different volume management solution for non zfs needs. - to manage the storage you need to be a zfs expert. conversely to manage zfs you need to be a storage expert. - raid5 is very slow if you move the raid computation away from the data as you need to move the data to the computation. 4. If I have a venti server and a bunch of sha1 codes, can I somehow instantiate a single fossil serving all of them under /archive? i don't understand the question. - erik
Re: [9fans] Changelogs Patches?
On Tue, 2009-01-20 at 09:19 -0500, erik quanstrom wrote: in the case of zfs, my claim is that since zfs can reuse blocks, two vdev backups, each with corruption or missing data in different places are pretty well useless. Got it. However, I'm still not fully convinced there's a definite edge one way or the other. Don't get me wrong: I'm not trying to defend ZFS (I don't think it needs defending, anyway) but rather I'm trying to test my mental model of how both work. if you end up rewriting a free block in zfs, there sure is. you can't decide which one is correct. You don't have to decide. You get use generation # for that. P.S. Oh, and in case of ZFS a damaged vdev will be detected (and possibly re-silvered) under normal working conditions, while fossil might not even notice a corruption. not true. one of many score checks: srv/lump.c:103: seterr(EStrange, lookuplump returned bad score %V not %V, u-score, score); I don't buy this argument for a simple reason: here's a very easy example that proves my point: term% fossil/fossil -f /tmp/fossil.bin fsys: dialing venti at net!$venti!venti warning: connecting to venti: Connection refused term% mount /srv/fossil /n/f term% cd /n/f/test term% echo 'this is innocent text' text.txt term% cat text.txt this is innocent text term% dd -if /dev/cons -of /tmp/fossil.bin -bs 1 -count 8 -oseek 278528 -trunc 0 this WAS 8+0 records in 8+0 records out term% rm /srv/fossil /srv/fscons term% fossil/fossil -f /tmp/fossil.bin fsys: dialing venti at net!$venti!venti warning: connecting to venti: Connection refused create /active/adm: file already exists create /active/adm adm sys d775: create /active/adm: file already exists create /active/adm/users: file already exists create /active/adm/users adm sys 664: create /active/adm/users: file already exists nuser 5 len 84 term% mount /srv/fossil /n/f2 term% cat /n/f2/test/text.txt this WAS innocent text term% Of course, with ZFS, the above corruption would be always noticed and sometimes (depending on your vdev setup) even silently fixed. Thanks, Roman.
Re: [9fans] Changelogs Patches?
Got it. However, I'm still not fully convinced there's a definite edge one way or the other. Don't get me wrong: I'm not trying to defend ZFS (I don't think it needs defending, anyway) but rather I'm trying to test my mental model of how both work. if you end up rewriting a free block in zfs, there sure is. you can't decide which one is correct. You don't have to decide. You get use generation # for that. what generation number? are there other things that your argument depends on that you haven't mentioned yet? not true. one of many score checks: srv/lump.c:103: seterr(EStrange, lookuplump returned bad score %V not %V, u-score, score); I don't buy this argument for a simple reason: here's a very easy example that proves my point: term% fossil/fossil -f /tmp/fossil.bin fsys: dialing venti at net!$venti!venti warning: connecting to venti: Connection refused well, there's your problem. you corrupted the cache, not the venti store. (you have no venti store in this example.) i should have been more clear that venti does the checking. there are many things that fossil doesn't do that it should. - erik
Re: [9fans] Changelogs Patches?
Is it how it was from the get go, or did you use venti-based solutions before? it's how i found it. i have two zfs servers and about 10 pools of different sizes with several hundred different zfs filesystems and volumes of raw disk exported via iscsi. What kind of clients are on the other side of iscsi? linux machines. You're using it on Linux? the zfs servers are OpenSolaris boxes. Aha! And here are my first questions: you say that I can run multiple fossils off of the same venti and thus have a setup that is very close to zfs clones: 1. how do you do that exactly? fossil -f doesn't work for me (nor should it according to the docs) i meant formatting the fossil disk with flfmt -v, sorry. it had been quite a while since i last had to restart from an old venti score :) 2. how do you work around the fact that each fossil needs its own partition (unlike ZFS where all the clones can share the same pool of blocks)? ultimately all blocks are shared on the same venti server unless you use separate ones. fossil does have the functionality to serve two different file systems from two different disks, but i don't think anyone has used that (but see example at the end). I think I understand it now (except for the fossil -f part), but how do you promote (zfs promote) such a clone? i'm unconvinced that 'promoting' is a genuine feature: it seems to me that the designers had to invent 'promoting' because they made the decision to make snapshots read-only in the first place. perhaps i'm wrong, but if the purpose of promoting something is to make it a true member of the filesystem community (with all capabilities that entails), then the corresponding feature in fossil would be to instantiate one from the particular venti score for the dump. i.e., flfmt -v. I see what you mean, but in case of venti -- nothing disappears, really. From that perspective you can sort of make those zfs clones linger. The storage consumption won't be any different, right? the storage consumption should be the same, i presume. my problem is that in the case of zfs having several hundred snapshots significantly degrades the performance of the management tools to the extend that zfs list takes 30 seconds with about a thousand entries. compared to fossil handling 5 years worth of daily dumps in less than a second. but that's not really a serious argument ;) Great! I tired to do as much homework as possible (hence the delay) but I still have some questions left: 0. A dumb one: what's the proper way of cleanly shutting down fossil and venti? see fshalt. it used to be, like most other things, that one could just turn the machine off without worry. then some bad things happened and fshalt was written. 1. What's the use of copying arenas to CD/DVD? Is it purely back up, since they have to stay on-line forever? people who back up to cd/dvd can answer that :) 3. Do you think its a good idea to have volume management be part of filesystems, since that way you can try to heal the data on-the-fly? i don't know... 4. If I have a venti server and a bunch of sha1 codes, can I somehow instantiate a single fossil serving all of them under /archive? not sure if this will work. you'll need as many partitions as the sha1 scores you have. then for each do fossil/flfmt -v score partition. once you've started fossil on the console type, for each partition/score: fsys somename config partition fsys somename venti ventiserver fsys somename open it's convoluted, yes. there may be an easier way. i know of people using vacfs and vac to backup their linux machines to venti. actions like the ones you're describing would be much easier there, although i am not sure vacfs has all the functionality to be a usable file system (for example, it's read-only). for my personal $0.02 i will say that this argument seems to revolve around trying to bend fossil and venti to match the functionality of zfs and the design decisions of the team that wrote it. i, frankly, think that it should be the other way around; zfs should provide the equivalent of the fossil/venti snapshot/dump functionality to its users. that, to me would be a benefit (of course it gets you sued by netapp too, but that's besides the point). all this filesystem/snapshot/clone games are just a bunch of toys to make the admins happy and have little effective use for the end user.
Re: [9fans] Changelogs Patches?
On Tue, 2009-01-20 at 18:36 -0500, erik quanstrom wrote: Got it. However, I'm still not fully convinced there's a definite edge one way or the other. Don't get me wrong: I'm not trying to defend ZFS (I don't think it needs defending, anyway) but rather I'm trying to test my mental model of how both work. if you end up rewriting a free block in zfs, there sure is. you can't decide which one is correct. You don't have to decide. You get use generation # for that. what generation number? I'm talking about a field in each ZFS block pointer. The field is actually called birth txg, but I thought alluding to VtEntry.gen would make it easier to understand what I had in mind. are there other things that your argument depends on that you haven't mentioned yet? Fair question. It depends on at leas cursory reading of ZFS-on-disk specification. I felt uneasy in this conversation precisely because I had a very vague recollection of Venti/Fossil paper. I guess it cuts both ways: http://opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf term% fossil/fossil -f /tmp/fossil.bin fsys: dialing venti at net!$venti!venti warning: connecting to venti: Connection refused well, there's your problem. you corrupted the cache, not the venti store. (you have no venti store in this example.) I was specifically referring to a normal operations to conjure an image of a typical setup of fossil+venti. In such a setup a corrupted block from a fossil partition will go undetected and could end up being stored in venti. At that point it will become venti problem. i should have been more clear that venti does the checking. there are many things that fossil doesn't do that it should. Sure, but I can't really use venti without using fossil (again: we are talking about a typical setup here not something like vac/vacfs), can I? If I can NOT than fossil becomes a weak link that can let corrupted data go undetected all the way to a venti store. This is quite worrisome for me. At least compared to ZFS it is. Thanks, Roman.
Re: [9fans] Changelogs Patches?
well, there's your problem. you corrupted the cache, not the venti store. (you have no venti store in this example.) I was specifically referring to a normal operations to conjure an image of a typical setup of fossil+venti. In such a setup a corrupted block from a fossil partition will go undetected and could end up being stored in venti. At that point it will become venti problem. it's important to keep in mind that fossil is just a write buffer. it is not intended for the perminant storage of data. while corrupt data could end up in venti, the exposure lies only between snapshots. you can rollback to the previous good score and continue. ken's fs has a proper cache. a corrupt cache can be recovered from by dumping the cache and restarting from the last good superblock. in the days when the fs was really a worm stored on mo disks, the worm was said to be very reliable storage. with raid+scrubbing we try to overcome the limitations of magnetic media. while there isn't any block checksum, there is a block tag. tag checking has spotted a few instances of corruption on my fs. fs-level checksumming and encryption is definately something i've considered. actually, with tags and encryption, checksumming is not necessary for error detection. - erik
Re: [9fans] Changelogs Patches?
... fossil does have the functionality to serve two different file systems from two different disks, but i don't think anyone has used that ... I do this, 'main' backed up by venti and 'other' which holds useful stuff that needn't be backed up, e.g. RFCs, cdrom images, datasheets etc. This is accessed via 9fs juke as an homage to the CDROM jukebox that once provided a similar filesystem at the labs. -Steve
Re: [9fans] Changelogs Patches?
I think I'm now ready to pick up this old thread (if anybody's still interested...) On Jan 7, 2009, at 5:11 PM, erik quanstrom wrote: Lets see. May be its my misinterpretation of what venti does. But so far I understand that it boils down to: I give venti a block of any length, it gives me a score back. Now internally, venti might decide just a clarification. this is done by the client. from venti(6): Files and Directories Venti accepts blocks up to 56 kilobytes in size. By conven- tion, Venti clients use hash trees of blocks to represent arbitrary-size data files. [...] Right. This, by the way, suggests that the onus is on the clients to help venti reuse as much blocks as possible. Has there been any established practices of finding the best cut-here points? But even in the former case I don't see how the corruption could be possible. Please elaborate. i didn't say there would be corruption. i assumed corruption and outlined how one could recover the maximal set of data and have a consistent fs (assuming the damage doesn't cut a full strip across all backups) by simply picking a good block at each lba from the available damaged and/or incomplete backups, which may originate at different times. (russ was the first that i know of to put this into practice.) in the case of zfs, my claim is that since zfs can reuse blocks, two vdev backups, each with corruption or missing data in different places are pretty well useless. Got it. However, I'm still not fully convinced there's a definite edge one way or the other. Don't get me wrong: I'm not trying to defend ZFS (I don't think it needs defending, anyway) but rather I'm trying to test my mental model of how both work. We assume a damaged set of arenas for venti and a damaged set of vdevs for ZFS. Everything is off-line at that point and we are running strictly in forensics mode. The show, basically, consists of three acts: 1. salvaging as many good data blocks as possible 2. building higher-order structures out of primary data blocks 3. trying to rebuild as much of a consistent FS as possible using all the available blocks It seems to me that #1 and #2 are 100% the same in terms of the probability of success. In fact, one might claim that ZFS has a slight edge because of: a. volume management being part of the FS b. the ditto blocks IOW every block pointer having up to 3 alternative locations for the block it points to The net result is that you might end up with more good blocks to choose from in ZFS world, than in venti's case. Which brings us to #3. Once again, we might have more blocks to choose from than we want (including free blocks) but the generation number should be enough of a clue to filter unwanted things out. Thanks, Roman. P.S. Oh, and in case of ZFS a damaged vdev will be detected (and possibly re-silvered) under normal working conditions, while fossil might not even notice a corruption.
Re: [9fans] Changelogs Patches?
On Tue, 2009-01-06 at 18:44 -0500, erik quanstrom wrote: a big difference between the decisions is in data integrety. it's much easier to break a fs that rewrites than it is a worm-based fs. True. But there's a grey area here: an FS that *never* rewrites live blocks, but can reclaim dead ones. That's essentially what ZFS does. unfortunately, i would think that can result in data loss since i can can no longer take a set of copies of the fs {fs_0, ... fs_n} and create a new copy with all the data possibly recovered by picking a set good blocks from the fs_i, since i can make a block dead by removing the file using it and i can make it live again by writing a new file. perhaps i've misinterpreted what you are saying? Lets see. May be its my misinterpretation of what venti does. But so far I understand that it boils down to: I give venti a block of any length, it gives me a score back. Now internally, venti might decide to split that huge block into a series of smaller ones and store it as a tree. But still all I get back is a single score. I don't care whether that score really describes my raw data block, or a block full of scores that actually describe raw data. All I care is that when I give venti that score back -- it'll reconstruct the data. I also have a guarantee that the data will never ever be deleted. Now, because of that guarantee (blocks are never deleted) and since all blocks bigger than 56k get split venti has a nice property of reusing blocks from existing trees. This happens as a by-product of the design: I ask venti to store a block and if that same block was already there -- there will be an extra arrow pointing at it. All in all -- very compact way of representing a forest of trees. Each tree corresponds to a VtEntry data structure and blocks full of VtEntry structures are called VtEntryDir's. Finally a root VtEntryDir is pointed at by VtRoot structure. Contrast this with ZFS, where blocks are *not* addressed via scores, but rather with a vdev:offset pairs called DVAs. This, of course, means that there's no block coalescing going on. You ask ZFS to store a block it gives you a DVA back. You ask it to store the same block again, you get a different DVA (well, actually it gives you a block pointer which is DVA augmented by extra stuff). That fundamental property of ZFS makes it impossible to have a single block implicitly referenced by multiple trees, unless the block happens to be part of an explicit snapshot of the same object at some later point in time. Thus, when there's a need to modify an existing object, ZFS never touches the old blocks. It build a tree of blocks, *explicitly* reusing those blocks that haven't changed. When it is done building the new tree the old one is still the active one. The last transaction that happens updates an uberblock (ZFS speak for VtRoot) in an atomic fashion, thus making a new tree an active one. The old tree is still around at that point and if it is not part of a snapshot it can be garbage collected and the blocks can be freed if it is part of the snapshot -- it is preserved. In the later case the behavior seems to be exactly what venti does But even in the former case I don't see how the corruption could be possible. Please elaborate. Thanks, Roman.
Re: [9fans] Changelogs Patches?
Lets see. May be its my misinterpretation of what venti does. But so far I understand that it boils down to: I give venti a block of any length, it gives me a score back. Now internally, venti might decide just a clarification. this is done by the client. from venti(6): Files and Directories Venti accepts blocks up to 56 kilobytes in size. By conven- tion, Venti clients use hash trees of blocks to represent arbitrary-size data files. [...] But even in the former case I don't see how the corruption could be possible. Please elaborate. i didn't say there would be corruption. i assumed corruption and outlined how one could recover the maximal set of data and have a consistent fs (assuming the damage doesn't cut a full strip across all backups) by simply picking a good block at each lba from the available damaged and/or incomplete backups, which may originate at different times. (russ was the first that i know of to put this into practice.) in the case of zfs, my claim is that since zfs can reuse blocks, two vdev backups, each with corruption or missing data in different places are pretty well useless. - erik
Re: [9fans] Changelogs Patches?
I'm still trying to figure out what kind of approximation of the above would be possible with fossil/venti. how about making a copy? venti will coalesce duplicate blocks. But wouldn't you still need to send these blocks over the wire (thus consuming bandwidth and time)? key word approximation. ☺ assuming that not all of your tree is in cache, moving the blocks over the wire would be much faster than the disk access. assuming just gbe, you should be able to copy 50mb/s out of and back into the same venti server. how big are your snapshots that this would be a problem? i don't know enough about fossil's structure, but i think you could write a specialized. - erik
Re: [9fans] Changelogs Patches?
i'm using zfs right now for a project storing a few terabytes worth of data and vm images. i have two zfs servers and about 10 pools of different sizes with several hundred different zfs filesystems and volumes of raw disk exported via iscsi. clones play a vital part in the whole set up (they number in the thousands). for what it's worth, zfs is the best thing in linux-world (sorry, solaris and *bsd too) for that kind of task. my comment is that, coming from fossil/venti, zfs feels just a bit more convoluted and there are more special cases that seem like a design mishap at least when compared to what i'm used to in the world i'm coming from. i'll try to explain: Fair enough. But YourTextGoesHere then becomes a transient property of my namespace, where in case of ZFS it is truly a tag for a snapshot. all snapshots have tags: their top-level sha1 score. what i supplied was simply a way to translate that to any random text. you don't need to, nor do you have to do this (by the way, do you get the irony of forcing snapshots to contain the '@' character in their name? sounds a lot like '#' to me ;) snapshots are generally accessible via fossil as a directory with the date of the snapshot as its name. this starts making more sense when you take into consideration that snapshots are global per fossil, but then you can run several fossils without having them step on their toes when it comes to venti. at least until you get a collision in blocks' hashes. in fact, i'm so used to fossil's dated snapshots that in my setup i have restricted 'YourTextGoesHere' to actually be a date. that gives me so much more context in the case where something goes wrong and i have to go back through the snapshots for a filesystem or a volume to find the last known good one. Well, strictly speaking Solaris does have a reasonable approximation of bind in a form of lofs -- so remapping default ZFS mount point to something else is not a big deal. did not know that $ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/branch that's as simple as starting a new fossil with -f 'somehex', where somehex is the score of the corresponding snap. this gives you both read-only snapshots, Meaning? venti is write-once. if you instantiate a fossil from a venti score it is, by definition, read-only, as all changes to the current fossil will not appear to another fossil instantiated from the same venti score. changes are committed to venti once you do a fossil snap, however that automatically generates a new snapshot score (not modifying the old one). it should be clear from the paper. - snapshots are read only and generally unmountable (unless you go through the effort of making them so by setting a special option, which i'm not sure is per-snapshot) Huh? That's weird -- I routinely access them via /pool/fs/.zfs/snapshot/snapshot name and I don't remember setting any kind of options. The visibility of .zfs can be tweaked, but all it really affects is Tab in bash ;-) - clones can only be created off of snapshots But that does sound reasonable. What else there is except snapshots and an active tree? Or are you objecting to the extra step that is needed where you really want to clone the active tree? i have .zfs exports turned off (it's off by default) because the read-only snapshots are useless in my environment. instead i must create clones off one or many snapshots and keep track and delete them when their tasks have been accomplished. this is an example of the design decision difference between fossil/venti and zfs: venti commits storage permanently and everything becomes a snapshot, while the designers of zfs decided to create a two-stage process introducing a read-only intermediary between the original data and a read-write access to it independent of other clients. where the second choice becomes a nuisance for me is in the case where one has thousands of clones and needs to keep track of thousands of names in order to ensure that when the right one has finished the right clone disappears. it's good that zfs can handle so many, otherwise it would've been useless. note that other systems take the plan9 approach to heart: qemu for example has the -snapshot argument which allows me to boot many VMs, fossil-style, off a single vm image without worrying whether they'll step on each other's toes. that way seems so much simpler and natural to me, but then i'm jaded by venti :) - clones are read-writable but they can only be mounted within the /pool/fs/branch hierarchy. if you want to share them you need to explicitly adjust a lot of zfs settings such as 'sharenfs' and so on; In general -- this is true :-( But I think there's a way now to do that. If you're really interested -- I can take a look and let you know. my problem is with the local/remote duality of exports: if i create a zfs cloned filesystem it's immediately locally available and perhaps (via 'sharenfs' inheritance from its parent)
Re: [9fans] Changelogs Patches?
very interesting post. this is an example of the design decision difference between fossil/venti and zfs: venti commits storage permanently and everything becomes a snapshot, while the designers of zfs decided to create a two-stage process introducing a read-only intermediary between the original data and a read-write access to it independent of other clients. a big difference between the decisions is in data integrety. it's much easier to break a fs that rewrites than it is a worm-based fs. even if the actual media are the same. and a broken rewriting fs is much harder to recover. russ wrote up a bit on recovering one good venti from an old copy and a damaged current venti. this same approach, (basically fs | fs') works for any worm fs. from a remote node. if i create a zfs cloned volume i need to arrange an iscsi method of access from a remote node. both nfs and iscsi have a host of nasty settings that need to be correct on both ends in order for things to work right. i can never hope to export an nfs share outside my DMZ. i don't see a solution to this problem: the unix world is committed to nfs and a bit less so to iscsi. i'm more of a 9p guy myself though, so i listed it as a complaint. oh, my perfect chance to shill aoe! how to configure aoe on plan 9 echo bind /net/ether0/dev/aoe/ctl now for the hard part # (this space intentionally left blank.) - erik
Re: [9fans] Changelogs Patches?
On Tue, 2009-01-06 at 11:19 -0500, erik quanstrom wrote: very interesting post. indeed. I actually need some time to digest it ;-) this is an example of the design decision difference between fossil/venti and zfs: venti commits storage permanently and everything becomes a snapshot, while the designers of zfs decided to create a two-stage process introducing a read-only intermediary between the original data and a read-write access to it independent of other clients. a big difference between the decisions is in data integrety. it's much easier to break a fs that rewrites than it is a worm-based fs. True. But there's a grey area here: an FS that *never* rewrites live blocks, but can reclaim dead ones. That's essentially what ZFS does. i don't see a solution to this problem: the unix world is committed to nfs and a bit less so to iscsi. i'm more of a 9p guy myself though, so i listed it as a complaint. oh, my perfect chance to shill aoe! how to configure aoe on plan 9 echo bind /net/ether0/dev/aoe/ctl now for the hard part # (this space intentionally left blank.) ;-) What's your personal experience on aoe vs. iscsi? Thanks, Roman.
Re: [9fans] Changelogs Patches?
a big difference between the decisions is in data integrety. it's much easier to break a fs that rewrites than it is a worm-based fs. True. But there's a grey area here: an FS that *never* rewrites live blocks, but can reclaim dead ones. That's essentially what ZFS does. unfortunately, i would think that can result in data loss since i can can no longer take a set of copies of the fs {fs_0, ... fs_n} and create a new copy with all the data possibly recovered by picking a set good blocks from the fs_i, since i can make a block dead by removing the file using it and i can make it live again by writing a new file. perhaps i've misinterpreted what you are saying? What's your personal experience on aoe vs. iscsi? i have no iscsi experience. aoe has been pretty fun to work with. the spec can be read in half an hour. (it's maybe ten pages.) i implemented a virtual aoe target for plan 9, vblade, from scratch on a friday evening. - erik
Re: [9fans] Changelogs Patches?
On Jan 4, 2009, at 9:12 PM, erik quanstrom wrote: Well, I guess I really got spoiled by ZFS's ability to do things like $ zfs snapshot pool/projects/f...@yourtextgoeshere and especially: $ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/ branch I'm still trying to figure out what kind of approximation of the above would be possible with fossil/venti. how about making a copy? venti will coalesce duplicate blocks. But wouldn't you still need to send these blocks over the wire (thus consuming bandwidth and time)? Thanks, Roman.
Re: [9fans] Changelogs Patches?
Cool! Looks like I found a bi-lingual person! ;-) Andrey, would you mind if I ask you to translate some other things between ZFS and venti/fossil for me? On Jan 4, 2009, at 9:24 PM, andrey mirtchovski wrote: Well, I guess I really got spoiled by ZFS's ability to do things like $ zfs snapshot pool/projects/f...@yourtextgoeshere at the console type snap. if you're allowing snaps to be mounted on the local fs then the equivalent would be mkdir /YourTextGoesHere; bind /n/dump/... / /YourTextGoesHere. Fair enough. But YourTextGoesHere then becomes a transient property of my namespace, where in case of ZFS it is truly a tag for a snapshot. note that zfs restricts where the snapshot can be mounted :p venti snapshots are, by default, read only. Well, strictly speaking Solaris does have a reasonable approximation of bind in a form of lofs -- so remapping default ZFS mount point to something else is not a big deal. $ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/branch that's as simple as starting a new fossil with -f 'somehex', where somehex is the score of the corresponding snap. this gives you both read-only snapshots, Meaning? and as many clones as you wish. Cool! note that you're cheating here, and by quite a bit: Lets see about that ;-) - snapshots are read only and generally unmountable (unless you go through the effort of making them so by setting a special option, which i'm not sure is per-snapshot) Huh? That's weird -- I routinely access them via /pool/fs/.zfs/snapshot/snapshot name and I don't remember setting any kind of options. The visibility of .zfs can be tweaked, but all it really affects is Tab in bash ;-) - clones can only be created off of snapshots But that does sound reasonable. What else there is except snapshots and an active tree? Or are you objecting to the extra step that is needed where you really want to clone the active tree? - clones are read-writable but they can only be mounted within the /pool/fs/branch hierarchy. if you want to share them you need to explicitly adjust a lot of zfs settings such as 'sharenfs' and so on; In general -- this is true :-( But I think there's a way now to do that. If you're really interested -- I can take a look and let you know. - none of this can be done remotely Meaning? - libzfs has an unpublished interface, so if one wants to, say, write a 9p server to expose zfs functionality to remote hosts they must either reverse engineer libzfs or use other means. This one is a bit unfair. The interface is published alright. As much as anything in Open Source is. It is also documented at the level that would be considered reasonable for Linux. The fact that it is not *stable* makes the usual thorough Solaris documentation lacking. But all in all, following along doesn't require much more extra effort compared to following along any other evolving OS project. And yes, the situation has changed compared to what it used to be when Solaris 10 just came out. If you had bad experience with libzfs sometime ago -- I'm sorry, but if you try again you might find it more to your linking. Thanks, Roman.
Re: [9fans] Changelogs Patches?
On Sun, 2009-01-04 at 07:03 +0900, sqweek wrote: On Tue, Dec 30, 2008 at 8:54 AM, Roman Shaposhnik r...@sun.com wrote: Personally, though, I'd say that the usefulness of the dump would be greatly improved if one had an ability to do ad-hoc archival snapshots AND assigning tags, not only dates to them. Tags don't make that much sense in this context since the dump is for the whole filesystem, not a specific project. Well, as Charles pointed out -- in case of Plan9 development the whole system is the entire project. However, tagging a source tree can be done with a simple dircp. It's not as though the duplicate data costs you anything when you're backed by venti. Hm. Good point. Although timing wise, I'd expect dircp to be dreadfully slow. Well, I guess I really got spoiled by ZFS's ability to do things like $ zfs snapshot pool/projects/f...@yourtextgoeshere and especially: $ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/branch I'm still trying to figure out what kind of approximation of the above would be possible with fossil/venti. Thanks, Roman.
Re: [9fans] Changelogs Patches?
Well, I guess I really got spoiled by ZFS's ability to do things like $ zfs snapshot pool/projects/f...@yourtextgoeshere at the console type snap. if you're allowing snaps to be mounted on the local fs then the equivalent would be mkdir /YourTextGoesHere; bind /n/dump/... / /YourTextGoesHere. note that zfs restricts where the snapshot can be mounted :p venti snapshots are, by default, read only. $ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/branch that's as simple as starting a new fossil with -f 'somehex', where somehex is the score of the corresponding snap. this gives you both read-only snapshots, and as many clones as you wish. note that you're cheating here, and by quite a bit: - snapshots are read only and generally unmountable (unless you go through the effort of making them so by setting a special option, which i'm not sure is per-snapshot) - clones can only be created off of snapshots - clones are read-writable but they can only be mounted within the /pool/fs/branch hierarchy. if you want to share them you need to explicitly adjust a lot of zfs settings such as 'sharenfs' and so on; - none of this can be done remotely - libzfs has an unpublished interface, so if one wants to, say, write a 9p server to expose zfs functionality to remote hosts they must either reverse engineer libzfs or use other means. so, while i'm sure you enjoy zfs quite a bit, for others used to plan9's venti/fossil way of doing things zfs can be quite a pain.
Re: [9fans] Changelogs Patches?
fossil/venti through the lens of ZFS. I guess its not a coincedence that ZFS actually has a built-in support for the kind of history transfer you were implementing. the transfer would have been trivial, had the filesystems been compatable. what i did was reenact the actions that built the original fs on the new fs by manipulating the clock on the target. - erik
Re: [9fans] Changelogs Patches?
On Tue, Dec 30, 2008 at 8:54 AM, Roman Shaposhnik r...@sun.com wrote: Personally, though, I'd say that the usefulness of the dump would be greatly improved if one had an ability to do ad-hoc archival snapshots AND assigning tags, not only dates to them. Tags don't make that much sense in this context since the dump is for the whole filesystem, not a specific project. However, tagging a source tree can be done with a simple dircp. It's not as though the duplicate data costs you anything when you're backed by venti. -sqweek
Re: [9fans] Changelogs Patches?
Knowing *who* made the change is often even more useful than the change comment. uriel On Tue, Dec 30, 2008 at 2:48 AM, Charles Forsyth fors...@terzarima.net wrote: i've rarely found per-change histories to be any more useful than most other comments, i'm afraid. And that meant that math texts and math teaching was all about polished final results. ah. my statement was ambiguous. i meant per-change chatter in the history, not the changes in the history. it's fine to have the chatter, but it isn't essential, because nothing relies on it, in the sense that the chatter causes the system to change its behaviour.
Re: [9fans] Changelogs Patches?
Knowing *who* made the change is often even more useful than the change comment. yes. i use ls -lm on our trees, but that might not work on less direct things like sources.
Re: [9fans] Changelogs Patches?
On Tue, Dec 30, 2008 at 4:06 PM, C H Forsyth fors...@vitanuova.com wrote: Knowing *who* made the change is often even more useful than the change comment. yes. i use ls -lm on our trees, but that might not work on less direct things like sources. It would work if the development trees were public... uriel
Re: [9fans] Changelogs Patches?
http://code.google.com/hosting/createProject On Tue, Dec 30, 2008 at 12:31 PM, Uriel urie...@gmail.com wrote: On Tue, Dec 30, 2008 at 4:06 PM, C H Forsyth fors...@vitanuova.com wrote: Knowing *who* made the change is often even more useful than the change comment. yes. i use ls -lm on our trees, but that might not work on less direct things like sources. It would work if the development trees were public... uriel
Re: [9fans] Changelogs Patches?
On Dec 26, 2008, at 5:27 AM, Charles Forsyth wrote: while a descriptive history is good, it takes a lot of extra work to generate. i've rarely found per-change histories to be any more useful than most other comments, i'm afraid. I believe that it all depends on what is it that you look at source code for. Long time ago I used to study mathematics. Soviet mathematical schooling was really quite exceptional, but there was one thing that I now wish was different. You see, soviet math got a Bourbaki virus in its early childhood. And that meant that math texts and math teaching was all about polished final results. None of that messy and disgusting process of actually discovering those results. None. The process itself was considered too imprecise and muddy: Rigor consisted in getting rid of an accretion of superfluous details. Conversely, lack of rigor gave my father an impression of a proof where one was walking in mud, where one had to pick up some sort of filth in order to get ahead. Once that filth was taken away, one could get at the mathematical object, a sort of crystallized body whose essence is its structure. From: http://ega-math.ru/Cartier.htm And thus the circle of those who just got it was formed. Back when I was a student, I wanted to belong to that circle so badly, that I missed a fundamental point: the very creation of the circle turned all of us from active participants in the process into art gallery goers. And that was a fine change for those who just wanted to appreciate fine math, but was a kiss of death for less gifted individuals who wanted to do math themselves (I won't touch the subject of whether less gifted individuals are supposed to do math in the first place, since its too personal and painful). Ok, with math it is a bit difficult to have the records of the process AND the final object at the same time (well, good teachers understood that and their lectures were the ones worth attending). But in software engineering we DO have a chance to have our cake and eat it too. Albeit only if we put as much focus on maintaining history (our records of the process) as we put on maintaing the code itself (final results). the advantage of dump and snap is that the scope is the whole system: including emails, discussion documents, the code, supporting tools -- everything in digital form. if software works differently today compared to yesterday, then in most cases, i'd expect 9fs dump to make it easy to track down the set of differences, and narrow the search to the culprit. it might not even be a source change, but a configuration file, or a file was moved or removed. I don't deny that 9fs dump is quite useful and it seems to match the organization of Plan9 developer club pretty well. Personally, though, I'd say that the usefulness of the dump would be greatly improved if one had an ability to do ad-hoc archival snapshots AND assigning tags, not only dates to them. That would, in effect, bring the whole process quite close to what established SCMs do. With the only major feature (the ability to easily trade history between different hosts) still missing. Thanks, Roman.
Re: [9fans] Changelogs Patches?
So is it time for a new file server then? :D
Re: [9fans] Changelogs Patches?
On Dec 27, 2008, at 3:56 AM, erik quanstrom wrote: I'm actually still trying to figure out how replica/* fits together with sources being a fossil server. These two, somehow, have to click, but I haven't figured out the connection just yet. Any pointers to the good docs? there's no connection. replica would work without a fossil server. for that matter, replica would work without a dump. all you need is an original and a changed version. Got it. The bit that I didn't quite get initially was the fact that there's history accumulated in dumps and that history might need to be transferred *exactly* like it is to a different fileserver. And with replica only transferring the end result (present moment in history terms) there seemed to be a missing link... i used replica (plus a few additional tools) to make a faithful copy of the coraid fileserver. http://www.quanstro.net/plan9/history.pdf ...but your article answered that last question completely. Although, I wonder whether direct transfer of history between two venti servers would be possible. Thanks, Roman. P.S. I also didn't quite understand the business of synchronizing Qids. I have always thought that they are only meaningful for the duration of the server's lifetime and thus all applications are quite immune to potential Qid changes as long as the connection get dropped and re-established. Or was it that your goal was to migrate so seamlessly that *running* applications wouldn't notice?
Re: [9fans] Changelogs Patches?
...but your article answered that last question completely. Although, I wonder whether direct transfer of history between two venti servers would be possible. if one were to transfer history between two fs with the same on-disk format, a simple device copy would be sufficient. i was moving from a 32-bit 4k block fs to geoff's 64 bit work with 8k blocks. history is not a property of venti. venti is a sparse virtual drive with ~2^80 bits storage. blocks are addressed by sha1 hash of their content. fossil is the fileserver. the analogy would be a change in fossil format. my technique would work for fossil, too. P.S. I also didn't quite understand the business of synchronizing Qids. I have always thought that they are only meaningful for the duration of the server's lifetime and thus all applications are quite immune to potential Qid changes as long as the connection get dropped and re-established. Or was it that your goal was to migrate so seamlessly that *running* applications wouldn't notice? that's okay. russ think's i'm nuts on this point, too. perhaps the paper wasn't fully clear. i wanted to make the assertion that if on the original fs,qid(patha) == qid(pathb) then on the new fs, qid(patha') == qid(pathb'). the qids weren't the same. for various reasons (i.e. not every copy of every file makes it to a dump), they can't be. it's just a very complicated way of saying, i didn't want to recopy the same data needlessly and increase the size of the fs. i just couldn't think of an easy way of making the same assertion another way without reading every file for each day of the dump. remember, the original fs was a pentium ii with a 100mbit ethernet card. it's still took 2 weeks to copy the data to the new fs. and russ is right in that it was overkill. but, hey, if it's worth doing, it's worth doing in grand excess. oh, by the way, the replica db's are reusable. they could also, if one wished by generated by the fs as part of the dump process. - erik
Re: [9fans] Changelogs Patches?
I don't deny that 9fs dump is quite useful and it seems to match the organization of Plan9 developer club pretty well. Personally, though, I'd say that the usefulness of the dump would be greatly improved if one had an ability to do ad-hoc archival snapshots AND assigning tags, not only dates to them. i can't recommed reading ken's kernel (the fs) enough. it's recognizable as related to plan 9, but it is much simplier. it can afford to be static. it would be nifty, if an early version of the fs (with typedef long Device) could be put up on sources for historical interest. - erik
Re: [9fans] Changelogs Patches?
i've rarely found per-change histories to be any more useful than most other comments, i'm afraid. And that meant that math texts and math teaching was all about polished final results. ah. my statement was ambiguous. i meant per-change chatter in the history, not the changes in the history. it's fine to have the chatter, but it isn't essential, because nothing relies on it, in the sense that the chatter causes the system to change its behaviour.
Re: [9fans] Changelogs Patches?
On Dec 25, 2008, at 8:57 PM, Anthony Sorace wrote: erik offered some suggestions for hosting various bits of things outside 9vx and connecting to that in order to get the dumps. those options are valid, but you can just as well host the entire thing within 9vx. it's not the default configuration, but i believe instructions are out there (9fans or the wiki). using fossil for your root, instead of #Z, will obviously cost you the benefits of #Z - namely, the pass-through transparency. That's a good advice. Thanks. I wonder, however, if such a transparency can be achieved the other way around -- serving my entire home directory via fossil from plan9port under UNIX and 9vx. Has anyone tried such a config? if your primary interest is for replica/*, I'm actually still trying to figure out how replica/* fits together with sources being a fossil server. These two, somehow, have to click, but I haven't figured out the connection just yet. Any pointers to the good docs? Thanks, Roman.
Re: [9fans] Changelogs Patches?
On Sat, Dec 27, 2008 at 06:04:42AM +, Eris Discordia wrote: it all begins with Adam and Steve, as Brian Stuart suggests, ways have been found of managing large teams of people with different specializations and those ways work. The Mgmt has a raison d'etre, despite what techno-people like to suggest. Because when, say Napoleon was commanding hundreds of thousands of soldiers, he was not commanding individually hundreds of thousands of soldiers. But he gave order to a handful, giving orders each to a handful etc. But is was his idea that was going from to to bottom. French: main tenir: holding (tenir) in one hand (main). You can maintenir a huge software if it is orthogonalized: when you take one piece, not the whole plate of spaghetti comes (it just pulling on the articulation, the communication, the API with the rest). And for people, the military adds: hold in one hand, so the other is free to slap when needed (and a foot free to kick if first lesson was not received strong enough). -- Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [9fans] Changelogs Patches?
I'm actually still trying to figure out how replica/* fits together with sources being a fossil server. These two, somehow, have to click, but I haven't figured out the connection just yet. Any pointers to the good docs? there's no connection. replica would work without a fossil server. for that matter, replica would work without a dump. all you need is an original and a changed version. i used replica (plus a few additional tools) to make a faithful copy of the coraid fileserver. http://www.quanstro.net/plan9/history.pdf - erik
Re: [9fans] Changelogs Patches?
I'm baffled. Slap me, or kick me--your choice. --On Saturday, December 27, 2008 11:36 AM +0100 tlaro...@polynum.com wrote: On Sat, Dec 27, 2008 at 06:04:42AM +, Eris Discordia wrote: it all begins with Adam and Steve, as Brian Stuart suggests, ways have been found of managing large teams of people with different specializations and those ways work. The Mgmt has a raison d'etre, despite what techno-people like to suggest. Because when, say Napoleon was commanding hundreds of thousands of soldiers, he was not commanding individually hundreds of thousands of soldiers. But he gave order to a handful, giving orders each to a handful etc. But is was his idea that was going from to to bottom. French: main tenir: holding (tenir) in one hand (main). You can maintenir a huge software if it is orthogonalized: when you take one piece, not the whole plate of spaghetti comes (it just pulling on the articulation, the communication, the API with the rest). And for people, the military adds: hold in one hand, so the other is free to slap when needed (and a foot free to kick if first lesson was not received strong enough). -- Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [9fans] Changelogs Patches?
while a descriptive history is good, it takes a lot of extra work to generate. i've rarely found per-change histories to be any more useful than most other comments, i'm afraid. you'd hope it would answer what was he thinking? but i found either it was obvious or i still had to ask. still, perhaps it could be regarded as an aid to future computer archaeologists, after all shared context has been lost. the intention of things like /CHANGES is mainly to point out moderate to large changes (eg, if you've been waiting for a bug fix or there's a significant change to usage or operation). it isn't intended to give details or rationale of the fix, any more than there is any of that for the original code, really. perhaps literate programming will fix that if it ever takes off. (the set of people that write good descriptions and the set of people that write good code don't necessarily have a big intersection.) for larger additions or changes i sometimes wrote short notes giving the background, the changes/additions and the rationale for them, ranging from the equivalent of a long e-mail to a several-page paper. that worked quite well, but was somewhat more work. also useful for compilers are links to bug demonstration programs and regression tests. the advantage of dump and snap is that the scope is the whole system: including emails, discussion documents, the code, supporting tools -- everything in digital form. if software works differently today compared to yesterday, then
Re: [9fans] Changelogs Patches?
the advantage of dump and snap is that the scope is the whole system: including emails, discussion documents, the code, supporting tools -- everything in digital form. if software works differently today compared to yesterday, then sorry, got cut off. then in most cases, i'd expect 9fs dump to make it easy to track down the set of differences, and narrow the search to the culprit. it might not even be a source change, but a configuration file, or a file was moved or removed.
Re: [9fans] Changelogs Patches?
I use CWEB (D. Knuth and Levy's) intensively and it is indeed invaluable. It doesn't magically improve code (my first attempts have just shown how poor my programming was: it's a magnifying glass, and one just saw with it bug's blinking eyes with bright smiles). Back when I used CWEB on a regular basis (I don't find myself writing as much substantive code from scratch of late), I experienced an interesting phenomenon. I could write pretty good code, almost as a stream of consciousness. The tool made it natural to present the code in the order in which I could understand it, rather than the order the compiler wanted it. But it was the effect of this that was really interesting. I found that as I wrote I'd think in terms of several things I needed to do and I'd put placeholders in (chunk names) for all but the one I was writing just then. As I'd finish a chunk, I'd go back an find another one that I hadn't written yet, and I could easily pick them in the order I figured out the way I wanted to handle it. At some point, I just ran out of chunks that needed to be written, and the code would be done. It was almost as if the completion of the code snuck up on me. At first, it was sort of a maybe Knuth's on to something here but it happened often enough that I now consider it a basic feature of the style. Back to the topic in question though, I did find that writing and maintaining good descriptions tool almost as much discipline as any other code documentation. I did have to resist the urge to leave the textual part of a chunk blank and just write the code. I also had to be diligent about updating the descriptions when the code changed. But for whatever reason (asthetics, tool, living up to Knuth's example...) it did seem a little easier in that context. However, in terms of changelogs and such, I'd say that's still an open question. It would seem that there should be some way to automate the creation of a changelog (at least in the form of a list of pointers) from the literate source. But the literate style itself doesn't really seem to create anything new in terms of the high level overview that you'd see in release notes or changelogs. BLS
Re: [9fans] Changelogs Patches?
On Fri, Dec 26, 2008 at 11:25:33AM -0600, blstu...@bellsouth.net wrote: Back when I used CWEB on a regular basis (I don't find myself writing as much substantive code from scratch of late), I experienced an interesting phenomenon. I could write pretty good code, almost as a stream of consciousness. The tool made it natural to present the code in the order in which I could understand it, rather than the order the compiler wanted it. Yes, but this means you have adapted the way you are writing the code to the logics behind litterate programming. Starting with a structured programming approach (litterate is indeed more) is probably the best. If, as I have done..., one looks to the finger instead of the moon, and takes it to be a way for formatting comments, with all the bells and whistles of TeX, one is definitively not on the right track---and that's why the packages to format C comments embedded in source is definitely not the same. Once you get at it, it really helps as you describe. (I have one library that I wrote almost in one go---the Esri's SHAPE lib support for KerGIS--- and that does the job; but it was not the first, but it was the first I wrote with explanations in _french_, my native and thinking language; so now, since I think in french, I write in french---but code, including identifiers and one line comments are in \CEE. This is the second lesson I learned). However, in terms of changelogs and such, I'd say that's still an open question. It would seem that there should be some way to automate the creation of a changelog (at least in the form of a list of pointers) from the literate source. But the literate style itself doesn't really seem to create anything new in terms of the high level overview that you'd see in release notes or changelogs. I like text, because of diffs. And CWEB has diffs ;) You can even confer this with Brooks' The mythical man-month, and adapting slightly CWEB diffs features will gave the highlighting changes doc Brooks has written about. Even with data, to get to the point one needs only diffs (I use it with vectorial map stuff to highlight what changes have been made between different versions provided by surveyors. This with the ability to show the state of data at -MM-DD hh:mm:ss is invaluable.) That is one of the many reasons I found plan9 so interesting: text oriented. -- Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [9fans] Changelogs Patches?
Back when I used CWEB on a regular basis (I don't find myself writing as much substantive code from scratch of late), I is it just me, or is hard to read someone else's cweb code? if it's not just me... i wonder if the same reason it's easy to write from the top down doesn't make it hard to read. you have to be thinking the same way from the top otherwise you're lost. appropriately, this being a plan 9 list and all, i find code written from the bottom up easier to read. - erik
Re: [9fans] Changelogs Patches?
On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote: appropriately, this being a plan 9 list and all, i find code written from the bottom up easier to read. Depending on the task (on the aim of the software), one happens to split from top to bottom, and to review and amend from bottom to top. There is a navigation between the two. Bottom to top is more easier because you are building more complicate stuff from basic stuff. But the definition of these elements (the software ortho normal base), the justification of these elements can be in part, has to be in part, a result of a top to bottom thought. The general papers about Unix and Plan 9, the explanations of the logics of the whole can not be, IMHO, tagged as bottom to top. They are simply to the point ;) -- Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com http://www.kergis.com/ Key fingerprint = 0FF7 E906 FBAF FE95 FD89 250D 52B1 AE95 6006 F40C
Re: [9fans] Changelogs Patches?
On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote: appropriately, this being a plan 9 list and all, i find code written from the bottom up easier to read. Depending on the task (on the aim of the software), one happens to split from top to bottom, and to review and amend from bottom to top. There is a navigation between the two. Bottom to top is more easier because you are building more complicate stuff from basic stuff. Some time back, I was trying to understand how to teach the reality of composing software. (Yes, I do think of it as a creative activity very similar to composing music.) The top-down and bottom-up ideas abound and make sense, but they never seemed to capture the reality. Then one day, after introspecting on the way I write code, I realized it's not one or the other; it's outside-in. I don't know what little tools I need to build until I have some sense of the big picture, but I can't really establish the exact boundaries between major elements until I've worked out the cleanest way to build the lower-level bits. So I iterative work back and forth between big picture and building blocks until they meet in the middle. As an aside, that's also when I realized what had always bugged me about the classic approach to team programming. The interfaces between major parts really comes last, but in assigning work to team members, you have to force it to come first. And of course, from that perpsective, it makes perfect sense why the best examples of programming are ones where the first versions are created by only 1 or 2 people and why the monstrosities created by large teams of professional software engineers are so often massive collections of mechanisms that don't work well together. BLS
Re: [9fans] Changelogs Patches?
The Story of Mel [...] I compared Mel's hand-optimized programs with the same code massaged by the optimizing assembler program, and Mel's always ran faster. That was because the top-down method of program design hadn't been invented yet, and Mel wouldn't have used it anyway. He wrote the innermost parts of his program loops first, so they would get first choice of the optimum address locations on the drum. The optimizing assembler wasn't smart enough to do it that way. [...] -- http://catb.org/jargon/html/story-of-mel.html Know why Mel is no more in business? 'Cause one man can only do so much work. The Empire State took many men to build, so did Khufu's pyramid, and there was no whining about many mechanisms that don't work well together. Now go call your managers PHBs. --On Friday, December 26, 2008 3:44 PM -0600 blstu...@bellsouth.net wrote: On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote: appropriately, this being a plan 9 list and all, i find code written from the bottom up easier to read. Depending on the task (on the aim of the software), one happens to split from top to bottom, and to review and amend from bottom to top. There is a navigation between the two. Bottom to top is more easier because you are building more complicate stuff from basic stuff. Some time back, I was trying to understand how to teach the reality of composing software. (Yes, I do think of it as a creative activity very similar to composing music.) The top-down and bottom-up ideas abound and make sense, but they never seemed to capture the reality. Then one day, after introspecting on the way I write code, I realized it's not one or the other; it's outside-in. I don't know what little tools I need to build until I have some sense of the big picture, but I can't really establish the exact boundaries between major elements until I've worked out the cleanest way to build the lower-level bits. So I iterative work back and forth between big picture and building blocks until they meet in the middle. As an aside, that's also when I realized what had always bugged me about the classic approach to team programming. The interfaces between major parts really comes last, but in assigning work to team members, you have to force it to come first. And of course, from that perpsective, it makes perfect sense why the best examples of programming are ones where the first versions are created by only 1 or 2 people and why the monstrosities created by large teams of professional software engineers are so often massive collections of mechanisms that don't work well together. BLS
Re: [9fans] Changelogs Patches?
Know why Mel is no more in business? 'Cause one man can only do so much work. The Empire State took many men to build, so did Khufu's pyramid, and there was no whining about many mechanisms that don't work well together. Now go call your managers PHBs. building a pyramid, starting at the top is one of those things that just doesn't scale. - erik
Re: [9fans] Changelogs Patches?
building a pyramid, starting at the top is one of those things that just doesn't scale. But if you figure out how, it's probably worth a Nobel. BLS
Re: [9fans] Changelogs Patches?
building a pyramid, starting at the top is one of those things that just doesn't scale. For that, you have bottom-up, right? But there's no meet-in-the-middle for a pyramid, or for software. Unless, the big picture is small enough to fit in one man's head and let him context-switch back and forth between general and particular, in which case you have to give up expanding software functionality at the one man barrier. All admirable architecture, and admirable software, is, in addition to being manifestation of great technique, manifestation of great management--even informal management is management in the end. Instead of it all begins with Adam and Steve, as Brian Stuart suggests, ways have been found of managing large teams of people with different specializations and those ways work. The Mgmt has a raison d'etre, despite what techno-people like to suggest. --On Friday, December 26, 2008 5:30 PM -0500 erik quanstrom quans...@quanstro.net wrote: Know why Mel is no more in business? 'Cause one man can only do so much work. The Empire State took many men to build, so did Khufu's pyramid, and there was no whining about many mechanisms that don't work well together. Now go call your managers PHBs. building a pyramid, starting at the top is one of those things that just doesn't scale. - erik
Re: [9fans] Changelogs Patches?
On Dec 25, 2008, at 6:37 AM, erik quanstrom wrote: despite the season, and typical attitudes, i don't think that development practices are a spiritual or moral decision. they are a practical one. Absolutely! Agreed 100%. My original question was not at all aimed at saving Plan9 development practices from the fiery inferno. Far from it. I simply wanted to figure out whether the things that really help me follow the development of other open source projects are available under Plan9. It is ok for them to be different (e.g. not based on traditional SCMs) and it is even ok for them not to be available at all. and what they have done at the labs appears to be working to me. It surely does work in a sense that Plan9 is very much alive and kicking. But there are also some things that make following Plan9 development and doing software archeology more difficult that, lets say, plan9port. It very well may be just my own ignorance (in which case, please educate me on these subjects) but my current impression is that sources.cs.bell-labs.com is the de-facto SCM for Plan9. In fact, it is the only way to get new source into the official tree, yet still have some ability to track the old stuff via main/archive. This model, however well suited for the closely-knitted inner circle of developers, makes it difficult for me to follow the project. Why? Well, here's my top reason: Plan9 development history is not quantized in atomic changesets, but rather in 24hour periods. Even if a developer wanted to record the fact that a particular state of the tree corresponds to a bug fix or a feature implementation the only way to do that would be not to allow any other changes in within the 24h window. This seem rather awkward. Two less severe problems are the lack of easy tracking of change ownership and code migration through time and space. Both are quite important when one tries to figure out how (and why!) did we get from /n/sourcesdump/2002/* to /n/sourcesdump/2008/* in my own experience, i've found scum always to cost time. but my big objection is the automatic merge. automatic merges make it way to easy to merge bad code without reviewing the diffs. while a descriptive history is good, it takes a lot of extra work to generate. just because it's part of the scum process doesn't make it free. Agreed. As much as there's price to pay when one tries to write clean code, there's a price to pay when one tries to maintain a clean history(*). In both cases, however, I, personally, would gladly pay that price. Otherwise I simply risk insanity if the project gets over a couple thousand lines of code or a more than a year old. Thanks, Roman. (*) My definition of a clean history is a set of smallest self-reliant changesets.
Re: [9fans] Changelogs Patches?
I surely hope the festive mood of the season will protect me from being ostracized for asking this, but is there any chance to map Plan9 development practices to some of the established ways of source code management? I mostly long for things like being able to browse Plan9 history with a clear understanding of who did what and for what reason. in the holiday spirit ☺, isn't this similar logic 1 scm packages are peace hope and light, everybody knows that. 2. if you don't use a scum package you are in the darkness 3. if you are in the darkness, you must be saved, or be cast into the pit. despite the season, and typical attitudes, i don't think that development practices are a spiritual or moral decision. they are a practical one. and what they have done at the labs appears to be working to me. in my own experience, i've found scum always to cost time. but my big objection is the automatic merge. automatic merges make it way too easy to merge bad code without reviewing the diffs. while a descriptive history is good, it takes a lot of extra work to generate. just because it's part of the scum process doesn't make it free. - erik
Re: [9fans] Changelogs Patches?
On Dec 24, 2008, at 10:40 PM, erik quanstrom wrote: Is there any preferred way to get changelogs / diffs these days? yesterday -d ... when i'm especially curious or anxious. But yesterday won't work in a more lightweight environment (such as 9vx) will it? exactly the same as plan 9 does. as long as the fs supports a dump fs, 9vx will support yesterday. True. But not having an fs that supports dump is exactly what makes 9vx a lighter weight environment (unless I'm grossly mistaken and #Z in 9vx actually has a way of supporting dump). for example, i've been mounting my diskless fs with 9vx. yesterday works just fine. i'm sure you could use a linux-based venti with plan 9-based fossil as well. True, but I'd really like to NOT have any extra software running and still have and ability to do replica/* and yesterday under 9vx. Can this be done? Thanks, Roman.
Re: [9fans] Changelogs Patches?
True, but I'd really like to NOT have any extra software running and still have and ability to do replica/* and yesterday under 9vx. I'm only vaguely familiar with 9vx, so there I can't speak, but you can certainly do replica/* as it is a user-level tool and as for yesterday, you can apply it to /n/sources, which is what you seem to imply is your requirement. ++L
Re: [9fans] Changelogs Patches?
depends what you mean by extra. if that means outside 9vx, then yes; if it means besides what 9vx uses by default, no. yesterday(1) relies on having dump-style snapshots. 9vx, as shipped, gets its root file system from #Z, which doesn't have snapshots. erik offered some suggestions for hosting various bits of things outside 9vx and connecting to that in order to get the dumps. those options are valid, but you can just as well host the entire thing within 9vx. it's not the default configuration, but i believe instructions are out there (9fans or the wiki). using fossil for your root, instead of #Z, will obviously cost you the benefits of #Z - namely, the pass-through transparency. if your primary interest is for replica/*, though, you might consider the direction i've been headed: root from fossil, but import $home or /usr from #Z.
Re: [9fans] Changelogs Patches?
using fossil for your root, instead of #Z, will obviously cost you the benefits of #Z - namely, the pass-through transparency. if your primary interest is for replica/*, though, you might consider the direction i've been headed: root from fossil, but import $home or /usr from #Z. That's close to what I'm doing. When I'm running stand-alone, I boot from fossil, bind #Z to /n/unix and bind my UNIX home directory to a mount point in the fossil file system. When running as a terminal, I boot from my file server, and still use pretty much the same binds. BLS
Re: [9fans] Changelogs Patches?
On Dec 22, 2008, at 8:41 AM, Charles Forsyth wrote: Is there any preferred way to get changelogs / diffs these days? yesterday -d ... when i'm especially curious or anxious. But yesterday won't work in a more lightweight environment (such as 9vx) will it? it probably wouldn't hurt to have a DMEXCL+DMAPPEND file (!) maintained by the command that applies patches, which appends the readme/notes file(s) for each patch as it is applied. not all changes are done through patches. Speaking of which -- is there any FAQ on the current development practices of the Plan9 project? Things like patch lifecycle, etc.? Thanks, Roman.
Re: [9fans] Changelogs Patches?
Is there any preferred way to get changelogs / diffs these days? yesterday -d ... when i'm especially curious or anxious. But yesterday won't work in a more lightweight environment (such as 9vx) will it? exactly the same as plan 9 does. as long as the fs supports a dump fs, 9vx will support yesterday. for example, i've been mounting my diskless fs with 9vx. yesterday works just fine. i'm sure you could use a linux-based venti with plan 9-based fossil as well. - erik
Re: [9fans] Changelogs Patches?
On Dec 22, 2008, at 8:46 PM, Nathaniel W Filardo wrote: Hi, The contrib index mentions that daily changelogs for Plan 9 are in sources/extra/changes, but those haven't been updated since early 2007. Is there any preferred way to get changelogs / diffs these days? Relatedly, is there a better way to mirror the development history of Plan 9 than running @{9fs sourcesdump; cd /n/sourcesdump; tar -c} | @{tar - x} or similar? I surely hope the festive mood of the season will protect me from being ostracized for asking this, but is there any chance to map Plan9 development practices to some of the established ways of source code management? I mostly long for things like being able to browse Plan9 history with a clear understanding of who did what and for what reason. Say what you will about Linux kernel, but things like these: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=summary surely make it much more bearable to work with^H^H^H^H^H around. Thanks, Roman. P.S. I see that Russ uses Mercurial SCM for some of his other projects, so may be my question is not that weird, after all...
Re: [9fans] Changelogs Patches?
Is there any preferred way to get changelogs / diffs these days? i use 9fs sources diff /whatever /n/sources/plan9/whatever and after a pull yesterday -d ... when i'm especially curious or anxious. it probably wouldn't hurt to have a DMEXCL+DMAPPEND file (!) maintained by the command that applies patches, which appends the readme/notes file(s) for each patch as it is applied. not all changes are done through patches.
Re: [9fans] Changelogs Patches?
2008/12/22 Venkatesh Srinivas m...@acm.jhu.edu: Hi, The contrib index mentions that daily changelogs for Plan 9 are in sources/extra/changes, but those haven't been updated since early 2007. Is there any preferred way to get changelogs / diffs these days? I used to maintain the changelogs, but ended up generating ENOTIME, pretty much just as everyone else who has worked on that. It's something I think I might pick up again; either Russ or Uriel emailed me a set of scripts to maintain it. Perhaps I'll start doing it again; it's mostly just a question of getting the scripts set up and doing it. --dho Also, in sources/patch, there are patches neither in applied/ or sorry/. Are these patches in queue? Applied? Not applied? Thanks, -- vs
Re: [9fans] Changelogs Patches?
It is pretty much a question of it being a totally backwards way of doing things, with one set of people doing the changes, and another set of people guessing the meaning of the changes writing the changelog. (This is claimed to be due to the first set of people not having the time to writing down what changes they make. Of course those same people seem to think the time spent when the second group has to inquire as to the nature of changes is not wasteful.) But following more conventional practices and heeding the crazy advice of unqualified people like Brian when he writes: *Keep records*. I maintain a FIXES file that describes every change to the code since the Awk book was published in 1988 [1] would be anathema to the Plan 9 way of doing things. uriel [1]: http://www.cs.princeton.edu/~bwk/testing.html On Mon, Dec 22, 2008 at 6:03 PM, Devon H. O'Dell devon.od...@gmail.com wrote: 2008/12/22 Venkatesh Srinivas m...@acm.jhu.edu: Hi, The contrib index mentions that daily changelogs for Plan 9 are in sources/extra/changes, but those haven't been updated since early 2007. Is there any preferred way to get changelogs / diffs these days? I used to maintain the changelogs, but ended up generating ENOTIME, pretty much just as everyone else who has worked on that. It's something I think I might pick up again; either Russ or Uriel emailed me a set of scripts to maintain it. Perhaps I'll start doing it again; it's mostly just a question of getting the scripts set up and doing it. --dho Also, in sources/patch, there are patches neither in applied/ or sorry/. Are these patches in queue? Applied? Not applied? Thanks, -- vs
Re: [9fans] Changelogs Patches?
Hi, The contrib index mentions that daily changelogs for Plan 9 are in sources/extra/changes, but those haven't been updated since early 2007. Is there any preferred way to get changelogs / diffs these days? Relatedly, is there a better way to mirror the development history of Plan 9 than running @{9fs sourcesdump; cd /n/sourcesdump; tar -c} | @{tar -x} or similar? Thanks. --nwf; pgp35C6i3YBHr.pgp Description: PGP signature