Re: [9fans] Changelogs Patches?

2009-01-26 Thread erik quanstrom
  it's important to keep in mind that fossil is just a write buffer.
  it is not intended for the perminant storage of data. 
 
 Sure. But it must store the data *intact* long enough
 for me to be able to do a snap. It has to be able to
 at least warn me about data corruption.

do you have any references to spontaenous data corruption
happening so soon on media that can be written elsewhere
without corruption?  ian ibm paper argus for raid[56] + chksum
that claimed that the p(lifetime) = 10^-13.

http://domino.watson.ibm.com/library/cyberdig.nsf/80741a79b3d5f4d085256b3600733b05/ca7b221ad09be77885257149004f7c53?OpenDocumentHighlight=0,RZ3652

but i didn't see any reason that this would apply to short-term
storage.

 That is my *entire* point. If fossil doesn't tell you that 
 the data in its buffer was/is corrupted -- you have no
 reason to rollback. 

if you're that worried, you do not need to modify fossil.
why don't you write a sdecc driver that as configuration
another sd device and a blocksize.  then you can just
add ecc on the way in and check it on the way out.

- erik



Re: [9fans] Changelogs Patches?

2009-01-26 Thread erik quanstrom
 It depends on the vdev configuration. You can do simple mirroring
 or you can do RAID-Z (which is more or less RAID-5 done properly).

raid5 done properly?  could you back up this claim?

also, with services like ec2, it's no use doing raid since all
your data could be on the same drive, regardless what the tell
you.

  does this depend on the amount of i/o one does on the data or does 
  zfs scrub at a minimum rate anyway.  if it does, that would be expensive.  
 
 You can do resilvering (fixing the data that is known to be
 bad) or scrubbing (verifying and fixing *all* the data). You
 also can configure things so that bad blocks either trigger
 or don't automatic resilvering. Does this answer your question?

no.  not at all.  if you're serious about using ec2, one of the
costs you need to control is your b/w usage.  you're going to
notice overly-aggressive scrubbing in your mothly bill.

  maybe ec2 is heads amazon wins, tails you loose?
 
 The scariest takeaway from the conference was: with the economy
 the way it is physical on-site datacenters are becoming a 
 luxury for all but the most wealthy companies. Thus whether
 we like it or not virtual data centers are here to stay.

if the numbers i came up with for coraid are correct, it would would cost
coraid about 50x more to use ec2.  that is, if we can run plan 9
at all.

- erik



Re: [9fans] Changelogs Patches?

2009-01-26 Thread Roman V. Shaposhnik
On Mon, 2009-01-26 at 08:53 -0500, erik quanstrom wrote:
  It depends on the vdev configuration. You can do simple mirroring
  or you can do RAID-Z (which is more or less RAID-5 done properly).
 
 raid5 done properly?  could you back up this claim?

Yes. See here for details:
   http://blogs.sun.com/bonwick/entry/raid_z

   does this depend on the amount of i/o one does on the data or does 
   zfs scrub at a minimum rate anyway.  if it does, that would be expensive. 

  
  You can do resilvering (fixing the data that is known to be
  bad) or scrubbing (verifying and fixing *all* the data). You
  also can configure things so that bad blocks either trigger
  or don't automatic resilvering. Does this answer your question?
 
 no.  not at all. 

Then, please, restate it.

 if you're serious about using ec2, one of the
 costs you need to control is your b/w usage.  you're going to
 notice overly-aggressive scrubbing in your mothly bill.

Only if you asked for that to happen. Its all under your control.
You may decide to never ever do scrubbing.

  The scariest takeaway from the conference was: with the economy
  the way it is physical on-site datacenters are becoming a 
  luxury for all but the most wealthy companies. Thus whether
  we like it or not virtual data centers are here to stay.
 
 if the numbers i came up with for coraid are correct, it would would cost
 coraid about 50x more to use ec2.  that is, if we can run plan 9
 at all.

You may think what you want, but obviously quite a few existing small to
mid-size companies disagree. Including a couple of labs with MPI apps
now running on EC2. May be your numbers are wrong, may be your usage
patterns are different. Who knows.

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-26 Thread Russ Cox
 As for me, here's my wish list so far. It is all about fossil, since
 it looks like venti is quite fine (at least for my purposes) the
 way it is:
 1. Block consistency. Yes I know the argument here is that you
 can always roll-back to the last known archival snapshot on venti.
 But the point is to kown *when* to roll back. And unless fossil
 warns you that a block has been corrupted you wouldn't know.

I don't understand what you mean.  Do you want fossil to tell
you when your disk is silently corrupting data, or something else?

 2. live mounting of arbitrary scores corresponding to vac
 VtRoot's to arbitrary sub-directories in my fossil tree. After
 all, if I can do create of regular files and sub-directories
 via fossil's console why can't I create pointers to the existing
 venti file-hierarchies?

The only reason this is hard is the choice of qids.
You need to decide whether to reuse the qids in the archive
or renumber them to avoid conflicts with existing qids.
The vac format already has a way to offset the qids of whole
subtrees, but then if you make the tree editable and new files are
created, it gets complicated.

 3. Not sure whether this is a fossil requirement or not, but I
 feel uneasy that a root score is sort of unrecoverable from the
 pure venti archive. Its either that I know it or I don't.

I don't understand what you mean here either.
From a venti archive, you do cat file.vac to find
the actual score.

For what it's worth, I'll be the first to admit that fossil has a
ton of rough edges and things that could be done better.
There were early design decisions that we didn't know the
implications of until relatively late in the implementation,
and I would revisit many of those if I had the luxury of
doing it over.  It is very much version 0.

The amazing thing to me about fossil is how indestructable
it is when used with venti.  While I was finishing fossil,
I ran it on my laptop as my day-to-day file system, and I never
lost a byte of data despite numerous bugs, because venti
itself was solid, and I always did an archive to venti before
trying out new code.  Once you see the data in the archive
tree, you can be very sure it's not going away.

 It is actually quite remarkable how similar the models of
 fossil/venti and Git seem to be: both build on the notion
 of the immutable history. Both address the history by the
 hash index. Both have a mutable area whose only purpose
 is to stage data for the subsequent commit to the permanent
 history. Etc.

I don't think it's too remarkable.  Content hash addressing was
in the air for the last decade or so and there were a lot of
systems built using it.  The one thing it does really well
is eliminate any worry about cache coherency and versioning.
That makes it very attractive for any system with large
amounts of data or multiple machines.  Once you've gone down
that route, you have to come to grips with how to implement
mutability in a fundamentally immutable system, and the
obvious way is with a mutable write buffer staging writes
out to the immutable storage.

 Hm. There doesn't seem to be much of shutdown code for fossil/venti
 in there. Does it mean that sync'ing venti and then just slay(1)'ing
 it is ok?

Yes, it is.  Venti is designed to be crash-proof, as is fossil.
They get the write ordering right and pick up where they left off.
They are not, however, disk corruption-proof.

Russ



Re: [9fans] Changelogs Patches?

2009-01-26 Thread erik quanstrom
 Yes. See here for details:
http://blogs.sun.com/bonwick/entry/raid_z

since these arguments rely heavily on the meme that
software raid == bad
i have a hard time signing on.  i believe i'm repeating
myself by saying that afik, there is no such thing as pure
hardware raid; that is, there is no hardware that does
all of what raid level n does in hardware.  even if it's
an embedded processor, it's all software raid.  perhaps
there's an xor engine to speed things along.

the other part of the argument — the write hole
depends on two things that i don't think are universal
a) zfs' demand for transactional storage b) a particular
raid implentation.

fancy raid cards often have battery-backed ram and thus
from the pov of the host, writes are atomic.  i don't have
any nda that let me see the firmware for a variety of raid
devices, but i find it hard to believe that all raid vendors
rewrite the entire stripe whever the write is smaller than
the stripe size and all could rewrite the data before the
parity.

 You may think what you want, but obviously quite a few existing small to
 mid-size companies disagree. Including a couple of labs with MPI apps
 now running on EC2. 

more people use windows than use plan 9.  should
i therefore conclude that my use of plan 9 is illogical?
http://en.wikipedia.org/wiki/Appeal_to_the_majority

why do you think that mpi has anything to do with
a plan 9 infastructure?

 May be your numbers are wrong, may be your usage
 patterns are different. Who knows.

a single cpu on ec2 costs $150/month.  my 6 personal
machines don't suck down that much juice.

the machines i have largely cost less than $500.  so
that's like $14/month.  that doesn't change the equation
much.

- erik




Re: [9fans] Changelogs Patches?

2009-01-26 Thread Roman V. Shaposhnik
On Mon, 2009-01-26 at 08:22 -0800, Russ Cox wrote:
  As for me, here's my wish list so far. It is all about fossil, since
  it looks like venti is quite fine (at least for my purposes) the
  way it is:
  1. Block consistency. Yes I know the argument here is that you
  can always roll-back to the last known archival snapshot on venti.
  But the point is to kown *when* to roll back. And unless fossil
  warns you that a block has been corrupted you wouldn't know.
 
 I don't understand what you mean.  Do you want fossil to tell
 you when your disk is silently corrupting data, or something else?

Implementation vise I would be happy to see the same score checks that
venti does implemented in fossil. Complaining like this:
   seterr(EStrange, lookuplump returned bad score %V not %V, u-score, score);
Would be good enough.
   
  2. live mounting of arbitrary scores corresponding to vac
  VtRoot's to arbitrary sub-directories in my fossil tree. After
  all, if I can do create of regular files and sub-directories
  via fossil's console why can't I create pointers to the existing
  venti file-hierarchies?
 
 The only reason this is hard is the choice of qids.
 You need to decide whether to reuse the qids in the archive
 or renumber them to avoid conflicts with existing qids.
 The vac format already has a way to offset the qids of whole
 subtrees, but then if you make the tree editable and new files are
 created, it gets complicated.

I see. Thanks for the explanation.

  3. Not sure whether this is a fossil requirement or not, but I
  feel uneasy that a root score is sort of unrecoverable from the
  pure venti archive. Its either that I know it or I don't.
 
 I don't understand what you mean here either.
 From a venti archive, you do cat file.vac to find
 the actual score.

As I mentioned: this one is not really a hard requirement, but
rather me thinking out loud. To me it feels that Venti is
opaque. In a sense that if I don't know the score to give to flfmt -v
then there's no way to browse through the venti to see what
could be there (unless I get physical access to arenas, I guess).

Now, suppose I have a fossil buffer that I constantly snap to venti.
That will build quite a lengthy chain of VtRoots. Then my fossil
buffer gets totally corrupted. I no longer know what was the 
score of the most recent snapshot. And I don't think I know of any
way to find that out.

 The amazing thing to me about fossil is how indestructable
 it is when used with venti.  

I agree. That has been very much the case during my short
evaluation of the two.

  It is actually quite remarkable how similar the models of
  fossil/venti and Git seem to be: both build on the notion
  of the immutable history. Both address the history by the
  hash index. Both have a mutable area whose only purpose
  is to stage data for the subsequent commit to the permanent
  history. Etc.
 
 I don't think it's too remarkable.  Content hash addressing was
 in the air for the last decade or so and there were a lot of
 systems built using it.  The one thing it does really well
 is eliminate any worry about cache coherency and versioning.
 That makes it very attractive for any system with large
 amounts of data or multiple machines.  Once you've gone down
 that route, you have to come to grips with how to implement
 mutability in a fundamentally immutable system, and the
 obvious way is with a mutable write buffer staging writes
 out to the immutable storage.

All true. Yet, it is surprising how many DSCMs that were built
on the idea of hash addressable history got the implementation
of mutability part wrong. Git is the closest one to, what I
now understand, is the fossil/venti approach.

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-26 Thread Steve Simon
 Now, suppose I have a fossil buffer that I constantly snap to venti.
 That will build quite a lengthy chain of VtRoots. Then my fossil
 buffer gets totally corrupted. I no longer know what was the 
 score of the most recent snapshot. And I don't think I know of any
 way to find that out.

there is a command fossil/last which prints the last snapped root score.
I run this from cron nightly and send the resulting score to a remote machine.

If all else fails there is a script  in /sys/src/cmd/venti/words/dumpvacroots
which interogates the http server built into venti and prints all the recent
root scores.

I have had to use this in the past when I had a dead disk and was less
carefull with my scores - all was fine, but I learnt my lesson.

-Steve



Re: [9fans] Changelogs Patches?

2009-01-26 Thread Roman Shaposhnik

On Jan 26, 2009, at 8:39 AM, erik quanstrom wrote:

This approach will work too. But it seems that asking fossil
to verify a checksum when the block is about to go to venti
is not that much of an overhead.


if checksumming is a good idea, shouldn't it be available outside
fossil?


It is available -- in venti ;-)


perhaps the argument is that it might be more efficient
to implement this inside fossil.


The argument has nothing to do with the efficiency. However
the way fossil is structured -- I think you're right it won't be
able to get additional benefits from its own checksuming.


 while this might be the case, i
don't see how the small overhead of a sd layer would matter
when you're assuming an ec2-style service, which will have a
minimum latency in the 10s of milliseconds.


Somehow you've got this strange idea that I'm engineering
something for ec2-style services. I am not. EC2 was a simple
example I used once. If it agitates you too much I promise
not too use it in the future ;-)

Thanks,
Roman.



Re: [9fans] Changelogs Patches?

2009-01-26 Thread Roman Shaposhnik

On Jan 26, 2009, at 9:37 AM, erik quanstrom wrote:

the other part of the argument — the write hole
depends on two things that i don't think are universal
a) zfs' demand for transactional storage


Huh?!?


b) a particular raid implentation.

fancy raid cards


I think you missed what I in RAID is supposed to
be expanding into ;-)

 i don't have any nda that let me see the firmware for a variety of  
raid

devices, but i find it hard to believe that all raid vendors
rewrite the entire stripe whever the write is smaller than
the stripe size and all could rewrite the data before the
parity.


Fancy ones might try to do fancy things, but see above.


why do you think that mpi has anything to do with
a plan 9 infastructure?


It is the other way around: the fact that Plan9 still
doesn't have anything to do with MPI keeps it
away from the kind of clusters Ron used to care
about (although, in reality, it is all about gcc anyway,
so MPI is a lesser argument here).


May be your numbers are wrong, may be your usage
patterns are different. Who knows.


a single cpu on ec2 costs $150/month.


I don't know where did you get that #, but my instance
on EC2 costs me about $70/m. Oh, wait! I know!
It is all because Solaris is so energy efficient ;-)


the machines i have largely cost less than $500.  so
that's like $14/month.  that doesn't change the equation
much.


I believe you are distorting my argument on purpose.

So lets just drop this conversation, ok?

Thanks,
Roman.



Re: [9fans] Changelogs Patches?

2009-01-26 Thread erik quanstrom
 the other part of the argument — the write hole
 depends on two things that i don't think are universal
 a) zfs' demand for transactional storage
 
 Huh?!?

why else would the zfs guys be worried about a
write hole for zfs?

what would happen to a raid-z if a write returned
as successful but were really written to the disk's cache?
and before the whole write is competed, the disk or
chassis looses power.  isn't that also a write hole?

i suppose the answer to this problem is the checksumming.
but if that is the case, what is the point of raid-z?

- erik




Re: [9fans] Changelogs Patches?

2009-01-25 Thread Roman V. Shaposhnik
On Tue, 2009-01-20 at 16:52 -0700, andrey mirtchovski wrote:
 for my personal $0.02 i will say that this argument seems to revolve
 around trying to bend fossil and venti to match the functionality of
 zfs and the design decisions of the team that wrote it. 

That is NOT the conversation I'm interested in. My main objective is 
to evaluate venti/fossil approach to storage and what kind of benefits
it might provide. It is inevitable that I will contrast venti/fossil
with ZFS, simply because it is the background I'm coming from.

 i, frankly, think that it should be the other way around; zfs should 
 provide the equivalent of the fossil/venti snapshot/dump functionality 
 to its users. that, to me would be a benefit.

Ok. It is fair to turn the tables. So now, let me ask you: what are
the benefits of fossil/venti that you want to see in ZFS? So far
the only real issue that you've identified is this:

  ||| where the second choice becomes a nuisance for me is in the 
  ||| case where one has thousands of clones and needs to keep track 
  ||| of thousands of names in order to ensure that when the right one
  ||| has finished the right clone disappears.

And I think it is a valid one. But is there anything else (execpt
the issues that have to do with the fact tha ZFS lives in UNIX
where fossil/venti in Plan9)? 

As for me, here's my wish list so far. It is all about fossil, since
it looks like venti is quite fine (at least for my purposes) the
way it is:
 1. Block consistency. Yes I know the argument here is that you
 can always roll-back to the last known archival snapshot on venti.
 But the point is to kown *when* to roll back. And unless fossil
 warns you that a block has been corrupted you wouldn't know.
  
 2. live mounting of arbitrary scores corresponding to vac
 VtRoot's to arbitrary sub-directories in my fossil tree. After 
 all, if I can do create of regular files and sub-directories 
 via fossil's console why can't I create pointers to the existing 
 venti file-hierarchies?
  
 3. Not sure whether this is a fossil requirement or not, but I
 feel uneasy that a root score is sort of unrecoverable from the
 pure venti archive. Its either that I know it or I don't. 

 all this filesystem/snapshot/clone games are just a bunch of toys to 
 make the admins happy and have little effective use for the end user.

I disagree. Remember that this whole conversation started from
a simple premise that a good archival system could be an
efficient replacement for the SCM. If your end users are
software developers -- that IS very relevant to them.

It is actually quite remarkable how similar the models of
fossil/venti and Git seem to be: both build on the notion
of the immutable history. Both address the history by the
hash index. Both have a mutable area whose only purpose
is to stage data for the subsequent commit to the permanent
history. Etc.

  I see what you mean, but in case of venti -- nothing disappears, really.
  From that perspective you can sort of make those zfs clones linger.
  The storage consumption won't be any different, right?
 
 the storage consumption should be the same, i presume. my problem is
 that in the case of zfs having several hundred snapshots significantly
 degrades the performance of the management tools to the extend that
 zfs list takes 30 seconds with about a thousand entries.

Really?!?

 compared to
 fossil handling 5 years worth of daily dumps in less than a second.
 but that's not really a serious argument ;)

And what's the output of
   term% ls -d path-to-your-fossil/archive/*/*/* | wc -l

  Great! I tired to do as much homework as possible (hence the delay) but
  I still have some questions left:
 0. A dumb one: what's the proper way of cleanly shutting down fossil
 and venti?
 
 see fshalt.

Hm. There doesn't seem to be much of shutdown code for fossil/venti
in there. Does it mean that sync'ing venti and then just slay(1)'ing
it is ok?

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-25 Thread Roman V. Shaposhnik
On Fri, 2009-01-23 at 22:36 -0500, erik quanstrom wrote:
  You never know when end-to-end data consistency will start to really
  matter. Just the other day I attended the cloud conference where 
  some Amazon EC2 customers were swapping stories of Amazon's networking
  stack malfunctioning and silently corrupting data that was written
  onto EBS. All of sudden, something like ZFS started to sound like 
  a really good idea to them.
 
 i know we need to bow down before zfs's greatness, but i still have
 some questions. ☺

Oh, come on! I said something like ZFS ;-) These guys are on
Linux, for crying out loud! They need to be saved one way
or the other (and Solaris at least have *some* AMIs available
on EC2).

 does ec2 corrupt all one's data en mass?  

From what I understood -- it was NOT en mass. But the scary
thing is that they only noticed because of the dumb luck
(the app coredumped because the input it was getting was not
properly formatted or something) 

 how do you do meaningful redundency in a cloud where one controls 
 none of the failure-prone pieces.

Well, that's the very point I'm trying to make: you have
to be at least notified that your data got corrupted.

Once you do get notified -- you can recover in variety
of different ways: starting from simply re-uploading/re-generating
your data all the way to the RAID-like things.

 finally, if p is the probability of a lost block, when does p become too
 large for zfs' redundency to overcome failures? 

It depends on the vdev configuration. You can do simple mirroring
or you can do RAID-Z (which is more or less RAID-5 done properly).

 does this depend on the amount of i/o one does on the data or does 
 zfs scrub at a minimum rate anyway.  if it does, that would be expensive.  

You can do resilvering (fixing the data that is known to be
bad) or scrubbing (verifying and fixing *all* the data). You
also can configure things so that bad blocks either trigger
or don't automatic resilvering. Does this answer your question?

 maybe ec2 is heads amazon wins, tails you loose?

The scariest takeaway from the conference was: with the economy
the way it is physical on-site datacenters are becoming a 
luxury for all but the most wealthy companies. Thus whether
we like it or not virtual data centers are here to stay.

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-25 Thread Roman V. Shaposhnik
On Tue, 2009-01-20 at 21:02 -0500, erik quanstrom wrote:
  In such a setup a corrupted block from a fossil 
  partition will go undetected and could end up
  being stored in venti. At that point it will become
  venti problem.
 
 it's important to keep in mind that fossil is just a write buffer.
 it is not intended for the perminant storage of data. 

Sure. But it must store the data *intact* long enough
for me to be able to do a snap. It has to be able to
at least warn me about data corruption.

 while corrupt data could end up in venti, the exposure lies only
 between snapshots.  you can rollback to the previous good
 score and continue.

That is my *entire* point. If fossil doesn't tell you that 
the data in its buffer was/is corrupted -- you have no
reason to rollback. 

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-23 Thread Roman V. Shaposhnik
On Wed, 2009-01-21 at 20:02 +0100, Uriel wrote:
 On Wed, Jan 21, 2009 at 2:43 AM, Roman V. Shaposhnik r...@sun.com wrote:
  Sure, but I can't really use venti  without using
  fossil (again: we are talking about a typical setup
  here not something like vac/vacfs), can I?
 
  If I can NOT than fossil becomes a weak link that
  can let corrupted data go undetected all the way
  to a venti store.
 
 Fossil has always been a weak link, and probably will always be until
 somebody replaces it. There was some idea of replacing it with a
 version of ken's fs that uses a venti backend...
 
 Venti's rock solid design is the only thing that makes fossil
 minimally tolerable despite its usual tendency of stepping on its hair
 and falling on his face.

After spending sometime reading the sources and grokking fossil
I don't think it is a walking disaster. Far from it. 

There are a couple of places where things can be improved, 
to make *me* happier (YMMV), and I'll try to focus on these 
in replying to Andrei's email. Just to get some closure on
this discussion.

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-23 Thread erik quanstrom
 After spending sometime reading the sources and grokking fossil
 I don't think it is a walking disaster. Far from it. 
 
 There are a couple of places where things can be improved, 
 to make *me* happier (YMMV), and I'll try to focus on these 
 in replying to Andrei's email. Just to get some closure on
 this discussion.

it's important to note, though, that fossil is a write
buffer and not a proper cache.  i believe this fact
is the main source of legitimate gripes with fossil.

the other source of trouble is that both fossil and
venti have at times suffered from being quite unfriendly
when shut down unexpectedly.  since they run on
cpu servers, and since there is a temptation to have
an all-in-wonder cpu server, unexpected shutdowns
can be more common than one would like.

- erik



Re: [9fans] Changelogs Patches?

2009-01-23 Thread erik quanstrom
 You never know when end-to-end data consistency will start to really
 matter. Just the other day I attended the cloud conference where 
 some Amazon EC2 customers were swapping stories of Amazon's networking
 stack malfunctioning and silently corrupting data that was written
 onto EBS. All of sudden, something like ZFS started to sound like 
 a really good idea to them.

i know we need to bow down before zfs's greatness, but i still have
some questions. ☺

does ec2 corrupt all one's data en mass?  how do you do meaningful
redundency in a cloud where one controls none of the failure-prone
pieces.

finally, if p is the probability of a lost block, when does p become too
large for zfs' redundency to overcome failures?  does this depend on
the amount of i/o one does on the data or does zfs scrub at a minimum
rate anyway.  if it does, that would be expensive.  

maybe ec2 is heads amazon wins, tails you loose?

- erik



Re: [9fans] Changelogs Patches?

2009-01-21 Thread erik quanstrom
On Wed Jan 21 01:40:13 EST 2009, st...@quintile.net wrote:
  ... fossil does have the functionality to serve two
  different file systems from two different disks, but i don't  think
  anyone has used that ...
 
 I do this, 'main' backed up by venti and 'other' which holds useful stuff
 that needn't be backed up, e.g. RFCs, cdrom images, datasheets etc. This is
 accessed via 9fs juke as an homage to the CDROM jukebox that once provided
 a similar filesystem at the labs.

actually, it was a hp jukebox that had mo disks.
alliance (neé plasmon) makes 60gb udo2 drives
  http://www.plasmon.com/archive_solutions/udodrives.html
and these libraries
  http://www.plasmon.com/archive_solutions/glibrary.html
the media are supposedly good for 50 years.

www.quanstro.net/plan9/disklessfs.pdf describes coraid's
worm-replacement strategy.  it is both better (offsite,
very fast access) and not better (the media are less reliable
and not write-once).

it would be neat to have a filesystem built as
filsys main cpe2.0kcache{e2.1jw0w1}
all the speed of disks and a perminant record,
but clearly not very cost effective.  and direct-
attach storage doesn't like the right place for
the worm.  it should be offsite.

- erik



Re: [9fans] Changelogs Patches?

2009-01-21 Thread Uriel
On Wed, Jan 21, 2009 at 2:43 AM, Roman V. Shaposhnik r...@sun.com wrote:
 I was specifically referring to a normal operations
 to conjure an image of a typical setup of fossil+venti.

 In such a setup a corrupted block from a fossil
 partition will go undetected and could end up
 being stored in venti. At that point it will become
 venti problem.

 i should have been more clear that venti does the
 checking.  there are many things that fossil doesn't
 do that it should.

 Sure, but I can't really use venti  without using
 fossil (again: we are talking about a typical setup
 here not something like vac/vacfs), can I?

 If I can NOT than fossil becomes a weak link that
 can let corrupted data go undetected all the way
 to a venti store.

Fossil has always been a weak link, and probably will always be until
somebody replaces it. There was some idea of replacing it with a
version of ken's fs that uses a venti backend...

Venti's rock solid design is the only thing that makes fossil
minimally tolerable despite its usual tendency of stepping on its hair
and falling on his face.

uriel

 This is quite worrisome for me. At least compared to
 ZFS it is.

 Thanks,
 Roman.



Re: [9fans] Changelogs Patches?

2009-01-21 Thread erik quanstrom
 Fossil has always been a weak link, and probably will always be until
 somebody replaces it. There was some idea of replacing it with a
 version of ken's fs that uses a venti backend...

i looked into how that would go enough to see
that venti would work at cross purposes to the
fs.  having a w address doesn't make much sense
when you can address by content.

in hindsight, that was likely obvious to everyone
but me.

i think ken's fs makes perfect sense without venti.
it has reasonable device support these days
(aoe, ata, ahci, marvell 88sx).

- erik



Re: [9fans] Changelogs Patches?

2009-01-20 Thread erik quanstrom
  in the case of zfs, my claim is that since zfs can reuse blocks, two
  vdev backups, each with corruption or missing data in different places
  are pretty well useless.
 
 
 Got it. However, I'm still not fully convinced there's a definite edge
 one way or the other. Don't get me wrong: I'm not trying to defend
 ZFS (I don't think it needs defending, anyway) but rather I'm trying
 to test my mental model of how both work.

if you end up rewriting a free block in zfs, there sure is.  you
can't decide which one is correct.

 P.S. Oh, and in case of ZFS a damaged vdev will be detected (and
 possibly re-silvered) under normal working conditions, while
 fossil might not even notice a corruption.

not true.  one of many score checks:

srv/lump.c:103: seterr(EStrange, lookuplump 
returned bad score %V not %V, u-score, score);

- erik



Re: [9fans] Changelogs Patches?

2009-01-20 Thread erik quanstrom
 1. What's the use of copying arenas to CD/DVD? Is it purely back up,
  since they have to stay on-line forever?

backup.

 2. Would fossil/venti notice silent data corruptions in blocks?

venti would.  the score wouldn't match the block.

 3. Do you think its a good idea to have volume management be
 part of filesystems, since that way you can try to heal the data
 on-the-fly?

i think they are seperate questions.  i see a couple of strong
disadvantages to combining volume management with the fs
- it's hard to reason about; zfs redundency stratgeies
seem ideosyncratic.
- you need a different volume management solution for
non zfs needs.
- to manage the storage you need to be a zfs expert.
conversely to manage zfs you need to be a storage
expert.
- raid5 is very slow if you move the raid computation away
from the data as you need to move the data to the
computation.

 4. If I have a venti server and a bunch of sha1 codes, can I somehow
 instantiate a single fossil serving all of them under /archive?

i don't understand the question.

- erik



Re: [9fans] Changelogs Patches?

2009-01-20 Thread Roman V. Shaposhnik
On Tue, 2009-01-20 at 09:19 -0500, erik quanstrom wrote:
   in the case of zfs, my claim is that since zfs can reuse blocks, two
   vdev backups, each with corruption or missing data in different places
   are pretty well useless.
  
  
  Got it. However, I'm still not fully convinced there's a definite edge
  one way or the other. Don't get me wrong: I'm not trying to defend
  ZFS (I don't think it needs defending, anyway) but rather I'm trying
  to test my mental model of how both work.
 
 if you end up rewriting a free block in zfs, there sure is.  you
 can't decide which one is correct.

You don't have to decide. You get use generation # for that.

  P.S. Oh, and in case of ZFS a damaged vdev will be detected (and
  possibly re-silvered) under normal working conditions, while
  fossil might not even notice a corruption.
 
 not true.  one of many score checks:
 
 srv/lump.c:103:   seterr(EStrange, lookuplump 
 returned bad score %V not %V, u-score, score);

I don't buy this argument for a simple reason: here's a very
easy example that proves my point:

term% fossil/fossil -f /tmp/fossil.bin
fsys: dialing venti at net!$venti!venti
warning: connecting to venti: Connection refused
term% mount /srv/fossil /n/f
term% cd /n/f/test
term% echo 'this  is innocent text'  text.txt
term% cat text.txt
this  is innocent text
term% dd -if /dev/cons -of /tmp/fossil.bin -bs 1 -count 8 -oseek 278528 -trunc 0
this WAS
8+0 records in
8+0 records out

term% rm /srv/fossil /srv/fscons
term% fossil/fossil -f /tmp/fossil.bin
fsys: dialing venti at net!$venti!venti
warning: connecting to venti: Connection refused
create /active/adm: file already exists
create /active/adm adm sys d775: create /active/adm: file already exists
create /active/adm/users: file already exists
create /active/adm/users adm sys 664: create /active/adm/users: file already 
exists
nuser 5 len 84
term% mount /srv/fossil /n/f2
term% cat /n/f2/test/text.txt
this WAS innocent text
term% 

Of course, with ZFS, the above corruption would be always
noticed and sometimes (depending on your vdev setup)
even silently fixed.

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-20 Thread erik quanstrom
   Got it. However, I'm still not fully convinced there's a definite edge
   one way or the other. Don't get me wrong: I'm not trying to defend
   ZFS (I don't think it needs defending, anyway) but rather I'm trying
   to test my mental model of how both work.
  
  if you end up rewriting a free block in zfs, there sure is.  you
  can't decide which one is correct.
 
 You don't have to decide. You get use generation # for that.
 

what generation number?  are there other things that your argument
depends on that you haven't mentioned yet?

  not true.  one of many score checks:
  
  srv/lump.c:103: seterr(EStrange, lookuplump 
  returned bad score %V not %V, u-score, score);
 
 I don't buy this argument for a simple reason: here's a very
 easy example that proves my point:
 
 term% fossil/fossil -f /tmp/fossil.bin
 fsys: dialing venti at net!$venti!venti
 warning: connecting to venti: Connection refused

well, there's your problem.  you corrupted
the cache, not the venti store.  (you have no
venti store in this example.)

i should have been more clear that venti does the
checking.  there are many things that fossil doesn't
do that it should.

- erik



Re: [9fans] Changelogs Patches?

2009-01-20 Thread andrey mirtchovski
 Is it how it was from the get go, or did you use venti-based solutions
 before?

it's how i found it.

 i have two zfs servers and about 10 pools of
 different sizes with several hundred different zfs filesystems and
 volumes of raw disk exported via iscsi.

 What kind of clients are on the other side of iscsi?

linux machines.

 You're using it on Linux?

the zfs servers are OpenSolaris boxes.

 Aha! And here are my first questions: you say that I can run multiple
 fossils
 off of the same venti and thus have a setup that is very close to zfs
 clones:
   1. how do you do that exactly? fossil -f doesn't work for me (nor should
 it according to the docs)

i meant formatting the fossil disk with flfmt -v, sorry. it had been
quite a while since i last had to restart from an old venti score :)

   2. how do you work around the fact that each fossil needs its own
partition (unlike ZFS where all the clones can share the same pool
of blocks)?

ultimately all blocks are shared on the same venti server unless you
use separate ones. fossil does have the functionality to serve two
different file systems from two different disks, but i don't  think
anyone has used that (but see example at the end).

 I think I understand it now (except for the fossil -f part), but how do
 you promote (zfs promote) such a clone?

i'm unconvinced that 'promoting' is a genuine feature: it seems to me
that the designers had to invent 'promoting' because they made the
decision to make snapshots read-only in the first place. perhaps i'm
wrong, but if the purpose of promoting something is to make it a true
member of the filesystem community (with all capabilities that
entails), then the corresponding feature in fossil would be to
instantiate one from the particular venti score for the dump. i.e.,
flfmt -v.

 I see what you mean, but in case of venti -- nothing disappears, really.
 From that perspective you can sort of make those zfs clones linger.
 The storage consumption won't be any different, right?

the storage consumption should be the same, i presume. my problem is
that in the case of zfs having several hundred snapshots significantly
degrades the performance of the management tools to the extend that
zfs list takes 30 seconds with about a thousand entries. compared to
fossil handling 5 years worth of daily dumps in less than a second.
but that's not really a serious argument ;)


 Great! I tired to do as much homework as possible (hence the delay) but
 I still have some questions left:
0. A dumb one: what's the proper way of cleanly shutting down fossil
and venti?

see fshalt. it used to be, like most other things, that one could just
turn the machine off without worry. then some bad things happened and
fshalt was written.

   1. What's the use of copying arenas to CD/DVD? Is it purely back up,
since they have to stay on-line forever?

people who back up to cd/dvd can answer that :)

   3. Do you think its a good idea to have volume management be
   part of filesystems, since that way you can try to heal the data
   on-the-fly?

i don't know...

   4. If I have a venti server and a bunch of sha1 codes, can I somehow
   instantiate a single fossil serving all of them under /archive?

not sure if this will work. you'll need as many partitions as the sha1
scores you have. then for each do fossil/flfmt -v score partition.

once you've started fossil on the console type, for each partition/score:

fsys somename config partition
fsys somename venti ventiserver
fsys somename open

it's convoluted, yes. there may be an easier way. i know of people
using vacfs and vac to backup their linux machines to venti. actions
like the ones you're describing would be much easier there, although i
am not sure vacfs has all the functionality to be a usable file system
(for example, it's read-only).

for my personal $0.02 i will say that this argument seems to revolve
around trying to bend fossil and venti to match the functionality of
zfs and the design decisions of the team that wrote it. i, frankly,
think that it should be the other way around; zfs should provide the
equivalent of the fossil/venti snapshot/dump functionality to its
users. that, to me would be a benefit (of course it gets you sued by
netapp too, but that's besides the point). all this
filesystem/snapshot/clone games are just a bunch of toys to make the
admins happy and have little effective use for the end user.



Re: [9fans] Changelogs Patches?

2009-01-20 Thread Roman V. Shaposhnik
On Tue, 2009-01-20 at 18:36 -0500, erik quanstrom wrote:
Got it. However, I'm still not fully convinced there's a definite edge
one way or the other. Don't get me wrong: I'm not trying to defend
ZFS (I don't think it needs defending, anyway) but rather I'm trying
to test my mental model of how both work.
   
   if you end up rewriting a free block in zfs, there sure is.  you
   can't decide which one is correct.
  
  You don't have to decide. You get use generation # for that.
  
 
 what generation number? 

I'm talking about a field in each ZFS block pointer. The
field is actually called birth txg, but I thought alluding
to VtEntry.gen would make it easier to understand what I had
in mind.

 are there other things that your argument
 depends on that you haven't mentioned yet?

Fair question. It depends on at leas cursory reading
of ZFS-on-disk specification. I felt uneasy in this
conversation precisely because I had a very vague recollection
of Venti/Fossil paper. I guess it cuts both ways:
   http://opensolaris.org/os/community/zfs/docs/ondiskformat0822.pdf

  term% fossil/fossil -f /tmp/fossil.bin
  fsys: dialing venti at net!$venti!venti
  warning: connecting to venti: Connection refused
 
 well, there's your problem.  you corrupted
 the cache, not the venti store.  (you have no
 venti store in this example.)

I was specifically referring to a normal operations
to conjure an image of a typical setup of fossil+venti.

In such a setup a corrupted block from a fossil 
partition will go undetected and could end up
being stored in venti. At that point it will become
venti problem.

 i should have been more clear that venti does the
 checking.  there are many things that fossil doesn't
 do that it should.

Sure, but I can't really use venti  without using 
fossil (again: we are talking about a typical setup
here not something like vac/vacfs), can I? 

If I can NOT than fossil becomes a weak link that
can let corrupted data go undetected all the way
to a venti store.

This is quite worrisome for me. At least compared to
ZFS it is.

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-20 Thread erik quanstrom
  well, there's your problem.  you corrupted
  the cache, not the venti store.  (you have no
  venti store in this example.)
 
 I was specifically referring to a normal operations
 to conjure an image of a typical setup of fossil+venti.
 
 In such a setup a corrupted block from a fossil 
 partition will go undetected and could end up
 being stored in venti. At that point it will become
 venti problem.

it's important to keep in mind that fossil is just a write buffer.
it is not intended for the perminant storage of data.  while
corrupt data could end up in venti, the exposure lies only
between snapshots.  you can rollback to the previous good
score and continue.

ken's fs has a proper cache.  a corrupt cache can be recovered
from by dumping the cache and restarting from the last good
superblock.  in the days when the fs was really a worm stored
on mo disks, the worm was said to be very reliable storage.
with raid+scrubbing we try to overcome the limitations of
magnetic media.  while there isn't any block checksum, there is a
block tag.  tag checking has spotted a few instances of
corruption on my fs.  fs-level checksumming and encryption
is definately something i've considered.  actually, with tags
and encryption, checksumming is not necessary for error
detection.

- erik



Re: [9fans] Changelogs Patches?

2009-01-20 Thread Steve Simon
 ... fossil does have the functionality to serve two
 different file systems from two different disks, but i don't  think
 anyone has used that ...

I do this, 'main' backed up by venti and 'other' which holds useful stuff
that needn't be backed up, e.g. RFCs, cdrom images, datasheets etc. This is
accessed via 9fs juke as an homage to the CDROM jukebox that once provided
a similar filesystem at the labs.

-Steve



Re: [9fans] Changelogs Patches?

2009-01-19 Thread Roman Shaposhnik

I think I'm now ready to pick up this old thread (if anybody's still
interested...)

On Jan 7, 2009, at 5:11 PM, erik quanstrom wrote:

Lets see. May be its my misinterpretation of what venti does. But so
far I understand that it boils down to: I give venti a block of any
length, it gives me a score back. Now internally, venti might decide


just a clarification.  this is done by the client.  from venti(6):
  Files and Directories
 Venti accepts blocks up to 56 kilobytes in size. By conven-
 tion, Venti clients use hash trees of blocks to represent
 arbitrary-size data files. [...]


Right. This, by the way, suggests that the onus is on the clients
to help venti reuse as much blocks as possible. Has there been
any established practices of finding the best cut-here points?


But even in the former case I don't see how the corruption could be
possible. Please elaborate.


i didn't say there would be corruption.  i assumed corruption
and outlined how one could recover the maximal set of data
and have a consistent fs (assuming the damage doesn't cut a
full strip across all backups) by simply picking a good
block at each lba from the available damaged and/or incomplete
backups, which may originate at different times.  (russ was the
first that i know of to put this into practice.)

in the case of zfs, my claim is that since zfs can reuse blocks, two
vdev backups, each with corruption or missing data in different places
are pretty well useless.



Got it. However, I'm still not fully convinced there's a definite edge
one way or the other. Don't get me wrong: I'm not trying to defend
ZFS (I don't think it needs defending, anyway) but rather I'm trying
to test my mental model of how both work.

We assume a damaged set of arenas for venti and a damaged set
of vdevs for ZFS. Everything is off-line at that point and we are  
running

strictly in forensics mode. The show, basically, consists of three acts:
1. salvaging as many good data blocks as possible
2. building higher-order structures out of primary data blocks
3. trying to rebuild as much of a consistent FS as possible
 using all the available blocks

It seems to me that #1 and #2 are 100% the same in terms of
the probability of success. In fact, one might claim that ZFS has
a slight edge because of:
 a. volume management being part of the FS
 b. the ditto blocks IOW every block pointer having up to
 3 alternative locations for the block it points to
The net result is that you might end up with more good blocks
to choose from in ZFS world, than in venti's case. Which brings
us to #3.

Once again, we might have more blocks to choose from than
we want (including free blocks) but the generation number
should be enough of a clue to filter unwanted things out.

Thanks,
Roman.

P.S. Oh, and in case of ZFS a damaged vdev will be detected (and
possibly re-silvered) under normal working conditions, while
fossil might not even notice a corruption.




Re: [9fans] Changelogs Patches?

2009-01-07 Thread Roman V. Shaposhnik
On Tue, 2009-01-06 at 18:44 -0500, erik quanstrom wrote:
  a big difference between the decisions is in data integrety.
  it's much easier to break a fs that rewrites than it is a 
  worm-based fs.
  
  True. But there's a grey area here: an FS that *never* rewrites
  live blocks, but can reclaim dead ones. That's essentially
  what ZFS does.
 
 unfortunately, i would think that can result in data loss since
 i can can no longer take a set of copies of the fs {fs_0, ... fs_n}
 and create a new copy with all the data possibly recovered
 by picking a set good blocks from the fs_i, since i can make
 a block dead by removing the file using it and i can make it
 live again by writing a new file.
 
 perhaps i've misinterpreted what you are saying?

Lets see. May be its my misinterpretation of what venti does. But so
far I understand that it boils down to: I give venti a block of any
length, it gives me a score back. Now internally, venti might decide
to split that huge block into a series of smaller ones and store it
as a tree. But still all I get back is a single score. I don't care
whether that score really describes my raw data block, or a block full
of scores that actually describe raw data. All I care is that when
I give venti that score back -- it'll reconstruct the data. I also
have a guarantee that the data will never ever be deleted. 

Now, because of that guarantee (blocks are never deleted) and since
all blocks bigger than 56k get split venti has a nice property of
reusing blocks from existing trees. This happens as a by-product
of the design: I ask venti to store a block and if that same block
was already there -- there will be an extra arrow pointing at it.
All in all -- very compact way of representing a forest of trees.
Each tree corresponds to a VtEntry data structure and blocks full
of VtEntry structures are called VtEntryDir's. Finally a root 
VtEntryDir is pointed at by VtRoot structure.

Contrast this with ZFS, where blocks are *not* addressed via scores,
but rather with a vdev:offset pairs called DVAs. This, of course,
means that there's no block coalescing going on. You ask ZFS to store
a block it gives you a DVA back. You ask it to store the same block
again, you get a different DVA (well, actually it gives you a block
pointer which is DVA augmented by extra stuff).

That fundamental property of ZFS makes it impossible to have a
single block implicitly referenced by multiple trees, unless the
block happens to be part of an explicit snapshot of the same object
at some later point in time.

Thus, when there's a need to modify an existing object, ZFS never
touches the old blocks. It build a tree of blocks, *explicitly*
reusing those blocks that haven't changed. When it is done building
the new tree the old one is still the active one. The last transaction
that happens updates an uberblock (ZFS speak for VtRoot) in an atomic
fashion, thus making a new tree an active one. The old tree is still
around at that point and if it is not part of a snapshot it can be
garbage collected and the blocks can be freed if it is part of the
snapshot -- it is preserved. In the later case the behavior seems
to be exactly what venti does

But even in the former case I don't see how the corruption could be
possible. Please elaborate.

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-07 Thread erik quanstrom
 Lets see. May be its my misinterpretation of what venti does. But so
 far I understand that it boils down to: I give venti a block of any
 length, it gives me a score back. Now internally, venti might decide

just a clarification.  this is done by the client.  from venti(6):
   Files and Directories
  Venti accepts blocks up to 56 kilobytes in size. By conven-
  tion, Venti clients use hash trees of blocks to represent
  arbitrary-size data files. [...]


 But even in the former case I don't see how the corruption could be
 possible. Please elaborate.

i didn't say there would be corruption.  i assumed corruption
and outlined how one could recover the maximal set of data
and have a consistent fs (assuming the damage doesn't cut a
full strip across all backups) by simply picking a good
block at each lba from the available damaged and/or incomplete
backups, which may originate at different times.  (russ was the
first that i know of to put this into practice.)

in the case of zfs, my claim is that since zfs can reuse blocks, two
vdev backups, each with corruption or missing data in different places
are pretty well useless.

- erik




Re: [9fans] Changelogs Patches?

2009-01-06 Thread erik quanstrom
  I'm still trying to figure out what kind of approximation of the  
  above
  would be possible with fossil/venti.
 
  how about making a copy?  venti will coalesce duplicate blocks.
 
 But wouldn't you still need to send these blocks over the wire (thus
 consuming bandwidth and time)?

key word approximation.  ☺

assuming that not all of your tree is in cache,
moving the blocks over the wire would be much
faster than the disk access.  assuming just gbe,
you should be able to copy 50mb/s out of and
back into the same venti server.  how big are
your snapshots that this would be a problem?

i don't know enough about fossil's structure, but i think
you could write a specialized.

- erik



Re: [9fans] Changelogs Patches?

2009-01-06 Thread andrey mirtchovski
i'm using zfs right now for a project storing a few terabytes worth of
data and vm images. i have two zfs servers and about 10 pools of
different sizes with several hundred different zfs filesystems and
volumes of raw disk exported via iscsi. clones play a vital part in
the whole set up (they number in the thousands). for what it's worth,
zfs is the best thing in linux-world (sorry, solaris and *bsd too) for
that kind of task. my comment is that, coming from fossil/venti, zfs
feels just a bit more convoluted and there are more special cases that
seem like a design mishap at least when compared to what i'm used to
in the world i'm coming from.

i'll try to explain:

 Fair enough. But YourTextGoesHere then becomes a transient property
 of my namespace, where in case of ZFS it is truly a tag for a snapshot.

all snapshots have tags: their top-level sha1 score. what i supplied
was simply a way to translate that to any random text. you don't need
to, nor do you have to do this (by the way, do you get the irony of
forcing snapshots to contain the '@' character in their name? sounds a
lot like '#' to me ;)

snapshots are generally accessible via fossil as a directory with the
date of the snapshot as its name. this starts making more sense when
you take into consideration that snapshots are global per fossil, but
then you can run several fossils without having them step on their
toes when it comes to venti. at least until you get a collision in
blocks' hashes.

in fact, i'm so used to fossil's dated snapshots that in my setup i
have restricted 'YourTextGoesHere' to actually be a date. that gives
me so much more context in the case where something goes wrong and i
have to go back through the snapshots for a filesystem or a volume to
find the last known good one.

 Well, strictly speaking Solaris does have a reasonable approximation
 of bind in a form of lofs -- so remapping default ZFS mount point to
 something else is not a big deal.

did not know that


  $ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/branch

 that's as simple as starting a new fossil with -f 'somehex', where
 somehex is the score of the corresponding snap.

 this gives you both read-only snapshots,

 Meaning?

venti is write-once. if you instantiate a fossil from a venti score it
is, by definition, read-only, as all changes to the current fossil
will not appear to another fossil instantiated from the same venti
score. changes are committed to venti once you do a fossil snap,
however that automatically generates a new snapshot score (not
modifying the old one). it should be clear from the paper.

 - snapshots are read only and generally unmountable (unless you go
 through the effort of making them so by setting a special option,
 which i'm not sure is per-snapshot)

 Huh? That's weird -- I routinely access them via
 /pool/fs/.zfs/snapshot/snapshot name
 and I don't remember setting any kind of options. The visibility
 of .zfs can be tweaked, but all it really affects is Tab in bash ;-)

 - clones can only be created off of snapshots

 But that does sound reasonable. What else there is except snapshots
 and an active tree? Or are you objecting to the extra step that is
 needed where you really want to clone the active tree?

i have .zfs exports turned off (it's off by default) because the
read-only snapshots are useless in my environment. instead i must
create clones off one or many snapshots and keep track and delete them
when their tasks have been accomplished.

this is an example of the design decision difference between
fossil/venti and zfs: venti commits storage permanently and everything
becomes a snapshot, while the designers of zfs decided to create a
two-stage process introducing a read-only intermediary between the
original data and a read-write access to it independent of other
clients.

where the second choice becomes a nuisance for me is in the case where
one has thousands of clones and needs to keep track of thousands of
names in order to ensure that when the right one has finished the
right clone disappears. it's good that zfs can handle so many,
otherwise it would've been useless.

note that other systems take the plan9 approach to heart: qemu for
example has the -snapshot argument which allows me to boot many VMs,
fossil-style, off a single vm image without worrying whether they'll
step on each other's toes. that way seems so much simpler and natural
to me, but then i'm jaded by venti :)

 - clones are read-writable but they can only be mounted within the
 /pool/fs/branch hierarchy. if you want to share them you need to
 explicitly adjust a lot of zfs settings such as 'sharenfs' and so on;

 In general -- this is true :-( But I think there's a way now to do that.
 If you're really interested -- I can take a look and let you know.

my problem is with the local/remote duality of exports: if i create a
zfs cloned filesystem it's immediately locally available and perhaps
(via 'sharenfs' inheritance from its parent) 

Re: [9fans] Changelogs Patches?

2009-01-06 Thread erik quanstrom
very interesting post.

 this is an example of the design decision difference between
 fossil/venti and zfs: venti commits storage permanently and everything
 becomes a snapshot, while the designers of zfs decided to create a
 two-stage process introducing a read-only intermediary between the
 original data and a read-write access to it independent of other
 clients.

a big difference between the decisions is in data integrety.
it's much easier to break a fs that rewrites than it is a worm-based
fs.  even if the actual media are the same.  and a broken rewriting
fs is much harder to recover.  russ wrote up a bit on recovering one
good venti from an old copy and a damaged current venti.  this
same approach, (basically fs | fs') works for any worm fs.

 from a remote node. if i create a zfs cloned volume i need to arrange
 an iscsi method of access from a remote node. both nfs and iscsi have
 a host of nasty settings that need to be correct on both ends in order
 for things to work right. i can never hope to export an nfs share
 outside my DMZ.
 
 i don't see a solution to this problem: the unix world is committed to
 nfs and a bit less so to iscsi. i'm more of a 9p guy myself though, so
 i listed it as a complaint.

oh, my perfect chance to shill aoe!  how to configure aoe on plan 9
echo bind /net/ether0/dev/aoe/ctl
now for the hard part
# (this space intentionally left blank.)

- erik




Re: [9fans] Changelogs Patches?

2009-01-06 Thread Roman V. Shaposhnik
On Tue, 2009-01-06 at 11:19 -0500, erik quanstrom wrote:
 very interesting post.

indeed. I actually need some time to digest it ;-)

  this is an example of the design decision difference between
  fossil/venti and zfs: venti commits storage permanently and everything
  becomes a snapshot, while the designers of zfs decided to create a
  two-stage process introducing a read-only intermediary between the
  original data and a read-write access to it independent of other
  clients.
 
 a big difference between the decisions is in data integrety.
 it's much easier to break a fs that rewrites than it is a 
 worm-based fs.

True. But there's a grey area here: an FS that *never* rewrites
live blocks, but can reclaim dead ones. That's essentially
what ZFS does.

  i don't see a solution to this problem: the unix world is committed to
  nfs and a bit less so to iscsi. i'm more of a 9p guy myself though, so
  i listed it as a complaint.
 
 oh, my perfect chance to shill aoe!  how to configure aoe on plan 9
   echo bind /net/ether0/dev/aoe/ctl
 now for the hard part
   # (this space intentionally left blank.)

;-)

What's your personal experience on aoe vs. iscsi?

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-06 Thread erik quanstrom
 a big difference between the decisions is in data integrety.
 it's much easier to break a fs that rewrites than it is a 
 worm-based fs.
 
 True. But there's a grey area here: an FS that *never* rewrites
 live blocks, but can reclaim dead ones. That's essentially
 what ZFS does.

unfortunately, i would think that can result in data loss since
i can can no longer take a set of copies of the fs {fs_0, ... fs_n}
and create a new copy with all the data possibly recovered
by picking a set good blocks from the fs_i, since i can make
a block dead by removing the file using it and i can make it
live again by writing a new file.

perhaps i've misinterpreted what you are saying?

 What's your personal experience on aoe vs. iscsi?

i have no iscsi experience.

aoe has been pretty fun to work with.  the spec can
be read in half an hour.  (it's maybe ten pages.)  i
implemented a virtual aoe target for plan 9, vblade,
from scratch on a friday evening.

- erik




Re: [9fans] Changelogs Patches?

2009-01-05 Thread Roman Shaposhnik

On Jan 4, 2009, at 9:12 PM, erik quanstrom wrote:

Well, I guess I really got spoiled by ZFS's ability to do things like
   $ zfs snapshot pool/projects/f...@yourtextgoeshere
and especially:
   $ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/ 
branch


I'm still trying to figure out what kind of approximation of the  
above

would be possible with fossil/venti.


how about making a copy?  venti will coalesce duplicate blocks.


But wouldn't you still need to send these blocks over the wire (thus
consuming bandwidth and time)?

Thanks,
Roman.



Re: [9fans] Changelogs Patches?

2009-01-05 Thread Roman Shaposhnik

Cool! Looks like I found a bi-lingual person! ;-) Andrey,
would you mind if I ask you to translate some other things
between ZFS and venti/fossil for me?

On Jan 4, 2009, at 9:24 PM, andrey mirtchovski wrote:

Well, I guess I really got spoiled by ZFS's ability to do things like
  $ zfs snapshot pool/projects/f...@yourtextgoeshere


at the console type snap. if you're allowing snaps to be mounted on
the local fs then the equivalent would be mkdir /YourTextGoesHere;
bind /n/dump/... / /YourTextGoesHere.


Fair enough. But YourTextGoesHere then becomes a transient property
of my namespace, where in case of ZFS it is truly a tag for a snapshot.


note that zfs restricts where
the snapshot can be mounted :p venti snapshots are, by default, read
only.


Well, strictly speaking Solaris does have a reasonable approximation
of bind in a form of lofs -- so remapping default ZFS mount point to
something else is not a big deal.


  $ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/branch


that's as simple as starting a new fossil with -f 'somehex', where
somehex is the score of the corresponding snap.

this gives you both read-only snapshots,


Meaning?


and as many clones as you wish.


Cool!


note that you're cheating here, and by quite a bit:


Lets see about that ;-)


- snapshots are read only and generally unmountable (unless you go
through the effort of making them so by setting a special option,
which i'm not sure is per-snapshot)


Huh? That's weird -- I routinely access them via
 /pool/fs/.zfs/snapshot/snapshot name
and I don't remember setting any kind of options. The visibility
of .zfs can be tweaked, but all it really affects is Tab in bash ;-)


- clones can only be created off of snapshots


But that does sound reasonable. What else there is except snapshots
and an active tree? Or are you objecting to the extra step that is
needed where you really want to clone the active tree?


- clones are read-writable but they can only be mounted within the
/pool/fs/branch hierarchy. if you want to share them you need to
explicitly adjust a lot of zfs settings such as 'sharenfs' and so on;


In general -- this is true :-( But I think there's a way now to do that.
If you're really interested -- I can take a look and let you know.


- none of this can be done remotely


Meaning?


- libzfs has an unpublished interface, so if one wants to, say, write
a 9p server to expose zfs functionality to remote hosts they must
either reverse engineer libzfs or use other means.



This one is a bit unfair. The interface is published alright. As much
as anything in Open Source is. It is also documented at the level
that would be considered reasonable for Linux. The fact that
it is not *stable* makes the usual thorough Solaris documentation
lacking.

But all in all, following along doesn't require much more extra
effort compared to following along any other evolving OS
project.

And yes, the situation has changed compared to what it used to
be when Solaris 10 just came out. If you had bad experience
with libzfs sometime ago -- I'm sorry, but if you try again you
might find it more to your linking.

Thanks,
Roman.



Re: [9fans] Changelogs Patches?

2009-01-04 Thread Roman V. Shaposhnik
On Sun, 2009-01-04 at 07:03 +0900, sqweek wrote:
 On Tue, Dec 30, 2008 at 8:54 AM, Roman Shaposhnik r...@sun.com wrote:
  Personally, though, I'd say that the usefulness of the
  dump would be greatly improved
  if one had an ability to do ad-hoc archival snapshots AND assigning tags,
  not only dates to them.
 
  Tags don't make that much sense in this context since the dump is for
 the whole filesystem, not a specific project.

Well, as Charles pointed out -- in case of Plan9 development the whole
system is the entire project.

 However, tagging a source tree can be done with a simple dircp. 
 It's not as though the duplicate data costs you anything when you're 
 backed by venti.

Hm. Good point. Although timing wise, I'd expect dircp to be dreadfully
slow.

Well, I guess I really got spoiled by ZFS's ability to do things like
$ zfs snapshot pool/projects/f...@yourtextgoeshere
and especially:
$ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/branch

I'm still trying to figure out what kind of approximation of the above
would be possible with fossil/venti.

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2009-01-04 Thread andrey mirtchovski
 Well, I guess I really got spoiled by ZFS's ability to do things like
$ zfs snapshot pool/projects/f...@yourtextgoeshere

at the console type snap. if you're allowing snaps to be mounted on
the local fs then the equivalent would be mkdir /YourTextGoesHere;
bind /n/dump/... / /YourTextGoesHere. note that zfs restricts where
the snapshot can be mounted :p venti snapshots are, by default, read
only.

$ zfs clone pool/projects/f...@yourtextgoeshere pool/projects/branch

that's as simple as starting a new fossil with -f 'somehex', where
somehex is the score of the corresponding snap.

this gives you both read-only snapshots, and as many clones as you wish.

note that you're cheating here, and by quite a bit:

- snapshots are read only and generally unmountable (unless you go
through the effort of making them so by setting a special option,
which i'm not sure is per-snapshot)

- clones can only be created off of snapshots

- clones are read-writable but they can only be mounted within the
/pool/fs/branch hierarchy. if you want to share them you need to
explicitly adjust a lot of zfs settings such as 'sharenfs' and so on;

- none of this can be done remotely

- libzfs has an unpublished interface, so if one wants to, say, write
a 9p server to expose zfs functionality to remote hosts they must
either reverse engineer libzfs or use other means.

so, while i'm sure you enjoy zfs quite a bit, for others used to
plan9's venti/fossil way of doing things zfs can be quite a pain.



Re: [9fans] Changelogs Patches?

2009-01-04 Thread erik quanstrom
 fossil/venti through the lens of ZFS. I guess its not a coincedence
 that ZFS actually has a built-in support for the kind of history
 transfer you were implementing.

the transfer would have been trivial, had the filesystems been
compatable.  what i did was reenact the actions that built the
original fs on the new fs by manipulating the clock on the target.

- erik




Re: [9fans] Changelogs Patches?

2009-01-03 Thread sqweek
On Tue, Dec 30, 2008 at 8:54 AM, Roman Shaposhnik r...@sun.com wrote:
 Personally, though, I'd say that the usefulness of the
 dump would be greatly improved
 if one had an ability to do ad-hoc archival snapshots AND assigning tags,
 not only dates to them.

 Tags don't make that much sense in this context since the dump is for
the whole filesystem, not a specific project. However, tagging a
source tree can be done with a simple dircp. It's not as though the
duplicate data costs you anything when you're backed by venti.
-sqweek



Re: [9fans] Changelogs Patches?

2008-12-30 Thread Uriel
Knowing *who* made the change is often even more useful than the change comment.

uriel

On Tue, Dec 30, 2008 at 2:48 AM, Charles Forsyth fors...@terzarima.net wrote:
 i've rarely found per-change histories to be any more useful than
 most other comments, i'm afraid.

And that meant that math texts and math teaching was   all about polished
final results.

 ah. my statement was ambiguous.
 i meant per-change chatter in the history, not the changes in the history.
 it's fine to have the chatter, but it isn't essential, because nothing
 relies on it, in the sense that the chatter causes the system to change its
 behaviour.





Re: [9fans] Changelogs Patches?

2008-12-30 Thread C H Forsyth
Knowing *who* made the change is often even more useful than the change 
comment.

yes. i use ls -lm on our trees, but that might not work on less direct things 
like sources.



Re: [9fans] Changelogs Patches?

2008-12-30 Thread Uriel
On Tue, Dec 30, 2008 at 4:06 PM, C H Forsyth fors...@vitanuova.com wrote:
Knowing *who* made the change is often even more useful than the change 
comment.

 yes. i use ls -lm on our trees, but that might not work on less direct things 
 like sources.

It would work if the development trees were public...

uriel



Re: [9fans] Changelogs Patches?

2008-12-30 Thread Noah Evans
http://code.google.com/hosting/createProject


On Tue, Dec 30, 2008 at 12:31 PM, Uriel urie...@gmail.com wrote:
 On Tue, Dec 30, 2008 at 4:06 PM, C H Forsyth fors...@vitanuova.com wrote:
Knowing *who* made the change is often even more useful than the change 
comment.

 yes. i use ls -lm on our trees, but that might not work on less direct 
 things like sources.

 It would work if the development trees were public...

 uriel





Re: [9fans] Changelogs Patches?

2008-12-29 Thread Roman Shaposhnik

On Dec 26, 2008, at 5:27 AM, Charles Forsyth wrote:

while a descriptive history is good, it takes a lot of extra work
to generate.


i've rarely found per-change histories to be any more useful than  
most other comments, i'm afraid.


I believe that it all depends on what is it that you look at source  
code for. Long time ago I used
to study mathematics. Soviet mathematical schooling was really quite  
exceptional, but there
was one thing that I now wish was different. You see, soviet math got  
a Bourbaki virus in its
early childhood. And that meant that math texts and math teaching was  
all about polished
final results. None of that messy and disgusting process of actually  
discovering those results.

None. The process itself was considered too imprecise and muddy:
   Rigor consisted in getting rid of an accretion of superfluous  
details. Conversely, lack of rigor
gave my father an impression of a proof where one was walking in  
mud, where one had to pick
up some sort of filth in order to get ahead. Once that filth was  
taken away, one could get at the
mathematical object, a sort of crystallized body whose essence is  
its structure.

  From: 
http://ega-math.ru/Cartier.htm
And thus the circle of those who just got it was formed.

Back when I was a student, I wanted to belong to that circle so badly,  
that I missed a fundamental
point: the very creation of the circle turned all of us from active  
participants in the process into
art gallery goers. And that was a fine change for those who just  
wanted to appreciate fine
math, but was a kiss of death for less gifted individuals who wanted  
to do math themselves (I won't
touch the subject of whether less gifted individuals are supposed to  
do math in the first place, since

its too personal and painful).

Ok, with math it is a bit difficult to have the records of the process  
AND the final object at the
same time (well, good teachers understood that and their lectures were  
the ones worth attending).
But in software engineering we DO have a chance to have our cake and  
eat it too. Albeit only
if we put as much focus on maintaining history (our records of the  
process) as we put on

maintaing the code itself (final results).

the advantage of dump and snap is that the scope is the whole  
system: including emails, discussion documents,
the code, supporting tools -- everything in digital form.  if  
software works differently today
compared to yesterday, then in most cases, i'd expect 9fs dump to  
make it easy to track down the
set of differences, and narrow the search to the culprit. it might  
not even be a source change,

but a configuration file, or a file was moved or removed.


I don't deny that 9fs dump is quite useful and it seems to match the  
organization of Plan9 developer
club pretty well. Personally, though, I'd say that the usefulness of  
the dump would be greatly improved
if one had an ability to do ad-hoc archival snapshots AND assigning  
tags, not only dates to them.


That would, in effect, bring the whole process quite close to what  
established SCMs do. With
the only major feature (the ability to easily trade history between  
different hosts) still missing.


Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2008-12-29 Thread hiro
So is it time for a new file server then? :D



Re: [9fans] Changelogs Patches?

2008-12-29 Thread Roman Shaposhnik

On Dec 27, 2008, at 3:56 AM, erik quanstrom wrote:
I'm actually still trying to figure out how replica/* fits together  
with

sources being a fossil server. These two, somehow, have to
click, but I haven't figured out the connection just yet. Any  
pointers

to the good docs?


there's no connection.  replica would work without a fossil server.
for that matter, replica would work without a dump.  all you need
is an original and a changed version.


Got it. The bit that I didn't quite get initially was the fact that  
there's

history accumulated in dumps and that history might need to be
transferred *exactly* like it is to a different fileserver. And with  
replica

only transferring the end result (present moment in history
terms) there seemed to be a missing link...


i used replica (plus a few additional tools) to make a faithful copy
of the coraid fileserver.  http://www.quanstro.net/plan9/history.pdf



...but your article answered that last question completely. Although,
I wonder whether direct transfer of history between two venti
servers would be possible.

Thanks,
Roman.

P.S. I also didn't quite understand the business of synchronizing Qids.
I have always thought that they are only meaningful for the duration
of the server's lifetime and thus all applications are quite immune to
potential Qid changes as long as the connection get dropped and
re-established. Or was it that your goal was to migrate so seamlessly
that *running* applications wouldn't notice? 



Re: [9fans] Changelogs Patches?

2008-12-29 Thread erik quanstrom
 ...but your article answered that last question completely. Although,
 I wonder whether direct transfer of history between two venti
 servers would be possible.

if one were to transfer history between two fs with the same on-disk
format, a simple device copy would be sufficient.  i was moving from
a 32-bit 4k block fs to geoff's 64 bit work with 8k blocks.

history is not a property of venti.  venti is a sparse virtual drive
with ~2^80 bits storage.  blocks are addressed by sha1 hash of
their content. fossil is the fileserver.  the analogy would be a change
in fossil format.  my technique would work for fossil, too.

 P.S. I also didn't quite understand the business of synchronizing Qids.
 I have always thought that they are only meaningful for the duration
 of the server's lifetime and thus all applications are quite immune to
 potential Qid changes as long as the connection get dropped and
 re-established. Or was it that your goal was to migrate so seamlessly
 that *running* applications wouldn't notice? 

that's okay.  russ think's i'm nuts on this point, too.

perhaps the paper wasn't fully clear.  i wanted to make the assertion
that if on the original fs,qid(patha) == qid(pathb) then on the new
fs, qid(patha') == qid(pathb').  the qids weren't the same.  for
various reasons (i.e.  not every copy of every file makes it to a
dump), they can't be.  it's just a very complicated way of saying, i
didn't want to recopy the same data needlessly and increase the size
of the fs.  i just couldn't think of an easy way of making the same
assertion another way without reading every file for each day of
the dump.  remember, the original fs was a pentium ii with a
100mbit ethernet card.  it's still took 2 weeks to copy the data
to the new fs.

and russ is right in that it was overkill.  but, hey, if it's worth doing,
it's worth doing in grand excess.

oh, by the way, the replica db's are reusable.  they could also, if
one wished by generated by the fs as part of the dump process.

- erik




Re: [9fans] Changelogs Patches?

2008-12-29 Thread erik quanstrom
 I don't deny that 9fs dump is quite useful and it seems to match the  
 organization of Plan9 developer
 club pretty well. Personally, though, I'd say that the usefulness of  
 the dump would be greatly improved
 if one had an ability to do ad-hoc archival snapshots AND assigning  
 tags, not only dates to them.

i can't recommed reading ken's kernel (the fs) enough.
it's recognizable as related to plan 9, but it is much simplier.
it can afford to be static.

it would be nifty, if an early version of the fs
(with typedef long Device) could be put up on sources
for historical interest.

- erik




Re: [9fans] Changelogs Patches?

2008-12-29 Thread Charles Forsyth
 i've rarely found per-change histories to be any more useful than  
 most other comments, i'm afraid.

And that meant that math texts and math teaching was   all about polished
final results.

ah. my statement was ambiguous.
i meant per-change chatter in the history, not the changes in the history.
it's fine to have the chatter, but it isn't essential, because nothing
relies on it, in the sense that the chatter causes the system to change its
behaviour.



Re: [9fans] Changelogs Patches?

2008-12-27 Thread Roman Shaposhnik

On Dec 25, 2008, at 8:57 PM, Anthony Sorace wrote:

erik offered some suggestions for hosting various bits of things
outside 9vx and connecting to that in order to get the dumps. those
options are valid, but you can just as well host the entire thing
within 9vx. it's not the default configuration, but i believe
instructions are out there (9fans or the wiki).

using fossil for your root, instead of #Z, will obviously cost you the
benefits of #Z - namely, the pass-through transparency.


That's a good advice. Thanks. I wonder, however, if such a transparency
can be achieved the other way around -- serving my entire home
directory via fossil from plan9port under UNIX and 9vx. Has anyone
tried such a config?


if your primary interest is for replica/*,


I'm actually still trying to figure out how replica/* fits together with
sources being a fossil server. These two, somehow, have to
click, but I haven't figured out the connection just yet. Any pointers
to the good docs?

Thanks,
Roman.



Re: [9fans] Changelogs Patches?

2008-12-27 Thread tlaronde
On Sat, Dec 27, 2008 at 06:04:42AM +, Eris Discordia wrote:
 it all begins with Adam and Steve, as Brian Stuart suggests, ways have 
 been found of managing large teams of people with different specializations 
 and those ways work. The Mgmt has a raison d'etre, despite what 
 techno-people like to suggest.

Because when, say Napoleon was commanding hundreds of thousands of 
soldiers, he was not commanding individually hundreds of thousands of
soldiers. But he gave order to a handful, giving orders each to a
handful etc. But is was his idea that was going from to to bottom.

French: main tenir: holding (tenir) in one hand (main).

You can maintenir a huge software if it is orthogonalized: when you
take one piece, not the whole plate of spaghetti comes (it just pulling
on the articulation, the communication, the API with the rest).

And for people, the military adds: hold in one hand, so the other is
free to slap when needed (and a foot free to kick if first lesson was
not received strong enough).
-- 
Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] Changelogs Patches?

2008-12-27 Thread erik quanstrom
 I'm actually still trying to figure out how replica/* fits together with
 sources being a fossil server. These two, somehow, have to
 click, but I haven't figured out the connection just yet. Any pointers
 to the good docs?

there's no connection.  replica would work without a fossil server.
for that matter, replica would work without a dump.  all you need
is an original and a changed version.

i used replica (plus a few additional tools) to make a faithful copy
of the coraid fileserver.  http://www.quanstro.net/plan9/history.pdf

- erik



Re: [9fans] Changelogs Patches?

2008-12-27 Thread Eris Discordia

I'm baffled. Slap me, or kick me--your choice.

--On Saturday, December 27, 2008 11:36 AM +0100 tlaro...@polynum.com wrote:


On Sat, Dec 27, 2008 at 06:04:42AM +, Eris Discordia wrote:

it all begins with Adam and Steve, as Brian Stuart suggests, ways have
been found of managing large teams of people with different
specializations  and those ways work. The Mgmt has a raison d'etre,
despite what  techno-people like to suggest.


Because when, say Napoleon was commanding hundreds of thousands of
soldiers, he was not commanding individually hundreds of thousands of
soldiers. But he gave order to a handful, giving orders each to a
handful etc. But is was his idea that was going from to to bottom.

French: main tenir: holding (tenir) in one hand (main).

You can maintenir a huge software if it is orthogonalized: when you
take one piece, not the whole plate of spaghetti comes (it just pulling
on the articulation, the communication, the API with the rest).

And for people, the military adds: hold in one hand, so the other is
free to slap when needed (and a foot free to kick if first lesson was
not received strong enough).
--
Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C









Re: [9fans] Changelogs Patches?

2008-12-26 Thread Charles Forsyth
while a descriptive history is good, it takes a lot of extra work
to generate.

i've rarely found per-change histories to be any more useful than most other 
comments, i'm afraid.
you'd hope it would answer what was he thinking? but i found either it was 
obvious or i still had to ask.
still, perhaps it could be regarded as an aid to future computer 
archaeologists, after
all shared context has been lost.

the intention of things like /CHANGES is mainly to point out moderate to large 
changes (eg, if you've
been waiting for a bug fix or there's a significant change to usage or 
operation).
it isn't intended to give details or rationale of the fix, any more than there 
is any of that for the
original code, really.  perhaps literate programming will fix that if it ever 
takes off.
(the set of people that write good descriptions and the set of people that 
write good code
don't necessarily have a big intersection.)  for larger additions or changes i 
sometimes wrote
short notes giving the background, the changes/additions and the rationale for 
them,
ranging from the equivalent of a long e-mail to a several-page paper. that 
worked quite
well, but was somewhat more work.

also useful for compilers are links to bug demonstration programs and 
regression tests.

the advantage of dump and snap is that the scope is the whole system: including 
emails, discussion documents,
the code, supporting tools -- everything in digital form.  if software works 
differently today
compared to yesterday, then 



Re: [9fans] Changelogs Patches?

2008-12-26 Thread Charles Forsyth
the advantage of dump and snap is that the scope is the whole system: 
including emails, discussion documents,
the code, supporting tools -- everything in digital form.  if software works 
differently today
compared to yesterday, then 

sorry, got cut off.   then in most cases, i'd expect 9fs dump to make it easy 
to track down the
set of differences, and narrow the search to the culprit. it might not even be 
a source change,
but a configuration file, or a file was moved or removed.



Re: [9fans] Changelogs Patches?

2008-12-26 Thread blstuart
 I use CWEB (D. Knuth and Levy's) intensively and it is indeed
 invaluable.
 It doesn't magically improve code (my first attempts have just shown
 how poor my programming was: it's a magnifying glass, and one just saw 
 with it bug's blinking eyes with bright smiles). 

Back when I used CWEB on a regular basis (I don't find myself
writing as much substantive code from scratch of late), I
experienced an interesting phenomenon.  I could write
pretty good code, almost as a stream of consciousness.
The tool made it natural to present the code in the order
in which I could understand it, rather than the order the
compiler wanted it.  But it was the effect of this that was
really interesting.  I found that as I wrote I'd think in terms
of several things I needed to do and I'd put placeholders in
(chunk names) for all but the one I was writing just then.
As I'd finish a chunk, I'd go back an find another one 
that I hadn't written yet, and I could easily pick them in
the order I figured out the way I wanted to handle it.
At some point, I just ran out of chunks that needed to
be written, and the code would be done.  It was almost
as if the completion of the code snuck up on me.  At
first, it was sort of a maybe Knuth's on to something
here but it happened often enough that I now consider
it a basic feature of the style.

Back to the topic in question though, I did find that
writing and maintaining good descriptions tool almost
as much discipline as any other code documentation.
I did have to resist the urge to leave the textual part
of a chunk blank and just write the code.  I also had
to be diligent about updating the descriptions when
the code changed.  But for whatever reason (asthetics,
tool, living up to Knuth's example...) it did seem a
little easier in that context.

However, in terms of changelogs and such, I'd say
that's still an open question.  It would seem that there
should be some way to automate the creation of a
changelog (at least in the form of a list of pointers)
from the literate source.  But the literate style itself
doesn't really seem to create anything new in terms
of the high level overview that you'd see in release
notes or changelogs.

BLS




Re: [9fans] Changelogs Patches?

2008-12-26 Thread tlaronde
On Fri, Dec 26, 2008 at 11:25:33AM -0600, blstu...@bellsouth.net wrote:
 
 Back when I used CWEB on a regular basis (I don't find myself
 writing as much substantive code from scratch of late), I
 experienced an interesting phenomenon.  I could write
 pretty good code, almost as a stream of consciousness.
 The tool made it natural to present the code in the order
 in which I could understand it, rather than the order the
 compiler wanted it.  

Yes, but this means you have adapted the way you are writing the code
to the logics behind litterate programming. Starting with a structured
programming approach (litterate is indeed more) is probably the best.
If, as I have done..., one looks to the finger instead of the moon, and
takes it to be a way for formatting comments, with all the bells and
whistles of TeX, one is definitively not on the right track---and that's
why the packages to format C comments embedded in source is definitely
not the same.

Once you get at it, it really helps as you describe. (I have one library
that I wrote almost in one go---the Esri's SHAPE lib support for
KerGIS--- and that does the job; but it was not the first, but it was
the first I wrote with explanations in _french_, my native and thinking
language; so now, since I think in french, I write in french---but code,
including identifiers and one line comments are in \CEE. This is the
second lesson I learned).

 
 However, in terms of changelogs and such, I'd say
 that's still an open question.  It would seem that there
 should be some way to automate the creation of a
 changelog (at least in the form of a list of pointers)
 from the literate source.  But the literate style itself
 doesn't really seem to create anything new in terms
 of the high level overview that you'd see in release
 notes or changelogs.

I like text, because of diffs. And CWEB has diffs ;) You can even confer
this with Brooks' The mythical man-month, and adapting slightly CWEB
diffs features will gave the highlighting changes doc Brooks has written
about.

Even with data, to get to the point one needs only diffs (I use it with
vectorial map stuff to highlight what changes have been made between
different versions provided by surveyors. This with the ability to show
the state of data at -MM-DD hh:mm:ss is invaluable.)

That is one of the many reasons I found plan9 so interesting: text
oriented.
-- 
Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] Changelogs Patches?

2008-12-26 Thread erik quanstrom
 Back when I used CWEB on a regular basis (I don't find myself
 writing as much substantive code from scratch of late), I

is it just me, or is hard to read someone else's cweb code?
if it's not just me...

i wonder if the same reason it's easy to write from the top
down doesn't make it hard to read.  you have to be thinking
the same way from the top otherwise you're lost.

appropriately, this being a plan 9 list and all, i find code
written from the bottom up easier to read.

- erik




Re: [9fans] Changelogs Patches?

2008-12-26 Thread tlaronde
On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote:
 appropriately, this being a plan 9 list and all, i find code
 written from the bottom up easier to read.
 

Depending on the task (on the aim of the software), one happens to split
from top to bottom, and to review and amend from bottom to top. 
There is a navigation between the two.

Bottom to top is more easier because you are building more complicate
stuff from basic stuff.

But the definition of these elements (the software ortho normal base),
the justification of these elements can be in part, has to be in
part, a result of a top to bottom thought.

The general papers about Unix and Plan 9, the explanations of the logics
of the whole can not be, IMHO, tagged as bottom to top. They are
simply to the point ;)
-- 
Thierry Laronde (Alceste) tlaronde +AT+ polynum +dot+ com
 http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] Changelogs Patches?

2008-12-26 Thread blstuart
 On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote:
 appropriately, this being a plan 9 list and all, i find code
 written from the bottom up easier to read.
 
 Depending on the task (on the aim of the software), one happens to split
 from top to bottom, and to review and amend from bottom to top. 
 There is a navigation between the two.
 
 Bottom to top is more easier because you are building more complicate
 stuff from basic stuff.

Some time back, I was trying to understand how to teach the
reality of composing software.  (Yes, I do think of it as a creative
activity very similar to composing music.)  The top-down and
bottom-up ideas abound and make sense, but they never seemed
to capture the reality.  Then one day, after introspecting on the
way I write code, I realized it's not one or the other; it's outside-in.
I don't know what little tools I need to build until I have some
sense of the big picture, but I can't really establish the exact
boundaries between major elements until I've worked out the
cleanest way to build the lower-level bits.  So I iterative work
back and forth between big picture and building blocks until
they meet in the middle.

As an aside, that's also when I realized what had always bugged
me about the classic approach to team programming.  The
interfaces between major parts really comes last, but in assigning
work to team members, you have to force it to come first.
And of course, from that perpsective, it makes perfect sense
why the best examples of programming are ones where the
first versions are created by only 1 or 2 people and why the
monstrosities created by large teams of professional software
engineers are so often massive collections of mechanisms
that don't work well together.

BLS




Re: [9fans] Changelogs Patches?

2008-12-26 Thread Eris Discordia

The Story of Mel

[...]

I compared Mel's hand-optimized programs with the same code massaged by
the optimizing assembler program, and Mel's always ran faster. That was
because the top-down method of program design hadn't been invented
yet, and Mel wouldn't have used it anyway. He wrote the innermost parts
of his program loops first, so they would get first choice of the optimum
address locations on the drum. The optimizing assembler wasn't smart
enough to do it that way.

[...]


-- http://catb.org/jargon/html/story-of-mel.html

Know why Mel is no more in business? 'Cause one man can only do so much 
work. The Empire State took many men to build, so did Khufu's pyramid, and 
there was no whining about many mechanisms that don't work well together. 
Now go call your managers PHBs.


--On Friday, December 26, 2008 3:44 PM -0600 blstu...@bellsouth.net wrote:


On Fri, Dec 26, 2008 at 01:20:17PM -0500, erik quanstrom wrote:

appropriately, this being a plan 9 list and all, i find code
written from the bottom up easier to read.


Depending on the task (on the aim of the software), one happens to split
from top to bottom, and to review and amend from bottom to top.
There is a navigation between the two.

Bottom to top is more easier because you are building more complicate
stuff from basic stuff.


Some time back, I was trying to understand how to teach the
reality of composing software.  (Yes, I do think of it as a creative
activity very similar to composing music.)  The top-down and
bottom-up ideas abound and make sense, but they never seemed
to capture the reality.  Then one day, after introspecting on the
way I write code, I realized it's not one or the other; it's outside-in.
I don't know what little tools I need to build until I have some
sense of the big picture, but I can't really establish the exact
boundaries between major elements until I've worked out the
cleanest way to build the lower-level bits.  So I iterative work
back and forth between big picture and building blocks until
they meet in the middle.

As an aside, that's also when I realized what had always bugged
me about the classic approach to team programming.  The
interfaces between major parts really comes last, but in assigning
work to team members, you have to force it to come first.
And of course, from that perpsective, it makes perfect sense
why the best examples of programming are ones where the
first versions are created by only 1 or 2 people and why the
monstrosities created by large teams of professional software
engineers are so often massive collections of mechanisms
that don't work well together.

BLS










Re: [9fans] Changelogs Patches?

2008-12-26 Thread erik quanstrom
 Know why Mel is no more in business? 'Cause one man can only do so much 
 work. The Empire State took many men to build, so did Khufu's pyramid, and 
 there was no whining about many mechanisms that don't work well together. 
 Now go call your managers PHBs.

building a pyramid, starting at the top is one of those things
that just doesn't scale.

- erik



Re: [9fans] Changelogs Patches?

2008-12-26 Thread blstuart
 building a pyramid, starting at the top is one of those things
 that just doesn't scale.

But if you figure out how, it's probably worth a Nobel.

BLS




Re: [9fans] Changelogs Patches?

2008-12-26 Thread Eris Discordia

building a pyramid, starting at the top is one of those things
that just doesn't scale.


For that, you have bottom-up, right? But there's no meet-in-the-middle 
for a pyramid, or for software. Unless, the big picture is small enough to 
fit in one man's head and let him context-switch back and forth between 
general and particular, in which case you have to give up expanding 
software functionality at the one man barrier.


All admirable architecture, and admirable software, is, in addition to 
being manifestation of great technique, manifestation of great 
management--even informal management is management in the end. Instead of 
it all begins with Adam and Steve, as Brian Stuart suggests, ways have 
been found of managing large teams of people with different specializations 
and those ways work. The Mgmt has a raison d'etre, despite what 
techno-people like to suggest.


--On Friday, December 26, 2008 5:30 PM -0500 erik quanstrom 
quans...@quanstro.net wrote:



Know why Mel is no more in business? 'Cause one man can only do so much
work. The Empire State took many men to build, so did Khufu's pyramid,
and  there was no whining about many mechanisms that don't work well
together.  Now go call your managers PHBs.


building a pyramid, starting at the top is one of those things
that just doesn't scale.

- erik









Re: [9fans] Changelogs Patches?

2008-12-26 Thread Roman Shaposhnik

On Dec 25, 2008, at 6:37 AM, erik quanstrom wrote:



despite the season, and typical attitudes, i don't think that
development practices are a spiritual or moral decision.
they are a practical one.


Absolutely! Agreed 100%. My original question was not
at all aimed at saving Plan9 development practices from
the fiery inferno. Far from it. I simply wanted to figure out
whether the things that really help me follow the development
of other open source projects are available under Plan9. It is
ok for them to be different (e.g. not based on traditional SCMs)
and it is even ok for them not to be available at all.


and what they have done at the labs appears to be working to me.


It surely does work in a sense that Plan9 is very much alive and  
kicking.

But there are also some things that make following Plan9 development
and doing software archeology more difficult that, lets say, plan9port.

It very well may be just my own ignorance (in which case, please
educate me on these subjects) but my current impression is that
sources.cs.bell-labs.com is the de-facto SCM for Plan9. In fact,
it is the only way to get new source into the official tree, yet still
have some ability to track the old stuff via main/archive. This model,
however well suited for the closely-knitted inner circle of developers,
makes it difficult for me to follow the project. Why? Well, here's my
top reason:
 Plan9 development history is not quantized in atomic  
changesets, but
 rather in 24hour periods. Even if a developer wanted to record  
the fact
 that a particular state of the tree corresponds to a bug fix or  
a feature
 implementation the only way to do that would be not to allow any  
other

 changes in within the 24h window. This seem rather awkward.

Two less severe problems are the lack of easy tracking of change  
ownership

and code migration through time and space. Both are quite important when
one tries to figure out how (and why!) did we get from
   /n/sourcesdump/2002/*
to
  /n/sourcesdump/2008/*


in my own experience, i've found scum always to cost time.
but my big objection is the automatic merge.  automatic merges
make it way to easy to merge bad code without reviewing the diffs.

while a descriptive history is good, it takes a lot of extra work
to generate.  just because it's part of the scum process doesn't
make it free.



Agreed. As much as there's price to pay when one tries to
write clean code, there's a price to pay when one tries to
maintain a clean history(*). In both cases, however, I, personally,
would gladly pay that price. Otherwise I simply risk insanity
if the project gets over a couple thousand lines of code or
a more than a year old.

Thanks,
Roman.

(*) My definition of a clean history is a set of smallest self-reliant
changesets.



Re: [9fans] Changelogs Patches?

2008-12-25 Thread erik quanstrom
 I surely hope the festive mood of the season will protect me from being
 ostracized for asking this, but is there any chance to map Plan9
 development practices to some of the established ways of source
 code management? I mostly long for things like being able to browse
 Plan9 history with a clear understanding of who did what and for
 what reason.

in the holiday spirit ☺, isn't this similar logic
1  scm packages are peace hope and light, everybody knows that.
2. if you don't use a scum package you are in the darkness
3. if you are in the darkness, you must be saved, or be cast into
the pit.

despite the season, and typical attitudes, i don't think that
development practices are a spiritual or moral decision.
they are a practical one.  and what they have done at the
labs appears to be working to me.  in my own experience,
i've found scum always to cost time.  but my big objection
is the automatic merge.  automatic merges make it way too
easy to merge bad code without reviewing the diffs.

while a descriptive history is good, it takes a lot of extra work
to generate.  just because it's part of the scum process doesn't
make it free.

- erik




Re: [9fans] Changelogs Patches?

2008-12-25 Thread Roman Shaposhnik

On Dec 24, 2008, at 10:40 PM, erik quanstrom wrote:

Is there any preferred way to get changelogs / diffs these days?


yesterday -d ...
when i'm especially curious or anxious.


But yesterday won't work in a more lightweight environment (such as
9vx) will it?


exactly the same as plan 9 does.

as long as the fs supports a dump fs, 9vx will support yesterday.


True. But not having an fs that supports dump is exactly what makes
9vx a lighter weight environment (unless I'm grossly mistaken and
#Z in 9vx actually has a way of supporting dump).


for example, i've been mounting my diskless fs with 9vx.  yesterday
works just fine.  i'm sure you could use a linux-based venti with
plan 9-based fossil as well.



True, but I'd really like to NOT have any extra software running and
still have and ability to do replica/* and yesterday under 9vx.

Can this be done?

Thanks,
Roman.



Re: [9fans] Changelogs Patches?

2008-12-25 Thread lucio
 True, but I'd really like to NOT have any extra software running and
 still have and ability to do replica/* and yesterday under 9vx.

I'm only vaguely familiar with 9vx, so there I can't speak, but you
can certainly do replica/* as it is a user-level tool and as for
yesterday, you can apply it to /n/sources, which is what you seem to
imply is your requirement.

++L




Re: [9fans] Changelogs Patches?

2008-12-25 Thread Anthony Sorace
depends what you mean by extra. if that means outside 9vx, then
yes; if it means besides what 9vx uses by default, no.

yesterday(1) relies on having dump-style snapshots. 9vx, as shipped,
gets its root file system from #Z, which doesn't have snapshots.

erik offered some suggestions for hosting various bits of things
outside 9vx and connecting to that in order to get the dumps. those
options are valid, but you can just as well host the entire thing
within 9vx. it's not the default configuration, but i believe
instructions are out there (9fans or the wiki).

using fossil for your root, instead of #Z, will obviously cost you the
benefits of #Z - namely, the pass-through transparency. if your
primary interest is for replica/*, though, you might consider the
direction i've been headed: root from fossil, but import $home or /usr
from #Z.



Re: [9fans] Changelogs Patches?

2008-12-25 Thread blstuart
 using fossil for your root, instead of #Z, will obviously cost you the
 benefits of #Z - namely, the pass-through transparency. if your
 primary interest is for replica/*, though, you might consider the
 direction i've been headed: root from fossil, but import $home or /usr
 from #Z.

That's close to what I'm doing.  When I'm running stand-alone,
I boot from fossil, bind #Z to /n/unix and bind my UNIX
home directory to a mount point in the fossil file system.
When running as a terminal, I boot from my file server, and
still use pretty much the same binds.

BLS




Re: [9fans] Changelogs Patches?

2008-12-24 Thread Roman Shaposhnik

On Dec 22, 2008, at 8:41 AM, Charles Forsyth wrote:

Is there any preferred way to get changelogs / diffs these days?


yesterday -d ...
when i'm especially curious or anxious.


But yesterday won't work in a more lightweight environment (such as
9vx) will it?


it probably wouldn't hurt to have a DMEXCL+DMAPPEND file (!)
maintained by the command that applies patches, which appends the  
readme/notes
file(s) for each patch as it is applied.  not all changes are done  
through patches.



Speaking of which -- is there any FAQ on the current development  
practices

of the Plan9 project? Things like patch lifecycle, etc.?

Thanks,
Roman.




Re: [9fans] Changelogs Patches?

2008-12-24 Thread erik quanstrom
 Is there any preferred way to get changelogs / diffs these days?

 yesterday -d ...
 when i'm especially curious or anxious.
 
 But yesterday won't work in a more lightweight environment (such as
 9vx) will it?

exactly the same as plan 9 does.

as long as the fs supports a dump fs, 9vx will support yesterday.

for example, i've been mounting my diskless fs with 9vx.  yesterday
works just fine.  i'm sure you could use a linux-based venti with
plan 9-based fossil as well.

- erik




Re: [9fans] Changelogs Patches?

2008-12-24 Thread Roman Shaposhnik

On Dec 22, 2008, at 8:46 PM, Nathaniel W Filardo wrote:

Hi,

The contrib index mentions that daily changelogs for Plan 9 are in
sources/extra/changes, but those haven't been updated since early  
2007.

Is there any preferred way to get changelogs / diffs these days?


Relatedly, is there a better way to mirror the development history  
of Plan 9
than running @{9fs sourcesdump; cd /n/sourcesdump; tar -c} | @{tar - 
x} or

similar?


I surely hope the festive mood of the season will protect me from being
ostracized for asking this, but is there any chance to map Plan9
development practices to some of the established ways of source
code management? I mostly long for things like being able to browse
Plan9 history with a clear understanding of who did what and for
what reason.

Say what you will about Linux kernel, but things like these:
   http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=summary
surely make it much more bearable to work with^H^H^H^H^H around.

Thanks,
Roman.

P.S. I see that Russ uses Mercurial SCM for some of his other
projects, so may be my question is not that weird, after all...



Re: [9fans] Changelogs Patches?

2008-12-22 Thread Charles Forsyth
Is there any preferred way to get changelogs / diffs these days?

i use

9fs sources
diff /whatever /n/sources/plan9/whatever

and after a pull

yesterday -d ...
when i'm especially curious or anxious.

it probably wouldn't hurt to have a DMEXCL+DMAPPEND file (!)
maintained by the command that applies patches, which appends the readme/notes
file(s) for each patch as it is applied.  not all changes are done through 
patches.



Re: [9fans] Changelogs Patches?

2008-12-22 Thread Devon H. O'Dell
2008/12/22 Venkatesh Srinivas m...@acm.jhu.edu:
 Hi,

 The contrib index mentions that daily changelogs for Plan 9 are in
 sources/extra/changes, but those haven't been updated since early 2007.
 Is there any preferred way to get changelogs / diffs these days?

I used to maintain the changelogs, but ended up generating ENOTIME,
pretty much just as everyone else who has worked on that. It's
something I think I might pick up again; either Russ or Uriel emailed
me a set of scripts to maintain it. Perhaps I'll start doing it again;
it's mostly just a question of getting the scripts set up and doing
it.

--dho

 Also, in sources/patch, there are patches neither in applied/ or sorry/.
 Are these patches in queue? Applied? Not applied?

 Thanks,
 -- vs





Re: [9fans] Changelogs Patches?

2008-12-22 Thread Uriel
It is pretty much a question of it being a totally backwards way of
doing things, with one set of people doing the changes, and another
set of people guessing the meaning of the changes writing the
changelog.

(This is claimed to be due to the first set of people not having the
time to writing down what changes they make. Of course those same
people seem to think the time spent when the second group has to
inquire as to the nature of changes is not wasteful.)

But following more conventional practices and heeding the crazy advice
of unqualified people like Brian when he writes:

*Keep records*. I maintain a FIXES file that describes every change
to the code since the Awk book was published in 1988 [1]

would be anathema to the Plan 9 way of doing things.

uriel


[1]: http://www.cs.princeton.edu/~bwk/testing.html

On Mon, Dec 22, 2008 at 6:03 PM, Devon H. O'Dell devon.od...@gmail.com wrote:
 2008/12/22 Venkatesh Srinivas m...@acm.jhu.edu:
 Hi,

 The contrib index mentions that daily changelogs for Plan 9 are in
 sources/extra/changes, but those haven't been updated since early 2007.
 Is there any preferred way to get changelogs / diffs these days?

 I used to maintain the changelogs, but ended up generating ENOTIME,
 pretty much just as everyone else who has worked on that. It's
 something I think I might pick up again; either Russ or Uriel emailed
 me a set of scripts to maintain it. Perhaps I'll start doing it again;
 it's mostly just a question of getting the scripts set up and doing
 it.

 --dho

 Also, in sources/patch, there are patches neither in applied/ or sorry/.
 Are these patches in queue? Applied? Not applied?

 Thanks,
 -- vs







Re: [9fans] Changelogs Patches?

2008-12-22 Thread Nathaniel W Filardo
 Hi,

 The contrib index mentions that daily changelogs for Plan 9 are in
 sources/extra/changes, but those haven't been updated since early 2007.
 Is there any preferred way to get changelogs / diffs these days?

Relatedly, is there a better way to mirror the development history of Plan 9
than running @{9fs sourcesdump; cd /n/sourcesdump; tar -c} | @{tar -x} or
similar?

Thanks.
--nwf;


pgp35C6i3YBHr.pgp
Description: PGP signature