Re: [9fans] a 9P session between debian client and Plan 9 server side

2012-01-05 Thread Kin Loo
Now I start the server with -D flag and try ''echo hello foo on
Linux. The server side on Plan 9 say:

-5- Twalk tag 1 fid 408 newfid 437 nwname 1 0:foo
-5- Rwalk tag 1 nwqid 1 0:( 0 )
-5- Tstat tag 1 fid 437
-5- Rstat tag 1  stat 'foo' 'bootes' 'bootes' 'unknown' q
( 0 ) m 0666 at 1325748319 mt 1325748319 l 0 t
0 d 0
-5- Tclunk tag 1 fid 437
-5- Rclunk tag 1
-5- Twalk tag 2 fid 408 newfid 437 nwname 1 0:foo
-5- Rwalk tag 2 nwqid 1 0:( 0 )
-5- Tstat tag 2 fid 437
-5- Rstat tag 2  stat 'foo' 'bootes' 'bootes' 'unknown' q
( 0 ) m 0666 at 1325748319 mt 1325748319 l  0 t 0 d 0
-5- Tclunk tag 2 fid 437
-5- Rclunk tag 2
-5- Twalk tag 2 fid 408 newfid 438 nwname 1 0:foo
-5- Rwalk tag 2 nwqid 1 0:( 0 )
-5- Topen tag 2 fid 438 mode 1
fid mode is 0x1
-5- Ropen tag 2 qid ( 0 ) iounit 0
-5- Tstat tag 2 fid 438
-5- Rstat tag 2  stat 'foo' 'bootes' 'bootes' 'unknown' q
( 0 ) m 0666 at 1325748319 mt 1325748319 l   0 t 0 d 0
-5- Twrite tag 6 fid 438 offset 0 count 6 'hello'
-5- Rwrite tag 6 count 6
-5- Tclunk tag 6 fid 438
-5- Rclunk tag 6

I will read more 9P examples in order to implement a wstat. Thank you
Yaroslav and Andrey, it is worth to learn from you.



[9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
Hello,

Summary of the previous epidodes: My Plan9 installation was still the
initial one as far as partitionning is concerned. Since I had not
grasped the venti purpose, other was empty, everything going into
the venti archived. And I was doing a number of install/de-install
of kerTeX for tests purposes, boom!: disk full and need to find a way
to load an alternate root to fix things---or reinstall.

But this leads to questions regarding the contrib stuff.

When one has the sources, archiving with history the sources make sense.
To take the example of kerTeX, there is a map describing where to put
eventually a file, so the sources vary a little, but the result may be
arbitrary. Secondly, the binaries compiled from the sources may vary
even if the sources do not vary.

So the compiled result is not worth archiving. (The convenience to have
a fallback snapshot to not disrupt work is here; in case of bigger
disaster, the time needed to recompile everything is acceptable---for
kerTeX, even if the result is several tens of Mb, this is a matter of
minutes.) Furthermore, for an experimental work, archiving a transient
state is not worth the disk space.

With the design of namespace manipulations, a Plan9 user can redirect
the writes where he wants them to happen---venti or not venti, that is
the question.

But the user has to know. Is there a policy described somewhere?

The problem, I think, is that on other systems, one thinks backup and
archiving _after_---and decide what goes in backups. While here, 
powerful tools are there, by default, but user may be unaware of 
consequences. Perhaps should it be proposed by default, for the let's
see what is Plan9, to get fossil only, and to switch to venti when
things are clear?

Cheers,
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
 So the compiled result is not worth archiving.

it has been more than once that in tracking down a problem, i've
found that the known working executable worked but the source
from that point in history didn't.  and vice versa.  having the executables
and libraries archived was very valuable.

otoh, just to use round numbers, if your build creates 100mb of new data
and all of it hits venti before being replaced, then you've got 1
builds/TB.  in practice, i think most people push to venti only once
a day, so this is practically infinite.  or, put another way, that's
$100/1 = 1¢/build.  since the standard depricated comment is
worth 2¢, it appears that these days 100mb is not worth commenting
on.  ☺.

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
On Thu, Jan 05, 2012 at 07:59:00AM -0500, erik quanstrom wrote:
  So the compiled result is not worth archiving.
 
 it has been more than once that in tracking down a problem, i've
 found that the known working executable worked but the source
 from that point in history didn't.  and vice versa.  having the executables
 and libraries archived was very valuable.
 
 otoh, just to use round numbers, if your build creates 100mb of new data
 and all of it hits venti before being replaced, then you've got 1
 builds/TB.  in practice, i think most people push to venti only once
 a day, so this is practically infinite.  or, put another way, that's
 $100/1 = 1¢/build.  since the standard depricated comment is
 worth 2¢, it appears that these days 100mb is not worth commenting
 on.  ?.

Perhaps, but it seems to me like digging ore, extracting the small
percentage of valuable; forging a ring; and throwing it in the ore, and
storing the whole...

Secondly, I still use optical definitive storage from time to time
(disks go in a vault)... with KerGIS and others, and kerTeX, this still
fit 3 times on a CDROM. So...

And finally, didn't the increase in size of the disks, with no decrease
of the reliability, increases the probability of disks failure?
Unfortunately, one finds not small (that were huge some years ago)
disks anymore...

PS: and for a Plan9 tester, he will begin by devoting a partition on a
disk to see. The iso is around 300Mb, so allocating 512 or 1024Mb will
seem enough. If he's hooked to Plan9---that happened to me ;)---sooner
or later a problem will occur.
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
On Thu, Jan 05, 2012 at 02:20:34PM +0100, tlaro...@polynum.com wrote:
 And finally, didn't the increase in size of the disks, with 
no decrease


no increase, of course. If probability of a failure for a sector is P,
increasing the number of sectors increases the probability of disk
failure.
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
 Perhaps, but it seems to me like digging ore, extracting the small
 percentage of valuable; forging a ring; and throwing it in the ore, and
 storing the whole...

generally it's apparent which files are worth investigating, and between
history (list of changes by date) and a binary search, it shouldn't take
more than a handful of tries to narrow things down.

in practice, i've found the executables more helpful than the source.

 Secondly, I still use optical definitive storage from time to time
 (disks go in a vault)... with KerGIS and others, and kerTeX, this still
 fit 3 times on a CDROM. So...

if you are using venti, there is no reason to re-archive closed arenas.
(and there's no a priori reason that your optical backup must include
history.)

 And finally, didn't the increase in size of the disks, with no decrease
 of the reliability, increases the probability of disks failure?
 Unfortunately, one finds not small (that were huge some years ago)
 disks anymore...

i think disk reliability is a term that gets canceled out.  if you have
n copies of an executable, whatever the reliablity of the drive, each
copy is exactly as likely to be intact.

 PS: and for a Plan9 tester, he will begin by devoting a partition on a
 disk to see. The iso is around 300Mb, so allocating 512 or 1024Mb will
 seem enough. If he's hooked to Plan9---that happened to me ;)---sooner
 or later a problem will occur.

that's not what i did.  i started with several 18gb scsi drives.

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
On Thu, Jan 05, 2012 at 08:27:50AM -0500, erik quanstrom wrote:
 
  Secondly, I still use optical definitive storage from time to time
  (disks go in a vault)... with KerGIS and others, and kerTeX, this still
  fit 3 times on a CDROM. So...
 
 if you are using venti, there is no reason to re-archive closed arenas.
 (and there's no a priori reason that your optical backup must include
 history.)

Because I use CVS (not on Plan9), and I backup my CVS. So, sources with
history. I do not consider CDROM to be eternal. So there is a small
number kept, and the older is destroyed when the new one is burnt.

-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread David du Colombier
I am not sure to understand your question.

Nothing forces you to dump the full Fossil tree to Venti
every night. You can run snap manually every time you
want, or run it only on a part of the tree.

You can also individually exclude some files from
the snapshots using the DMTMP bit.

If you really want to avoid archiving binaries, you
could simply add a cron job which automatically apply
the DMTMP bit on the binaries just before the
archival snapshot.

Fossil and Venti are very flexible, you can do almost
everything you want.

-- 
David du Colombier



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
 Because I use CVS (not on Plan9), and I backup my CVS. So, sources with
 history. I do not consider CDROM to be eternal. So there is a small
 number kept, and the older is destroyed when the new one is burnt.

sorry.  i thought we were talking about organizing plan 9
storage.  never mind 

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
On Thu Jan  5 08:28:57 EST 2012, tlaro...@polynum.com wrote:
 On Thu, Jan 05, 2012 at 02:20:34PM +0100, tlaro...@polynum.com wrote:
  And finally, didn't the increase in size of the disks, with 
 no decrease
 
 
 no increase, of course. If probability of a failure for a sector is P,
 increasing the number of sectors increases the probability of disk
 failure.

sector failure != disk failure.  disk failure is generally due to the
heads c, and generally independent of the size of the device.

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
On Thu, Jan 05, 2012 at 09:14:28AM -0500, erik quanstrom wrote:
  Because I use CVS (not on Plan9), and I backup my CVS. So, sources with
  history. I do not consider CDROM to be eternal. So there is a small
  number kept, and the older is destroyed when the new one is burnt.
 
 sorry.  i thought we were talking about organizing plan 9
 storage.  never mind 

I use CVS on NetBSD now. But even on Plan9, I want my sources with
history. This means that on Plan9, I will make a separate partition for
my sources; add a log file to register comments about changes; and 
backup the whole arena.
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
On Thu, Jan 05, 2012 at 02:48:10PM +0100, David du Colombier wrote:
 
 Fossil and Venti are very flexible, you can do almost
 everything you want.

No doubt about that.

But perhaps the other users are smart enough to have understood all this
at installation time, but when I first installed Plan9, that was not for
the archival features. And I spent my time on Plan9 looking for the
distributed system, the namespace and so on, not on venti.

The question is more about the defaults and/or the documentation.
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Russ Cox
On Thu, Jan 5, 2012 at 10:15 AM,  tlaro...@polynum.com wrote:
 But perhaps the other users are smart enough to have understood all this
 at installation time, but when I first installed Plan9, that was not for
 the archival features. And I spent my time on Plan9 looking for the
 distributed system, the namespace and so on, not on venti.

 The question is more about the defaults and/or the documentation.

The default is that you have so little data in comparison to a
modern disk that there is no good reason not to save full
snapshots.  As Erik and others have pointed out, if you do
find reason to exclude certain trees from the snapshots, you
can use chmod +t.  The system is working as intended.

Russ



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
On Thu, Jan 05, 2012 at 10:44:18AM -0500, Russ Cox wrote:
 
 The default is that you have so little data in comparison to a
 modern disk that there is no good reason not to save full
 snapshots.  As Erik and others have pointed out, if you do
 find reason to exclude certain trees from the snapshots, you
 can use chmod +t.  The system is working as intended.

Quoting ``Installing the Plan9 Distribution'':

You need an x86-based PC with 32MB of RAM, a supported video card, and a
hard disk with at least 300MB of unpartitionned space and a free primary
partition slot.

Yes, this is from the printed edition of Plan9 Programmer's Manual, 3rd
Edition.

But I don't see why caveats will hurt a new comer, who is probably
not devoting an entire new disk to a system he doesn't know yet and 
wants to try, but making Plan9 some place on a disk populated with other
data.

And giving a hint about the archival features will not hurt either.
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread David du Colombier
The third edition was published in june 2000. It predates
both Venti (april 2002) and Fossil (january 2003).

This documentation was about installing Plan 9 on a
standalone terminal running kfs, not a file server.

-- 
David du Colombier



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread ron minnich
I doubt anyone would object if you want to change the text and submit
to the website owners.

ron



Re: [9fans] Killing venti

2012-01-05 Thread smiley
Russ Cox r...@swtch.com writes:

 run venti/sync.

Ah.  Cool.  Gotta love those undocumented commands.  :) While probing
the distal edges of Venti's documented functionality, I also came across
the following, which have similar (but not identical) effect:

hget http://$vthost:$vtwebport/flushicache
hget http://$vthost:$vtwebport/flushdcache

These HTTP requests initiate flushes of the index and arena block
caches, respectively, and don't return a response until the respective
flush is complete.

-- 
+---+
|Smiley   smi...@icebubble.orgPGP key ID:BC549F8B |
|Fingerprint: 9329 DB4A 30F5 6EDA D2BA  3489 DAB7 555A BC54 9F8B|
+---+



Re: [9fans] ramfs, fossil, venti etc.

2012-01-05 Thread smiley
Steve Simon st...@quintile.net writes:

 Even fossil can be grown though you will need a new bigger
 partition or grow the existing one using fs(3), this can then
 be refreshed from a venti snapshot.

Quid pro quo: IIRC, if you do this (using flfmt), you will retain venti
archives of the fossil filesystem, but loose the ephemeral snapshots, as
well as any data marked +t.  There's currently no way to resize a fossil
file system in-place, is there?

-- 
+---+
|Smiley   smi...@icebubble.orgPGP key ID:BC549F8B |
|Fingerprint: 9329 DB4A 30F5 6EDA D2BA  3489 DAB7 555A BC54 9F8B|
+---+



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Bakul Shah
On Thu, 05 Jan 2012 17:39:07 +0100 tlaro...@polynum.com  wrote:
 On Thu, Jan 05, 2012 at 10:44:18AM -0500, Russ Cox wrote:
  
  The default is that you have so little data in comparison to a
  modern disk that there is no good reason not to save full
  snapshots.  As Erik and others have pointed out, if you do
  find reason to exclude certain trees from the snapshots, you
  can use chmod +t.  The system is working as intended.
 
 Quoting ``Installing the Plan9 Distribution'':
 
 You need an x86-based PC with 32MB of RAM, a supported video card, and a
 hard disk with at least 300MB of unpartitionned space and a free primary
 partition slot.
 
 Yes, this is from the printed edition of Plan9 Programmer's Manual, 3rd
 Edition.
 
 But I don't see why caveats will hurt a new comer, who is probably
 not devoting an entire new disk to a system he doesn't know yet and 
 wants to try, but making Plan9 some place on a disk populated with other
 data.

Are you going to update the wiki then? Newbies need all the
help they can get! If you do, make sure to mention they use
RAID1! Not when they are just playing with plan9 but before
they start storing precious data.  On a moderen consumer disk,
if you read 1TB, you have 8% chance of a silent bad read
sector.  More important to worry about that in today's world
than optimizing disk space use.

If you partition disk space, chances are, it will be used
suboptimally (that is, it will turn out you guessed partition
size wrong for the actual use).  These days (for most people)
the *bulk* of their disks contain user data so there is really
no point in partitioning. Just make sure your truly critical
data is backed up separately (repeat until you are satisfied).



Re: [9fans] (no subject)

2012-01-05 Thread Aram Hăvărneanu
erik quanstrom wrote:
 for me, the most important questions are
 - how do i set up a raid/hot spares, and
 - can i do this without rebooting.

Of course, and right now I'm doing exactly that using a different
operating system.  Can I do that on Plan 9? I don't know, I'm trying
to find out without much success.

 wikipedia.

I really don't understand why are you sending me to read wikipedia.
Generally, I think of myself as a decent speaker, I know how to make
myself clear.  It's obvious that in this case I failed.  In my
previous job I have worked on a file system that among other things
also implements redundancy.  If I implemented these things I guess I
know about them without having to read wikipedia.

The machine I use today for storage also runs bits of my own software.
 It's very easy to administer, tells me when disks are broken, I can
just add disks for more storage without reboot and I can hot swap
disks.  Can Plan 9 do this?  I don't know, I guess not?  It's fine by
me, I'm willing to sacrifice performance and ease of administration
for an operating system I like better.  I'm willing to implement
myself what I need and doesn't exist yet, though I have a very hard
time understanding what's missing from these very, very vague
discussions.

 there's nothing strange about a sata device or even a raid of
 various devices of any type being presented with an ide programming
 interface.  one could just as easily slap an ahci programming interface
 on, but either requires translation software/hardware.

I agree that's nothing inherently strange, but in practice it's
uncommon, at least in my experience.  But then again, I'm not a
hardware guy, so my experience means nothing.

-- 
Aram Hăvărneanu



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread John Floren
 On Thu, Jan 5, 2012 at 10:15 AM,  tlaro...@polynum.com wrote:
 But perhaps the other users are smart enough to have understood all this
 at installation time, but when I first installed Plan9, that was not for
 the archival features. And I spent my time on Plan9 looking for the
 distributed system, the namespace and so on, not on venti.

 The question is more about the defaults and/or the documentation.
 
 The default is that you have so little data in comparison to a
 modern disk that there is no good reason not to save full
 snapshots.  As Erik and others have pointed out, if you do
 find reason to exclude certain trees from the snapshots, you
 can use chmod +t.  The system is working as intended.
 
 Russ

For reference, I set up our current Plan 9 system about half a year
ago.  We have 3.8 TB of Venti storage total.  We have used 2.8 GB of
that, with basically no precautions taken to set anything +t; in
general, if it's around at 4 a.m., it's going into Venti.  I figure we
have roughly another 2,000 years of storage left at the current rate
:)



John




Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
 if you read 1TB, you have 8% chance of a silent bad read
 sector.  More important to worry about that in today's world
 than optimizing disk space use.

do you have a citation for this?  i know if you work out the
numbers from the BER, this is about what you get, but in
practice i do not see this 8%.  we do pattern writes all the
time, and i can't recall the last time i saw a silent read error.

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
On Thu, Jan 05, 2012 at 09:36:13AM -0800, ron minnich wrote:
 I doubt anyone would object if you want to change the text and submit
 to the website owners.

That was my intention, but before, I wanted to submit to the list some
stuff, in order to not publish nonsense. [But probably some people equal
Laronde and nonsense...]

I do want to submit the stuff for revue, since I have been playing with
the installation process to use it at a reparation tool.

Mounting the CD image, extracting the El Torito 2.88Mb image;
mounting the file as a fat; extracting 9load and repopulating my
plan9:9fat (9pcflop having a kernel and a spartiate root is then
a reparation tool). Jivaro'ing the CD image to put it in the 100
Mb 9fat, in order to install from there, Plan9 fat being seen from
NetBSD by playing with a disklabel. Etc.

I don't care about a software incident if I have fun and use it to
learn...
-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread tlaronde
On Thu, Jan 05, 2012 at 10:07:08AM -0800, John Floren wrote:
 
 For reference, I set up our current Plan 9 system about half a year
 ago.  We have 3.8 TB of Venti storage total.  We have used 2.8 GB of
 that, with basically no precautions taken to set anything +t; in
 general, if it's around at 4 a.m., it's going into Venti.  I figure we
 have roughly another 2,000 years of storage left at the current rate
 :)

The TB were there because you planned to use TeXlive, that's all... ;)

-- 
Thierry Laronde tlaronde +AT+ polynum +dot+ com
  http://www.kergis.com/
Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C



Re: [9fans] Killing venti

2012-01-05 Thread David du Colombier
venti/sync calls vtsync which is documented in venti-client(2).

Hopefully, you don't have to flush the dcache or icache before
shutting down Venti. Especially since flushing the icache will
likely take a very long time.

-- 
David du Colombier



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Bakul Shah
On Thu, 05 Jan 2012 13:01:52 EST erik quanstrom quans...@quanstro.net  wrote:
  if you read 1TB, you have 8% chance of a silent bad read
  sector.  More important to worry about that in today's world
  than optimizing disk space use.
 
 do you have a citation for this?  i know if you work out the
 numbers from the BER, this is about what you get, but in
 practice i do not see this 8%.  we do pattern writes all the
 time, and i can't recall the last time i saw a silent read error.

Silent == unseen! Do you log RAID errors? Only way to catch them.

That number is derived purely on an bit error rate (I think
vendors base that on the Reed-Solomon code used). No idea how
uniformly random the data (or medium) is in practice. I
thought the practice was worse!



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Bakul Shah
On Thu, 05 Jan 2012 10:07:08 PST John Floren j...@jfloren.net  wrote:
 
 For reference, I set up our current Plan 9 system about half a year
 ago.  We have 3.8 TB of Venti storage total.  We have used 2.8 GB of
 that, with basically no precautions taken to set anything +t; in
 general, if it's around at 4 a.m., it's going into Venti.  I figure we
 have roughly another 2,000 years of storage left at the current rate
 :)

I first read that 2.8 GB as 2.8 TB and was utterly confused!

You'd save a bunch of energy if you only powered up venti
disks once @ 4AM for a few minutes (and on demand when you
look at /n/dump).  Though venti might have fits! And the disks
might too! So may be this calls for a two level venti? First
to an SSD RAID and a much less frequent venti/copy to hard
disks.

venti doesn't have a scrub command, does it? zfs scrub was
instrumental in warning me that I needed new disks.



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
On Thu Jan  5 13:26:16 EST 2012, ba...@bitblocks.com wrote:
 On Thu, 05 Jan 2012 13:01:52 EST erik quanstrom quans...@quanstro.net  
 wrote:
   if you read 1TB, you have 8% chance of a silent bad read
   sector.  More important to worry about that in today's world
   than optimizing disk space use.
  
  do you have a citation for this?  i know if you work out the
  numbers from the BER, this is about what you get, but in
  practice i do not see this 8%.  we do pattern writes all the
  time, and i can't recall the last time i saw a silent read error.
 
 Silent == unseen! Do you log RAID errors? Only way to catch them.
 
 That number is derived purely on an bit error rate (I think
 vendors base that on the Reed-Solomon code used). No idea how
 uniformly random the data (or medium) is in practice. I
 thought the practice was worse!

i thought your definition of silent was not caught by the on-drive
ecc.  i think this is not very likely,   and we're explicitly checking for
this byrunning massive numbers of disks through pattern writes with
verification, and don't see it.

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread John Floren
 On Thu, 05 Jan 2012 10:07:08 PST John Floren j...@jfloren.net  wrote:
 
 For reference, I set up our current Plan 9 system about half a year
 ago.  We have 3.8 TB of Venti storage total.  We have used 2.8 GB of
 that, with basically no precautions taken to set anything +t; in
 general, if it's around at 4 a.m., it's going into Venti.  I figure we
 have roughly another 2,000 years of storage left at the current rate
 :)
 
 I first read that 2.8 GB as 2.8 TB and was utterly confused!
 
 You'd save a bunch of energy if you only powered up venti
 disks once @ 4AM for a few minutes (and on demand when you
 look at /n/dump).  Though venti might have fits! And the disks
 might too! So may be this calls for a two level venti? First
 to an SSD RAID and a much less frequent venti/copy to hard
 disks.
 
 venti doesn't have a scrub command, does it? zfs scrub was
 instrumental in warning me that I needed new disks.

Well, we need the venti disks powered on whenever we're using it,
right?  Since most of the filesystem is actually living on Venti and
fossil just has pointers to it?  Also, I think it's probably better
for disks to stay on all the time rather than go on-off-on-off.

And compared to the rest of the machine room, keeping a Coraid running
all the time isn't that big of a thing.



John




Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
 You'd save a bunch of energy if you only powered up venti
 disks once @ 4AM for a few minutes (and on demand when you
 look at /n/dump).  Though venti might have fits! And the disks
 might too! So may be this calls for a two level venti? First
 to an SSD RAID and a much less frequent venti/copy to hard
 disks.
 
 venti doesn't have a scrub command, does it? zfs scrub was
 instrumental in warning me that I needed new disks.

they're using coraid storage.  all this is taken care of for them
by the SR appliance.

- erik



Re: [9fans] Killing venti

2012-01-05 Thread Russ Cox
On Thu, Jan 5, 2012 at 12:35 PM,  smi...@icebubble.org wrote:
 run venti/sync.

 Ah.  Cool.  Gotta love those undocumented commands.  :) While probing
 the distal edges of Venti's documented functionality, I also came across
 the following, which have similar (but not identical) effect:

 hget http://$vthost:$vtwebport/flushicache
 hget http://$vthost:$vtwebport/flushdcache

 These HTTP requests initiate flushes of the index and arena block
 caches, respectively, and don't return a response until the respective
 flush is complete.

Honestly, you don't even have to run venti/sync.
Every command that writes to venti ends by doing
a sync.

You probably don't want to use those hget commands.
They are safe, of course, but it is equally safe not to
run them.  The icache in particular can take a long time to
flush, and venti will recover the entries (in less time than
the flush would have taken) the next time it starts.

Russ



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Aram Hăvărneanu
 venti doesn't have a scrub command, does it? zfs scrub was
 instrumental in warning me that I needed new disks.

 they're using coraid storage.  all this is taken care of for them
 by the SR appliance.

Out of curiosity, how?  ZFS blocks are checksummed. ZFS scrub reads
not physical blocks on disks, but logical ZFS blocks and validates
their checksum.  How can the Coraid appliance determine if the data it
reads is valid or not since it works below the file system layer and
only understands physical blocks?

-- 
Aram Hăvărneanu



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
On Thu Jan  5 14:13:55 EST 2012, ara...@mgk.ro wrote:
  venti doesn't have a scrub command, does it? zfs scrub was
  instrumental in warning me that I needed new disks.
 
  they're using coraid storage.  all this is taken care of for them
  by the SR appliance.
 
 Out of curiosity, how?  ZFS blocks are checksummed. ZFS scrub reads
 not physical blocks on disks, but logical ZFS blocks and validates
 their checksum.  How can the Coraid appliance determine if the data it
 reads is valid or not since it works below the file system layer and
 only understands physical blocks?

all redundant raid types have parity (even raid 1; fun fact!).

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread ron minnich
but john, the whole your venti would easily fit in even a small server
memory, now and forever ;)

ron



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
 For reference, I set up our current Plan 9 system about half a year
 ago.  We have 3.8 TB of Venti storage total.  We have used 2.8 GB of
 that, with basically no precautions taken to set anything +t; in
 general, if it's around at 4 a.m., it's going into Venti.  I figure we
 have roughly another 2,000 years of storage left at the current rate
 :)

in 10 years, we've managed to store only 645gb of stuff in ken fs.  so 
duplicates
are stored as dupes.  large chunks of it are email (from the pre nupas days)
or imported from even older systems.  the worm is only 3tb.

the system was built last nov with the then-the-best-option 1tb drives.  today,
the whole worm could be put on 2 drives with raid 10.  in 2 years, when it's
time to replace the drives, the worm should be less than 2/3 full, assuming
~300% year/year acceleration in storage used.

my personal system, a mere 6 years old, only has about 12.5gb of junk.
(in a 1500gb worm!).  and drives are ripe for replacement.

by the way, in thinking a bit more about the BER and scrubbing, my
3 raids have been scrubbing continuously for 3 years and, (so that's
hundreds of times) except when i actually had a bad disk, i have not
seen URE.

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Bakul Shah
On Thu, 05 Jan 2012 13:43:49 EST erik quanstrom quans...@quanstro.net  wrote:
 On Thu Jan  5 13:26:16 EST 2012, ba...@bitblocks.com wrote:
  On Thu, 05 Jan 2012 13:01:52 EST erik quanstrom quans...@quanstro.net  wr
 ote:
if you read 1TB, you have 8% chance of a silent bad read
sector.  More important to worry about that in today's world
than optimizing disk space use.
   
   do you have a citation for this?  i know if you work out the
   numbers from the BER, this is about what you get, but in
   practice i do not see this 8%.  we do pattern writes all the
   time, and i can't recall the last time i saw a silent read error.
  
  Silent == unseen! Do you log RAID errors? Only way to catch them.
  
  That number is derived purely on an bit error rate (I think
  vendors base that on the Reed-Solomon code used). No idea how
  uniformly random the data (or medium) is in practice. I
  thought the practice was worse!
 
 i thought your definition of silent was not caught by the on-drive
 ecc.  i think this is not very likely,   and we're explicitly checking for

Hmm You are right!  I meant *uncorrectable* read errors
(URE), which are not necessarily *undetectable* errors (where
a data pattern switches to another pattern mapping to the same
syndrome bits).  Clearly my memory by now has had much more
massive bit-errors! Still, consumer disk URE rate of 10^-14
coupled with large disk sizes does mean RAID is essential. 

 this byrunning massive numbers of disks through pattern writes with
 verification, and don't see it.

Are these new disks?  The rate goes up with age.  Do SMART
stats show any new errors?  It is also possible vendors are
*conservatively* specifying 10^-14 (though I no longer know
how they arrive at the URE number!).  Can you share what you
did discover? [offline, if you don't want to broadcast]

You've probably read
http://research.cs.wisc.edu/adsl/Publications/latent-sigmetrics07.ps



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Bakul Shah
On Thu, 05 Jan 2012 13:50:48 EST erik quanstrom quans...@quanstro.net  wrote:
  You'd save a bunch of energy if you only powered up venti
  disks once @ 4AM for a few minutes (and on demand when you
  look at /n/dump).  Though venti might have fits! And the disks
  might too! So may be this calls for a two level venti? First
  to an SSD RAID and a much less frequent venti/copy to hard
  disks.
  
  venti doesn't have a scrub command, does it? zfs scrub was
  instrumental in warning me that I needed new disks.
 
 they're using coraid storage.  all this is taken care of for them
 by the SR appliance.

When are you going to sell these retail?!

The question was for venti though.



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
   venti doesn't have a scrub command, does it? zfs scrub was
   instrumental in warning me that I needed new disks.
  
  they're using coraid storage.  all this is taken care of for them
  by the SR appliance.
 
 When are you going to sell these retail?!

 The question was for venti though.

i'm not sure i follow.  why can't venti assume a perfect array-of-bytes
device and let the appliance take care of it?

if you care enough to get ecc memory, your data path should be
100% ecc protected.

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Steve Simon
 You'd save a bunch of energy if you only powered up venti
 disks once @ 4AM for a few minutes (and on demand when you
 look at /n/dump).

If fossil is setup to dump to venti then it needs venti to
work at all. Fossil is a write cache, so, just after the dump
at 4am fossil is empty and consists only of a pointer to the
root of the dump in venti; all reads are then satisfied from
venti alone, until some data is written.

-Steve



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Yaroslav
2012/1/5 Bakul Shah ba...@bitblocks.com:
 You'd save a bunch of energy if you only powered up venti
 disks once @ 4AM for a few minutes (and on demand when you
 look at /n/dump).  Though venti might have fits! And the disks
 might too! So may be this calls for a two level venti? First
 to an SSD RAID and a much less frequent venti/copy to hard
 disks.

I think you're confusing kenfs+worm with fossil+venti in sense that
ken fs is a complete cache for worm while fossil is a write cache for
venti. You need venti running all the time.

-- 
- Yaroslav



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread erik quanstrom
On Thu Jan  5 16:24:58 EST 2012, yari...@gmail.com wrote:
 2012/1/5 Bakul Shah ba...@bitblocks.com:
  You'd save a bunch of energy if you only powered up venti
  disks once @ 4AM for a few minutes (and on demand when you
  look at /n/dump).  Though venti might have fits! And the disks
  might too! So may be this calls for a two level venti? First
  to an SSD RAID and a much less frequent venti/copy to hard
  disks.
 
 I think you're confusing kenfs+worm with fossil+venti in sense that
 ken fs is a complete cache for worm while fossil is a write cache for
 venti. You need venti running all the time.

this only could work if you assume that everything in the worm
is also in the cache, and you've configured your cache to cache the worm.
since these days the cache is often similar in performance to the worm, the
default we use is to not copy the worm into the cache.  this would just
result in more i/o.

so tl;dr you need the worm available at all times be it venti+fossil
or ken fs/cwfs

- erik



Re: [9fans] venti and contrib: RFC

2012-01-05 Thread Aram Hăvărneanu
erik quanstrom wrote:
 do you have a citation for this?  i know if you work out the
 numbers from the BER, this is about what you get, but in
 practice i do not see this 8%.  we do pattern writes all the
 time, and i can't recall the last time i saw a silent read error.

Yes, the real numbers are much, much lower, but still significant
because they affect RAID reconstruction.  See this[1] paper.

Some unrelated, but interesting fact from that paper: nearline disks
(and their adapters) develop checksum mismatches an order of magnitude
more often than enterprise class disk drives.

[1] L. N. Bairavasundaram, G. R. Goodson, B. Schroeder, A. C.
Arpaci-Dusseau, and R. H. Arpaci-Dusseau. An Analysis of Data
Corruption in the Storage Stack. In FAST, 2008
http://www.cs.toronto.edu/~bianca/papers/fast08.pdf

-- 
Aram Hăvărneanu