Re: Howto help hans?

2006-12-04 Thread Valdis . Kletnieks
On Mon, 04 Dec 2006 13:02:15 GMT, Danny Milosavljevic said:
> What does an attourney cost for, say, 100 hours over there?

Guesstimating $200/hour (minimum), you're looking at $20K and up.  Taking
a case to trial is going to take a lot more than 100 billable hours.




pgpgsOvhkEQyt.pgp
Description: PGP signature


Re: Which version will be merged into mainline kernel?

2006-11-11 Thread Valdis . Kletnieks
On Sat, 11 Nov 2006 15:14:18 GMT, Danny Milosavljevic said:
> I've never understood this kind of attitude some MTAs have. Usually the
> hardware would make sure that stuff doesn't disappear (UPS, powered RAM,
> harddisk condenser) and not some weird software workaround that complicates
> and slows down everything.

I've never understood this kind of attitude some filesystems have. Usually
the hardware would make sure that stuff doesn't disappear, and not some
weird software workarounds like journalling or write barriers that
complicate and slow down everything.

Now as you were saying?



pgpHqTi1lKSH4.pgp
Description: PGP signature


Re: reiser4 experimental patch

2006-11-10 Thread Valdis . Kletnieks
On Fri, 10 Nov 2006 10:59:30 -0200, Guilherme Covolo said:
> the diference between my an  Johannes Hirte's patch is:
> *
> 
> /fs/reiser4/super_ops.c
> 
> 290c290
> < static int reiser4_statfs(struct dentry *dentry, struct kstatfs *statfs)
> ---
> > static int reiser4_statfs(struct super_block *super, struct kstatfs *statfs

diff -c or diff -u please.  That way, if some unrelated thing moves the lines
up or down 1 or 2, it still applies.  Also, it's easier to look at a 'diff -u'
and understand what's going on, because you get to see 3-4 lines either side
of the changed lines.

> i change my super_ops.c but why you alter te int to ssize_t on item.h?

ssize_t isn't an int on some architectures, it's a 'long'.  As a result if
you reference a 32 bit value where you should use 64, you'll certainly
end up with something unexpected (probably an oops).


pgp89gE9Ad8ly.pgp
Description: PGP signature


Re: reiser4 experimental patch

2006-11-09 Thread Valdis . Kletnieks
On Thu, 09 Nov 2006 17:23:20 -0200, Guilherme Covolo said:
> hello guys,
> 
> my experimental patch need modfications on fs/reiser4/context.c
> 
> i need help ;)

You'll have to give us more info than that.  What happened?

Patch reject? It didn't compile? It didn't modprobe? The resulting kernel
didn't boot? The resulting kernel oopsed? Other? 



pgpL4a0CRF33Z.pgp
Description: PGP signature


Re: reiser4 cryptcompress test setup: bug

2006-11-06 Thread Valdis . Kletnieks
On Mon, 06 Nov 2006 19:27:57 GMT, Danny Milosavljevic said:
> Hi Edward,
> 
> I finally tried your cryptcompress setup (2.6.18-mm3) and just did the
> first evil thing I could think of:
> 
> the reiser4 partition with ccreg40 enabled is /dev/sda1 (/mnt/tmp/)
> (mkfs'ed, thus empty).

> [  372.564358]  [] ext3_dirty_inode+0x30/0x90
> [  372.564364]  [] cp_new_stat64+0xf9/0x110

You might want to check /home sometime - this looks like an ext3 botch
rather than a reiserfs botch, unless reiser4 is stomping on something
behind ext3's back.


pgpbnkgsqXUA8.pgp
Description: PGP signature


Re: Hans Reiser arrested...

2006-10-12 Thread Valdis . Kletnieks
On Wed, 11 Oct 2006 13:32:22 EDT, Toby Thain said:

> He's in custody of the police; apparently even his lawyer can't see him.

"Even his lawyer can't see him" is the sort of thing that only happens
in 3rd world countries with shaky grasp on human rights.

Of course, this *is* the US, so maybe in fact Hans isn't being allowed
to see his lawyer


pgpEamo1z5kNP.pgp
Description: PGP signature


Re: Reiser FS will not boot after crash

2006-09-04 Thread Valdis . Kletnieks
On Mon, 04 Sep 2006 23:33:27 +0400, "Vladimir V. Saveliev" said:

> after unclean shutdown journal reply is necessary to return reiserfs to 
> consistent state. Maybe GRUB did not do that?

A case can be made that GRUB should be keeping its grubby little paws off
the filesystem journal.  It's a *bootloader*.  It's only purpose in life is
to load other code that can make intelligent decisions about things like
how (or even whether) to replay a filesystem journal.


pgpPJnsG2OrrF.pgp
Description: PGP signature


Re: data corruption with 2.4.25 and datalogging patches

2006-07-17 Thread Valdis . Kletnieks
On Mon, 17 Jul 2006 12:12:22 PDT, Hans Reiser said:
> It seems like bad memory is growing as a percentage of user filesystem
> problem sources.   Do others have that feeling also?

Assuming that the chances of any given 16 megabit (or whatever size it is)
RAM chip having a flaky bit being identical, then the chance of bad memory
in any given gigabyte of RAM is the same.. and if you have 1/2G of memory,
you have 1/8 the chance of a bad bit compared to having 4G installed.

The bigger question is why ECC isn't catching this stuff (and yes,
I know some hardware doesn't do ECC on all data paths, which is the point :)


pgpIboPFxgrst.pgp
Description: PGP signature


Re: [PATCH] reiserfs: fix handling of device names with /'s in them

2006-07-17 Thread Valdis . Kletnieks
On Mon, 17 Jul 2006 11:27:20 PDT, Hans Reiser said:
> [EMAIL PROTECTED] wrote:
> >On Sun, 16 Jul 2006 20:02:27 PDT, Hans Reiser said:

> >>Create a mountpoint which knows how to resolve a/b without using a
> >>"directory".

For wanting to resolve it *without* using a "directory"...

> >And said mountpoint gets past the '/' interpretation in the VFS, how, 
> >exactly?
> >
> >fs/namei.c, do_path_lookup() does magic on a '/' on about the 3rd line.
> >So you're going to get handed 'a'.

> It does not need to be so complex actually,  Just create a plain old
> parent directory just like every other parent directory in procfs.

This smells a lot like using a directory to resolve it


pgpII2oRQluma.pgp
Description: PGP signature


Re: [PATCH] reiserfs: fix handling of device names with /'s in them

2006-07-17 Thread Valdis . Kletnieks
On Sun, 16 Jul 2006 20:02:27 PDT, Hans Reiser said:

> Create a mountpoint which knows how to resolve a/b without using a
> "directory".

And said mountpoint gets past the '/' interpretation in the VFS, how, exactly?

fs/namei.c, do_path_lookup() does magic on a '/' on about the 3rd line.
So you're going to get handed 'a'.


pgpAiNTu9WQ5W.pgp
Description: PGP signature


Re: any way to disable fsync?

2006-07-11 Thread Valdis . Kletnieks
On Tue, 11 Jul 2006 17:04:56 PDT, Hans Reiser said:
> There are legitimate applications where the value of data is low enough
> and the load is high enough, that losing the database upon crash is ok.
>
> I have mixed feelings about making it a mount option for reiser4 because
> many users will not know what they do.  In the end though, I should just
> sell the rope and advise but not control what people do with it.   If
> someone writes it I will take a mount option patch to disable fsync iff
> it comes with documentation that has a lot of warnings.

Two things to consider before writing code:

1) Should it be done at the VFS level instead of in the filesystem?
Architecturally, it might be better there, so it applies to ext3 and jfs
and others too... 

2) Alternatively, should it be done on a per-file basis (possibly
flagged with a chattr or similar)?  It can't be done as an open()
flag or ioctl(), because you're trying to override what the code does...
That way, you can mitigate any fsync() load caused by one file, and
still not leave yourself open to being screwed by some other application
that tries to fsync() in other directories on that filesystem.  It
would be Really Bad if /home/fred/db23.sqlite gets corrupted because
the filesystem was mounted -nofsync because of /home/george/moby.sqlite
overhead


pgpVCab4SeTph.pgp
Description: PGP signature


Re: any way to disable fsync?

2006-07-11 Thread Valdis . Kletnieks
On Tue, 11 Jul 2006 23:03:12 +0200, =?iso-8859-2?B?o3VrYXN6IE1pZXJ6d2E=?= said:
> I got problem with apps that are calling fsync, it makes my hard drive  
> flush like mad and it slows down things quite a lot.

Several have posted how to bypass it.  I'll pose the opposite side:

Usually, applications call fsync() because they're pretty sure that if
the disk and in-memory copies aren't lined up, a crash at that point could
result in data loss and/or corruption.

So sqlite calls fsync() - probably because if it *doesn't*, and your
system crashes/reboots, you *will* lose that sqlite database.

Your data, your decision.


pgpTVO6hloYbk.pgp
Description: PGP signature


Re: Re: alman birasý oettinger türkiyede

2006-05-15 Thread Valdis . Kletnieks
On Mon, 15 May 2006 13:50:49 +0200, Lars Grobe said:
> Ok, this is the first time of my life that I was really pleased by what I
> read in a spam mail. As a German living in Istanbul, I will translate: The
> cheapest beer available in Germany will be sold in Turkey now, too :-)

If it weren't for the fact that even *cheap* German beer beats most US beer,
I'd say you got the wrong criterion for being overjoyed... ;)


pgpY7RGOlVrJd.pgp
Description: PGP signature


Re: bad bread

2006-05-09 Thread Valdis . Kletnieks
On Tue, 09 May 2006 00:18:32 +0200, PFC said:

>   Linux RAID has a special option for that : you can trigger a check, 
> which  
> will re-read the entire disks and, if a read error occurs, re-write the  
> failing sector with good data from the other drives in the RAID. The drive  
> with the bad sector will then remap it to another sector.

If you have 2 mirrored disks, and are replacing one, you don't have a good
block to read it from.  The failure mode was a RAID controller that didn't
properly handle re-writing the bad block on the first disk, so when the
second disk got a bad block, you were screwed



pgpQzzLSB85Ov.pgp
Description: PGP signature


Re: bad bread

2006-05-08 Thread Valdis . Kletnieks
On Sun, 07 May 2006 10:35:44 +0200, PFC said:
> 
> > In the event of physical HD failure, the procedure goes like this:
> 
>   Get mail saying a HDD is dead. Replace harddisk, resynchronize RAID.
>   Use Linux software RAID. Harddrives are cheaper that the time you'll 
> lose  
> trying to recover your data.

Remember to take backups *anyhow*.  That way, if the RAID controller dumps
cow manure on all the sectors, you won't be saying "Oh, SH*T".

Also, note that there exist buggy RAID controllers, where if you are doing
mirroring to 2 disks, and they develop bad blocks at different locations,
you can trash the mirror by resynchronizing (basically, you swap out one of
the bad disks, re-sync, it progresses as far as the bad block on the source
for the mirror, and dies).



pgpncHXAUBEls.pgp
Description: PGP signature


Re: Transparent Compression

2006-05-05 Thread Valdis . Kletnieks
On Fri, 05 May 2006 10:37:40 +0200, Jonathan Carter said:
> I've read about ReiserFS's built-in compression, and I've been excited
> about it for a long time, but haven't figured out how to activate it
> yet. I've googled and looked on the reiserfs website, but couldn't find
> any information on how to do it.
> 
> Can anyone please tell me how, or point me to the appropriate
> documentation?

Step 1: Wait till the code is actually released.

I don't think Reiser-with-compression is a configuration that is
currently buildable by mere mortals at the current time.



pgpUGJEe5NGk9.pgp
Description: PGP signature


Re: -- warning("vs-44", "out of memory?"); --[ addendum ]--

2006-04-06 Thread Valdis . Kletnieks
On Thu, 06 Apr 2006 09:07:42 +0200, Roy Lanek said:

> the freezing, for once. The system is otherwise DECENTLY
> usable really, provided one does not start to use the
> GIMP, Open Office or similar. 

You have much lower standards of usability than many of us.

"Client 2: I don't know we need to worry too much about strengthening that.
After all, these are not meant to be luxury flats.

Client 1: Absolutely. If we make sure the tenants are of light build and
relatively sedentary and if the weather's on our side, I think we have a winner
here."
-- Monty Python, the Architect sketch.





pgpukog96Tc90.pgp
Description: PGP signature


Re: Reiser4 crash 2.6.16-mm1

2006-03-27 Thread Valdis . Kletnieks
On Mon, 27 Mar 2006 14:32:14 PST, Joe Feise said:

> Thanks for the suggestion. I haven't run a memtest, but I don't really think 
> that the memory is bad. The machine most likely would have had other issues 
> if that was the case.

You'd be *amazed*.  Intermittently weak memory (especially if it's just one bad
bit) can manifest in the most odd ways.

In fact, if you think about it, if it's bad memory, your trashed reiser4
partition could very well *be* that "would have had other issues" that you
said you'd see if it was bad memory. ;)


pgpJspWmlyLGY.pgp
Description: PGP signature


Re: Static overrun in reiser3

2006-03-15 Thread Valdis . Kletnieks
On Wed, 15 Mar 2006 14:01:22 PST, Hans Reiser said:
> Jeff Mahoney wrote:

> > Ah, sorry, all I can do is review their database. I can't actually run
> > the checker myself.
> 
> Ah, so there is a database somewhere that we can look at?

You may have missed out - I think the Coverty guys only did the in-Linus-tree
stuff, and I'm pretty sure they didn't cover the -mm branch.  But I could be 
wrong...


pgpIBxFfH9giW.pgp
Description: PGP signature


iosched (was Re: Full of surprises - A reiser4 story from userland)

2005-09-28 Thread Valdis . Kletnieks
On Wed, 28 Sep 2005 22:13:52 +0300, Islam Amer said:

> BTW, Previously I had amazing performance with anticipatory
> IO-scheduler ( even more so with genetic anticipatory ) any comments
> on this io-scheduler business, as it stirred up some commotion before.
> Is the performance boost an illusion or is it not.

The performance boost for any of the provided iosched schemes can be
positive, negative, imaginary, or complex(*), depending on the actual workload 
of
the system, and what reference patterns it generates.

There's 4 in-tree schedulers precisely because each of them has a clear-cut
advantage for some statistic (be it throughput, or latency, or CPU overhead, or
whatever) for some identified workload type.

(*) I suspect that (benchmarks being benchmarks) the chance that the boost
be totally real, with no imaginary component, is very slim.  And everybody
knows that most benchmark results are complex to interpret.. :)


pgpYlwcNDClWq.pgp
Description: PGP signature


Re: Will I need to re-format my partition for using the compression plugin?

2005-09-22 Thread Valdis . Kletnieks
On Thu, 22 Sep 2005 18:13:23 EDT, Gregory Maxwell said:

> It would normally seem silly to use RSA for disk encryption... but
> there might be applications, although you'd still never use RSA
> directly on user controlled data.  For example, RSA could be used on a
> multi user server to append mail to a mail file so that once written
> the data is only accessible once the user logs on.  The reiser4 crypto
> system will use the kernel keyring api, so it would be quite
> reasonable to tie encryption to user accounts. 'write only' files and
> 'read only' files would be a simple logical extension, and would
> require asymetric cryptography.

In fact, RSA would *still* be a poor choice there - the CPU costs go up
exponentially with the size of the object encrypted.  And if you have a 64K
sized files, that means if you use RSA directly, you get to do mathematics with
524,288 bit numbers.  Yep, multiply a 524,288 bit number by a 1024 bit number
and then compute the remainder when divided by another 1024 bit number. Lather,
rinse, repeat. ;)

You know how sites that do a lot of SSL buy special hardware accelerators?
The only *real* benefit they give you is offloading the CPU cost of doing
RSA over a 128-bit or so session key.

OK. Got that?  Doing RSA over a 16 byte file "costs" as much as opening a
standard 128-bit encryption SSL connection (because it's basically the same
thing).  And a 17 byte file costs you a lot more than 8 times as much.  And a
32 byte file isn't 16 times worse, it's *hundreds* of times worse.

That's why *nobody* uses RSA for anything other than securing a good-sized
symmetric session key.  So for this use, you'd use RSA to secure the file's
actual symmetric key (and possibly things like the initialization vectors).

(Note to designers - those pesky IV's are a *lot* trickier to get right
than you might think.  For instance, there's a known watermarking attack
against the current cryptoloop implementation in the kernel that allows
an attacker to prove the existence of data on the disk even without the
key - so a DRM scheme could find watermarked data even *after* encryption).

> Although for most compression algorithms not all inputs are valid
> outputs, so this may not work for you... It would be ideal (for disk
> encryption) if it were not possible to tell if you have the right key
> without decrypting an entire sector. This requires careful selection
> in compression and chaining mode.

In fact, Hamming distance considerations imply that usually you don't
need to decrypt more than 1 or 2 (*maybe* 3) blocks the size of the
symmetric cypher's blocksize.  For something like AES-256, you can probably
be sure in 32 bytes (1 block), very sure in 64 bytes, and totally sure in 128
bytes (unless the attacker has the misfortune to be trying to decrypt a file
that has actual structure on the same order as /dev/random output).

>   Alternatively, it may be possible
> to develop a good large block cipher which while being much slower
> than a single block of a small-block cipher, is faster for a disk
> block.  For example, mercy is about 4x faster than AES on my system
> but is still 16x slower for the smallest unit of decryption than AES.
> Unfortunately mercy has security problems.

Tough design challenge there.

The problem is that if you have a cipher that can handle 512-*byte* input
blocks, it's going to probably stomp on a *lot* of L1 and L2 cache lines.
And you can't even rely on the usual pre-expansion tricks because that adds
even *more* to cache pressure.

Another desirable property of symmetric ciphers is that they tend to change
about half the output bits for a single-bit input change, and in an 
unpredictable
manner.  This ends up meaning that you'll probably need O(log2 N) rounds, and
more likely closer to O(N) rounds, to mix the pool.  Gonna be a *lot* of rounds
for a 512-byte block. ;)

> > 2) Even though most modern block ciphers are designed to be fast, it's still
> > faster to apply a reasonably quick compression scheme to whomp 16K of data
> > down to 5-6K and encrypt/decrypt 5-6k than it is to encrypt/decrypt 16K.
> 
> Depends on the compression mode and the cipher. A good AES
> implementation is around the same speed as an aggressive gzip. In
> general this is correct.

That's why you don't use an *aggressive* gzip, but use 'gzip -3' instead. :)



pgpw8EBD8yIaD.pgp
Description: PGP signature


Re: Will I need to re-format my partition for using the compression plugin?

2005-09-22 Thread Valdis . Kletnieks
On Thu, 22 Sep 2005 16:54:12 EDT, michael chang said:
> On 9/22/05, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> > 2) Even though most modern block ciphers are designed to be fast, it's still
> > faster to apply a reasonably quick compression scheme to whomp 16K of data
> > down to 5-6K and encrypt/decrypt 5-6k than it is to encrypt/decrypt 16K.
> 
> Two questions.  One, does this mean that compression will usually be
> performed before encryption (which to me, sounds like it appears to be
> what would be the best method here)?

Yes, the general rule is "compress, then encrypt" and "Decrypt then decompress".

The corner cases where it should be the other way around are so few and far
between that it's not worth worrying about - they basically center around those
few times when compression causes a *larger* payload, and can be dealt with
a simple "Don't compress if output is bigger than input" rule.


pgpRncQZOEsYN.pgp
Description: PGP signature


Re: Will I need to re-format my partition for using the compression plugin?

2005-09-22 Thread Valdis . Kletnieks
On Thu, 22 Sep 2005 15:11:59 CDT, David Masover said:

> > Because sometimes it is useful to compress data before encryption since 
> > compression
> > destroys vulnerable regular structure of some special files (like *.html)
> 
> Although I'd imagine some algorithms are fairly resistant against that 
> (RSA, maybe?), the main reason is simple -- encryption tends to 
> introduce randomness.  If the crypto is any good at all, you won't be 
> able to compress very well after you've encrypted.

1) RSA is useless for this - you really need a symmetric block cipher of some
sort.  Almost all block ciphers are best used with maximum-entropy input - if
the attacker can lop out a large part of the keyspace, a brute force attack
becomes a lot easier.  This is somewhat related to the concept of "Hamming
Distance". If the attacker tries a brute force attack, and the first 8 bytes of
the output look like valid HTML, or English text, or anything else
recognizable, he's almost certainly found found the correct key.  On the other
hand, well-compressed data has very high entropy - as a result, it becomes
harder to tell if a correct key has been found.  If it's English text, but
3 of the first 8 bytes have the high bit set, it's probably not a correct key.
If it's compressed, 3 flipped bits in the first 8 bytes will probably still
represent a valid compressed stream - just of something else wildly different.

2) Even though most modern block ciphers are designed to be fast, it's still
faster to apply a reasonably quick compression scheme to whomp 16K of data
down to 5-6K and encrypt/decrypt 5-6k than it is to encrypt/decrypt 16K.



pgp2td0oJxyEV.pgp
Description: PGP signature


Re: Will I need to re-format my partition for using the compression plugin?

2005-09-22 Thread Valdis . Kletnieks
On Fri, 23 Sep 2005 00:03:32 +0400, Edward Shishkin said:

> Checksuming means a low
> performance: in order to read some bytes of such file you will need 
> first to read the whole file
> to check a checksum (isnt it?).

No.  Almost all modern networking gear is *perfectly* able to do incremental
updates of the checksum.  See this RFC:

1141 Incremental updating of the Internet checksum. T. Mallory, A.
 Kullberg. Jan-01-1990. (Format: TXT=3587 bytes) (Updates RFC1071)
 (Updated by RFC1624) (Status: INFORMATIONAL)
http://www.ietf.org/rfc/rfc1141.txt

The method is trivially extensible to other CRC schemes - and in fact, the
triviality is the entire reason why cryptographically strong hashes like MD5 or
the SHA family are interesting at all.  (I've seen more than one definition of
"cryptographically strong hash" as being basically a CRC function that does
*not* permit incremental updating)



pgpI0WmTOcshN.pgp
Description: PGP signature


Re: I request inclusion of reiser4 in the mainline kernel

2005-09-20 Thread Valdis . Kletnieks
On Tue, 20 Sep 2005 17:17:13 EDT, "Theodore Ts'o" said:
>
> An exit code of 1 means that filesystem errors were corrected
> (successfully).  

Right.  The problem is that this was a *second* check, after the first one
terminated with exit code 0, 1, or 2.  Thus, it *should* have exited with 0.

The *first* check lied - if there were unfixed errors, it should have exited
with exit 4.


pgphzcfEapgtf.pgp
Description: PGP signature


Re: I request inclusion of reiser4 in the mainline kernel

2005-09-20 Thread Valdis . Kletnieks
On Tue, 20 Sep 2005 23:28:12 +0400, Roman I Khimov said:
> --nextPart1692600.LIfSYN1P7A

> Maybe I'm doing something wrong here, but ext2 have failed on second check
> of first pass with
> 
> Second check...
> e2fsck 1.34 (25-Jul-2003)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure

> fsck.damaged: * FILE SYSTEM WAS MODIFIED *
> fsck.damaged: 1345/25064 files (1.7% non-contiguous), 94063/10 blocks
> fsck lied about its success (result = 1)

What was the return value and output from the *first* fsck? 


pgpoMa1WLwXB9.pgp
Description: PGP signature


Re: I request inclusion of reiser4 in the mainline kernel

2005-09-18 Thread Valdis . Kletnieks
On Sun, 18 Sep 2005 22:16:11 PDT, Hans Reiser said:

> Hellwig, people who write slow file systems should not lecture their
> measurably superiors on how to code.  Oh, and I should mention that
> other people besides me have measured reiser4, and concluded it is twice
> the speed of the other Linux filesystems, so don't go claiming it is
> just my benchmarks.   What you are doing is keeping me from doing a real
> code review myself by keeping my guys so busy that they don't have time
> to review the fixmes I inserted and would insert more of if I thought
> they had time for them.

Hans, unfortunately the most obvious reading of the above is "Reiser4 is so
damned fast because it doesn't bother doing sanity-checking".  If there's still
more "fixmes" to be inserted that *you* know of, and there are so many that
there's no time to fix them, why is this being submitted for inclusion?

On Sun, 18 Sep 2005 22:09:08 PDT, Hans Reiser said:
> Of course, the reiser4 code is not as stable as it was before the
> changes Christoph asked for.

This sort of claim requires proof - can you point at *specific* things that
were less stable after you fixed the code, including explaining why they're
less stable?


pgpSbMPr0RKqd.pgp
Description: PGP signature


Re: I request inclusion of reiser4 in the mainline kernel

2005-09-18 Thread Valdis . Kletnieks
On Sun, 18 Sep 2005 13:22:27 EDT, michael chang said:

> Give Hans a chance; and please try to understand, even if he's hard to
> work with.  Discriminate him because he's not a developer you can talk
> with, and I believe that's like discriminating a guy in a wheelchair
> because he can't run with you when you jog in the morning.

There's nothing wrong with discriminating against the guy in the wheelchair
under some circumstances - for instance, when your track team needs a new
high jumper.

Similarly, when the goal is to build a set of developers that can actually
get work accomplished, poor interpersonal communication skills can be a
major problem.

If the problem is that Hans and the rest of the kernel developers don't get
along, perhaps the most expedient thing would be for Hans to step out of the
way and have somebody else from Namesys (or elsewhere even) act as the 
interface.


pgp90wjQAcsxl.pgp
Description: PGP signature


Re: we have got hash function screwed up

2005-09-11 Thread Valdis . Kletnieks
On Mon, 12 Sep 2005 00:49:54 +0200, evilninja said:
> [EMAIL PROTECTED] schrieb:
> > Yes, I know there's needs to support borked legacy filesystems that were 
> > mkfs'ed
> > before the problem was recognized.  That means fsck.reiserfs needs to know 
> > about
> > it - but mkfs.reiserfs??  Seen in the Fedora Core devel tree as of tonight:
> 
> yeah, this "feature" of mkfs.reiserfs could be removed, but since reiserfs
> is in bugfixes-only-mode i don't see it happen.

I'd consider a #if 0/#endif to remove known busticated code a bugfix. ;)


pgpbojpUl362y.pgp
Description: PGP signature


Re: we have got hash function screwed up

2005-09-10 Thread Valdis . Kletnieks
On Sat, 10 Sep 2005 17:36:49 +0200, evilninja said:
> Gabor HALASZ schrieb:
> > Sep  5 12:30:24 sk8n kernel: ReiserFS: dm-10: checking transaction log 
> > (dm-10)
> > Sep  5 12:30:24 sk8n kernel: ReiserFS: dm-10: Using rupasov hash to sort 
> > names
> 
> why did you choose the rupasov hash?
> http://www.namesys.com/mount-options.html knows:
> 
> rupasov: [...] Never use it, as it has high probability of hash
>  collisions.

Why is it selectable then?

Yes, I know there's needs to support borked legacy filesystems that were mkfs'ed
before the problem was recognized.  That means fsck.reiserfs needs to know about
it - but mkfs.reiserfs??  Seen in the Fedora Core devel tree as of tonight:

% rpm -q reiserfs-utils
reiserfs-utils-3.6.19-2
% strings /sbin/mkfs.reiserfs  | grep -i rupasov
rupasov
  -h | --hash rupasov|tea|r5   hash function to use by default



pgpCEjTycKelh.pgp
Description: PGP signature


Re: Reiser4 and ACLs

2005-08-14 Thread Valdis . Kletnieks
On Sun, 14 Aug 2005 05:38:40 PDT, Marc Perkel said:
> btw - is Reiser4 still going to get merged into 2.6.13?

It's not in 2.6.13-rc6, and I doubt Linus is going to blop *that* big
a chunk of code in this late - it's already well into the "is this 3-liner
too drastic" phase.

What happens when the 2.6.14 tree opens is up to Linus and Andrew.



pgpvushrPVW5F.pgp
Description: PGP signature


Re: reiser4 on 2.6.13-rc6-realtime-preempt

2005-08-12 Thread Valdis . Kletnieks
On Fri, 12 Aug 2005 12:09:03 +0200, gimpel said:

> reiser4 again. Maybe the is to wait for stable 2.6.13 before doing
> tests with realtime-preempt as it gets updated twice a day.
> And i so much hope the kernel guys decide to merge reiser4.

Well, reiser4 can't possibly make it into 2.6.13, as we're at -rc6 already
and Linus asked for a "quiet down" several -rc ago.  What happens when the
tree opens for 2.6.14 is a different question that I can't answer



pgpcKVS80WLMQ.pgp
Description: PGP signature


Re: reiser4 performance

2005-08-09 Thread Valdis . Kletnieks
On Tue, 09 Aug 2005 13:52:49 EDT, michael chang said:

> Striped RAID only works if you have multiple disks and a decent bus. 
> I'm stuck on the lowest-end Dell Dimension 3000, with one of the
> slowest hard drives in history.  And I haven't gotten around to
> opening the case... yet.

Newbie. ;)

IBM 2314 disk drive for the S/360, late 60s. 10 14" platters, 3600RPM, 29M of
storage capacity, 650Kbytes/second transfer rate.  And that was a fast
mainframe drive for its day.

Now what was this about slow tiny drives? ;)

And if you think a seek hurts latency on modern disk drives, you should have
seen what an end-to-end seek did on a filesystem on a DECTape (yes, the tape
had addressable blocks, you could (and many people did) put a filesystem on it).


pgpqcuTkTllWu.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-29 Thread Valdis . Kletnieks
On Wed, 29 Jun 2005 16:58:20 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said:
> What pisses me off is the fact that Gnome and friends implement
> their own incompatible-with-others VFS's and automounters and
> stuff.

The fact that things like Gnome, which are basically consumers of their own
dogfood, have incompatible versions says very loudly that there's no consensus
on the semantics

> Surely supporting this in the kernel and extending the LSB
> to require this is the best step to take without infringing
> anyone's freedom as such.

First we need to decide *if* it's to be supported, then *what* to support


pgp3p4cUtovoE.pgp
Description: PGP signature


Re: Reiser4 + seekdir()

2005-06-29 Thread Valdis . Kletnieks
On Wed, 29 Jun 2005 14:22:05 +0400, Vladimir Saveliev said:
 
> Existence of various plugins assumes that user is able to choose
> whatever is suitable for him. Or create his own plugin if none of
> existing ones satisfies him.
> If user cares a lot about using telldir/seekdir he is supposed to choose
> SEEKABLE_HASHED_DIR_PLUGIN_ID.

Is that "the user", or "the person building the kernel"?



pgpe8CxAd1yAw.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-27 Thread Valdis . Kletnieks
On Mon, 27 Jun 2005 13:25:14 CDT, David Masover said:

> I was just trying to avoid the "people will never adopt a new archive
> format" argument by pointing out that a similar archive format was
> recently created and adopted.

Out of curiosity, adopted by popular acclaim, or because an 800 pound gorilla
said "This is the format we're shipping  in, learn to deal with it"?
(I've seen both happen multiple times in the last quarter century, on many
different operating systems)

(For that matter, all of my production boxes are backed up by either Tivoli or
Legato, and I haven't a *clue* what format those tapes are in.  As a practical
matter, it doesn't really matter - after the first quarter petabyte or so of
backed up data, you're not going to do a restore without the software's help
anyhow.. ;)



pgpG4HKmqmFUb.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-27 Thread Valdis . Kletnieks
On Mon, 27 Jun 2005 02:07:46 CDT, David Masover said:
> > Exactly the same sort of thing - traditionally it's been more or less 
> > ignored
> > in the system accounting, because A would usually average out to causing as
> > many I/Os as B did, and they were roughly equal in cost so it was a wash.
> 
> Even if A is doing A/V work and B is programming?

I said "traditionally" - it's been a "oh well, we can't do much about it"
problem for a *long* time (for instance, time spent in an interrupt handler
has usually been charged off against whoever's timeslide the interrupt handler
took a chunk out of).  It's only been tolerated so far because (a) the costs
for both users are about equal and (b) you rarely have a heavy I/O DB and a
number cruncher on the same box, or a user doing A/V work and a user doing
programming - if it's not a single-use machine, there's *multiple* number
crunchers, DBs, or programmers, and they tend to balance out.

Said tendency can dissapear quite easily here

> How do we get over quota errors, btw?  Can we get them from write()
> calls?  If so, I don't see a Problem(TM), just an annoyance.

One gotcha here is that it means that you can't do delayed allocation on
writes - you *have* to allocate disk space at each write and then update
the quotas. (And yes, I know that 'man 2 close' says that bad stuff can
happen to your data even after your program exits - that doesn't mean we
should go out of our way to make things worse.. ;)


pgpUHb5i8cnuK.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-27 Thread Valdis . Kletnieks
On Mon, 27 Jun 2005 02:00:49 CDT, David Masover said:

> >>Speaking of backup, that's another nice place for a plugin.  Imagine a
> >>dump that didn't have to be of the entire FS, but rather an arbitrary
> >>tree...  That might be a nice new archive format.  I know Apple already
> >>uses something like this for their dmg packages.
> > Hmm.. you mean like 'tar' or 'cpio' or 'pax' or 'rsync'? :) 
> No, a dmg is an OS X program installer.  It appears to be a disk image
> of sorts.  So this is the backup idea in reverse.

I was addressing the ability to deal with an arbitrary tree.  By that 
definition,
a dmg, being a disk image and not a tree image, is *not* what you want


pgp3Qa8O4oZuW.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Mon, 27 Jun 2005 01:27:25 CDT, David Masover said:

> I back up with rsync, actually.

Doesn't matter what it is.  You still need to define sane semantics for
it.. ;)

> Speaking of backup, that's another nice place for a plugin.  Imagine a
> dump that didn't have to be of the entire FS, but rather an arbitrary
> tree...  That might be a nice new archive format.  I know Apple already
> uses something like this for their dmg packages.

Hmm.. you mean like 'tar' or 'cpio' or 'pax' or 'rsync'? :) 


pgpBnQQgpO38t.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Mon, 27 Jun 2005 00:54:17 CDT, David Masover said:

> There has been some mention of inheritance, but I've forgotten how
> that's supposed to work.  If there's some sort of inheritance where
> children inherit properties of their parent directory, and also inherit
> changes to those properties, than Hans probably wants that to be the
> prefered way of doing things?

Well, the 'chmod g+s dirname/' example *is* just "children inherit the
group of the directory", and somebody didn't like that.. ;)

> > Now throw in multiple users and CPU limits.  User A enters that directory 
> > and
> > references everything, causing the buffer cache to get filled up.  While 
> > there,
> > A makes changes, so the pages are dirty - "for i in */*; do echo " " >> $i; 
> > done"
> > would do the job...  User B now does something that causes a writeback of 
> > one
> > of those buffer cache pages.
> > 
> > A) What process currently gets ticked for the CPU and I/O for the writeback?
> > 
> > B) In your model, who will get ticked for the resources?
> > 
> > C) Will the users riot? (Note that you can't win here - currently, the 
> > "price"
> > of writing back A's and B's pages are about equal.  However, if A gets 
> > dinked
> > for an expensive writeback due to B's process, A will get miffed.  If B gets
> > charge for an expensive writeback of A's, B will get miffed. If you say 
> > "screw it"
> > and bill it to a kernel thread, the bean counters will get miffed... ;)
> 
> If I understand this correctly, this is somewhat like if user A creates
> a 50 meg file on a system with 100 megs free RAM, and user B runs
> "sync".  Also similar to if B were to suddenly fill up 75 megs of RAM,
> forcing A's file to be flushed -- last I checked, in Reiser4, only a
> sync or memory pressure causes writes to flush.

Exactly the same sort of thing - traditionally it's been more or less ignored
in the system accounting, because A would usually average out to causing as
many I/Os as B did, and they were roughly equal in cost so it was a wash.
However, if one user has a much higher per-page cost than the other, the
imbalance can start to matter *very* quickly

> Right?  This is tempting to comment on, but I want to make sure I grok
> it first...

For more fun, consider how you can write 1 megabyte of data to a file,
lseek to the beginning and start writing again - and you go over quota
on the *second* write even though you're over-writing already existing
data.  Can happen if you're compressing, and the second write doesn't
compress as well as the first. (To be fair, we already have similar
issues with sparse files - but at least 'tar --sparse' has an easy way
to deal with it compared to this. ;)


pgpCF0I8B9buu.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Mon, 27 Jun 2005 00:57:54 CDT, David Masover said:

> In one of three possible settings for the imaginary zipfile plugin, yes.
>  But if we're talking about a kernel source tree, how many of us
> actually build zipfiles/tarballs of their kernel source trees, rather
> than unpack existing ones?

I dunno.  I'll often build a tarball of "-mm plus local patches" known to
be working at the moment, precisely so I can just untar that as a known good
base for the next kernel-hackfest, rather than untar Linus's tree, apply all
of the -mm patch, then all my local patches again...

And even if I'm not *that* ambitious, I'll at least tar up a clean -mm tree
to use as a base. :)

And even if I didn't do that, you *do* have to do something when the disk
gets backed up.  You *do* intend for sensible things to happen then, right? ;)



pgpSCyqEvVUKZ.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Mon, 27 Jun 2005 00:31:46 CDT, David Masover said:

> *If* we decide that this must go both ways, *then* we either turn off
> write support inside the zipfile

Oh, *that* will do wonders for command symmetry.  And you just shot down
the whole 'mv foo bar' being equivalent to 'zip bar foo' concept. ;)

>  and do "make" with a symlink farm (cp
> - -as isn't hard), or (better) we can set things up so that only on access
> (most likely a read) of the original zipfile do we re-add all the changes.

Those chuckleheads who have filled up a disk by saying 'tar cvf foo.tar .' just
got a whole new way to fill the disk... ;)


pgp8cnDLIoZE0.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Sun, 26 Jun 2005 23:10:43 EDT, Hubert Chan said:
> On Sun, 26 Jun 2005 20:40:29 -0400, [EMAIL PROTECTED] said:

> > Oh, I'm waiting for the fun the first time somebody deploys a plugin
> > that has similar semantics to 'chmod g+s dirname/' ;)

(You *did* notice it was set-GID of a *directory* not an executable file,
right?)

> Reiser4 plugins have to be compiled into the kernel.  (They're not
> plugins in the sense that most people use that word.)  And any admin who
> would compile that kind of plugin into the kernel needs to have his head

Oh?  You saying that it *wont* be permitted for a user to say:

mkdir $HOME/zipped
chattr "files under here are ZIP files" $HOME/zipped

and instead you have to do that chattr by hand for *every* *single* zip file?

Or "files on this filesystem are encrypted by default"?

I suspect that this sort of thing is going to be one of the *first* things
that will get created, and any admin who tries to sell this idea to the users
*without* that sort of functionality will be handed their head.

Or, if "that type of plugin.. needs to have their head examimed", I suggest
that you go to your kernel source tree, find fs/ext3/ialloc.c, and this code
in ext3_new_inode():

if (test_opt (sb, GRPID))
inode->i_gid = dir->i_gid;
else if (dir->i_mode & S_ISGID) {
inode->i_gid = dir->i_gid;
if (S_ISDIR(mode))
mode |= S_ISGID;
} else
inode->i_gid = current->fsgid;

and #ifdef out all but the last line, and see if anything breaks. ;)

> examined.  Not to mention that plugins must first go through Hans and/or
> Linus before they can get included into the kernel.
> 
> The kernel defines the set of plugins available to the user.  The user
> selects (to a certain degree) which plugins to use.

The point you missed was that plugins *will* have interactions, and as
the guys who are working on a stacker for LSM modules have found out the
hard way, trying to deal with the composition of functions is fiendishly
difficult.

And notice that it doesn't *have* to be quite so obvious - how about if a
user creates a directory $HOME/zipped/ and flags it as "anything under here
is a zipped file".

Now throw in multiple users and CPU limits.  User A enters that directory and
references everything, causing the buffer cache to get filled up.  While there,
A makes changes, so the pages are dirty - "for i in */*; do echo " " >> $i; 
done"
would do the job...  User B now does something that causes a writeback of one
of those buffer cache pages.

A) What process currently gets ticked for the CPU and I/O for the writeback?

B) In your model, who will get ticked for the resources?

C) Will the users riot? (Note that you can't win here - currently, the "price"
of writing back A's and B's pages are about equal.  However, if A gets dinked
for an expensive writeback due to B's process, A will get miffed.  If B gets
charge for an expensive writeback of A's, B will get miffed. If you say "screw 
it"
and bill it to a kernel thread, the bean counters will get miffed... ;)


pgpeCE1Y8XJs7.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Sun, 26 Jun 2005 21:37:48 CDT, David Masover said:

> > Go read http://www.tux.org/lkml/#s7-7 and ponder until enlightenment 
> > arrives.
> 
> So what?  I don't intend to convince anyone based on how much
> slower/faster their kernel compiles.  It's meant to illustrate the
> principle of the thing.

No, you seemed convinced that you'd have a big win based on the fact that
big chunks don't get unpacked - when in fact it's not as much of a win as
you might think.

And at least in the real world, performance *does* matter - if doing it the
traditional way is 3 times faster, nobody's going to be interested.

> Besides, your point was that you could not run make inside of a kernel
> tarball/zipfile.  Nobody ever suggested that you would actually want to.

"Here's a new facility.  Don't bother trying to actually use it".

Is that the message you're trying to send?


pgpkXNloSX147.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Sun, 26 Jun 2005 21:37:48 CDT, David Masover said:

> Assume we can do on-disk caching, similar to fscache/cachefs for nfs.
> Now, benchmark:
> 
> $ unzip linux-2.6.12.zip && make -C linux-2.6.12
> 
> versus the hypothetical
> 
> $ make -C linux-2.6.12.zip/.../contents
> 
> This is an automatic performance gain, in theory, because the second
> command is identical to unzipping just the parts you need into
> linux-2.6.12, then running "make".

Nope, they're not identical.  The first specifically unzips it into the file
system, leaving the zip file intact.  The second, you're having to take all
those .o files and other stuff that the 'make' generates and put them back
into the .zip file *on the fly* - when the 'make' is half done, the .zip should
reflect a directory tree that has had half the make execute

(Think - after that hyptothetical 'make' completes, where is 'vmlinux'? ;)


pgpMW7gmAGlYr.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Sun, 26 Jun 2005 15:54:25 PDT, Hans Reiser said:
> [EMAIL PROTECTED] wrote:
> 
> > (Hint - work out how long a kernel 'make' would take
> >if you were doing it inside a .tar.bz2).
> >  
> >
> After the first time, not very long, if you had enough ram  the
> plugin would keep the data uncompressed until it flushed it to disk.

You're not allowed to use current existing stuff like the disk buffer cache
to weasel your way out on this one.  "if you had enough ram" has been true
for decades.  The trouble is that quite often you *don't* have enough ram
 
> Performance might even improve since less would be written to disk.

I've worked with filesystems where performance improves due to compression
(AIX's JFS).  It's a lot harder to provide an improvement if you're writing
37 more bytes in between bytes 399457 and 399458 (I suppose by aligning
byte 399458 so it actually is on the start of a 4K block you can do that, but
then you're losing the advantages of the compression.. ;)



pgpt7ncoA2bcx.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Sun, 26 Jun 2005 17:35:48 CDT, David Masover said:

> > Right. So please explain what crypto/raw/foo and crypto/inflated/foo.gz 
> > give you.
> 
> In that example (shouldn't have used the name "crypto", but oh well), it
> should be crypto/raw/foo.gz and crypto/inflated/foo -- where foo.gz is
> the gzip'ed file and foo is the transparently compressed/decompressed
> file.  Basically, these are equivalent:
> 
> $ zcat crypto/raw/foo.gz
> $ cat crypto/inflated/foo

I'm *quite* aware of what your preconceived notions think it *should* be.

Maybe the two examples I asked for have *real-world* meanings that you should
be allowing for.  Like, for instance, on a mail server, where the A/V software
may need to unzip a file 5 or 6 times to find out if there's malicious content.

Or seeing if it's a ".zip bomb", where a small .zip will decompress to 
gigabytes.

Or I'm testing a new compression algorithm, to see if multiple compressions help
(yes, I know that it *shouldn't* help - but I've seen real-world cases where the
algorithm could only look at a 4K or 8K window at a time, and if you hit a 
*very*
long run of duplicate 4K segments, a second compression would compress all the
identical or near-identical "this is a 4K chunk" tokens...)


> > It's got a *LOT* to do with it if I created a *DIRECTORY*, to use *AS A 
> > DIRECTORY*,
> > the way Unix-style systems have done for 3 decades, and suddenly my system 
> > is
> > running like a pig because the kernel decided that it's a .zip file.
> 
> The kernel does not decide that.  You do.  And it doesn't automatically
> decide that every time you create a file.  You have to use some
> interface to trigger the plugins.

Oh, I'm waiting for the fun the first time somebody deploys a plugin that
has similar semantics to 'chmod g+s dirname/' ;)

> I guess I need a new name for this approach.  That's three possible ways
> of doing this?

I *said* you need to think this through in detail, didn't I? ;)
 
> I remember discussing that, actually.  It wouldn't automatically do this
> if you didn't want it to, but it would be nice if, say, it was something
> truly seekable like linux-2.6.12.zip, and linux-2.6.12 was a
> user-created symlink to linux-2.6.12.zip/.../contents, and we had a nice
> caching system...

I think you're highly deluded as to just how much or little performance gain
this will get you. Model what happens with a kernel 'make' on a 256M machine
with and without all that zipping and compressing, and assume that a constant
48M is available for caching of the linux-2.6.12/ tree.

> This is nice because then you get exactly the same performance during
> "make" as you would with "unzip && make", only better, because files you
> don't ever use (lots of arch, for instance) are not unpacked.

Go read http://www.tux.org/lkml/#s7-7 and ponder until enlightenment arrives.



pgpIiBhIz7zum.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Sun, 26 Jun 2005 19:16:48 CDT, David Masover said:

> But, to avoid confusion, the inclusion of a crytocompress plugin in a
> given kernel doesn't mean that all files accessed from that kernel are
> encrypted and compressed.  It just means that you can pick an individual
> file and set it to be transparently encrypted/compressed.
> 
> That is what I meant by "enabled".  Not per-user, but per-file.

Doing key management in a secure manner is going to be *fun*. :)


pgpOGJMOJGgcF.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Sun, 26 Jun 2005 14:58:07 CDT, David Masover said:

> "Plugins" is a bad word.  This user's combination of plugins is most
> likely identical to other users', it's just which ones are enabled, and
> which aren't?  If they are all included, I assume they play nice.

Which ones are enabled. Exactly.

> And just because they are called "plugins" doesn't mean the EA looks
> different every week.

They do if the one enabled this week is "make EAs look like symlinks", and
last week's was "make EAs look like folders".

(Don't blame me, *you're* the one that said "EAs can look like any other 
object"..)


> > And 'cat crypto/raw/foo' or 'crypto/inflated/foo.gz' gets you what, exactly
?
> > 
> > Now throw some .bz2 and .zip files into the mix... ;)
> 
> Interface is the same.  Only, zip files aren't just compression, so
> maybe the interface changes a little there.

Right. So please explain what crypto/raw/foo and crypto/inflated/foo.gz give 
you.

> Point is, now you have a standard interface for any program to access
> any simple lossless compression, transparently.
> 
> >>Another possibility, if you like file-as-a-directory:
> >>
> >>cat foo.gz  # raw
> >>cat foo.gz/inflated # decompressed
> >>
> >>One could easily imagine things like these two potentially equivalent
> >>commands:
> >>
> >>cp foo bar.zip/
> >>zip bar foo

> > Unless of course the user had done 'mkdir sorted.by.city.zip' to make
> > a directory of files containing data sorted by USPS Zip code.
> 
> What's this got to do with anything?

It's got a *LOT* to do with it if I created a *DIRECTORY*, to use *AS A 
DIRECTORY*,
the way Unix-style systems have done for 3 decades, and suddenly my system is
running like a pig because the kernel decided that it's a .zip file.

> > And what happens if the user has a file 'bar' that's not a ZIP file,
> > and a directory 'bar.zip' isn't a view into 'bar'?
> 
> In file-as-a-directory (which is probably NOT happening soon), bar.zip
> is both the actual zipfile and the view inside, depending on whether you
> try to open() it directly or peek inside it as a directory.

Ahem.  "bar.zip' is a *DIRECTORY*. I said 'mkdir bar.zip' - why is it not
acting like a directory?
 
> However, let's not discuss this now.  I do NOT want to start another
> "silent semantic changes with reiser4" thread.  File-as-directory is not
> happening this time, so don't worry about it -- this time.

Fish or cut bait.  You are the one who started handwaving the 
'file-as-directory'.
If you don't want it discussed, don't mention it.

> > Most of the time, if I have a file 'linux-2.6.12.tar.bz2' and a
> > directory 'linux-2.6.12', what is under the directory is *NOT* the same
> > data as what's in the .bz2 - I've done 'make oldconfig' and a few builds
> > and some variable amount of patching, usually with rejects, and I *don't*
> > want that .bz2 being updated during all this (hint - what's my next command
> > after 'rm -rf linux-2.6.12' likely to be, and why, and  what expectations
> > do I have when I do it?)
> 
> You're misunderstanding.  man zip.
> $ zip bar foo
> creates/modifies a file named "bar.zip", not "bar", which contains the
> file "foo".

No. *YOU* are misunderstanding.  I have a directory 'linux-2.6.12', and
I have a file 'linux-2.6.12.tar.bz2', and I do *NOT* want directory operations
to be silently converted into "let's scribble into the middle of this tar file
and then compress it".  (Hint - work out how long a kernel 'make' would take
if you were doing it inside a .tar.bz2).

> > You want to think this sort of thing through *really* thoroughly, because
> > there's a *lot* of things, both users and programs, that have expectations
> > about The Way Things Work.
> 
> Or, I can avoid those issues altogether, and simply delegate this kind
> of stuff to user-created-but-magic directories.  For instance, I could
> have a directory called "/foo" which contains encrypted files, and
> "/foo/decrypted" which has transparently decrypted representations of them.

So rather than everything working in a funky manner, a program gets to guess
how funky, and in what direction, a given magical directory is


pgp8ioinE1xvZ.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-26 Thread Valdis . Kletnieks
On Sun, 26 Jun 2005 02:48:06 CDT, David Masover said:

> Lincoln Dale wrote:

> > this is the WHOLE point of standardization .. i don't think its that
> > Reiser4's EAs offer any more or less capabilities than standard EAs -
> 
> They do.  Reiser4's EAs can look like any other object -- files,
> folders, symlinks, whatever.  This is important, especially for
> transparency.

No, you want them to look like the same objects that {get|set}xattr() manage
currently.  You don't want programs to have to guess what an EA looks like
this week, with this user's combination of plugins that's different from
everybody else's.

> > lets take this a step further.  what about compression?  do we accept
> > that each filesystem can implement its own proprietary compression via
> > its own API - and now we need individual user-space tools to understand
> 
> No, that's the beauty of these "EAs" in Reiser4.  The API is standard
> write(2) commands.  sys_reiser4 supposedly implements an interface to
> make this scale better, but otherwise have the same semantics.  And who
> said anything about proprietary compression?  I think we were planning
> on the kernel's zlib, though we might have been planning to make it a
> bit more seekable...
> 
> > each of these APIs?
> 
> So, the API becomes something like:
> 
> cat crypto/inflated/foo   # transparently decompressed
> cat crypto/raw/foo.gz # raw, gzip-compressed

And 'cat crypto/raw/foo' or 'crypto/inflated/foo.gz' gets you what, exactly?

Now throw some .bz2 and .zip files into the mix... ;)

> Another possibility, if you like file-as-a-directory:
> 
> cat foo.gz# raw
> cat foo.gz/inflated   # decompressed
> 
> One could easily imagine things like these two potentially equivalent
> commands:
> 
> cp foo bar.zip/
> zip bar foo

Unless of course the user had done 'mkdir sorted.by.city.zip' to make
a directory of files containing data sorted by USPS Zip code.

And what happens if the user has a file 'bar' that's not a ZIP file,
and a directory 'bar.zip' isn't a view into 'bar'?

Most of the time, if I have a file 'linux-2.6.12.tar.bz2' and a
directory 'linux-2.6.12', what is under the directory is *NOT* the same
data as what's in the .bz2 - I've done 'make oldconfig' and a few builds
and some variable amount of patching, usually with rejects, and I *don't*
want that .bz2 being updated during all this (hint - what's my next command
after 'rm -rf linux-2.6.12' likely to be, and why, and  what expectations
do I have when I do it?)

You want to think this sort of thing through *really* thoroughly, because
there's a *lot* of things, both users and programs, that have expectations
about The Way Things Work.

> The whole point is to have less userland tools, not more.  I'm not
> saying we move zip into the kernel, just that the user now has one less
> command to remember.

But now instead of having to remember the one meme "I can manage any
compressed-archive format that's stored in a file, and put other files in it,
and all I need is the appropriate userspace tool", they have to remember "the
cp trick works for .zip and .tar, but I'll get a "not a directory" error if I
try it with a .hqx file, and that other file format may or may not work,
because I can't remember if this kernel has a working out-of-tree module for
this kernel"



pgp996ifajjEW.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-25 Thread Valdis . Kletnieks
On Sat, 25 Jun 2005 13:33:27 CDT, David Masover said:
> > Now *think* for a moment - how does a hypothetical Reiser4 using ext3 format
> > gain any speed advantage with small files, when the speed advantage is based
> > on using a format other than ext3?

> happen in RAM.  If you do a ton of work with a dataset that stays in
> RAM, Reiser probably performs as well or better than a ramdisk, because
> changes that get overwritten while still in RAM never actually touch the
> disk.

At that point, since the actual buffer management is being done at the VFS
level (see fs/buffer.c and friends) what you're really comparing is the speed
of journalling metadata - at which point you need to be *very* careful to
specify exactly what configuration you're talking about.  If you don't believe
me, investigate why mounting a filesystem with 'noatime,nodiratime' can make a
dramatic difference totally independent of the underlying filesystem, but the
actual amount of gain is dependent on format (hint - how far do the heads have
to move to record 3 atime updates against 3 random inodes on an ext2, an ext3,
and a VFAT filesystem, assuming no other disk activity), and why
ext3 has 3 different modes data=ordered/writeback/journal.

>Reiser also doesn't fragment as quickly as ext3, and I don't
> think that has anything to do with its format.

Care to explain why it's not format-dependent? 

> > b) Tell reiser4 to get its grubby little paws off the VFS if it ever intends
> > to have a chance of being merged in mainline.
> 
> You are saying Reiser isn't in because it shouldn't touch VFS.  Every
> single other person in this thread says Reiser isn't in because it
> *should* touch VFS.

Hmm.. let's see.. I said Reiser isn't in because it shouldn't be screwing with
the VFS, and said stuff should be done separate from the Reiser4 filesystem.

Everybody else said that to achieve all the goals that Hans wants will require
changes to the VFS, and the way reiser4 gets there isn't acceptable.

Seems like myself and everybody else are saying this needs to be factored into
2 pieces, a FS piece and a VFS piece, and moved forward separately.

> > c) Have a *separate* project to improve the speed/reliability/function of
> > the VFS layer, which is the only way that your vision of having the ext3 and
> > reiser developers cooperating will ever happen.
> 
> Why does it have to be a separate project if it is already *done* as
> part of Reiser4?  Or is the name Reiser just cursed that way?

Because it's done *as a part of reiser4*, and not as a separately reviewed
change to the VFS.

> The FS that gets merged ahead of time without plugins would no longer be
> Reiser4.  Go read the whitepaper, or tell me why I'm wrong, but even
> symlinks are implemented as plugins.

Which is another way of saying Reiser4 can't be merged in its present form.

> I'm not arguing that at all.  But if you've got an entirely new driver,
> why not do:
> 
> Patch 1/2:  Add white_whale driver, which also adds moby_foo_init to
> nautical core.

You don't do this because the rule is "one patch, one logical change", and
"which also" implies more than one change.

> Actually, plugins are just as easy or easier than crypto-loop or
> dm-crypt.  And why shouldn't my crypto be easy?  Most users are insecure
> in all kinds of ways because of that attitude -- security is HARD, so I

There's a vast distinction between "easy for implementors" and "easy for
users".  Jaari Russo's loop-aes stuff does a wonderful job of being "easy for
users" - just say "mount", answer the passphrase, and you're good to go.  The
underlying arguing about the crypto involved is complicated enough keep
professional crypto jocks busy for years (is the watermark attack Jaari is
concerned about a real threat?  You tell me. ;)

Meanwhile, PGP was designed to be used in an environment where you could do
this:  "Today's secret plans are AES256 encrypted.  The key is the next key in
your one-time-pad book, XOR'ed with your 128-bit secret key - do it in your 
head".
(And yes, you can easily memorize a 16-digit hex number and learn to do an XOR
with another 16-digit number, if failing to do so means you could end up dead).

This is inconvenient for the user, but intractable for an attacker to create a
scenario where they can just 'vi /each/decrypted/file' ;)

> won't do it.  If security is transparent, just enter a password and go,
> then more people would do it.

"Just enter a password and go, then more people would do it".

Two words: "phishing e-mail".

Why does phishing work at all?  Because it's simple for the user, and the
user isn't aware of the totally busticated underlying security model of
SMTP (namely, there isn't one) and the mostly busticated security model of
most browsers (the misleading concept that if an SSL site is identified
by the SSL cert, that this implies the site can be trusted, and other similar
misfeatures).



pgpUasibn4ACr.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-25 Thread Valdis . Kletnieks
On Fri, 24 Jun 2005 23:10:35 CDT, David Masover said:

> But Linux is better.  DOS ain't broke, but Linux is better.  So maybe
> VFS ain't broke, but plugins would be better.  I guess we'll only know
> if we let Reiser4 merge...

No, we'll only know if we merge something that does plugins at the VFS
level in a well-designed way.

> This was about a hypothetical ext3 format as a reiser4 storage plugin.
> I'm not sure how this ties into the VFS stuff.

Very poorly.  There's only two interpretations of "ext3 as a reiser4 plugin"
that make *any* sense.  The first is that reiser4 is totally violating the VFS
layer boundary, and the second is that reiser4 is trying to be an all-singing
all-dancing wankfest.  Later on, you say:

> A lot of what people like about ext3 is its stability and fairly
> universally accepted format.  A lot of what people like about XFS is its
> stability and speed, mainly with large files.  A lot of what people like
> about Reiser4 (as it is today) is its speed, with large and especially
> with small files.

Now *think* for a moment - how does a hypothetical Reiser4 using ext3 format
gain any speed advantage with small files, when the speed advantage is based
on using a format other than ext3?

As I said, either it's violating the VFS boundary, or it's busy wanking.

The Reiser4 proponents would be well served to disavow that particular
hypothetical example - I have yet to see *anything* that does more damage
to the Reiser4 cause.

> So, in this hypothetical situation where ext3 is a reiser4 plugin,
> suddenly all the ext3 developers are trying to improve the speed and
> reliability of reiser4, which benefits both ext3 and reiser4, instead of
> just ext3.

Or we can do what *should* be done, which is:

a) Put the crack pipe down.

b) Tell reiser4 to get its grubby little paws off the VFS if it ever intends
to have a chance of being merged in mainline.

c) Have a *separate* project to improve the speed/reliability/function of
the VFS layer, which is the only way that your vision of having the ext3 and
reiser developers cooperating will ever happen.

Yes, the VFS could probably use an overhaul.  But that *will* happen like this:

1) A patch is submitted and passes review to change the VFS.
2) If appropriate, a patch for reiser4 (if it gets merged) is also submitted
(possibly by the same people) to be the first user of the new API/functionality.

There's a *reason* why we see patch streams that look like:

Patch 1/3: Add moby_foo_init function to nautical core.
Patch 2/3: Modify white_whale driver to use moby_foo_init
Patch 3/3: Modify captain_ahab driver to use moby_foo_init

> Aside from what someone else already said about this, why not just have
> support for accessing, say, a .gpg file as transparently decrypted?  You
> don't even need file-as-directory, just create a file called foo which
> is really the decrypted version of foo.gpg.  No need to change the
> format, just the filesystem.

I don't think this is what they mean by "Linux gives you enough rope to
shoot yourself in the foot with"...

> Plus, as someone else said, it's much easier to do
> $ vim /some/encrypted/file
> than
> $ gpg --decrypt /some/encrypted/file > /some/decrypted/file
> $ vim /some/decrypted/file
> $ gpg --encrypt /some/decrypted/file > /some/encrypted/file
> $ shred /some/decrypted/file

You've totally failed to understand that the whole *point* of PGP is that 'vim
/some/encrypted/file' *isnt* easy to do.  A better example might be the various
crypto-loop-ish variants or Microsoft's EFS, where the key management model is
more tractable to this sort of automation.



pgpusAjyL6l40.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-24 Thread Valdis . Kletnieks
On Fri, 24 Jun 2005 17:53:15 PDT, Hans Reiser said:
> [EMAIL PROTECTED] wrote:
> 
> > Right - once the VFS hands the call off to reiser4, you're on your own
> >
> >as far as I'm concerned..
> >  
> >
> Well, that is all I ask for, and Christophe and company disagree. 
> Happy to abstract it more into VFS after the merge if others want it
> though

Unfortunately, the only *realistic* path I can see at the moment is to strip
it down to those reiser4 features that don't require any VFS help/changes to
do, and pursue VFS changes as a separate issue entirely.

Also, just as a personal note - talking about a "reiser4 plugin that does ext3
backend storage" doesn't help the cause.  What you *should* be trying to sell:

1) a reiser4 storage backend with backend plugins that follows the VFS 
conventions.
2) a vastly enhanced VFS that has VFS plugins and calls reiser4 and ext3 
backends.

Approaching it that way will make people think that (a) you really *do* 
understand
the VFS/FS layering and (b) are willing to compromise to get things done.. ;)



pgpLAuclh3B7C.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-24 Thread Valdis . Kletnieks
On Fri, 24 Jun 2005 16:20:45 MDT, Perry Kundert said:

> OK, fair enough.  The file-as-directory stuff, which introduces
> VFS-incompatible issues, was turned off.  It requires VFS changes.

Mind you, I still think that sounds *interesting*, but it *has* to happen
at the VFS level.  (And if *that* doesn't force a 2.7 fork to happen,
nothing will :)

> The remaining plugin architecture, as far as I understand, deals
> in the on-disk structure of the FS -- just like journals.  Encryption,
> Compression, and the like.
> 
> So, what you are saying, is -- so long as the "plugins" do stuff
> that deals in how reiser4 slings bits back and forth to the disk,
> you're OK, right?

Right - once the VFS hands the call off to reiser4, you're on your own
as far as I'm concerned..

> So, what you are saying is: if reiser4 wants to provide
> variability in on-disk format, so long as it implements it using
> *multiple different filesystems* (eg reiser4-cryptcompress,
> reiser4-whatever) -- just like ext2 vs. ext3 -- then you are OK with
> it?  But, if they are implemented as "plugins", so that the ONE
> reiser4 filesystem can modulate its behaviour based on what the
> on-disk format says, that you are NOT OK with it?
> 
> And this makes sense, why?

You misread that - my point was that ext2 and ext3 may look similar, but
they're sufficiently divergent that trying to create one driver that handles
both results in an ugly driver, thus the split...

> Don't get me wrong -- I'm not saying that ext2 and ext3 shouldn't
> be separate file systems.  However, if they were designed from the
> start so that the ONE (say) ext23 filesystem could look at its on-disk
> format, notice that the data specified the "journal" plugin, and
> implement the correct behaviour -- that this would be "bad"?

Well, if they *had* been, it would be a different story.  And there's stuff
in ext3 (see the "-O feature" section of 'man tune2fs') that *does* do the
sort of thing that you're proposing.  It's just that the ext2 codebase doesn't
fit in well for historical reasons.

> Because I can envision an ext23 filesystem that is just like
> reiser4, that does exactly that -- implements its variable behaviour
> via a "journal" plugin.

> So, if it did so, would you be OK with it?  As long as it wasn't
> called reiser4?

No, I'd be perfectly happy with a reiser4 that had a 'tunereiser 
--enable-plugin='
that had the same sort of format-altering semantics that 'tune2fs -O' has.

For bonus points, design a system that stores the plugin *in the file
system* (probably need to have a bytecode interpreter for this).  Then
you eliminate the "can't mount if the kernel can't insmod the plugin" issue ;)

> I really don't mean to sound sarcastic -- it just sounds like
> there are "other" issues at work here -- like "Hans is a Butt-Head, so
> I want to reject reiser4's plugin design for modulating its behviour,
> no matter what".

I've never actually met Hans, so I don't know if he really *is* a butt-head
or not.  And I usually at least *try* to phrase it more like "this proposal is
a non-starter that only a butt-head could continue to support, because.." :)

Of course, I myself have been called a butt-head on numerous occasions, because
I'm convinced there's a right and wrong way to design something. ;)

> OK, so far it seems like we are actually agreeing -- the stuff
> that gets done via reiser4 "plugins" actually doesn't have anything to
> do with the VFS, and it shouldn't be there.  So long as reiser4
> presents a VFS-sensible, VFS-consistent heirarchy of stuff that "looks
> like" files and directories to the VFS, then we're OK with it?
> 
> Whether reiser4 uses "plugins", or ESP, or whatever to decide what
> behaviour it implements in order to produce this VFS-consistent
> interface, then that's OK, right?

Right - not that *my* opinion counts for tons on LKML, and any *other*
stylistic/design faults are a separate issue. :)


pgpysKWXO5s3x.pgp
Description: PGP signature


Re: reiser4 plugins

2005-06-24 Thread Valdis . Kletnieks
On Fri, 24 Jun 2005 11:13:45 MDT, Perry Kundert said:

> In general, isn't it better to first include modules providing
> divergent but possibly interesting functionality (such as Reiser4) as
> an "optional" or "experimental" component, and then slowly re-factor
> desirable functionality into higher level facilities like the VFS?

The problem arises when the facility is something that is demonstrably
borked when done in an optional way in one filesystem, and really needs
to be done at the VFS level if it is to be done at all.

>I ask you -- if everyone in kernel-land is so convinced that you
> should always select varying on-disk formats via the VFS, then *why*
> hasn't ext2/ext3 been merged into a single filesystem?

Because the formats, although similar enough to be mostly compatible, are
still different enough that merging them is difficult.  There's some very
subtle second-order effects, where the ext3 driver can do things in different
orders or with different algorithms because it has a journal, when the ext2
code has to do things in a specific way because it has to *always* have things
in a consistent enough state that fsck.ext2 can clean things up.  So you end
up with code that looks like:

if (fs->journalled) {
/* 500 lines of code for the ext3 case */
} else {
/* 300 lines of different code for ext2 */
}

If you don't like that, then you can do this instead:

1) put ext2_do_whatever in ext2_whatever.c
2) put ext3_do_whatever in ext3_whatever.c

extern ext2_do_whatever();
extern ext3_do_whatever();

if (fs-> journalled) {
ext3_do_whatever();
} else {
ext2_do_whatever();
}

In fact, I seem to remember Alan Cox answering this with "only about 10% of
the code *wouldn't* end up like this" or similar...

> Surely the
> "journalling" plugin of this filesystem is a prime candidate for
> selection via the VFS?

To be doing "journalling" at the VFS level implies that a journal is something
that makes sense at the VFS level - that it's basically filesystem independent,
which is most certainly *not* true - the notations an XFS journal needs to make
to indicate which blocks were just removed from the free-block structure are
quite different from what ext3 needs to record.

Note that journalling is neither an attribute of the actual data, or of
the user-visible metadata (inode contents, etc).   The only things that
care about the journalling format/etc are the filesystem driver, the mount
command, and the mkfs/fsck commands.   As such, it's a file system issue,
not a VFS issue.

For a good example of why this is so, go back and read the recent discussion
of what happens to flash memory filesystems mounted with 'sync' - this was
a case of the VFS doing "journalling by flushing" without consulting the
low-level drivers


pgpoai8EjBqXa.pgp
Description: PGP signature


Re: 13000Gig partition badblock check is the same -- do a reiserfsck again ?

2005-06-02 Thread Valdis . Kletnieks
On Thu, 02 Jun 2005 09:28:50 CDT, Dan Oglesby said:

> latest versions.  Took two days to run, but it completed, and I ended up 
> only losing 2 files out of over 1.1 million files on a 1TB RAID-5 
> array.  That's not too bad, considering how many times the machine went 
> up and down due to bad power in the building.

Buy a UPS. Now.  Even if it's just a big battery that will only keep you
running for 10 mins - at least that will give you enough time to do a clean
shutdown -h rather than get stuff trashed.

If you can't get money for it, just point at the lost-productivity costs
the *next* time the terabyte takes 2 days to recover.. and remind the boss that
you could be down for 2 days every time the lights flicker.. ;)


pgp0bq8czXfZ1.pgp
Description: PGP signature


Re: File as a directory - VFS Changes

2005-05-31 Thread Valdis . Kletnieks
On Tue, 31 May 2005 08:04:42 PDT, Hans Reiser said:

> >Cycle may consists of more graph nodes than fits into memory. 
> >
> There are pathname length restrictions already in the kernel that should
> prevent that, yes?

The problem is that although a *single* pathname can't be longer than some
length, you can still create a cycle.  Consider for instance a pathname 
restriction
of 1024 chars.  Filenames A, B, and C are all 400 characters long.  A points at 
B,
B points at C - and C points back to A.

Also, although the set of inodes *in the cycle* fits in memory, the set of
inodes *in the entire graph* that has to be searched to verify the presence of
a cycle may not (in general, you have to be ready to examine *all* the inodes
unless you can do some pruning (unallocated, provably un-cycleable, and so
on)).  THis is the sort of thing that you can afford to do in userspace during
an fsck, but certainly can't do in the kernel on every syscall that might
create a cycle...



pgpdt2U5lIsqK.pgp
Description: PGP signature


Re: Reiserfs 1300G partition on lvm problem ...

2005-05-30 Thread Valdis . Kletnieks
On Mon, 30 May 2005 08:17:00 +0200, Matthias Barremaecker said:

> I did a bad block check and I have 10 bad blocks of 4096bytes on 1300Gig 
> and ... that is the reason reiserfs will not work anymore.

> I guess this sux. I rather have that the data on the bad blocks is just 
> corupted but the rest is accesseble.

It all depends on which 10 blocks go bad.  If it's a block that's allocated to
a file, you lose the 4K or whatever that's in that block.

If it's a block that an inode lives in, you're probably going to have the
entire file evaporate.

If it's a block that contains something even more important, you're going to
have large sections of the file system evaporate.

It's a tradeoff issue - how many times do you replicate metadata on the
filesystem, against how well the file system deals with errors.  The problem is
that if you just say "let's have 2 copies of everything, just in case", it
takes a lot more disk space to *store* 2 copies of the metadata.  Also, your
disk performance falls through the floor - most journalled filesystems have
enough trouble making sure that *one* copy of things like the free list is on
disk and consistent with the journal.  Making 2 copies is going to probably
triple your disk I/O and complicate matters a *lot* for fsck (if you crash and
the two copies aren't consistent, which one do you believe?)

That's why almost all filesystems designers just punt and assume that the media
actually works, and suggest if your media might not be 100% reliable, that you
use RAID or similar solutions



pgpSnF34gSSOu.pgp
Description: PGP signature


Re: Reiserfs 1300G partition on lvm problem ...

2005-05-29 Thread Valdis . Kletnieks
On Sun, 29 May 2005 21:25:54 +0200, Matthias Barremaecker said:

> but that sais it is a fysical drive error

Physical drive errors.  Your hardware is broken.  Isn't much that Reiserfs
can do about it.

> What can I do.

1) Call whoever you get hardware support from.

2) Be ready to restore from backups.

3) If you didn't have RAID-5 (or similar) set up, or a good backup, consider
it a learning experience.

If your data is important enough that you'll care if you lose it, you should 
take
steps to make sure you won't lose it... It's that simple.

(Just for the record, if we have important info, it gets at least RAID5, a
backup to tape or other device, *and* a *second* backup off-site.  And my shop
is far from the most paranoid about such things.)



pgp2dVSpWxCvh.pgp
Description: PGP signature


Re: Problems with accessing directory

2005-05-29 Thread Valdis . Kletnieks
On Sun, 29 May 2005 18:44:02 +0200, Kurt Ghekiere said:

> May 29 17:28:51 mail3 kernel: Process hax0r (pid: 3738,
> stackpage=f121b000)
> May 29 17:28:51 mail3 kernel: Stack:  bfe5 1000 f63b6000
> bfffe7f0 f63b6000 b8e4 
> May 29 17:28:51 mail3 kernel:f78aaf22 0a3a 0020 f121a000
> c0108a93 000b bfffe7f0 f78a99a1
> May 29 17:28:51 mail3 kernel:   
>    
> May 29 17:28:51 mail3 kernel: Call Trace:[]
> May 29 17:28:51 mail3 kernel:
> May 29 17:28:51 mail3 kernel: Code: 8a 02 84 c0 75 ef e8 9c ec ff ff 89
> c2 80 3a 00 0f 84 bb 00

Interesting process name indeed. Hopefully you recognize it? ;)

I would suggest running the call trace through ksymoops, but it's so short that
we've quite obviously clobbered the stack to the point that ksymoops won't tell
us anything useful.

I'd investigate why you get all those insmod errors - why is the system trying
to load pciehp and hw_random if there's no device?  Alternatively, are other
modules getting loaded incorrectly and blocking those from starting? It's
possible that if your kernel and modules are out of sync, that Bad Things like
panics happen

You probably should look at upgrading the userspace reiserfsck and MD/LVM tools
- your kernel seems unhappy with the old versions.

Other than that, I admit to not having any clear "AHA! THAT's their problem"
solution, sorry


pgpTiOKMSdDkf.pgp
Description: PGP signature


Re: Problems with accessing directory

2005-05-29 Thread Valdis . Kletnieks
On Tue, 29 May 2001 18:14:28 +0200, Webservice said:

> When accessing a particulary directory, the systems hangs with a kernel
> panic (mapping memory).

This will be a lot easier to diagnose with:

The exact version of your kernel (uname -a), the version of reiserfsck,
and the actual panic traceback (set up a serial console to catch it, or
even take a picture with a digital camera if all else fails).

Have you run 'badblocks' on the 4 md devices, to rule out an actual bad
spot on the disk?


pgpFfAoRZDuOp.pgp
Description: PGP signature


Re: File as a directory - Ordered Relations

2005-05-28 Thread Valdis . Kletnieks
On Fri, 27 May 2005 23:56:35 CDT, David Masover said:

> Hans, comment please?  Is this approaching v5 / v6 / Future Vision?  It
> does seem more than a little "clunky" when applied to v4...

I'm not Hans, but I *will* ask "How much of this is *rationally* doable
without some help from the VFS?".  At the very least, some of this stuff
will require the FS to tell the VFS to suspend its disbelief (for starters,
doing this without confusing the VFS's concepts of dentries/inodes/reference
counts is going to be interesting... :)


pgppFAsfV0InP.pgp
Description: PGP signature


Re: peak performance

2005-05-28 Thread Valdis . Kletnieks
On Fri, 27 May 2005 19:26:26 +1000, robby cunningham said:
> I've been using your product for 4 months now. I've increased my length from
> 2 inches
> to nearly 6 inches. Your product has saved my sex life.-Matt, FL

I'm glad Reiserfs worked for him, but somehow I don't see Hans listing this
one on the "Reiserfs success stories" page. ;)


pgpajTaF2JBvl.pgp
Description: PGP signature


Re: Reiser4 O_DIRECT

2005-05-24 Thread Valdis . Kletnieks
On Tue, 24 May 2005 16:35:51 CDT, David Masover said:

> My feeling is that you create the standard as you create the test, not
> the other way around.  If the test works, then there are by definition
> few bugs if any in the system itself -- any other bugs are actually in
> the application, not the system.

That's even worse.  Then if somebody bodgers it all up with some corner case
that your test system didn't cover, you're by definition screwed, as the
standard won't say what it *SHOULD* do

Consider the vast philosophical difference between "This is what the FS *should*
do" and "This is what we tested the FS for".  You want the standard to be the
first, not the second.


pgpv30U27h0em.pgp
Description: PGP signature


Re: Reiser4 O_DIRECT

2005-05-23 Thread Valdis . Kletnieks
On Mon, 23 May 2005 12:52:12 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said:
> On Sun, May 22, 2005 at 07:22:51PM -0500, David Masover wrote:
> >Of course, I've worked on sufficiently few big projects that I'm still
> >naive enough to believe that unit tests _can_ catch everything, if
> >they're done right.  I'm sure I'll eventually be proven wrong...
> I'm sure a testing professional will happily prove you wrong ;)

It's *never* the testing professional that disproves "unit tests can catch 
everything".

It's the guy with the creeping-horror Cobol/Python database that finds the stuff
that unit tests can't catch.. ;)



pgpYs0VhzoXNU.pgp
Description: PGP signature


Re: Reiser4 O_DIRECT

2005-05-22 Thread Valdis . Kletnieks
On Sun, 22 May 2005 19:22:51 CDT, David Masover said:

> This is exactly why it should be in the kernel once the developers can't
> find any more bugs.  Marked as experimental, mainly, but in the kernel
> where real users can throw cobol/Java/sql bastardizations at it and
> break it.

Oh, I agree there - it's at a point where it *should* be in a -mm kernel,
or a -linus wrapped with a Kconfig 'depends on EXPERIMENTAL'. (I'll let
Andrew and Linus make *THAT* decision ;)

I'm just worried about PHB managers lurking on the list and reading "so stable
even the *developers* can't break it" as "it's *really* good and solid" rather
than "we need *other* people to break all the stuff we forgot to break" ;)



pgpIVeDrSjLfn.pgp
Description: PGP signature


Re: Re[2]: Reiser4 O_DIRECT

2005-05-22 Thread Valdis . Kletnieks
On Sat, 21 May 2005 23:49:00 +0200, Pysiak Satriani said:

> I remember Hans saying that r4 is so stable that the developers themselves
> can not find any more bugs.

Which in reality probably means "It *probably* won't eat your data".

Remember that the developers have a limited number of different hardware
configurations, and a limited number of test tools, and a limited number
of ways they use the file system.

So there's probably *plenty* of bugs still to be found - most of them the sort
that nobody will expect, and won't be found until some user's creeping-horror
database application that's written half in Cobol and half in Python does 
something
totally stupid but legal.

I've been on both ends - beta tester for software the developers weren't
finding any more bugs in (I filed over 300 bug reports against the product
anyhow), and had users find fatal bugs I couldn't find (my favorite had to be a
user who managed to crash software I wrote by entering a backspace character.
On an IBM 3270 terminal. Which doesn't *HAVE* a transmittable backspace
character - backspace is handled locally in the terminal)



pgppuKrhGbpvC.pgp
Description: PGP signature


Re: trusted processes

2005-05-12 Thread Valdis . Kletnieks
On Thu, 12 May 2005 21:13:46 CDT, David Masover said:

> I bet this is what Hans was thinking of with "views".  But views are
> much more global than "trusted processes".  Specifically, views allow
> different degrees of "normal" processes.
> 
> Other than that, I don't see how this is particularly helpful compared
> to UNIX security -- root is trusted, others aren't trusted, use ACLs if
> you need something complex.

That's good if you're working with discretionary access control (DAC) - both
ACLs and the rwx bits are examples of that.

If you're trying to implement mandatory access control (MAC), you need to take
a totally opposite approach - you need to design it such that even the file
*owner* is *not* able to grant access even if they want to - they can do 'chmod
777' and 'chacl -B' all they want, others can't get access if the site policy
doesn't permit it. (Dealing with all the odd corner cases is why things like
SELinux end up being complicated).

Those of you who are looking at "views" should probably take note of the
"Polyinstantation" thread on the '[EMAIL PROTECTED]' list. I've attached
the overview of the patchkit - the other 5 parts of the patch kit really aren't
much use unless you have a crash&burn SELinux testbed handy.  There's also some
implicit assumptions about having an MLS compartmentalized policy in place.

Unfortunately, I didn't keep all the postings on the thread.

The SELinux list is archived at http://www.nsa.gov/selinux/list-archive/ but
unfortunately only refreshed at irregular intervals (last was March 9).

--- Begin Message ---
This patch us a userspace patch to provide polyinstantiation support in
SELinux.  I am including a patch to libselinux to provide this, as well
as patches to login, su, gdm, and policy to make this work.  These
patches will follow in separate emails.  Comments are appreciated
(usually, at least).

OVERVIEW
This code provides polyinstantiation support for directories in SELinux
systems.  It creates multiple instances of a directory as dictated by
policy.  These instances are actually subdirectories, named using a MD5
hash of the context of the member directory, that are used in place of
the parent directory.  To interface with policy, the code utilizes
the /selinux/member interface to read how directories should be
polyinstantiated according to policy.

In order to specify how directories should be instantiated in type
enforcement policy, a type_member rule is used.  MLS polyinstantiation
policy is implicit (i.e. the directory should be polyinstantiated to the
level of the user).  This code queries the /selinux/member interface to
see what member of a polyinstantiated directory should be used for a
given subject context.

To replace the original directory with the appropriate member directory,
per-process namespaces and bind mounts are used.  More specifically, an
entrypoint program (say login) calls this library to see if any
polyinstantiation is necessary.  If so, it calls clone() instead of fork
with the CLONE_NEWNS flag to get a new namespace.  Then, the library can
bind mount member directories over the originals.  Additionally, the
library remounts the original directory elsewhere (e.g. /tmp is
remounted to /.tmp-poly-orig) for security-aware (and allowed) programs
to utilize it.

USING THE LIBRARY
The library exports 2 functions, security_setupns(), the main function,
and security_set_setupns_printf(), a support function to change where
printf's go. security_setupns() sets up a namespace for the user being
processed.  It takes one argument - commit, which is an integer that can
be 0 or 1.  If commit=0, the function does not actually set up the
namespace, but just checks to see if any modifications to the namespace
are necessary.  If commit=1, those modifications are actually made.  The
function returns the number of changes (i.e. directories needing
polyinstantiation).  security_set_setupns_printf is used to replace the
printf function (which defaults to logging to stderr) the same way that
set_matchpathcon_printf() does.

CONFIG FILE
There is one config file, which is stored in /etc/selinux/polydirs.  The
first line contains the default context to use for directories that
originals are remounted to (e.g. /.tmp-poly-orig), which only matters
before the bind mount happens.  The rest of the file is a newline-
delimited list of candidate directories to be polyinstantiated.  Each of
these directories will be checked to see if polyinstantiation is
necessary according to the policy.  Additionally, the library supports
the special directory $HOME to indicate the home directory of the user
who's environment we're setting up.

POLICY
Policy is fairly straightforward.  Just write a type_member rule.  The
syntax is:
type_member  : 
 is the type of the user logging in,  is the
type of the directory being polyinstantiated,  is dir
(since this patch only works for directories), and  is what
you want the member directory context to be.  So, the rule
type_memb

Re: disk runs full

2005-05-12 Thread Valdis . Kletnieks
On Thu, 12 May 2005 13:50:31 +0200, Alexander Gruber said:
> I checked it with du -sh * on the root partition and the result was much 
> smaller than the used space reported by df.

Note that temporary files are often creat()ed and then unlink()ed, leaving
the open file descriptor as the last reference.  You should probably run
'lsof' or similar tool.  On my laptop at the moment:

lsof -n | grep dele
cardmgr2207   root3u   CHR  254,05556 
/dev/cm-2123-2 (deleted)
cardmgr2207   root4u   CHR  254,15559 
/dev/cm-2123-5 (deleted)
cardmgr2207   root5u   CHR  254,25562 
/dev/cm-2123-8 (deleted)
exmh   7142 valdis   10u   REG   0,160  74035 
/tmp/tclfG25oV (deleted)
gconfd-2   7805 valdis   13wW  REG   0,16  641  44735 
/tmp/gconfd-valdis/lock/0t1115905590ut151063u967p7805r252866408k3219173544 
(deleted)
aspell 9481 valdis2u   REG   0,160  74035 
/tmp/tclfG25oV (deleted)

So there's 2 open but unlinked files on /tmp, and du and df will show up 
different
values. (Note that exmh did an open() of a file, unlinked it, and then passed
the open file descriptor to aspell as stdin - so that space will be reclaimed
once *both* of those processes have done a close() on the file descriptor).


pgpPdTaifLWuY.pgp
Description: PGP signature


Re: file as a directory

2005-05-10 Thread Valdis . Kletnieks
On Tue, 10 May 2005 10:39:23 BST, Peter Foldiak said:
> Back in November 2004, I suggested on the linux-kernel and reiserfs
> lists that the Reiser4 architecture could allow us to abolish the
> unnatural naming distinction between directories/files/parts-of-file
> (i.e. to unify naming within-file-system and within-file naming) in an
> efficient way.
> I suggested that one way of doing that would be to extend XPath-like
> selection syntax above the (XML) file level.

I believe the consensus was that this needs to happen at the VFS layer, not
the FS level.  The next step would be designing an API for this - what would
the VFS present to userspace, and in what way, and how would backward
combatability be maintained?


pgpKIGxXIFQdb.pgp
Description: PGP signature


Re: Re[2]: When Reiser4 will be officially included in the kernel? ...

2005-05-04 Thread Valdis . Kletnieks
On Thu, 05 May 2005 00:38:54 +0200, Pysiak Satriani said:
> > This is OK, however, what I am looking for is to download the Kernel
> > from kernel.org, and found Reiser4 code inside. This means officially
> > for me.
> FYI, kernel.org does have patches with r4, eg.
> http://www.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.12-rc3/2.6.12-rc3-mm2/2.6.12-rc3-mm2.bz2
> 
> When 2.6.12 comes up, you might want to take the next 2.6.12-mm1 or
> the next 2.6.12 reiser patch from namesys.com

Please do *NOT* run -mm kernels unless you are *sure* you know what you're
doing.  They are *literally* bleeding-edge test kernels, and *are* the most
likely things on the entire kernel.org server to hang, wedge, reformat your
disks and eat your data, and otherwise have a bad time with.

To repeat:  -mm are *TEST* kernels.  2.6.12-rc3-mm1 came out at 23:11 Friday.
2.6.12-rc3-mm2 came out at 16:43 the next day. Why? Umm.. let's just say
that it didn't even *compile* cleanly for uniprocessor X86... ;)  I've been
running -mm kernels on this laptop since 2.5.45-mm1 or so, and I'd estimate
that at *least* 1 in 4 hasn't even booted cleanly to multiuser without 
additional
patching and tweaking. In the last 31 -mms, I've needed 10 additional patches.
Think about that - since 2.6.9-rc2-mm1, there's been 31, and 10 needed patches.
Cool stuff if "Test Pilot" is part of your job description, but not what you
want to put on production boxes. ;)

If you're brave, you can pull the -broken-out variant and apply all the
reiser4-* patches in the order they're listed in the 'series' file - that
should work unless they depend on some other patch being applied first.

If you're willing to stress test new stuff, and are prepared to recover your
system from backups, go for it - Linus and Andrew want -rc and -mm kernels to
get more runtime.  But they're definitely *not* the "official" kernels that the
original poster wanted, which is usually called "mainline" or "Linus" kernels.

>From where I'm sitting on the sidelines, Reiser4 *can't* make 2.6.12, is a long
shot for .13, and most likely will land somewhere around .14 to .16.  And I'm
going to predict there will be at least one more major bun fight on the lkml
list about pseudo-files before it gets in (sorry Hans, but I have to side with
the guys who said "Cool idea, but it really needs to be at the VFS level")...



pgp2pDwCcdvrX.pgp
Description: PGP signature


Re: Re[2]: reiser4.1

2005-03-03 Thread Valdis . Kletnieks
On Thu, 03 Mar 2005 09:55:20 +0100, Pysiak Satriani said:

> I remember Hans saying that nowadays CPUs are so fast, they compress
> faster than HDDs move the heads around and do the writes. So compression,
> if done properly, can be with no negative impact to speed.
> 
> Can you say what level of compression with which processors would handle
> it without speedloss?

IBM's AIX 4.3 and later support LZ compression on their JFS file system.
I was able to measure a 10-15% speed-up by converting /usr to compressed
even on a 133MZ Power604e chip because even back then, it was faster to
read half as many blocks off a SCSI disk and decompress.

So it's been at least a potential win for a decade or so, assuming your
filesystem is able to deal well with fragments (you can't really win unless
you take a 4K or so logical block, compress it to some number of 512-byte
chunks, and then store the resulting chunks cheaply - JFS does the
blocks-and-frags efficiently, so it's easy to win.  On the other hand,
it would be dreadful for Reiser3 - imagine having to do tail packing
for *every block* (which is what you end up doing, sort of...)


pgp81X4PfC8ZW.pgp
Description: PGP signature


Re: where are reiser4 sources

2005-02-28 Thread Valdis . Kletnieks
On Mon, 28 Feb 2005 15:41:52 PST, Hans Reiser said:

> I am frankly skeptical that one should attempt to clone windows.

That explains why WINE exists, I guess.. ;)


pgpzJsM1Ca7pO.pgp
Description: PGP signature


Re: where are reiser4 sources

2005-02-22 Thread Valdis . Kletnieks
On Tue, 22 Feb 2005 10:41:25 PST, Hans Reiser said:

> That violates the license.

Umm.. from fs/reiser4/README:

Reiser4 is hereby licensed under the GNU General
Public License version 2.

Where in the GPL does it say he can't port to another OS?


pgppgfP4VjYKb.pgp
Description: PGP signature


Re: Plugin for corruption resistance?

2005-02-18 Thread Valdis . Kletnieks
On Fri, 18 Feb 2005 08:36:51 EST, Gregory Maxwell said:

> Tree hashes.
> Divide the file into blocks of N bytes. Compute size/N hashes. 
> Group hashes into pairs. Compute N/2 N' hashes, this is fast because
> hashes are small. Group N' hashes into pairs compute N'/2 N'' hashes
> etc.. Reduce to a single hash.

You get massively I/O bound real fast this way.  You may want to re-evaluate
whether this *really* buys you anything, especially if you're not using some
sort of guarantee that you know what's actually b0rked...

> In my initial suggestion I offered that hashes could be verified by a
> userspace daemon, or by fsck (since it's an expensive operation)...
> Such policy could be controlled in the daemon.
> In most cases I'd like it to make the file inaccessible until I go and
> fix it by hand.

You're still missing the point that in general, you don't have a way to tell 
whether
the block the file lived in went bad, or the block the hash lived in went bad.

Sure, if the file *happens* to be ascii text, you can use Wetware 1.5 to scan
the file and tell which one went bad.  However, you'll need Wetware 2.0 to
do the same for your multi-gigabyte Oracle database... :)

(And yes, I *have* seen cases where Tripwire went completely and totally bananas
and claimed zillions of files were corrupted, when the *real* problem was that
the Tripwire database itself had gotten stomped on - so it's *not* a purely
theoretical issue


pgpk0wA71b8oV.pgp
Description: PGP signature


Re: Plugin for corruption resistance?

2005-02-17 Thread Valdis . Kletnieks
On Thu, 17 Feb 2005 21:43:08 CST, David Masover said:

> This way is easier, though.  But I was thinking about accessing the
> file.  I don't know of any hashes that can be easily updated from part
> of the file, unless you're hashing only pieces of the file in the first
> place, but it'd be nice to not bother hashing at all until the hash is
> needed, especially if we are hashing the whole file.

There's plenty of CRC functions that are quite easily set up for an
incremental update (see RFCs 1141 and 1624 on how to do it for the CRC function
used for Internet IP packets).  You'd of course not want to use that CRC-16,
but the same basic principle applies to other CRC functions.

The problem is that most CRC functions aren't very much good at detecting
multi-bit errors, and when you're talking about hundreds of gigabytes of
disk on a modern RAID, the CRC functions are hardly bulletproof.

On the flip side, hash functions like MD5 or the SHA family are fairly 
bulletproof,
but are essentially impossible to develop an incremental update for (if there
existed a fast incremental update for the hash function, that would imply a
very low preimage resistance, rendering it useless as a cryptographic hash).

Also, there's another issue - unlike standard ECC codes that can actually *fix*
the problem (for at least small number of bit errors), it's unclear what you 
should
do if you find a mismatch between the hash of a block and the block contents, as
you don't know whether it's the actual data or the hash that's corrupted



pgppfOUk0kfEV.pgp
Description: PGP signature


Re: WELCOME to reiserfs-list@namesys.com

2005-02-07 Thread Valdis . Kletnieks
On Mon, 07 Feb 2005 13:28:26 EST, Rick Spillane said:
> Is reiserfs *completely* ACID compliant? Acid meaning Atomicity,
> Consistancy, Isolation, and Durability? If not (which I would expect
> is true) then how far away are the offending parts from making
> reiserfs ACID compliant, and where are they in the source?

Do you mean this as "buzzword compliant", or do you have a pointer
to an actual specification or compliance suite?



pgpuuSXiM43ZH.pgp
Description: PGP signature


Re: missing files?

2005-02-01 Thread Valdis . Kletnieks
On Tue, 01 Feb 2005 23:34:13 EST, [EMAIL PROTECTED] said:

> All seemed fine, until I noticed that my nightly cron job that runs the
> Gentoo emerge program to check for software updates was failing. After
> investigating I discovered that a number of directories were missing.
> That is, they appeared when using 'ls' with no options, but using 'ls
> -l' produced multiple 'No such file or directory' results. 
> 
> Now it appears I'm unable to remove those directories. 'rm -f
> ' doesn't work, and rmdir reports the directory isn't empty. I
> can rename the directory, but that doesn't fix the apparent corruption
> in the filesystem. Also, fsck.reiser4 doesn't report any problems.
> 
> Can anyone explain why this might have happened and what I might be able
> to do to fix it?

Sounds like wonky file permissions on the directory - lack of write permission
*on the directory* will cause 'rm' to fail.  Remember that renaming the 
directory
requires write permission *on it's parent*, not on itself.

(Note the following is on an ext3 filesystem)

[~]2 mkdir /tmp/foo-bar
[~]2 touch /tmp/foo-bar/baz 
[~]2 chmod 555 /tmp/foo-bar/   
[~]2 rm /tmp/foo-bar/baz
rm: cannot remove `/tmp/foo-bar/baz': Permission denied
[~]2 mv /tmp/foo-bar /tmp/foo-bar-quux
[~]2 rm /tmp/foo-bar-quux/baz
rm: cannot remove `/tmp/foo-bar-quux/baz': Permission denied
[~]2 ls /tmp/foo-bar-quux/baz
/tmp/foo-bar-quux/baz
[~]2 ls -l /tmp/foo-bar-quux/baz
-rw-r--r--  1 valdis valdis 0 Feb  2 00:25 /tmp/foo-bar-quux/baz
[~]2 ls -l /tmp/foo-bar-quux
total 1
-rw-r--r--  1 valdis valdis 0 Feb  2 00:25 baz
[~]2 chmod 400 /tmp/foo-bar-quux 
[~]2 ls -l /tmp/foo-bar-quux
total 0
?-  ? ? ? ?   ? baz
[~]2 chmod 100 /tmp/foo-bar-quux
[~]2 ls -l /tmp/foo-bar-quux
ls: /tmp/foo-bar-quux: Permission denied
[~]2 ls -l /tmp/foo-bar-quux/baz
-rw-r--r--  1 valdis valdis 0 Feb  2 00:25 /tmp/foo-bar-quux/baz
[~]2 ls -ld /tmp/foo-bar-quux/   
d--x--  2 valdis valdis 1024 Feb  2 00:25 /tmp/foo-bar-quux/

Bad umask/chmod? (Note that some shells and scripting languages do
very different things with 'umask 22' and 'umask 022', and 'chmod 600'
and 'chmod 0600')


pgpnHcYrOvIZ1.pgp
Description: PGP signature


Re: Why is Reiser4 slower then ReiserFS v3

2004-12-27 Thread Valdis . Kletnieks
On Mon, 27 Dec 2004 13:38:12 MST, Dark Shadow said:

> I have three hard drives so I took a file from one and copied it to
> the others and timed it
> source drive /dev/hda Reiser3 Western Digital 40gb 7200rpm
> target1 drive /dev/hdb Reiser4 Western Digital 80gb 7200rpm
> target2 drive /dev/sda Reiser3 Seagate 160gb 7200rpm (SATA but still
> same rpm as rest so it should be the same)

You may wish to run 'hdparm -T -t' on each drive and see what the *raw* speed
is.  All drives are not created equal... ;)

> time cp ~/800mb.file /target1
> real0m41.409s
> user0m0.010s
> sys 0m4.364s

> time cp ~/800mb.file /target2
> real0m38.318s
> user0m0.017s
> sys 0m5.627s

Similarly, you should try each one 3-5 times and get an average (for
starters, if you have more than 800M of memory, the second time around it
may all still be in cache, so the second time gets a hot-cache boost). It
may be useful to run the command once and *ignore* its times, and then
re-run the command 3 times and average those results (so all 3 times you
actually *use* start from the same "previous command just finished" cache 
state).


pgpXNoxZMWOWg.pgp
Description: PGP signature


Re: reiser4 and apache (was: Re: Reiser4 and ZFS)

2004-12-27 Thread Valdis . Kletnieks
On Mon, 27 Dec 2004 13:28:15 +0100, Sander said:
> [EMAIL PROTECTED] wrote (ao):
> > For many shops, it's quite likely that a ZFS with "more scalability
> > and administration" is The Right Choice, especially if it does *NOT*
> > include lots of odd new features and quirks that might break
> > production code (remember the joys in getting Apache running on
> > reiser4, until it was discovered that the 'file-as-directory' stuff
> > broke programs that weren't expecting it?).
> 
> I just got bitten by this. Is it possible to get apache to run on
> reiser4, and if so, how?

I think the archives have pointers to both a reiser4 patch that disables
the "file as directory", and a patch to apache to deal with the situation...


pgpvyRJnczF90.pgp
Description: PGP signature


Re: Reiser4 and ZFS

2004-12-15 Thread Valdis . Kletnieks
On Tue, 14 Dec 2004 22:46:08 PST, Job Bob said:

>   So if ZFS is not the vaporware that WinFS is, what
> new features of ZFS are worth incorporating into
> Reiser fs? Will Reiser fs continue to stay ahead of
> ZFS?

This of course depends in a *large* part on how exactly you define "ahead".

For many shops, it's quite likely that a ZFS with "more scalability and
administration" is The Right Choice, especially if it does *NOT* include lots
of odd new features and quirks that might break production code (remember the
joys in getting Apache running on reiser4, until it was discovered that the
'file-as-directory' stuff broke programs that weren't expecting it?).

The criteria I use for rating a filesystem as being "ahead" for use on my
laptop (where new cool features often count for more than stability or
performance) are very different than what I want for a mail server.  And the
guy 3 cubicles over doesn't *care* about the details of filesystems - he just
worries about "will this make Oracle run faster or more robustly?".


pgpcwdXJGglKy.pgp
Description: PGP signature


Re: (reiser4)install on root

2004-11-26 Thread Valdis . Kletnieks
On Fri, 26 Nov 2004 00:58:28 PST, BLuEGoD said:

>  Hi, i want to know how to install reiser4 on root with only 1 HD.. I use
> mkfs.reiser4 from the reiser4 utils after compile and install kernel & patches
> on a debian woody 2.2.. with a scsi HD, but it crashes (errors found doing
> mkfs.reiser4 on root device and on next boot I saw a kernel panic).. Note: I
> did it with the root mounted.. because i need to boot with that HD..

As several have noted, you can use a spare partition to build a new root
filesystem.  Another option is to use a "rescue disk" or a Knoppix disk
or other CD-based toolset to boot from, and use that to do your mkfs.reiser4.

It's a good idea to *always* have a rescue disk handy, because they enable you
to recover from many problems that will prevent you from booting all the
way to single-user off your production disk (trashed boot block, 
missing/misnamed
files in /boot, a need to fsck the boot partition, or whatever...)


pgpTdyu9Am6zL.pgp
Description: PGP signature


Re: file as a directory

2004-11-22 Thread Valdis . Kletnieks
On Mon, 22 Nov 2004 19:24:36 +0530, Amit Gud said:

>  A straight forward question. Wouldn't adding a "file as a directory"
> mechanism more logical in VFS itself,

There was quite the flame-fest on the lkml a while back regarding
how the semantics of "file as a directory" should operate.  There's
a number of really nasty corner cases that you need to deal with.

Go back and re-read the whole flame-fest, understand all the points
raised, and let us know when you have a workable proposal.

(Hint - "file as directory" broke a number of programs that didn't
expect that a file *could* be a directory, when run on a reiser4
filesystem...)


pgpHR2WwtexCi.pgp
Description: PGP signature


Re: Reiser4 kernel modules

2004-11-01 Thread Valdis . Kletnieks
On Mon, 01 Nov 2004 14:07:45 +0100, =?iso-8859-1?q?Lars_Tobias_B=F8rsting?= said:

> And why does reiser4 need changes in the kernel code? Is it really a
> smart approach to require kernel changes for reiser4 to work?
> 
> Why isn't it possible to build reiser4 as kernel modules?

It still requires the change to the kernel to add the proper code
in linux-2.6.9/fs/reiser4, add it to the proper Makefiles, the Kconfig
glue needed to build it, etc.  Adding in-tree code is a change to the
kernel, even if it ends up getting built as modules.

Or you *can* build it as an out-of-tree module - which still has some
rough edges in the 2.6 Kbuild infrastructure (most notably, if you
do a 'make modules_install', you have to remember to re-install all your
out-of-tree stuff as well...)


pgpHTEcqbsHgz.pgp
Description: PGP signature


Re: Directory updates in filesystems

2004-10-29 Thread Valdis . Kletnieks
On Fri, 29 Oct 2004 00:26:18 CDT, David Masover said:

> If this is about locking not working well with NFS, why not ensure that
> the directory itself is owned by root and read-only before attempting?
> Wait -- don't answer that...

No, this is a different problem.

Imagine a directory with 10K files called 0001, 0002, 0003,  , .
You start a 'readdir()' loop, and get to 5497 or so.  At this point,
another process removes 1260 through 1265, and then another process renames 8534 to
1263, putting it in the slot just vacated - and you reach the end of
the readdir() loop never seeing that file.

> | Are there any file systems that fully address this issue, or POSIX
> calls that
> | guaranteed to make an atomic readdir, without specific locking, or must a
> | lock be obtained on the directory to ensure that the read is
> consistent. I
> | think that locking is needed in the application if complete
> consistency is
> | required because the underlying behaviour of the OSes/filesystems is so
> | variable in this regard, but I'd be interested in understanding what
> | characteristics a filesystem would have to have to avoid this.
> 
> Maybe an atomic readdir operation?  Does reiser4 do atomic reads?

Do you *REALLY* want to lock the *entire* dir (probably in memory, which
can hurt for directories with 10Ks or 100Ks entries, which is where the
problem is most evident)?  Even if it's not locked in memory, the mere
locking against updates can be *painful* performance-wise.

> I know reiser4 (or at least should by 4.1) have a sys_reiser4 api which
> does atomic write operations.  That is:  application starts the
> transaction, does a bunch of writes, ends the transaction.  If at any
> point there is a failure, filesystem tells application to roll back.

Atomic operations don't help you here, unless you're willing to take a
locking performance hit.  Remember that rename() is *already* atomic (at least
from other process's viewpoint), and you have the "rename into a slot
you've passed" problem mentioned above...


> This alows read-only access, such as a web server, to operate on
> slightly stale "snapshots" as this would create.  When faced with a
> decision of:
> 
> - - serving a slightly stale page immediately
> - - making users wait for a write of a newer version to complete
> - - serving a half-written newer version
> 
> I am sure most web admins would choose the first option, which is what
> they would get if the pages were being updated with vim.  The difference
> is that the filesystem solution works on larger units than single files.

The problem is that if you're a mail server, you probably *don't* want to
be sending a slightly stale version of the mail that just got queued.  There,
the only realistic option is your "make users wait" - which may be intolerable
when you're trying to do millions of transactions an hour...


pgpl3McrPt7Pi.pgp
Description: PGP signature


Re: Interesting deletion idea

2004-10-08 Thread Valdis . Kletnieks
On Fri, 08 Oct 2004 19:52:14 EDT, John Richard Moser said:

> I thought the DOD algorithm was 7 pass?

Citation please?  If you have a better reference than DOD 5220-22.M,
feel free to share it.

> If this is going on rapidly, there's no point in trying to completely
> destroy the disk for *every* logical operation; but buffering the
> operations and then only doing the most recent one, and destroying the
> area before that one exactly, would be OK.  The idea is that rapid
> overwrites from userspace get collapsed into a single overwrite; and
> then the kernel overwrites a bunch of times before flushing that data to
> disk to securely erase it.

The point is that you have no really good way to know beforehand that
the flurry of writes is over, and it's time to collapse the writes into
a single write.  

To demonstrate using your example:

a = open("/some/file.txt");
seek(a, 0, 0);
fputc(a,'N');
seek(a, 0, 0);
fputc(a, 'D');
seek(a,0,0);
fputc(a, 'X');

At what point do you do the overwrite?  You place it just before the
fputc 'X' - but you can't really delay to that rather than at the
'N' or 'D' unless you *know* that the 'X' one will happen 'Soon Enough'.
There's also the point that fputc() is stdio and buffered by default,
unless you've called fflush() or setlinebuf() or similar.  Even if you
look at the read()/write() syscall level, the Linux kernel will almost
certainly automatically do most of the needed collapsing in the buffer
cache code (look at fs/buffer.c for the gory details) - in fact, most
of the time, you need to use fsync() or similar to *force* the data to
actually get to the disk (often, the data doesn't go out until long after
the process has actually exited - and then there's the different way
that the different I/O elevators schedule things, just to add another
layer of unpredictability into things).  The end result is that it's
a lot harder than it looks to get this right...

In addition, doing the overwrite at *THAT* point is *the wrong point* - as
you're about to overwrite the block at least once *anyhow*.  You *really* need to
be doing erasing in the handling for the unlink() and (f)truncate() syscalls,
because *that* is the point you're freeing the disk blocks - and the point of
erasing is to prohibit scavenging of old data off the disk.  This has the added
benefit of being something you *can* do basically at the filesystem's leisure,
subject to a requirement that you return blocks to the free list fast enough
to prevent disk space exhaustion (which is trickier than it looks - under heavy
file create/write/read/unlink loads, you need to be doing it as fast as possible
at exactly the time you have the least idle bandwidth - at worst case, a 3-pass
erase of all blocks will limit you to 25% of the effective write bandwidth in a
steady-state high-load situation).

Also, you *really* need to be *very* careful regarding write barriers and the
like - look at the linux-kernel archives for the last few months where a *long*
series of threads about the problems on IDE.

Basically, if the drive has a write cache on it, you have to either disable
it or jump through some *real* hoops in order to get strictly correct write
barrier semantics (and on some drives, the situation is totally impossible).




pgpjXUeKGZHnZ.pgp
Description: PGP signature


Re: Interesting deletion idea

2004-10-08 Thread Valdis . Kletnieks
On Fri, 08 Oct 2004 01:55:19 EDT, John Richard Moser said:

> It'd be fun to be able to mount -o remount,erase=gutman / and have the
> gutman algorithm erase everything.  It may be interesting to get the
> journal to work around parts of the journal being erased, and to do
> other things in an attempt to allow heavy erasure algorithms (Gutman is
> a 34 pass alg IIRC) to function without slowing operations down visibly.

Anybody seriously proposing Gutman's 35 passes needs to be taken out back
and shot - or at least worked over with a rubber hose.  The *only* reason
that there's 35 passes is so that at least 3 or 4 passes will tickle a
corner case of some on-media encoding scheme (for instance, if you don't
have any MFM drives left, you can toss like half the entries).

Current thinking from the spooks who should know:

Canadian RCMP TSSIT OPS-II says: "Must first be checked for correct functioning
and then have all storage areas overwritten once with the binary digit ONE,
once with the binary digit ZERO and once with a single numeric, alphabetic or
special character, " (http://jya.com/rcmp2.htm)

American DoD 5220-22.M says: Overwriting all addressable locations with a
character, its complement, then a random character and verify.

DOD 5220-22.M applies to civilian contractors, and is approved for material
rated up to SECRET.  TOP SECRET or higher still calls for physical destruction
of media or mass degaussing.

In other words, our spooks think that if 3 passes isn't enough, you need to
totally destroy it.

(Two notes - (1) that read-back verify *is* required to make sure you did it
right, and (2) neither one worries about the information leakage from bad blocks
that have been remapped by the drive)

> The erasure should probably only apply to relavent parts of disk.  Inode
> information, for example, would be pointless; journal transactions, file
> data, and directory entries, on the other hand, are all possible
> sensitive information; the filename may be sensitive data (directory entry).

Careful analysis of the inodes themselves has a *lot* more information leakage
than you might expect - if the filesystem uses *ANY* sort of predictable order
for inode allocation, you can look at the free inodes and trace back what
order they were freed in (very easy if the filesystem has a free inode list,
a bit more of a challenge if it allocates on the fly like reiser3).  Once
you know that, you know what uid/gid the file belonged to, its size, and
the ctime/mtime/atime.

That's a *LOT* of info that can be used to reconstruct what was going on.

> Buffering multiple overwrites of the same area and applying them in a
> sane and orderly manner may allow you to catch rapid, repeted overwrites
> of disk areas and wait until several have gone by before actually
> applying them.  This would allow you to avoid some of the overhead of
> attempting to destroy overwritten data.

Actually, that's the *last* think you want to do - you really need to send
3 overwrites down the pipe to the disk *and make sure you have a write barrier
between them*.  The *last* think you want is to send 3 writes to the disk,
and have the disk's write cache bugger^Wbuffer "optimize" it so only the
last written block actually goes to disk


pgpE2Kx1hSm0u.pgp
Description: PGP signature


Re: [PATCH] make fs/reiser4/search.c compile with gcc 4.0

2004-09-22 Thread Valdis . Kletnieks
On Wed, 22 Sep 2004 21:03:45 +0200, =?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= said:

> This code uses stuppid gcc extension, that is not present in gcc 4.0.

OK, if it's using a GCC extension that's already officially deprecated in 3.X and will 
be
removed in 4.0, then that *is* a good reason to fix the code.


pgpMo5D4IKMbc.pgp
Description: PGP signature


Re: [PATCH] make fs/reiser4/search.c compile with gcc 4.0

2004-09-22 Thread Valdis . Kletnieks
On Wed, 22 Sep 2004 20:45:55 +0200, =?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= said:

> I know gcc 4.0 is still in it's alphas.
> Obvious solution is to move function declared in other function
> up-wards. Since it's static anyway, it won't make any diffrence.
> Please consider applying to repo.

I'm not sure it's a good idea to be trying to "fix" reiser4 code to compile
with an alpha compiler, at least without a *very* firm commitment from the gcc
crew that this *is* a real error in the reiser4 code and not a bug in gcc
causing a spurious message.

Why does this code compile cleanly with gcc 3.x and fail with a 4.0 alpha?
Without knowing that, it's *STUPID* to change the code to suit the alpha...


pgpK6j3pBQqVn.pgp
Description: PGP signature


Re: The argument for fs assistance in handling archives

2004-09-02 Thread Valdis . Kletnieks
On Thu, 02 Sep 2004 20:11:13 CDT, David Masover said:

> It'd be like writing OpenGL entirely in software, before hardware
> accelerators work, and at the last minute have to change the library to
> use triangles instead of splines.

I expect that SGI did a software-only version of IrisGL first, so they could
figure out what the hardware accelerators needed to support.  And even then,
the API for IrisGL got modified when it became OpenGL.



pgpSsKKBMfbKA.pgp
Description: PGP signature


Re: The argument for fs assistance in handling archives

2004-09-02 Thread Valdis . Kletnieks
On Thu, 02 Sep 2004 19:43:34 CDT, David Masover said:

> And on apps.  Should I teach OpenOffice.org to do version control?
> Seems a lot easier to just do it in the kernel, and teach everything to
> do version control in one fell swoop.

Including files you didn't really want to keep version control of?

How many temp files does gcc create and unlink in the course of a kernel build?
(And remember, you can't say "don't enable that on /tmp" - gcc respects the
setting of $TMPDIR - so an 'export TMPDIR=~/tmp' confuses things quite
nicely...)

And it's hard for the kernel to know that an unlink() done by gcc should be
treated differently than the "recover the last version" you *want* it do be able
to do after you work on a source file for a long while, save it, and then
fumble-finger a 'rm * .o' - you can't even use a heuristic like "don't version
control it unless it's N seconds or more old"

(Note that the "obvious" solution of creating a chattr flag has its own
complexity issues - should versioning be turned on by default for some types
and not others, etc...)

There be dragons here - it's not as simple as "drop in a plugin and be happy".



pgpzMIzBDqaRf.pgp
Description: PGP signature


Re: Was able to reproduce "cp: cannot stat file.x: Input/output error"

2004-08-10 Thread Valdis . Kletnieks
On Tue, 10 Aug 2004 01:31:17 PDT, Hans Reiser said:

> Thanks for explaining
> 
> sync;sync;sync;halt
> 
> I always felt I was failing to grok something.

As was the author who recommended it.  It started out as:

# sync   ( this one schedules the I/O)
# sync   ( just a time waster typing)
# sync   ( just a time waster typing )
# halt( and we finally actually shut down).

The disks on the old PDP and Vax 750 boxes were actually sluggish enough that
if you had a whole 1M or 2M in the buffer cache to flush out, it was actually not
difficult to enter "sync", hit return, enter "halt", hit return, and have the
halt happen before sync finished, doing the predictable to the non-journaled file
systems.  Empirical studies showed that even on the biggest-memory boxes,
sync could almost always finish with 2 time-wasters before the halt.. ;)



pgpHv1dAXoZ26.pgp
Description: PGP signature


Re: Was able to reproduce "cp: cannot stat file.x: Input/output error"

2004-08-09 Thread Valdis . Kletnieks
On Sat, 07 Aug 2004 00:49:43 PDT, Hans Reiser said:

> >I think I have discovered the problem - unless there was a reason mongo was
> >issuing mount/unmount commands at the start/end of a mongo 'run' as well as
> >before/after _each phase_.

> Probably someone wanted to separate the measurement of the phases.  It 
> has been a while since I read mongo.

Note that an unmount/mount pair will force a flush of all dirtied pages in the
in-memory file cache, and *really* not return until it's really done and really
out on disk.  In addition, sync() will force stuff to disk, but *not* invalidate
in-cache pages - more drastic measures are needed if you want to benchmark
with a cold cache (which is almost a must if you're doing actual filesystem
benchmarking, as otherwise you're benching the in-core cache instead).

As an aside, although the Linux fs/buffer:do_sync() won't return until it's
all really done, there is no mandate that the sync() syscall wait (and in fact,
is the source of the old "type 'sync' three times, then 'halt'" - the second
and third times you typed sync and hit return hopefully gave the I/O scheduled
by the *first* sync time to complete.  At least one 'Unix for Dummies' book
proved their lack of depth of understanding when they recommended:

# sync;sync;sync;halt

;)



pgpfOGWPiApBm.pgp
Description: PGP signature


Re: Fibration questions

2004-07-23 Thread Valdis . Kletnieks
On Fri, 23 Jul 2004 12:28:49 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said:

> I think desktops for all the Joe Q. Averages are pretty much a different
> scene from servers..

It's not as different as you might think.  Remember in most corporations that use
Active Directory, all the infrastructure boxes (domain controllers, etc) are Windows
boxes too.  Quite recently, an amazing number of webservers got 0wned because
somebody browsed the net using IE while logged on at the server console

> >And how many times has the global RAM market been put under severe strain
> >because the latest Windows upgrade needed more RAM, so everybody went out and
> >bought more RAM, and more RAM, and more RAM...
> 
> But Windows isn't the only thing that starts requiring more RAM, and
> if you can buy more for a lesser price, that's what you'll do, regardless.

No, the price of RAM went *UP*, dramatically, because demand was higher
than supply, so you were buying less for a higher price.

The point is that the manufacturers of RAM and systems had *no* incentive
to do anything to stop it.

"Microsoft is expected to recommend that the "average" Longhorn PC feature a
dual-core CPU running at 4 to 6GHz; a minimum of 2 gigs of RAM; up to a
terabyte of storage; a 1 Gbit, built-in, Ethernet-wired port and an 802.11g
wireless link; and a graphics processor that runs three times faster than those
on the market today."

http://www.microsoft-watch.com/article2/0,1995,1581842,00.asp

Now *try* to convince me that the Dell and HP saw this, and their first thought
was "Let's see if we can get it to run well on a single-core 3GHz with 1G of RAM" ;)

If that was their first thought, the second was "OK, I'm done laughing, now I need
to pick myself up off the floor"

> Maybe I'm the unexperienced obnoxious adolescent again, as I'm in only
> my second job so far, but I've noticed that both employers have the
> principle that if you can get anything, even the slightest guarantee,
> that something is faster and more stable at a somewhat higher cost, it's 
> worth it. Even if you'd be paying for a scapegoat-factor warranty.

Right. Which is why you end up *buying* that faster server at higher cost than
you might really need.

Most managers have a *really* hard time dealing with the concept "If you use this
alternative, totally free, no-cost, software, it will run faster and save you money".

> Tune even faster solution and get even more power, it'll last us
> all weekend, before it goes obsolete...

You'd be *amazed* at how many sites *dont* have somebody on the payroll
who can do tuning well.  Usually, it's whatever they remember from the MSCE
exam.  Just because my shop has people experienced in tuning everything from
old ferrite-core systems to top-10 supercomputers doesn't mean every shop does. ;)

> So a Dell 2650 could have could have handled what the 6600 did?

No, the 2650 would certainly have gotten swamped, the two boxes are doing
different things.  The point is that *DELL* didn't have any incentive to get
me to buy a 2650 instead of another 6600.

And if I had little clue, and actually talked to a Dell sales rep, they probably could
have convinced me I needed a 6600.  

> And they're still selling 6600s, how big an impact would Reiser4's speed
> advantage have really on them? But it seems I'm over over my head now :)

Trust me, it wouldn't have helped enough to get the 6600's workload to fit
on a 2650.

> Should I have said safe instead of secure? Maybe that would be the better
> English word for it.
> Like being safe at power failures.

Is it *demonstrably* better than ext3 with 'data=journal'?

> Then there's view security, which should be implemented.

Ahh.. but view security doesn't do you as much good as it could, mostly
because of the "support at the VFS level" issues.

> I make my meager living as a small-time administrator and writer of
> web (and similar) magick in Python, so I don't know why the xattrs 
> couldn't be mapped to Reiser4 calls, but shouldn't it be technically possible?

I'll refrain from saying anything except "read the list archives"
 
> But these are just ideas, I have absolutely zero marketing experience
> so this should not be taken as a presumptuous manual on how to do things :)

You'd have more luck not talking to the people who sell hardware or systems, but
to the people who *use* hardware and systems, or who sell consulting/maintenance.

For instance, Google has multiple large server farms, each of which has 15K to 20K
systems in it.  They're a Linux shop, and would probably be willing to part with a
fairly large sum of cash if it meant their hardware upgrade costs went down even 5%.

There's lots of places making money doing custom one-off solutions based on
Linux - for instance, most of IBM's Linux revenue comes from consulting/
support. A shop that's doing systems integration might well be willing to pay
$100K for another thing in their bag of tricks that lets them land 20 contracts
that mak

Re: ext3 -> reiserfs conversion utility?

2004-06-18 Thread Valdis . Kletnieks
On Fri, 18 Jun 2004 23:00:35 CDT, David Masover said:

> Do backups.  Now.  You are an idiot and/or a cheapskate if you don't
> have backups, because one day something will happen -- probably
> something ridiculously stupid -- and you will need them.  I mean, go
> build a backup server and, if you can afford it, give it something like
> a terabyte raid5 hotplug array.  Do it now.

And if possible, don't rely on the fact that raid5 is redundant.  I know
somebody who worked at a dot-com, and a PHB bought a large 2-terabyte RAID5 in
the days when 2T was still "pretty big".  Said PHB refused to buy a separate
backup, since it was hot-swap RAID5.  Friend voiced objections, and PHB gloated
the first 2 times a single disk died and the system automagically rebuilt onto
a hot spare.

Then poetic justice arrived - a plumbing problem on a floor above caused
multiple thousands of gallons to decide the fastest way to ground level was
through the RAID5. Everybody immediately started updating their resumes,
because they *knew* the company was doomed when all their data went away.
Except for the PHB - his resume used to be on the RAID5...;)



pgpUtnyFESBTt.pgp
Description: PGP signature


Re: Processes dying?

2004-06-04 Thread Valdis . Kletnieks
On Fri, 04 Jun 2004 23:39:25 +0300, [EMAIL PROTECTED] (Markus 
=?UNKNOWN?Q?T=F6rnqvist?=)  said:
> Hello
> 
> I just started using the latest auto-snapshot.
> 
> I noticed weird behavior, that is, processes crash, so bad even C-c doesn't
> kill them. For some reason running strace behind them gives me C-c support.

My guess is that some *other* process got wedged in the kernel while holding
a kernel lock, causing other processes to block when they needed that lock.

Probably will need a SysRq-T output to figure out who's hung where...


pgp7QNPQoONO7.pgp
Description: PGP signature


Re: snapshot, checkpoints

2004-06-04 Thread Valdis . Kletnieks
On Fri, 04 Jun 2004 13:10:05 +0200, Paul Wagland said:

> Can't the same functionality be created with device mapper though? At least
> under linux anyway?

You'd need 2 things:

1) *very* recent patched device mapper (I think patches for snapshot support
went by on LKML just day before yesterday or so).

2) You also need a suitable write-barrier interlock to the filesystem, to
basically force a flush-to-disk of all the incore data buffers, etc (basically,
you need to ensure that at the instant the snapshot is taken, the on-disk copy
is "clean" by fsck standards).

At that point, you can just have a utility that goes "flush; snapshot;" and go on
your way.



pgpMb3KdlRYDW.pgp
Description: PGP signature


Re: The situation at hand and in the future

2004-05-28 Thread Valdis . Kletnieks
On Fri, 28 May 2004 09:33:24 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said:

> Persistent over boots? I'd like the passphrase and key to survive
> a boot...

No you don't.

If the passphrase and key are persistent, then an attacker can get your data.

Think about it - the only reason an attacker doesn't have access to your
data is because they don't have the passphrase/key.  If you leave them around,
you've given away the keys to the kingdom.


pgpU8FbRs1vOH.pgp
Description: PGP signature


Re: [PATCH] "metas" in reiserfs v4 snapshot 2004.03.26

2004-05-17 Thread Valdis . Kletnieks
On Sat, 15 May 2004 14:10:10 +0300, Markus =?UNKNOWN?Q?T=F6rnqvist?= said:

> This has been discussed. There is the mailer that uses an @-named
> symlink to the current message. Can't remember which one.

That would be MH, nhm, and exmh...


pgpVdhLh4Ebv2.pgp
Description: PGP signature


Re: reiser4 non-free?

2004-05-11 Thread Valdis . Kletnieks
On Tue, 11 May 2004 10:57:01 PDT, Hans Reiser said:

> Random credits are the elegant answer.  Displaying only the distro name 
> at boot time is morally wrong.

Would be nice - the RedHat/Fedora GUI installer already supports showing the
current install status in one pane, and scrolling through a bunch of blurbs
in another.  It might be possible to get (at least) the Fedora side of the fence
to include blurbs for the package contributors as well...

I'm uncomfortable with the very large leap between "a request for the distro
to do the morally right thing" and "required by license" however.  As the old
saying goes: "Don't let you mouth write no check your butt can't cash" - a
distro could very well be willing to accept something under a "good faith
best effort" basis, but be unwilling to commit to "required to under all
circumstances"


pgp0.pgp
Description: PGP signature


  1   2   >