subject:"Re\: Finding hardlinks"

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko

On Wednesday 03 January 2007 21:26, Frank van Maarseveen wrote:
> On Wed, Jan 03, 2007 at 08:31:32PM +0100, Mikulas Patocka wrote:
> > 64-bit inode numbers space is not yet implemented on Linux --- the problem 
> > is that if you return ino >= 2^32, programs compiled without 
> > -D_FILE_OFFSET_BITS=64 will fail with stat() returning -EOVERFLOW --- this 
> > failure is specified in POSIX, but not very useful.
> 
> hmm, checking iunique(), ino_t, __kernel_ino_t... I see. Pity. So at
> some point in time we may need a sort of "ino64" mount option to be
> able to switch to a 64 bit number space on mount basis. Or (conversely)
> refuse to mount without that option if we know there are >32 bit st_ino
> out there. And invent iunique64() and use that when "ino64" specified
> for FAT/SMB/...  when those filesystems haven't been replaced by a
> successor by that time.
> 
> At that time probably all programs are either compiled with
> -D_FILE_OFFSET_BITS=64 (most already are because of files bigger than 2G)
> or completely 64 bit. 

Good plan. Be prepared to redo it again when 64bits will feel "small" also.
Then again when 128bit will be "small". Don't tell me this won't happen.
15 years ago people would laugh about 32bit inode numbers being not enough.
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko

On Wednesday 03 January 2007 13:42, Pavel Machek wrote:
> I guess that is the way to go. samefile(path1, path2) is unfortunately
> inherently racy.

Not a problem in practice. You don't expect cp -a
to reliably copy a tree which something else is modifying
at the same time.

Thus we assume that the tree we operate on is not modified.
--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko

On Thursday 28 December 2006 10:06, Benny Halevy wrote:
> Mikulas Patocka wrote:
> >>> If user (or script) doesn't specify that flag, it doesn't help. I think
> >>> the best solution for these filesystems would be either to add new syscall
> >>>   int is_hardlink(char *filename1, char *filename2)
> >>> (but I know adding syscall bloat may be objectionable)
> >> it's also the wrong api; the filenames may have been changed under you
> >> just as you return from this call, so it really is a
> >> "was_hardlink_at_some_point()" as you specify it.
> >> If you make it work on fd's.. it has a chance at least.
> > 
> > Yes, but it doesn't matter --- if the tree changes under "cp -a" command, 
> > no one guarantees you what you get.
> > int fis_hardlink(int handle1, int handle 2);
> > Is another possibility but it can't detect hardlinked symlinks.

It also suffers from combinatorial explosion.
cp -a on 10^6 files will require ~0.5 * 10^12 compares...
 
> It seems like the posix idea of unique  doesn't
> hold water for modern file systems and that creates real problems for
> backup apps which rely on that to detect hard links.

Yes, and it should have been obvious at 32->64bit inode# transition.
Unfortunately people tend to think "ok, NOW this new shiny BIGNUM-bit
field is big enough for everybody". Then cycle repeats in five years...

I think the solution is that inode "numbers" should become
opaque _variable-length_ hashes. They are already just hash values,
this is nothing new. All problems stem from fixed width of inode# only.

--
vda
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-11 Thread Pádraig Brady

Frank van Maarseveen wrote:
> On Tue, Jan 09, 2007 at 11:26:25AM -0500, Steven Rostedt wrote:
>> On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote:
>>
 50% probability of false positive on 4G files seems like very ugly
 design problem to me.
>>> 4 billion files, each with more than one link is pretty far fetched.
>>> And anyway, filesystems can take steps to prevent collisions, as they
>>> do currently for 32bit st_ino, without serious difficulties
>>> apparently.
>> Maybe not 4 billion files, but you can get a large number of >1 linked
>> files, when you copy full directories with "cp -rl".
> 
> Yes but "cp -rl" is typically done by _developers_ and they tend to
> have a better understanding of this (uh, at least within linux context
> I hope so).

I'm not really following this thread, but that's wrong.
A lot of people use hardlinks to provide snapshot functionality.
I.E. the following can be used to efficiently make snapshots:

rsync /src/ /backup/today
cp -al /backup/today /backup/$Date

See also:

http://www.dirvish.org/
http://www.rsnapshot.org/
http://igmus.org/code/

> Also, just adding hard-links doesn't increase the number of inodes.

I don't think that was the point.

Pádraig.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-11 Thread Pádraig Brady

Frank van Maarseveen wrote:
 On Tue, Jan 09, 2007 at 11:26:25AM -0500, Steven Rostedt wrote:
 On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote:

 50% probability of false positive on 4G files seems like very ugly
 design problem to me.
 4 billion files, each with more than one link is pretty far fetched.
 And anyway, filesystems can take steps to prevent collisions, as they
 do currently for 32bit st_ino, without serious difficulties
 apparently.
 Maybe not 4 billion files, but you can get a large number of 1 linked
 files, when you copy full directories with cp -rl.
 
 Yes but cp -rl is typically done by _developers_ and they tend to
 have a better understanding of this (uh, at least within linux context
 I hope so).

I'm not really following this thread, but that's wrong.
A lot of people use hardlinks to provide snapshot functionality.
I.E. the following can be used to efficiently make snapshots:

rsync /src/ /backup/today
cp -al /backup/today /backup/$Date

See also:

http://www.dirvish.org/
http://www.rsnapshot.org/
http://igmus.org/code/

 Also, just adding hard-links doesn't increase the number of inodes.

I don't think that was the point.

Pádraig.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko

On Thursday 28 December 2006 10:06, Benny Halevy wrote:
 Mikulas Patocka wrote:
  If user (or script) doesn't specify that flag, it doesn't help. I think
  the best solution for these filesystems would be either to add new syscall
int is_hardlink(char *filename1, char *filename2)
  (but I know adding syscall bloat may be objectionable)
  it's also the wrong api; the filenames may have been changed under you
  just as you return from this call, so it really is a
  was_hardlink_at_some_point() as you specify it.
  If you make it work on fd's.. it has a chance at least.
  
  Yes, but it doesn't matter --- if the tree changes under cp -a command, 
  no one guarantees you what you get.
  int fis_hardlink(int handle1, int handle 2);
  Is another possibility but it can't detect hardlinked symlinks.

It also suffers from combinatorial explosion.
cp -a on 10^6 files will require ~0.5 * 10^12 compares...
 
 It seems like the posix idea of unique st_dev, st_ino doesn't
 hold water for modern file systems and that creates real problems for
 backup apps which rely on that to detect hard links.

Yes, and it should have been obvious at 32-64bit inode# transition.
Unfortunately people tend to think ok, NOW this new shiny BIGNUM-bit
field is big enough for everybody. Then cycle repeats in five years...

I think the solution is that inode numbers should become
opaque _variable-length_ hashes. They are already just hash values,
this is nothing new. All problems stem from fixed width of inode# only.

--
vda
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko

On Wednesday 03 January 2007 13:42, Pavel Machek wrote:
 I guess that is the way to go. samefile(path1, path2) is unfortunately
 inherently racy.

Not a problem in practice. You don't expect cp -a
to reliably copy a tree which something else is modifying
at the same time.

Thus we assume that the tree we operate on is not modified.
--
vda
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko

On Wednesday 03 January 2007 21:26, Frank van Maarseveen wrote:
 On Wed, Jan 03, 2007 at 08:31:32PM +0100, Mikulas Patocka wrote:
  64-bit inode numbers space is not yet implemented on Linux --- the problem 
  is that if you return ino = 2^32, programs compiled without 
  -D_FILE_OFFSET_BITS=64 will fail with stat() returning -EOVERFLOW --- this 
  failure is specified in POSIX, but not very useful.
 
 hmm, checking iunique(), ino_t, __kernel_ino_t... I see. Pity. So at
 some point in time we may need a sort of ino64 mount option to be
 able to switch to a 64 bit number space on mount basis. Or (conversely)
 refuse to mount without that option if we know there are 32 bit st_ino
 out there. And invent iunique64() and use that when ino64 specified
 for FAT/SMB/...  when those filesystems haven't been replaced by a
 successor by that time.
 
 At that time probably all programs are either compiled with
 -D_FILE_OFFSET_BITS=64 (most already are because of files bigger than 2G)
 or completely 64 bit. 

Good plan. Be prepared to redo it again when 64bits will feel small also.
Then again when 128bit will be small. Don't tell me this won't happen.
15 years ago people would laugh about 32bit inode numbers being not enough.
--
vda
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-10 Thread Benny Halevy

Nicolas Williams wrote:
> On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
>> I agree that the way the client implements its cache is out of the protocol
>> scope. But how do you interpret "correct behavior" in section 4.2.1?
>>  "Clients MUST use filehandle comparisons only to improve performance, not 
>> for correct behavior. All clients need to be prepared for situations in 
>> which it cannot be determined whether two filehandles denote the same object 
>> and in such cases, avoid making invalid assumptions which might cause 
>> incorrect behavior."
>> Don't you consider data corruption due to cache inconsistency an incorrect 
>> behavior?
> 
> If a file with multiple hardlinks appears to have multiple distinct
> filehandles then a client like Trond's will treat it as multiple
> distinct files (with the same hardlink count, and you won't be able to
> find the other links to them -- oh well).  Can this cause data
> corruption?  Yes, but only if there are applications that rely on the
> different file names referencing the same file, and backup apps on the
> client won't get the hardlinks right either.

The case I'm discussing is multiple filehandles for the same name,
not even for different hardlinks.  This causes spurious EIO errors
on the client when the filehandle changes and cache inconsistency
when opening the file multiple times in parallel.

> 
> What I don't understand is why getting the fileid is so hard -- always
> GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
> difficult as it is to maintain a hash table of fileids.

It's not difficult at all, just that the client can't rely on the fileids to be
unique in both space and time because of server non-compliance (e.g. netapp's
snapshots) and fileid reuse after delete.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-10 Thread Benny Halevy

Nicolas Williams wrote:
 On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
 I agree that the way the client implements its cache is out of the protocol
 scope. But how do you interpret correct behavior in section 4.2.1?
  Clients MUST use filehandle comparisons only to improve performance, not 
 for correct behavior. All clients need to be prepared for situations in 
 which it cannot be determined whether two filehandles denote the same object 
 and in such cases, avoid making invalid assumptions which might cause 
 incorrect behavior.
 Don't you consider data corruption due to cache inconsistency an incorrect 
 behavior?
 
 If a file with multiple hardlinks appears to have multiple distinct
 filehandles then a client like Trond's will treat it as multiple
 distinct files (with the same hardlink count, and you won't be able to
 find the other links to them -- oh well).  Can this cause data
 corruption?  Yes, but only if there are applications that rely on the
 different file names referencing the same file, and backup apps on the
 client won't get the hardlinks right either.

The case I'm discussing is multiple filehandles for the same name,
not even for different hardlinks.  This causes spurious EIO errors
on the client when the filehandle changes and cache inconsistency
when opening the file multiple times in parallel.

 
 What I don't understand is why getting the fileid is so hard -- always
 GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
 difficult as it is to maintain a hash table of fileids.

It's not difficult at all, just that the client can't rely on the fileids to be
unique in both space and time because of server non-compliance (e.g. netapp's
snapshots) and fileid reuse after delete.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-09 Thread Steven Rostedt


On Tue, 9 Jan 2007, Frank van Maarseveen wrote:

>
> Yes but "cp -rl" is typically done by _developers_ and they tend to
> have a better understanding of this (uh, at least within linux context
> I hope so).
>
> Also, just adding hard-links doesn't increase the number of inodes.

No, but it increases the number of inodes that have link >1. :)
-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-09 Thread Frank van Maarseveen

On Tue, Jan 09, 2007 at 11:26:25AM -0500, Steven Rostedt wrote:
> On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote:
> 
> > > 50% probability of false positive on 4G files seems like very ugly
> > > design problem to me.
> > 
> > 4 billion files, each with more than one link is pretty far fetched.
> > And anyway, filesystems can take steps to prevent collisions, as they
> > do currently for 32bit st_ino, without serious difficulties
> > apparently.
> 
> Maybe not 4 billion files, but you can get a large number of >1 linked
> files, when you copy full directories with "cp -rl".

Yes but "cp -rl" is typically done by _developers_ and they tend to
have a better understanding of this (uh, at least within linux context
I hope so).

Also, just adding hard-links doesn't increase the number of inodes.

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-09 Thread Steven Rostedt

On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote:

> > 50% probability of false positive on 4G files seems like very ugly
> > design problem to me.
> 
> 4 billion files, each with more than one link is pretty far fetched.
> And anyway, filesystems can take steps to prevent collisions, as they
> do currently for 32bit st_ino, without serious difficulties
> apparently.

Maybe not 4 billion files, but you can get a large number of >1 linked
files, when you copy full directories with "cp -rl".  Which I do a lot
when developing. I've done that a few times with the Linux tree.  Given
other utils that copy as hard links, can perhaps make a 4 billion number
of files with >1 link possible, and perhaps likely in the near future.

-- Steve

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-09 Thread Steven Rostedt

On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote:

  50% probability of false positive on 4G files seems like very ugly
  design problem to me.
 
 4 billion files, each with more than one link is pretty far fetched.
 And anyway, filesystems can take steps to prevent collisions, as they
 do currently for 32bit st_ino, without serious difficulties
 apparently.

Maybe not 4 billion files, but you can get a large number of 1 linked
files, when you copy full directories with cp -rl.  Which I do a lot
when developing. I've done that a few times with the Linux tree.  Given
other utils that copy as hard links, can perhaps make a 4 billion number
of files with 1 link possible, and perhaps likely in the near future.

-- Steve


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-09 Thread Frank van Maarseveen

On Tue, Jan 09, 2007 at 11:26:25AM -0500, Steven Rostedt wrote:
 On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote:
 
   50% probability of false positive on 4G files seems like very ugly
   design problem to me.
  
  4 billion files, each with more than one link is pretty far fetched.
  And anyway, filesystems can take steps to prevent collisions, as they
  do currently for 32bit st_ino, without serious difficulties
  apparently.
 
 Maybe not 4 billion files, but you can get a large number of 1 linked
 files, when you copy full directories with cp -rl.

Yes but cp -rl is typically done by _developers_ and they tend to
have a better understanding of this (uh, at least within linux context
I hope so).

Also, just adding hard-links doesn't increase the number of inodes.

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-09 Thread Steven Rostedt


On Tue, 9 Jan 2007, Frank van Maarseveen wrote:


 Yes but cp -rl is typically done by _developers_ and they tend to
 have a better understanding of this (uh, at least within linux context
 I hope so).

 Also, just adding hard-links doesn't increase the number of inodes.

No, but it increases the number of inodes that have link 1. :)
-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi

> > You mean POSIX compliance is impossible?  So what?  It is possible to
> > implement an approximation that is _at least_ as good as samefile().
> > One really dumb way is to set st_ino to the 'struct inode' pointer for
> > example.  That will sure as hell fit into 64bits and will give a
> > unique (alas not stable) identifier for each file.  Opening two files,
> > doing fstat() on them and comparing st_ino will give exactly the same
> > guarantees as samefile().
> 
> Good, ... except that it doesn't work. AFAIK, POSIX mandates inodes
> to be unique until umount, not until inode cache expires :-)
> 
> IOW, if you have such implementation of st_ino, you can emulate samefile()
> with it, but you cannot have it without violating POSIX.

The whole discussion started out from the premise, that some
filesystems can't support stable unique inode numbers, i.e. they don't
conform to POSIX.

Filesystems which do conform to POSIX have _no need_ for samefile().
Ones that don't conform, can chose a scheme that is best suited to
applications need, balancing uniqueness and stability in various ways.

> > 4 billion files, each with more than one link is pretty far fetched.
> 
> Not on terabyte scale disk arrays, which are getting quite common these days.
> 
> > And anyway, filesystems can take steps to prevent collisions, as they
> > do currently for 32bit st_ino, without serious difficulties
> > apparently.
> 
> They currently do that usually by not supporting more than 4G files
> in a single FS.

And with 64bit st_ino, they'll have to live with the limitation of not
more than 2^64 files.  Tough luck ;)

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Martin Mares

Hello!

> You mean POSIX compliance is impossible?  So what?  It is possible to
> implement an approximation that is _at least_ as good as samefile().
> One really dumb way is to set st_ino to the 'struct inode' pointer for
> example.  That will sure as hell fit into 64bits and will give a
> unique (alas not stable) identifier for each file.  Opening two files,
> doing fstat() on them and comparing st_ino will give exactly the same
> guarantees as samefile().

Good, ... except that it doesn't work. AFAIK, POSIX mandates inodes
to be unique until umount, not until inode cache expires :-)

IOW, if you have such implementation of st_ino, you can emulate samefile()
with it, but you cannot have it without violating POSIX.

> 4 billion files, each with more than one link is pretty far fetched.

Not on terabyte scale disk arrays, which are getting quite common these days.

> And anyway, filesystems can take steps to prevent collisions, as they
> do currently for 32bit st_ino, without serious difficulties
> apparently.

They currently do that usually by not supporting more than 4G files
in a single FS.

Have a nice fortnight
-- 
Martin `MJ' Mares  <[EMAIL PROTECTED]>   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
"Oh no, not again!"  -- The bowl of petunias
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi

> > There's really no point trying to push for such an inferior interface
> > when the problems which samefile is trying to address are purely
> > theoretical.
> 
> Oh yes, there is. st_ino is powerful, *but impossible to implement*
> on many filesystems.

You mean POSIX compliance is impossible?  So what?  It is possible to
implement an approximation that is _at least_ as good as samefile().
One really dumb way is to set st_ino to the 'struct inode' pointer for
example.  That will sure as hell fit into 64bits and will give a
unique (alas not stable) identifier for each file.  Opening two files,
doing fstat() on them and comparing st_ino will give exactly the same
guarantees as samefile().

> > Currently linux is living with 32bit st_ino because of legacy apps,
> > and people are not constantly agonizing about it.  Fixing the
> > EOVERFLOW problem will enable filesystems to slowly move towards 64bit
> > st_ino, which should be more than enough.
> 
> 50% probability of false positive on 4G files seems like very ugly
> design problem to me.

4 billion files, each with more than one link is pretty far fetched.
And anyway, filesystems can take steps to prevent collisions, as they
do currently for 32bit st_ino, without serious difficulties
apparently.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Pavel Machek

Hi!

> > >> No one guarantees you sane result of tar or cp -a while changing the 
> > >> tree.
> > >> I don't see how is_samefile() could make it worse.
> > >
> > > There are several cases where changing the tree doesn't affect the
> > > correctness of the tar or cp -a result.  In some of these cases using
> > > samefile() instead of st_ino _will_ result in a corrupted result.
> > 
> > ... and those are what?
> 
>   - /a/p/x and /a/q/x are links to the same file
> 
>   - /b/y and /a/q/y are links to the same file
> 
>   - tar is running on /a
> 
>   - meanwhile the following commands are executed:
> 
>  mv /a/p/x /b/x
>  mv /b/y /a/p/x
> 
> With st_ino checking you'll get a perfectly consistent archive,
> regardless of the timing.  With samefile() you could get an archive
> where the data in /a/q/y is not stored, instead it will contain the
> data of /a/q/x.
> 
> Note, this is far nastier than the "normal" corruption you usually get
> with changing the tree under tar, the file is not just duplicated or
> missing, it becomes a completely different file, even though it hasn't
> been touched at all during the archiving.
> 
> The basic problem with samefile() is that it can only compare files at
> a single snapshot in time, and cannot take into account any changes in
> the tree (unless keeping files open, which is impractical).

> There's really no point trying to push for such an inferior interface
> when the problems which samefile is trying to address are purely
> theoretical.

Oh yes, there is. st_ino is powerful, *but impossible to implement*
on many filesystems. You are of course welcome to combine st_ino with
samefile.

> Currently linux is living with 32bit st_ino because of legacy apps,
> and people are not constantly agonizing about it.  Fixing the
> EOVERFLOW problem will enable filesystems to slowly move towards 64bit
> st_ino, which should be more than enough.

50% probability of false positive on 4G files seems like very ugly
design problem to me.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Pavel Machek

On Fri 2007-01-05 16:15:41, Miklos Szeredi wrote:
> > > And does it matter? If you rename a file, tar might skip it no matter of 
> > > hardlink detection (if readdir races with rename, you can read none of 
> > > the 
> > > names of file, one or both --- all these are possible).
> > > 
> > > If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delete 
> > > both "a" and "b" and create totally new files "dir2/c" linked to 
> > > "dir2/d", 
> > > tar might hardlink both "c" and "d" to "a" and "b".
> > > 
> > > No one guarantees you sane result of tar or cp -a while changing the 
> > > tree. 
> > > I don't see how is_samefile() could make it worse.
> > 
> > There are several cases where changing the tree doesn't affect the
> > correctness of the tar or cp -a result.  In some of these cases using
> > samefile() instead of st_ino _will_ result in a corrupted result.
> 
> Also note, that using st_ino in combination with samefile() doesn't
> make the result much better, it eliminates false positives, but cannot
> fix false negatives.

I'd argue false negatives are not as severe.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi

> >> No one guarantees you sane result of tar or cp -a while changing the tree.
> >> I don't see how is_samefile() could make it worse.
> >
> > There are several cases where changing the tree doesn't affect the
> > correctness of the tar or cp -a result.  In some of these cases using
> > samefile() instead of st_ino _will_ result in a corrupted result.
> 
> ... and those are what?

  - /a/p/x and /a/q/x are links to the same file

  - /b/y and /a/q/y are links to the same file

  - tar is running on /a

  - meanwhile the following commands are executed:

 mv /a/p/x /b/x
 mv /b/y /a/p/x

With st_ino checking you'll get a perfectly consistent archive,
regardless of the timing.  With samefile() you could get an archive
where the data in /a/q/y is not stored, instead it will contain the
data of /a/q/x.

Note, this is far nastier than the "normal" corruption you usually get
with changing the tree under tar, the file is not just duplicated or
missing, it becomes a completely different file, even though it hasn't
been touched at all during the archiving.

The basic problem with samefile() is that it can only compare files at
a single snapshot in time, and cannot take into account any changes in
the tree (unless keeping files open, which is impractical).

There's really no point trying to push for such an inferior interface
when the problems which samefile is trying to address are purely
theoretical.

Currently linux is living with 32bit st_ino because of legacy apps,
and people are not constantly agonizing about it.  Fixing the
EOVERFLOW problem will enable filesystems to slowly move towards 64bit
st_ino, which should be more than enough.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi

  No one guarantees you sane result of tar or cp -a while changing the tree.
  I don't see how is_samefile() could make it worse.
 
  There are several cases where changing the tree doesn't affect the
  correctness of the tar or cp -a result.  In some of these cases using
  samefile() instead of st_ino _will_ result in a corrupted result.
 
 ... and those are what?

  - /a/p/x and /a/q/x are links to the same file

  - /b/y and /a/q/y are links to the same file

  - tar is running on /a

  - meanwhile the following commands are executed:

 mv /a/p/x /b/x
 mv /b/y /a/p/x

With st_ino checking you'll get a perfectly consistent archive,
regardless of the timing.  With samefile() you could get an archive
where the data in /a/q/y is not stored, instead it will contain the
data of /a/q/x.

Note, this is far nastier than the normal corruption you usually get
with changing the tree under tar, the file is not just duplicated or
missing, it becomes a completely different file, even though it hasn't
been touched at all during the archiving.

The basic problem with samefile() is that it can only compare files at
a single snapshot in time, and cannot take into account any changes in
the tree (unless keeping files open, which is impractical).

There's really no point trying to push for such an inferior interface
when the problems which samefile is trying to address are purely
theoretical.

Currently linux is living with 32bit st_ino because of legacy apps,
and people are not constantly agonizing about it.  Fixing the
EOVERFLOW problem will enable filesystems to slowly move towards 64bit
st_ino, which should be more than enough.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Pavel Machek

On Fri 2007-01-05 16:15:41, Miklos Szeredi wrote:
   And does it matter? If you rename a file, tar might skip it no matter of 
   hardlink detection (if readdir races with rename, you can read none of 
   the 
   names of file, one or both --- all these are possible).
   
   If you have dir1/a hardlinked to dir1/b and while tar runs you delete 
   both a and b and create totally new files dir2/c linked to 
   dir2/d, 
   tar might hardlink both c and d to a and b.
   
   No one guarantees you sane result of tar or cp -a while changing the 
   tree. 
   I don't see how is_samefile() could make it worse.
  
  There are several cases where changing the tree doesn't affect the
  correctness of the tar or cp -a result.  In some of these cases using
  samefile() instead of st_ino _will_ result in a corrupted result.
 
 Also note, that using st_ino in combination with samefile() doesn't
 make the result much better, it eliminates false positives, but cannot
 fix false negatives.

I'd argue false negatives are not as severe.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Pavel Machek

Hi!

   No one guarantees you sane result of tar or cp -a while changing the 
   tree.
   I don't see how is_samefile() could make it worse.
  
   There are several cases where changing the tree doesn't affect the
   correctness of the tar or cp -a result.  In some of these cases using
   samefile() instead of st_ino _will_ result in a corrupted result.
  
  ... and those are what?
 
   - /a/p/x and /a/q/x are links to the same file
 
   - /b/y and /a/q/y are links to the same file
 
   - tar is running on /a
 
   - meanwhile the following commands are executed:
 
  mv /a/p/x /b/x
  mv /b/y /a/p/x
 
 With st_ino checking you'll get a perfectly consistent archive,
 regardless of the timing.  With samefile() you could get an archive
 where the data in /a/q/y is not stored, instead it will contain the
 data of /a/q/x.
 
 Note, this is far nastier than the normal corruption you usually get
 with changing the tree under tar, the file is not just duplicated or
 missing, it becomes a completely different file, even though it hasn't
 been touched at all during the archiving.
 
 The basic problem with samefile() is that it can only compare files at
 a single snapshot in time, and cannot take into account any changes in
 the tree (unless keeping files open, which is impractical).

 There's really no point trying to push for such an inferior interface
 when the problems which samefile is trying to address are purely
 theoretical.

Oh yes, there is. st_ino is powerful, *but impossible to implement*
on many filesystems. You are of course welcome to combine st_ino with
samefile.

 Currently linux is living with 32bit st_ino because of legacy apps,
 and people are not constantly agonizing about it.  Fixing the
 EOVERFLOW problem will enable filesystems to slowly move towards 64bit
 st_ino, which should be more than enough.

50% probability of false positive on 4G files seems like very ugly
design problem to me.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi

  There's really no point trying to push for such an inferior interface
  when the problems which samefile is trying to address are purely
  theoretical.
 
 Oh yes, there is. st_ino is powerful, *but impossible to implement*
 on many filesystems.

You mean POSIX compliance is impossible?  So what?  It is possible to
implement an approximation that is _at least_ as good as samefile().
One really dumb way is to set st_ino to the 'struct inode' pointer for
example.  That will sure as hell fit into 64bits and will give a
unique (alas not stable) identifier for each file.  Opening two files,
doing fstat() on them and comparing st_ino will give exactly the same
guarantees as samefile().

  Currently linux is living with 32bit st_ino because of legacy apps,
  and people are not constantly agonizing about it.  Fixing the
  EOVERFLOW problem will enable filesystems to slowly move towards 64bit
  st_ino, which should be more than enough.
 
 50% probability of false positive on 4G files seems like very ugly
 design problem to me.

4 billion files, each with more than one link is pretty far fetched.
And anyway, filesystems can take steps to prevent collisions, as they
do currently for 32bit st_ino, without serious difficulties
apparently.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Martin Mares

Hello!

 You mean POSIX compliance is impossible?  So what?  It is possible to
 implement an approximation that is _at least_ as good as samefile().
 One really dumb way is to set st_ino to the 'struct inode' pointer for
 example.  That will sure as hell fit into 64bits and will give a
 unique (alas not stable) identifier for each file.  Opening two files,
 doing fstat() on them and comparing st_ino will give exactly the same
 guarantees as samefile().

Good, ... except that it doesn't work. AFAIK, POSIX mandates inodes
to be unique until umount, not until inode cache expires :-)

IOW, if you have such implementation of st_ino, you can emulate samefile()
with it, but you cannot have it without violating POSIX.

 4 billion files, each with more than one link is pretty far fetched.

Not on terabyte scale disk arrays, which are getting quite common these days.

 And anyway, filesystems can take steps to prevent collisions, as they
 do currently for 32bit st_ino, without serious difficulties
 apparently.

They currently do that usually by not supporting more than 4G files
in a single FS.

Have a nice fortnight
-- 
Martin `MJ' Mares  [EMAIL PROTECTED]   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Oh no, not again!  -- The bowl of petunias
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi

  You mean POSIX compliance is impossible?  So what?  It is possible to
  implement an approximation that is _at least_ as good as samefile().
  One really dumb way is to set st_ino to the 'struct inode' pointer for
  example.  That will sure as hell fit into 64bits and will give a
  unique (alas not stable) identifier for each file.  Opening two files,
  doing fstat() on them and comparing st_ino will give exactly the same
  guarantees as samefile().
 
 Good, ... except that it doesn't work. AFAIK, POSIX mandates inodes
 to be unique until umount, not until inode cache expires :-)
 
 IOW, if you have such implementation of st_ino, you can emulate samefile()
 with it, but you cannot have it without violating POSIX.

The whole discussion started out from the premise, that some
filesystems can't support stable unique inode numbers, i.e. they don't
conform to POSIX.

Filesystems which do conform to POSIX have _no need_ for samefile().
Ones that don't conform, can chose a scheme that is best suited to
applications need, balancing uniqueness and stability in various ways.

  4 billion files, each with more than one link is pretty far fetched.
 
 Not on terabyte scale disk arrays, which are getting quite common these days.
 
  And anyway, filesystems can take steps to prevent collisions, as they
  do currently for 32bit st_ino, without serious difficulties
  apparently.
 
 They currently do that usually by not supporting more than 4G files
 in a single FS.

And with 64bit st_ino, they'll have to live with the limitation of not
more than 2^64 files.  Tough luck ;)

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-07 Thread Mikulas Patocka


Currently, large file support is already necessary to handle dvd and
video. It's also useful for images for virtualization. So the failing
stat()
calls should already be a thing of the past with modern distributions.


As long as glibc compiles by default with 32-bit ino_t, the problem exists
and is severe --- programs handling large files, such as coreutils, tar,
mc, mplayer, already compile with 64-bit ino_t and off_t, but the user (or
script) may type something like:

cat >file.c <
#include 
main()
{
int h;
struct stat st;
if ((h = creat("foo", 0600)) < 0) perror("creat"), exit(1);
if (fstat(h, )) perror("stat"), exit(1);
close(h);
return 0;
}
EOF
gcc file.c; ./a.out

--- and you certainly do not want this to fail (unless you are out of disk
space).

The difference is, that with 32-bit program and 64-bit off_t, you get
deterministic failure on large files, with 32-bit program and 64-bit
ino_t, you get random failures.


What's (technically) the problem with changing the gcc default?


Technically none (i.e. edit gcc specs or glibc includes). But persuading 
all distribution builders to use this version is impossible. Plus there 
are many binary programs that are unchangable.



Alternatively we could make the error deterministic in various ways. Start
st_ino numbering from 4G (except for a few special ones maybe such
as root/mounts). Or make old and new programs look differently at the
ELF level or by sys_personality() and/or check against a "ino64" mount
flag/filesystem feature. Lots of possibilities.


I think the best solution would be to drop -EOVERFLOW on st_ino and let 
legacy 32-bit programs live with coliding inodes. They'll have anyway.


Mikulas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-07 Thread Mikulas Patocka


And does it matter? If you rename a file, tar might skip it no matter of
hardlink detection (if readdir races with rename, you can read none of the
names of file, one or both --- all these are possible).

If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delete
both "a" and "b" and create totally new files "dir2/c" linked to "dir2/d",
tar might hardlink both "c" and "d" to "a" and "b".

No one guarantees you sane result of tar or cp -a while changing the tree.
I don't see how is_samefile() could make it worse.


There are several cases where changing the tree doesn't affect the
correctness of the tar or cp -a result.  In some of these cases using
samefile() instead of st_ino _will_ result in a corrupted result.


... and those are what? If you create hardlinks while copying, you may 
have files duplicated instead of hardlinked in the backup. If you unlink 
hardlinks, cp will miss hardlinks too and create two copies of the same 
file (it searches the hash only for files with i_nlink > 1). If you rename 
files, the archive will be completely fscked up (either missing or 
duplicate files).



Generally samefile() is _weaker_ than the st_ino interface in
comparing the identity of two files without using massive amounts of
memory.  You're searching for a better solution, not one that is
broken in a different way, aren't you?


What is the relevant case where st_ino/st_dev works and samefile(char 
*path1, char *path2) doesn't?



Miklos


Mikulas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-07 Thread Mikulas Patocka


And does it matter? If you rename a file, tar might skip it no matter of
hardlink detection (if readdir races with rename, you can read none of the
names of file, one or both --- all these are possible).

If you have dir1/a hardlinked to dir1/b and while tar runs you delete
both a and b and create totally new files dir2/c linked to dir2/d,
tar might hardlink both c and d to a and b.

No one guarantees you sane result of tar or cp -a while changing the tree.
I don't see how is_samefile() could make it worse.


There are several cases where changing the tree doesn't affect the
correctness of the tar or cp -a result.  In some of these cases using
samefile() instead of st_ino _will_ result in a corrupted result.


... and those are what? If you create hardlinks while copying, you may 
have files duplicated instead of hardlinked in the backup. If you unlink 
hardlinks, cp will miss hardlinks too and create two copies of the same 
file (it searches the hash only for files with i_nlink  1). If you rename 
files, the archive will be completely fscked up (either missing or 
duplicate files).



Generally samefile() is _weaker_ than the st_ino interface in
comparing the identity of two files without using massive amounts of
memory.  You're searching for a better solution, not one that is
broken in a different way, aren't you?


What is the relevant case where st_ino/st_dev works and samefile(char 
*path1, char *path2) doesn't?



Miklos


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-07 Thread Mikulas Patocka


Currently, large file support is already necessary to handle dvd and
video. It's also useful for images for virtualization. So the failing
stat()
calls should already be a thing of the past with modern distributions.


As long as glibc compiles by default with 32-bit ino_t, the problem exists
and is severe --- programs handling large files, such as coreutils, tar,
mc, mplayer, already compile with 64-bit ino_t and off_t, but the user (or
script) may type something like:

cat file.c EOF
#include sys/types.h
#include sys/stat.h
main()
{
int h;
struct stat st;
if ((h = creat(foo, 0600))  0) perror(creat), exit(1);
if (fstat(h, st)) perror(stat), exit(1);
close(h);
return 0;
}
EOF
gcc file.c; ./a.out

--- and you certainly do not want this to fail (unless you are out of disk
space).

The difference is, that with 32-bit program and 64-bit off_t, you get
deterministic failure on large files, with 32-bit program and 64-bit
ino_t, you get random failures.


What's (technically) the problem with changing the gcc default?


Technically none (i.e. edit gcc specs or glibc includes). But persuading 
all distribution builders to use this version is impossible. Plus there 
are many binary programs that are unchangable.



Alternatively we could make the error deterministic in various ways. Start
st_ino numbering from 4G (except for a few special ones maybe such
as root/mounts). Or make old and new programs look differently at the
ELF level or by sys_personality() and/or check against a ino64 mount
flag/filesystem feature. Lots of possibilities.


I think the best solution would be to drop -EOVERFLOW on st_ino and let 
legacy 32-bit programs live with coliding inodes. They'll have anyway.


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Halevy, Benny

> From: [EMAIL PROTECTED] on behalf of Nicolas Williams
> Sent: Fri 1/5/2007 18:40
> To: Halevy, Benny
> Cc: Trond Myklebust; Jan Harkes; Miklos Szeredi; nfsv4@ietf.org; 
> linux-kernel@vger.kernel.org; Mikulas Patocka; linux-fsdevel@vger.kernel.org; 
> Jeff Layton; Arjan van de Ven
> Subject: Re: [nfsv4] RE: Finding hardlinks
> 
> On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
> > I agree that the way the client implements its cache is out of the protocol
> > scope. But how do you interpret "correct behavior" in section 4.2.1?
> >  "Clients MUST use filehandle comparisons only to improve performance, not 
> > for correct behavior. All clients > need to be prepared for situations in 
> > which it cannot be determined whether two filehandles denote the same > 
> > object and in such cases, avoid making invalid assumptions which might 
> > cause incorrect behavior."
> > Don't you consider data corruption due to cache inconsistency an incorrect 
> > behavior?
> 
> If a file with multiple hardlinks appears to have multiple distinct
> filehandles then a client like Trond's will treat it as multiple
> distinct files (with the same hardlink count, and you won't be able to
> find the other links to them -- oh well).  Can this cause data
> corruption?  Yes, but only if there are applications that rely on the
> different file names referencing the same file, and backup apps on the
> client won't get the hardlinks right either.

Well, this is why the hard links were made, no?
FWIW, I believe that rename of an open file might also produce this problem.


> 
> What I don't understand is why getting the fileid is so hard -- always
> GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
> difficult as it is to maintain a hash table of fileids.


The problem with NFS is that fileid isn't enough because the client doesn't
know about removes by other clients until it uses the stale filehandle.
Also, quite a few file systems are not keeping fileids unique (this triggered
this thread)
 
> 
> Nico
> --

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Bodo Eggert

Miklos Szeredi <[EMAIL PROTECTED]> wrote:

>> > Well, sort of.  Samefile without keeping fds open doesn't have any
>> > protection against the tree changing underneath between first
>> > registering a file and later opening it.  The inode number is more
>> 
>> You only need to keep one-file-per-hardlink-group open during final
>> verification, checking that inode hashing produced reasonable results.
> 
> What final verification?  I wasn't just talking about 'tar' but all
> cases where st_ino might be used to check the identity of two files at
> possibly different points in time.
> 
> Time A:remember identity of file X
> Time B:check if identity of file Y matches that of file X
> 
> With samefile() if you open X at A, and keep it open till B, you can
> accumulate large numbers of open files and the application can fail.
> 
> If you don't keep an open file, just remember the path, then renaming
> X will foil the later identity check.  Changing the file at this path
> between A and B can even give you a false positive.  This applies to
> 'tar' as well as the other uses.

If you open Y, this open file descriptor will guarantee that no distinct
file will have the same inode number while all hardliked files must have
the same inode number. (AFAIK)

Now you will check this against the list of hardlink candidates using the
stored inode number. If the inode number has changed, this will result in
a false negative. If you removed X, recreated it with the same inode number
and linked that to Y, you'll get a false positive (which could be identified
by the [mc]time changes).

Samefile without keeping the files open will result in the same false
positive as open+fstat+stat, while samefile with keeping the files open
will occasionally overflow the files table, Therefore I think it's not
worth while introducing samefile as long as the inode is unique for open
files. OTOH you'll want to keep the inode number as stable as possible,
since it's the only sane way to find sets of hardlinked files and some
important programs may depend on it.
-- 
Ich danke GMX dafür, die Verwendung meiner Adressen mittels per SPF
verbreiteten Lügen zu sabotieren.

http://david.woodhou.se/why-not-spf.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Noveck, Dave

For now, I'm not going to address the controversial issues here,
mainly because I haven't decided how I feel about them yet.

 Whether allowing multiple filehandles per object is a good
 or even reasonably acceptable idea.

 What the fact that RFC3530 talks about implies about what
 clients should do about the issue.

One thing that I hope is not controversial is that the v4.1 spec
should either get rid of this or make it clear and implementable.
I expect plenty of controversy about which of those to choose, but
hope that there isn't any about the proposition that we have to 
choose one of those two.

> SECINFO information is, for instance, given
> out on a per-filehandle basis, does that mean that the server will
have
> different security policies? 

Well yes, RFC3530 does say "The new SECINFO operation will allow the 
client to determine, on a per filehandle basis", but I think that
just has to be considered as an error rather than indicating that if
you have two different filehandles for the same object, they can have 
different security policies.  SECINFO in RFC3530 takes a directory fh
and a name, so if there are multiple filehandles for the object with
that name, there is no way for SECINFO to associate different policies
with different filehandles.  All it has is the name to go by.  I think
this should be corrected to "on a per-object basis" in the new spec no 
matter what we do on other issues.

I think the principle here has to be that if we do allow multiple 
fh's to map to the same object, we require that they designate the 
same object, and thus it is not allowed for the server to act as if 
you have multiple different object with different characteristics.

Similarly as to:

> In some places, people haven't even started
> to think about the consequences: 
>
> If GETATTR directed to the two filehandles does not return the
> fileid attribute for both of the handles, then it cannot be
> determined whether the two objects are the same.  Therefore,
> operations which depend on that knowledge (e.g., client side data
> caching) cannot be done reliably.

I think they (and maybe "they" includes me, I haven't checked the
history
here) started to think about them, but went in a bad direction.

The implication here that you can have a different set of attributes
supported for the same object based on which filehandle is used to 
access the attributes is totally bogus.

The definition of supp_attr says "The bit vector which would retrieve
all mandatory and recommended attributes that are supported for this 
object.  The scope of this attribute applies to all objects with a
matching fsid."  So having the same object have different attributes
supported based on the filehandle used or even two objects in the same
fs having different attributes supported, in particular having fileid
supported for one and not the other just isn't valid.

> The fact is that RFC3530 contains masses of rope with which
> to allow server and client vendors to hang themselves. 

If that means simply making poor choices, then OK.  But if there are 
other cases where you feel that the specification of a feature is simply

incoherent and the consequences not really thought out, then I think 
we need to discuss them and not propagate that state of affairs to v4.1.

-Original Message-
From: Trond Myklebust [mailto:[EMAIL PROTECTED] 
Sent: Friday, January 05, 2007 5:29 AM
To: Benny Halevy
Cc: Jan Harkes; Miklos Szeredi; nfsv4@ietf.org;
linux-kernel@vger.kernel.org; Mikulas Patocka;
linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven
Subject: Re: [nfsv4] RE: Finding hardlinks

On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
> Trond Myklebust wrote:
> > Exactly where do you see us violating the close-to-open cache
> > consistency guarantees?
> > 
> 
> I haven't seen that. What I did see is cache inconsistency when
opening
> the same file with different file descriptors when the filehandle
changes.
> My testing shows that at least fsync and close fail with EIO when the
filehandle
> changed while there was dirty data in the cache and that's good.
Still,
> not sharing the cache while the file is opened (even on a different
file
> descriptors by the same process) seems impractical.

Tough. I'm not going to commit to adding support for multiple
filehandles. The fact is that RFC3530 contains masses of rope with which
to allow server and client vendors to hang themselves. The fact that the
protocol claims support for servers that use multiple filehandles per
inode does not mean it is necessarily a good idea. It adds unnecessary
code complexity, it screws with server scalability (extra GETATTR calls
just in order to probe existing filehandles), and it is insufficiently
well documented in the RFC: SECINFO information is, for instance, given
out on a per-filehandle

Re: Finding hardlinks

2007-01-05 Thread Frank van Maarseveen

On Fri, Jan 05, 2007 at 09:43:22AM +0100, Miklos Szeredi wrote:
> > > > > High probability is all you have.  Cosmic radiation hitting your
> > > > > computer will more likly cause problems, than colliding 64bit inode
> > > > > numbers ;)
> > > > 
> > > > Some of us have machines designed to cope with cosmic rays, and would be
> > > > unimpressed with a decrease in reliability.
> > > 
> > > With the suggested samefile() interface you'd get a failure with just
> > > about 100% reliability for any application which needs to compare a
> > > more than a few files.  The fact is open files are _very_ expensive,
> > > no wonder they are limited in various ways.
> > > 
> > > What should 'tar' do when it runs out of open files, while searching
> > > for hardlinks?  Should it just give up?  Then the samefile() interface
> > > would be _less_ reliable than the st_ino one by a significant margin.
> > 
> > You need at most two simultenaously open files for examining any
> > number of hardlinks. So yes, you can make it reliable.
> 
> Well, sort of.  Samefile without keeping fds open doesn't have any
> protection against the tree changing underneath between first
> registering a file and later opening it.  The inode number is more
> useful in this respect.  In fact inode number + generation number will
> give you a unique identifier in time as well, which is a _lot_ more
> useful to determine if the file you are checking is actually the same
> as one that you've come across previously.

Samefile with keeping fds open doesn't buy you much anyway. What exactly
would be the value of a directory tree seen by operating only on fds
(even for directories) when some rogue process is renaming, moving,
updating stuff underneath?  One ends up with a tree which misses alot
of files and hardly bears any resemblance with the actual tree at any
point in time and I'm not even talking about filedata.

It is futile to try to get a consistent tree view on a live filesystem,
with- or without using fds. It just doesn't work without fundamental
support for some kind of "freezing" or time-travel inside the
kernel. Snapshots at the block device level are problematic too.

> 
> So instead of samefile() I'd still suggest an extended attribute
> interface which exports the file's unique (in space and time)
> identifier as an opaque cookie.

But then you're just _shifting_ the problem instead of fixing it:
st_ino/st_mtime (st_ctime?) are designed for this purpose. If the
filesystem doesn't support it properly: live with the consequences
which are mostly minor. Notable exceptions are of course backup tools
but backups _must_ be verified anyway so you'll discover soon.

(btw, that's what I noticed after restoring a system from a CD (iso9660
 with RR): all hardlinks were gone)

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Nicolas Williams

On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
> I agree that the way the client implements its cache is out of the protocol
> scope. But how do you interpret "correct behavior" in section 4.2.1?
>  "Clients MUST use filehandle comparisons only to improve performance, not 
> for correct behavior. All clients need to be prepared for situations in which 
> it cannot be determined whether two filehandles denote the same object and in 
> such cases, avoid making invalid assumptions which might cause incorrect 
> behavior."
> Don't you consider data corruption due to cache inconsistency an incorrect 
> behavior?

If a file with multiple hardlinks appears to have multiple distinct
filehandles then a client like Trond's will treat it as multiple
distinct files (with the same hardlink count, and you won't be able to
find the other links to them -- oh well).  Can this cause data
corruption?  Yes, but only if there are applications that rely on the
different file names referencing the same file, and backup apps on the
client won't get the hardlinks right either.

What I don't understand is why getting the fileid is so hard -- always
GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
difficult as it is to maintain a hash table of fileids.

Nico
-- 
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust

On Fri, 2007-01-05 at 10:40 -0600, Nicolas Williams wrote:
> What I don't understand is why getting the fileid is so hard -- always
> GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
> difficult as it is to maintain a hash table of fileids.

You've been sleeping in class. We always try to get the fileid together
with the GETFH. The irritating bit is having to redo a GETATTR using the
old filehandle in order to figure out if the 2 filehandles refer to the
same file. Unlike filehandles, fileids can be reused.

Then there is the point of dealing with that servers can (and do!)
actually lie to you.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi

> > And does it matter? If you rename a file, tar might skip it no matter of 
> > hardlink detection (if readdir races with rename, you can read none of the 
> > names of file, one or both --- all these are possible).
> > 
> > If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delete 
> > both "a" and "b" and create totally new files "dir2/c" linked to "dir2/d", 
> > tar might hardlink both "c" and "d" to "a" and "b".
> > 
> > No one guarantees you sane result of tar or cp -a while changing the tree. 
> > I don't see how is_samefile() could make it worse.
> 
> There are several cases where changing the tree doesn't affect the
> correctness of the tar or cp -a result.  In some of these cases using
> samefile() instead of st_ino _will_ result in a corrupted result.

Also note, that using st_ino in combination with samefile() doesn't
make the result much better, it eliminates false positives, but cannot
fix false negatives.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi

> And does it matter? If you rename a file, tar might skip it no matter of 
> hardlink detection (if readdir races with rename, you can read none of the 
> names of file, one or both --- all these are possible).
> 
> If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delete 
> both "a" and "b" and create totally new files "dir2/c" linked to "dir2/d", 
> tar might hardlink both "c" and "d" to "a" and "b".
> 
> No one guarantees you sane result of tar or cp -a while changing the tree. 
> I don't see how is_samefile() could make it worse.

There are several cases where changing the tree doesn't affect the
correctness of the tar or cp -a result.  In some of these cases using
samefile() instead of st_ino _will_ result in a corrupted result.

Generally samefile() is _weaker_ than the st_ino interface in
comparing the identity of two files without using massive amounts of
memory.  You're searching for a better solution, not one that is
broken in a different way, aren't you?

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Mikulas Patocka


Well, sort of.  Samefile without keeping fds open doesn't have any
protection against the tree changing underneath between first
registering a file and later opening it.  The inode number is more


You only need to keep one-file-per-hardlink-group open during final
verification, checking that inode hashing produced reasonable results.


What final verification?  I wasn't just talking about 'tar' but all
cases where st_ino might be used to check the identity of two files at
possibly different points in time.

Time A:remember identity of file X
Time B:check if identity of file Y matches that of file X

With samefile() if you open X at A, and keep it open till B, you can
accumulate large numbers of open files and the application can fail.

If you don't keep an open file, just remember the path, then renaming
X will foil the later identity check.  Changing the file at this path
between A and B can even give you a false positive.  This applies to
'tar' as well as the other uses.


And does it matter? If you rename a file, tar might skip it no matter of 
hardlink detection (if readdir races with rename, you can read none of the 
names of file, one or both --- all these are possible).


If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delete 
both "a" and "b" and create totally new files "dir2/c" linked to "dir2/d", 
tar might hardlink both "c" and "d" to "a" and "b".


No one guarantees you sane result of tar or cp -a while changing the tree. 
I don't see how is_samefile() could make it worse.


Mikulas


Miklos


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi

> > Well, sort of.  Samefile without keeping fds open doesn't have any
> > protection against the tree changing underneath between first
> > registering a file and later opening it.  The inode number is more
> 
> You only need to keep one-file-per-hardlink-group open during final
> verification, checking that inode hashing produced reasonable results.

What final verification?  I wasn't just talking about 'tar' but all
cases where st_ino might be used to check the identity of two files at
possibly different points in time.

Time A:remember identity of file X
Time B:check if identity of file Y matches that of file X

With samefile() if you open X at A, and keep it open till B, you can
accumulate large numbers of open files and the application can fail.

If you don't keep an open file, just remember the path, then renaming
X will foil the later identity check.  Changing the file at this path
between A and B can even give you a false positive.  This applies to
'tar' as well as the other uses.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Pavel Machek

Hi!

> > > > Some of us have machines designed to cope with cosmic rays, and would be
> > > > unimpressed with a decrease in reliability.
> > > 
> > > With the suggested samefile() interface you'd get a failure with just
> > > about 100% reliability for any application which needs to compare a
> > > more than a few files.  The fact is open files are _very_ expensive,
> > > no wonder they are limited in various ways.
> > > 
> > > What should 'tar' do when it runs out of open files, while searching
> > > for hardlinks?  Should it just give up?  Then the samefile() interface
> > > would be _less_ reliable than the st_ino one by a significant margin.
> > 
> > You need at most two simultenaously open files for examining any
> > number of hardlinks. So yes, you can make it reliable.
> 
> Well, sort of.  Samefile without keeping fds open doesn't have any
> protection against the tree changing underneath between first
> registering a file and later opening it.  The inode number is more

You only need to keep one-file-per-hardlink-group open during final
verification, checking that inode hashing produced reasonable results.

Pavel
-- 
Thanks for all the (sleeping) penguins.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust

On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
> Trond Myklebust wrote:
> > Exactly where do you see us violating the close-to-open cache
> > consistency guarantees?
> > 
> 
> I haven't seen that. What I did see is cache inconsistency when opening
> the same file with different file descriptors when the filehandle changes.
> My testing shows that at least fsync and close fail with EIO when the 
> filehandle
> changed while there was dirty data in the cache and that's good. Still,
> not sharing the cache while the file is opened (even on a different file
> descriptors by the same process) seems impractical.

Tough. I'm not going to commit to adding support for multiple
filehandles. The fact is that RFC3530 contains masses of rope with which
to allow server and client vendors to hang themselves. The fact that the
protocol claims support for servers that use multiple filehandles per
inode does not mean it is necessarily a good idea. It adds unnecessary
code complexity, it screws with server scalability (extra GETATTR calls
just in order to probe existing filehandles), and it is insufficiently
well documented in the RFC: SECINFO information is, for instance, given
out on a per-filehandle basis, does that mean that the server will have
different security policies? In some places, people haven't even started
to think about the consequences:

  If GETATTR directed to the two filehandles does not return the
  fileid attribute for both of the handles, then it cannot be
  determined whether the two objects are the same.  Therefore,
  operations which depend on that knowledge (e.g., client side data
  caching) cannot be done reliably.

This implies the combination is legal, but offers no indication as to
how you would match OPEN/CLOSE requests via different paths. AFAICS you
would have to do non-cached I/O with no share modes (i.e. NFSv3-style
"special" stateids). There is no way in hell we will ever support
non-cached I/O in NFS other than the special case of O_DIRECT.

...and no, I'm certainly not interested in "fixing" the RFC on this
point in any way other than getting this crap dropped from the spec. I
see no use for it at all.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi

> > > > High probability is all you have.  Cosmic radiation hitting your
> > > > computer will more likly cause problems, than colliding 64bit inode
> > > > numbers ;)
> > > 
> > > Some of us have machines designed to cope with cosmic rays, and would be
> > > unimpressed with a decrease in reliability.
> > 
> > With the suggested samefile() interface you'd get a failure with just
> > about 100% reliability for any application which needs to compare a
> > more than a few files.  The fact is open files are _very_ expensive,
> > no wonder they are limited in various ways.
> > 
> > What should 'tar' do when it runs out of open files, while searching
> > for hardlinks?  Should it just give up?  Then the samefile() interface
> > would be _less_ reliable than the st_ino one by a significant margin.
> 
> You need at most two simultenaously open files for examining any
> number of hardlinks. So yes, you can make it reliable.

Well, sort of.  Samefile without keeping fds open doesn't have any
protection against the tree changing underneath between first
registering a file and later opening it.  The inode number is more
useful in this respect.  In fact inode number + generation number will
give you a unique identifier in time as well, which is a _lot_ more
useful to determine if the file you are checking is actually the same
as one that you've come across previously.

So instead of samefile() I'd still suggest an extended attribute
interface which exports the file's unique (in space and time)
identifier as an opaque cookie.

For filesystems like FAT you can basically only guarantee that two
files are the same as long as those files are in the icache, no matter
if you use samefile() or inode numbers.  Userpace _can_ make the
inodes stay in the cache by keeping the files open, which works for
samefile as well as checking by inode number.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Benny Halevy

Trond Myklebust wrote:
> On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote:
>> I agree that the way the client implements its cache is out of the protocol
>> scope. But how do you interpret "correct behavior" in section 4.2.1?
>>  "Clients MUST use filehandle comparisons only to improve performance, not 
>> for correct behavior. All clients need to be prepared for situations in 
>> which it cannot be determined whether two filehandles denote the same object 
>> and in such cases, avoid making invalid assumptions which might cause 
>> incorrect behavior."
>> Don't you consider data corruption due to cache inconsistency an incorrect 
>> behavior?
> 
> Exactly where do you see us violating the close-to-open cache
> consistency guarantees?
> 

I haven't seen that. What I did see is cache inconsistency when opening
the same file with different file descriptors when the filehandle changes.
My testing shows that at least fsync and close fail with EIO when the filehandle
changed while there was dirty data in the cache and that's good. Still,
not sharing the cache while the file is opened (even on a different file
descriptors by the same process) seems impractical.

Benny
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Benny Halevy

Trond Myklebust wrote:
 On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote:
 I agree that the way the client implements its cache is out of the protocol
 scope. But how do you interpret correct behavior in section 4.2.1?
  Clients MUST use filehandle comparisons only to improve performance, not 
 for correct behavior. All clients need to be prepared for situations in 
 which it cannot be determined whether two filehandles denote the same object 
 and in such cases, avoid making invalid assumptions which might cause 
 incorrect behavior.
 Don't you consider data corruption due to cache inconsistency an incorrect 
 behavior?
 
 Exactly where do you see us violating the close-to-open cache
 consistency guarantees?
 

I haven't seen that. What I did see is cache inconsistency when opening
the same file with different file descriptors when the filehandle changes.
My testing shows that at least fsync and close fail with EIO when the filehandle
changed while there was dirty data in the cache and that's good. Still,
not sharing the cache while the file is opened (even on a different file
descriptors by the same process) seems impractical.

Benny
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi

High probability is all you have.  Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)
   
   Some of us have machines designed to cope with cosmic rays, and would be
   unimpressed with a decrease in reliability.
  
  With the suggested samefile() interface you'd get a failure with just
  about 100% reliability for any application which needs to compare a
  more than a few files.  The fact is open files are _very_ expensive,
  no wonder they are limited in various ways.
  
  What should 'tar' do when it runs out of open files, while searching
  for hardlinks?  Should it just give up?  Then the samefile() interface
  would be _less_ reliable than the st_ino one by a significant margin.
 
 You need at most two simultenaously open files for examining any
 number of hardlinks. So yes, you can make it reliable.

Well, sort of.  Samefile without keeping fds open doesn't have any
protection against the tree changing underneath between first
registering a file and later opening it.  The inode number is more
useful in this respect.  In fact inode number + generation number will
give you a unique identifier in time as well, which is a _lot_ more
useful to determine if the file you are checking is actually the same
as one that you've come across previously.

So instead of samefile() I'd still suggest an extended attribute
interface which exports the file's unique (in space and time)
identifier as an opaque cookie.

For filesystems like FAT you can basically only guarantee that two
files are the same as long as those files are in the icache, no matter
if you use samefile() or inode numbers.  Userpace _can_ make the
inodes stay in the cache by keeping the files open, which works for
samefile as well as checking by inode number.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust

On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
 Trond Myklebust wrote:
  Exactly where do you see us violating the close-to-open cache
  consistency guarantees?
  
 
 I haven't seen that. What I did see is cache inconsistency when opening
 the same file with different file descriptors when the filehandle changes.
 My testing shows that at least fsync and close fail with EIO when the 
 filehandle
 changed while there was dirty data in the cache and that's good. Still,
 not sharing the cache while the file is opened (even on a different file
 descriptors by the same process) seems impractical.

Tough. I'm not going to commit to adding support for multiple
filehandles. The fact is that RFC3530 contains masses of rope with which
to allow server and client vendors to hang themselves. The fact that the
protocol claims support for servers that use multiple filehandles per
inode does not mean it is necessarily a good idea. It adds unnecessary
code complexity, it screws with server scalability (extra GETATTR calls
just in order to probe existing filehandles), and it is insufficiently
well documented in the RFC: SECINFO information is, for instance, given
out on a per-filehandle basis, does that mean that the server will have
different security policies? In some places, people haven't even started
to think about the consequences:

  If GETATTR directed to the two filehandles does not return the
  fileid attribute for both of the handles, then it cannot be
  determined whether the two objects are the same.  Therefore,
  operations which depend on that knowledge (e.g., client side data
  caching) cannot be done reliably.

This implies the combination is legal, but offers no indication as to
how you would match OPEN/CLOSE requests via different paths. AFAICS you
would have to do non-cached I/O with no share modes (i.e. NFSv3-style
special stateids). There is no way in hell we will ever support
non-cached I/O in NFS other than the special case of O_DIRECT.


...and no, I'm certainly not interested in fixing the RFC on this
point in any way other than getting this crap dropped from the spec. I
see no use for it at all.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Pavel Machek

Hi!

Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
   
   With the suggested samefile() interface you'd get a failure with just
   about 100% reliability for any application which needs to compare a
   more than a few files.  The fact is open files are _very_ expensive,
   no wonder they are limited in various ways.
   
   What should 'tar' do when it runs out of open files, while searching
   for hardlinks?  Should it just give up?  Then the samefile() interface
   would be _less_ reliable than the st_ino one by a significant margin.
  
  You need at most two simultenaously open files for examining any
  number of hardlinks. So yes, you can make it reliable.
 
 Well, sort of.  Samefile without keeping fds open doesn't have any
 protection against the tree changing underneath between first
 registering a file and later opening it.  The inode number is more

You only need to keep one-file-per-hardlink-group open during final
verification, checking that inode hashing produced reasonable results.

Pavel
-- 
Thanks for all the (sleeping) penguins.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi

  Well, sort of.  Samefile without keeping fds open doesn't have any
  protection against the tree changing underneath between first
  registering a file and later opening it.  The inode number is more
 
 You only need to keep one-file-per-hardlink-group open during final
 verification, checking that inode hashing produced reasonable results.

What final verification?  I wasn't just talking about 'tar' but all
cases where st_ino might be used to check the identity of two files at
possibly different points in time.

Time A:remember identity of file X
Time B:check if identity of file Y matches that of file X

With samefile() if you open X at A, and keep it open till B, you can
accumulate large numbers of open files and the application can fail.

If you don't keep an open file, just remember the path, then renaming
X will foil the later identity check.  Changing the file at this path
between A and B can even give you a false positive.  This applies to
'tar' as well as the other uses.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Mikulas Patocka


Well, sort of.  Samefile without keeping fds open doesn't have any
protection against the tree changing underneath between first
registering a file and later opening it.  The inode number is more


You only need to keep one-file-per-hardlink-group open during final
verification, checking that inode hashing produced reasonable results.


What final verification?  I wasn't just talking about 'tar' but all
cases where st_ino might be used to check the identity of two files at
possibly different points in time.

Time A:remember identity of file X
Time B:check if identity of file Y matches that of file X

With samefile() if you open X at A, and keep it open till B, you can
accumulate large numbers of open files and the application can fail.

If you don't keep an open file, just remember the path, then renaming
X will foil the later identity check.  Changing the file at this path
between A and B can even give you a false positive.  This applies to
'tar' as well as the other uses.


And does it matter? If you rename a file, tar might skip it no matter of 
hardlink detection (if readdir races with rename, you can read none of the 
names of file, one or both --- all these are possible).


If you have dir1/a hardlinked to dir1/b and while tar runs you delete 
both a and b and create totally new files dir2/c linked to dir2/d, 
tar might hardlink both c and d to a and b.


No one guarantees you sane result of tar or cp -a while changing the tree. 
I don't see how is_samefile() could make it worse.


Mikulas


Miklos


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi

 And does it matter? If you rename a file, tar might skip it no matter of 
 hardlink detection (if readdir races with rename, you can read none of the 
 names of file, one or both --- all these are possible).
 
 If you have dir1/a hardlinked to dir1/b and while tar runs you delete 
 both a and b and create totally new files dir2/c linked to dir2/d, 
 tar might hardlink both c and d to a and b.
 
 No one guarantees you sane result of tar or cp -a while changing the tree. 
 I don't see how is_samefile() could make it worse.

There are several cases where changing the tree doesn't affect the
correctness of the tar or cp -a result.  In some of these cases using
samefile() instead of st_ino _will_ result in a corrupted result.

Generally samefile() is _weaker_ than the st_ino interface in
comparing the identity of two files without using massive amounts of
memory.  You're searching for a better solution, not one that is
broken in a different way, aren't you?

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi

  And does it matter? If you rename a file, tar might skip it no matter of 
  hardlink detection (if readdir races with rename, you can read none of the 
  names of file, one or both --- all these are possible).
  
  If you have dir1/a hardlinked to dir1/b and while tar runs you delete 
  both a and b and create totally new files dir2/c linked to dir2/d, 
  tar might hardlink both c and d to a and b.
  
  No one guarantees you sane result of tar or cp -a while changing the tree. 
  I don't see how is_samefile() could make it worse.
 
 There are several cases where changing the tree doesn't affect the
 correctness of the tar or cp -a result.  In some of these cases using
 samefile() instead of st_ino _will_ result in a corrupted result.

Also note, that using st_ino in combination with samefile() doesn't
make the result much better, it eliminates false positives, but cannot
fix false negatives.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust

On Fri, 2007-01-05 at 10:40 -0600, Nicolas Williams wrote:
 What I don't understand is why getting the fileid is so hard -- always
 GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
 difficult as it is to maintain a hash table of fileids.

You've been sleeping in class. We always try to get the fileid together
with the GETFH. The irritating bit is having to redo a GETATTR using the
old filehandle in order to figure out if the 2 filehandles refer to the
same file. Unlike filehandles, fileids can be reused.

Then there is the point of dealing with that servers can (and do!)
actually lie to you.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Nicolas Williams

On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
 I agree that the way the client implements its cache is out of the protocol
 scope. But how do you interpret correct behavior in section 4.2.1?
  Clients MUST use filehandle comparisons only to improve performance, not 
 for correct behavior. All clients need to be prepared for situations in which 
 it cannot be determined whether two filehandles denote the same object and in 
 such cases, avoid making invalid assumptions which might cause incorrect 
 behavior.
 Don't you consider data corruption due to cache inconsistency an incorrect 
 behavior?

If a file with multiple hardlinks appears to have multiple distinct
filehandles then a client like Trond's will treat it as multiple
distinct files (with the same hardlink count, and you won't be able to
find the other links to them -- oh well).  Can this cause data
corruption?  Yes, but only if there are applications that rely on the
different file names referencing the same file, and backup apps on the
client won't get the hardlinks right either.

What I don't understand is why getting the fileid is so hard -- always
GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
difficult as it is to maintain a hash table of fileids.

Nico
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-05 Thread Frank van Maarseveen

On Fri, Jan 05, 2007 at 09:43:22AM +0100, Miklos Szeredi wrote:
 High probability is all you have.  Cosmic radiation hitting your
 computer will more likly cause problems, than colliding 64bit inode
 numbers ;)

Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
   
   With the suggested samefile() interface you'd get a failure with just
   about 100% reliability for any application which needs to compare a
   more than a few files.  The fact is open files are _very_ expensive,
   no wonder they are limited in various ways.
   
   What should 'tar' do when it runs out of open files, while searching
   for hardlinks?  Should it just give up?  Then the samefile() interface
   would be _less_ reliable than the st_ino one by a significant margin.
  
  You need at most two simultenaously open files for examining any
  number of hardlinks. So yes, you can make it reliable.
 
 Well, sort of.  Samefile without keeping fds open doesn't have any
 protection against the tree changing underneath between first
 registering a file and later opening it.  The inode number is more
 useful in this respect.  In fact inode number + generation number will
 give you a unique identifier in time as well, which is a _lot_ more
 useful to determine if the file you are checking is actually the same
 as one that you've come across previously.

Samefile with keeping fds open doesn't buy you much anyway. What exactly
would be the value of a directory tree seen by operating only on fds
(even for directories) when some rogue process is renaming, moving,
updating stuff underneath?  One ends up with a tree which misses alot
of files and hardly bears any resemblance with the actual tree at any
point in time and I'm not even talking about filedata.

It is futile to try to get a consistent tree view on a live filesystem,
with- or without using fds. It just doesn't work without fundamental
support for some kind of freezing or time-travel inside the
kernel. Snapshots at the block device level are problematic too.

 
 So instead of samefile() I'd still suggest an extended attribute
 interface which exports the file's unique (in space and time)
 identifier as an opaque cookie.

But then you're just _shifting_ the problem instead of fixing it:
st_ino/st_mtime (st_ctime?) are designed for this purpose. If the
filesystem doesn't support it properly: live with the consequences
which are mostly minor. Notable exceptions are of course backup tools
but backups _must_ be verified anyway so you'll discover soon.

(btw, that's what I noticed after restoring a system from a CD (iso9660
 with RR): all hardlinks were gone)

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Noveck, Dave

For now, I'm not going to address the controversial issues here,
mainly because I haven't decided how I feel about them yet.

 Whether allowing multiple filehandles per object is a good
 or even reasonably acceptable idea.

 What the fact that RFC3530 talks about implies about what
 clients should do about the issue.

One thing that I hope is not controversial is that the v4.1 spec
should either get rid of this or make it clear and implementable.
I expect plenty of controversy about which of those to choose, but
hope that there isn't any about the proposition that we have to 
choose one of those two.

 SECINFO information is, for instance, given
 out on a per-filehandle basis, does that mean that the server will
have
 different security policies? 

Well yes, RFC3530 does say The new SECINFO operation will allow the 
client to determine, on a per filehandle basis, but I think that
just has to be considered as an error rather than indicating that if
you have two different filehandles for the same object, they can have 
different security policies.  SECINFO in RFC3530 takes a directory fh
and a name, so if there are multiple filehandles for the object with
that name, there is no way for SECINFO to associate different policies
with different filehandles.  All it has is the name to go by.  I think
this should be corrected to on a per-object basis in the new spec no 
matter what we do on other issues.

I think the principle here has to be that if we do allow multiple 
fh's to map to the same object, we require that they designate the 
same object, and thus it is not allowed for the server to act as if 
you have multiple different object with different characteristics.

Similarly as to:

 In some places, people haven't even started
 to think about the consequences: 

 If GETATTR directed to the two filehandles does not return the
 fileid attribute for both of the handles, then it cannot be
 determined whether the two objects are the same.  Therefore,
 operations which depend on that knowledge (e.g., client side data
 caching) cannot be done reliably.

I think they (and maybe they includes me, I haven't checked the
history
here) started to think about them, but went in a bad direction.

The implication here that you can have a different set of attributes
supported for the same object based on which filehandle is used to 
access the attributes is totally bogus.

The definition of supp_attr says The bit vector which would retrieve
all mandatory and recommended attributes that are supported for this 
object.  The scope of this attribute applies to all objects with a
matching fsid.  So having the same object have different attributes
supported based on the filehandle used or even two objects in the same
fs having different attributes supported, in particular having fileid
supported for one and not the other just isn't valid.

 The fact is that RFC3530 contains masses of rope with which
 to allow server and client vendors to hang themselves. 

If that means simply making poor choices, then OK.  But if there are 
other cases where you feel that the specification of a feature is simply

incoherent and the consequences not really thought out, then I think 
we need to discuss them and not propagate that state of affairs to v4.1.

-Original Message-
From: Trond Myklebust [mailto:[EMAIL PROTECTED] 
Sent: Friday, January 05, 2007 5:29 AM
To: Benny Halevy
Cc: Jan Harkes; Miklos Szeredi; nfsv4@ietf.org;
linux-kernel@vger.kernel.org; Mikulas Patocka;
linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven
Subject: Re: [nfsv4] RE: Finding hardlinks


On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
 Trond Myklebust wrote:
  Exactly where do you see us violating the close-to-open cache
  consistency guarantees?
  
 
 I haven't seen that. What I did see is cache inconsistency when
opening
 the same file with different file descriptors when the filehandle
changes.
 My testing shows that at least fsync and close fail with EIO when the
filehandle
 changed while there was dirty data in the cache and that's good.
Still,
 not sharing the cache while the file is opened (even on a different
file
 descriptors by the same process) seems impractical.

Tough. I'm not going to commit to adding support for multiple
filehandles. The fact is that RFC3530 contains masses of rope with which
to allow server and client vendors to hang themselves. The fact that the
protocol claims support for servers that use multiple filehandles per
inode does not mean it is necessarily a good idea. It adds unnecessary
code complexity, it screws with server scalability (extra GETATTR calls
just in order to probe existing filehandles), and it is insufficiently
well documented in the RFC: SECINFO information is, for instance, given
out on a per-filehandle basis, does that mean that the server will have
different security policies? In some places, people haven't even started
to think about the consequences

Re: Finding hardlinks

2007-01-05 Thread Bodo Eggert

Miklos Szeredi [EMAIL PROTECTED] wrote:

  Well, sort of.  Samefile without keeping fds open doesn't have any
  protection against the tree changing underneath between first
  registering a file and later opening it.  The inode number is more
 
 You only need to keep one-file-per-hardlink-group open during final
 verification, checking that inode hashing produced reasonable results.
 
 What final verification?  I wasn't just talking about 'tar' but all
 cases where st_ino might be used to check the identity of two files at
 possibly different points in time.
 
 Time A:remember identity of file X
 Time B:check if identity of file Y matches that of file X
 
 With samefile() if you open X at A, and keep it open till B, you can
 accumulate large numbers of open files and the application can fail.
 
 If you don't keep an open file, just remember the path, then renaming
 X will foil the later identity check.  Changing the file at this path
 between A and B can even give you a false positive.  This applies to
 'tar' as well as the other uses.

If you open Y, this open file descriptor will guarantee that no distinct
file will have the same inode number while all hardliked files must have
the same inode number. (AFAIK)

Now you will check this against the list of hardlink candidates using the
stored inode number. If the inode number has changed, this will result in
a false negative. If you removed X, recreated it with the same inode number
and linked that to Y, you'll get a false positive (which could be identified
by the [mc]time changes).

Samefile without keeping the files open will result in the same false
positive as open+fstat+stat, while samefile with keeping the files open
will occasionally overflow the files table, Therefore I think it's not
worth while introducing samefile as long as the inode is unique for open
files. OTOH you'll want to keep the inode number as stable as possible,
since it's the only sane way to find sets of hardlinked files and some
important programs may depend on it.
-- 
Ich danke GMX dafür, die Verwendung meiner Adressen mittels per SPF
verbreiteten Lügen zu sabotieren.

http://david.woodhou.se/why-not-spf.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RE: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Halevy, Benny

 From: [EMAIL PROTECTED] on behalf of Nicolas Williams
 Sent: Fri 1/5/2007 18:40
 To: Halevy, Benny
 Cc: Trond Myklebust; Jan Harkes; Miklos Szeredi; nfsv4@ietf.org; 
 linux-kernel@vger.kernel.org; Mikulas Patocka; linux-fsdevel@vger.kernel.org; 
 Jeff Layton; Arjan van de Ven
 Subject: Re: [nfsv4] RE: Finding hardlinks

 On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
  I agree that the way the client implements its cache is out of the protocol
  scope. But how do you interpret correct behavior in section 4.2.1?
   Clients MUST use filehandle comparisons only to improve performance, not 
  for correct behavior. All clients  need to be prepared for situations in 
  which it cannot be determined whether two filehandles denote the same  
  object and in such cases, avoid making invalid assumptions which might 
  cause incorrect behavior.
  Don't you consider data corruption due to cache inconsistency an incorrect 
  behavior?

 If a file with multiple hardlinks appears to have multiple distinct
 filehandles then a client like Trond's will treat it as multiple
 distinct files (with the same hardlink count, and you won't be able to
 find the other links to them -- oh well).  Can this cause data
 corruption?  Yes, but only if there are applications that rely on the
 different file names referencing the same file, and backup apps on the
 client won't get the hardlinks right either.

Well, this is why the hard links were made, no?
FWIW, I believe that rename of an open file might also produce this problem.

 What I don't understand is why getting the fileid is so hard -- always
 GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
 difficult as it is to maintain a hash table of fileids.

The problem with NFS is that fileid isn't enough because the client doesn't
know about removes by other clients until it uses the stale filehandle.
Also, quite a few file systems are not keeping fileids unique (this triggered
this thread)

 Nico
 --

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-04 Thread Pavel Machek

Hi!

> > > High probability is all you have.  Cosmic radiation hitting your
> > > computer will more likly cause problems, than colliding 64bit inode
> > > numbers ;)
> > 
> > Some of us have machines designed to cope with cosmic rays, and would be
> > unimpressed with a decrease in reliability.
> 
> With the suggested samefile() interface you'd get a failure with just
> about 100% reliability for any application which needs to compare a
> more than a few files.  The fact is open files are _very_ expensive,
> no wonder they are limited in various ways.
> 
> What should 'tar' do when it runs out of open files, while searching
> for hardlinks?  Should it just give up?  Then the samefile() interface
> would be _less_ reliable than the st_ino one by a significant margin.

You need at most two simultenaously open files for examining any
number of hardlinks. So yes, you can make it reliable.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-04 Thread Nikita Danilov

Mikulas Patocka writes:
 > > > BTW. How does ReiserFS find that a given inode number (or object ID in
 > > > ReiserFS terminology) is free before assigning it to new file/directory?
 > >
 > > reiserfs v3 has an extent map of free object identifiers in
 > > super-block.
 > 
 > Inode free space can have at most 2^31 extents --- if inode numbers 
 > alternate between "allocated", "free". How do you pack it to superblock?

In the worst case, when free/used extents are small, some free oids are
"leaked", but this has never been problem in practice. In fact, there
was a patch for reiserfs v3 to store this map in special hidden file but
it wasn't included in mainline, as nobody ever complained about oid map
fragmentation.

 > 
 > > reiser4 used 64 bit object identifiers without reuse.
 > 
 > So you are going to hit the same problem as I did with SpadFS --- you 
 > can't export 64-bit inode number to userspace (programs without 
 > -D_FILE_OFFSET_BITS=64 will have stat() randomly failing with EOVERFLOW 
 > then) and if you export only 32-bit number, it will eventually wrap-around 
 > and colliding st_ino will cause data corruption with many userspace 
 > programs.

Indeed, this is fundamental problem. Reiser4 tries to ameliorate it by
using hash function that starts colliding only when there are billions
of files, in which case 32bit inode number is screwed anyway.

Note, that none of the above problems invalidates reasons for having
long in-kernel inode identifiers that I outlined in other message.

 > 
 > Mikulas

Nikita.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Trond Myklebust

On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote:
> I agree that the way the client implements its cache is out of the protocol
> scope. But how do you interpret "correct behavior" in section 4.2.1?
>  "Clients MUST use filehandle comparisons only to improve performance, not 
> for correct behavior. All clients need to be prepared for situations in which 
> it cannot be determined whether two filehandles denote the same object and in 
> such cases, avoid making invalid assumptions which might cause incorrect 
> behavior."
> Don't you consider data corruption due to cache inconsistency an incorrect 
> behavior?

Exactly where do you see us violating the close-to-open cache
consistency guarantees?

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Benny Halevy


Trond Myklebust wrote:
> On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
>> I sincerely expect you or anybody else for this matter to try to provide
>> feedback and object to the protocol specification in case they disagree
>> with it (or think it's ambiguous or self contradicting) rather than ignoring
>> it and implementing something else. I think we're shooting ourselves in the
>> foot when doing so and it is in our common interest to strive to reach a
>> realistic standard we can all comply with and interoperate with each other.
> 
> You are reading the protocol wrong in this case.

Obviously we interpret it differently and that by itself calls for considering
clarification of the text :)

> 
> While the protocol does allow the server to implement the behaviour that
> you've been advocating, it in no way mandates it. Nor does it mandate
> that the client should gather files with the same (fsid,fileid) and
> cache them together. Those are issues to do with _implementation_, and
> are thus beyond the scope of the IETF.
> 
> In our case, the client will ignore the unique_handles attribute. It
> will use filehandles as our inode cache identifier. It will not jump
> through hoops to provide caching semantics that go beyond close-to-open
> for servers that set unique_handles to "false".

I agree that the way the client implements its cache is out of the protocol
scope. But how do you interpret "correct behavior" in section 4.2.1?
 "Clients MUST use filehandle comparisons only to improve performance, not for 
correct behavior. All clients need to be prepared for situations in which it 
cannot be determined whether two filehandles denote the same object and in such 
cases, avoid making invalid assumptions which might cause incorrect behavior."
Don't you consider data corruption due to cache inconsistency an incorrect 
behavior?

Benny
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Trond Myklebust

On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
> I sincerely expect you or anybody else for this matter to try to provide
> feedback and object to the protocol specification in case they disagree
> with it (or think it's ambiguous or self contradicting) rather than ignoring
> it and implementing something else. I think we're shooting ourselves in the
> foot when doing so and it is in our common interest to strive to reach a
> realistic standard we can all comply with and interoperate with each other.

You are reading the protocol wrong in this case.

While the protocol does allow the server to implement the behaviour that
you've been advocating, it in no way mandates it. Nor does it mandate
that the client should gather files with the same (fsid,fileid) and
cache them together. Those are issues to do with _implementation_, and
are thus beyond the scope of the IETF.

In our case, the client will ignore the unique_handles attribute. It
will use filehandles as our inode cache identifier. It will not jump
through hoops to provide caching semantics that go beyond close-to-open
for servers that set unique_handles to "false".

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Trond Myklebust

On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
 I sincerely expect you or anybody else for this matter to try to provide
 feedback and object to the protocol specification in case they disagree
 with it (or think it's ambiguous or self contradicting) rather than ignoring
 it and implementing something else. I think we're shooting ourselves in the
 foot when doing so and it is in our common interest to strive to reach a
 realistic standard we can all comply with and interoperate with each other.

You are reading the protocol wrong in this case.

While the protocol does allow the server to implement the behaviour that
you've been advocating, it in no way mandates it. Nor does it mandate
that the client should gather files with the same (fsid,fileid) and
cache them together. Those are issues to do with _implementation_, and
are thus beyond the scope of the IETF.

In our case, the client will ignore the unique_handles attribute. It
will use filehandles as our inode cache identifier. It will not jump
through hoops to provide caching semantics that go beyond close-to-open
for servers that set unique_handles to false.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Benny Halevy


Trond Myklebust wrote:
 On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
 I sincerely expect you or anybody else for this matter to try to provide
 feedback and object to the protocol specification in case they disagree
 with it (or think it's ambiguous or self contradicting) rather than ignoring
 it and implementing something else. I think we're shooting ourselves in the
 foot when doing so and it is in our common interest to strive to reach a
 realistic standard we can all comply with and interoperate with each other.
 
 You are reading the protocol wrong in this case.

Obviously we interpret it differently and that by itself calls for considering
clarification of the text :)

 
 While the protocol does allow the server to implement the behaviour that
 you've been advocating, it in no way mandates it. Nor does it mandate
 that the client should gather files with the same (fsid,fileid) and
 cache them together. Those are issues to do with _implementation_, and
 are thus beyond the scope of the IETF.
 
 In our case, the client will ignore the unique_handles attribute. It
 will use filehandles as our inode cache identifier. It will not jump
 through hoops to provide caching semantics that go beyond close-to-open
 for servers that set unique_handles to false.

I agree that the way the client implements its cache is out of the protocol
scope. But how do you interpret correct behavior in section 4.2.1?
 Clients MUST use filehandle comparisons only to improve performance, not for 
correct behavior. All clients need to be prepared for situations in which it 
cannot be determined whether two filehandles denote the same object and in such 
cases, avoid making invalid assumptions which might cause incorrect behavior.
Don't you consider data corruption due to cache inconsistency an incorrect 
behavior?

Benny
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Trond Myklebust

On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote:
 I agree that the way the client implements its cache is out of the protocol
 scope. But how do you interpret correct behavior in section 4.2.1?
  Clients MUST use filehandle comparisons only to improve performance, not 
 for correct behavior. All clients need to be prepared for situations in which 
 it cannot be determined whether two filehandles denote the same object and in 
 such cases, avoid making invalid assumptions which might cause incorrect 
 behavior.
 Don't you consider data corruption due to cache inconsistency an incorrect 
 behavior?

Exactly where do you see us violating the close-to-open cache
consistency guarantees?

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-04 Thread Nikita Danilov

Mikulas Patocka writes:
BTW. How does ReiserFS find that a given inode number (or object ID in
ReiserFS terminology) is free before assigning it to new file/directory?
  
   reiserfs v3 has an extent map of free object identifiers in
   super-block.
  
  Inode free space can have at most 2^31 extents --- if inode numbers 
  alternate between allocated, free. How do you pack it to superblock?

In the worst case, when free/used extents are small, some free oids are
leaked, but this has never been problem in practice. In fact, there
was a patch for reiserfs v3 to store this map in special hidden file but
it wasn't included in mainline, as nobody ever complained about oid map
fragmentation.

  
   reiser4 used 64 bit object identifiers without reuse.
  
  So you are going to hit the same problem as I did with SpadFS --- you 
  can't export 64-bit inode number to userspace (programs without 
  -D_FILE_OFFSET_BITS=64 will have stat() randomly failing with EOVERFLOW 
  then) and if you export only 32-bit number, it will eventually wrap-around 
  and colliding st_ino will cause data corruption with many userspace 
  programs.

Indeed, this is fundamental problem. Reiser4 tries to ameliorate it by
using hash function that starts colliding only when there are billions
of files, in which case 32bit inode number is screwed anyway.

Note, that none of the above problems invalidates reasons for having
long in-kernel inode identifiers that I outlined in other message.

  
  Mikulas

Nikita.

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-04 Thread Pavel Machek

Hi!

   High probability is all you have.  Cosmic radiation hitting your
   computer will more likly cause problems, than colliding 64bit inode
   numbers ;)
  
  Some of us have machines designed to cope with cosmic rays, and would be
  unimpressed with a decrease in reliability.
 
 With the suggested samefile() interface you'd get a failure with just
 about 100% reliability for any application which needs to compare a
 more than a few files.  The fact is open files are _very_ expensive,
 no wonder they are limited in various ways.
 
 What should 'tar' do when it runs out of open files, while searching
 for hardlinks?  Should it just give up?  Then the samefile() interface
 would be _less_ reliable than the st_ino one by a significant margin.

You need at most two simultenaously open files for examining any
number of hardlinks. So yes, you can make it reliable.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Trond Myklebust

On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
> Believe it or not, but server companies like Panasas try to follow the 
> standard
> when designing and implementing their products while relying on client vendors
> to do the same.

I personally have never given a rats arse about "standards" if they make
no sense to me. If the server is capable of knowing about hard links,
then why does it need all this extra crap in the filehandle that just
obfuscates the hard link info?

The bottom line is that nothing in our implementation will result in
such a server performing sub-optimally w.r.t. the client. The only
result is that we will conform to close-to-open semantics instead of
strict POSIX caching semantics when two processes have opened the same
file via different hard links.

> I sincerely expect you or anybody else for this matter to try to provide
> feedback and object to the protocol specification in case they disagree
> with it (or think it's ambiguous or self contradicting) rather than ignoring
> it and implementing something else. I think we're shooting ourselves in the
> foot when doing so and it is in our common interest to strive to reach a
> realistic standard we can all comply with and interoperate with each other.

This has nothing to do with the protocol itself: it has only to do with
caching semantics. As far as caching goes, the only guarantees that NFS
clients give are the close-to-open semantics, and this should indeed be
respected by the implementation in question.

Trond

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen

On Thu, Jan 04, 2007 at 12:43:20AM +0100, Mikulas Patocka wrote:
> On Wed, 3 Jan 2007, Frank van Maarseveen wrote:
> >Currently, large file support is already necessary to handle dvd and
> >video. It's also useful for images for virtualization. So the failing 
> >stat()
> >calls should already be a thing of the past with modern distributions.
> 
> As long as glibc compiles by default with 32-bit ino_t, the problem exists 
> and is severe --- programs handling large files, such as coreutils, tar, 
> mc, mplayer, already compile with 64-bit ino_t and off_t, but the user (or 
> script) may type something like:
> 
> cat >file.c < #include 
> #include 
> main()
> {
>   int h;
>   struct stat st;
>   if ((h = creat("foo", 0600)) < 0) perror("creat"), exit(1);
>   if (fstat(h, )) perror("stat"), exit(1);
>   close(h);
>   return 0;
> }
> EOF
> gcc file.c; ./a.out
> 
> --- and you certainly do not want this to fail (unless you are out of disk 
> space).
> 
> The difference is, that with 32-bit program and 64-bit off_t, you get 
> deterministic failure on large files, with 32-bit program and 64-bit 
> ino_t, you get random failures.

What's (technically) the problem with changing the gcc default?

Alternatively we could make the error deterministic in various ways. Start
st_ino numbering from 4G (except for a few special ones maybe such
as root/mounts). Or make old and new programs look differently at the
ELF level or by sys_personality() and/or check against a "ino64" mount
flag/filesystem feature. Lots of possibilities.

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka


On Wed, 3 Jan 2007, Frank van Maarseveen wrote:


On Wed, Jan 03, 2007 at 01:09:41PM -0800, Bryan Henderson wrote:

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon

this

for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.


But for at least the last of those decades, filesystems that could not do
that were not uncommon.  They had to present 32 bit inode numbers and
either allowed more than 4G files or just didn't have the means of
assigning inode numbers with the proper uniqueness to files.  And the sky
did not fall.  I don't have an explanation why,


I think it's mostly high end use and high end users tend to understand
more. But we're going to see more really large filesystems in "normal"
use so..

Currently, large file support is already necessary to handle dvd and
video. It's also useful for images for virtualization. So the failing stat()
calls should already be a thing of the past with modern distributions.


As long as glibc compiles by default with 32-bit ino_t, the problem exists 
and is severe --- programs handling large files, such as coreutils, tar, 
mc, mplayer, already compile with 64-bit ino_t and off_t, but the user (or 
script) may type something like:


cat >file.c <
#include 
main()
{
int h;
struct stat st;
if ((h = creat("foo", 0600)) < 0) perror("creat"), exit(1);
if (fstat(h, )) perror("stat"), exit(1);
close(h);
return 0;
}
EOF
gcc file.c; ./a.out

--- and you certainly do not want this to fail (unless you are out of disk 
space).


The difference is, that with 32-bit program and 64-bit off_t, you get 
deterministic failure on large files, with 32-bit program and 64-bit 
ino_t, you get random failures.


Mikulas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek

Hi!

> >Sure it is. Numerous popular POSIX filesystems do that. There is a lot of
> >inode number space in 64 bit (of course it is a matter of time for it to
> >jump to 128 bit and more)
> 
> If the filesystem was designed by someone not from Unix world (FAT, SMB, 
> ...), then not. And users still want to access these filesystems.
> 
> 64-bit inode numbers space is not yet implemented on Linux --- the problem 
> is that if you return ino >= 2^32, programs compiled without 
> -D_FILE_OFFSET_BITS=64 will fail with stat() returning -EOVERFLOW --- this 
> failure is specified in POSIX, but not very useful.

Hehe, can we simply -EOVERFLOW on VFAT all the time? ...probably not
useful :-(. But ability to say "unknown" in st_ino field would
help

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen

On Wed, Jan 03, 2007 at 01:09:41PM -0800, Bryan Henderson wrote:
> >On any decent filesystem st_ino should uniquely identify an object and
> >reliably provide hardlink information. The UNIX world has relied upon 
> this
> >for decades. A filesystem with st_ino collisions without being hardlinked
> >(or the other way around) needs a fix.
> 
> But for at least the last of those decades, filesystems that could not do 
> that were not uncommon.  They had to present 32 bit inode numbers and 
> either allowed more than 4G files or just didn't have the means of 
> assigning inode numbers with the proper uniqueness to files.  And the sky 
> did not fall.  I don't have an explanation why,

I think it's mostly high end use and high end users tend to understand
more. But we're going to see more really large filesystems in "normal"
use so..

Currently, large file support is already necessary to handle dvd and
video. It's also useful for images for virtualization. So the failing stat()
calls should already be a thing of the past with modern distributions.

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Bryan Henderson

>On any decent filesystem st_ino should uniquely identify an object and
>reliably provide hardlink information. The UNIX world has relied upon 
this
>for decades. A filesystem with st_ino collisions without being hardlinked
>(or the other way around) needs a fix.

But for at least the last of those decades, filesystems that could not do 
that were not uncommon.  They had to present 32 bit inode numbers and 
either allowed more than 4G files or just didn't have the means of 
assigning inode numbers with the proper uniqueness to files.  And the sky 
did not fall.  I don't have an explanation why, but it makes it look to me 
like there are worse things than not having total one-one correspondence 
between inode numbers and files.  Having a stat or mount fail because 
inodes are too big, having fewer than 4G files, and waiting for the 
filesystem to generate a suitable inode number might fall in that 
category.

I fully agree that much effort should be put into making inode numbers 
work the way POSIX demands, but I also know that that sometimes requires 
more than just writing some code.

--
Bryan Henderson   San Jose California
IBM Almaden Research Center   Filesystems

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen

On Wed, Jan 03, 2007 at 08:31:32PM +0100, Mikulas Patocka wrote:
> I didn't hardlink directories, I just patched stat, lstat and fstat to
> always return st_ino == 0 --- and I've seen those failures. These 
> failures
> are going to happen on non-POSIX filesystems in real world too, very
> rarely.
> >>>
> >>>I don't want to spoil your day but testing with st_ino==0 is a bad choice
> >>>because it is a special number. Anyway, one can only find breakage,
> >>>not prove that all the other programs handle this correctly so this is
> >>>kind of pointless.
> >>>
> >>>On any decent filesystem st_ino should uniquely identify an object and
> >>>reliably provide hardlink information. The UNIX world has relied upon 
> >>>this
> >>>for decades. A filesystem with st_ino collisions without being hardlinked
> >>>(or the other way around) needs a fix.
> >>
> >>... and that's the problem --- the UNIX world specified something that
> >>isn't implementable in real world.
> >
> >Sure it is. Numerous popular POSIX filesystems do that. There is a lot of
> >inode number space in 64 bit (of course it is a matter of time for it to
> >jump to 128 bit and more)
> 
> If the filesystem was designed by someone not from Unix world (FAT, SMB, 
> ...), then not. And users still want to access these filesystems.

They can. Hey, it's not perfect but who expects FAT/SMB to be "perfect" anyway?

> 
> 64-bit inode numbers space is not yet implemented on Linux --- the problem 
> is that if you return ino >= 2^32, programs compiled without 
> -D_FILE_OFFSET_BITS=64 will fail with stat() returning -EOVERFLOW --- this 
> failure is specified in POSIX, but not very useful.

hmm, checking iunique(), ino_t, __kernel_ino_t... I see. Pity. So at
some point in time we may need a sort of "ino64" mount option to be
able to switch to a 64 bit number space on mount basis. Or (conversely)
refuse to mount without that option if we know there are >32 bit st_ino
out there. And invent iunique64() and use that when "ino64" specified
for FAT/SMB/...  when those filesystems haven't been replaced by a
successor by that time.

At that time probably all programs are either compiled with
-D_FILE_OFFSET_BITS=64 (most already are because of files bigger than 2G)
or completely 64 bit. 

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka


I didn't hardlink directories, I just patched stat, lstat and fstat to
always return st_ino == 0 --- and I've seen those failures. These failures
are going to happen on non-POSIX filesystems in real world too, very
rarely.


I don't want to spoil your day but testing with st_ino==0 is a bad choice
because it is a special number. Anyway, one can only find breakage,
not prove that all the other programs handle this correctly so this is
kind of pointless.

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.


... and that's the problem --- the UNIX world specified something that
isn't implementable in real world.


Sure it is. Numerous popular POSIX filesystems do that. There is a lot of
inode number space in 64 bit (of course it is a matter of time for it to
jump to 128 bit and more)


If the filesystem was designed by someone not from Unix world (FAT, SMB, 
...), then not. And users still want to access these filesystems.


64-bit inode numbers space is not yet implemented on Linux --- the problem 
is that if you return ino >= 2^32, programs compiled without 
-D_FILE_OFFSET_BITS=64 will fail with stat() returning -EOVERFLOW --- this 
failure is specified in POSIX, but not very useful.


Mikulas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen

On Wed, Jan 03, 2007 at 08:17:34PM +0100, Mikulas Patocka wrote:
> 
> On Wed, 3 Jan 2007, Frank van Maarseveen wrote:
> 
> >On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:
> >>
> >>I didn't hardlink directories, I just patched stat, lstat and fstat to
> >>always return st_ino == 0 --- and I've seen those failures. These failures
> >>are going to happen on non-POSIX filesystems in real world too, very
> >>rarely.
> >
> >I don't want to spoil your day but testing with st_ino==0 is a bad choice
> >because it is a special number. Anyway, one can only find breakage,
> >not prove that all the other programs handle this correctly so this is
> >kind of pointless.
> >
> >On any decent filesystem st_ino should uniquely identify an object and
> >reliably provide hardlink information. The UNIX world has relied upon this
> >for decades. A filesystem with st_ino collisions without being hardlinked
> >(or the other way around) needs a fix.
> 
> ... and that's the problem --- the UNIX world specified something that 
> isn't implementable in real world.

Sure it is. Numerous popular POSIX filesystems do that. There is a lot of
inode number space in 64 bit (of course it is a matter of time for it to
jump to 128 bit and more)

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka




On Wed, 3 Jan 2007, Frank van Maarseveen wrote:


On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:


I didn't hardlink directories, I just patched stat, lstat and fstat to
always return st_ino == 0 --- and I've seen those failures. These failures
are going to happen on non-POSIX filesystems in real world too, very
rarely.


I don't want to spoil your day but testing with st_ino==0 is a bad choice
because it is a special number. Anyway, one can only find breakage,
not prove that all the other programs handle this correctly so this is
kind of pointless.

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.


... and that's the problem --- the UNIX world specified something that 
isn't implementable in real world.


You can take a closed box and say "this is POSIX cerified" --- but how 
useful such box could be, if you can't access CDs, diskettes and USB 
sticks with it?


Mikulas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen

On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:
> 
> I didn't hardlink directories, I just patched stat, lstat and fstat to 
> always return st_ino == 0 --- and I've seen those failures. These failures 
> are going to happen on non-POSIX filesystems in real world too, very 
> rarely.

I don't want to spoil your day but testing with st_ino==0 is a bad choice
because it is a special number. Anyway, one can only find breakage,
not prove that all the other programs handle this correctly so this is
kind of pointless.

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.

Synthetic filesystems such as /proc are special due to their dynamic
nature and I think st_ino uniqueness is far more important than being able
to provide hardlinks there. Most tree handling programs ("cp", "rm", ...)
break horribly when the tree underneath changes at the same time.

-- 
Frank
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka




On Wed, 3 Jan 2007, Miklos Szeredi wrote:


High probability is all you have.  Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)


Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.


With the suggested samefile() interface you'd get a failure with just
about 100% reliability for any application which needs to compare a
more than a few files.  The fact is open files are _very_ expensive,
no wonder they are limited in various ways.

What should 'tar' do when it runs out of open files, while searching
for hardlinks?  Should it just give up?  Then the samefile() interface
would be _less_ reliable than the st_ino one by a significant margin.


You could do samefile() for paths --- as for races --- it doesn't matter 
in this scenario, it is no more racy than stat or lstat.


Mikulas


Miklos


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Miklos Szeredi

> > High probability is all you have.  Cosmic radiation hitting your
> > computer will more likly cause problems, than colliding 64bit inode
> > numbers ;)
> 
> Some of us have machines designed to cope with cosmic rays, and would be
> unimpressed with a decrease in reliability.

With the suggested samefile() interface you'd get a failure with just
about 100% reliability for any application which needs to compare a
more than a few files.  The fact is open files are _very_ expensive,
no wonder they are limited in various ways.

What should 'tar' do when it runs out of open files, while searching
for hardlinks?  Should it just give up?  Then the samefile() interface
would be _less_ reliable than the st_ino one by a significant margin.

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Matthew Wilcox

On Wed, Jan 03, 2007 at 01:33:31PM +0100, Miklos Szeredi wrote:
> High probability is all you have.  Cosmic radiation hitting your
> computer will more likly cause problems, than colliding 64bit inode
> numbers ;)

Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Martin Mares

Hello!

> High probability is all you have.  Cosmic radiation hitting your
> computer will more likly cause problems, than colliding 64bit inode
> numbers ;)

No.

If you assign 64-bit inode numbers randomly, 2^32 of them are sufficient
to generate a collision with probability around 50%.

Have a nice fortnight
-- 
Martin `MJ' Mares  <[EMAIL PROTECTED]>   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
A Bash poem: time for echo in canyon; do echo $echo $echo; done
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek

Hi!

> > > > > the use of a good hash function.  The chance of an accidental
> > > > > collision is infinitesimally small.  For a set of 
> > > > > 
> > > > >  100 files: 0.03%
> > > > >1,000,000 files: 0.03%
> > > > 
> > > > I do not think we want to play with probability like this. I mean...
> > > > imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
> > > > unreasonable, and collision probability is going to be ~100% due to
> > > > birthday paradox.
> > > > 
> > > > You'll still want to back up your 4TB server...
> > > 
> > > Certainly, but tar isn't going to remember all the inode numbers.
> > > Even if you solve the storage requirements (not impossible) it would
> > > have to do (4e9^2)/2=8e18 comparisons, which computers don't have
> > > enough CPU power just yet.
> > 
> > Storage requirements would be 16GB of RAM... that's small enough. If
> > you sort, you'll only need 32*2^32 comparisons, and that's doable.
> > 
> > I do not claim it is _likely_. You'd need hardlinks, as you
> > noticed. But system should work, not "work with high probability", and
> > I believe we should solve this in long term.
> 
> High probability is all you have.  Cosmic radiation hitting your
> computer will more likly cause problems, than colliding 64bit inode
> numbers ;)

As I have shown... no, that's not right. 32*2^32 operations is small
enough not to have problems with cosmic radiation.

> But you could add a new interface for the extra paranoid.  The
> proposed 'samefile(fd1, fd2)' syscall is severly limited by the heavy
> weight of file descriptors.

I guess that is the way to go. samefile(path1, path2) is unfortunately
inherently racy.

> Another idea is to export the filesystem internal ID as an arbitray
> length cookie through the extended attribute interface.  That could be
> stored/compared by the filesystem quite efficiently.

How will that work for FAT?

Or maybe we can relax that "inode may not change over rename" and
"zero length files need unique inode numbers"...

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Benny Halevy

Trond Myklebust wrote:
> On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote:
>> Trond Myklebust wrote:
>>>  
>>> On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
 Mikulas Patocka wrote:
> BTW. how does (or how should?) NFS client deal with cache coherency if 
> filehandles for the same file differ?
>
 Trond can probably answer this better than me...
 As I read it, currently the nfs client matches both the fileid and the
 filehandle (in nfs_find_actor). This means that different filehandles
 for the same file would result in different inodes :(.
 Strictly following the nfs protocol, comparing only the fileid should
 be enough IF fileids are indeed unique within the filesystem.
 Comparing the filehandle works as a workaround when the exported filesystem
 (or the nfs server) violates that.  From a user stand point I think that
 this should be configurable, probably per mount point.
>>> Matching files by fileid instead of filehandle is a lot more trouble
>>> since fileids may be reused after a file has been deleted. Every time
>>> you look up a file, and get a new filehandle for the same fileid, you
>>> would at the very least have to do another GETATTR using one of the
>>> 'old' filehandles in order to ensure that the file is the same object as
>>> the one you have cached. Then there is the issue of what to do when you
>>> open(), read() or write() to the file: which filehandle do you use, are
>>> the access permissions the same for all filehandles, ...
>>>
>>> All in all, much pain for little or no gain.
>> See my answer to your previous reply.  It seems like the current
>> implementation is in violation of the nfs protocol and the extra pain
>> is required.
> 
> ...and we should care because...?
> 
> Trond
> 

Believe it or not, but server companies like Panasas try to follow the standard
when designing and implementing their products while relying on client vendors
to do the same.

I sincerely expect you or anybody else for this matter to try to provide
feedback and object to the protocol specification in case they disagree
with it (or think it's ambiguous or self contradicting) rather than ignoring
it and implementing something else. I think we're shooting ourselves in the
foot when doing so and it is in our common interest to strive to reach a
realistic standard we can all comply with and interoperate with each other.

Benny

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Miklos Szeredi

> > > > the use of a good hash function.  The chance of an accidental
> > > > collision is infinitesimally small.  For a set of 
> > > > 
> > > >  100 files: 0.03%
> > > >1,000,000 files: 0.03%
> > > 
> > > I do not think we want to play with probability like this. I mean...
> > > imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
> > > unreasonable, and collision probability is going to be ~100% due to
> > > birthday paradox.
> > > 
> > > You'll still want to back up your 4TB server...
> > 
> > Certainly, but tar isn't going to remember all the inode numbers.
> > Even if you solve the storage requirements (not impossible) it would
> > have to do (4e9^2)/2=8e18 comparisons, which computers don't have
> > enough CPU power just yet.
> 
> Storage requirements would be 16GB of RAM... that's small enough. If
> you sort, you'll only need 32*2^32 comparisons, and that's doable.
> 
> I do not claim it is _likely_. You'd need hardlinks, as you
> noticed. But system should work, not "work with high probability", and
> I believe we should solve this in long term.

High probability is all you have.  Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)

But you could add a new interface for the extra paranoid.  The
proposed 'samefile(fd1, fd2)' syscall is severly limited by the heavy
weight of file descriptors.

Another idea is to export the filesystem internal ID as an arbitray
length cookie through the extended attribute interface.  That could be
stored/compared by the filesystem quite efficiently.

But I think most apps will still opt for the portable intefaces which
while not perfect, are "good enough".

Miklos
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek

Hi!

> > > the use of a good hash function.  The chance of an accidental
> > > collision is infinitesimally small.  For a set of 
> > > 
> > >  100 files: 0.03%
> > >1,000,000 files: 0.03%
> > 
> > I do not think we want to play with probability like this. I mean...
> > imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
> > unreasonable, and collision probability is going to be ~100% due to
> > birthday paradox.
> > 
> > You'll still want to back up your 4TB server...
> 
> Certainly, but tar isn't going to remember all the inode numbers.
> Even if you solve the storage requirements (not impossible) it would
> have to do (4e9^2)/2=8e18 comparisons, which computers don't have
> enough CPU power just yet.

Storage requirements would be 16GB of RAM... that's small enough. If
you sort, you'll only need 32*2^32 comparisons, and that's doable.

I do not claim it is _likely_. You'd need hardlinks, as you
noticed. But system should work, not "work with high probability", and
I believe we should solve this in long term.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek

Hi!

   the use of a good hash function.  The chance of an accidental
   collision is infinitesimally small.  For a set of 
   
100 files: 0.03%
  1,000,000 files: 0.03%
  
  I do not think we want to play with probability like this. I mean...
  imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
  unreasonable, and collision probability is going to be ~100% due to
  birthday paradox.
  
  You'll still want to back up your 4TB server...
 
 Certainly, but tar isn't going to remember all the inode numbers.
 Even if you solve the storage requirements (not impossible) it would
 have to do (4e9^2)/2=8e18 comparisons, which computers don't have
 enough CPU power just yet.

Storage requirements would be 16GB of RAM... that's small enough. If
you sort, you'll only need 32*2^32 comparisons, and that's doable.

I do not claim it is _likely_. You'd need hardlinks, as you
noticed. But system should work, not work with high probability, and
I believe we should solve this in long term.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Miklos Szeredi

the use of a good hash function.  The chance of an accidental
collision is infinitesimally small.  For a set of 

 100 files: 0.03%
   1,000,000 files: 0.03%
   
   I do not think we want to play with probability like this. I mean...
   imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
   unreasonable, and collision probability is going to be ~100% due to
   birthday paradox.
   
   You'll still want to back up your 4TB server...
  
  Certainly, but tar isn't going to remember all the inode numbers.
  Even if you solve the storage requirements (not impossible) it would
  have to do (4e9^2)/2=8e18 comparisons, which computers don't have
  enough CPU power just yet.
 
 Storage requirements would be 16GB of RAM... that's small enough. If
 you sort, you'll only need 32*2^32 comparisons, and that's doable.
 
 I do not claim it is _likely_. You'd need hardlinks, as you
 noticed. But system should work, not work with high probability, and
 I believe we should solve this in long term.

High probability is all you have.  Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)

But you could add a new interface for the extra paranoid.  The
proposed 'samefile(fd1, fd2)' syscall is severly limited by the heavy
weight of file descriptors.

Another idea is to export the filesystem internal ID as an arbitray
length cookie through the extended attribute interface.  That could be
stored/compared by the filesystem quite efficiently.

But I think most apps will still opt for the portable intefaces which
while not perfect, are good enough.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Benny Halevy

Trond Myklebust wrote:
 On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote:
 Trond Myklebust wrote:
  
 On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
 Mikulas Patocka wrote:
 BTW. how does (or how should?) NFS client deal with cache coherency if 
 filehandles for the same file differ?

 Trond can probably answer this better than me...
 As I read it, currently the nfs client matches both the fileid and the
 filehandle (in nfs_find_actor). This means that different filehandles
 for the same file would result in different inodes :(.
 Strictly following the nfs protocol, comparing only the fileid should
 be enough IF fileids are indeed unique within the filesystem.
 Comparing the filehandle works as a workaround when the exported filesystem
 (or the nfs server) violates that.  From a user stand point I think that
 this should be configurable, probably per mount point.
 Matching files by fileid instead of filehandle is a lot more trouble
 since fileids may be reused after a file has been deleted. Every time
 you look up a file, and get a new filehandle for the same fileid, you
 would at the very least have to do another GETATTR using one of the
 'old' filehandles in order to ensure that the file is the same object as
 the one you have cached. Then there is the issue of what to do when you
 open(), read() or write() to the file: which filehandle do you use, are
 the access permissions the same for all filehandles, ...

 All in all, much pain for little or no gain.
 See my answer to your previous reply.  It seems like the current
 implementation is in violation of the nfs protocol and the extra pain
 is required.
 
 ...and we should care because...?
 
 Trond
 

Believe it or not, but server companies like Panasas try to follow the standard
when designing and implementing their products while relying on client vendors
to do the same.

I sincerely expect you or anybody else for this matter to try to provide
feedback and object to the protocol specification in case they disagree
with it (or think it's ambiguous or self contradicting) rather than ignoring
it and implementing something else. I think we're shooting ourselves in the
foot when doing so and it is in our common interest to strive to reach a
realistic standard we can all comply with and interoperate with each other.

Benny

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek

Hi!

 the use of a good hash function.  The chance of an accidental
 collision is infinitesimally small.  For a set of 
 
  100 files: 0.03%
1,000,000 files: 0.03%

I do not think we want to play with probability like this. I mean...
imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
unreasonable, and collision probability is going to be ~100% due to
birthday paradox.

You'll still want to back up your 4TB server...
   
   Certainly, but tar isn't going to remember all the inode numbers.
   Even if you solve the storage requirements (not impossible) it would
   have to do (4e9^2)/2=8e18 comparisons, which computers don't have
   enough CPU power just yet.
  
  Storage requirements would be 16GB of RAM... that's small enough. If
  you sort, you'll only need 32*2^32 comparisons, and that's doable.
  
  I do not claim it is _likely_. You'd need hardlinks, as you
  noticed. But system should work, not work with high probability, and
  I believe we should solve this in long term.
 
 High probability is all you have.  Cosmic radiation hitting your
 computer will more likly cause problems, than colliding 64bit inode
 numbers ;)

As I have shown... no, that's not right. 32*2^32 operations is small
enough not to have problems with cosmic radiation.

 But you could add a new interface for the extra paranoid.  The
 proposed 'samefile(fd1, fd2)' syscall is severly limited by the heavy
 weight of file descriptors.

I guess that is the way to go. samefile(path1, path2) is unfortunately
inherently racy.

 Another idea is to export the filesystem internal ID as an arbitray
 length cookie through the extended attribute interface.  That could be
 stored/compared by the filesystem quite efficiently.

How will that work for FAT?

Or maybe we can relax that inode may not change over rename and
zero length files need unique inode numbers...

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Martin Mares

Hello!

 High probability is all you have.  Cosmic radiation hitting your
 computer will more likly cause problems, than colliding 64bit inode
 numbers ;)

No.

If you assign 64-bit inode numbers randomly, 2^32 of them are sufficient
to generate a collision with probability around 50%.

Have a nice fortnight
-- 
Martin `MJ' Mares  [EMAIL PROTECTED]   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
A Bash poem: time for echo in canyon; do echo $echo $echo; done
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Matthew Wilcox

On Wed, Jan 03, 2007 at 01:33:31PM +0100, Miklos Szeredi wrote:
 High probability is all you have.  Cosmic radiation hitting your
 computer will more likly cause problems, than colliding 64bit inode
 numbers ;)

Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Miklos Szeredi

  High probability is all you have.  Cosmic radiation hitting your
  computer will more likly cause problems, than colliding 64bit inode
  numbers ;)
 
 Some of us have machines designed to cope with cosmic rays, and would be
 unimpressed with a decrease in reliability.

With the suggested samefile() interface you'd get a failure with just
about 100% reliability for any application which needs to compare a
more than a few files.  The fact is open files are _very_ expensive,
no wonder they are limited in various ways.

What should 'tar' do when it runs out of open files, while searching
for hardlinks?  Should it just give up?  Then the samefile() interface
would be _less_ reliable than the st_ino one by a significant margin.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka




On Wed, 3 Jan 2007, Miklos Szeredi wrote:


High probability is all you have.  Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)


Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.


With the suggested samefile() interface you'd get a failure with just
about 100% reliability for any application which needs to compare a
more than a few files.  The fact is open files are _very_ expensive,
no wonder they are limited in various ways.

What should 'tar' do when it runs out of open files, while searching
for hardlinks?  Should it just give up?  Then the samefile() interface
would be _less_ reliable than the st_ino one by a significant margin.


You could do samefile() for paths --- as for races --- it doesn't matter 
in this scenario, it is no more racy than stat or lstat.


Mikulas


Miklos


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen

On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:
 
 I didn't hardlink directories, I just patched stat, lstat and fstat to 
 always return st_ino == 0 --- and I've seen those failures. These failures 
 are going to happen on non-POSIX filesystems in real world too, very 
 rarely.

I don't want to spoil your day but testing with st_ino==0 is a bad choice
because it is a special number. Anyway, one can only find breakage,
not prove that all the other programs handle this correctly so this is
kind of pointless.

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.

Synthetic filesystems such as /proc are special due to their dynamic
nature and I think st_ino uniqueness is far more important than being able
to provide hardlinks there. Most tree handling programs (cp, rm, ...)
break horribly when the tree underneath changes at the same time.

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka




On Wed, 3 Jan 2007, Frank van Maarseveen wrote:


On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:


I didn't hardlink directories, I just patched stat, lstat and fstat to
always return st_ino == 0 --- and I've seen those failures. These failures
are going to happen on non-POSIX filesystems in real world too, very
rarely.


I don't want to spoil your day but testing with st_ino==0 is a bad choice
because it is a special number. Anyway, one can only find breakage,
not prove that all the other programs handle this correctly so this is
kind of pointless.

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.


... and that's the problem --- the UNIX world specified something that 
isn't implementable in real world.


You can take a closed box and say this is POSIX cerified --- but how 
useful such box could be, if you can't access CDs, diskettes and USB 
sticks with it?


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen

On Wed, Jan 03, 2007 at 08:17:34PM +0100, Mikulas Patocka wrote:
 
 On Wed, 3 Jan 2007, Frank van Maarseveen wrote:
 
 On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:
 
 I didn't hardlink directories, I just patched stat, lstat and fstat to
 always return st_ino == 0 --- and I've seen those failures. These failures
 are going to happen on non-POSIX filesystems in real world too, very
 rarely.
 
 I don't want to spoil your day but testing with st_ino==0 is a bad choice
 because it is a special number. Anyway, one can only find breakage,
 not prove that all the other programs handle this correctly so this is
 kind of pointless.
 
 On any decent filesystem st_ino should uniquely identify an object and
 reliably provide hardlink information. The UNIX world has relied upon this
 for decades. A filesystem with st_ino collisions without being hardlinked
 (or the other way around) needs a fix.
 
 ... and that's the problem --- the UNIX world specified something that 
 isn't implementable in real world.

Sure it is. Numerous popular POSIX filesystems do that. There is a lot of
inode number space in 64 bit (of course it is a matter of time for it to
jump to 128 bit and more)

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 >

1 - 100 of 199 matches

Mail list logo