Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko
On Wednesday 03 January 2007 13:42, Pavel Machek wrote:
 I guess that is the way to go. samefile(path1, path2) is unfortunately
 inherently racy.

Not a problem in practice. You don't expect cp -a
to reliably copy a tree which something else is modifying
at the same time.

Thus we assume that the tree we operate on is not modified.
--
vda
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko
On Wednesday 03 January 2007 21:26, Frank van Maarseveen wrote:
 On Wed, Jan 03, 2007 at 08:31:32PM +0100, Mikulas Patocka wrote:
  64-bit inode numbers space is not yet implemented on Linux --- the problem 
  is that if you return ino = 2^32, programs compiled without 
  -D_FILE_OFFSET_BITS=64 will fail with stat() returning -EOVERFLOW --- this 
  failure is specified in POSIX, but not very useful.
 
 hmm, checking iunique(), ino_t, __kernel_ino_t... I see. Pity. So at
 some point in time we may need a sort of ino64 mount option to be
 able to switch to a 64 bit number space on mount basis. Or (conversely)
 refuse to mount without that option if we know there are 32 bit st_ino
 out there. And invent iunique64() and use that when ino64 specified
 for FAT/SMB/...  when those filesystems haven't been replaced by a
 successor by that time.
 
 At that time probably all programs are either compiled with
 -D_FILE_OFFSET_BITS=64 (most already are because of files bigger than 2G)
 or completely 64 bit. 

Good plan. Be prepared to redo it again when 64bits will feel small also.
Then again when 128bit will be small. Don't tell me this won't happen.
15 years ago people would laugh about 32bit inode numbers being not enough.
--
vda
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nfsv4] RE: Finding hardlinks

2007-01-10 Thread Benny Halevy
Nicolas Williams wrote:
 On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
 I agree that the way the client implements its cache is out of the protocol
 scope. But how do you interpret correct behavior in section 4.2.1?
  Clients MUST use filehandle comparisons only to improve performance, not 
 for correct behavior. All clients need to be prepared for situations in 
 which it cannot be determined whether two filehandles denote the same object 
 and in such cases, avoid making invalid assumptions which might cause 
 incorrect behavior.
 Don't you consider data corruption due to cache inconsistency an incorrect 
 behavior?
 
 If a file with multiple hardlinks appears to have multiple distinct
 filehandles then a client like Trond's will treat it as multiple
 distinct files (with the same hardlink count, and you won't be able to
 find the other links to them -- oh well).  Can this cause data
 corruption?  Yes, but only if there are applications that rely on the
 different file names referencing the same file, and backup apps on the
 client won't get the hardlinks right either.

The case I'm discussing is multiple filehandles for the same name,
not even for different hardlinks.  This causes spurious EIO errors
on the client when the filehandle changes and cache inconsistency
when opening the file multiple times in parallel.

 
 What I don't understand is why getting the fileid is so hard -- always
 GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
 difficult as it is to maintain a hash table of fileids.

It's not difficult at all, just that the client can't rely on the fileids to be
unique in both space and time because of server non-compliance (e.g. netapp's
snapshots) and fileid reuse after delete.


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-10 Thread Bryan Henderson
I did cp -rl his-tree my-tree (which completed
quickly), edited the two files that needed to be patched, then did
diff -urp his-tree my-tree, which also completed quickly, as diff knows
that if two files have the same inode, they don't need to be opened.

... download one tree from kernel.org, do a bunch of cp -lr for
each arch you plan to play with, and then go and work on each of the 
trees
separately.

Cool.  It's like a poor man's directory overlay (same basic concept as 
union mount, Make's VPATH, and Subversion branching).  And I guess this 
explains why the diff optimization is so important.

--
Bryan Henderson   San Jose California
IBM Almaden Research Center   Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-09 Thread Steven Rostedt

On Tue, 9 Jan 2007, Frank van Maarseveen wrote:


 Yes but cp -rl is typically done by _developers_ and they tend to
 have a better understanding of this (uh, at least within linux context
 I hope so).

 Also, just adding hard-links doesn't increase the number of inodes.

No, but it increases the number of inodes that have link 1. :)
-- Steve

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-09 Thread Bryan Henderson
but you can get a large number of 1 linked
files, when you copy full directories with cp -rl.  Which I do a lot
when developing. I've done that a few times with the Linux tree.

Can you shed some light on how you use this technique?  (I.e. what does it 
do for you?)

Many people are of the opinion that since the invention of symbolic links, 
multiple hard links to files have been more trouble than they're worth.  I 
purged the last of them from my personal system years ago.  This thread 
has been a good overview of the negative side of hardlinking; it would be 
good to see what the positives are.

--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-09 Thread Pavel Machek
On Tue 2007-01-09 15:43:14, Bryan Henderson wrote:
 but you can get a large number of 1 linked
 files, when you copy full directories with cp -rl.  Which I do a lot
 when developing. I've done that a few times with the Linux tree.
 
 Can you shed some light on how you use this technique?  (I.e. what does it 
 do for you?)
 
 Many people are of the opinion that since the invention of symbolic links, 
 multiple hard links to files have been more trouble than they're worth.  I 
 purged the last of them from my personal system years ago.  This thread 
 has been a good overview of the negative side of hardlinking; it would be 
 good to see what the positives are.

git uses hardlinks heavily, AFAICT.

And no, you can't symlink two linux source trees against each other,
edit on both randomly, and expect result to work... can you? While
hardlinks + common editors were designed to enable that IIRC.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
  No one guarantees you sane result of tar or cp -a while changing the tree.
  I don't see how is_samefile() could make it worse.
 
  There are several cases where changing the tree doesn't affect the
  correctness of the tar or cp -a result.  In some of these cases using
  samefile() instead of st_ino _will_ result in a corrupted result.
 
 ... and those are what?

  - /a/p/x and /a/q/x are links to the same file

  - /b/y and /a/q/y are links to the same file

  - tar is running on /a

  - meanwhile the following commands are executed:

 mv /a/p/x /b/x
 mv /b/y /a/p/x

With st_ino checking you'll get a perfectly consistent archive,
regardless of the timing.  With samefile() you could get an archive
where the data in /a/q/y is not stored, instead it will contain the
data of /a/q/x.

Note, this is far nastier than the normal corruption you usually get
with changing the tree under tar, the file is not just duplicated or
missing, it becomes a completely different file, even though it hasn't
been touched at all during the archiving.

The basic problem with samefile() is that it can only compare files at
a single snapshot in time, and cannot take into account any changes in
the tree (unless keeping files open, which is impractical).

There's really no point trying to push for such an inferior interface
when the problems which samefile is trying to address are purely
theoretical.

Currently linux is living with 32bit st_ino because of legacy apps,
and people are not constantly agonizing about it.  Fixing the
EOVERFLOW problem will enable filesystems to slowly move towards 64bit
st_ino, which should be more than enough.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-08 Thread Pavel Machek
On Fri 2007-01-05 16:15:41, Miklos Szeredi wrote:
   And does it matter? If you rename a file, tar might skip it no matter of 
   hardlink detection (if readdir races with rename, you can read none of 
   the 
   names of file, one or both --- all these are possible).
   
   If you have dir1/a hardlinked to dir1/b and while tar runs you delete 
   both a and b and create totally new files dir2/c linked to 
   dir2/d, 
   tar might hardlink both c and d to a and b.
   
   No one guarantees you sane result of tar or cp -a while changing the 
   tree. 
   I don't see how is_samefile() could make it worse.
  
  There are several cases where changing the tree doesn't affect the
  correctness of the tar or cp -a result.  In some of these cases using
  samefile() instead of st_ino _will_ result in a corrupted result.
 
 Also note, that using st_ino in combination with samefile() doesn't
 make the result much better, it eliminates false positives, but cannot
 fix false negatives.

I'd argue false negatives are not as severe.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-08 Thread Pavel Machek
Hi!

   No one guarantees you sane result of tar or cp -a while changing the 
   tree.
   I don't see how is_samefile() could make it worse.
  
   There are several cases where changing the tree doesn't affect the
   correctness of the tar or cp -a result.  In some of these cases using
   samefile() instead of st_ino _will_ result in a corrupted result.
  
  ... and those are what?
 
   - /a/p/x and /a/q/x are links to the same file
 
   - /b/y and /a/q/y are links to the same file
 
   - tar is running on /a
 
   - meanwhile the following commands are executed:
 
  mv /a/p/x /b/x
  mv /b/y /a/p/x
 
 With st_ino checking you'll get a perfectly consistent archive,
 regardless of the timing.  With samefile() you could get an archive
 where the data in /a/q/y is not stored, instead it will contain the
 data of /a/q/x.
 
 Note, this is far nastier than the normal corruption you usually get
 with changing the tree under tar, the file is not just duplicated or
 missing, it becomes a completely different file, even though it hasn't
 been touched at all during the archiving.
 
 The basic problem with samefile() is that it can only compare files at
 a single snapshot in time, and cannot take into account any changes in
 the tree (unless keeping files open, which is impractical).

 There's really no point trying to push for such an inferior interface
 when the problems which samefile is trying to address are purely
 theoretical.

Oh yes, there is. st_ino is powerful, *but impossible to implement*
on many filesystems. You are of course welcome to combine st_ino with
samefile.

 Currently linux is living with 32bit st_ino because of legacy apps,
 and people are not constantly agonizing about it.  Fixing the
 EOVERFLOW problem will enable filesystems to slowly move towards 64bit
 st_ino, which should be more than enough.

50% probability of false positive on 4G files seems like very ugly
design problem to me.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
  There's really no point trying to push for such an inferior interface
  when the problems which samefile is trying to address are purely
  theoretical.
 
 Oh yes, there is. st_ino is powerful, *but impossible to implement*
 on many filesystems.

You mean POSIX compliance is impossible?  So what?  It is possible to
implement an approximation that is _at least_ as good as samefile().
One really dumb way is to set st_ino to the 'struct inode' pointer for
example.  That will sure as hell fit into 64bits and will give a
unique (alas not stable) identifier for each file.  Opening two files,
doing fstat() on them and comparing st_ino will give exactly the same
guarantees as samefile().

  Currently linux is living with 32bit st_ino because of legacy apps,
  and people are not constantly agonizing about it.  Fixing the
  EOVERFLOW problem will enable filesystems to slowly move towards 64bit
  st_ino, which should be more than enough.
 
 50% probability of false positive on 4G files seems like very ugly
 design problem to me.

4 billion files, each with more than one link is pretty far fetched.
And anyway, filesystems can take steps to prevent collisions, as they
do currently for 32bit st_ino, without serious difficulties
apparently.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
  You mean POSIX compliance is impossible?  So what?  It is possible to
  implement an approximation that is _at least_ as good as samefile().
  One really dumb way is to set st_ino to the 'struct inode' pointer for
  example.  That will sure as hell fit into 64bits and will give a
  unique (alas not stable) identifier for each file.  Opening two files,
  doing fstat() on them and comparing st_ino will give exactly the same
  guarantees as samefile().
 
 Good, ... except that it doesn't work. AFAIK, POSIX mandates inodes
 to be unique until umount, not until inode cache expires :-)
 
 IOW, if you have such implementation of st_ino, you can emulate samefile()
 with it, but you cannot have it without violating POSIX.

The whole discussion started out from the premise, that some
filesystems can't support stable unique inode numbers, i.e. they don't
conform to POSIX.

Filesystems which do conform to POSIX have _no need_ for samefile().
Ones that don't conform, can chose a scheme that is best suited to
applications need, balancing uniqueness and stability in various ways.

  4 billion files, each with more than one link is pretty far fetched.
 
 Not on terabyte scale disk arrays, which are getting quite common these days.
 
  And anyway, filesystems can take steps to prevent collisions, as they
  do currently for 32bit st_ino, without serious difficulties
  apparently.
 
 They currently do that usually by not supporting more than 4G files
 in a single FS.

And with 64bit st_ino, they'll have to live with the limitation of not
more than 2^64 files.  Tough luck ;)

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-08 Thread Martin Mares
Hello!

 You mean POSIX compliance is impossible?  So what?  It is possible to
 implement an approximation that is _at least_ as good as samefile().
 One really dumb way is to set st_ino to the 'struct inode' pointer for
 example.  That will sure as hell fit into 64bits and will give a
 unique (alas not stable) identifier for each file.  Opening two files,
 doing fstat() on them and comparing st_ino will give exactly the same
 guarantees as samefile().

Good, ... except that it doesn't work. AFAIK, POSIX mandates inodes
to be unique until umount, not until inode cache expires :-)

IOW, if you have such implementation of st_ino, you can emulate samefile()
with it, but you cannot have it without violating POSIX.

 4 billion files, each with more than one link is pretty far fetched.

Not on terabyte scale disk arrays, which are getting quite common these days.

 And anyway, filesystems can take steps to prevent collisions, as they
 do currently for 32bit st_ino, without serious difficulties
 apparently.

They currently do that usually by not supporting more than 4G files
in a single FS.

Have a nice fortnight
-- 
Martin `MJ' Mares  [EMAIL PROTECTED]   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
Oh no, not again!  -- The bowl of petunias
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
High probability is all you have.  Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)
   
   Some of us have machines designed to cope with cosmic rays, and would be
   unimpressed with a decrease in reliability.
  
  With the suggested samefile() interface you'd get a failure with just
  about 100% reliability for any application which needs to compare a
  more than a few files.  The fact is open files are _very_ expensive,
  no wonder they are limited in various ways.
  
  What should 'tar' do when it runs out of open files, while searching
  for hardlinks?  Should it just give up?  Then the samefile() interface
  would be _less_ reliable than the st_ino one by a significant margin.
 
 You need at most two simultenaously open files for examining any
 number of hardlinks. So yes, you can make it reliable.

Well, sort of.  Samefile without keeping fds open doesn't have any
protection against the tree changing underneath between first
registering a file and later opening it.  The inode number is more
useful in this respect.  In fact inode number + generation number will
give you a unique identifier in time as well, which is a _lot_ more
useful to determine if the file you are checking is actually the same
as one that you've come across previously.

So instead of samefile() I'd still suggest an extended attribute
interface which exports the file's unique (in space and time)
identifier as an opaque cookie.

For filesystems like FAT you can basically only guarantee that two
files are the same as long as those files are in the icache, no matter
if you use samefile() or inode numbers.  Userpace _can_ make the
inodes stay in the cache by keeping the files open, which works for
samefile as well as checking by inode number.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust
On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
 Trond Myklebust wrote:
  Exactly where do you see us violating the close-to-open cache
  consistency guarantees?
  
 
 I haven't seen that. What I did see is cache inconsistency when opening
 the same file with different file descriptors when the filehandle changes.
 My testing shows that at least fsync and close fail with EIO when the 
 filehandle
 changed while there was dirty data in the cache and that's good. Still,
 not sharing the cache while the file is opened (even on a different file
 descriptors by the same process) seems impractical.

Tough. I'm not going to commit to adding support for multiple
filehandles. The fact is that RFC3530 contains masses of rope with which
to allow server and client vendors to hang themselves. The fact that the
protocol claims support for servers that use multiple filehandles per
inode does not mean it is necessarily a good idea. It adds unnecessary
code complexity, it screws with server scalability (extra GETATTR calls
just in order to probe existing filehandles), and it is insufficiently
well documented in the RFC: SECINFO information is, for instance, given
out on a per-filehandle basis, does that mean that the server will have
different security policies? In some places, people haven't even started
to think about the consequences:

  If GETATTR directed to the two filehandles does not return the
  fileid attribute for both of the handles, then it cannot be
  determined whether the two objects are the same.  Therefore,
  operations which depend on that knowledge (e.g., client side data
  caching) cannot be done reliably.

This implies the combination is legal, but offers no indication as to
how you would match OPEN/CLOSE requests via different paths. AFAICS you
would have to do non-cached I/O with no share modes (i.e. NFSv3-style
special stateids). There is no way in hell we will ever support
non-cached I/O in NFS other than the special case of O_DIRECT.


...and no, I'm certainly not interested in fixing the RFC on this
point in any way other than getting this crap dropped from the spec. I
see no use for it at all.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-05 Thread Pavel Machek
Hi!

Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
   
   With the suggested samefile() interface you'd get a failure with just
   about 100% reliability for any application which needs to compare a
   more than a few files.  The fact is open files are _very_ expensive,
   no wonder they are limited in various ways.
   
   What should 'tar' do when it runs out of open files, while searching
   for hardlinks?  Should it just give up?  Then the samefile() interface
   would be _less_ reliable than the st_ino one by a significant margin.
  
  You need at most two simultenaously open files for examining any
  number of hardlinks. So yes, you can make it reliable.
 
 Well, sort of.  Samefile without keeping fds open doesn't have any
 protection against the tree changing underneath between first
 registering a file and later opening it.  The inode number is more

You only need to keep one-file-per-hardlink-group open during final
verification, checking that inode hashing produced reasonable results.

Pavel
-- 
Thanks for all the (sleeping) penguins.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
 And does it matter? If you rename a file, tar might skip it no matter of 
 hardlink detection (if readdir races with rename, you can read none of the 
 names of file, one or both --- all these are possible).
 
 If you have dir1/a hardlinked to dir1/b and while tar runs you delete 
 both a and b and create totally new files dir2/c linked to dir2/d, 
 tar might hardlink both c and d to a and b.
 
 No one guarantees you sane result of tar or cp -a while changing the tree. 
 I don't see how is_samefile() could make it worse.

There are several cases where changing the tree doesn't affect the
correctness of the tar or cp -a result.  In some of these cases using
samefile() instead of st_ino _will_ result in a corrupted result.

Generally samefile() is _weaker_ than the st_ino interface in
comparing the identity of two files without using massive amounts of
memory.  You're searching for a better solution, not one that is
broken in a different way, aren't you?

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
  And does it matter? If you rename a file, tar might skip it no matter of 
  hardlink detection (if readdir races with rename, you can read none of the 
  names of file, one or both --- all these are possible).
  
  If you have dir1/a hardlinked to dir1/b and while tar runs you delete 
  both a and b and create totally new files dir2/c linked to dir2/d, 
  tar might hardlink both c and d to a and b.
  
  No one guarantees you sane result of tar or cp -a while changing the tree. 
  I don't see how is_samefile() could make it worse.
 
 There are several cases where changing the tree doesn't affect the
 correctness of the tar or cp -a result.  In some of these cases using
 samefile() instead of st_ino _will_ result in a corrupted result.

Also note, that using st_ino in combination with samefile() doesn't
make the result much better, it eliminates false positives, but cannot
fix false negatives.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust
On Fri, 2007-01-05 at 10:40 -0600, Nicolas Williams wrote:
 What I don't understand is why getting the fileid is so hard -- always
 GETATTR when you GETFH and you'll be fine.  I'm guessing that's not as
 difficult as it is to maintain a hash table of fileids.

You've been sleeping in class. We always try to get the fileid together
with the GETFH. The irritating bit is having to redo a GETATTR using the
old filehandle in order to figure out if the 2 filehandles refer to the
same file. Unlike filehandles, fileids can be reused.

Then there is the point of dealing with that servers can (and do!)
actually lie to you.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Noveck, Dave
For now, I'm not going to address the controversial issues here,
mainly because I haven't decided how I feel about them yet.

 Whether allowing multiple filehandles per object is a good
 or even reasonably acceptable idea.

 What the fact that RFC3530 talks about implies about what
 clients should do about the issue.

One thing that I hope is not controversial is that the v4.1 spec
should either get rid of this or make it clear and implementable.
I expect plenty of controversy about which of those to choose, but
hope that there isn't any about the proposition that we have to 
choose one of those two.

 SECINFO information is, for instance, given
 out on a per-filehandle basis, does that mean that the server will
have
 different security policies? 

Well yes, RFC3530 does say The new SECINFO operation will allow the 
client to determine, on a per filehandle basis, but I think that
just has to be considered as an error rather than indicating that if
you have two different filehandles for the same object, they can have 
different security policies.  SECINFO in RFC3530 takes a directory fh
and a name, so if there are multiple filehandles for the object with
that name, there is no way for SECINFO to associate different policies
with different filehandles.  All it has is the name to go by.  I think
this should be corrected to on a per-object basis in the new spec no 
matter what we do on other issues.

I think the principle here has to be that if we do allow multiple 
fh's to map to the same object, we require that they designate the 
same object, and thus it is not allowed for the server to act as if 
you have multiple different object with different characteristics.

Similarly as to:

 In some places, people haven't even started
 to think about the consequences: 

 If GETATTR directed to the two filehandles does not return the
 fileid attribute for both of the handles, then it cannot be
 determined whether the two objects are the same.  Therefore,
 operations which depend on that knowledge (e.g., client side data
 caching) cannot be done reliably.

I think they (and maybe they includes me, I haven't checked the
history
here) started to think about them, but went in a bad direction.

The implication here that you can have a different set of attributes
supported for the same object based on which filehandle is used to 
access the attributes is totally bogus.

The definition of supp_attr says The bit vector which would retrieve
all mandatory and recommended attributes that are supported for this 
object.  The scope of this attribute applies to all objects with a
matching fsid.  So having the same object have different attributes
supported based on the filehandle used or even two objects in the same
fs having different attributes supported, in particular having fileid
supported for one and not the other just isn't valid.

 The fact is that RFC3530 contains masses of rope with which
 to allow server and client vendors to hang themselves. 

If that means simply making poor choices, then OK.  But if there are 
other cases where you feel that the specification of a feature is simply

incoherent and the consequences not really thought out, then I think 
we need to discuss them and not propagate that state of affairs to v4.1.

-Original Message-
From: Trond Myklebust [mailto:[EMAIL PROTECTED] 
Sent: Friday, January 05, 2007 5:29 AM
To: Benny Halevy
Cc: Jan Harkes; Miklos Szeredi; nfsv4@ietf.org;
linux-kernel@vger.kernel.org; Mikulas Patocka;
linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven
Subject: Re: [nfsv4] RE: Finding hardlinks


On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
 Trond Myklebust wrote:
  Exactly where do you see us violating the close-to-open cache
  consistency guarantees?
  
 
 I haven't seen that. What I did see is cache inconsistency when
opening
 the same file with different file descriptors when the filehandle
changes.
 My testing shows that at least fsync and close fail with EIO when the
filehandle
 changed while there was dirty data in the cache and that's good.
Still,
 not sharing the cache while the file is opened (even on a different
file
 descriptors by the same process) seems impractical.

Tough. I'm not going to commit to adding support for multiple
filehandles. The fact is that RFC3530 contains masses of rope with which
to allow server and client vendors to hang themselves. The fact that the
protocol claims support for servers that use multiple filehandles per
inode does not mean it is necessarily a good idea. It adds unnecessary
code complexity, it screws with server scalability (extra GETATTR calls
just in order to probe existing filehandles), and it is insufficiently
well documented in the RFC: SECINFO information is, for instance, given
out on a per-filehandle basis, does that mean that the server will have
different security policies? In some places, people haven't even started
to think about the consequences

RFC: Stable inodes for inode-less filesystems (was: Finding hardlinks)

2007-01-05 Thread Bodo Eggert
Pavel Machek [EMAIL PROTECTED] wrote:

 Another idea is to export the filesystem internal ID as an arbitray
 length cookie through the extended attribute interface.  That could be
 stored/compared by the filesystem quite efficiently.
 
 How will that work for FAT?

 Or maybe we can relax that inode may not change over rename and
 zero length files need unique inode numbers...

I didn't look into the code, and I'm not experienced in writing (linux)
fs, but I have an Idea I'd like to share. Maybe it's not that bad ...

(I'm going to type about inode numbers, since having constant inodes
 is desired and the extended attribute would only be an aid if the
 inode is too small.)

IIRC, no cluster is reserved for empty files on FAT; if I'm wrong, it'll
be much easier, you would just use the cluster-number (less than 32 bit).

The basic idea is to use a different inode range for non-empty and empty
files. This will result in the inode possibly changing after close()* or
on rename(empty1, empty2). OTOH it will keep a stable inode for non-empty
files and for newly written files** if they aren't stat()ed before writing
the first byte. I'm not sure if it's better than changing inodes after
$randomtime, but I just made a quick strace on gtar, rsync and cp -a;
they don't look at the dest inode before it would change (or at all).

(If this idea is applied to iso9660, the hard problem will be finding the
 number of hardlinked files for one location)

Changing the inode# on the last close* can be done by purging the cache
if the file is empty XOR the file has an inode# from the empty-range.
(That should be the same operation as done by unlink()?)
A new open(), stat() or readdir should create the correct kind of inode#.

*) You can optionally just wait for the inode to expire, but you need to
   keep the associated reservation(s) until then. I don't expect any
   visible effect from doing this, but see ** from the next paragraph
   on how to minimize the effect. The reserved directory entry (see far
   below in this text) is 32 Bytes, but the fragmentation may be bad.
**) which are empty on open() and therefore don't yet have a stable inode#
   Those inode numbers will apear to be stable because nobody saw them
   change. It's safe to change the inode# because by reserving disk space,
   we got a unique inode#. I hope the kernel side allows this ...


For non-empty files, you can use the cluster-number (start address), it's
unique, and won't change on rename. It will, however, change on emptying
or writing to an empty file. If you write to an empty file, no special
handling is required, all reservations are implicit*. If you empty a file,
you'll have to keep the first cluster reserved** untill it's closed,
otherwise you'd risk an inode collision.

*) since the inode# doesn't change yet, you'll still have to treat it like
   an empty file while unlinking or renaming.
**) It's OK to reuse it if it's in the middle of a file, so you may
optionally keep a list of these clusters and not start files there
instead of reserving the space. OTOH, it's more work.


Empty files will be a PITA with 32-bit-inodes, since a full-sized FAT32 can
have about 2^38 empty files*. (The extended attribute would work as described
below.) You can, however, generate inode numbers for empty files, risking
collisions. This requires all generated inode numbers to be above 0x4000
(or above the number of clusters on disk).

*) 8 TB divided by 32 B / directory entry

With 64-bit-values, you can generate an unique inode for empty files
using cluster#-of-dir | 0x8000 | index_in_dir  32. The downside
is, it will change on cross-directory-renames and may change on in-
directory-renames. If this happens to an open file, you'll need to
make sure the old inode# is not reused by reserving that directory
entry, since the inode# can't change for open files.


extra operations on final close:
if empty inode:
 if !empty
  unreserve_directory_entry(inode  0x7fff, inode  32)
  uncache inode (will change inode#)
  stop
 if unreserve_directory_entry(inode  0x7fff, inode  32)
  uncache inode
if non-empty inode
 if empty
  free start cluster
  uncache inode

extra operations on unlink/rename:
if empty inode:
 if can_use_current_inode#_for_dest
  do it
  unreserve_directory_entry(inode  0x7fff, inode  32)
  // because of mv a/empty b/empty; mv b/empty a/empty
 else if is_open_file
  // the current inode won't fit the new location:
  reserve_directory_entry(old_inode  0x7fff, inode  32)

extra operations on truncate
if non-empty inode  empty_after_truncate
 exempt start cluster from being freed,
 or put it on a list of non-startclusters

extra operation on extend
if empty inode  nobody did e.g. stat() after opening this file
 silently change inode, nobody will notice. Racy? Possible?


Required data in filehandle:
 Location of directory entry (d.e. contains inode information)
  (this shouldn't be new data?)
 stat-flag (possibly one per 

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Trond Myklebust
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
 I sincerely expect you or anybody else for this matter to try to provide
 feedback and object to the protocol specification in case they disagree
 with it (or think it's ambiguous or self contradicting) rather than ignoring
 it and implementing something else. I think we're shooting ourselves in the
 foot when doing so and it is in our common interest to strive to reach a
 realistic standard we can all comply with and interoperate with each other.

You are reading the protocol wrong in this case.

While the protocol does allow the server to implement the behaviour that
you've been advocating, it in no way mandates it. Nor does it mandate
that the client should gather files with the same (fsid,fileid) and
cache them together. Those are issues to do with _implementation_, and
are thus beyond the scope of the IETF.

In our case, the client will ignore the unique_handles attribute. It
will use filehandles as our inode cache identifier. It will not jump
through hoops to provide caching semantics that go beyond close-to-open
for servers that set unique_handles to false.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Benny Halevy

Trond Myklebust wrote:
 On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
 I sincerely expect you or anybody else for this matter to try to provide
 feedback and object to the protocol specification in case they disagree
 with it (or think it's ambiguous or self contradicting) rather than ignoring
 it and implementing something else. I think we're shooting ourselves in the
 foot when doing so and it is in our common interest to strive to reach a
 realistic standard we can all comply with and interoperate with each other.
 
 You are reading the protocol wrong in this case.

Obviously we interpret it differently and that by itself calls for considering
clarification of the text :)

 
 While the protocol does allow the server to implement the behaviour that
 you've been advocating, it in no way mandates it. Nor does it mandate
 that the client should gather files with the same (fsid,fileid) and
 cache them together. Those are issues to do with _implementation_, and
 are thus beyond the scope of the IETF.
 
 In our case, the client will ignore the unique_handles attribute. It
 will use filehandles as our inode cache identifier. It will not jump
 through hoops to provide caching semantics that go beyond close-to-open
 for servers that set unique_handles to false.

I agree that the way the client implements its cache is out of the protocol
scope. But how do you interpret correct behavior in section 4.2.1?
 Clients MUST use filehandle comparisons only to improve performance, not for 
correct behavior. All clients need to be prepared for situations in which it 
cannot be determined whether two filehandles denote the same object and in such 
cases, avoid making invalid assumptions which might cause incorrect behavior.
Don't you consider data corruption due to cache inconsistency an incorrect 
behavior?

Benny
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-04 Thread Nikita Danilov
Mikulas Patocka writes:
BTW. How does ReiserFS find that a given inode number (or object ID in
ReiserFS terminology) is free before assigning it to new file/directory?
  
   reiserfs v3 has an extent map of free object identifiers in
   super-block.
  
  Inode free space can have at most 2^31 extents --- if inode numbers 
  alternate between allocated, free. How do you pack it to superblock?

In the worst case, when free/used extents are small, some free oids are
leaked, but this has never been problem in practice. In fact, there
was a patch for reiserfs v3 to store this map in special hidden file but
it wasn't included in mainline, as nobody ever complained about oid map
fragmentation.

  
   reiser4 used 64 bit object identifiers without reuse.
  
  So you are going to hit the same problem as I did with SpadFS --- you 
  can't export 64-bit inode number to userspace (programs without 
  -D_FILE_OFFSET_BITS=64 will have stat() randomly failing with EOVERFLOW 
  then) and if you export only 32-bit number, it will eventually wrap-around 
  and colliding st_ino will cause data corruption with many userspace 
  programs.

Indeed, this is fundamental problem. Reiser4 tries to ameliorate it by
using hash function that starts colliding only when there are billions
of files, in which case 32bit inode number is screwed anyway.

Note, that none of the above problems invalidates reasons for having
long in-kernel inode identifiers that I outlined in other message.

  
  Mikulas

Nikita.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Peter Staubach

Bryan Henderson wrote:

Clients MUST use filehandle comparisons only to improve
performance, not for correct behavior. All clients need to
be prepared for situations in which it cannot be determined
whether two filehandles denote the same object and in such
cases, avoid making invalid assumptions which might cause incorrect 


behavior.
  
Don't you consider data corruption due to cache inconsistency an 
  

incorrect behavior?
  

Exactly where do you see us violating the close-to-open cache
consistency guarantees?



Let me add the information that Trond is implying:  His answer is yes, he 
doesn't consider data corruption due to cache inconsistency to be 
incorrect behavior.  And the reason is that, contrary to what one would 
expect, NFS allows that (for reasons of implementation practicality).  It 
says when you open a file via an NFS client and read it via that open 
instance, you can legally see data as old as the moment you opened it. 
Ergo, you can't use NFS in cases where that would cause unacceptable data 
corruption.


We normally think of this happening when a different client updates the 
file, in which case there's no practical way for the reading client to 
know his cache is stale.  When the updater and reader use the same client, 
we can do better, but if I'm not mistaken, the NFS protocol does not 
require us to do so.  And probably more relevant: the user wouldn't expect 
cache consistency.


This last is especially true, the expectations for use of NFS mounted
file systems are pretty well known and have been set from years of
experience.

A workaround is provided for cooperating processes which need stronger
consistency than the normal guarantees and that is file/record locking.

   Thanx...

  ps
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-04 Thread Pavel Machek
Hi!

   High probability is all you have.  Cosmic radiation hitting your
   computer will more likly cause problems, than colliding 64bit inode
   numbers ;)
  
  Some of us have machines designed to cope with cosmic rays, and would be
  unimpressed with a decrease in reliability.
 
 With the suggested samefile() interface you'd get a failure with just
 about 100% reliability for any application which needs to compare a
 more than a few files.  The fact is open files are _very_ expensive,
 no wonder they are limited in various ways.
 
 What should 'tar' do when it runs out of open files, while searching
 for hardlinks?  Should it just give up?  Then the samefile() interface
 would be _less_ reliable than the st_ino one by a significant margin.

You need at most two simultenaously open files for examining any
number of hardlinks. So yes, you can make it reliable.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi!

   the use of a good hash function.  The chance of an accidental
   collision is infinitesimally small.  For a set of 
   
100 files: 0.03%
  1,000,000 files: 0.03%
  
  I do not think we want to play with probability like this. I mean...
  imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
  unreasonable, and collision probability is going to be ~100% due to
  birthday paradox.
  
  You'll still want to back up your 4TB server...
 
 Certainly, but tar isn't going to remember all the inode numbers.
 Even if you solve the storage requirements (not impossible) it would
 have to do (4e9^2)/2=8e18 comparisons, which computers don't have
 enough CPU power just yet.

Storage requirements would be 16GB of RAM... that's small enough. If
you sort, you'll only need 32*2^32 comparisons, and that's doable.

I do not claim it is _likely_. You'd need hardlinks, as you
noticed. But system should work, not work with high probability, and
I believe we should solve this in long term.
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Miklos Szeredi
the use of a good hash function.  The chance of an accidental
collision is infinitesimally small.  For a set of 

 100 files: 0.03%
   1,000,000 files: 0.03%
   
   I do not think we want to play with probability like this. I mean...
   imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
   unreasonable, and collision probability is going to be ~100% due to
   birthday paradox.
   
   You'll still want to back up your 4TB server...
  
  Certainly, but tar isn't going to remember all the inode numbers.
  Even if you solve the storage requirements (not impossible) it would
  have to do (4e9^2)/2=8e18 comparisons, which computers don't have
  enough CPU power just yet.
 
 Storage requirements would be 16GB of RAM... that's small enough. If
 you sort, you'll only need 32*2^32 comparisons, and that's doable.
 
 I do not claim it is _likely_. You'd need hardlinks, as you
 noticed. But system should work, not work with high probability, and
 I believe we should solve this in long term.

High probability is all you have.  Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)

But you could add a new interface for the extra paranoid.  The
proposed 'samefile(fd1, fd2)' syscall is severly limited by the heavy
weight of file descriptors.

Another idea is to export the filesystem internal ID as an arbitray
length cookie through the extended attribute interface.  That could be
stored/compared by the filesystem quite efficiently.

But I think most apps will still opt for the portable intefaces which
while not perfect, are good enough.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Benny Halevy
Trond Myklebust wrote:
 On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote:
 Trond Myklebust wrote:
  
 On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
 Mikulas Patocka wrote:
 BTW. how does (or how should?) NFS client deal with cache coherency if 
 filehandles for the same file differ?

 Trond can probably answer this better than me...
 As I read it, currently the nfs client matches both the fileid and the
 filehandle (in nfs_find_actor). This means that different filehandles
 for the same file would result in different inodes :(.
 Strictly following the nfs protocol, comparing only the fileid should
 be enough IF fileids are indeed unique within the filesystem.
 Comparing the filehandle works as a workaround when the exported filesystem
 (or the nfs server) violates that.  From a user stand point I think that
 this should be configurable, probably per mount point.
 Matching files by fileid instead of filehandle is a lot more trouble
 since fileids may be reused after a file has been deleted. Every time
 you look up a file, and get a new filehandle for the same fileid, you
 would at the very least have to do another GETATTR using one of the
 'old' filehandles in order to ensure that the file is the same object as
 the one you have cached. Then there is the issue of what to do when you
 open(), read() or write() to the file: which filehandle do you use, are
 the access permissions the same for all filehandles, ...

 All in all, much pain for little or no gain.
 See my answer to your previous reply.  It seems like the current
 implementation is in violation of the nfs protocol and the extra pain
 is required.
 
 ...and we should care because...?
 
 Trond
 

Believe it or not, but server companies like Panasas try to follow the standard
when designing and implementing their products while relying on client vendors
to do the same.

I sincerely expect you or anybody else for this matter to try to provide
feedback and object to the protocol specification in case they disagree
with it (or think it's ambiguous or self contradicting) rather than ignoring
it and implementing something else. I think we're shooting ourselves in the
foot when doing so and it is in our common interest to strive to reach a
realistic standard we can all comply with and interoperate with each other.

Benny

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi!

 the use of a good hash function.  The chance of an accidental
 collision is infinitesimally small.  For a set of 
 
  100 files: 0.03%
1,000,000 files: 0.03%

I do not think we want to play with probability like this. I mean...
imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
unreasonable, and collision probability is going to be ~100% due to
birthday paradox.

You'll still want to back up your 4TB server...
   
   Certainly, but tar isn't going to remember all the inode numbers.
   Even if you solve the storage requirements (not impossible) it would
   have to do (4e9^2)/2=8e18 comparisons, which computers don't have
   enough CPU power just yet.
  
  Storage requirements would be 16GB of RAM... that's small enough. If
  you sort, you'll only need 32*2^32 comparisons, and that's doable.
  
  I do not claim it is _likely_. You'd need hardlinks, as you
  noticed. But system should work, not work with high probability, and
  I believe we should solve this in long term.
 
 High probability is all you have.  Cosmic radiation hitting your
 computer will more likly cause problems, than colliding 64bit inode
 numbers ;)

As I have shown... no, that's not right. 32*2^32 operations is small
enough not to have problems with cosmic radiation.

 But you could add a new interface for the extra paranoid.  The
 proposed 'samefile(fd1, fd2)' syscall is severly limited by the heavy
 weight of file descriptors.

I guess that is the way to go. samefile(path1, path2) is unfortunately
inherently racy.

 Another idea is to export the filesystem internal ID as an arbitray
 length cookie through the extended attribute interface.  That could be
 stored/compared by the filesystem quite efficiently.

How will that work for FAT?

Or maybe we can relax that inode may not change over rename and
zero length files need unique inode numbers...

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Martin Mares
Hello!

 High probability is all you have.  Cosmic radiation hitting your
 computer will more likly cause problems, than colliding 64bit inode
 numbers ;)

No.

If you assign 64-bit inode numbers randomly, 2^32 of them are sufficient
to generate a collision with probability around 50%.

Have a nice fortnight
-- 
Martin `MJ' Mares  [EMAIL PROTECTED]   
http://mj.ucw.cz/
Faculty of Math and Physics, Charles University, Prague, Czech Rep., Earth
A Bash poem: time for echo in canyon; do echo $echo $echo; done
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Matthew Wilcox
On Wed, Jan 03, 2007 at 01:33:31PM +0100, Miklos Szeredi wrote:
 High probability is all you have.  Cosmic radiation hitting your
 computer will more likly cause problems, than colliding 64bit inode
 numbers ;)

Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:
 
 I didn't hardlink directories, I just patched stat, lstat and fstat to 
 always return st_ino == 0 --- and I've seen those failures. These failures 
 are going to happen on non-POSIX filesystems in real world too, very 
 rarely.

I don't want to spoil your day but testing with st_ino==0 is a bad choice
because it is a special number. Anyway, one can only find breakage,
not prove that all the other programs handle this correctly so this is
kind of pointless.

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.

Synthetic filesystems such as /proc are special due to their dynamic
nature and I think st_ino uniqueness is far more important than being able
to provide hardlinks there. Most tree handling programs (cp, rm, ...)
break horribly when the tree underneath changes at the same time.

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka



On Wed, 3 Jan 2007, Frank van Maarseveen wrote:


On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:


I didn't hardlink directories, I just patched stat, lstat and fstat to
always return st_ino == 0 --- and I've seen those failures. These failures
are going to happen on non-POSIX filesystems in real world too, very
rarely.


I don't want to spoil your day but testing with st_ino==0 is a bad choice
because it is a special number. Anyway, one can only find breakage,
not prove that all the other programs handle this correctly so this is
kind of pointless.

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.


... and that's the problem --- the UNIX world specified something that 
isn't implementable in real world.


You can take a closed box and say this is POSIX cerified --- but how 
useful such box could be, if you can't access CDs, diskettes and USB 
sticks with it?


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka

I didn't hardlink directories, I just patched stat, lstat and fstat to
always return st_ino == 0 --- and I've seen those failures. These failures
are going to happen on non-POSIX filesystems in real world too, very
rarely.


I don't want to spoil your day but testing with st_ino==0 is a bad choice
because it is a special number. Anyway, one can only find breakage,
not prove that all the other programs handle this correctly so this is
kind of pointless.

On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.


... and that's the problem --- the UNIX world specified something that
isn't implementable in real world.


Sure it is. Numerous popular POSIX filesystems do that. There is a lot of
inode number space in 64 bit (of course it is a matter of time for it to
jump to 128 bit and more)


If the filesystem was designed by someone not from Unix world (FAT, SMB, 
...), then not. And users still want to access these filesystems.


64-bit inode numbers space is not yet implemented on Linux --- the problem 
is that if you return ino = 2^32, programs compiled without 
-D_FILE_OFFSET_BITS=64 will fail with stat() returning -EOVERFLOW --- this 
failure is specified in POSIX, but not very useful.


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Bryan Henderson
On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon 
this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.

But for at least the last of those decades, filesystems that could not do 
that were not uncommon.  They had to present 32 bit inode numbers and 
either allowed more than 4G files or just didn't have the means of 
assigning inode numbers with the proper uniqueness to files.  And the sky 
did not fall.  I don't have an explanation why, but it makes it look to me 
like there are worse things than not having total one-one correspondence 
between inode numbers and files.  Having a stat or mount fail because 
inodes are too big, having fewer than 4G files, and waiting for the 
filesystem to generate a suitable inode number might fall in that 
category.

I fully agree that much effort should be put into making inode numbers 
work the way POSIX demands, but I also know that that sometimes requires 
more than just writing some code.

--
Bryan Henderson   San Jose California
IBM Almaden Research Center   Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Wed, Jan 03, 2007 at 01:09:41PM -0800, Bryan Henderson wrote:
 On any decent filesystem st_ino should uniquely identify an object and
 reliably provide hardlink information. The UNIX world has relied upon 
 this
 for decades. A filesystem with st_ino collisions without being hardlinked
 (or the other way around) needs a fix.
 
 But for at least the last of those decades, filesystems that could not do 
 that were not uncommon.  They had to present 32 bit inode numbers and 
 either allowed more than 4G files or just didn't have the means of 
 assigning inode numbers with the proper uniqueness to files.  And the sky 
 did not fall.  I don't have an explanation why,

I think it's mostly high end use and high end users tend to understand
more. But we're going to see more really large filesystems in normal
use so..

Currently, large file support is already necessary to handle dvd and
video. It's also useful for images for virtualization. So the failing stat()
calls should already be a thing of the past with modern distributions.

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi!

 Sure it is. Numerous popular POSIX filesystems do that. There is a lot of
 inode number space in 64 bit (of course it is a matter of time for it to
 jump to 128 bit and more)
 
 If the filesystem was designed by someone not from Unix world (FAT, SMB, 
 ...), then not. And users still want to access these filesystems.
 
 64-bit inode numbers space is not yet implemented on Linux --- the problem 
 is that if you return ino = 2^32, programs compiled without 
 -D_FILE_OFFSET_BITS=64 will fail with stat() returning -EOVERFLOW --- this 
 failure is specified in POSIX, but not very useful.

Hehe, can we simply -EOVERFLOW on VFAT all the time? ...probably not
useful :-(. But ability to say unknown in st_ino field would
help

Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Thu, Jan 04, 2007 at 12:43:20AM +0100, Mikulas Patocka wrote:
 On Wed, 3 Jan 2007, Frank van Maarseveen wrote:
 Currently, large file support is already necessary to handle dvd and
 video. It's also useful for images for virtualization. So the failing 
 stat()
 calls should already be a thing of the past with modern distributions.
 
 As long as glibc compiles by default with 32-bit ino_t, the problem exists 
 and is severe --- programs handling large files, such as coreutils, tar, 
 mc, mplayer, already compile with 64-bit ino_t and off_t, but the user (or 
 script) may type something like:
 
 cat file.c EOF
 #include sys/types.h
 #include sys/stat.h
 main()
 {
   int h;
   struct stat st;
   if ((h = creat(foo, 0600))  0) perror(creat), exit(1);
   if (fstat(h, st)) perror(stat), exit(1);
   close(h);
   return 0;
 }
 EOF
 gcc file.c; ./a.out
 
 --- and you certainly do not want this to fail (unless you are out of disk 
 space).
 
 The difference is, that with 32-bit program and 64-bit off_t, you get 
 deterministic failure on large files, with 32-bit program and 64-bit 
 ino_t, you get random failures.

What's (technically) the problem with changing the gcc default?

Alternatively we could make the error deterministic in various ways. Start
st_ino numbering from 4G (except for a few special ones maybe such
as root/mounts). Or make old and new programs look differently at the
ELF level or by sys_personality() and/or check against a ino64 mount
flag/filesystem feature. Lots of possibilities.

-- 
Frank
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Trond Myklebust
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
 Believe it or not, but server companies like Panasas try to follow the 
 standard
 when designing and implementing their products while relying on client vendors
 to do the same.

I personally have never given a rats arse about standards if they make
no sense to me. If the server is capable of knowing about hard links,
then why does it need all this extra crap in the filehandle that just
obfuscates the hard link info?

The bottom line is that nothing in our implementation will result in
such a server performing sub-optimally w.r.t. the client. The only
result is that we will conform to close-to-open semantics instead of
strict POSIX caching semantics when two processes have opened the same
file via different hard links.

 I sincerely expect you or anybody else for this matter to try to provide
 feedback and object to the protocol specification in case they disagree
 with it (or think it's ambiguous or self contradicting) rather than ignoring
 it and implementing something else. I think we're shooting ourselves in the
 foot when doing so and it is in our common interest to strive to reach a
 realistic standard we can all comply with and interoperate with each other.

This has nothing to do with the protocol itself: it has only to do with
caching semantics. As far as caching goes, the only guarantees that NFS
clients give are the close-to-open semantics, and this should indeed be
respected by the implementation in question.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-02 Thread Pavel Machek
Hi!

   It seems like the posix idea of unique st_dev, st_ino doesn't
   hold water for modern file systems 
   
   are you really sure?
  
  Well Jan's example was of Coda that uses 128-bit internal file ids.
  
   and if so, why don't we fix *THAT* instead
  
  Hmm, sometimes you can't fix the world, especially if the filesystem
  is exported over NFS and has a problem with fitting its file IDs uniquely
  into a 64-bit identifier.
 
 Note, it's pretty easy to fit _anything_ into a 64-bit identifier with
 the use of a good hash function.  The chance of an accidental
 collision is infinitesimally small.  For a set of 
 
  100 files: 0.03%
1,000,000 files: 0.03%

I do not think we want to play with probability like this. I mean...
imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
unreasonable, and collision probability is going to be ~100% due to
birthday paradox.

You'll still want to back up your 4TB server...

Pavel
-- 
Thanks for all the (sleeping) penguins.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-02 Thread Miklos Szeredi
It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems 

are you really sure?
   
   Well Jan's example was of Coda that uses 128-bit internal file ids.
   
and if so, why don't we fix *THAT* instead
   
   Hmm, sometimes you can't fix the world, especially if the filesystem
   is exported over NFS and has a problem with fitting its file IDs uniquely
   into a 64-bit identifier.
  
  Note, it's pretty easy to fit _anything_ into a 64-bit identifier with
  the use of a good hash function.  The chance of an accidental
  collision is infinitesimally small.  For a set of 
  
   100 files: 0.03%
 1,000,000 files: 0.03%
 
 I do not think we want to play with probability like this. I mean...
 imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
 unreasonable, and collision probability is going to be ~100% due to
 birthday paradox.
 
 You'll still want to back up your 4TB server...

Certainly, but tar isn't going to remember all the inode numbers.
Even if you solve the storage requirements (not impossible) it would
have to do (4e9^2)/2=8e18 comparisons, which computers don't have
enough CPU power just yet.

It doesn't matter if there are collisions within the filesystem, as
long as there are no collisions between the set of files an
application is working on at the same time.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-02 Thread Mikulas Patocka



On Tue, 2 Jan 2007, Miklos Szeredi wrote:


It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems


are you really sure?


Well Jan's example was of Coda that uses 128-bit internal file ids.


and if so, why don't we fix *THAT* instead


Hmm, sometimes you can't fix the world, especially if the filesystem
is exported over NFS and has a problem with fitting its file IDs uniquely
into a 64-bit identifier.


Note, it's pretty easy to fit _anything_ into a 64-bit identifier with
the use of a good hash function.  The chance of an accidental
collision is infinitesimally small.  For a set of

 100 files: 0.03%
   1,000,000 files: 0.03%


I do not think we want to play with probability like this. I mean...
imagine 4G files, 1KB each. That's 4TB disk space, not _completely_
unreasonable, and collision probability is going to be ~100% due to
birthday paradox.

You'll still want to back up your 4TB server...


Certainly, but tar isn't going to remember all the inode numbers.
Even if you solve the storage requirements (not impossible) it would
have to do (4e9^2)/2=8e18 comparisons, which computers don't have
enough CPU power just yet.


It is remembering all inode numbers with nlink  1 and many other tools 
are remembering all directory inode numbers (see my other post on this 
topic). It of course doesn't compare each number with all others, it is 
using hashing.



It doesn't matter if there are collisions within the filesystem, as
long as there are no collisions between the set of files an
application is working on at the same time.


--- that are all files in case of backup.


Miklos


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [nfsv4] RE: Finding hardlinks

2007-01-02 Thread Trond Myklebust
On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote:
 Trond Myklebust wrote:
   
  On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
   Mikulas Patocka wrote:
  
   BTW. how does (or how should?) NFS client deal with cache coherency if 
   filehandles for the same file differ?
   
   
   Trond can probably answer this better than me...
   As I read it, currently the nfs client matches both the fileid and the
   filehandle (in nfs_find_actor). This means that different filehandles
   for the same file would result in different inodes :(.
   Strictly following the nfs protocol, comparing only the fileid should
   be enough IF fileids are indeed unique within the filesystem.
   Comparing the filehandle works as a workaround when the exported 
   filesystem
   (or the nfs server) violates that.  From a user stand point I think that
   this should be configurable, probably per mount point.
  
  Matching files by fileid instead of filehandle is a lot more trouble
  since fileids may be reused after a file has been deleted. Every time
  you look up a file, and get a new filehandle for the same fileid, you
  would at the very least have to do another GETATTR using one of the
  'old' filehandles in order to ensure that the file is the same object as
  the one you have cached. Then there is the issue of what to do when you
  open(), read() or write() to the file: which filehandle do you use, are
  the access permissions the same for all filehandles, ...
  
  All in all, much pain for little or no gain.
 
 See my answer to your previous reply.  It seems like the current
 implementation is in violation of the nfs protocol and the extra pain
 is required.

...and we should care because...?

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Finding hardlinks

2007-01-02 Thread Trond Myklebust
On Sun, 2006-12-31 at 16:19 -0500, Halevy, Benny wrote:

 Even for NFSv3 (that doesn't have the unique_handles attribute I think
 that the linux nfs client can do a better job.  If you'd have a filehandle
 cache that points at inodes you could maintain a many to one relationship
 from multiple filehandles into one inode.  When you discover a new filehandle
 you can look up the inode cache for the same fileid and if one is found you
 can do a getattr on the old filehandle (without loss of generality you should 
 always use the latest filehandle that was returned for that filesystem object,
 although any filehandle that refers to it can be used).
 If the getattr succeeded then the filehandles refer to the same fs object and
 you can create a new entry in the filehandle cache pointing at that inode.
 Otherwise, if getattr says that the old filehandle is stale I think you should
 mark the inode as stale and keep it around so that applications can get an
 appropriate error until last close, before you clean up the fh cache from the
 stale filehandles. A new inode structure should be created for the new 
 filehandle.

There are, BTW, other reasons why the above is a bad idea: it breaks on
a bunch of well known servers. Look back at the 2.2.x kernels and the
kind of hacks they had in order to deal with crap like the Netapp
'.snapshot' directories which contain files with duplicate fileids that
do not represent hard links, but rather represent previous revisions of
the same file.

That kind of hackery was part of the reason why I ripped out that code.
The other reasons were
- that you end up playing unnecessary getattr games like the
above for little gain.
- the only servers that implemented the above were borken pieces
of crap that encoded parent directories in the filehandle, and
which end up breaking anyway under cross-directory renames.
- the world is filled with non-posix filesystems that frequently
don't have real fileids. They are often just generated on the
fly and can change at the drop of a hat.

Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-02 Thread Mikulas Patocka

On Wed, 3 Jan 2007, Trond Myklebust wrote:


On Sat, 2006-12-30 at 02:04 +0100, Mikulas Patocka wrote:


On Fri, 29 Dec 2006, Trond Myklebust wrote:


On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote:

Why don't you rip off the support for colliding inode number from the
kernel at all (i.e. remove iget5_locked)?

It's reasonable to have either no support for colliding ino_t or full
support for that (including syscalls that userspace can use to work with
such filesystem) --- but I don't see any point in having half-way support
in kernel as is right now.


What would ino_t have to do with inode numbers? It is only used as a
hash table lookup. The inode number is set in the -getattr() callback.


The question is: why does the kernel contain iget5 function that looks up
according to callback, if the filesystem cannot have more than 64-bit
inode identifier?


Huh? The filesystem can have as large a damned identifier as it likes.
NFSv4 uses 128-byte filehandles, for instance.


But then it needs some other syscall to let applications determine 
hardlinks --- which was the initial topic in this thread.



POSIX filesystems are another matter. They can only have 64-bit
identifiers thanks to the requirement that inode numbers be 64-bit
unique and permanently stored, however Linux caters for a whole
truckload of filesystems which will never fit that label: look at all
those users of iunique(), for one...


I see them. The bad thing is that many programmers read POSIX, write 
programs as if POSIX specification was true and these programs break 
randomly on non-POSIX filesystem. Each non-POSIX filesystem invents st_ino 
on its own, trying to minimize hash collision, making the failure even 
less probable and worse to find.


The current situation is (for example) that cp does stat(), open(), 
fstat() and compares st_ino/st_dev --- if they mismatch, it writes error 
and doesn't copy files --- so if kernel removes the inode from cache 
between stat() and open() and filesystem uses iunique(), cp will fail.


What utilities should the user use on those non-POSIX filesystems, if not 
cp?


Probably some file-handling guidelines should be specified and written to 
Documentation/ as a form of standard that can appliaction programmers use.


Mikulas


Trond


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-01 Thread Pavel Machek
Hi!

 If user (or script) doesn't specify that flag, it 
 doesn't help. I think
 the best solution for these filesystems would be 
 either to add new syscall
 int is_hardlink(char *filename1, char *filename2)
 (but I know adding syscall bloat may be objectionable)
 
 it's also the wrong api; the filenames may have been 
 changed under you
 just as you return from this call, so it really is a
 was_hardlink_at_some_point() as you specify it.
 If you make it work on fd's.. it has a chance at least.
 
 Yes, but it doesn't matter --- if the tree changes under 
 cp -a command, no one guarantees you what you get.
   int fis_hardlink(int handle1, int handle 2);
 Is another possibility but it can't detect hardlinked 
 symlinks.

Ugh. Is it even legal to hardlink symlinks?

Anyway, cp -a is not the only application that wants to do hardlink
detection.
Pavel
-- 
Thanks for all the (sleeping) penguins.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka

Hi!


If user (or script) doesn't specify that flag, it
doesn't help. I think
the best solution for these filesystems would be
either to add new syscall
int is_hardlink(char *filename1, char *filename2)
(but I know adding syscall bloat may be objectionable)


it's also the wrong api; the filenames may have been
changed under you
just as you return from this call, so it really is a
was_hardlink_at_some_point() as you specify it.
If you make it work on fd's.. it has a chance at least.


Yes, but it doesn't matter --- if the tree changes under
cp -a command, no one guarantees you what you get.
int fis_hardlink(int handle1, int handle 2);
Is another possibility but it can't detect hardlinked
symlinks.


Ugh. Is it even legal to hardlink symlinks?


Why it shoudln't be? It seems to work quite fine in Linux.


Anyway, cp -a is not the only application that wants to do hardlink
detection.


I tested programs for ino_t collision (I intentionally injected it) and 
found that CP from coreutils 6.7 fails to copy directories but displays 
error messages (coreutils 5 work fine). MC and ARJ skip directories with 
colliding ino_t and pretend that operation completed successfuly. FTS 
library fails to walk directories returning FTS_DC error. Diffutils, find, 
grep fail to search directories with coliding inode numbers. Tar seems 
tolerant except incremental backup (which I didn't try). All programs 
except diff were tolerant to coliding ino_t on files.


ino_t is no longer unique in many filesystems, it seems like quite serious 
data corruption possibility.


Mikulas


Pavel

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka

 The question is: why does the kernel contain iget5 function that looks up
 according to callback, if the filesystem cannot have more than 64-bit
 inode identifier?

Generally speaking, file system might have two different identifiers for
files:

- one that makes it easy to tell whether two files are the same one;

- one that makes it easy to locate file on the storage.

According to POSIX, inode number should always work as identifier of the
first class, but not necessary as one of the second. For example, in
reiserfs something called a key is used to locate on-disk inode, which
in turn, contains inode number. Identifiers of the second class tend to


BTW. How does ReiserFS find that a given inode number (or object ID in 
ReiserFS terminology) is free before assigning it to new file/directory?


Mikulas


live in directory entries, and during lookup we want to consult inode
cache _before_ reading inode from the disk (otherwise cache is mostly
useless), right? This means that some file systems want to index inodes
in a cache by something different than inode number.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-01 Thread Nikita Danilov
Mikulas Patocka writes:

[...]

  
  BTW. How does ReiserFS find that a given inode number (or object ID in 
  ReiserFS terminology) is free before assigning it to new file/directory?

reiserfs v3 has an extent map of free object identifiers in
super-block. reiser4 used 64 bit object identifiers without reuse.

  
  Mikulas

Nikita.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka

 BTW. How does ReiserFS find that a given inode number (or object ID in
 ReiserFS terminology) is free before assigning it to new file/directory?

reiserfs v3 has an extent map of free object identifiers in
super-block.


Inode free space can have at most 2^31 extents --- if inode numbers 
alternate between allocated, free. How do you pack it to superblock?



reiser4 used 64 bit object identifiers without reuse.


So you are going to hit the same problem as I did with SpadFS --- you 
can't export 64-bit inode number to userspace (programs without 
-D_FILE_OFFSET_BITS=64 will have stat() randomly failing with EOVERFLOW 
then) and if you export only 32-bit number, it will eventually wrap-around 
and colliding st_ino will cause data corruption with many userspace 
programs.


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-01 Thread Jan Harkes
On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote:
 Anyway, cp -a is not the only application that wants to do hardlink
 detection.
 
 I tested programs for ino_t collision (I intentionally injected it) and 
 found that CP from coreutils 6.7 fails to copy directories but displays 
 error messages (coreutils 5 work fine). MC and ARJ skip directories with 
 colliding ino_t and pretend that operation completed successfuly. FTS 
 library fails to walk directories returning FTS_DC error. Diffutils, find, 
 grep fail to search directories with coliding inode numbers. Tar seems 
 tolerant except incremental backup (which I didn't try). All programs 
 except diff were tolerant to coliding ino_t on files.

Thanks for testing so many programs, but... did the files/symlinks with
colliding inode number have i_nlink  1? Or did you also have directories
with colliding inode numbers. It looks like you've introduced hardlinked
directories in your test which are definitely not supported, in fact it
will probably cause not only issues for userspace programs, but also
locking and garbage collection issues in the kernel's dcache.

I'm surprised you're seeing so many problems. The only find problem that
I am aware of is the one where it assumes that there will be only
i_nlink-2 subdirectories in a given directory, this optimization can be
disabled with -noleaf. The only problems I've encountered with ino_t
collisions are archivers and other programs that recursively try to copy
a tree while preserving hardlinks. And in all cases these seem to have
no problem with such collisions as long as i_nlink == 1.

Jan
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka

On Mon, 1 Jan 2007, Jan Harkes wrote:


On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote:

Anyway, cp -a is not the only application that wants to do hardlink
detection.


I tested programs for ino_t collision (I intentionally injected it) and
found that CP from coreutils 6.7 fails to copy directories but displays
error messages (coreutils 5 work fine). MC and ARJ skip directories with
colliding ino_t and pretend that operation completed successfuly. FTS
library fails to walk directories returning FTS_DC error. Diffutils, find,
grep fail to search directories with coliding inode numbers. Tar seems
tolerant except incremental backup (which I didn't try). All programs
except diff were tolerant to coliding ino_t on files.


Thanks for testing so many programs, but... did the files/symlinks with
colliding inode number have i_nlink  1? Or did you also have directories
with colliding inode numbers. It looks like you've introduced hardlinked
directories in your test which are definitely not supported, in fact it
will probably cause not only issues for userspace programs, but also
locking and garbage collection issues in the kernel's dcache.


I tested it only on files without hardlink (with i_nlink == 1) --- most 
programs (except diff) are tolerant to collision, they won't store st_ino 
in memory unless i_nlink  1.


I didn't hardlink directories, I just patched stat, lstat and fstat to 
always return st_ino == 0 --- and I've seen those failures. These failures 
are going to happen on non-POSIX filesystems in real world too, very 
rarely.


BTW. POSIX supports (optionally) hardlinked directories but doesn't 
supoprt colliding st_ino --- so programs act according to POSIX --- but 
the problem is that this POSIX requirement no longer represents real world 
situation.



I'm surprised you're seeing so many problems. The only find problem that
I am aware of is the one where it assumes that there will be only
i_nlink-2 subdirectories in a given directory, this optimization can be
disabled with -noleaf.


This is not a bug but a feature. If filesystem doesn't count 
subdirectories, it should set directory's n_link to 1 and find will be ok.


The only problems I've encountered with ino_t collisions are archivers 
and other programs that recursively try to copy a tree while preserving 
hardlinks. And in all cases these seem to have no problem with such 
collisions as long as i_nlink == 1.


Yes, but they have big problems with directory ino_t collisions. They 
think that directories are hardlinked and skip processing them.


Mikulas


Jan


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-31 Thread Mikulas Patocka

On Wed, 20 Dec 2006, Al Viro wrote:


On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote:

I don't see any problems with changing struct kstat.  There would be
reservations against changing inode.i_ino though.

So filesystems that have 64bit inodes will need a specialized
getattr() method instead of generic_fillattr().


And they are already free to do so.  And no, struct kstat doesn't need
to be changed - it has u64 ino already.


If I return 64-bit values as ino_t, 32-bit programs will get EOVERFLOW on 
stat attempt (even if they are not going to use st_ino in any way) --- I 
know that POSIX specifies it, but the question is if it is useful.


What is the correct solution? Mount option that can differentiate between 
32-bit colliding inode numbers and 64-bit non-colliding inode numbers? Or 
is there any better idea.


Given the fact that glibc compiles anything by default with 32-bit ino_t, 
I wonder if returning 64-bit inode number is possible at all.


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [nfsv4] RE: Finding hardlinks

2006-12-31 Thread Halevy, Benny
Trond Myklebust wrote:
  
 On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
  Mikulas Patocka wrote:
 
  BTW. how does (or how should?) NFS client deal with cache coherency if 
  filehandles for the same file differ?
  
  
  Trond can probably answer this better than me...
  As I read it, currently the nfs client matches both the fileid and the
  filehandle (in nfs_find_actor). This means that different filehandles
  for the same file would result in different inodes :(.
  Strictly following the nfs protocol, comparing only the fileid should
  be enough IF fileids are indeed unique within the filesystem.
  Comparing the filehandle works as a workaround when the exported filesystem
  (or the nfs server) violates that.  From a user stand point I think that
  this should be configurable, probably per mount point.
 
 Matching files by fileid instead of filehandle is a lot more trouble
 since fileids may be reused after a file has been deleted. Every time
 you look up a file, and get a new filehandle for the same fileid, you
 would at the very least have to do another GETATTR using one of the
 'old' filehandles in order to ensure that the file is the same object as
 the one you have cached. Then there is the issue of what to do when you
 open(), read() or write() to the file: which filehandle do you use, are
 the access permissions the same for all filehandles, ...
 
 All in all, much pain for little or no gain.

See my answer to your previous reply.  It seems like the current
implementation is in violation of the nfs protocol and the extra pain
is required.

 
 Most servers therefore take great pains to ensure that clients can use
 filehandles to identify inodes. The exceptions tend to be broken in
 other ways

This is true maybe in linux, but not necessarily in non-linux based nfs
servers.

 (Note: knfsd without the no_subtree_check option is one of
 these exceptions - it can break in the case of cross-directory renames).
 
 Cheers,
   Trond


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-31 Thread Nikita Danilov
Mikulas Patocka writes:
  
  
  On Fri, 29 Dec 2006, Trond Myklebust wrote:
  
   On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote:
   Why don't you rip off the support for colliding inode number from the
   kernel at all (i.e. remove iget5_locked)?
  
   It's reasonable to have either no support for colliding ino_t or full
   support for that (including syscalls that userspace can use to work with
   such filesystem) --- but I don't see any point in having half-way support
   in kernel as is right now.
  
   What would ino_t have to do with inode numbers? It is only used as a
   hash table lookup. The inode number is set in the -getattr() callback.
  
  The question is: why does the kernel contain iget5 function that looks up 
  according to callback, if the filesystem cannot have more than 64-bit 
  inode identifier?

Generally speaking, file system might have two different identifiers for
files:

 - one that makes it easy to tell whether two files are the same one;

 - one that makes it easy to locate file on the storage.

According to POSIX, inode number should always work as identifier of the
first class, but not necessary as one of the second. For example, in
reiserfs something called a key is used to locate on-disk inode, which
in turn, contains inode number. Identifiers of the second class tend to
live in directory entries, and during lookup we want to consult inode
cache _before_ reading inode from the disk (otherwise cache is mostly
useless), right? This means that some file systems want to index inodes
in a cache by something different than inode number.

There is another reason, why I, personally, would like to have an
ability to index inodes by things other than inode numbers: delayed
inode number allocation. Strictly speaking, file system has to assign
inode number to the file only when it is just about to report it to the
user space (either though stat, or, ugh... readdir). If location of
inode on disk depends on its inode number (like it is in inode-table
based file systems like ext[23]) then delayed inode number allocation
has to same advantages as delayed block allocation.

  
  This lookup callback just induces writing bad filesystems with coliding 
  inode numbers. Either remove coda, smb (and possibly other) filesystems 
  from the kernel or make a proper support for userspace for them.
  
  The situation is that current coreutils 6.7 fail to recursively copy 
  directories if some two directories in the tree have coliding inode 
  number, so you get random data corruption with these filesystems.
  
  Mikulas

Nikita.

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Arjan van de Ven
On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote:
 Statement 1:
 If two files have identical st_dev and st_ino, they MUST be hardlinks of
 each other/the same file.
 
 Statement 2:
 If two files are a hardlink of each other, they MUST be detectable
 (for example by having the same st_dev/st_ino)
 
 I personally consider statement 1 a mandatory requirement in terms of
 quality of implementation if not Posix compliance.
 
 Statement 2 for me is nice but optional
 
 Statement 1 without Statement 2 provides one of those facilities where the 
 computer tells you something is maybe or almost certainly true.

No it's not a almost certainly. It's a these ARE.
It's not a these are NOT

Statement 2 is the these are NOT statement basically

they are entirely separate concepts... 
(but then again I'm not a CS guy so maybe I just look at it from a
different angle)

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Trond Myklebust
On Thu, 2006-12-28 at 17:12 +0200, Benny Halevy wrote:

 As an example, some file systems encode hint information into the filehandle
 and the hints may change over time, another example is encoding parent
 information into the filehandle and then handles representing hard links
 to the same file from different directories will differ.

Both these examples are bogus. Filehandle information should not change
over time (except in the special case of NFSv4 volatile filehandles)
and they should definitely not encode parent directory information that
can change over time (think rename()!).

Cheers
  Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [nfsv4] RE: Finding hardlinks

2006-12-29 Thread Trond Myklebust
On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
 Mikulas Patocka wrote:

 BTW. how does (or how should?) NFS client deal with cache coherency if 
 filehandles for the same file differ?
 
 
 Trond can probably answer this better than me...
 As I read it, currently the nfs client matches both the fileid and the
 filehandle (in nfs_find_actor). This means that different filehandles
 for the same file would result in different inodes :(.
 Strictly following the nfs protocol, comparing only the fileid should
 be enough IF fileids are indeed unique within the filesystem.
 Comparing the filehandle works as a workaround when the exported filesystem
 (or the nfs server) violates that.  From a user stand point I think that
 this should be configurable, probably per mount point.

Matching files by fileid instead of filehandle is a lot more trouble
since fileids may be reused after a file has been deleted. Every time
you look up a file, and get a new filehandle for the same fileid, you
would at the very least have to do another GETATTR using one of the
'old' filehandles in order to ensure that the file is the same object as
the one you have cached. Then there is the issue of what to do when you
open(), read() or write() to the file: which filehandle do you use, are
the access permissions the same for all filehandles, ...

All in all, much pain for little or no gain.

Most servers therefore take great pains to ensure that clients can use
filehandles to identify inodes. The exceptions tend to be broken in
other ways (Note: knfsd without the no_subtree_check option is one of
these exceptions - it can break in the case of cross-directory renames).

Cheers,
  Trond

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Phillip Lougher


On 29 Dec 2006, at 08:41, Arjan van de Ven wrote:




I think statement 2 is extremely important.  Without this guarantee
applications have to guess which files are hardlinks.  Any guessing
is going to be be got wrong sometimes with potentially disastrous
results.


actually no. Statement 1 will tell them when the kernel knows they are
hardlinks. It's the kernels job to make a reasonably quality of
implementation so that that works most of the time.

Statement 2 requires that all of the time which suddenly creates  
a lot

of evil corner cases (like what if I mount a network filesystem twice
and the server doesn't quite tell me enough to figure it out  
cases) to

make it impractical.



Actually no.  Statement  2 for me is important in terms of archive  
correctness.  With my archiver program Mksquashfs, if the two files  
are the same, and filesystem says they're hardlinks, I make them  
hardlinks in the Squashfs filesystem, otherwise they're stored as  
duplicates (same data, different inode).  Doesn't matter much in  
terms of storage overhead, but it does matter if two files become  
one, or vice versa.


If a filesystem cannot guarantee statement 2 in the normal case, I  
wouldn't use hardlinks in that filesystem, period.   Using evil  
corner cases and network filesystems as an objection is somewhat  
like saying because we can't do it in every case, we shouldn't bother  
doing it in the normal case too.  Disk based filesystems should be  
able to handle statements 1 and 2.  No-one expects things to always  
work correctly in evil corner cases or with network filesystems.


Phillip


Think of it as the difference between good and perfect.
(and perfect is the enemy of good :)

the kernel will tell you when it knows within reason, via statement 1
technology. It's not perfect, but reasonably will be enough for normal
userspace to depend on it. Your case is NOT a case of I require  
100%..

it's a we'd like to take hardlinks into account case.


--
if you want to mail me at work (you don't), use arjan (at)  
linux.intel.com
Test the interaction between Linux and your BIOS via http:// 
www.linuxfirmwarekit.org




-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Arjan van de Ven

 
 Actually no.  Statement  2 for me is important in terms of archive  
 correctness.  With my archiver program Mksquashfs, if the two files  
 are the same, and filesystem says they're hardlinks, I make them  
 hardlinks in the Squashfs filesystem, otherwise they're stored as  
 duplicates (same data, different inode).  Doesn't matter much in  
 terms of storage overhead, but it does matter if two files become  
 one, or vice versa.

statement 2 was all files that are hardlinks can be found with ino/dev
pairs. How would files become one if accidentally the kernel shows a
hardlinked file as 2 separate files in terms of inode nr or device?

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Bryan Henderson
On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote:
 Statement 1:
 If two files have identical st_dev and st_ino, they MUST be hardlinks 
of
 each other/the same file.
 
 Statement 2:
 If two files are a hardlink of each other, they MUST be detectable
 (for example by having the same st_dev/st_ino)
 
 I personally consider statement 1 a mandatory requirement in terms of
 quality of implementation if not Posix compliance.
 
 Statement 2 for me is nice but optional
 
 Statement 1 without Statement 2 provides one of those facilities where 
the 
 computer tells you something is maybe or almost certainly true.

No it's not a almost certainly. It's a these ARE.

There are various these AREs here, but the almost certainly I'm 
talking about is where Statement 1 is true and Statement 2 is false and 
the inode numbers you read through two links are different.  (For example, 
consider a filesystem in which the reported inode number is the internal 
inode number truncated to 32 bits).  The links are almost certainly to 
different files.

--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Arjan van de Ven
On Fri, 2006-12-29 at 10:08 -0800, Bryan Henderson wrote:
 On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote:
  Statement 1:
  If two files have identical st_dev and st_ino, they MUST be hardlinks 
 of
  each other/the same file.
  
  Statement 2:
  If two files are a hardlink of each other, they MUST be detectable
  (for example by having the same st_dev/st_ino)
  
  I personally consider statement 1 a mandatory requirement in terms of
  quality of implementation if not Posix compliance.
  
  Statement 2 for me is nice but optional
  
  Statement 1 without Statement 2 provides one of those facilities where 
 the 


 There are various these AREs here, but the almost certainly I'm 
 talking about is where Statement 1 is true and Statement 2 is false and 
 the inode numbers you read through two links are different.  (For example, 
 consider a filesystem in which the reported inode number is the internal 
 inode number truncated to 32 bits).  The links are almost certainly to 
 different files.
 

but then statement 1 is false and violated.


-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Bryan Henderson
On Fri, 2006-12-29 at 10:08 -0800, Bryan Henderson wrote:
 On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote:
  Statement 1:
  If two files have identical st_dev and st_ino, they MUST be 
hardlinks 
 of
  each other/the same file.
  
  Statement 2:
  If two files are a hardlink of each other, they MUST be 
detectable
  (for example by having the same st_dev/st_ino)
  
  I personally consider statement 1 a mandatory requirement in terms 
of
  quality of implementation if not Posix compliance.
  
  Statement 2 for me is nice but optional
  
  Statement 1 without Statement 2 provides one of those facilities 
where 
 the 

 There are various these AREs here, but the almost certainly I'm 
 talking about is where Statement 1 is true and Statement 2 is false and 

 the inode numbers you read through two links are different.  (For 
example, 
 consider a filesystem in which the reported inode number is the 
internal 
 inode number truncated to 32 bits).  The links are almost certainly to 
 different files.
 

but then statement 1 is false and violated.

Whoops; wrong example.  It doesn't matter, though, since clearly there 
exist correct examples: where Statement 1 is true and Statement 2 is 
false, and in that case when the inode numbers are different, the links 
are almost certainly to different files.

--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-29 Thread Mikulas Patocka



On Fri, 29 Dec 2006, Trond Myklebust wrote:


On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote:

Why don't you rip off the support for colliding inode number from the
kernel at all (i.e. remove iget5_locked)?

It's reasonable to have either no support for colliding ino_t or full
support for that (including syscalls that userspace can use to work with
such filesystem) --- but I don't see any point in having half-way support
in kernel as is right now.


What would ino_t have to do with inode numbers? It is only used as a
hash table lookup. The inode number is set in the -getattr() callback.


The question is: why does the kernel contain iget5 function that looks up 
according to callback, if the filesystem cannot have more than 64-bit 
inode identifier?


This lookup callback just induces writing bad filesystems with coliding 
inode numbers. Either remove coda, smb (and possibly other) filesystems 
from the kernel or make a proper support for userspace for them.


The situation is that current coreutils 6.7 fail to recursively copy 
directories if some two directories in the tree have coliding inode 
number, so you get random data corruption with these filesystems.


Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Arjan van de Ven

 It seems like the posix idea of unique st_dev, st_ino doesn't
 hold water for modern file systems 

are you really sure?
and if so, why don't we fix *THAT* instead, rather than adding racy
syscalls and such that just can't really be used right...


-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Jeff Layton

Benny Halevy wrote:


It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems and that creates real problems for
backup apps which rely on that to detect hard links.



Why not? Granted, many of the filesystems in the Linux kernel don't enforce that 
they have unique st_ino values, but I'm working on a set of patches to try and 
fix that.



Adding a vfs call to check for file equivalence seems like a good idea to me.
A syscall exposing it to user mode apps can look like what you sketched above,
and another variant of it can maybe take two paths and possibly a flags field
(for e.g. don't follow symlinks).

I'm cross-posting this also to [EMAIL PROTECTED] NFS has exactly the same 
problem
with fsid, fileid as fileid is 64 bit wide. Although the nfs client can
determine that two filesystem objects are hard linked if they have the same
filehandle but there are cases where two distinct filehandles can still refer to
the same filesystem object.  Letting the nfs client determine file equivalency
based on filehandles will probably satisfy most users but if the exported
fs supports the new call discussed above, exporting it over NFS makes a
lot of sense to me... What do you guys think about adding such an operation
to NFS?



This sounds like a bug to me. It seems like we should have a one to one 
correspondence of filehandle - inode. In what situations would this not be the 
case?


-- Jeff

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Benny Halevy

Jeff Layton wrote:
 Benny Halevy wrote:
 It seems like the posix idea of unique st_dev, st_ino doesn't
 hold water for modern file systems and that creates real problems for
 backup apps which rely on that to detect hard links.

 
 Why not? Granted, many of the filesystems in the Linux kernel don't enforce 
 that 
 they have unique st_ino values, but I'm working on a set of patches to try 
 and 
 fix that.

That's great and will surely help most file systems (apparently not Coda as
Jan says they use 128 bit internal file identifiers).

What about 32 bit architectures? Is ino_t going to be 64 bit
there too?

 
 Adding a vfs call to check for file equivalence seems like a good idea to me.
 A syscall exposing it to user mode apps can look like what you sketched 
 above,
 and another variant of it can maybe take two paths and possibly a flags field
 (for e.g. don't follow symlinks).

 I'm cross-posting this also to [EMAIL PROTECTED] NFS has exactly the same 
 problem
 with fsid, fileid as fileid is 64 bit wide. Although the nfs client can
 determine that two filesystem objects are hard linked if they have the same
 filehandle but there are cases where two distinct filehandles can still 
 refer to
 the same filesystem object.  Letting the nfs client determine file 
 equivalency
 based on filehandles will probably satisfy most users but if the exported
 fs supports the new call discussed above, exporting it over NFS makes a
 lot of sense to me... What do you guys think about adding such an operation
 to NFS?

 
 This sounds like a bug to me. It seems like we should have a one to one 
 correspondence of filehandle - inode. In what situations would this not be 
 the 
 case?

Well, the NFS protocol allows that [see rfc1813, p. 21: If two file handles 
from
the same server are equal, they must refer to the same file, but if they are not
equal, no conclusions can be drawn.]

As an example, some file systems encode hint information into the filehandle
and the hints may change over time, another example is encoding parent
information into the filehandle and then handles representing hard links
to the same file from different directories will differ.

 
 -- Jeff
 
 -
 To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
 the body of a message to [EMAIL PROTECTED]
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Jeff Layton

Benny Halevy wrote:

Jeff Layton wrote:

Benny Halevy wrote:

It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems and that creates real problems for
backup apps which rely on that to detect hard links.

Why not? Granted, many of the filesystems in the Linux kernel don't enforce that 
they have unique st_ino values, but I'm working on a set of patches to try and 
fix that.


That's great and will surely help most file systems (apparently not Coda as
Jan says they use 128 bit internal file identifiers).

What about 32 bit architectures? Is ino_t going to be 64 bit
there too?



Sorry, I should qualify that statement. A lot of filesystems don't have 
permanent i_ino values (mostly pseudo filesystems -- pipefs, sockfs, /proc 
stuff, etc). For those, the idea is to try to make sure we use 32 bit values for 
them and to ensure that they are uniquely assigned. I unfortunately can't do 
much about filesystems that do have permanent inode numbers.



Adding a vfs call to check for file equivalence seems like a good idea to me.
A syscall exposing it to user mode apps can look like what you sketched above,
and another variant of it can maybe take two paths and possibly a flags field
(for e.g. don't follow symlinks).

I'm cross-posting this also to [EMAIL PROTECTED] NFS has exactly the same 
problem
with fsid, fileid as fileid is 64 bit wide. Although the nfs client can
determine that two filesystem objects are hard linked if they have the same
filehandle but there are cases where two distinct filehandles can still refer to
the same filesystem object.  Letting the nfs client determine file equivalency
based on filehandles will probably satisfy most users but if the exported
fs supports the new call discussed above, exporting it over NFS makes a
lot of sense to me... What do you guys think about adding such an operation
to NFS?

This sounds like a bug to me. It seems like we should have a one to one 
correspondence of filehandle - inode. In what situations would this not be the 
case?


Well, the NFS protocol allows that [see rfc1813, p. 21: If two file handles 
from
the same server are equal, they must refer to the same file, but if they are not
equal, no conclusions can be drawn.]

As an example, some file systems encode hint information into the filehandle
and the hints may change over time, another example is encoding parent
information into the filehandle and then handles representing hard links
to the same file from different directories will differ.



Interesting. That does seem to break the method of st_dev/st_ino for finding 
hardlinks. For Linux fileservers I think we generally do have 1:1 correspondence 
so that's not generally an issue.


If we're getting into changing specs, though, I think it would be better to 
change it to enforce a 1:1 filehandle to inode correspondence rather than making 
new NFS ops. That does mean you can't use the filehandle for carrying other 
info, but it seems like there ought to be better mechanisms for that.


-- Jeff
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Jan Engelhardt

On Dec 28 2006 10:54, Jeff Layton wrote:

 Sorry, I should qualify that statement. A lot of filesystems don't have
 permanent i_ino values (mostly pseudo filesystems -- pipefs, sockfs, /proc
 stuff, etc). For those, the idea is to try to make sure we use 32 bit values
 for them and to ensure that they are uniquely assigned. I unfortunately can't
 do much about filesystems that do have permanent inode numbers.

Anyway, this could probably come in handy for unionfs too.


-`J'
-- 
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Bryan Henderson
Adding a vfs call to check for file equivalence seems like a good idea to 
me.

That would be only barely useful.  It would let 'diff' say, those are 
both the same file, but wouldn't be useful for something trying to 
duplicate a filesystem (e.g. a backup program).  Such a program can't do 
the comparison between every possible pairing of file names.

I'd rather just see a unique file identifier that's as big as it needs to 
be.  And the more unique the better.  (There are lots of degrees of 
uniqueness; unique as long as the files exist; as long as the filesystems 
are mounted, etc.).

--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Arjan van de Ven

 If it's important to know that two names refer to the same file in a 
 remote filesystem, I don't see any way around adding a new concept of file 
 identifier to the protocol.

actually there are 2 separate issues at hand, and this thread sort of
confuses them into one:

Statement 1:
If two files have identical st_dev and st_ino, they MUST be hardlinks of
each other/the same file.

Statement 2:
If two files are a hardlink of each other, they MUST be detectable
(for example by having the same st_dev/st_ino)


I personally consider statement 1 a mandatory requirement in terms of
quality of implementation if not Posix compliance.

Statement 2 for me is nice but optional, the use case for it is VERY
different, it's an optimization for a program like tar to not have to
back a file up twice, while statement 1 is there to ensure that
hardlinks CAN be backed up smartly.


Let's please treat these as 2 separate issues, I agree they're somewhat
related, but really they're a different  kind of guarantee and have
entirely different usecases as well.

(oh and I'm very open to hearing about cases where a violation of
statement 2 ends up being an actual problem)


Greetings,
   Arjan van de Ven

 
-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Miklos Szeredi
  It seems like the posix idea of unique st_dev, st_ino doesn't
  hold water for modern file systems 
  
  are you really sure?
 
 Well Jan's example was of Coda that uses 128-bit internal file ids.
 
  and if so, why don't we fix *THAT* instead
 
 Hmm, sometimes you can't fix the world, especially if the filesystem
 is exported over NFS and has a problem with fitting its file IDs uniquely
 into a 64-bit identifier.

Note, it's pretty easy to fit _anything_ into a 64-bit identifier with
the use of a good hash function.  The chance of an accidental
collision is infinitesimally small.  For a set of 

 100 files: 0.03%
   1,000,000 files: 0.03%

And usually (tar, diff, cp -a, etc.) work with a very limited set of
st_ino's.  An app that would store a million st_ino values and compare
each new to all the existing ones would be having severe performance
problems and yet _almost never_ come across a false positive.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Finding hardlinks

2006-12-28 Thread Halevy, Benny
Mikulas Patocka wrote:
 
 This sounds like a bug to me. It seems like we should have a one to one
 correspondence of filehandle - inode. In what situations would this not be 
 the
 case?

 Well, the NFS protocol allows that [see rfc1813, p. 21: If two file handles 
 from
 the same server are equal, they must refer to the same file, but if they are 
 not
 equal, no conclusions can be drawn.]

 As an example, some file systems encode hint information into the filehandle
 and the hints may change over time, another example is encoding parent
 information into the filehandle and then handles representing hard links
 to the same file from different directories will differ.

BTW. how does (or how should?) NFS client deal with cache coherency if 
filehandles for the same file differ?


Trond can probably answer this better than me...
As I read it, currently the nfs client matches both the fileid and the
filehandle (in nfs_find_actor). This means that different filehandles
for the same file would result in different inodes :(.
Strictly following the nfs protocol, comparing only the fileid should
be enough IF fileids are indeed unique within the filesystem.
Comparing the filehandle works as a workaround when the exported filesystem
(or the nfs server) violates that.  From a user stand point I think that
this should be configurable, probably per mount point.

Mikulas

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Finding hardlinks

2006-12-28 Thread Halevy, Benny
Bryan Henderson wrote:
 
Adding a vfs call to check for file equivalence seems like a good idea to 
me.

That would be only barely useful.  It would let 'diff' say, those are 
both the same file, but wouldn't be useful for something trying to 
duplicate a filesystem (e.g. a backup program).  Such a program can't do 
the comparison between every possible pairing of file names.

Gnu tar, for example, remembers and matches st_dev, st_ino only for
potential hard links (st_nlink  1).
My thinking was that the application will call the equivalence syscall
to verify a match on the inode number (and when nlink  1), not for every
pair of files in the filesystem.


I'd rather just see a unique file identifier that's as big as it needs to 
be.  And the more unique the better.  (There are lots of degrees of 
uniqueness; unique as long as the files exist; as long as the filesystems 
are mounted, etc.).

--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-28 Thread Bryan Henderson
Statement 1:
If two files have identical st_dev and st_ino, they MUST be hardlinks of
each other/the same file.

Statement 2:
If two files are a hardlink of each other, they MUST be detectable
(for example by having the same st_dev/st_ino)

I personally consider statement 1 a mandatory requirement in terms of
quality of implementation if not Posix compliance.

Statement 2 for me is nice but optional

Statement 1 without Statement 2 provides one of those facilities where the 
computer tells you something is maybe or almost certainly true.  While 
it's useful in plenty of practical cases, in my experience, it leaves 
computer engineers uncomfortable.  Recently, there was a discussion on 
this list of a proposed case in which stat() results are maybe correct, 
but maybe garbage that covered some of that philosophy.

it's an optimization for a program like tar to not have to
back a file up twice,

I think it's a stronger need than just to make a tarball smaller.  When 
you restore the tarball in which 'foo' and 'bar' are different files, you 
get a fundamentally different tree of files than the one you started with 
in which 'foo' and 'bar' were two different names for the same file.  If, 
in the restored tree, you write to 'foo', you won't see the result in 
'bar'.  If you remove read permission from 'foo', the world can still see 
the information in 'bar'.  Plus, in some cases optimization is a matter of 
life or death -- the extra resources (storage space, cache space, access 
time, etc) for the duplicated files might be enough to move you from 
practical to impractical.

People tend to demand that restore programs faithfully restore what was 
backed up.  (I've even seen requirements that the inode numbers upon 
restore be the same).  Given the difficulty of dealing with multi-linked 
files, not to mention various nonstandard file attributes fancy filesystem 
types have, I suppose they probably don't have really high expectations of 
that nowadays, but it's still a worthy goal not to turn one file into two.

--
Bryan Henderson IBM Almaden Research Center
San Jose CA Filesystems

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-23 Thread Arjan van de Ven

 
 If user (or script) doesn't specify that flag, it doesn't help. I think 
 the best solution for these filesystems would be either to add new syscall
   int is_hardlink(char *filename1, char *filename2)
 (but I know adding syscall bloat may be objectionable)

it's also the wrong api; the filenames may have been changed under you
just as you return from this call, so it really is a
was_hardlink_at_some_point() as you specify it.
If you make it work on fd's.. it has a chance at least.

-- 
if you want to mail me at work (you don't), use arjan (at) linux.intel.com
Test the interaction between Linux and your BIOS via 
http://www.linuxfirmwarekit.org

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-23 Thread Mikulas Patocka

If user (or script) doesn't specify that flag, it doesn't help. I think
the best solution for these filesystems would be either to add new syscall
int is_hardlink(char *filename1, char *filename2)
(but I know adding syscall bloat may be objectionable)


it's also the wrong api; the filenames may have been changed under you
just as you return from this call, so it really is a
was_hardlink_at_some_point() as you specify it.
If you make it work on fd's.. it has a chance at least.


Yes, but it doesn't matter --- if the tree changes under cp -a command, 
no one guarantees you what you get.

int fis_hardlink(int handle1, int handle 2);
Is another possibility but it can't detect hardlinked symlinks.

Mikulas
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-21 Thread Jan Harkes
On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote:
 The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend
 the kstat.ino field to 64bit and fix those filesystems to fill in
 kstat correctly.

Coda actually uses 128-bit file identifiers internally, so 64-bits
really doesn't cut it. Since the 128-bit space is used pretty sparsely
there is a hash which avoids most collistions in 32-bit i_ino space, but
not completely. I can also imagine that at some point someone wants to
implement a git-based filesystem where it would be more natural to use
160-bit SHA1 hashes as unique object identifiers.

But Coda only allow hardlinks within a single directory and if someone
renames a hardlinked file and one of the names ends up in a different
directory we implicitly create a copy of the object. This actually
leverages off of the way we handle volume snapshots and the fact that we
use whole file caching and writes, so we only copy the metadata while
the data is 'copy-on-write'.

I'm considering changing the way we handle hardlinks by having link(2)
always create a new object with copy-on-write semantics (i.e. replacing
link with some sort of a copyfile operation). This way we can get rid of
several special cases like the cross-directory rename. It also avoids
problems when the various replicas of an object are found to be
inconsistent and we allow the user to expand the file. On expansion a
file becomes a directory that contains all the objects on individual
replicas. Handling the expansion in a dcache friendly way is nasty
enough as is and complicated by the fact that we really don't want such
an expansion to result in hard-linked directories, so we are forced to
inventing new unique object identifiers, etc. Again, not having
hardlinks would simplify things somewhat here.

Any application that tries to be smart enough to keep track of which
files are hardlinked should (in my opinion) also have a way to disable
this behaviour.

Jan

-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-21 Thread Mikulas Patocka

On Thu, 21 Dec 2006, Jan Harkes wrote:


On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote:

The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend
the kstat.ino field to 64bit and fix those filesystems to fill in
kstat correctly.


Coda actually uses 128-bit file identifiers internally, so 64-bits
really doesn't cut it. Since the 128-bit space is used pretty sparsely
there is a hash which avoids most collistions in 32-bit i_ino space, but
not completely. I can also imagine that at some point someone wants to
implement a git-based filesystem where it would be more natural to use
160-bit SHA1 hashes as unique object identifiers.

But Coda only allow hardlinks within a single directory and if someone
renames a hardlinked file and one of the names ends up in a different
directory we implicitly create a copy of the object. This actually
leverages off of the way we handle volume snapshots and the fact that we
use whole file caching and writes, so we only copy the metadata while
the data is 'copy-on-write'.


The problem is that if inode number collision happens occasionally, you 
get data corruption with cp -a command --- it will just copy one file and 
hardlink the other.



Any application that tries to be smart enough to keep track of which
files are hardlinked should (in my opinion) also have a way to disable
this behaviour.


If user (or script) doesn't specify that flag, it doesn't help. I think 
the best solution for these filesystems would be either to add new syscall

int is_hardlink(char *filename1, char *filename2)
(but I know adding syscall bloat may be objectionable)
or add new field in statvfs ST_HAS_BROKEN_INO_T, that applications can 
test and disable hardlink processing.


Mikulas


Jan


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-20 Thread Miklos Szeredi
 I've came across this problem: how can a userspace program (such as for 
 example cp -a) tell that two files form a hardlink? Comparing inode 
 number will break on filesystems that can have more than 2^32 files (NFS3, 
 OCFS, SpadFS; kernel developers already implemented iget5_locked for the 
 case of colliding inode numbers). Other possibilities:
 
 --- compare not only ino, but all stat entries and make sure that
   i_nlink  1?
   --- is not 100% reliable either, only lowers failure probability
 --- create a hardlink and watch if i_nlink is increased on both files?
   --- doesn't work on read-only filesystems
 --- compare file content?
   --- cp -a won't then corrupt data at least, but will create
   hardlinks where they shouldn't be.
 
 Is there some reliable way how should cp -a command determine that? 
 Finding in kernel whether two dentries point to the same inode is trivial 
 but I am not sure how to let userspace know ... am I missing something?

The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend
the kstat.ino field to 64bit and fix those filesystems to fill in
kstat correctly.

SUSv3 requires st_ino/st_dev to be unique within a system so the
application shouldn't need to bend over backwards.

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-20 Thread Mikulas Patocka

I've came across this problem: how can a userspace program (such as for
example cp -a) tell that two files form a hardlink? Comparing inode
number will break on filesystems that can have more than 2^32 files (NFS3,
OCFS, SpadFS; kernel developers already implemented iget5_locked for the
case of colliding inode numbers). Other possibilities:

--- compare not only ino, but all stat entries and make sure that
i_nlink  1?
--- is not 100% reliable either, only lowers failure probability
--- create a hardlink and watch if i_nlink is increased on both files?
--- doesn't work on read-only filesystems
--- compare file content?
--- cp -a won't then corrupt data at least, but will create
hardlinks where they shouldn't be.

Is there some reliable way how should cp -a command determine that?
Finding in kernel whether two dentries point to the same inode is trivial
but I am not sure how to let userspace know ... am I missing something?


The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend
the kstat.ino field to 64bit and fix those filesystems to fill in
kstat correctly.


There is 32-bit __st_ino and 64-bit st_ino --- what is their purpose? Some 
old compatibility code?



SUSv3 requires st_ino/st_dev to be unique within a system so the
application shouldn't need to bend over backwards.


I see but kernel needs to be fixed for that. Would patches for changing 
kstat be accepted?


Mikulas


Miklos


-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-20 Thread Miklos Szeredi
  I've came across this problem: how can a userspace program (such as for
  example cp -a) tell that two files form a hardlink? Comparing inode
  number will break on filesystems that can have more than 2^32 files (NFS3,
  OCFS, SpadFS; kernel developers already implemented iget5_locked for the
  case of colliding inode numbers). Other possibilities:
 
  --- compare not only ino, but all stat entries and make sure that
 i_nlink  1?
 --- is not 100% reliable either, only lowers failure probability
  --- create a hardlink and watch if i_nlink is increased on both files?
 --- doesn't work on read-only filesystems
  --- compare file content?
 --- cp -a won't then corrupt data at least, but will create
 hardlinks where they shouldn't be.
 
  Is there some reliable way how should cp -a command determine that?
  Finding in kernel whether two dentries point to the same inode is trivial
  but I am not sure how to let userspace know ... am I missing something?
 
  The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend
  the kstat.ino field to 64bit and fix those filesystems to fill in
  kstat correctly.
 
 There is 32-bit __st_ino and 64-bit st_ino --- what is their purpose? Some 
 old compatibility code?

Yes.

  SUSv3 requires st_ino/st_dev to be unique within a system so the
  application shouldn't need to bend over backwards.
 
 I see but kernel needs to be fixed for that. Would patches for changing 
 kstat be accepted?

I don't see any problems with changing struct kstat.  There would be
reservations against changing inode.i_ino though.

So filesystems that have 64bit inodes will need a specialized
getattr() method instead of generic_fillattr().

Miklos
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Finding hardlinks

2006-12-20 Thread Al Viro
On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote:
 I don't see any problems with changing struct kstat.  There would be
 reservations against changing inode.i_ino though.
 
 So filesystems that have 64bit inodes will need a specialized
 getattr() method instead of generic_fillattr().

And they are already free to do so.  And no, struct kstat doesn't need
to be changed - it has u64 ino already.
-
To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html