Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko
On Wednesday 03 January 2007 13:42, Pavel Machek wrote: I guess that is the way to go. samefile(path1, path2) is unfortunately inherently racy. Not a problem in practice. You don't expect cp -a to reliably copy a tree which something else is modifying at the same time. Thus we assume that the

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko
On Wednesday 03 January 2007 21:26, Frank van Maarseveen wrote: On Wed, Jan 03, 2007 at 08:31:32PM +0100, Mikulas Patocka wrote: 64-bit inode numbers space is not yet implemented on Linux --- the problem is that if you return ino = 2^32, programs compiled without -D_FILE_OFFSET_BITS=64

Re: [nfsv4] RE: Finding hardlinks

2007-01-10 Thread Benny Halevy
Nicolas Williams wrote: On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote: I agree that the way the client implements its cache is out of the protocol scope. But how do you interpret correct behavior in section 4.2.1? Clients MUST use filehandle comparisons only to improve

Re: Finding hardlinks

2007-01-10 Thread Bryan Henderson
I did cp -rl his-tree my-tree (which completed quickly), edited the two files that needed to be patched, then did diff -urp his-tree my-tree, which also completed quickly, as diff knows that if two files have the same inode, they don't need to be opened. ... download one tree from kernel.org, do

Re: Finding hardlinks

2007-01-09 Thread Steven Rostedt
On Tue, 9 Jan 2007, Frank van Maarseveen wrote: Yes but cp -rl is typically done by _developers_ and they tend to have a better understanding of this (uh, at least within linux context I hope so). Also, just adding hard-links doesn't increase the number of inodes. No, but it increases the

Re: Finding hardlinks

2007-01-09 Thread Bryan Henderson
but you can get a large number of 1 linked files, when you copy full directories with cp -rl. Which I do a lot when developing. I've done that a few times with the Linux tree. Can you shed some light on how you use this technique? (I.e. what does it do for you?) Many people are of the opinion

Re: Finding hardlinks

2007-01-09 Thread Pavel Machek
On Tue 2007-01-09 15:43:14, Bryan Henderson wrote: but you can get a large number of 1 linked files, when you copy full directories with cp -rl. Which I do a lot when developing. I've done that a few times with the Linux tree. Can you shed some light on how you use this technique? (I.e.

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
No one guarantees you sane result of tar or cp -a while changing the tree. I don't see how is_samefile() could make it worse. There are several cases where changing the tree doesn't affect the correctness of the tar or cp -a result. In some of these cases using samefile() instead of

Re: Finding hardlinks

2007-01-08 Thread Pavel Machek
On Fri 2007-01-05 16:15:41, Miklos Szeredi wrote: And does it matter? If you rename a file, tar might skip it no matter of hardlink detection (if readdir races with rename, you can read none of the names of file, one or both --- all these are possible). If you have dir1/a

Re: Finding hardlinks

2007-01-08 Thread Pavel Machek
Hi! No one guarantees you sane result of tar or cp -a while changing the tree. I don't see how is_samefile() could make it worse. There are several cases where changing the tree doesn't affect the correctness of the tar or cp -a result. In some of these cases using

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
There's really no point trying to push for such an inferior interface when the problems which samefile is trying to address are purely theoretical. Oh yes, there is. st_ino is powerful, *but impossible to implement* on many filesystems. You mean POSIX compliance is impossible? So what?

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
You mean POSIX compliance is impossible? So what? It is possible to implement an approximation that is _at least_ as good as samefile(). One really dumb way is to set st_ino to the 'struct inode' pointer for example. That will sure as hell fit into 64bits and will give a unique (alas

Re: Finding hardlinks

2007-01-08 Thread Martin Mares
Hello! You mean POSIX compliance is impossible? So what? It is possible to implement an approximation that is _at least_ as good as samefile(). One really dumb way is to set st_ino to the 'struct inode' pointer for example. That will sure as hell fit into 64bits and will give a unique

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
High probability is all you have. Cosmic radiation hitting your computer will more likly cause problems, than colliding 64bit inode numbers ;) Some of us have machines designed to cope with cosmic rays, and would be unimpressed with a decrease in reliability. With the

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust
On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote: Trond Myklebust wrote: Exactly where do you see us violating the close-to-open cache consistency guarantees? I haven't seen that. What I did see is cache inconsistency when opening the same file with different file descriptors when

Re: Finding hardlinks

2007-01-05 Thread Pavel Machek
Hi! Some of us have machines designed to cope with cosmic rays, and would be unimpressed with a decrease in reliability. With the suggested samefile() interface you'd get a failure with just about 100% reliability for any application which needs to compare a more than a few

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
And does it matter? If you rename a file, tar might skip it no matter of hardlink detection (if readdir races with rename, you can read none of the names of file, one or both --- all these are possible). If you have dir1/a hardlinked to dir1/b and while tar runs you delete both a and b

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
And does it matter? If you rename a file, tar might skip it no matter of hardlink detection (if readdir races with rename, you can read none of the names of file, one or both --- all these are possible). If you have dir1/a hardlinked to dir1/b and while tar runs you delete both a

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust
On Fri, 2007-01-05 at 10:40 -0600, Nicolas Williams wrote: What I don't understand is why getting the fileid is so hard -- always GETATTR when you GETFH and you'll be fine. I'm guessing that's not as difficult as it is to maintain a hash table of fileids. You've been sleeping in class. We

RE: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Noveck, Dave
Subject: Re: [nfsv4] RE: Finding hardlinks On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote: Trond Myklebust wrote: Exactly where do you see us violating the close-to-open cache consistency guarantees? I haven't seen that. What I did see is cache inconsistency when opening the same

RFC: Stable inodes for inode-less filesystems (was: Finding hardlinks)

2007-01-05 Thread Bodo Eggert
Pavel Machek [EMAIL PROTECTED] wrote: Another idea is to export the filesystem internal ID as an arbitray length cookie through the extended attribute interface. That could be stored/compared by the filesystem quite efficiently. How will that work for FAT? Or maybe we can relax that

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Trond Myklebust
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote: I sincerely expect you or anybody else for this matter to try to provide feedback and object to the protocol specification in case they disagree with it (or think it's ambiguous or self contradicting) rather than ignoring it and

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Benny Halevy
Trond Myklebust wrote: On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote: I sincerely expect you or anybody else for this matter to try to provide feedback and object to the protocol specification in case they disagree with it (or think it's ambiguous or self contradicting) rather than

Re: Finding hardlinks

2007-01-04 Thread Nikita Danilov
Mikulas Patocka writes: BTW. How does ReiserFS find that a given inode number (or object ID in ReiserFS terminology) is free before assigning it to new file/directory? reiserfs v3 has an extent map of free object identifiers in super-block. Inode free space can have at most

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Peter Staubach
Bryan Henderson wrote: Clients MUST use filehandle comparisons only to improve performance, not for correct behavior. All clients need to be prepared for situations in which it cannot be determined whether two filehandles denote the same object and in such cases, avoid making invalid assumptions

Re: Finding hardlinks

2007-01-04 Thread Pavel Machek
Hi! High probability is all you have. Cosmic radiation hitting your computer will more likly cause problems, than colliding 64bit inode numbers ;) Some of us have machines designed to cope with cosmic rays, and would be unimpressed with a decrease in reliability. With the

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi! the use of a good hash function. The chance of an accidental collision is infinitesimally small. For a set of 100 files: 0.03% 1,000,000 files: 0.03% I do not think we want to play with probability like this. I mean... imagine 4G files,

Re: Finding hardlinks

2007-01-03 Thread Miklos Szeredi
the use of a good hash function. The chance of an accidental collision is infinitesimally small. For a set of 100 files: 0.03% 1,000,000 files: 0.03% I do not think we want to play with probability like this. I mean... imagine 4G files,

Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Benny Halevy
Trond Myklebust wrote: On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote: Trond Myklebust wrote: On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: Mikulas Patocka wrote: BTW. how does (or how should?) NFS client deal with cache coherency if filehandles for the same file

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi! the use of a good hash function. The chance of an accidental collision is infinitesimally small. For a set of 100 files: 0.03% 1,000,000 files: 0.03% I do not think we want to play with probability like this. I mean...

Re: Finding hardlinks

2007-01-03 Thread Martin Mares
Hello! High probability is all you have. Cosmic radiation hitting your computer will more likly cause problems, than colliding 64bit inode numbers ;) No. If you assign 64-bit inode numbers randomly, 2^32 of them are sufficient to generate a collision with probability around 50%.

Re: Finding hardlinks

2007-01-03 Thread Matthew Wilcox
On Wed, Jan 03, 2007 at 01:33:31PM +0100, Miklos Szeredi wrote: High probability is all you have. Cosmic radiation hitting your computer will more likly cause problems, than colliding 64bit inode numbers ;) Some of us have machines designed to cope with cosmic rays, and would be unimpressed

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote: I didn't hardlink directories, I just patched stat, lstat and fstat to always return st_ino == 0 --- and I've seen those failures. These failures are going to happen on non-POSIX filesystems in real world too, very rarely.

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka
On Wed, 3 Jan 2007, Frank van Maarseveen wrote: On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote: I didn't hardlink directories, I just patched stat, lstat and fstat to always return st_ino == 0 --- and I've seen those failures. These failures are going to happen on non-POSIX

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka
I didn't hardlink directories, I just patched stat, lstat and fstat to always return st_ino == 0 --- and I've seen those failures. These failures are going to happen on non-POSIX filesystems in real world too, very rarely. I don't want to spoil your day but testing with st_ino==0 is a bad

Re: Finding hardlinks

2007-01-03 Thread Bryan Henderson
On any decent filesystem st_ino should uniquely identify an object and reliably provide hardlink information. The UNIX world has relied upon this for decades. A filesystem with st_ino collisions without being hardlinked (or the other way around) needs a fix. But for at least the last of those

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Wed, Jan 03, 2007 at 01:09:41PM -0800, Bryan Henderson wrote: On any decent filesystem st_ino should uniquely identify an object and reliably provide hardlink information. The UNIX world has relied upon this for decades. A filesystem with st_ino collisions without being hardlinked (or the

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi! Sure it is. Numerous popular POSIX filesystems do that. There is a lot of inode number space in 64 bit (of course it is a matter of time for it to jump to 128 bit and more) If the filesystem was designed by someone not from Unix world (FAT, SMB, ...), then not. And users still want to

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Thu, Jan 04, 2007 at 12:43:20AM +0100, Mikulas Patocka wrote: On Wed, 3 Jan 2007, Frank van Maarseveen wrote: Currently, large file support is already necessary to handle dvd and video. It's also useful for images for virtualization. So the failing stat() calls should already be a thing

Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Trond Myklebust
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote: Believe it or not, but server companies like Panasas try to follow the standard when designing and implementing their products while relying on client vendors to do the same. I personally have never given a rats arse about standards if

Re: Finding hardlinks

2007-01-02 Thread Pavel Machek
Hi! It seems like the posix idea of unique st_dev, st_ino doesn't hold water for modern file systems are you really sure? Well Jan's example was of Coda that uses 128-bit internal file ids. and if so, why don't we fix *THAT* instead Hmm, sometimes you can't fix the

Re: Finding hardlinks

2007-01-02 Thread Miklos Szeredi
It seems like the posix idea of unique st_dev, st_ino doesn't hold water for modern file systems are you really sure? Well Jan's example was of Coda that uses 128-bit internal file ids. and if so, why don't we fix *THAT* instead Hmm, sometimes you can't fix

Re: Finding hardlinks

2007-01-02 Thread Mikulas Patocka
On Tue, 2 Jan 2007, Miklos Szeredi wrote: It seems like the posix idea of unique st_dev, st_ino doesn't hold water for modern file systems are you really sure? Well Jan's example was of Coda that uses 128-bit internal file ids. and if so, why don't we fix *THAT* instead Hmm, sometimes

RE: [nfsv4] RE: Finding hardlinks

2007-01-02 Thread Trond Myklebust
On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote: Trond Myklebust wrote: On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: Mikulas Patocka wrote: BTW. how does (or how should?) NFS client deal with cache coherency if filehandles for the same file differ?

RE: Finding hardlinks

2007-01-02 Thread Trond Myklebust
On Sun, 2006-12-31 at 16:19 -0500, Halevy, Benny wrote: Even for NFSv3 (that doesn't have the unique_handles attribute I think that the linux nfs client can do a better job. If you'd have a filehandle cache that points at inodes you could maintain a many to one relationship from multiple

Re: Finding hardlinks

2007-01-02 Thread Mikulas Patocka
On Wed, 3 Jan 2007, Trond Myklebust wrote: On Sat, 2006-12-30 at 02:04 +0100, Mikulas Patocka wrote: On Fri, 29 Dec 2006, Trond Myklebust wrote: On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: Why don't you rip off the support for colliding inode number from the kernel at all

Re: Finding hardlinks

2007-01-01 Thread Pavel Machek
Hi! If user (or script) doesn't specify that flag, it doesn't help. I think the best solution for these filesystems would be either to add new syscall int is_hardlink(char *filename1, char *filename2) (but I know adding syscall bloat may be objectionable) it's also the wrong api;

Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka
Hi! If user (or script) doesn't specify that flag, it doesn't help. I think the best solution for these filesystems would be either to add new syscall int is_hardlink(char *filename1, char *filename2) (but I know adding syscall bloat may be objectionable) it's also the wrong api; the

Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka
The question is: why does the kernel contain iget5 function that looks up according to callback, if the filesystem cannot have more than 64-bit inode identifier? Generally speaking, file system might have two different identifiers for files: - one that makes it easy to tell whether two files

Re: Finding hardlinks

2007-01-01 Thread Nikita Danilov
Mikulas Patocka writes: [...] BTW. How does ReiserFS find that a given inode number (or object ID in ReiserFS terminology) is free before assigning it to new file/directory? reiserfs v3 has an extent map of free object identifiers in super-block. reiser4 used 64 bit object identifiers

Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka
BTW. How does ReiserFS find that a given inode number (or object ID in ReiserFS terminology) is free before assigning it to new file/directory? reiserfs v3 has an extent map of free object identifiers in super-block. Inode free space can have at most 2^31 extents --- if inode numbers

Re: Finding hardlinks

2007-01-01 Thread Jan Harkes
On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote: Anyway, cp -a is not the only application that wants to do hardlink detection. I tested programs for ino_t collision (I intentionally injected it) and found that CP from coreutils 6.7 fails to copy directories but displays

Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka
On Mon, 1 Jan 2007, Jan Harkes wrote: On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote: Anyway, cp -a is not the only application that wants to do hardlink detection. I tested programs for ino_t collision (I intentionally injected it) and found that CP from coreutils 6.7 fails

Re: Finding hardlinks

2006-12-31 Thread Mikulas Patocka
On Wed, 20 Dec 2006, Al Viro wrote: On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote: I don't see any problems with changing struct kstat. There would be reservations against changing inode.i_ino though. So filesystems that have 64bit inodes will need a specialized getattr()

RE: [nfsv4] RE: Finding hardlinks

2006-12-31 Thread Halevy, Benny
Trond Myklebust wrote: On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: Mikulas Patocka wrote: BTW. how does (or how should?) NFS client deal with cache coherency if filehandles for the same file differ? Trond can probably answer this better than me... As I read it,

Re: Finding hardlinks

2006-12-31 Thread Nikita Danilov
Mikulas Patocka writes: On Fri, 29 Dec 2006, Trond Myklebust wrote: On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: Why don't you rip off the support for colliding inode number from the kernel at all (i.e. remove iget5_locked)? It's reasonable to have either no

Re: Finding hardlinks

2006-12-29 Thread Arjan van de Ven
On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote: Statement 1: If two files have identical st_dev and st_ino, they MUST be hardlinks of each other/the same file. Statement 2: If two files are a hardlink of each other, they MUST be detectable (for example by having the same

Re: Finding hardlinks

2006-12-29 Thread Trond Myklebust
On Thu, 2006-12-28 at 17:12 +0200, Benny Halevy wrote: As an example, some file systems encode hint information into the filehandle and the hints may change over time, another example is encoding parent information into the filehandle and then handles representing hard links to the same file

Re: [nfsv4] RE: Finding hardlinks

2006-12-29 Thread Trond Myklebust
On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: Mikulas Patocka wrote: BTW. how does (or how should?) NFS client deal with cache coherency if filehandles for the same file differ? Trond can probably answer this better than me... As I read it, currently the nfs client matches

Re: Finding hardlinks

2006-12-29 Thread Phillip Lougher
On 29 Dec 2006, at 08:41, Arjan van de Ven wrote: I think statement 2 is extremely important. Without this guarantee applications have to guess which files are hardlinks. Any guessing is going to be be got wrong sometimes with potentially disastrous results. actually no. Statement 1 will

Re: Finding hardlinks

2006-12-29 Thread Arjan van de Ven
Actually no. Statement 2 for me is important in terms of archive correctness. With my archiver program Mksquashfs, if the two files are the same, and filesystem says they're hardlinks, I make them hardlinks in the Squashfs filesystem, otherwise they're stored as duplicates (same

Re: Finding hardlinks

2006-12-29 Thread Bryan Henderson
On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote: Statement 1: If two files have identical st_dev and st_ino, they MUST be hardlinks of each other/the same file. Statement 2: If two files are a hardlink of each other, they MUST be detectable (for example by having the same

Re: Finding hardlinks

2006-12-29 Thread Arjan van de Ven
On Fri, 2006-12-29 at 10:08 -0800, Bryan Henderson wrote: On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote: Statement 1: If two files have identical st_dev and st_ino, they MUST be hardlinks of each other/the same file. Statement 2: If two files are a hardlink of each

Re: Finding hardlinks

2006-12-29 Thread Bryan Henderson
On Fri, 2006-12-29 at 10:08 -0800, Bryan Henderson wrote: On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote: Statement 1: If two files have identical st_dev and st_ino, they MUST be hardlinks of each other/the same file. Statement 2: If two files are a hardlink of each

Re: Finding hardlinks

2006-12-29 Thread Mikulas Patocka
On Fri, 29 Dec 2006, Trond Myklebust wrote: On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: Why don't you rip off the support for colliding inode number from the kernel at all (i.e. remove iget5_locked)? It's reasonable to have either no support for colliding ino_t or full support

Re: Finding hardlinks

2006-12-28 Thread Arjan van de Ven
It seems like the posix idea of unique st_dev, st_ino doesn't hold water for modern file systems are you really sure? and if so, why don't we fix *THAT* instead, rather than adding racy syscalls and such that just can't really be used right... -- if you want to mail me at work (you don't),

Re: Finding hardlinks

2006-12-28 Thread Jeff Layton
Benny Halevy wrote: It seems like the posix idea of unique st_dev, st_ino doesn't hold water for modern file systems and that creates real problems for backup apps which rely on that to detect hard links. Why not? Granted, many of the filesystems in the Linux kernel don't enforce that they

Re: Finding hardlinks

2006-12-28 Thread Benny Halevy
Jeff Layton wrote: Benny Halevy wrote: It seems like the posix idea of unique st_dev, st_ino doesn't hold water for modern file systems and that creates real problems for backup apps which rely on that to detect hard links. Why not? Granted, many of the filesystems in the Linux kernel

Re: Finding hardlinks

2006-12-28 Thread Jeff Layton
parent information into the filehandle and then handles representing hard links to the same file from different directories will differ. Interesting. That does seem to break the method of st_dev/st_ino for finding hardlinks. For Linux fileservers I think we generally do have 1:1 correspondence

Re: Finding hardlinks

2006-12-28 Thread Jan Engelhardt
On Dec 28 2006 10:54, Jeff Layton wrote: Sorry, I should qualify that statement. A lot of filesystems don't have permanent i_ino values (mostly pseudo filesystems -- pipefs, sockfs, /proc stuff, etc). For those, the idea is to try to make sure we use 32 bit values for them and to ensure that

Re: Finding hardlinks

2006-12-28 Thread Bryan Henderson
Adding a vfs call to check for file equivalence seems like a good idea to me. That would be only barely useful. It would let 'diff' say, those are both the same file, but wouldn't be useful for something trying to duplicate a filesystem (e.g. a backup program). Such a program can't do the

Re: Finding hardlinks

2006-12-28 Thread Arjan van de Ven
If it's important to know that two names refer to the same file in a remote filesystem, I don't see any way around adding a new concept of file identifier to the protocol. actually there are 2 separate issues at hand, and this thread sort of confuses them into one: Statement 1: If two

Re: Finding hardlinks

2006-12-28 Thread Miklos Szeredi
It seems like the posix idea of unique st_dev, st_ino doesn't hold water for modern file systems are you really sure? Well Jan's example was of Coda that uses 128-bit internal file ids. and if so, why don't we fix *THAT* instead Hmm, sometimes you can't fix the world,

RE: Finding hardlinks

2006-12-28 Thread Halevy, Benny
Mikulas Patocka wrote: This sounds like a bug to me. It seems like we should have a one to one correspondence of filehandle - inode. In what situations would this not be the case? Well, the NFS protocol allows that [see rfc1813, p. 21: If two file handles from the same server are

RE: Finding hardlinks

2006-12-28 Thread Halevy, Benny
Bryan Henderson wrote: Adding a vfs call to check for file equivalence seems like a good idea to me. That would be only barely useful. It would let 'diff' say, those are both the same file, but wouldn't be useful for something trying to duplicate a filesystem (e.g. a backup program). Such

Re: Finding hardlinks

2006-12-28 Thread Bryan Henderson
Statement 1: If two files have identical st_dev and st_ino, they MUST be hardlinks of each other/the same file. Statement 2: If two files are a hardlink of each other, they MUST be detectable (for example by having the same st_dev/st_ino) I personally consider statement 1 a mandatory requirement

Re: Finding hardlinks

2006-12-23 Thread Arjan van de Ven
If user (or script) doesn't specify that flag, it doesn't help. I think the best solution for these filesystems would be either to add new syscall int is_hardlink(char *filename1, char *filename2) (but I know adding syscall bloat may be objectionable) it's also the wrong api; the

Re: Finding hardlinks

2006-12-23 Thread Mikulas Patocka
If user (or script) doesn't specify that flag, it doesn't help. I think the best solution for these filesystems would be either to add new syscall int is_hardlink(char *filename1, char *filename2) (but I know adding syscall bloat may be objectionable) it's also the wrong api; the

Re: Finding hardlinks

2006-12-21 Thread Jan Harkes
On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote: The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend the kstat.ino field to 64bit and fix those filesystems to fill in kstat correctly. Coda actually uses 128-bit file identifiers internally, so 64-bits really

Re: Finding hardlinks

2006-12-21 Thread Mikulas Patocka
On Thu, 21 Dec 2006, Jan Harkes wrote: On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote: The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend the kstat.ino field to 64bit and fix those filesystems to fill in kstat correctly. Coda actually uses 128-bit file

Re: Finding hardlinks

2006-12-20 Thread Miklos Szeredi
I've came across this problem: how can a userspace program (such as for example cp -a) tell that two files form a hardlink? Comparing inode number will break on filesystems that can have more than 2^32 files (NFS3, OCFS, SpadFS; kernel developers already implemented iget5_locked for the

Re: Finding hardlinks

2006-12-20 Thread Mikulas Patocka
I've came across this problem: how can a userspace program (such as for example cp -a) tell that two files form a hardlink? Comparing inode number will break on filesystems that can have more than 2^32 files (NFS3, OCFS, SpadFS; kernel developers already implemented iget5_locked for the case of

Re: Finding hardlinks

2006-12-20 Thread Miklos Szeredi
I've came across this problem: how can a userspace program (such as for example cp -a) tell that two files form a hardlink? Comparing inode number will break on filesystems that can have more than 2^32 files (NFS3, OCFS, SpadFS; kernel developers already implemented iget5_locked for the

Re: Finding hardlinks

2006-12-20 Thread Al Viro
On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote: I don't see any problems with changing struct kstat. There would be reservations against changing inode.i_ino though. So filesystems that have 64bit inodes will need a specialized getattr() method instead of generic_fillattr().