Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko
On Wednesday 03 January 2007 21:26, Frank van Maarseveen wrote: > On Wed, Jan 03, 2007 at 08:31:32PM +0100, Mikulas Patocka wrote: > > 64-bit inode numbers space is not yet implemented on Linux --- the problem > > is that if you return ino >= 2^32, programs compiled without > > -D_FILE_OFFSET_BIT

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko
On Wednesday 03 January 2007 13:42, Pavel Machek wrote: > I guess that is the way to go. samefile(path1, path2) is unfortunately > inherently racy. Not a problem in practice. You don't expect cp -a to reliably copy a tree which something else is modifying at the same time. Thus we assume that the

Re: Finding hardlinks

2007-01-11 Thread Denis Vlasenko
On Thursday 28 December 2006 10:06, Benny Halevy wrote: > Mikulas Patocka wrote: > >>> If user (or script) doesn't specify that flag, it doesn't help. I think > >>> the best solution for these filesystems would be either to add new syscall > >>> int is_hardlink(char *filename1, char *filename2) >

Re: Finding hardlinks

2007-01-11 Thread Pádraig Brady
Frank van Maarseveen wrote: > On Tue, Jan 09, 2007 at 11:26:25AM -0500, Steven Rostedt wrote: >> On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote: >> 50% probability of false positive on 4G files seems like very ugly design problem to me. >>> 4 billion files, each with more than on

Re: [nfsv4] RE: Finding hardlinks

2007-01-10 Thread Benny Halevy
Nicolas Williams wrote: > On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote: >> I agree that the way the client implements its cache is out of the protocol >> scope. But how do you interpret "correct behavior" in section 4.2.1? >> "Clients MUST use filehandle comparisons only to improve

Re: Finding hardlinks

2007-01-09 Thread Steven Rostedt
On Tue, 9 Jan 2007, Frank van Maarseveen wrote: > > Yes but "cp -rl" is typically done by _developers_ and they tend to > have a better understanding of this (uh, at least within linux context > I hope so). > > Also, just adding hard-links doesn't increase the number of inodes. No, but it increa

Re: Finding hardlinks

2007-01-09 Thread Frank van Maarseveen
On Tue, Jan 09, 2007 at 11:26:25AM -0500, Steven Rostedt wrote: > On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote: > > > > 50% probability of false positive on 4G files seems like very ugly > > > design problem to me. > > > > 4 billion files, each with more than one link is pretty far fet

Re: Finding hardlinks

2007-01-09 Thread Steven Rostedt
On Mon, 2007-01-08 at 13:00 +0100, Miklos Szeredi wrote: > > 50% probability of false positive on 4G files seems like very ugly > > design problem to me. > > 4 billion files, each with more than one link is pretty far fetched. > And anyway, filesystems can take steps to prevent collisions, as the

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
> > You mean POSIX compliance is impossible? So what? It is possible to > > implement an approximation that is _at least_ as good as samefile(). > > One really dumb way is to set st_ino to the 'struct inode' pointer for > > example. That will sure as hell fit into 64bits and will give a > > uniq

Re: Finding hardlinks

2007-01-08 Thread Martin Mares
Hello! > You mean POSIX compliance is impossible? So what? It is possible to > implement an approximation that is _at least_ as good as samefile(). > One really dumb way is to set st_ino to the 'struct inode' pointer for > example. That will sure as hell fit into 64bits and will give a > unique

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
> > There's really no point trying to push for such an inferior interface > > when the problems which samefile is trying to address are purely > > theoretical. > > Oh yes, there is. st_ino is powerful, *but impossible to implement* > on many filesystems. You mean POSIX compliance is impossible?

Re: Finding hardlinks

2007-01-08 Thread Pavel Machek
Hi! > > >> No one guarantees you sane result of tar or cp -a while changing the > > >> tree. > > >> I don't see how is_samefile() could make it worse. > > > > > > There are several cases where changing the tree doesn't affect the > > > correctness of the tar or cp -a result. In some of these cas

Re: Finding hardlinks

2007-01-08 Thread Pavel Machek
On Fri 2007-01-05 16:15:41, Miklos Szeredi wrote: > > > And does it matter? If you rename a file, tar might skip it no matter of > > > hardlink detection (if readdir races with rename, you can read none of > > > the > > > names of file, one or both --- all these are possible). > > > > > > If yo

Re: Finding hardlinks

2007-01-08 Thread Miklos Szeredi
> >> No one guarantees you sane result of tar or cp -a while changing the tree. > >> I don't see how is_samefile() could make it worse. > > > > There are several cases where changing the tree doesn't affect the > > correctness of the tar or cp -a result. In some of these cases using > > samefile()

Re: Finding hardlinks

2007-01-07 Thread Mikulas Patocka
Currently, large file support is already necessary to handle dvd and video. It's also useful for images for virtualization. So the failing stat() calls should already be a thing of the past with modern distributions. As long as glibc compiles by default with 32-bit ino_t, the problem exists and

Re: Finding hardlinks

2007-01-07 Thread Mikulas Patocka
And does it matter? If you rename a file, tar might skip it no matter of hardlink detection (if readdir races with rename, you can read none of the names of file, one or both --- all these are possible). If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delete both "a" and "b" an

RE: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Halevy, Benny
Ven > Subject: Re: [nfsv4] RE: Finding hardlinks > > On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote: > > I agree that the way the client implements its cache is out of the protocol > > scope. But how do you interpret "correct behavior" in section 4.2

Re: Finding hardlinks

2007-01-05 Thread Bodo Eggert
Miklos Szeredi <[EMAIL PROTECTED]> wrote: >> > Well, sort of. Samefile without keeping fds open doesn't have any >> > protection against the tree changing underneath between first >> > registering a file and later opening it. The inode number is more >> >> You only need to keep one-file-per-har

RE: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Noveck, Dave
Miklos Szeredi; nfsv4@ietf.org; linux-kernel@vger.kernel.org; Mikulas Patocka; linux-fsdevel@vger.kernel.org; Jeff Layton; Arjan van de Ven Subject: Re: [nfsv4] RE: Finding hardlinks On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote: > Trond Myklebust wrote: > > Exactly where do you

Re: Finding hardlinks

2007-01-05 Thread Frank van Maarseveen
On Fri, Jan 05, 2007 at 09:43:22AM +0100, Miklos Szeredi wrote: > > > > > High probability is all you have. Cosmic radiation hitting your > > > > > computer will more likly cause problems, than colliding 64bit inode > > > > > numbers ;) > > > > > > > > Some of us have machines designed to cope wi

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Nicolas Williams
On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote: > I agree that the way the client implements its cache is out of the protocol > scope. But how do you interpret "correct behavior" in section 4.2.1? > "Clients MUST use filehandle comparisons only to improve performance, not > for corr

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust
On Fri, 2007-01-05 at 10:40 -0600, Nicolas Williams wrote: > What I don't understand is why getting the fileid is so hard -- always > GETATTR when you GETFH and you'll be fine. I'm guessing that's not as > difficult as it is to maintain a hash table of fileids. You've been sleeping in class. We a

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
> > And does it matter? If you rename a file, tar might skip it no matter of > > hardlink detection (if readdir races with rename, you can read none of the > > names of file, one or both --- all these are possible). > > > > If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delet

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
> And does it matter? If you rename a file, tar might skip it no matter of > hardlink detection (if readdir races with rename, you can read none of the > names of file, one or both --- all these are possible). > > If you have "dir1/a" hardlinked to "dir1/b" and while tar runs you delete > both

Re: Finding hardlinks

2007-01-05 Thread Mikulas Patocka
Well, sort of. Samefile without keeping fds open doesn't have any protection against the tree changing underneath between first registering a file and later opening it. The inode number is more You only need to keep one-file-per-hardlink-group open during final verification, checking that inod

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
> > Well, sort of. Samefile without keeping fds open doesn't have any > > protection against the tree changing underneath between first > > registering a file and later opening it. The inode number is more > > You only need to keep one-file-per-hardlink-group open during final > verification, ch

Re: Finding hardlinks

2007-01-05 Thread Pavel Machek
Hi! > > > > Some of us have machines designed to cope with cosmic rays, and would be > > > > unimpressed with a decrease in reliability. > > > > > > With the suggested samefile() interface you'd get a failure with just > > > about 100% reliability for any application which needs to compare a > >

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Trond Myklebust
On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote: > Trond Myklebust wrote: > > Exactly where do you see us violating the close-to-open cache > > consistency guarantees? > > > > I haven't seen that. What I did see is cache inconsistency when opening > the same file with different file descrip

Re: Finding hardlinks

2007-01-05 Thread Miklos Szeredi
> > > > High probability is all you have. Cosmic radiation hitting your > > > > computer will more likly cause problems, than colliding 64bit inode > > > > numbers ;) > > > > > > Some of us have machines designed to cope with cosmic rays, and would be > > > unimpressed with a decrease in reliabil

Re: [nfsv4] RE: Finding hardlinks

2007-01-05 Thread Benny Halevy
Trond Myklebust wrote: > On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote: >> I agree that the way the client implements its cache is out of the protocol >> scope. But how do you interpret "correct behavior" in section 4.2.1? >> "Clients MUST use filehandle comparisons only to improve perform

Re: Finding hardlinks

2007-01-04 Thread Pavel Machek
Hi! > > > High probability is all you have. Cosmic radiation hitting your > > > computer will more likly cause problems, than colliding 64bit inode > > > numbers ;) > > > > Some of us have machines designed to cope with cosmic rays, and would be > > unimpressed with a decrease in reliability. >

Re: Finding hardlinks

2007-01-04 Thread Nikita Danilov
Mikulas Patocka writes: > > > BTW. How does ReiserFS find that a given inode number (or object ID in > > > ReiserFS terminology) is free before assigning it to new file/directory? > > > > reiserfs v3 has an extent map of free object identifiers in > > super-block. > > Inode free space can h

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Trond Myklebust
On Thu, 2007-01-04 at 12:04 +0200, Benny Halevy wrote: > I agree that the way the client implements its cache is out of the protocol > scope. But how do you interpret "correct behavior" in section 4.2.1? > "Clients MUST use filehandle comparisons only to improve performance, not > for correct beh

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Benny Halevy
Trond Myklebust wrote: > On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote: >> I sincerely expect you or anybody else for this matter to try to provide >> feedback and object to the protocol specification in case they disagree >> with it (or think it's ambiguous or self contradicting) rather t

Re: [nfsv4] RE: Finding hardlinks

2007-01-04 Thread Trond Myklebust
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote: > I sincerely expect you or anybody else for this matter to try to provide > feedback and object to the protocol specification in case they disagree > with it (or think it's ambiguous or self contradicting) rather than ignoring > it and impleme

Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Trond Myklebust
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote: > Believe it or not, but server companies like Panasas try to follow the > standard > when designing and implementing their products while relying on client vendors > to do the same. I personally have never given a rats arse about "standards"

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Thu, Jan 04, 2007 at 12:43:20AM +0100, Mikulas Patocka wrote: > On Wed, 3 Jan 2007, Frank van Maarseveen wrote: > >Currently, large file support is already necessary to handle dvd and > >video. It's also useful for images for virtualization. So the failing > >stat() > >calls should already be a

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka
On Wed, 3 Jan 2007, Frank van Maarseveen wrote: On Wed, Jan 03, 2007 at 01:09:41PM -0800, Bryan Henderson wrote: On any decent filesystem st_ino should uniquely identify an object and reliably provide hardlink information. The UNIX world has relied upon this for decades. A filesystem with st_

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi! > >Sure it is. Numerous popular POSIX filesystems do that. There is a lot of > >inode number space in 64 bit (of course it is a matter of time for it to > >jump to 128 bit and more) > > If the filesystem was designed by someone not from Unix world (FAT, SMB, > ...), then not. And users still

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Wed, Jan 03, 2007 at 01:09:41PM -0800, Bryan Henderson wrote: > >On any decent filesystem st_ino should uniquely identify an object and > >reliably provide hardlink information. The UNIX world has relied upon > this > >for decades. A filesystem with st_ino collisions without being hardlinked >

Re: Finding hardlinks

2007-01-03 Thread Bryan Henderson
>On any decent filesystem st_ino should uniquely identify an object and >reliably provide hardlink information. The UNIX world has relied upon this >for decades. A filesystem with st_ino collisions without being hardlinked >(or the other way around) needs a fix. But for at least the last of those

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Wed, Jan 03, 2007 at 08:31:32PM +0100, Mikulas Patocka wrote: > I didn't hardlink directories, I just patched stat, lstat and fstat to > always return st_ino == 0 --- and I've seen those failures. These > failures > are going to happen on non-POSIX filesystems in real world too,

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka
I didn't hardlink directories, I just patched stat, lstat and fstat to always return st_ino == 0 --- and I've seen those failures. These failures are going to happen on non-POSIX filesystems in real world too, very rarely. I don't want to spoil your day but testing with st_ino==0 is a bad choice

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Wed, Jan 03, 2007 at 08:17:34PM +0100, Mikulas Patocka wrote: > > On Wed, 3 Jan 2007, Frank van Maarseveen wrote: > > >On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote: > >> > >>I didn't hardlink directories, I just patched stat, lstat and fstat to > >>always return st_ino == 0

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka
On Wed, 3 Jan 2007, Frank van Maarseveen wrote: On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote: I didn't hardlink directories, I just patched stat, lstat and fstat to always return st_ino == 0 --- and I've seen those failures. These failures are going to happen on non-POSIX

Re: Finding hardlinks

2007-01-03 Thread Frank van Maarseveen
On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote: > > I didn't hardlink directories, I just patched stat, lstat and fstat to > always return st_ino == 0 --- and I've seen those failures. These failures > are going to happen on non-POSIX filesystems in real world too, very > rarel

Re: Finding hardlinks

2007-01-03 Thread Mikulas Patocka
On Wed, 3 Jan 2007, Miklos Szeredi wrote: High probability is all you have. Cosmic radiation hitting your computer will more likly cause problems, than colliding 64bit inode numbers ;) Some of us have machines designed to cope with cosmic rays, and would be unimpressed with a decrease in re

Re: Finding hardlinks

2007-01-03 Thread Miklos Szeredi
> > High probability is all you have. Cosmic radiation hitting your > > computer will more likly cause problems, than colliding 64bit inode > > numbers ;) > > Some of us have machines designed to cope with cosmic rays, and would be > unimpressed with a decrease in reliability. With the suggested

Re: Finding hardlinks

2007-01-03 Thread Matthew Wilcox
On Wed, Jan 03, 2007 at 01:33:31PM +0100, Miklos Szeredi wrote: > High probability is all you have. Cosmic radiation hitting your > computer will more likly cause problems, than colliding 64bit inode > numbers ;) Some of us have machines designed to cope with cosmic rays, and would be unimpressed

Re: Finding hardlinks

2007-01-03 Thread Martin Mares
Hello! > High probability is all you have. Cosmic radiation hitting your > computer will more likly cause problems, than colliding 64bit inode > numbers ;) No. If you assign 64-bit inode numbers randomly, 2^32 of them are sufficient to generate a collision with probability around 50%.

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi! > > > > > the use of a good hash function. The chance of an accidental > > > > > collision is infinitesimally small. For a set of > > > > > > > > > > 100 files: 0.03% > > > > >1,000,000 files: 0.03% > > > > > > > > I do not think we want to play with probabili

Re: [nfsv4] RE: Finding hardlinks

2007-01-03 Thread Benny Halevy
Trond Myklebust wrote: > On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote: >> Trond Myklebust wrote: >>> >>> On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: Mikulas Patocka wrote: > BTW. how does (or how should?) NFS client deal with cache coherency if > filehandles fo

Re: Finding hardlinks

2007-01-03 Thread Miklos Szeredi
> > > > the use of a good hash function. The chance of an accidental > > > > collision is infinitesimally small. For a set of > > > > > > > > 100 files: 0.03% > > > >1,000,000 files: 0.03% > > > > > > I do not think we want to play with probability like this. I mea

Re: Finding hardlinks

2007-01-03 Thread Pavel Machek
Hi! > > > the use of a good hash function. The chance of an accidental > > > collision is infinitesimally small. For a set of > > > > > > 100 files: 0.03% > > >1,000,000 files: 0.03% > > > > I do not think we want to play with probability like this. I mean... > >

Re: Finding hardlinks

2007-01-02 Thread Mikulas Patocka
On Wed, 3 Jan 2007, Trond Myklebust wrote: On Sat, 2006-12-30 at 02:04 +0100, Mikulas Patocka wrote: On Fri, 29 Dec 2006, Trond Myklebust wrote: On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: Why don't you rip off the support for colliding inode number from the kernel at all (i.e

RE: Finding hardlinks

2007-01-02 Thread Trond Myklebust
On Sun, 2006-12-31 at 16:19 -0500, Halevy, Benny wrote: > Even for NFSv3 (that doesn't have the unique_handles attribute I think > that the linux nfs client can do a better job. If you'd have a filehandle > cache that points at inodes you could maintain a many to one relationship > from multiple

RE: [nfsv4] RE: Finding hardlinks

2007-01-02 Thread Trond Myklebust
On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote: > Trond Myklebust wrote: > > > > On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: > > > Mikulas Patocka wrote: > > > > > >BTW. how does (or how should?) NFS client deal with cache coherency if > > > >filehandles for the same file di

RE: Finding hardlinks

2007-01-02 Thread Trond Myklebust
On Sun, 2006-12-31 at 16:19 -0500, Halevy, Benny wrote: > Trond Myklebust wrote: > > > > On Thu, 2006-12-28 at 17:12 +0200, Benny Halevy wrote: > > > > > As an example, some file systems encode hint information into the > > > filehandle > > > and the hints may change over time, another example

Re: Finding hardlinks

2007-01-02 Thread Trond Myklebust
On Sat, 2006-12-30 at 02:04 +0100, Mikulas Patocka wrote: > > On Fri, 29 Dec 2006, Trond Myklebust wrote: > > > On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: > >> Why don't you rip off the support for colliding inode number from the > >> kernel at all (i.e. remove iget5_locked)? > >>

Re: Finding hardlinks

2007-01-02 Thread Mikulas Patocka
Certainly, but tar isn't going to remember all the inode numbers. Even if you solve the storage requirements (not impossible) it would have to do (4e9^2)/2=8e18 comparisons, which computers don't have enough CPU power just yet. It is remembering all inode numbers with nlink > 1 and many other to

Re: Finding hardlinks

2007-01-02 Thread Miklos Szeredi
> > Certainly, but tar isn't going to remember all the inode numbers. > > Even if you solve the storage requirements (not impossible) it would > > have to do (4e9^2)/2=8e18 comparisons, which computers don't have > > enough CPU power just yet. > > It is remembering all inode numbers with nlink > 1

Re: Finding hardlinks

2007-01-02 Thread Mikulas Patocka
On Tue, 2 Jan 2007, Miklos Szeredi wrote: It seems like the posix idea of unique doesn't hold water for modern file systems are you really sure? Well Jan's example was of Coda that uses 128-bit internal file ids. and if so, why don't we fix *THAT* instead Hmm, sometimes you can't fix

Re: Finding hardlinks

2007-01-02 Thread Miklos Szeredi
> > > >> It seems like the posix idea of unique doesn't > > > >> hold water for modern file systems > > > > > > > > are you really sure? > > > > > > Well Jan's example was of Coda that uses 128-bit internal file ids. > > > > > > > and if so, why don't we fix *THAT* instead > > > > > > Hmm, so

Re: Finding hardlinks

2007-01-02 Thread Pavel Machek
Hi! > > >> It seems like the posix idea of unique doesn't > > >> hold water for modern file systems > > > > > > are you really sure? > > > > Well Jan's example was of Coda that uses 128-bit internal file ids. > > > > > and if so, why don't we fix *THAT* instead > > > > Hmm, sometimes you can

Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka
On Mon, 1 Jan 2007, Jan Harkes wrote: On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote: Anyway, cp -a is not the only application that wants to do hardlink detection. I tested programs for ino_t collision (I intentionally injected it) and found that CP from coreutils 6.7 fails

Re: Finding hardlinks

2007-01-01 Thread Jan Harkes
On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote: > >Anyway, cp -a is not the only application that wants to do hardlink > >detection. > > I tested programs for ino_t collision (I intentionally injected it) and > found that CP from coreutils 6.7 fails to copy directories but displa

Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka
> BTW. How does ReiserFS find that a given inode number (or object ID in > ReiserFS terminology) is free before assigning it to new file/directory? reiserfs v3 has an extent map of free object identifiers in super-block. Inode free space can have at most 2^31 extents --- if inode numbers alter

Re: Finding hardlinks

2007-01-01 Thread Nikita Danilov
Mikulas Patocka writes: [...] > > BTW. How does ReiserFS find that a given inode number (or object ID in > ReiserFS terminology) is free before assigning it to new file/directory? reiserfs v3 has an extent map of free object identifiers in super-block. reiser4 used 64 bit object identifiers

Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka
> The question is: why does the kernel contain iget5 function that looks up > according to callback, if the filesystem cannot have more than 64-bit > inode identifier? Generally speaking, file system might have two different identifiers for files: - one that makes it easy to tell whether two fil

Re: Finding hardlinks

2007-01-01 Thread Mikulas Patocka
Hi! If user (or script) doesn't specify that flag, it doesn't help. I think the best solution for these filesystems would be either to add new syscall int is_hardlink(char *filename1, char *filename2) (but I know adding syscall bloat may be objectionable) it's also the wrong api; the f

Re: Finding hardlinks

2007-01-01 Thread Pavel Machek
Hi! > >>If user (or script) doesn't specify that flag, it > >>doesn't help. I think > >>the best solution for these filesystems would be > >>either to add new syscall > >>int is_hardlink(char *filename1, char *filename2) > >>(but I know adding syscall bloat may be objectionable) > > > >it's

Re: Finding hardlinks

2006-12-31 Thread Nikita Danilov
Mikulas Patocka writes: > > > On Fri, 29 Dec 2006, Trond Myklebust wrote: > > > On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: > >> Why don't you rip off the support for colliding inode number from the > >> kernel at all (i.e. remove iget5_locked)? > >> > >> It's reasonable t

RE: [nfsv4] RE: Finding hardlinks

2006-12-31 Thread Halevy, Benny
Trond Myklebust wrote: > > On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: > > Mikulas Patocka wrote: > > > >BTW. how does (or how should?) NFS client deal with cache coherency if > > >filehandles for the same file differ? > > > > > > > Trond can probably answer this better than me...

RE: Finding hardlinks

2006-12-31 Thread Halevy, Benny
Trond Myklebust wrote: > > On Thu, 2006-12-28 at 17:12 +0200, Benny Halevy wrote: > > > As an example, some file systems encode hint information into the filehandle > > and the hints may change over time, another example is encoding parent > > information into the filehandle and then handles rep

Re: Finding hardlinks

2006-12-31 Thread Mikulas Patocka
On Wed, 20 Dec 2006, Al Viro wrote: On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote: I don't see any problems with changing struct kstat. There would be reservations against changing inode.i_ino though. So filesystems that have 64bit inodes will need a specialized getattr() met

Re: Finding hardlinks

2006-12-29 Thread Mikulas Patocka
On Fri, 29 Dec 2006, Trond Myklebust wrote: On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: Why don't you rip off the support for colliding inode number from the kernel at all (i.e. remove iget5_locked)? It's reasonable to have either no support for colliding ino_t or full support

Re: Finding hardlinks

2006-12-29 Thread Trond Myklebust
On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote: > Why don't you rip off the support for colliding inode number from the > kernel at all (i.e. remove iget5_locked)? > > It's reasonable to have either no support for colliding ino_t or full > support for that (including syscalls that user

Re: [nfsv4] RE: Finding hardlinks

2006-12-29 Thread Trond Myklebust
On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote: > Mikulas Patocka wrote: > >BTW. how does (or how should?) NFS client deal with cache coherency if > >filehandles for the same file differ? > > > > Trond can probably answer this better than me... > As I read it, currently the nfs client ma

Re: Finding hardlinks

2006-12-29 Thread Trond Myklebust
On Thu, 2006-12-28 at 17:12 +0200, Benny Halevy wrote: > As an example, some file systems encode hint information into the filehandle > and the hints may change over time, another example is encoding parent > information into the filehandle and then handles representing hard links > to the same fi

RE: Finding hardlinks

2006-12-28 Thread Halevy, Benny
Mikulas Patocka wrote: > >>> This sounds like a bug to me. It seems like we should have a one to one >>> correspondence of filehandle -> inode. In what situations would this not be >>> the >>> case? >> >> Well, the NFS protocol allows that [see rfc1813, p. 21: "If two file handles >> from >> the

Re: Finding hardlinks

2006-12-28 Thread Miklos Szeredi
> >> It seems like the posix idea of unique doesn't > >> hold water for modern file systems > > > > are you really sure? > > Well Jan's example was of Coda that uses 128-bit internal file ids. > > > and if so, why don't we fix *THAT* instead > > Hmm, sometimes you can't fix the world, especia

Re: Finding hardlinks

2006-12-28 Thread Mikulas Patocka
This sounds like a bug to me. It seems like we should have a one to one correspondence of filehandle -> inode. In what situations would this not be the case? Well, the NFS protocol allows that [see rfc1813, p. 21: "If two file handles from the same server are equal, they must refer to the same

Re: Finding hardlinks

2006-12-28 Thread Mikulas Patocka
On Thu, 28 Dec 2006, Arjan van de Ven wrote: It seems like the posix idea of unique doesn't hold water for modern file systems are you really sure? and if so, why don't we fix *THAT* instead, rather than adding racy syscalls and such that just can't really be used right... Why don't you r

Re: Finding hardlinks

2006-12-28 Thread Jan Engelhardt
On Dec 28 2006 10:54, Jeff Layton wrote: > > Sorry, I should qualify that statement. A lot of filesystems don't have > permanent i_ino values (mostly pseudo filesystems -- pipefs, sockfs, /proc > stuff, etc). For those, the idea is to try to make sure we use 32 bit values > for them and to ensure

Re: Finding hardlinks

2006-12-28 Thread Jeff Layton
Benny Halevy wrote: Jeff Layton wrote: Benny Halevy wrote: It seems like the posix idea of unique doesn't hold water for modern file systems and that creates real problems for backup apps which rely on that to detect hard links. Why not? Granted, many of the filesystems in the Linux kernel do

Re: Finding hardlinks

2006-12-28 Thread Benny Halevy
Arjan van de Ven wrote: >> It seems like the posix idea of unique doesn't >> hold water for modern file systems > > are you really sure? Well Jan's example was of Coda that uses 128-bit internal file ids. > and if so, why don't we fix *THAT* instead Hmm, sometimes you can't fix the world, esp

Re: Finding hardlinks

2006-12-28 Thread Benny Halevy
Jeff Layton wrote: > Benny Halevy wrote: >> It seems like the posix idea of unique doesn't >> hold water for modern file systems and that creates real problems for >> backup apps which rely on that to detect hard links. >> > > Why not? Granted, many of the filesystems in the Linux kernel don't e

Re: Finding hardlinks

2006-12-28 Thread Jeff Layton
Benny Halevy wrote: It seems like the posix idea of unique doesn't hold water for modern file systems and that creates real problems for backup apps which rely on that to detect hard links. Why not? Granted, many of the filesystems in the Linux kernel don't enforce that they have unique st_

Re: Finding hardlinks

2006-12-28 Thread Arjan van de Ven
> It seems like the posix idea of unique doesn't > hold water for modern file systems are you really sure? and if so, why don't we fix *THAT* instead, rather than adding racy syscalls and such that just can't really be used right... -- if you want to mail me at work (you don't), use arjan (a

Re: Finding hardlinks

2006-12-28 Thread Benny Halevy
Mikulas Patocka wrote: >>> If user (or script) doesn't specify that flag, it doesn't help. I think >>> the best solution for these filesystems would be either to add new syscall >>> int is_hardlink(char *filename1, char *filename2) >>> (but I know adding syscall bloat may be objectionable) >> i

Re: Finding hardlinks

2006-12-23 Thread Mikulas Patocka
If user (or script) doesn't specify that flag, it doesn't help. I think the best solution for these filesystems would be either to add new syscall int is_hardlink(char *filename1, char *filename2) (but I know adding syscall bloat may be objectionable) it's also the wrong api; the filenam

Re: Finding hardlinks

2006-12-23 Thread Arjan van de Ven
> > If user (or script) doesn't specify that flag, it doesn't help. I think > the best solution for these filesystems would be either to add new syscall > int is_hardlink(char *filename1, char *filename2) > (but I know adding syscall bloat may be objectionable) it's also the wrong api; th

Re: Finding hardlinks

2006-12-21 Thread Jan Harkes
On Fri, Dec 22, 2006 at 12:49:42AM +0100, Mikulas Patocka wrote: > On Thu, 21 Dec 2006, Jan Harkes wrote: > >On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote: > >>The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend > >>the kstat.ino field to 64bit and fix those files

Re: Finding hardlinks

2006-12-21 Thread Mikulas Patocka
On Thu, 21 Dec 2006, Jan Harkes wrote: On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote: The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend the kstat.ino field to 64bit and fix those filesystems to fill in kstat correctly. Coda actually uses 128-bit file ident

Re: Finding hardlinks

2006-12-21 Thread Jan Harkes
On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote: > The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend > the kstat.ino field to 64bit and fix those filesystems to fill in > kstat correctly. Coda actually uses 128-bit file identifiers internally, so 64-bits really d

Re: Finding hardlinks

2006-12-20 Thread Mikulas Patocka
On Wed, 20 Dec 2006, Al Viro wrote: On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote: I don't see any problems with changing struct kstat. There would be reservations against changing inode.i_ino though. So filesystems that have 64bit inodes will need a specialized getattr() m

Re: Finding hardlinks

2006-12-20 Thread Al Viro
On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote: > I don't see any problems with changing struct kstat. There would be > reservations against changing inode.i_ino though. > > So filesystems that have 64bit inodes will need a specialized > getattr() method instead of generic_fillatt

Re: Finding hardlinks

2006-12-20 Thread Miklos Szeredi
> >> I've came across this problem: how can a userspace program (such as for > >> example "cp -a") tell that two files form a hardlink? Comparing inode > >> number will break on filesystems that can have more than 2^32 files (NFS3, > >> OCFS, SpadFS; kernel developers already implemented iget5_lock

Re: Finding hardlinks

2006-12-20 Thread Mikulas Patocka
I've came across this problem: how can a userspace program (such as for example "cp -a") tell that two files form a hardlink? Comparing inode number will break on filesystems that can have more than 2^32 files (NFS3, OCFS, SpadFS; kernel developers already implemented iget5_locked for the case of

Re: Finding hardlinks

2006-12-20 Thread Miklos Szeredi
> I've came across this problem: how can a userspace program (such as for > example "cp -a") tell that two files form a hardlink? Comparing inode > number will break on filesystems that can have more than 2^32 files (NFS3, > OCFS, SpadFS; kernel developers already implemented iget5_locked for th

  1   2   >