On Wednesday 03 January 2007 13:42, Pavel Machek wrote:
I guess that is the way to go. samefile(path1, path2) is unfortunately
inherently racy.
Not a problem in practice. You don't expect cp -a
to reliably copy a tree which something else is modifying
at the same time.
Thus we assume that the
On Wednesday 03 January 2007 21:26, Frank van Maarseveen wrote:
On Wed, Jan 03, 2007 at 08:31:32PM +0100, Mikulas Patocka wrote:
64-bit inode numbers space is not yet implemented on Linux --- the problem
is that if you return ino = 2^32, programs compiled without
-D_FILE_OFFSET_BITS=64
Nicolas Williams wrote:
On Thu, Jan 04, 2007 at 12:04:14PM +0200, Benny Halevy wrote:
I agree that the way the client implements its cache is out of the protocol
scope. But how do you interpret correct behavior in section 4.2.1?
Clients MUST use filehandle comparisons only to improve
I did cp -rl his-tree my-tree (which completed
quickly), edited the two files that needed to be patched, then did
diff -urp his-tree my-tree, which also completed quickly, as diff knows
that if two files have the same inode, they don't need to be opened.
... download one tree from kernel.org, do
On Tue, 9 Jan 2007, Frank van Maarseveen wrote:
Yes but cp -rl is typically done by _developers_ and they tend to
have a better understanding of this (uh, at least within linux context
I hope so).
Also, just adding hard-links doesn't increase the number of inodes.
No, but it increases the
but you can get a large number of 1 linked
files, when you copy full directories with cp -rl. Which I do a lot
when developing. I've done that a few times with the Linux tree.
Can you shed some light on how you use this technique? (I.e. what does it
do for you?)
Many people are of the opinion
On Tue 2007-01-09 15:43:14, Bryan Henderson wrote:
but you can get a large number of 1 linked
files, when you copy full directories with cp -rl. Which I do a lot
when developing. I've done that a few times with the Linux tree.
Can you shed some light on how you use this technique? (I.e.
No one guarantees you sane result of tar or cp -a while changing the tree.
I don't see how is_samefile() could make it worse.
There are several cases where changing the tree doesn't affect the
correctness of the tar or cp -a result. In some of these cases using
samefile() instead of
On Fri 2007-01-05 16:15:41, Miklos Szeredi wrote:
And does it matter? If you rename a file, tar might skip it no matter of
hardlink detection (if readdir races with rename, you can read none of
the
names of file, one or both --- all these are possible).
If you have dir1/a
Hi!
No one guarantees you sane result of tar or cp -a while changing the
tree.
I don't see how is_samefile() could make it worse.
There are several cases where changing the tree doesn't affect the
correctness of the tar or cp -a result. In some of these cases using
There's really no point trying to push for such an inferior interface
when the problems which samefile is trying to address are purely
theoretical.
Oh yes, there is. st_ino is powerful, *but impossible to implement*
on many filesystems.
You mean POSIX compliance is impossible? So what?
You mean POSIX compliance is impossible? So what? It is possible to
implement an approximation that is _at least_ as good as samefile().
One really dumb way is to set st_ino to the 'struct inode' pointer for
example. That will sure as hell fit into 64bits and will give a
unique (alas
Hello!
You mean POSIX compliance is impossible? So what? It is possible to
implement an approximation that is _at least_ as good as samefile().
One really dumb way is to set st_ino to the 'struct inode' pointer for
example. That will sure as hell fit into 64bits and will give a
unique
High probability is all you have. Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)
Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
With the
On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
Trond Myklebust wrote:
Exactly where do you see us violating the close-to-open cache
consistency guarantees?
I haven't seen that. What I did see is cache inconsistency when opening
the same file with different file descriptors when
Hi!
Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
With the suggested samefile() interface you'd get a failure with just
about 100% reliability for any application which needs to compare a
more than a few
And does it matter? If you rename a file, tar might skip it no matter of
hardlink detection (if readdir races with rename, you can read none of the
names of file, one or both --- all these are possible).
If you have dir1/a hardlinked to dir1/b and while tar runs you delete
both a and b
And does it matter? If you rename a file, tar might skip it no matter of
hardlink detection (if readdir races with rename, you can read none of the
names of file, one or both --- all these are possible).
If you have dir1/a hardlinked to dir1/b and while tar runs you delete
both a
On Fri, 2007-01-05 at 10:40 -0600, Nicolas Williams wrote:
What I don't understand is why getting the fileid is so hard -- always
GETATTR when you GETFH and you'll be fine. I'm guessing that's not as
difficult as it is to maintain a hash table of fileids.
You've been sleeping in class. We
Subject: Re: [nfsv4] RE: Finding hardlinks
On Fri, 2007-01-05 at 10:28 +0200, Benny Halevy wrote:
Trond Myklebust wrote:
Exactly where do you see us violating the close-to-open cache
consistency guarantees?
I haven't seen that. What I did see is cache inconsistency when
opening
the same
Pavel Machek [EMAIL PROTECTED] wrote:
Another idea is to export the filesystem internal ID as an arbitray
length cookie through the extended attribute interface. That could be
stored/compared by the filesystem quite efficiently.
How will that work for FAT?
Or maybe we can relax that
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
I sincerely expect you or anybody else for this matter to try to provide
feedback and object to the protocol specification in case they disagree
with it (or think it's ambiguous or self contradicting) rather than ignoring
it and
Trond Myklebust wrote:
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
I sincerely expect you or anybody else for this matter to try to provide
feedback and object to the protocol specification in case they disagree
with it (or think it's ambiguous or self contradicting) rather than
Mikulas Patocka writes:
BTW. How does ReiserFS find that a given inode number (or object ID in
ReiserFS terminology) is free before assigning it to new file/directory?
reiserfs v3 has an extent map of free object identifiers in
super-block.
Inode free space can have at most
Bryan Henderson wrote:
Clients MUST use filehandle comparisons only to improve
performance, not for correct behavior. All clients need to
be prepared for situations in which it cannot be determined
whether two filehandles denote the same object and in such
cases, avoid making invalid assumptions
Hi!
High probability is all you have. Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)
Some of us have machines designed to cope with cosmic rays, and would be
unimpressed with a decrease in reliability.
With the
Hi!
the use of a good hash function. The chance of an accidental
collision is infinitesimally small. For a set of
100 files: 0.03%
1,000,000 files: 0.03%
I do not think we want to play with probability like this. I mean...
imagine 4G files,
the use of a good hash function. The chance of an accidental
collision is infinitesimally small. For a set of
100 files: 0.03%
1,000,000 files: 0.03%
I do not think we want to play with probability like this. I mean...
imagine 4G files,
Trond Myklebust wrote:
On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote:
Trond Myklebust wrote:
On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
Mikulas Patocka wrote:
BTW. how does (or how should?) NFS client deal with cache coherency if
filehandles for the same file
Hi!
the use of a good hash function. The chance of an accidental
collision is infinitesimally small. For a set of
100 files: 0.03%
1,000,000 files: 0.03%
I do not think we want to play with probability like this. I mean...
Hello!
High probability is all you have. Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)
No.
If you assign 64-bit inode numbers randomly, 2^32 of them are sufficient
to generate a collision with probability around 50%.
On Wed, Jan 03, 2007 at 01:33:31PM +0100, Miklos Szeredi wrote:
High probability is all you have. Cosmic radiation hitting your
computer will more likly cause problems, than colliding 64bit inode
numbers ;)
Some of us have machines designed to cope with cosmic rays, and would be
unimpressed
On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:
I didn't hardlink directories, I just patched stat, lstat and fstat to
always return st_ino == 0 --- and I've seen those failures. These failures
are going to happen on non-POSIX filesystems in real world too, very
rarely.
On Wed, 3 Jan 2007, Frank van Maarseveen wrote:
On Tue, Jan 02, 2007 at 01:04:06AM +0100, Mikulas Patocka wrote:
I didn't hardlink directories, I just patched stat, lstat and fstat to
always return st_ino == 0 --- and I've seen those failures. These failures
are going to happen on non-POSIX
I didn't hardlink directories, I just patched stat, lstat and fstat to
always return st_ino == 0 --- and I've seen those failures. These failures
are going to happen on non-POSIX filesystems in real world too, very
rarely.
I don't want to spoil your day but testing with st_ino==0 is a bad
On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon
this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the other way around) needs a fix.
But for at least the last of those
On Wed, Jan 03, 2007 at 01:09:41PM -0800, Bryan Henderson wrote:
On any decent filesystem st_ino should uniquely identify an object and
reliably provide hardlink information. The UNIX world has relied upon
this
for decades. A filesystem with st_ino collisions without being hardlinked
(or the
Hi!
Sure it is. Numerous popular POSIX filesystems do that. There is a lot of
inode number space in 64 bit (of course it is a matter of time for it to
jump to 128 bit and more)
If the filesystem was designed by someone not from Unix world (FAT, SMB,
...), then not. And users still want to
On Thu, Jan 04, 2007 at 12:43:20AM +0100, Mikulas Patocka wrote:
On Wed, 3 Jan 2007, Frank van Maarseveen wrote:
Currently, large file support is already necessary to handle dvd and
video. It's also useful for images for virtualization. So the failing
stat()
calls should already be a thing
On Wed, 2007-01-03 at 14:35 +0200, Benny Halevy wrote:
Believe it or not, but server companies like Panasas try to follow the
standard
when designing and implementing their products while relying on client vendors
to do the same.
I personally have never given a rats arse about standards if
Hi!
It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems
are you really sure?
Well Jan's example was of Coda that uses 128-bit internal file ids.
and if so, why don't we fix *THAT* instead
Hmm, sometimes you can't fix the
It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems
are you really sure?
Well Jan's example was of Coda that uses 128-bit internal file ids.
and if so, why don't we fix *THAT* instead
Hmm, sometimes you can't fix
On Tue, 2 Jan 2007, Miklos Szeredi wrote:
It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems
are you really sure?
Well Jan's example was of Coda that uses 128-bit internal file ids.
and if so, why don't we fix *THAT* instead
Hmm, sometimes
On Sun, 2006-12-31 at 16:25 -0500, Halevy, Benny wrote:
Trond Myklebust wrote:
On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
Mikulas Patocka wrote:
BTW. how does (or how should?) NFS client deal with cache coherency if
filehandles for the same file differ?
On Sun, 2006-12-31 at 16:19 -0500, Halevy, Benny wrote:
Even for NFSv3 (that doesn't have the unique_handles attribute I think
that the linux nfs client can do a better job. If you'd have a filehandle
cache that points at inodes you could maintain a many to one relationship
from multiple
On Wed, 3 Jan 2007, Trond Myklebust wrote:
On Sat, 2006-12-30 at 02:04 +0100, Mikulas Patocka wrote:
On Fri, 29 Dec 2006, Trond Myklebust wrote:
On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote:
Why don't you rip off the support for colliding inode number from the
kernel at all
Hi!
If user (or script) doesn't specify that flag, it
doesn't help. I think
the best solution for these filesystems would be
either to add new syscall
int is_hardlink(char *filename1, char *filename2)
(but I know adding syscall bloat may be objectionable)
it's also the wrong api;
Hi!
If user (or script) doesn't specify that flag, it
doesn't help. I think
the best solution for these filesystems would be
either to add new syscall
int is_hardlink(char *filename1, char *filename2)
(but I know adding syscall bloat may be objectionable)
it's also the wrong api; the
The question is: why does the kernel contain iget5 function that looks up
according to callback, if the filesystem cannot have more than 64-bit
inode identifier?
Generally speaking, file system might have two different identifiers for
files:
- one that makes it easy to tell whether two files
Mikulas Patocka writes:
[...]
BTW. How does ReiserFS find that a given inode number (or object ID in
ReiserFS terminology) is free before assigning it to new file/directory?
reiserfs v3 has an extent map of free object identifiers in
super-block. reiser4 used 64 bit object identifiers
BTW. How does ReiserFS find that a given inode number (or object ID in
ReiserFS terminology) is free before assigning it to new file/directory?
reiserfs v3 has an extent map of free object identifiers in
super-block.
Inode free space can have at most 2^31 extents --- if inode numbers
On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote:
Anyway, cp -a is not the only application that wants to do hardlink
detection.
I tested programs for ino_t collision (I intentionally injected it) and
found that CP from coreutils 6.7 fails to copy directories but displays
On Mon, 1 Jan 2007, Jan Harkes wrote:
On Mon, Jan 01, 2007 at 11:47:06PM +0100, Mikulas Patocka wrote:
Anyway, cp -a is not the only application that wants to do hardlink
detection.
I tested programs for ino_t collision (I intentionally injected it) and
found that CP from coreutils 6.7 fails
On Wed, 20 Dec 2006, Al Viro wrote:
On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote:
I don't see any problems with changing struct kstat. There would be
reservations against changing inode.i_ino though.
So filesystems that have 64bit inodes will need a specialized
getattr()
Trond Myklebust wrote:
On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
Mikulas Patocka wrote:
BTW. how does (or how should?) NFS client deal with cache coherency if
filehandles for the same file differ?
Trond can probably answer this better than me...
As I read it,
Mikulas Patocka writes:
On Fri, 29 Dec 2006, Trond Myklebust wrote:
On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote:
Why don't you rip off the support for colliding inode number from the
kernel at all (i.e. remove iget5_locked)?
It's reasonable to have either no
On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote:
Statement 1:
If two files have identical st_dev and st_ino, they MUST be hardlinks of
each other/the same file.
Statement 2:
If two files are a hardlink of each other, they MUST be detectable
(for example by having the same
On Thu, 2006-12-28 at 17:12 +0200, Benny Halevy wrote:
As an example, some file systems encode hint information into the filehandle
and the hints may change over time, another example is encoding parent
information into the filehandle and then handles representing hard links
to the same file
On Thu, 2006-12-28 at 15:07 -0500, Halevy, Benny wrote:
Mikulas Patocka wrote:
BTW. how does (or how should?) NFS client deal with cache coherency if
filehandles for the same file differ?
Trond can probably answer this better than me...
As I read it, currently the nfs client matches
On 29 Dec 2006, at 08:41, Arjan van de Ven wrote:
I think statement 2 is extremely important. Without this guarantee
applications have to guess which files are hardlinks. Any guessing
is going to be be got wrong sometimes with potentially disastrous
results.
actually no. Statement 1 will
Actually no. Statement 2 for me is important in terms of archive
correctness. With my archiver program Mksquashfs, if the two files
are the same, and filesystem says they're hardlinks, I make them
hardlinks in the Squashfs filesystem, otherwise they're stored as
duplicates (same
On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote:
Statement 1:
If two files have identical st_dev and st_ino, they MUST be hardlinks
of
each other/the same file.
Statement 2:
If two files are a hardlink of each other, they MUST be detectable
(for example by having the same
On Fri, 2006-12-29 at 10:08 -0800, Bryan Henderson wrote:
On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote:
Statement 1:
If two files have identical st_dev and st_ino, they MUST be hardlinks
of
each other/the same file.
Statement 2:
If two files are a hardlink of each
On Fri, 2006-12-29 at 10:08 -0800, Bryan Henderson wrote:
On Thu, 2006-12-28 at 16:44 -0800, Bryan Henderson wrote:
Statement 1:
If two files have identical st_dev and st_ino, they MUST be
hardlinks
of
each other/the same file.
Statement 2:
If two files are a hardlink of each
On Fri, 29 Dec 2006, Trond Myklebust wrote:
On Thu, 2006-12-28 at 19:14 +0100, Mikulas Patocka wrote:
Why don't you rip off the support for colliding inode number from the
kernel at all (i.e. remove iget5_locked)?
It's reasonable to have either no support for colliding ino_t or full
support
It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems
are you really sure?
and if so, why don't we fix *THAT* instead, rather than adding racy
syscalls and such that just can't really be used right...
--
if you want to mail me at work (you don't),
Benny Halevy wrote:
It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems and that creates real problems for
backup apps which rely on that to detect hard links.
Why not? Granted, many of the filesystems in the Linux kernel don't enforce that
they
Jeff Layton wrote:
Benny Halevy wrote:
It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems and that creates real problems for
backup apps which rely on that to detect hard links.
Why not? Granted, many of the filesystems in the Linux kernel
parent
information into the filehandle and then handles representing hard links
to the same file from different directories will differ.
Interesting. That does seem to break the method of st_dev/st_ino for finding
hardlinks. For Linux fileservers I think we generally do have 1:1 correspondence
On Dec 28 2006 10:54, Jeff Layton wrote:
Sorry, I should qualify that statement. A lot of filesystems don't have
permanent i_ino values (mostly pseudo filesystems -- pipefs, sockfs, /proc
stuff, etc). For those, the idea is to try to make sure we use 32 bit values
for them and to ensure that
Adding a vfs call to check for file equivalence seems like a good idea to
me.
That would be only barely useful. It would let 'diff' say, those are
both the same file, but wouldn't be useful for something trying to
duplicate a filesystem (e.g. a backup program). Such a program can't do
the
If it's important to know that two names refer to the same file in a
remote filesystem, I don't see any way around adding a new concept of file
identifier to the protocol.
actually there are 2 separate issues at hand, and this thread sort of
confuses them into one:
Statement 1:
If two
It seems like the posix idea of unique st_dev, st_ino doesn't
hold water for modern file systems
are you really sure?
Well Jan's example was of Coda that uses 128-bit internal file ids.
and if so, why don't we fix *THAT* instead
Hmm, sometimes you can't fix the world,
Mikulas Patocka wrote:
This sounds like a bug to me. It seems like we should have a one to one
correspondence of filehandle - inode. In what situations would this not be
the
case?
Well, the NFS protocol allows that [see rfc1813, p. 21: If two file handles
from
the same server are
Bryan Henderson wrote:
Adding a vfs call to check for file equivalence seems like a good idea to
me.
That would be only barely useful. It would let 'diff' say, those are
both the same file, but wouldn't be useful for something trying to
duplicate a filesystem (e.g. a backup program). Such
Statement 1:
If two files have identical st_dev and st_ino, they MUST be hardlinks of
each other/the same file.
Statement 2:
If two files are a hardlink of each other, they MUST be detectable
(for example by having the same st_dev/st_ino)
I personally consider statement 1 a mandatory requirement
If user (or script) doesn't specify that flag, it doesn't help. I think
the best solution for these filesystems would be either to add new syscall
int is_hardlink(char *filename1, char *filename2)
(but I know adding syscall bloat may be objectionable)
it's also the wrong api; the
If user (or script) doesn't specify that flag, it doesn't help. I think
the best solution for these filesystems would be either to add new syscall
int is_hardlink(char *filename1, char *filename2)
(but I know adding syscall bloat may be objectionable)
it's also the wrong api; the
On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote:
The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend
the kstat.ino field to 64bit and fix those filesystems to fill in
kstat correctly.
Coda actually uses 128-bit file identifiers internally, so 64-bits
really
On Thu, 21 Dec 2006, Jan Harkes wrote:
On Wed, Dec 20, 2006 at 12:44:42PM +0100, Miklos Szeredi wrote:
The stat64.st_ino field is 64bit, so AFAICS you'd only need to extend
the kstat.ino field to 64bit and fix those filesystems to fill in
kstat correctly.
Coda actually uses 128-bit file
I've came across this problem: how can a userspace program (such as for
example cp -a) tell that two files form a hardlink? Comparing inode
number will break on filesystems that can have more than 2^32 files (NFS3,
OCFS, SpadFS; kernel developers already implemented iget5_locked for the
I've came across this problem: how can a userspace program (such as for
example cp -a) tell that two files form a hardlink? Comparing inode
number will break on filesystems that can have more than 2^32 files (NFS3,
OCFS, SpadFS; kernel developers already implemented iget5_locked for the
case of
I've came across this problem: how can a userspace program (such as for
example cp -a) tell that two files form a hardlink? Comparing inode
number will break on filesystems that can have more than 2^32 files (NFS3,
OCFS, SpadFS; kernel developers already implemented iget5_locked for the
On Wed, Dec 20, 2006 at 05:50:11PM +0100, Miklos Szeredi wrote:
I don't see any problems with changing struct kstat. There would be
reservations against changing inode.i_ino though.
So filesystems that have 64bit inodes will need a specialized
getattr() method instead of generic_fillattr().
84 matches
Mail list logo