Re: [zfs-discuss] missing files on copy
On Jan 30, 2008 1:34 AM, Carson Gaspar [EMAIL PROTECTED] wrote: If this is Sun's cp, file a bug. It's failing to notice that it didn't provide a large enough buffer to getdents(), so it only got partial results. Of course, the getdents() API is rather unfortunate. It appears the only safe algorithm is: while ((r = getdents(...)) 0) { /* process results */ } if (r 0) { /* handle error */ } You _always_ have to call it at least twice to be sure you've gotten everything. In OpenSolaris, cp uses (indirectly) readdir(), not raw getdents(). http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libcmd/common/cp.c#487 which uses the build-a-linked-list code here: http://src.opensolaris.org/source/xref/sfw/usr/src/cmd/coreutils/coreutils-6.7/lib/fts.c#913 That code appears to error out and return incomplete results if a) the filename is too long or b) an integer overflows. Christopher's filenames are only 96 chars; could Unicode be involved somehow? b) seems unlikely in the extreme. It still seems like a bug, but I don't see where it is. I am only an egg ;-) Will ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski wrote: I noticed that the first calls in the cp and ls to getdents() return similar file lists, with the same values. However, in the ls, it makes a second call to getdents(): If this is Sun's cp, file a bug. It's failing to notice that it didn't provide a large enough buffer to getdents(), so it only got partial results. cp doesn't use getdents() but it uses readdir() instead; the whole buffer is hidden to it. Of course, the getdents() API is rather unfortunate. It appears the only safe algorithm is: while ((r = getdents(...)) 0) { /* process results */ } if (r 0) { /* handle error */ } You _always_ have to call it at least twice to be sure you've gotten everything. That's why you never use getdents but rather readdir() which hides this for you. It appears that the off_t of the directory entries in the particular second read is 2^32; so perhaps a cp which hasn't been compiled with handle large files is being used? Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
That code appears to error out and return incomplete results if a) the filename is too long or b) an integer overflows. Christopher's filenames are only 96 chars; could Unicode be involved somehow? b) seems unlikely in the extreme. It still seems like a bug, but I don't see where it is. I am only an egg ;-) And ls would fail in the same manner. There's one piece of code in cp (see usr/src/cmd/mv/mv.c) which short-circuits a readdir-loop: while ((dp = readdir(srcdirp)) != NULL) { int ret; if ((ret = traverse_attrfile(dp, source, target, 1)) == -1) continue; else if (ret 0) { ++error; goto out; } This is strange to me because all other failures result in cp going over to the next file. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Will Murnane [EMAIL PROTECTED] wrote: On Jan 30, 2008 1:34 AM, Carson Gaspar [EMAIL PROTECTED] wrote: If this is Sun's cp, file a bug. It's failing to notice that it didn't provide a large enough buffer to getdents(), so it only got partial results. Of course, the getdents() API is rather unfortunate. It appears the only safe algorithm is: while ((r = getdents(...)) 0) { /* process results */ } if (r 0) { /* handle error */ } You _always_ have to call it at least twice to be sure you've gotten everything. In OpenSolaris, cp uses (indirectly) readdir(), not raw getdents(). http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libcmd/common/cp.c#487 which uses the build-a-linked-list code here: http://src.opensolaris.org/source/xref/sfw/usr/src/cmd/coreutils/coreutils-6.7/lib/fts.c#913 That code appears to error out and return incomplete results if a) the filename is too long or b) an integer overflows. Christopher's filenames are only 96 chars; could Unicode be involved somehow? b) seems unlikely in the extreme. It still seems like a bug, but I don't see where it is. I am only an egg ;-) An interesting thought We of course need to know whether the user used /bin/cp or a shadow implementation from ksh93. I did never see any problems with star(1) and star(1)/libfind(3) are heavy readdir(3) users... Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski [EMAIL PROTECTED] wrote: Of course, the getdents() API is rather unfortunate. It appears the only safe algorithm is: while ((r = getdents(...)) 0) { /* process results */ } if (r 0) { /* handle error */ } You _always_ have to call it at least twice to be sure you've gotten everything. Yes, it is Sun's cp. I'm trying, with some difficulty, to figure out exactly how to reproduce this error in a way not specific to my data. I copied a set of randomly generated files with a deep directory structure and cp seems to correctly call getdents() multiple times. Note that cp (mv) does not call getdents() directly but readdir(). If there is a problem, it is most likely in readdir() and it really looks strangee that ls(1) (although it uses the same implementaion) works for you. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
[EMAIL PROTECTED] wrote: And ls would fail in the same manner. There's one piece of code in cp (see usr/src/cmd/mv/mv.c) which short-circuits a readdir-loop: while ((dp = readdir(srcdirp)) != NULL) { int ret; if ((ret = traverse_attrfile(dp, source, target, 1)) == -1) continue; else if (ret 0) { ++error; goto out; } This is strange to me because all other failures result in cp going over to the next file. traverse_attrfile() returns -1 only for: if ((dp-d_name[0] == '.' dp-d_name[1] == '\0') || (dp-d_name[0] == '.' dp-d_name[1] == '.' dp-d_name[2] == '\0') || (sysattr_type(dp-d_name) == _RO_SATTR) || (sysattr_type(dp-d_name) == _RW_SATTR)) return (-1); So this primarily skips '.' and '..'. The rest seems to check for DOS extensions in extended attributes. but this is only done to copy attributes and not files. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Hello Christopher, Wednesday, January 30, 2008, 7:27:01 AM, you wrote: CG Carson Gaspar wrote: Christopher Gorski wrote: I noticed that the first calls in the cp and ls to getdents() return similar file lists, with the same values. However, in the ls, it makes a second call to getdents(): If this is Sun's cp, file a bug. It's failing to notice that it didn't provide a large enough buffer to getdents(), so it only got partial results. Of course, the getdents() API is rather unfortunate. It appears the only safe algorithm is: while ((r = getdents(...)) 0) { /* process results */ } if (r 0) { /* handle error */ } You _always_ have to call it at least twice to be sure you've gotten everything. CG Yes, it is Sun's cp. I'm trying, with some difficulty, to figure out CG exactly how to reproduce this error in a way not specific to my data. I CG copied a set of randomly generated files with a deep directory structure CG and cp seems to correctly call getdents() multiple times. If you could re-create empty files - exactly the same directory atructure and file names, check if you still got a problem. If you do, then if you could send a script here (mkdir's -p and touch) so we can investigate. Assuming your file names and directory structure can be made public. -- Best regards, Robert Milkowskimailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Joerg Schilling wrote: Will Murnane [EMAIL PROTECTED] wrote: On Jan 30, 2008 1:34 AM, Carson Gaspar [EMAIL PROTECTED] wrote: If this is Sun's cp, file a bug. It's failing to notice that it didn't provide a large enough buffer to getdents(), so it only got partial results. Of course, the getdents() API is rather unfortunate. It appears the only safe algorithm is: while ((r = getdents(...)) 0) { /* process results */ } if (r 0) { /* handle error */ } You _always_ have to call it at least twice to be sure you've gotten everything. In OpenSolaris, cp uses (indirectly) readdir(), not raw getdents(). http://src.opensolaris.org/source/xref/onnv/onnv-gate/usr/src/lib/libcmd/common/cp.c#487 which uses the build-a-linked-list code here: http://src.opensolaris.org/source/xref/sfw/usr/src/cmd/coreutils/coreutils-6.7/lib/fts.c#913 That code appears to error out and return incomplete results if a) the filename is too long or b) an integer overflows. Christopher's filenames are only 96 chars; could Unicode be involved somehow? b) seems unlikely in the extreme. It still seems like a bug, but I don't see where it is. I am only an egg ;-) An interesting thought We of course need to know whether the user used /bin/cp or a shadow implementation from ksh93. I did never see any problems with star(1) and star(1)/libfind(3) are heavy readdir(3) users... Jörg I am able to replicate the problem in bash using: #truss -tall -vall -o /tmp/getdents.bin.cp.truss /bin/cp -pr /pond/photos/* /pond/copytestsame/ So I'm assuming that's using /bin/cp Also, from my _very limited_ investigation this morning, it seems that #grep Err /tmp/getdents.bin.cp.truss | grep -v ENOENT | grep getdents returns entries such as: getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEE34000, 8192) Err#9 EBADF ...(truncated) whereas it seems like with a copy where everything is transferred correctly that the above statement returns no getdents64() with an EBADF error, leading me to believe that somewhere along the line getdents64 is attempted to be called but that the descriptor is invalidated somehow. Again...I am only gleaming that from a very limited test. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski [EMAIL PROTECTED] wrote: I am able to replicate the problem in bash using: #truss -tall -vall -o /tmp/getdents.bin.cp.truss /bin/cp -pr /pond/photos/* /pond/copytestsame/ So I'm assuming that's using /bin/cp Also, from my _very limited_ investigation this morning, it seems that #grep Err /tmp/getdents.bin.cp.truss | grep -v ENOENT | grep getdents returns entries such as: getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEE34000, 8192) Err#9 EBADF ...(truncated) If you get this, you may need to provide the full truss output to allow to understand what'ts happening. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Robert Milkowski [EMAIL PROTECTED] wrote: If you could re-create empty files - exactly the same directory atructure and file names, check if you still got a problem. If you do, then if you could send a script here (mkdir's -p and touch) so we can investigate. If you like to replicate a long directory structure with empty files, you can use star: star -c -meta f=/tmp/x.tar -C dir . and later: star -xp -xmeta f=/tmp/x.tar Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Also, from my _very limited_ investigation this morning, it seems tha= t #grep Err /tmp/getdents.bin.cp.truss | grep -v ENOENT | grep getdents returns entries such as: getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEE34000, 8192) Err#9 EBADF =2E..(truncated) Ah, this looks like someone closed stdin and then something weird happened. Hm. We need full truss out, specifically of all calls which return or release filedescriptors. The plot thickens. Casper ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Hello Joerg, Wednesday, January 30, 2008, 2:56:27 PM, you wrote: JS Robert Milkowski [EMAIL PROTECTED] wrote: If you could re-create empty files - exactly the same directory atructure and file names, check if you still got a problem. If you do, then if you could send a script here (mkdir's -p and touch) so we can investigate. JS If you like to replicate a long directory structure with empty files, JS you can use star: JS star -c -meta f=/tmp/x.tar -C dir . JS and later: JS star -xp -xmeta f=/tmp/x.tar It really is a swiss knife :) That's a handy one (although it's a first time I actually have seen a need for such functionality). -- Best regards, Robertmailto:[EMAIL PROTECTED] http://milek.blogspot.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
[EMAIL PROTECTED] wrote: Also, from my _very limited_ investigation this morning, it seems tha= t #grep Err /tmp/getdents.bin.cp.truss | grep -v ENOENT | grep getdents returns entries such as: getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEC92000, 8192) Err#9 EBADF getdents64(0, 0xFEE34000, 8192) Err#9 EBADF =2E..(truncated) Ah, this looks like someone closed stdin and then something weird happened. Hm. stdin is usually not a directory ;-9 This looks much more weird Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Mark Ashley [EMAIL PROTECTED] wrote: It's simply a shell grokking issue, when you allow your (l)users to self name your files then you will have spaces etc in the filename (breaks shell arguments). In this case the '[E]' is breaking your command line argument grokking. Can't be, because the '[E]' wasn't part of the command line arguments (it was in a subdirectory). - Marcus ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
On Tue, 29 Jan 2008 14:25:18 +0200 Marcus Sundman [EMAIL PROTECTED] wrote: Mark Ashley [EMAIL PROTECTED] wrote: It's simply a shell grokking issue, when you allow your (l)users to self name your files then you will have spaces etc in the filename (breaks shell arguments). In this case the '[E]' is breaking your command line argument grokking. Can't be, because the '[E]' wasn't part of the command line arguments (it was in a subdirectory). Also, if the '[E]' were causing a problem, I would think it would affect all files in the dir, not some seemingly random subset of similarly named files. Hopefully I'll be able to put in some time investigating this again soon. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Robert Milkowski wrote: As Joerg suggested - please check getdents() - remember to use truss -v getdents so you should see all directory listings. I would check both getdents and open - so if it appears in getdents but is not opened later on... I ran the copy procedure with this: #truss -t open,getdents -v open,getdents -o /tmp/getdents.truss cp -pr /pond/photos/* /pond/copytestsame/ It seems that the same files I am missing on copy are missing in the getdents return value, even though I can see and read these files via bash. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski wrote: Robert Milkowski wrote: As Joerg suggested - please check getdents() - remember to use truss -v getdents so you should see all directory listings. I would check both getdents and open - so if it appears in getdents but is not opened later on... I ran the copy procedure with this: #truss -t open,getdents -v open,getdents -o /tmp/getdents.truss cp -pr /pond/photos/* /pond/copytestsame/ It seems that the same files I am missing on copy are missing in the getdents return value, even though I can see and read these files via bash. -Chris I've attached a copy of truss output from doing the copy, and from doing a simple ls on the original directory. The getdents output for the ls shows all the files, whereas the getdents output for the cp is missing about 20 files. The cp: stat64(/pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 0x08068468) = 0 d=0x02D90002 i=59135 m=0040755 l=2 u=0 g=0 sz=222 at = Jan 30 00:05:23 EST 2008 [ 1201669523 ] mt = May 2 19:59:48 EDT 2006 [ 1146614388 ] ct = Jan 20 23:37:48 EST 2008 [ 1200890268 ] bsz=14336 blks=7 fs=zfs pathconf(/pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 20) = 2 acl(/pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, ACE_GETACLCNT, 0, 0x) = 6 stat64(/pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 0x080419C0) = 0 d=0x02D90002 i=59135 m=0040755 l=2 u=0 g=0 sz=222 at = Jan 30 00:05:23 EST 2008 [ 1201669523 ] mt = May 2 19:59:48 EDT 2006 [ 1146614388 ] ct = Jan 20 23:37:48 EST 2008 [ 1200890268 ] bsz=14336 blks=7 fs=zfs acl(/pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, ACE_GETACL, 6, 0x080ED910) = 6 stat64(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002, 0x080683D8) = 0 d=0x02D90002 i=13930 m=0040755 l=8 u=0 g=0 sz=8 at = Jan 30 00:09:22 EST 2008 [ 1201669762 ] mt = Jan 30 00:15:26 EST 2008 [ 1201670126 ] ct = Jan 30 00:15:26 EST 2008 [ 1201670126 ] bsz=131072 blks=3 fs=zfs stat64(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 0x080683D8) Err#2 ENOENT stat64(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 0x080683D8) Err#2 ENOENT mkdir(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 0755) = 0 stat64(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 0x080683D8) = 0 d=0x02D90002 i=953 m=0040755 l=2 u=0 g=0 sz=2 at = Jan 30 00:15:26 EST 2008 [ 1201670126 ] mt = Jan 30 00:15:26 EST 2008 [ 1201670126 ] ct = Jan 30 00:15:26 EST 2008 [ 1201670126 ] bsz=131072 blks=1 fs=zfs chmod(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 0755) = 0 openat(AT_FDCWD, /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, O_RDONLY|O_NDELAY|O_LARGEFILE) = 0 fcntl(0, F_SETFD, 0x0001) = 0 fstat64(0, 0x080413F0) = 0 d=0x02D90002 i=59135 m=0040755 l=2 u=0 g=0 sz=222 at = Jan 30 00:05:23 EST 2008 [ 1201669523 ] mt = May 2 19:59:48 EDT 2006 [ 1146614388 ] ct = Jan 20 23:37:48 EST 2008 [ 1200890268 ] bsz=14336 blks=7 fs=zfs fstat64(0, 0x080414D0) = 0 d=0x02D90002 i=59135 m=0040755 l=2 u=0 g=0 sz=222 at = Jan 30 00:05:23 EST 2008 [ 1201669523 ] mt = May 2 19:59:48 EDT 2006 [ 1146614388 ] ct = Jan 20 23:37:48 EST 2008 [ 1200890268 ] bsz=14336 blks=7 fs=zfs getdents64(0, 0xFEC9A000, 8192) = 8160 ino=59135 off=1rlen=24 . ino=21193 off=2rlen=24 .. ino=59171 off=268814041 rlen=40 104-0432_CRW.JPG output truncated here...file list continues on...missing certain files... the ls: lstat64(/pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, 0x08046980) = 0 d=0x02D90002 i=59135 m=0040755 l=2 u=0 g=0 sz=222 at = Jan 30 00:02:47 EST 2008 [ 1201669367 ] mt = May 2 19:59:48 EDT 2006 [ 1146614388 ] ct = Jan 20 23:37:48 EST 2008 [ 1200890268 ] bsz=14336 blks=7 fs=zfs openat(AT_FDCWD, /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg, O_RDONLY|O_NDELAY|O_LARGEFILE) = 3 fcntl(3, F_SETFD, 0x0001) = 0 fstat64(3, 0x08047980) = 0 d=0x02D90002 i=59135 m=0040755 l=2 u=0 g=0 sz=222 at = Jan 30 00:02:47 EST 2008 [ 1201669367 ] mt = May 2 19:59:48 EDT 2006 [ 1146614388 ] ct = Jan 20 23:37:48 EST 2008 [ 1200890268 ] bsz=14336
Re: [zfs-discuss] missing files on copy
Christopher Gorski wrote: Christopher Gorski wrote: Robert Milkowski wrote: As Joerg suggested - please check getdents() - remember to use truss -v getdents so you should see all directory listings. I would check both getdents and open - so if it appears in getdents but is not opened later on... I ran the copy procedure with this: #truss -t open,getdents -v open,getdents -o /tmp/getdents.truss cp -pr /pond/photos/* /pond/copytestsame/ It seems that the same files I am missing on copy are missing in the getdents return value, even though I can see and read these files via bash. -Chris I've attached a copy of truss output from doing the copy, and from doing a simple ls on the original directory. The getdents output for the ls shows all the files, whereas the getdents output for the cp is missing about 20 files. ... I noticed that the first calls in the cp and ls to getdents() return similar file lists, with the same values. However, in the ls, it makes a second call to getdents(): ino=59236 off=518928750 rlen=40 105-0563_CRW.JPG brk(0x080732F0) = 0 brk(0x0807D2F0) = 0 brk(0x0807D2F0) = 0 brk(0x080872F0) = 0 brk(0x080872F0) = 0 brk(0x080912F0) = 0 getdents64(3, 0xFECE4000, 8192) = 680 ino=59207 off=519930959 rlen=40 105-0593_CRW.JPG ino=59297 off=520121012 rlen=40 105-0502_CRW.JPG ino=59161 off=523498482 rlen=40 104-0422_IMG.JPG ino=59293 off=523683089 rlen=40 105-0506_CRW.JPG ino=59178 off=524812848 rlen=40 106-0625_CRW.JPG ino=59182 off=528395669 rlen=40 106-0621_CRW.JPG ino=59274 off=528576196 rlen=40 105-0525_CRW.JPG ino=59140 off=528767015 rlen=40 104-0401_IMG.JPG ino=59168 off=530430584 rlen=40 104-0429_CRW.JPG ino=59144 off=531822466 rlen=40 104-0405_IMG.JPG ino=59278 off=532136289 rlen=40 105-0521_CRW.JPG ino=59137 off=532747309 rlen=40 103-0398_IMG.JPG ino=59164 off=533088919 rlen=40 104-0425_CRW.JPG ino=59198 off=533268032 rlen=40 106-0602_CRW.JPG ino=59270 off=534421035 rlen=40 105-0529_CRW.JPG ino=59148 off=535365485 rlen=40 104-0409_IMG.JPG ino=59194 off=536848869 rlen=40 106-0606_CRW.JPG getdents64(3, 0xFECE4000, 8192) = 0 close(3)= 0 These are my missing files. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski wrote: I noticed that the first calls in the cp and ls to getdents() return similar file lists, with the same values. However, in the ls, it makes a second call to getdents(): If this is Sun's cp, file a bug. It's failing to notice that it didn't provide a large enough buffer to getdents(), so it only got partial results. Of course, the getdents() API is rather unfortunate. It appears the only safe algorithm is: while ((r = getdents(...)) 0) { /* process results */ } if (r 0) { /* handle error */ } You _always_ have to call it at least twice to be sure you've gotten everything. -- Carson ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Carson Gaspar wrote: Christopher Gorski wrote: I noticed that the first calls in the cp and ls to getdents() return similar file lists, with the same values. However, in the ls, it makes a second call to getdents(): If this is Sun's cp, file a bug. It's failing to notice that it didn't provide a large enough buffer to getdents(), so it only got partial results. Of course, the getdents() API is rather unfortunate. It appears the only safe algorithm is: while ((r = getdents(...)) 0) { /* process results */ } if (r 0) { /* handle error */ } You _always_ have to call it at least twice to be sure you've gotten everything. Yes, it is Sun's cp. I'm trying, with some difficulty, to figure out exactly how to reproduce this error in a way not specific to my data. I copied a set of randomly generated files with a deep directory structure and cp seems to correctly call getdents() multiple times. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
It's simply a shell grokking issue, when you allow your (l)users to self name your files then you will have spaces etc in the filename (breaks shell arguments). In this case the '[E]' is breaking your command line argument grokking. We have the same issue in our photos tree. We have to use non shell tools to do the copying or attribute changing. CG # ls CG /pond/photos/unsorted/drive-452a/\[E\]/drive/archives/seconddisk_20nov2002/eujpg/103* CG /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0398_IMG.JPG CG /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG CG /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG CG # ls CG /pond/copytestsame/unsorted/drive-452a/\[E\]/drive/archives/seconddisk_20nov2002/eujpg/103* CG /pond/copytestsame/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG CG /pond/copytestsame/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG CG # grep eujpg /tmp/cp.truss | grep 103 | grep seconddisk CG open64(unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG, CG O_RDONLY) = 0 CG open64(unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG, CG O_RDONLY) = 0 CG open64(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG, CG O_RDONLY) = 6 CG open64(unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG, CG O_RDONLY) = 0 CG open64(unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG, CG O_RDONLY) = 0 CG open64(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG, CG O_RDONLY) = 6 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski wrote: unsorted/photosbackup/laptopd600/[D]/cag2b/eujpg/103-0398_IMG.JPG is a file that is always missing in the new tree. Oops, I meant: unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0398_IMG.JPG is always missing in the new tree. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Robert Milkowski wrote: Hello Christopher, Friday, January 25, 2008, 5:37:58 AM, you wrote: CG michael schuster wrote: I assume you've assured that there's enough space in /pond ... can you try $(cd pond/photos; tar cf - *) | (cd /pond/copytestsame; tar xf -) CG I tried it, and it worked. The new tree is an exact copy of the old one. could you run your cp as 'truss -t open -o /tmp/cp.truss cp * ' and then see if you can see all files being open for reads and check if they were successfully opened for writes? I ran: #truss -t open -o /tmp/cp.truss cp -pr * /pond/copytestsame/ Same result as with cp. The same files are missing in the new tree. unsorted/photosbackup/laptopd600/[D]/cag2b/eujpg/103-0398_IMG.JPG is a file that is always missing in the new tree. # ls /pond/photos/unsorted/drive-452a/\[E\]/drive/archives/seconddisk_20nov2002/eujpg/103* /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0398_IMG.JPG /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG # ls /pond/copytestsame/unsorted/drive-452a/\[E\]/drive/archives/seconddisk_20nov2002/eujpg/103* /pond/copytestsame/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG /pond/copytestsame/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG # grep eujpg /tmp/cp.truss | grep 103 | grep seconddisk open64(unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG, O_RDONLY) = 0 open64(unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG, O_RDONLY) = 0 open64(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG, O_RDONLY) = 6 open64(unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG, O_RDONLY) = 0 open64(unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG, O_RDONLY) = 0 open64(/pond/copytestsame//unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG, O_RDONLY) = 6 The missing file does not seem to be in the truss output. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
On Fri, 25 Jan 2008 15:18:36 -0500 Tiernan, Daniel [EMAIL PROTECTED] wrote: You may have hit a cp and or shell bug due to the directory naming topology. Rather then depend on cp -r I prefer the cpio method: find * print|cpio -pdumv dest_path I'd try the find by itself to see if it yields the correct file list before piping into cpio... I will look into this and Jörg's suggestion when I return to the machine on Monday. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
You may have hit a cp and or shell bug due to the directory naming topology. Rather then depend on cp -r I prefer the cpio method: find * print|cpio -pdumv dest_path I'd try the find by itself to see if it yields the correct file list before piping into cpio... -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Christopher Gorski Sent: Friday, January 25, 2008 12:52 PM To: Robert Milkowski Cc: zfs-discuss@opensolaris.org; michael schuster Subject: Re: [zfs-discuss] missing files on copy Christopher Gorski wrote: unsorted/photosbackup/laptopd600/[D]/cag2b/eujpg/103-0398_IMG.JPG is a file that is always missing in the new tree. Oops, I meant: unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0 398_IMG.JPG is always missing in the new tree. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss == Please access the attached hyperlink for an important electronic communications disclaimer: http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html == ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski [EMAIL PROTECTED] wrote: can you try $(cd pond/photos; tar cf - *) | (cd /pond/copytestsame; tar xf -) CG I tried it, and it worked. The new tree is an exact copy of the old one. could you run your cp as 'truss -t open -o /tmp/cp.truss cp * ' and then see if you can see all files being open for reads and check if they were successfully opened for writes? I ran: #truss -t open -o /tmp/cp.truss cp -pr * /pond/copytestsame/ Same result as with cp. The same files are missing in the new tree. unsorted/photosbackup/laptopd600/[D]/cag2b/eujpg/103-0398_IMG.JPG is a file that is always missing in the new tree. ... The missing file does not seem to be in the truss output. Do not expect to see anything useful when tracing open. But check getdents(2) i.e. what gets called from readdir(3). I recently got a star bug report from a FreeBSD guy that turned out to be a result from a missing .. entry in a zfs snapshot root dir. Check the source of the failing program also... I did recently spend a lot of time in fixing nasty bugs in the SCCS source and it turned out that there have been places where the author believed that . and .. are always returned by readdir(3) and that they are always returned first. Jörg -- EMail:[EMAIL PROTECTED] (home) Jörg Schilling D-13353 Berlin [EMAIL PROTECTED](uni) [EMAIL PROTECTED] (work) Blog: http://schily.blogspot.com/ URL: http://cdrecord.berlios.de/old/private/ ftp://ftp.berlios.de/pub/schily ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski wrote: Hi, I'm running snv_78 on a dual-core 64-bit x86 system with 2 500GB usb drives mirrored into one pool. I did this (intending to set the rdonly flag after I copy my data): zfs create pond/read-only mkdir /pond/read-only/copytest cp -rp /pond/photos/* /pond/read-only/copytest/ After the copy is complete, a comparison of the original and copied trees revealed that /pond/read-only/copytest/photos has missing files. I tried this twice, and the missing files are different every time. I'm copying 35GB, and about 1GB is missing. cp gives me no errors, and zpool status says everything is fine. A du -k of both trees shows the discrepancy. are you missing disk space, or actual files? Michael -- Michael SchusterSun Microsystems, Inc. recursion, n: see 'recursion' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
I'm missing actual files. I did this a second time, with the exact same result. It appears that the missing files in each copy are the same files. I originally copied these files over via Samba before trying to copy them locally with cp to the other file system. I'll have 200 sequentially numbered photos in a directory about ten levels deep, and say, 20 of these files won't exist in the copy. I'm able to read the original files. I've verified that I can read the original files. I also verified locally via bash that files in the copied trees are missing. -Chris michael schuster wrote: Christopher Gorski wrote: Hi, I'm running snv_78 on a dual-core 64-bit x86 system with 2 500GB usb drives mirrored into one pool. I did this (intending to set the rdonly flag after I copy my data): zfs create pond/read-only mkdir /pond/read-only/copytest cp -rp /pond/photos/* /pond/read-only/copytest/ After the copy is complete, a comparison of the original and copied trees revealed that /pond/read-only/copytest/photos has missing files. I tried this twice, and the missing files are different every time. I'm copying 35GB, and about 1GB is missing. cp gives me no errors, and zpool status says everything is fine. A du -k of both trees shows the discrepancy. are you missing disk space, or actual files? Michael ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
FWIW, I just finished performing a copy again, to the same filesystem: mkdir /pond/copytestsame cd /pond/photos cp -rp * /pond/copytestsame Same files are missing throughout the new tree...on the order of a thousand files. There are about 27k files in /pond/photos and 25k files in /pond/copytestsame The original samba copy from another PC to /pond/photos copied everything correctly. -Chris Christopher Gorski wrote: I'm missing actual files. I did this a second time, with the exact same result. It appears that the missing files in each copy are the same files. I originally copied these files over via Samba before trying to copy them locally with cp to the other file system. I'll have 200 sequentially numbered photos in a directory about ten levels deep, and say, 20 of these files won't exist in the copy. I'm able to read the original files. I've verified that I can read the original files. I also verified locally via bash that files in the copied trees are missing. -Chris michael schuster wrote: Christopher Gorski wrote: Hi, I'm running snv_78 on a dual-core 64-bit x86 system with 2 500GB usb drives mirrored into one pool. I did this (intending to set the rdonly flag after I copy my data): zfs create pond/read-only mkdir /pond/read-only/copytest cp -rp /pond/photos/* /pond/read-only/copytest/ After the copy is complete, a comparison of the original and copied trees revealed that /pond/read-only/copytest/photos has missing files. I tried this twice, and the missing files are different every time. I'm copying 35GB, and about 1GB is missing. cp gives me no errors, and zpool status says everything is fine. A du -k of both trees shows the discrepancy. are you missing disk space, or actual files? Michael ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Christopher Gorski wrote: FWIW, I just finished performing a copy again, to the same filesystem: mkdir /pond/copytestsame cd /pond/photos cp -rp * /pond/copytestsame Same files are missing throughout the new tree...on the order of a thousand files. There are about 27k files in /pond/photos and 25k files in /pond/copytestsame The original samba copy from another PC to /pond/photos copied everything correctly. I assume you've assured that there's enough space in /pond ... can you try $(cd pond/photos; tar cf - *) | (cd /pond/copytestsame; tar xf -) Michael -- Michael SchusterSun Microsystems, Inc. recursion, n: see 'recursion' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Nicolas Williams wrote: On Thu, Jan 24, 2008 at 11:06:13PM -0500, Christopher Gorski wrote: I'm missing actual files. Christopher Gorski wrote: zfs create pond/read-only mkdir /pond/read-only/copytest cp -rp /pond/photos/* /pond/read-only/copytest/ Might the missing files' names start with '.' by any chance? If so, know that the glob pattern * does not match names that start with '.'. Valid point, but I think more precisely you need to ask whether any files/directories in /pond/photos/ start with a .; beneath there, that should be irrelevant. Michael -- Michael SchusterSun Microsystems, Inc. recursion, n: see 'recursion' ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
No. Here's a cut and paste of names of actual files missing: (the original) ls -al /pond/photos/unsorted/drive-452a/\[E\]/drive/archives/seconddisk_20nov2002/eujpg/103-0* -rwxr--r-- 1 root root 593558 Nov 20 2002 /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0398_IMG.JPG -rwxr--r-- 1 root root 592655 Nov 20 2002 /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG -rwxr--r-- 1 root root 545029 Nov 20 2002 /pond/photos/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG (copied tree) # ls -al /pond/read-only/copytest/unsorted/drive-452a/\[E\]/drive/archives/seconddisk_20nov2002/eujpg/103-0* -rwxr--r-- 1 root root 592655 Nov 20 2002 /pond/read-only/copytest/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0399_IMG.JPG -rwxr--r-- 1 root root 545029 Nov 20 2002 /pond/read-only/copytest/unsorted/drive-452a/[E]/drive/archives/seconddisk_20nov2002/eujpg/103-0400_IMG.JPG I have plenty of disk space (over 100GB free) and have done the copy three times, with all copies simultaneously existing on the drive, all with the same missing files. I notice that if I do cp -rp /blah/blah/eujpg/* /pond/test that all the files do copy correctly. I'm attempting a tar create/extract as suggested by Michael, but it will take a bit of time... -Chris Nicolas Williams wrote: On Thu, Jan 24, 2008 at 11:06:13PM -0500, Christopher Gorski wrote: I'm missing actual files. Christopher Gorski wrote: zfs create pond/read-only mkdir /pond/read-only/copytest cp -rp /pond/photos/* /pond/read-only/copytest/ Might the missing files' names start with '.' by any chance? If so, know that the glob pattern * does not match names that start with '.'. Nico ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Are there so many files that the glob expansion results in too large an argument list for cp? ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
Nicolas Williams wrote: Are there so many files that the glob expansion results in too large an argument list for cp? There are only four subdirs in /pond/photos: # ls /pond/photos 2006-02-15 2006-06-09 2007-12-20 unsorted ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] missing files on copy
michael schuster wrote: I assume you've assured that there's enough space in /pond ... can you try $(cd pond/photos; tar cf - *) | (cd /pond/copytestsame; tar xf -) I tried it, and it worked. The new tree is an exact copy of the old one. -Chris ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss