Re: [Ocfs2-users] ls taking ages on a directory containing 900000 files
strace -p PID -ttt -T Attach and get some timings. The simplest guess is that the system lacks memory to cache all the inodes and thus has to hit disk (and more importantly take cluster locks) for the same inode repeatedly. The user guide has a section in NOTES explaining this. On Tue, Dec 4, 2012 at 8:54 AM, Amaury Francois amaury.franc...@digora.comwrote: Hello, ** ** We are running OCFS2 1.8 and on a kernel UEK2. An ls on a directory containing approx. 1 million of files is very long (1H). The features we have activated on the filesystem are the following : ** ** [root@pa-oca-app10 ~]# debugfs.ocfs2 -R stats /dev/sdb1 Revision: 0.90 Mount Count: 0 Max Mount Count: 20 State: 0 Errors: 0 Check Interval: 0 Last Check: Fri Nov 30 19:30:17 2012 Creator OS: 0 Feature Compat: 3 backup-super strict-journal-super Feature Incompat: 32592 sparse extended-slotmap inline-data metaecc xattr indexed-dirs refcount discontig-bg clusterinfo Tunefs Incomplete: 0 Feature RO compat: 1 unwritten Root Blknum: 5 System Dir Blknum: 6 First Cluster Group Blknum: 3 Block Size Bits: 12 Cluster Size Bits: 12 Max Node Slots: 8 Extended Attributes Inline Size: 256 Label: exchange2 UUID: 2375EAF4E4954C4ABB984BDE27AC93D5 Hash: 2880301520 (0xabade9d0) DX Seeds: 1678175851 1096448356 79406012 (0x6406ee6b 0x415a7964 0x04bba3bc) Cluster stack: o2cb Cluster name: appcluster Cluster flags: 1 Globalheartbeat Inode: 2 Mode: 00 Generation: 3567595533 (0xd4a5300d) FS Generation: 3567595533 (0xd4a5300d) CRC32: 0c996202 ECC: 0819 Type: Unknown Attr: 0x0 Flags: Valid System Superblock Dynamic Features: (0x0) User: 0 (root) Group: 0 (root) Size: 0 Links: 0 Clusters: 5242635 ctime: 0x508eac6b 0x0 -- Mon Oct 29 17:18:51.0 2012 atime: 0x0 0x0 -- Thu Jan 1 01:00:00.0 1970 mtime: 0x508eac6b 0x0 -- Mon Oct 29 17:18:51.0 2012 dtime: 0x0 -- Thu Jan 1 01:00:00 1970 Refcount Block: 0 Last Extblk: 0 Orphan Slot: 0 Sub Alloc Slot: Global Sub Alloc Bit: 65535 ** ** ** ** May inline-data or xattr be the source of the problem ? ** ** Thank you. ** ** [image: Description : Description : Description : cid:image001.png@01CD01F3.35091200] * * *Amaury FRANCOIS* • *Ingénieur* Mobile +33 (0)6 88 12 62 54 *amaury.franc...@digora.com * * * *Siège Social – 66 rue du Marché Gare – 67200 STRASBOURG* Tél : 0 820 200 217 - +33 (0)3 88 10 49 20 [image: Description : test] ** ** ** ** ___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-users image002.jpgimage001.png___ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com https://oss.oracle.com/mailman/listinfo/ocfs2-users
Re: [Ocfs2-users] ls taking ages on a directory containing 900000 files
1.5 ms per inode. Times 900K files equals 22 mins. Large dirs are a problem is all file systems. The degree of problem depends on the overhead. An easy solution around is to shard the files into multilevel dirs. Like a 2 level structure of a 1000 files in 1000 dirs. Or, a 3 level structure with even fewer files per dir. Or you could use the other approach suggested. Avoids stat() by disabling color-ls. Or just use plain find. On Tue, Dec 4, 2012 at 3:16 PM, Erik Schwartz schwartz.eri...@gmail.comwrote: Amaury, you can see in strace output that it's performing a stat on every file. Try simply: $ /bin/ls My guess is you're using a system where ls is aliased to use options that are more expensive. Best regards - Erik On 12/4/12 5:12 PM, Amaury Francois wrote: The strace looks like this (on all files) : 1354662591.755319 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P069_F01589.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001389 1354662591.756775 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P035_F01592.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001532 1354662591.758376 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P085_F01559.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001429 1354662591.759873 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P027_F01569.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001377 1354662591.761317 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P002_F01581.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001420 1354662591.762804 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P050_F01568.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001345 1354662591.764216 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P089_F01567.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001541 1354662591.765828 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P010_F01594.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001358 1354662591.767252 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P045_F01569.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001396 1354662591.768715 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P036_F01592.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.002072 1354662591.770854 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P089_F01568.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001722 1354662591.772643 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P009_F01600.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001281 1354662591.773992 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P022_F01583.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001413 We are using a 32 bits architecture, can it be the cause of the kernel not having enough memory ? Any possibility to change this behavior ? Description : Description : Description : cid:image001.png@01CD01F3.35091200 * * *Amaury FRANCOIS* • *Ingénieur* Mobile +33 (0)6 88 12 62 54 *amaury.franc...@digora.com mailto:amaury.franc...@digora.com*** * * *Siège Social – 66 rue du Marché Gare – 67200 STRASBOURG* Tél : 0 820 200 217 - +33 (0)3 88 10 49 20 Description : test *De :*Sunil Mushran [mailto:sunil.mush...@gmail.com] *Envoyé :* mardi 4 décembre 2012 18:29 *À :* Amaury Francois *Cc :* ocfs2-users@oss.oracle.com *Objet :* Re: [Ocfs2-users] ls taking ages on a directory containing 90 files strace -p PID -ttt -T Attach and get some timings. The simplest guess is that the system lacks memory to cache all the inodes and thus has to hit disk (and more importantly take cluster locks) for the same inode repeatedly. The user guide has a section in NOTES explaining this. On Tue, Dec 4, 2012 at 8:54 AM, Amaury Francois amaury.franc...@digora.com mailto:amaury.franc...@digora.com wrote: Hello, We are running OCFS2 1.8 and on a kernel UEK2. An ls on a directory containing approx. 1 million of files is very long (1H). The features we have activated on the filesystem are the following : [root@pa-oca-app10 ~]# debugfs.ocfs2 -R stats /dev/sdb1 Revision: 0.90 Mount Count: 0 Max Mount Count: 20 State: 0 Errors: 0 Check Interval: 0 Last Check: Fri Nov 30 19:30:17 2012 Creator OS: 0 Feature Compat: 3 backup-super strict-journal-super Feature Incompat: 32592 sparse extended-slotmap inline-data metaecc xattr indexed-dirs refcount discontig-bg clusterinfo Tunefs Incomplete: 0 Feature RO compat: 1 unwritten Root Blknum: 5 System Dir Blknum: 6 First Cluster Group Blknum: 3 Block Size Bits: 12 Cluster Size Bits: 12 Max Node Slots: 8 Extended Attributes Inline Size: 256 Label: exchange2 UUID: 2375EAF4E4954C4ABB984BDE27AC93D5 Hash: 2880301520
Re: [Ocfs2-users] ls taking ages on a directory containing 900000 files
Thank you very much for your answers ! [Description : Description : Description : cid:image001.png@01CD01F3.35091200] Amaury FRANCOIS * Ingénieur Mobile +33 (0)6 88 12 62 54 amaury.franc...@digora.commailto:amaury.franc...@digora.com Siège Social - 66 rue du Marché Gare - 67200 STRASBOURG Tél : 0 820 200 217 - +33 (0)3 88 10 49 20 [Description : test] De : Sunil Mushran [mailto:sunil.mush...@gmail.com] Envoyé : mercredi 5 décembre 2012 00:22 À : Erik Schwartz Cc : Amaury Francois; ocfs2-users@oss.oracle.com Objet : Re: [Ocfs2-users] ls taking ages on a directory containing 90 files 1.5 ms per inode. Times 900K files equals 22 mins. Large dirs are a problem is all file systems. The degree of problem depends on the overhead. An easy solution around is to shard the files into multilevel dirs. Like a 2 level structure of a 1000 files in 1000 dirs. Or, a 3 level structure with even fewer files per dir. Or you could use the other approach suggested. Avoids stat() by disabling color-ls. Or just use plain find. On Tue, Dec 4, 2012 at 3:16 PM, Erik Schwartz schwartz.eri...@gmail.commailto:schwartz.eri...@gmail.com wrote: Amaury, you can see in strace output that it's performing a stat on every file. Try simply: $ /bin/ls My guess is you're using a system where ls is aliased to use options that are more expensive. Best regards - Erik On 12/4/12 5:12 PM, Amaury Francois wrote: The strace looks like this (on all files) : 1354662591.755319 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P069_F01589.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001389 1354662591.756775 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P035_F01592.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001532 1354662591.758376 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P085_F01559.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001429 1354662591.759873 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P027_F01569.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001377 1354662591.761317 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P002_F01581.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001420 1354662591.762804 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P050_F01568.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001345 1354662591.764216 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P089_F01567.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001541 1354662591.765828 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P010_F01594.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001358 1354662591.767252 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P045_F01569.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001396 1354662591.768715 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P036_F01592.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.002072 1354662591.770854 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P089_F01568.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001722 1354662591.772643 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P009_F01600.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001281 1354662591.773992 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P022_F01583.txt, {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001413 We are using a 32 bits architecture, can it be the cause of the kernel not having enough memory ? Any possibility to change this behavior ? Description : Description : Description : cid:image001.png@01CD01F3.35091200 * * *Amaury FRANCOIS* * *Ingénieur* Mobile +33 (0)6 88 12 62 54tel:%2B33%20%280%296%2088%20%2012%2062%2054 *amaury.franc...@digora.commailto:amaury.franc...@digora.com mailto:amaury.franc...@digora.commailto:amaury.franc...@digora.com*** * * *Siège Social - 66 rue du Marché Gare - 67200 STRASBOURG* Tél : 0 820 200 217 - +33 (0)3 88 10 49 20tel:%2B33%20%280%293%2088%2010%2049%2020 Description : test *De :*Sunil Mushran [mailto:sunil.mush...@gmail.commailto:sunil.mush...@gmail.com] *Envoyé :* mardi 4 décembre 2012 18:29 *À :* Amaury Francois *Cc :* ocfs2-users@oss.oracle.commailto:ocfs2-users@oss.oracle.com *Objet :* Re: [Ocfs2-users] ls taking ages on a directory containing 90 files strace -p PID -ttt -T Attach and get some timings. The simplest guess is that the system lacks memory to cache all the inodes and thus has to hit disk (and more importantly take cluster locks) for the same inode repeatedly. The user guide has a section in NOTES explaining this. On Tue, Dec 4, 2012 at 8:54 AM, Amaury Francois amaury.franc...@digora.commailto:amaury.franc...@digora.com mailto:amaury.franc...@digora.commailto:amaury.franc...@digora.com wrote: Hello, We are running OCFS2 1.8 and on a kernel UEK2. An ls on a directory containing approx. 1 million of files is very long (1H). The features we have activated on the filesystem are the following : [root@pa-oca-app10 ~]# debugfs.ocfs2 -R stats /dev/sdb1 Revision: 0.90