Re: [Ocfs2-users] ls taking ages on a directory containing 900000 files

2012-12-04 Thread Sunil Mushran
strace -p PID -ttt -T

Attach and get some timings. The simplest guess is that the system lacks
memory to cache all the inodes
and thus has to hit disk (and more importantly take cluster locks) for the
same inode repeatedly. The user
guide has a section in NOTES explaining this.



On Tue, Dec 4, 2012 at 8:54 AM, Amaury Francois
amaury.franc...@digora.comwrote:

  Hello,

 ** **

 We are running OCFS2 1.8 and on a kernel UEK2. An ls on a directory
 containing approx. 1 million of files  is very long (1H). The features we
 have activated on the filesystem are the following : 

 ** **

 [root@pa-oca-app10 ~]# debugfs.ocfs2 -R stats /dev/sdb1

 Revision: 0.90

 Mount Count: 0   Max Mount Count: 20

 State: 0   Errors: 0

 Check Interval: 0   Last Check: Fri Nov 30 19:30:17 2012

 Creator OS: 0

 Feature Compat: 3 backup-super strict-journal-super

 Feature Incompat: 32592 sparse extended-slotmap inline-data
 metaecc xattr indexed-dirs refcount discontig-bg clusterinfo

 Tunefs Incomplete: 0

 Feature RO compat: 1 unwritten

 Root Blknum: 5   System Dir Blknum: 6

 First Cluster Group Blknum: 3

 Block Size Bits: 12   Cluster Size Bits: 12

 Max Node Slots: 8

 Extended Attributes Inline Size: 256

 Label: exchange2

 UUID: 2375EAF4E4954C4ABB984BDE27AC93D5

 Hash: 2880301520 (0xabade9d0)

 DX Seeds: 1678175851 1096448356 79406012 (0x6406ee6b 0x415a7964
 0x04bba3bc)

 Cluster stack: o2cb

 Cluster name: appcluster

 Cluster flags: 1 Globalheartbeat

 Inode: 2   Mode: 00   Generation: 3567595533 (0xd4a5300d)

 FS Generation: 3567595533 (0xd4a5300d)

 CRC32: 0c996202   ECC: 0819

 Type: Unknown   Attr: 0x0   Flags: Valid System Superblock

 Dynamic Features: (0x0)

 User: 0 (root)   Group: 0 (root)   Size: 0

 Links: 0   Clusters: 5242635

 ctime: 0x508eac6b 0x0 -- Mon Oct 29 17:18:51.0 2012

 atime: 0x0 0x0 -- Thu Jan  1 01:00:00.0 1970

 mtime: 0x508eac6b 0x0 -- Mon Oct 29 17:18:51.0 2012

 dtime: 0x0 -- Thu Jan  1 01:00:00 1970

 Refcount Block: 0

 Last Extblk: 0   Orphan Slot: 0

 Sub Alloc Slot: Global   Sub Alloc Bit: 65535

 ** **

 ** **

 May inline-data or xattr be the source of the problem ?

 ** **

 Thank you. 

 ** **

 [image: Description : Description : Description :
 cid:image001.png@01CD01F3.35091200]

 * *

 *Amaury FRANCOIS*   •  *Ingénieur*

 Mobile +33 (0)6 88  12 62 54

 *amaury.franc...@digora.com *

 * *

 *Siège Social – 66 rue du Marché Gare – 67200 STRASBOURG*

 Tél : 0 820 200 217 - +33 (0)3 88 10 49 20 

 [image: Description : test]

 ** **

 ** **

 ___
 Ocfs2-users mailing list
 Ocfs2-users@oss.oracle.com
 https://oss.oracle.com/mailman/listinfo/ocfs2-users

image002.jpgimage001.png___
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] ls taking ages on a directory containing 900000 files

2012-12-04 Thread Sunil Mushran
1.5 ms per inode. Times 900K files equals 22 mins.

Large dirs are a problem is all file systems. The degree of problem
depends on the overhead. An easy solution around is to shard the
files into multilevel dirs. Like a 2 level structure of a 1000 files in
1000 dirs. Or, a 3 level structure with even fewer files per dir.

Or you could use the other approach suggested. Avoids stat()
by disabling color-ls. Or just use plain find.


On Tue, Dec 4, 2012 at 3:16 PM, Erik Schwartz schwartz.eri...@gmail.comwrote:

 Amaury, you can see in strace output that it's performing a stat on
 every file.

 Try simply:

   $ /bin/ls

 My guess is you're using a system where ls is aliased to use options
 that are more expensive.

 Best regards -

 Erik


 On 12/4/12 5:12 PM, Amaury Francois wrote:
  The strace looks like this (on all files) :
 
 
 
  1354662591.755319
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P069_F01589.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001389
 
  1354662591.756775
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P035_F01592.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001532
 
  1354662591.758376
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P085_F01559.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001429
 
  1354662591.759873
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P027_F01569.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001377
 
  1354662591.761317
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P002_F01581.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001420
 
  1354662591.762804
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P050_F01568.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001345
 
  1354662591.764216
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P089_F01567.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001541
 
  1354662591.765828
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P010_F01594.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001358
 
  1354662591.767252
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P045_F01569.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001396
 
  1354662591.768715
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P036_F01592.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.002072
 
  1354662591.770854
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P089_F01568.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001722
 
  1354662591.772643
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P009_F01600.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001281
 
  1354662591.773992
  lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P022_F01583.txt,
  {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001413
 
 
 
  We are using a 32 bits architecture, can it be the cause of the kernel
  not having enough memory ? Any possibility to change this behavior ?
 
 
 
  Description : Description : Description :
 cid:image001.png@01CD01F3.35091200
 
 
 
  * *
 
  *Amaury FRANCOIS*   •  *Ingénieur*
 
  Mobile +33 (0)6 88 12 62 54
 
  *amaury.franc...@digora.com mailto:amaury.franc...@digora.com***
 
  * *
 
  *Siège Social – 66 rue du Marché Gare – 67200 STRASBOURG*
 
  Tél : 0 820 200 217 - +33 (0)3 88 10 49 20
 
 
 
  Description : test
 
 
 
 
 
  *De :*Sunil Mushran [mailto:sunil.mush...@gmail.com]
  *Envoyé :* mardi 4 décembre 2012 18:29
  *À :* Amaury Francois
  *Cc :* ocfs2-users@oss.oracle.com
  *Objet :* Re: [Ocfs2-users] ls taking ages on a directory containing
  90 files
 
 
 
  strace -p PID -ttt -T
 
 
 
  Attach and get some timings. The simplest guess is that the system lacks
  memory to cache all the inodes
 
  and thus has to hit disk (and more importantly take cluster locks) for
  the same inode repeatedly. The user
 
  guide has a section in NOTES explaining this.
 
 
 
 
 
  On Tue, Dec 4, 2012 at 8:54 AM, Amaury Francois
  amaury.franc...@digora.com mailto:amaury.franc...@digora.com wrote:
 
  Hello,
 
 
 
  We are running OCFS2 1.8 and on a kernel UEK2. An ls on a directory
  containing approx. 1 million of files  is very long (1H). The features
  we have activated on the filesystem are the following :
 
 
 
  [root@pa-oca-app10 ~]# debugfs.ocfs2 -R stats /dev/sdb1
 
  Revision: 0.90
 
  Mount Count: 0   Max Mount Count: 20
 
  State: 0   Errors: 0
 
  Check Interval: 0   Last Check: Fri Nov 30 19:30:17 2012
 
  Creator OS: 0
 
  Feature Compat: 3 backup-super strict-journal-super
 
  Feature Incompat: 32592 sparse extended-slotmap inline-data
  metaecc xattr indexed-dirs refcount discontig-bg clusterinfo
 
  Tunefs Incomplete: 0
 
  Feature RO compat: 1 unwritten
 
  Root Blknum: 5   System Dir Blknum: 6
 
  First Cluster Group Blknum: 3
 
  Block Size Bits: 12   Cluster Size Bits: 12
 
  Max Node Slots: 8
 
  Extended Attributes Inline Size: 256
 
  Label: exchange2
 
  UUID: 2375EAF4E4954C4ABB984BDE27AC93D5
 
  Hash: 2880301520 

Re: [Ocfs2-users] ls taking ages on a directory containing 900000 files

2012-12-04 Thread Amaury Francois
Thank you very much for your answers !

[Description : Description : Description : cid:image001.png@01CD01F3.35091200]


Amaury FRANCOIS   *  Ingénieur
Mobile +33 (0)6 88  12 62 54
amaury.franc...@digora.commailto:amaury.franc...@digora.com

Siège Social - 66 rue du Marché Gare - 67200 STRASBOURG
Tél : 0 820 200 217 - +33 (0)3 88 10 49 20

[Description : test]



De : Sunil Mushran [mailto:sunil.mush...@gmail.com]
Envoyé : mercredi 5 décembre 2012 00:22
À : Erik Schwartz
Cc : Amaury Francois; ocfs2-users@oss.oracle.com
Objet : Re: [Ocfs2-users] ls taking ages on a directory containing 90 
files

1.5 ms per inode. Times 900K files equals 22 mins.

Large dirs are a problem is all file systems. The degree of problem
depends on the overhead. An easy solution around is to shard the
files into multilevel dirs. Like a 2 level structure of a 1000 files in
1000 dirs. Or, a 3 level structure with even fewer files per dir.

Or you could use the other approach suggested. Avoids stat()
by disabling color-ls. Or just use plain find.

On Tue, Dec 4, 2012 at 3:16 PM, Erik Schwartz 
schwartz.eri...@gmail.commailto:schwartz.eri...@gmail.com wrote:
Amaury, you can see in strace output that it's performing a stat on
every file.

Try simply:

  $ /bin/ls

My guess is you're using a system where ls is aliased to use options
that are more expensive.

Best regards -

Erik


On 12/4/12 5:12 PM, Amaury Francois wrote:
 The strace looks like this (on all files) :



 1354662591.755319
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P069_F01589.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001389

 1354662591.756775
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P035_F01592.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001532

 1354662591.758376
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P085_F01559.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001429

 1354662591.759873
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P027_F01569.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001377

 1354662591.761317
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P002_F01581.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001420

 1354662591.762804
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P050_F01568.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001345

 1354662591.764216
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P089_F01567.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001541

 1354662591.765828
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P010_F01594.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001358

 1354662591.767252
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P045_F01569.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001396

 1354662591.768715
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P036_F01592.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.002072

 1354662591.770854
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P089_F01568.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001722

 1354662591.772643
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P009_F01600.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001281

 1354662591.773992
 lstat64(TEW_STRESS_TEST_VM.1K_100P_1F.P022_F01583.txt,
 {st_mode=S_IFREG|0664, st_size=1000, ...}) = 0 0.001413



 We are using a 32 bits architecture, can it be the cause of the kernel
 not having enough memory ? Any possibility to change this behavior ?



 Description : Description : Description : cid:image001.png@01CD01F3.35091200



 * *

 *Amaury FRANCOIS*   *  *Ingénieur*

 Mobile +33 (0)6 88 12 62 54tel:%2B33%20%280%296%2088%20%2012%2062%2054

 *amaury.franc...@digora.commailto:amaury.franc...@digora.com 
 mailto:amaury.franc...@digora.commailto:amaury.franc...@digora.com***

 * *

 *Siège Social - 66 rue du Marché Gare - 67200 STRASBOURG*

 Tél : 0 820 200 217 - +33 (0)3 88 10 49 
 20tel:%2B33%20%280%293%2088%2010%2049%2020



 Description : test





 *De :*Sunil Mushran 
 [mailto:sunil.mush...@gmail.commailto:sunil.mush...@gmail.com]
 *Envoyé :* mardi 4 décembre 2012 18:29
 *À :* Amaury Francois
 *Cc :* ocfs2-users@oss.oracle.commailto:ocfs2-users@oss.oracle.com
 *Objet :* Re: [Ocfs2-users] ls taking ages on a directory containing
 90 files



 strace -p PID -ttt -T



 Attach and get some timings. The simplest guess is that the system lacks
 memory to cache all the inodes

 and thus has to hit disk (and more importantly take cluster locks) for
 the same inode repeatedly. The user

 guide has a section in NOTES explaining this.





 On Tue, Dec 4, 2012 at 8:54 AM, Amaury Francois
 amaury.franc...@digora.commailto:amaury.franc...@digora.com 
 mailto:amaury.franc...@digora.commailto:amaury.franc...@digora.com wrote:

 Hello,



 We are running OCFS2 1.8 and on a kernel UEK2. An ls on a directory
 containing approx. 1 million of files  is very long (1H). The features
 we have activated on the filesystem are the following :



 [root@pa-oca-app10 ~]# debugfs.ocfs2 -R stats /dev/sdb1

 Revision: 0.90