Re: [Gluster-devel] [Gluster-users] high load when copy directory with many files
Hi Marco, sorry for the late reply. I've run some tests and I don't see any big difference between ls, stat and getfattr. Can you provide more details about what test did you run ? It would also help to provide a profile info for each test: To start profile info: gluster volume profile start Before each test: gluster volume profile info clear After the test: gluster volume profile info >/some/file Regards, Xavi On Mon, Apr 12, 2021 at 9:01 AM Xavi Hernandez wrote: > On Sun, Apr 11, 2021 at 10:29 AM Amar Tumballi wrote: > >> Hi Marco, this is really good test/info. Thanks. >> >> One more thing to observe is you are running such tests is 'gluster >> profile info', so the bottleneck fop is listed. >> >> Mohit, Xavi, in this parallel operations, the load may be high due to >> inodelk used in mds xattr update in dht? Or you guys suspect something else? >> > > A profile info would be very useful to know which fop gets more requests. > I think inodelk by itself shouldn't be an issue (I guess we are setting mds > only once, right ?). In theory we shouldn't be sending any operation on an > inode without a previous successful lookup, and in this case lookups should > fail, so I don't clearly see what's the difference compared to an stat. > > We should investigate this. I'll try to do some experiments (not sure if > this week, though). > > Regards, > > Xavi > > >> Regards >> Amar >> >> On Sat, 10 Apr, 2021, 11:45 pm Marco Lerda - FOREACH S.R.L., < >> marco.le...@foreach.it> wrote: >> >>> hi, >>> we have isolated the problem (meanwhile some hardware upgrade and code >>> optimization helped to limit the problem). >>> it happens when many request (HTTP over apache) comes to a non existent >>> file. >>> With 30 concurrent request to the same non existing file cause the load >>> go high without limit. >>> Same requests on existing files works fine. >>> I have tried to simulate che apache access to file excluding apache with >>> repeated command on files with the same parallelism (30): >>> - with ls works fine, file exists or not >>> - with stat works fine, file exists or not >>> - with xattr load go up, file exists or not >>> >>> thank you >>> >>> >>> Il 05/10/2020 19.45, Marco Lerda - FOREACH S.R.L. ha scritto: >>> > hi, >>> > we use glusterfs on a php application that have many small php files >>> > images etc... >>> > We use glusterfs in replication mode. >>> > We have 2 nodes connected in fiber with 100MBps and less than 1 ms >>> > latency. >>> > We have also an arbiter on slower network (but the issue is there also >>> > without the arbiter). >>> > When we copy a directory (cp command) with many files, cpu usage and >>> > load explode raplidly, >>> > our application become inaccessible until the copy ends. >>> > >>> > I wonder if is that normal or we have done something wrong. >>> > I know that glusterfs is not indicated with many small files, and I >>> > know that it slow down, >>> > but I want to avoid that a simple copy of a directory will put down >>> > out application. >>> > >>> > Any suggestion? >>> > >>> > Thanks a lot >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > Community Meeting Calendar: >>> > >>> > Schedule - >>> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> > Bridge: https://bluejeans.com/441850968 >>> > >>> > Gluster-users mailing list >>> > gluster-us...@gluster.org >>> > https://lists.gluster.org/mailman/listinfo/gluster-users >>> >>> -- >>> >>> -- >>> Marco Lerda >>> FOREACH S.R.L. >>> Via Laghi di Avigliana 115, 12022 - Busca (CN) >>> Telefono: 0171-1984102 >>> Centralino/Fax: 0171-1984100 >>> Email: marco.le...@foreach.it >>> Web: http://www.foreach.it >>> >>> >>> >>> >>> >>> Community Meeting Calendar: >>> >>> Schedule - >>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >>> Bridge: https://meet.google.com/cpu-eiue-hvk >>> Gluster-users mailing list >>> gluster-us...@gluster.org >>> https://lists.gluster.org/mailman/listinfo/gluster-users >>> >> --- >> >> Community Meeting Calendar: >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> >> Gluster-devel mailing list >> Gluster-devel@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-devel >> >> --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] high load when copy directory with many files
thank you all, just tryed set performance.nl-cache on and worked out Il 11/04/2021 10.29, Amar Tumballi ha scritto: Hi Marco, this is really good test/info. Thanks. One more thing to observe is you are running such tests is 'gluster profile info', so the bottleneck fop is listed. Mohit, Xavi, in this parallel operations, the load may be high due to inodelk used in mds xattr update in dht? Or you guys suspect something else? Regards Amar On Sat, 10 Apr, 2021, 11:45 pm Marco Lerda - FOREACH S.R.L.,wrote: hi, we have isolated the problem (meanwhile some hardware upgrade and code optimization helped to limit the problem). it happens when many request (HTTP over apache) comes to a non existent file. With 30 concurrent request to the same non existing file cause the load go high without limit. Same requests on existing files works fine. I have tried to simulate che apache access to file excluding apache with repeated command on files with the same parallelism (30): - with ls works fine, file exists or not - with stat works fine, file exists or not - with xattr load go up, file exists or not thank you Il 05/10/2020 19.45, Marco Lerda - FOREACH S.R.L. ha scritto: > hi, > we use glusterfs on a php application that have many small php files > images etc... > We use glusterfs in replication mode. > We have 2 nodes connected in fiber with 100MBps and less than 1 ms > latency. > We have also an arbiter on slower network (but the issue is there also > without the arbiter). > When we copy a directory (cp command) with many files, cpu usage and > load explode raplidly, > our application become inaccessible until the copy ends. > > I wonder if is that normal or we have done something wrong. > I know that glusterfs is not indicated with many small files, and I > know that it slow down, > but I want to avoid that a simple copy of a directory will put down > out application. > > Any suggestion? > > Thanks a lot > > > > > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://bluejeans.com/441850968 > > Gluster-users mailing list > gluster-us...@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users -- -- Marco Lerda FOREACH S.R.L. Via Laghi di Avigliana 115, 12022 - Busca (CN) Telefono: 0171-1984102 Centralino/Fax: 0171-1984100 Email: marco.le...@foreach.it Web: http://www.foreach.it Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-users mailing list gluster-us...@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users -- -- Marco Lerda FOREACH S.R.L. Via Laghi di Avigliana 115, 12022 - Busca (CN) Telefono: 0171-1984102 Centralino/Fax: 0171-1984100 Email: marco.le...@foreach.it Web: http://www.foreach.it --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] high load when copy directory with many files
On Sun, Apr 11, 2021 at 10:29 AM Amar Tumballi wrote: > Hi Marco, this is really good test/info. Thanks. > > One more thing to observe is you are running such tests is 'gluster > profile info', so the bottleneck fop is listed. > > Mohit, Xavi, in this parallel operations, the load may be high due to > inodelk used in mds xattr update in dht? Or you guys suspect something else? > A profile info would be very useful to know which fop gets more requests. I think inodelk by itself shouldn't be an issue (I guess we are setting mds only once, right ?). In theory we shouldn't be sending any operation on an inode without a previous successful lookup, and in this case lookups should fail, so I don't clearly see what's the difference compared to an stat. We should investigate this. I'll try to do some experiments (not sure if this week, though). Regards, Xavi > Regards > Amar > > On Sat, 10 Apr, 2021, 11:45 pm Marco Lerda - FOREACH S.R.L., < > marco.le...@foreach.it> wrote: > >> hi, >> we have isolated the problem (meanwhile some hardware upgrade and code >> optimization helped to limit the problem). >> it happens when many request (HTTP over apache) comes to a non existent >> file. >> With 30 concurrent request to the same non existing file cause the load >> go high without limit. >> Same requests on existing files works fine. >> I have tried to simulate che apache access to file excluding apache with >> repeated command on files with the same parallelism (30): >> - with ls works fine, file exists or not >> - with stat works fine, file exists or not >> - with xattr load go up, file exists or not >> >> thank you >> >> >> Il 05/10/2020 19.45, Marco Lerda - FOREACH S.R.L. ha scritto: >> > hi, >> > we use glusterfs on a php application that have many small php files >> > images etc... >> > We use glusterfs in replication mode. >> > We have 2 nodes connected in fiber with 100MBps and less than 1 ms >> > latency. >> > We have also an arbiter on slower network (but the issue is there also >> > without the arbiter). >> > When we copy a directory (cp command) with many files, cpu usage and >> > load explode raplidly, >> > our application become inaccessible until the copy ends. >> > >> > I wonder if is that normal or we have done something wrong. >> > I know that glusterfs is not indicated with many small files, and I >> > know that it slow down, >> > but I want to avoid that a simple copy of a directory will put down >> > out application. >> > >> > Any suggestion? >> > >> > Thanks a lot >> > >> > >> > >> > >> > >> > >> > >> > Community Meeting Calendar: >> > >> > Schedule - >> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> > Bridge: https://bluejeans.com/441850968 >> > >> > Gluster-users mailing list >> > gluster-us...@gluster.org >> > https://lists.gluster.org/mailman/listinfo/gluster-users >> >> -- >> >> -- >> Marco Lerda >> FOREACH S.R.L. >> Via Laghi di Avigliana 115, 12022 - Busca (CN) >> Telefono: 0171-1984102 >> Centralino/Fax: 0171-1984100 >> Email: marco.le...@foreach.it >> Web: http://www.foreach.it >> >> >> >> >> >> Community Meeting Calendar: >> >> Schedule - >> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC >> Bridge: https://meet.google.com/cpu-eiue-hvk >> Gluster-users mailing list >> gluster-us...@gluster.org >> https://lists.gluster.org/mailman/listinfo/gluster-users >> > --- > > Community Meeting Calendar: > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > > Gluster-devel mailing list > Gluster-devel@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-devel > > --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel
Re: [Gluster-devel] [Gluster-users] high load when copy directory with many files
Hi Marco, this is really good test/info. Thanks. One more thing to observe is you are running such tests is 'gluster profile info', so the bottleneck fop is listed. Mohit, Xavi, in this parallel operations, the load may be high due to inodelk used in mds xattr update in dht? Or you guys suspect something else? Regards Amar On Sat, 10 Apr, 2021, 11:45 pm Marco Lerda - FOREACH S.R.L., < marco.le...@foreach.it> wrote: > hi, > we have isolated the problem (meanwhile some hardware upgrade and code > optimization helped to limit the problem). > it happens when many request (HTTP over apache) comes to a non existent > file. > With 30 concurrent request to the same non existing file cause the load > go high without limit. > Same requests on existing files works fine. > I have tried to simulate che apache access to file excluding apache with > repeated command on files with the same parallelism (30): > - with ls works fine, file exists or not > - with stat works fine, file exists or not > - with xattr load go up, file exists or not > > thank you > > > Il 05/10/2020 19.45, Marco Lerda - FOREACH S.R.L. ha scritto: > > hi, > > we use glusterfs on a php application that have many small php files > > images etc... > > We use glusterfs in replication mode. > > We have 2 nodes connected in fiber with 100MBps and less than 1 ms > > latency. > > We have also an arbiter on slower network (but the issue is there also > > without the arbiter). > > When we copy a directory (cp command) with many files, cpu usage and > > load explode raplidly, > > our application become inaccessible until the copy ends. > > > > I wonder if is that normal or we have done something wrong. > > I know that glusterfs is not indicated with many small files, and I > > know that it slow down, > > but I want to avoid that a simple copy of a directory will put down > > out application. > > > > Any suggestion? > > > > Thanks a lot > > > > > > > > > > > > > > > > Community Meeting Calendar: > > > > Schedule - > > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > > Bridge: https://bluejeans.com/441850968 > > > > Gluster-users mailing list > > gluster-us...@gluster.org > > https://lists.gluster.org/mailman/listinfo/gluster-users > > -- > > -- > Marco Lerda > FOREACH S.R.L. > Via Laghi di Avigliana 115, 12022 - Busca (CN) > Telefono: 0171-1984102 > Centralino/Fax: 0171-1984100 > Email: marco.le...@foreach.it > Web: http://www.foreach.it > > > > > > Community Meeting Calendar: > > Schedule - > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC > Bridge: https://meet.google.com/cpu-eiue-hvk > Gluster-users mailing list > gluster-us...@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > --- Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://meet.google.com/cpu-eiue-hvk Gluster-devel mailing list Gluster-devel@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-devel