Re: [Gluster-devel] [Gluster-users] high load when copy directory with many files

2021-04-21 Thread Xavi Hernandez
Hi Marco,

sorry for the late reply.

I've run some tests and I don't see any big difference between ls, stat and
getfattr. Can you provide more details about what test did you run ?

It would also help to provide a profile info for each test:

To start profile info: gluster volume profile  start
Before each test: gluster volume profile  info clear
After the test: gluster volume profile  info >/some/file

Regards,

Xavi

On Mon, Apr 12, 2021 at 9:01 AM Xavi Hernandez  wrote:

> On Sun, Apr 11, 2021 at 10:29 AM Amar Tumballi  wrote:
>
>> Hi Marco, this is really good test/info. Thanks.
>>
>> One more thing to observe is you are running such tests is 'gluster
>> profile info', so the bottleneck fop is listed.
>>
>> Mohit, Xavi, in this parallel operations, the load may be high due to
>> inodelk used in mds xattr update in dht? Or you guys suspect something else?
>>
>
> A profile info would be very useful to know which fop gets more requests.
> I think inodelk by itself shouldn't be an issue (I guess we are setting mds
> only once, right ?). In theory we shouldn't be sending any operation on an
> inode without a previous successful lookup, and in this case lookups should
> fail, so I don't clearly see what's the difference compared to an stat.
>
> We should investigate this. I'll try to do some experiments (not sure if
> this week, though).
>
> Regards,
>
> Xavi
>
>
>> Regards
>> Amar
>>
>> On Sat, 10 Apr, 2021, 11:45 pm Marco Lerda - FOREACH S.R.L., <
>> marco.le...@foreach.it> wrote:
>>
>>> hi,
>>> we have isolated the problem (meanwhile some hardware upgrade and code
>>> optimization helped to limit the problem).
>>> it happens when many request (HTTP over apache) comes to a non existent
>>> file.
>>> With 30 concurrent request to the same non existing file cause the load
>>> go high without limit.
>>> Same requests on existing files works fine.
>>> I have tried to simulate che apache access to file excluding apache with
>>> repeated command on files with the same parallelism (30):
>>> - with ls works fine, file exists or not
>>> - with stat works fine, file exists or not
>>> - with xattr load go up, file exists or not
>>>
>>> thank you
>>>
>>>
>>> Il 05/10/2020 19.45, Marco Lerda - FOREACH S.R.L. ha scritto:
>>> > hi,
>>> > we use glusterfs on a php application that have many small php files
>>> > images etc...
>>> > We use glusterfs in replication mode.
>>> > We have 2 nodes connected in fiber with 100MBps and less than 1 ms
>>> > latency.
>>> > We have also an arbiter on slower network (but the issue is there also
>>> > without the arbiter).
>>> > When we copy a directory (cp command) with many files, cpu usage and
>>> > load explode raplidly,
>>> > our application become inaccessible until the copy ends.
>>> >
>>> > I wonder if is that normal or we have done something wrong.
>>> > I know that glusterfs is not indicated with many small files, and I
>>> > know that it slow down,
>>> > but I want to avoid that a simple copy of a directory will put down
>>> > out application.
>>> >
>>> > Any suggestion?
>>> >
>>> > Thanks a lot
>>> >
>>> >
>>> >
>>> > 
>>> >
>>> >
>>> >
>>> > Community Meeting Calendar:
>>> >
>>> > Schedule -
>>> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>> > Bridge: https://bluejeans.com/441850968
>>> >
>>> > Gluster-users mailing list
>>> > gluster-us...@gluster.org
>>> > https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>>> --
>>>
>>> --
>>> Marco Lerda
>>> FOREACH S.R.L.
>>> Via Laghi di Avigliana 115, 12022 - Busca (CN)
>>> Telefono: 0171-1984102
>>> Centralino/Fax: 0171-1984100
>>> Email:  marco.le...@foreach.it
>>> Web: http://www.foreach.it
>>>
>>> 
>>>
>>>
>>>
>>> Community Meeting Calendar:
>>>
>>> Schedule -
>>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>> Gluster-users mailing list
>>> gluster-us...@gluster.org
>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>
>> ---
>>
>> Community Meeting Calendar:
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://meet.google.com/cpu-eiue-hvk
>>
>> Gluster-devel mailing list
>> Gluster-devel@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-devel
>>
>>
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-users] high load when copy directory with many files

2021-04-14 Thread Marco Lerda - FOREACH S.R.L.

  
  
thank you all,
just tryed set performance.nl-cache  on and worked out


Il 11/04/2021 10.29, Amar Tumballi ha
  scritto:


  
  Hi Marco, this is really good test/info. Thanks.


One more thing to observe is you are running
  such tests is 'gluster profile info', so the bottleneck fop is
  listed.


Mohit, Xavi, in this parallel operations, the
  load may be high due to inodelk used in mds xattr update in
  dht? Or you guys suspect something else?


Regards
Amar
  
  
  
On Sat, 10 Apr, 2021, 11:45 pm
  Marco Lerda - FOREACH S.R.L., 
  wrote:

hi,
  we have isolated the problem (meanwhile some hardware upgrade
  and code 
  optimization helped to limit the problem).
  it happens when many request (HTTP over apache) comes to a non
  existent 
  file.
  With 30 concurrent request to the same non existing file cause
  the load 
  go high without limit.
  Same requests on existing files works fine.
  I have tried to simulate che apache access to file excluding
  apache with 
  repeated command on files with the same parallelism (30):
  - with ls works fine, file exists or not
  - with stat works fine, file exists or not
  - with xattr load go up, file exists or not
  
  thank you
  
  
  Il 05/10/2020 19.45, Marco Lerda - FOREACH S.R.L. ha scritto:
  > hi,
  > we use glusterfs on a php application that have many
  small php files 
  > images etc...
  > We use glusterfs in replication mode.
  > We have 2 nodes connected in fiber with 100MBps and less
  than 1 ms 
  > latency.
  > We have also an arbiter on slower network (but the issue
  is there also 
  > without the arbiter).
  > When we copy a directory (cp command) with many files,
  cpu usage and 
  > load explode raplidly,
  > our application become inaccessible until the copy ends.
  >
  > I wonder if is that normal or we have done something
  wrong.
  > I know that glusterfs is not indicated with many small
  files, and I 
  > know that it slow down,
  > but I want to avoid that a simple copy of a directory
  will put down 
  > out application.
  >
  > Any suggestion?
  >
  > Thanks a lot
  >
  >
  >
  > 
  >
  >
  >
  > Community Meeting Calendar:
  >
  > Schedule -
  > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
  > Bridge: https://bluejeans.com/441850968
  >
  > Gluster-users mailing list
  > gluster-us...@gluster.org
  > https://lists.gluster.org/mailman/listinfo/gluster-users
  
  -- 
  
  --
  Marco Lerda
  FOREACH S.R.L.
  Via Laghi di Avigliana 115, 12022 - Busca (CN)
  Telefono: 0171-1984102
  Centralino/Fax: 0171-1984100
  Email:  marco.le...@foreach.it
  Web: http://www.foreach.it
  
  
  
  
  
  Community Meeting Calendar:
  
  Schedule -
  Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
  Bridge: https://meet.google.com/cpu-eiue-hvk
  Gluster-users mailing list
  gluster-us...@gluster.org
  https://lists.gluster.org/mailman/listinfo/gluster-users

  


-- 

--
Marco Lerda
FOREACH S.R.L.
Via Laghi di Avigliana 115, 12022 - Busca (CN)
Telefono: 0171-1984102
Centralino/Fax: 0171-1984100
Email:  marco.le...@foreach.it
Web: http://www.foreach.it
  

---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-users] high load when copy directory with many files

2021-04-12 Thread Xavi Hernandez
On Sun, Apr 11, 2021 at 10:29 AM Amar Tumballi  wrote:

> Hi Marco, this is really good test/info. Thanks.
>
> One more thing to observe is you are running such tests is 'gluster
> profile info', so the bottleneck fop is listed.
>
> Mohit, Xavi, in this parallel operations, the load may be high due to
> inodelk used in mds xattr update in dht? Or you guys suspect something else?
>

A profile info would be very useful to know which fop gets more requests. I
think inodelk by itself shouldn't be an issue (I guess we are setting mds
only once, right ?). In theory we shouldn't be sending any operation on an
inode without a previous successful lookup, and in this case lookups should
fail, so I don't clearly see what's the difference compared to an stat.

We should investigate this. I'll try to do some experiments (not sure if
this week, though).

Regards,

Xavi


> Regards
> Amar
>
> On Sat, 10 Apr, 2021, 11:45 pm Marco Lerda - FOREACH S.R.L., <
> marco.le...@foreach.it> wrote:
>
>> hi,
>> we have isolated the problem (meanwhile some hardware upgrade and code
>> optimization helped to limit the problem).
>> it happens when many request (HTTP over apache) comes to a non existent
>> file.
>> With 30 concurrent request to the same non existing file cause the load
>> go high without limit.
>> Same requests on existing files works fine.
>> I have tried to simulate che apache access to file excluding apache with
>> repeated command on files with the same parallelism (30):
>> - with ls works fine, file exists or not
>> - with stat works fine, file exists or not
>> - with xattr load go up, file exists or not
>>
>> thank you
>>
>>
>> Il 05/10/2020 19.45, Marco Lerda - FOREACH S.R.L. ha scritto:
>> > hi,
>> > we use glusterfs on a php application that have many small php files
>> > images etc...
>> > We use glusterfs in replication mode.
>> > We have 2 nodes connected in fiber with 100MBps and less than 1 ms
>> > latency.
>> > We have also an arbiter on slower network (but the issue is there also
>> > without the arbiter).
>> > When we copy a directory (cp command) with many files, cpu usage and
>> > load explode raplidly,
>> > our application become inaccessible until the copy ends.
>> >
>> > I wonder if is that normal or we have done something wrong.
>> > I know that glusterfs is not indicated with many small files, and I
>> > know that it slow down,
>> > but I want to avoid that a simple copy of a directory will put down
>> > out application.
>> >
>> > Any suggestion?
>> >
>> > Thanks a lot
>> >
>> >
>> >
>> > 
>> >
>> >
>> >
>> > Community Meeting Calendar:
>> >
>> > Schedule -
>> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> > Bridge: https://bluejeans.com/441850968
>> >
>> > Gluster-users mailing list
>> > gluster-us...@gluster.org
>> > https://lists.gluster.org/mailman/listinfo/gluster-users
>>
>> --
>>
>> --
>> Marco Lerda
>> FOREACH S.R.L.
>> Via Laghi di Avigliana 115, 12022 - Busca (CN)
>> Telefono: 0171-1984102
>> Centralino/Fax: 0171-1984100
>> Email:  marco.le...@foreach.it
>> Web: http://www.foreach.it
>>
>> 
>>
>>
>>
>> Community Meeting Calendar:
>>
>> Schedule -
>> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
>> Bridge: https://meet.google.com/cpu-eiue-hvk
>> Gluster-users mailing list
>> gluster-us...@gluster.org
>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>
> ---
>
> Community Meeting Calendar:
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
>
> Gluster-devel mailing list
> Gluster-devel@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-devel
>
>
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel



Re: [Gluster-devel] [Gluster-users] high load when copy directory with many files

2021-04-11 Thread Amar Tumballi
Hi Marco, this is really good test/info. Thanks.

One more thing to observe is you are running such tests is 'gluster profile
info', so the bottleneck fop is listed.

Mohit, Xavi, in this parallel operations, the load may be high due to
inodelk used in mds xattr update in dht? Or you guys suspect something else?

Regards
Amar

On Sat, 10 Apr, 2021, 11:45 pm Marco Lerda - FOREACH S.R.L., <
marco.le...@foreach.it> wrote:

> hi,
> we have isolated the problem (meanwhile some hardware upgrade and code
> optimization helped to limit the problem).
> it happens when many request (HTTP over apache) comes to a non existent
> file.
> With 30 concurrent request to the same non existing file cause the load
> go high without limit.
> Same requests on existing files works fine.
> I have tried to simulate che apache access to file excluding apache with
> repeated command on files with the same parallelism (30):
> - with ls works fine, file exists or not
> - with stat works fine, file exists or not
> - with xattr load go up, file exists or not
>
> thank you
>
>
> Il 05/10/2020 19.45, Marco Lerda - FOREACH S.R.L. ha scritto:
> > hi,
> > we use glusterfs on a php application that have many small php files
> > images etc...
> > We use glusterfs in replication mode.
> > We have 2 nodes connected in fiber with 100MBps and less than 1 ms
> > latency.
> > We have also an arbiter on slower network (but the issue is there also
> > without the arbiter).
> > When we copy a directory (cp command) with many files, cpu usage and
> > load explode raplidly,
> > our application become inaccessible until the copy ends.
> >
> > I wonder if is that normal or we have done something wrong.
> > I know that glusterfs is not indicated with many small files, and I
> > know that it slow down,
> > but I want to avoid that a simple copy of a directory will put down
> > out application.
> >
> > Any suggestion?
> >
> > Thanks a lot
> >
> >
> >
> > 
> >
> >
> >
> > Community Meeting Calendar:
> >
> > Schedule -
> > Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> > Bridge: https://bluejeans.com/441850968
> >
> > Gluster-users mailing list
> > gluster-us...@gluster.org
> > https://lists.gluster.org/mailman/listinfo/gluster-users
>
> --
>
> --
> Marco Lerda
> FOREACH S.R.L.
> Via Laghi di Avigliana 115, 12022 - Busca (CN)
> Telefono: 0171-1984102
> Centralino/Fax: 0171-1984100
> Email:  marco.le...@foreach.it
> Web: http://www.foreach.it
>
> 
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://meet.google.com/cpu-eiue-hvk
> Gluster-users mailing list
> gluster-us...@gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
>
---

Community Meeting Calendar:
Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://meet.google.com/cpu-eiue-hvk

Gluster-devel mailing list
Gluster-devel@gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-devel