Re: [Pulp-list] Content server Performance

2021-07-01 Thread Daniel Alley
Pulp 3.14 is out, which includes both the content serving improvements and
also "retry on error" downloads which should help the CDN errors you've
been experiencing.  Katello should have their RPMs ready in a couple of
days.

On Mon, Jun 28, 2021 at 9:39 AM Daniel Alley  wrote:

> Sorry Bin, this ended up in my spam somehow so I missed your update until
> a second ago.
>
> Realistically, it's probably getting bottlenecked on the database.  You
> can definitely try increasing the workers further (beyond 50) but I'm not
> sure how much it will help.  A lot of the improvements in 3.14 are oriented
> around reducing our load on the database so it should help quite a bit.
>
> On Tue, Jun 22, 2021 at 12:35 PM Bin Li (BLOOMBERG/ 120 PARK) <
> [email protected]> wrote:
>
>> We will look into upgrade 3.7.3 to 3.14.
>> For now, I have updated number of worker a few times. We are having 50
>> workers running. I no longer see the timed out messages but the TIME_WAIT
>> is still around 5k.
>>
>> # netstat -an | grep -i TIME_WAIT |grep 24816 | wc -l
>> 5473
>>
>> Also notice the database connection is over 60.
>> => select count(*) from pg_stat_activity where usename = 'pulp';
>> count
>> ---
>> 63
>> (1 row)
>>
>> Should I keep adding workers until the queue comes down? We still have
>> plenty of cpu and memory on the host.
>>
>>
>> From: [email protected] At: 06/22/21 12:01:30 UTC-4:00
>> To: [email protected]
>> Cc: Bin Li (BLOOMBERG/ 120 PARK ) ,
>> [email protected]
>> Subject: Re: [Pulp-list] Content server Performance
>>
>>
>>
>> On Tue, Jun 22, 2021 at 11:56 AM Danny Sauer 
>> wrote:
>>
>>> You can certainly run multiple instances of the content server.  It just
>>> needs a connection to the database and access to the storage.
>>>
>> Agreed, you could deploy additional content servers and have your
>> nginx/apache load balance them.
>>
>>
>>> Have you tuned the number of worker processes in Gunicorn?  It defaults
>>> to 1, but should almost certainly be increased for any sort of volume.
>>> https://docs.gunicorn.org/en/stable/settings.html#worker-processes
>>>
>> Pulp changed the default gunicorn worker processes to 8 maybe a release
>> or two ago. See the `pulp_content_workers` variable in the installer here
>> https://pulp-installer.readthedocs.io/en/latest/roles/pulp_content/#role-variables
>>
>>>
>>> There are several moving pieces, but that's really all I had to touch
>>> here.
>>>
>>> --Danny
>>>
>> With pulpcore==3.14 there is a significant performance improvement being
>> reviewed now  https://pulp.plan.io/issues/8805  . In addition to
>> resolving it with methods like ^, when 3.14 comes out (scheduled for June
>> 29th) it would be great if you could report on if the improvements helped
>> you.
>>
>>>
>>> On Tue, Jun 22, 2021 at 10:34 AM Bin Li (BLOOMBERG/ 120 PARK) <
>>> [email protected]> wrote:
>>>
>>>> We recently add more clients to use the pulp content server. The
>>>> processes run out the file descriptor first. We then increased both nginx
>>>> and pulp-content by creating a override.conf
>>>> /etc/systemd/system/pulpcore-content.service.d # cat override.conf
>>>> [Service]
>>>> LimitNOFILE=65536
>>>>
>>>> and updated nginx.conf
>>>> # Gunicorn docs suggest this value.
>>>> worker_processes 1;
>>>> events {
>>>> worker_connections 1; # increase if you have lots of clients
>>>> accept_mutex off; # set to 'on' if nginx worker_processes > 1
>>>> }
>>>>
>>>> worker_rlimit_nofile 2;
>>>>
>>>>
>>>> Now we are keep getting this error.
>>>> 2021/06/22 11:26:36 [error] 78373#0: *112823 upstream timed out (110:
>>>> Connection timed out) while connecting to upstream, client:
>>>>
>>>> It looks like pulp-content server cannot keep up with requests. Is
>>>> there anything we could do to increase the performance of the content
>>>> server?
>>>> ___
>>>> Pulp-list mailing list
>>>> [email protected]
>>>> https://listman.redhat.com/mailman/listinfo/pulp-list
>>>
>>> ___
>>> Pulp-list mailing list
>>> [email protected]
>>> https://listman.redhat.com/mailman/listinfo/pulp-list
>>
>>
>> ___
>> Pulp-list mailing list
>> [email protected]
>> https://listman.redhat.com/mailman/listinfo/pulp-list
>
>
___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Content server Performance

2021-06-28 Thread Daniel Alley
Sorry Bin, this ended up in my spam somehow so I missed your update until a
second ago.

Realistically, it's probably getting bottlenecked on the database.  You can
definitely try increasing the workers further (beyond 50) but I'm not sure
how much it will help.  A lot of the improvements in 3.14 are oriented
around reducing our load on the database so it should help quite a bit.

On Tue, Jun 22, 2021 at 12:35 PM Bin Li (BLOOMBERG/ 120 PARK) <
[email protected]> wrote:

> We will look into upgrade 3.7.3 to 3.14.
> For now, I have updated number of worker a few times. We are having 50
> workers running. I no longer see the timed out messages but the TIME_WAIT
> is still around 5k.
>
> # netstat -an | grep -i TIME_WAIT |grep 24816 | wc -l
> 5473
>
> Also notice the database connection is over 60.
> => select count(*) from pg_stat_activity where usename = 'pulp';
> count
> ---
> 63
> (1 row)
>
> Should I keep adding workers until the queue comes down? We still have
> plenty of cpu and memory on the host.
>
>
> From: [email protected] At: 06/22/21 12:01:30 UTC-4:00
> To: [email protected]
> Cc: Bin Li (BLOOMBERG/ 120 PARK ) ,
> [email protected]
> Subject: Re: [Pulp-list] Content server Performance
>
>
>
> On Tue, Jun 22, 2021 at 11:56 AM Danny Sauer 
> wrote:
>
>> You can certainly run multiple instances of the content server.  It just
>> needs a connection to the database and access to the storage.
>>
> Agreed, you could deploy additional content servers and have your
> nginx/apache load balance them.
>
>
>> Have you tuned the number of worker processes in Gunicorn?  It defaults
>> to 1, but should almost certainly be increased for any sort of volume.
>> https://docs.gunicorn.org/en/stable/settings.html#worker-processes
>>
> Pulp changed the default gunicorn worker processes to 8 maybe a release or
> two ago. See the `pulp_content_workers` variable in the installer here
> https://pulp-installer.readthedocs.io/en/latest/roles/pulp_content/#role-variables
>
>>
>> There are several moving pieces, but that's really all I had to touch
>> here.
>>
>> --Danny
>>
> With pulpcore==3.14 there is a significant performance improvement being
> reviewed now  https://pulp.plan.io/issues/8805  . In addition to
> resolving it with methods like ^, when 3.14 comes out (scheduled for June
> 29th) it would be great if you could report on if the improvements helped
> you.
>
>>
>> On Tue, Jun 22, 2021 at 10:34 AM Bin Li (BLOOMBERG/ 120 PARK) <
>> [email protected]> wrote:
>>
>>> We recently add more clients to use the pulp content server. The
>>> processes run out the file descriptor first. We then increased both nginx
>>> and pulp-content by creating a override.conf
>>> /etc/systemd/system/pulpcore-content.service.d # cat override.conf
>>> [Service]
>>> LimitNOFILE=65536
>>>
>>> and updated nginx.conf
>>> # Gunicorn docs suggest this value.
>>> worker_processes 1;
>>> events {
>>> worker_connections 1; # increase if you have lots of clients
>>> accept_mutex off; # set to 'on' if nginx worker_processes > 1
>>> }
>>>
>>> worker_rlimit_nofile 2;
>>>
>>>
>>> Now we are keep getting this error.
>>> 2021/06/22 11:26:36 [error] 78373#0: *112823 upstream timed out (110:
>>> Connection timed out) while connecting to upstream, client:
>>>
>>> It looks like pulp-content server cannot keep up with requests. Is there
>>> anything we could do to increase the performance of the content server?
>>> ___
>>> Pulp-list mailing list
>>> [email protected]
>>> https://listman.redhat.com/mailman/listinfo/pulp-list
>>
>> ___
>> Pulp-list mailing list
>> [email protected]
>> https://listman.redhat.com/mailman/listinfo/pulp-list
>
>
> ___
> Pulp-list mailing list
> [email protected]
> https://listman.redhat.com/mailman/listinfo/pulp-list
___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Content server Performance

2021-06-22 Thread Bin Li (BLOOMBERG/ 120 PARK)
We will look into upgrade 3.7.3 to 3.14. 
For now, I have updated number of worker a few times. We are having 50 workers 
running. I no longer see the timed out messages but the TIME_WAIT is still 
around 5k.

# netstat -an | grep -i TIME_WAIT |grep 24816 | wc -l
5473

Also notice the database connection is over 60.
=> select count(*) from pg_stat_activity where usename = 'pulp';
 count 
---
63
(1 row)

Should I keep adding workers until the queue comes down? We still have plenty 
of cpu and memory on the host.


From: [email protected] At: 06/22/21 12:01:30 UTC-4:00To:  
[email protected]
Cc:  Bin Li (BLOOMBERG/ 120 PARK ) ,  [email protected]
Subject: Re: [Pulp-list] Content server Performance


On Tue, Jun 22, 2021 at 11:56 AM Danny Sauer  wrote:

You can certainly run multiple instances of the content server.  It just needs 
a connection to the database and access to the storage.
Agreed, you could deploy additional content servers and have your nginx/apache 
load balance them.
 


Have you tuned the number of worker processes in Gunicorn?  It defaults to 1, 
but should almost certainly be increased for any sort of volume.  
https://docs.gunicorn.org/en/stable/settings.html#worker-processes
Pulp changed the default gunicorn worker processes to 8 maybe a release or two 
ago. See the `pulp_content_workers` variable in the installer here 
https://pulp-installer.readthedocs.io/en/latest/roles/pulp_content/#role-variables

There are several moving pieces, but that's really all I had to touch here.

--Danny
With pulpcore==3.14 there is a significant performance improvement being 
reviewed now  https://pulp.plan.io/issues/8805  . In addition to resolving it 
with methods like ^, when 3.14 comes out (scheduled for June 29th) it would be 
great if you could report on if the improvements helped you.


On Tue, Jun 22, 2021 at 10:34 AM Bin Li (BLOOMBERG/ 120 PARK) 
 wrote:

We recently add more clients to use the pulp content server. The processes run 
out the file descriptor first. We then increased both nginx and pulp-content by 
creating a override.conf 
/etc/systemd/system/pulpcore-content.service.d # cat override.conf 
[Service]
LimitNOFILE=65536

and updated nginx.conf
# Gunicorn docs suggest this value.
worker_processes 1;
events {
worker_connections 1;  # increase if you have lots of clients
accept_mutex off;  # set to 'on' if nginx worker_processes > 1
}

worker_rlimit_nofile2;


Now we are keep getting this error.
2021/06/22 11:26:36 [error] 78373#0: *112823 upstream timed out (110: 
Connection timed out) while connecting to upstream, client:

It looks like pulp-content server cannot keep up with requests. Is there 
anything we could do to increase the performance of the content 
server?___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list
 ___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list


___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Content server Performance

2021-06-22 Thread Bin Li (BLOOMBERG/ 120 PARK)
We added api workers but not the gunicorn worker. I noticed

ExecStart=/opt/utils/venv/pulp/3.7.3/bin/gunicorn pulpcore.content:server \
  --bind '127.0.0.1:24816' \
  --worker-class 'aiohttp.GunicornWebWorker' \
  -w 2 \
  --access-logfile -

I will update -w to 10 to see if it helps.

From: [email protected] At: 06/22/21 11:55:53 UTC-4:00To:  Bin Li 
(BLOOMBERG/ 120 PARK ) 
Cc:  [email protected]
Subject: Re: [Pulp-list] Content server Performance

You can certainly run multiple instances of the content server.  It just needs 
a connection to the database and access to the storage.

Have you tuned the number of worker processes in Gunicorn?  It defaults to 1, 
but should almost certainly be increased for any sort of volume.  
https://docs.gunicorn.org/en/stable/settings.html#worker-processes

There are several moving pieces, but that's really all I had to touch here.

--Danny
On Tue, Jun 22, 2021 at 10:34 AM Bin Li (BLOOMBERG/ 120 PARK) 
 wrote:

We recently add more clients to use the pulp content server. The processes run 
out the file descriptor first. We then increased both nginx and pulp-content by 
creating a override.conf 
/etc/systemd/system/pulpcore-content.service.d # cat override.conf 
[Service]
LimitNOFILE=65536

and updated nginx.conf
# Gunicorn docs suggest this value.
worker_processes 1;
events {
worker_connections 1;  # increase if you have lots of clients
accept_mutex off;  # set to 'on' if nginx worker_processes > 1
}

worker_rlimit_nofile2;


Now we are keep getting this error.
2021/06/22 11:26:36 [error] 78373#0: *112823 upstream timed out (110: 
Connection timed out) while connecting to upstream, client:

It looks like pulp-content server cannot keep up with requests. Is there 
anything we could do to increase the performance of the content 
server?___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list


___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Content server Performance

2021-06-22 Thread Brian Bouterse
On Tue, Jun 22, 2021 at 11:56 AM Danny Sauer  wrote:

> You can certainly run multiple instances of the content server.  It just
> needs a connection to the database and access to the storage.
>
Agreed, you could deploy additional content servers and have your
nginx/apache load balance them.


> Have you tuned the number of worker processes in Gunicorn?  It defaults to
> 1, but should almost certainly be increased for any sort of volume.
> https://docs.gunicorn.org/en/stable/settings.html#worker-processes
>
Pulp changed the default gunicorn worker processes to 8 maybe a release or
two ago. See the `pulp_content_workers` variable in the installer here
https://pulp-installer.readthedocs.io/en/latest/roles/pulp_content/#role-variables

>
> There are several moving pieces, but that's really all I had to touch here.
>
> --Danny
>
With pulpcore==3.14 there is a significant performance improvement being
reviewed now  https://pulp.plan.io/issues/8805  . In addition to resolving
it with methods like ^, when 3.14 comes out (scheduled for June 29th) it
would be great if you could report on if the improvements helped you.

>
> On Tue, Jun 22, 2021 at 10:34 AM Bin Li (BLOOMBERG/ 120 PARK) <
> [email protected]> wrote:
>
>> We recently add more clients to use the pulp content server. The
>> processes run out the file descriptor first. We then increased both nginx
>> and pulp-content by creating a override.conf
>> /etc/systemd/system/pulpcore-content.service.d # cat override.conf
>> [Service]
>> LimitNOFILE=65536
>>
>> and updated nginx.conf
>> # Gunicorn docs suggest this value.
>> worker_processes 1;
>> events {
>> worker_connections 1; # increase if you have lots of clients
>> accept_mutex off; # set to 'on' if nginx worker_processes > 1
>> }
>>
>> worker_rlimit_nofile 2;
>>
>>
>> Now we are keep getting this error.
>> 2021/06/22 11:26:36 [error] 78373#0: *112823 upstream timed out (110:
>> Connection timed out) while connecting to upstream, client:
>>
>> It looks like pulp-content server cannot keep up with requests. Is there
>> anything we could do to increase the performance of the content server?
>> ___
>> Pulp-list mailing list
>> [email protected]
>> https://listman.redhat.com/mailman/listinfo/pulp-list
>
> ___
> Pulp-list mailing list
> [email protected]
> https://listman.redhat.com/mailman/listinfo/pulp-list
___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Content server Performance

2021-06-22 Thread Daniel Alley
We've been working on efficiency and performance over the past few
releases, particularly the one that will be releasing in the next week
(3.14).  I wouldn't expect that changing the number of Nginx workers would
improve anything, but improving the number of gunicorn workers *should*.
It sounds like you may have already done that?

In pulpcore-content.service, tweak the value of "workers":

ExecStart=/usr/local/lib/pulp/bin/gunicorn pulpcore.content:server \
>   --bind '127.0.0.1:24816' \
>   --worker-class 'aiohttp.GunicornWebWorker' \
>   --workers 8 \
>   --timeout 90 \
>   --access-logfile -
>

But yes, this is a brute force approach, for better efficiency you'll need
to upgrade.

On Tue, Jun 22, 2021 at 11:34 AM Bin Li (BLOOMBERG/ 120 PARK) <
[email protected]> wrote:

> We recently add more clients to use the pulp content server. The processes
> run out the file descriptor first. We then increased both nginx and
> pulp-content by creating a override.conf
> /etc/systemd/system/pulpcore-content.service.d # cat override.conf
> [Service]
> LimitNOFILE=65536
>
> and updated nginx.conf
> # Gunicorn docs suggest this value.
> worker_processes 1;
> events {
> worker_connections 1; # increase if you have lots of clients
> accept_mutex off; # set to 'on' if nginx worker_processes > 1
> }
>
> worker_rlimit_nofile 2;
>
>
> Now we are keep getting this error.
> 2021/06/22 11:26:36 [error] 78373#0: *112823 upstream timed out (110:
> Connection timed out) while connecting to upstream, client:
>
> It looks like pulp-content server cannot keep up with requests. Is there
> anything we could do to increase the performance of the content server?
> ___
> Pulp-list mailing list
> [email protected]
> https://listman.redhat.com/mailman/listinfo/pulp-list
___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Content server Performance

2021-06-22 Thread Danny Sauer
You can certainly run multiple instances of the content server.  It just
needs a connection to the database and access to the storage.

Have you tuned the number of worker processes in Gunicorn?  It defaults to
1, but should almost certainly be increased for any sort of volume.
https://docs.gunicorn.org/en/stable/settings.html#worker-processes

There are several moving pieces, but that's really all I had to touch here.

--Danny

On Tue, Jun 22, 2021 at 10:34 AM Bin Li (BLOOMBERG/ 120 PARK) <
[email protected]> wrote:

> We recently add more clients to use the pulp content server. The processes
> run out the file descriptor first. We then increased both nginx and
> pulp-content by creating a override.conf
> /etc/systemd/system/pulpcore-content.service.d # cat override.conf
> [Service]
> LimitNOFILE=65536
>
> and updated nginx.conf
> # Gunicorn docs suggest this value.
> worker_processes 1;
> events {
> worker_connections 1; # increase if you have lots of clients
> accept_mutex off; # set to 'on' if nginx worker_processes > 1
> }
>
> worker_rlimit_nofile 2;
>
>
> Now we are keep getting this error.
> 2021/06/22 11:26:36 [error] 78373#0: *112823 upstream timed out (110:
> Connection timed out) while connecting to upstream, client:
>
> It looks like pulp-content server cannot keep up with requests. Is there
> anything we could do to increase the performance of the content server?
> ___
> Pulp-list mailing list
> [email protected]
> https://listman.redhat.com/mailman/listinfo/pulp-list
___
Pulp-list mailing list
[email protected]
https://listman.redhat.com/mailman/listinfo/pulp-list