Not an expert here as i started to work with pulp just recently, but i
tried to install 2-3 server configuration with a master and 1 or 2
clients. The aim was to spread the load (in active active configuration
so no clustered configuration) and avoiding a single point of failure.
Sadly i went stuck with the fact that nodes need Oauth authentication,
but it was not working properly and other pages declared oauth
deprecated and soon to be removed from pulp.
How then nodes should work its a mystery. Since the documentation was
contradicting itself and i didn't managed to make of work ( ssl issues
even if i disabled it everywhere) i opted for a totally different approach:
I created a single pulp server and mounted a nas volume.
I moved the /var/lib/pulp and /var/lib/mongodb to the nas and replaced
the mentioned path with another nfs mount. Simbolic links could work
with mongodb, but not with pulp as some paths need to be available on
apache who by default don't follow simlinks.
Once the pulp stuff are located in NAS i exported that volume on 2 more
apache servers and made available the same 'published' directory trough
those apache server ( you can reuse the pukp.conf in /etc/httpd/conf.d
as it need just minor changes). All the clients actually connect to the
apache servers, so i can scale horizontally how much do i want and the
pulp server only do the repo sync so his load actually its quite low.
The good:
with this configuration the pulp server can be restarted, reinstalled,
or shutdown and the repos will still available to the hosts as they
connect to the apache servers. This helps pulp maintenance. Having pulp
unavailable means only that there will be no new syncs to update the
repositories but the repos are available.
The bad:
this is all nice but only if you use pulp as pure rpm repo manager. If
you use pulp also to register the hosts, then this configuration its no
use for you. since the hosts have to register, they have to connect to
the pulp servers and only pulp can 'push' changes to the hosts, so the
single point of failure comes back.
the workaround ( no, its not "ugly" :) )
In my work environment we use puppet to define the server configuration
and the running services, so we can rebuild it automatically without
manual intervention. This includes repo configurationa nd packages
installed, so we dont need to register hosts in specific host groups as
puppet does everything (better).
Actually during my host registration test i didn't liked the logic
behind. We host several thousand hosts and we need to be able to
reinstall them when needed without manual intervention. Puppet cope
that, so when i was looking how to register a host i was surprised that
a host cannot register to a specific puppet host group. You have to do
that by hand on the puppet server ( more exactly: using pulp-admin). So
anytime a machine register itself you have some manual task on pulp,
which its not scalable for us, so in the end we skipped this part and
used pulp just are local rpm repo and continued to use puppet for the rest.
On 22/06/15 15:11, Sean Waite wrote:
By children, I'm referring to child nodes - the subservers that can
sync from a "parent" node.
Looking again at the resources, below is what I have. It does look
like the 1.7g proc is actually a worker.
Some statistics on what I have here (resident memory):
2 celery__main__worker procs listed as "resource_manager" - 41m
memory each
2 celery__main__worker procs listed as "reserved_resource_worker" -
42m and 1.7g respectively
1 mongo process - 972m
1 celerybeat - 24m
a pile of httpd procs - 14m each
1 qpid - 21m
For disk utilization, the mongo db is around 3.8G and my directory
containing all of the rpms etc is around 95G.
We're on a system with only 3.5G available memory, which is probably
part of the problem. We're looking at expanding it, I'm just trying to
figure out how much to expand it by. From your numbers above, we'd
need 6-7G of memory + 2*N gigs for the workers. Should I expect maybe
3-4 workers at any one time? I've got 2 now, but that is at an idle state.
On Mon, Jun 22, 2015 at 9:24 AM, Brian Bouterse <[email protected]
<mailto:[email protected]>> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Sean,
I'm not really sure what you mean by the term 'children'. Maybe you
mean process or consumer?
I expect pulp_resource_manager to use less than 1.7G of memory, but
its possible. Memory analysis can be a little bit tricky so more
details are needed about how this is being measured to be sure.
The biggest memory process within Pulp by far is mongodb. If you can,
ensure that at least 4G of RAM is available on that machine that you
are running mongodb on.
I looked into the docs and we don't talk much about the memory
requirements. Feel free to file a bug on that if you want. Roughly I
expect the following amounts of RAM to be available per process:
pulp_celerybeat, 256MB - 512MB
pulp_resource_manager, 256MB - 512MB
pulp_workers. This process spawns N workers. Each worker could use
256MB - 2GB depending on what its doing.
httpd, 1GB
mongodb, 4GB
qpidd/rabbitMQ, ???
Note all the pulp_*, processes have a parent and child process, for
the numbers above I consider each parent/child together. I usually
show the inheritance using `sudo ps -awfux`.
I'm interested to see what others think about these numbers too.
- -Brian
On 06/22/2015 08:46 AM, Sean Waite wrote:
> Hi,
>
> I've got a pulp server running, and I'd like to add some children.
> The server itself is a bit hard up on resources, so we're going to
> rebuild with a larger one. How much resources would the children
> use? Is it a fairly beefy process/memory hog?
>
> We've got a large number of repositories. pulp-resource-manager
> seems to be using 1.7G of memory, with a .7G of mongodb.
>
> Any pointers on how much I might be able to expect?
>
> Thanks
>
> -- Sean Waite
> [email protected] <mailto:[email protected]>
<mailto:[email protected] <mailto:[email protected]>> Cloud
Operations
> Engineer GPG 3071E870 TraceLink, Inc.
>
> Be Excellent to Each Other
>
>
> _______________________________________________ Pulp-list mailing
> list [email protected] <mailto:[email protected]>
> https://www.redhat.com/mailman/listinfo/pulp-list
>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAEBCAAGBQJViAx4AAoJEK48cdELyEfyjeUH/j06u2ERqrvTogSW+T3ZNYgI
4xnkypN6/oIv87BhaVysif1adYI4R/egiIHlqHGxO0HBWm/AKQygMWJBvMMK3Dlg
PtGdDdD7BBnGEwuTeFm0qJMlofk3PKmRaPRrFhwFe6DD/UaYgM7FSVsVbyn4zZpf
HSMSk+j77FoEH8ExUX4i43UJOjkp1vfFgyynKMwxIHi6vLY0VDnmIS3iISlfroIA
T+ZmS5t2u2NBU3dgTSHNlQsWP4BT2JH8VRWatoVoMc/vwlIJv+fzYn+tMAjNwKu+
Lepcowq7sXLQzmlqGgYpVMofcBy4Mv3V0z2tjZOzqySF7omIG5YK7uFDLISxu1g=
=AhJm
-----END PGP SIGNATURE-----
_______________________________________________
Pulp-list mailing list
[email protected] <mailto:[email protected]>
https://www.redhat.com/mailman/listinfo/pulp-list
--
Sean Waite [email protected] <mailto:[email protected]>
Cloud Operations Engineer GPG 3071E870
TraceLink, Inc.
Be Excellent to Each Other
_______________________________________________
Pulp-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-list
_______________________________________________
Pulp-list mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/pulp-list