Re: [Pulp-list] Error during Upgrade

2019-05-28 Thread Sebastian Sonne
I think we’ve „fixed“ it. The error was presumably created by Puppet which 
started the pulp services during the migration process. We’ve dropped the 
unit_deb* collections from mongodb and the update went through.


> Am 28.05.2019 um 10:06 schrieb Sebastian Sonne :
>
> Hello everyone,
>
> I’ve just upgraded to 2.19.0 from 2.18.0. During pulp-manage-db I’ve received 
> the following error:
>
> Migration package pulp.server.db.migrations is up to date at version 29
> Applying pulp_deb.plugins.migrations version 2
> ***
> * Migrating Deb Package content...
> * Migrated units: 30181 of 301817
> * Migrated units: 60362 of 301817
> * Migrated units: 90543 of 301817
> * Migrated units: 120724 of 301817
> ***
> Applying migration 
> pulp_deb.plugins.migrations.0002_make_rel_fields_consistent failed.
>
> Halting migrations due to a migration failure.
> '$unset' is empty. You must specify a field like so: {$unset: {: ...}}
> Traceback (most recent call last):
>  File "/usr/lib/python2.7/site-packages/pulp/server/db/manage.py", line 240, 
> in main
>return _auto_manage_db(options)
>  File "/usr/lib/python2.7/site-packages/pulp/server/db/manage.py", line 307, 
> in _auto_manage_db
>migrate_database(options)
>  File "/usr/lib/python2.7/site-packages/pulp/server/db/manage.py", line 135, 
> in migrate_database
>update_current_version=not options.test)
>  File "/usr/lib/python2.7/site-packages/pulp/server/db/migrate/models.py", 
> line 189, in apply_migration
>migration.migrate()
>  File 
> "/usr/lib/python2.7/site-packages/pulp_deb/plugins/migrations/0002_make_rel_fields_consistent.py",
>  line 99, in migrate
>{'$unset': remove_fields},)
>  File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 835, 
> in update_one
>bypass_doc_val=bypass_document_validation)
>  File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 710, 
> in _update
>_check_write_command_response([(0, result)])
>  File "/usr/lib64/python2.7/site-packages/pymongo/helpers.py", line 301, in 
> _check_write_command_response
>    raise WriteError(error.get("errmsg"), error.get("code"), error)
> WriteError: '$unset' is empty. You must specify a field like so: {$unset: 
> {: …}}
>
> How can I fix this?
>
> Regards,
> Sebastian
>
> --
> Sebastian Sonne
> IT Systems Engineer (Linux)
> Systems & Applications
>
> noris network AG
> Thomas-Mann-Straße 16-20
> 90471 Nürnberg
> Deutschland
>
> Tel +49 911 9352 1184
> Fax +49 911 9352 100
> Mobil +49 151 41466075
> Email sebastian.so...@noris.de
>
> noris network AG - Mehr Leistung als Standard
> Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
> Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689
> ___
> Pulp-list mailing list
> Pulp-list@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list


--
Sebastian Sonne
IT Systems Engineer (Linux)
Systems & Applications

noris network AG
Thomas-Mann-Straße 16-20
90471 Nürnberg
Deutschland

Tel +49 911 9352 1184
Fax +49 911 9352 100
Mobil +49 151 41466075
Email sebastian.so...@noris.de

noris network AG - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689


smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

[Pulp-list] Error during Upgrade

2019-05-28 Thread Sebastian Sonne
Hello everyone,

I’ve just upgraded to 2.19.0 from 2.18.0. During pulp-manage-db I’ve received 
the following error:

Migration package pulp.server.db.migrations is up to date at version 29
Applying pulp_deb.plugins.migrations version 2
***
* Migrating Deb Package content...
* Migrated units: 30181 of 301817
* Migrated units: 60362 of 301817
* Migrated units: 90543 of 301817
* Migrated units: 120724 of 301817
***
Applying migration pulp_deb.plugins.migrations.0002_make_rel_fields_consistent 
failed.

Halting migrations due to a migration failure.
'$unset' is empty. You must specify a field like so: {$unset: {: ...}}
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/pulp/server/db/manage.py", line 240, 
in main
return _auto_manage_db(options)
  File "/usr/lib/python2.7/site-packages/pulp/server/db/manage.py", line 307, 
in _auto_manage_db
migrate_database(options)
  File "/usr/lib/python2.7/site-packages/pulp/server/db/manage.py", line 135, 
in migrate_database
update_current_version=not options.test)
  File "/usr/lib/python2.7/site-packages/pulp/server/db/migrate/models.py", 
line 189, in apply_migration
migration.migrate()
  File 
"/usr/lib/python2.7/site-packages/pulp_deb/plugins/migrations/0002_make_rel_fields_consistent.py",
 line 99, in migrate
{'$unset': remove_fields},)
  File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 835, in 
update_one
bypass_doc_val=bypass_document_validation)
  File "/usr/lib64/python2.7/site-packages/pymongo/collection.py", line 710, in 
_update
_check_write_command_response([(0, result)])
  File "/usr/lib64/python2.7/site-packages/pymongo/helpers.py", line 301, in 
_check_write_command_response
raise WriteError(error.get("errmsg"), error.get("code"), error)
WriteError: '$unset' is empty. You must specify a field like so: {$unset: 
{: …}}

How can I fix this?

Regards,
Sebastian

--
Sebastian Sonne
IT Systems Engineer (Linux)
Systems & Applications

noris network AG
Thomas-Mann-Straße 16-20
90471 Nürnberg
Deutschland

Tel +49 911 9352 1184
Fax +49 911 9352 100
Mobil +49 151 41466075
Email sebastian.so...@noris.de

noris network AG - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689


smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Sanity check w/r/t OpenSuSE's public 3rd party repos

2019-03-26 Thread Sebastian Sonne
Hi Kodia,

you’re using the wrong URLs to synchronize, the x86_64 directory directly 
contains the packages. The baseurl needs to contain a folder called „repodata“, 
which in the case of the katacontainers repo would be 
http://download.opensuse.org/repositories/home:/katacontainers:/releases:/x86_64:/stable-1.6/RHEL_7/

Regards,
Sebastian


> Am 25.03.2019 um 22:50 schrieb Kodiak Firesmith :
>
> Hi All,
> So..  I have a problem and it could be our enterprise proxy, or it could be 
> Pulp.  I've upgraded a test server to the latest version of Pulp GA (2.18.1) 
> to try to resolve, but no matter what, hosted YUM repos on 
> download.opensuse.org will *not* sync down to my pulp servers.  The metadata 
> syncs fine, then it pulls in zero RPMs and zero deltarpms, then fails with 
> 'error retrieving metadata'.
>
> A couple repositories I've tested are:
> http://download.opensuse.org/repositories/home:/katacontainers:/releases:/x86_64:/stable-1.6/RHEL_7/x86_64/
>
> http://download.opensuse.org/repositories/home:/polyconvex:/NM/openSUSE_Leap_42.3/x86_64/
>
> Here's what it looks like to the admin running a sync:
> https://paste.fedoraproject.org/paste/smQW3pfpc2AwdWvJ7ntuqQ
>
> And here's what's logged into syslog:
> https://paste.fedoraproject.org/paste/TPGMjKMpyJGrOaG4v8ufZg
>
> I have two theories (assuming it isn't our bluecoat proxies) that come solely 
> from a non-developer standpoint:
>
> 1.  something is wonky with Suse's implementation of CDN using something 
> called MirrorBrain that I know nothing about.
>
> 2.  Pulp doesn't like colons in URLs - there are a ton of colons in every 
> Opensuse YUM repo.
>
> So, would anyone else perhaps not behind a proxy try to sync one of those 
> repos?  And then of course if it is having problems even not behind a proxy, 
> I suppose my next step will be to file a story on pulp.plan.io.
>
> Thanks!
>  - Kodiak
>
> ___
> Pulp-list mailing list
> Pulp-list@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list


--
Sebastian Sonne
IT Systems Engineer (Linux)
Systems & Applications

noris network AG
Thomas-Mann-Straße 16-20
90471 Nürnberg
Deutschland

Tel +49 911 9352 1184
Fax +49 911 9352 100
Mobil +49 151 41466075
Email sebastian.so...@noris.de

noris network AG - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689


smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] RHEL 8 Beta appstream 404

2019-01-02 Thread Sebastian Sonne
It’s already giving you a 404, so… that’s your issue. RedHat hasn’t published 
anything. Wouldn’t exactly be the first time. We’ve recently tried syncing some 
kubernetes repo (I forgot the details, it was most likely some openshift repo 
though). Turns out the repo was just plain empty. The directories were there, 
the URL is within our subscriptions, but it just didn’t sync anything. Going 
through this manually I’ve discovered what I suspected: Empty repos.



> Am 02.01.2019 um 19:57 schrieb Kodiak Firesmith :
>
> Not strictly a Pulp issue but I'm hoping there might also be some RHEL 8 beta 
> Pulp users out there.  I've never gotten the appstream CDN ISOs repo to 
> mirror via Pulp.  All other RHEL 8 beta repos work fine.
>
> Eg:
>
> https://cdn.redhat.com/content/beta/rhel8/8/x86_64/appstream/iso
> ...fails w/ 404
>
> https://cdn.redhat.com/content/beta/rhel8/8/x86_64/baseos/iso
> ...works just fine.
>
> Anyone else getting this path to work?  Perhaps I've extracted the wrong path?
> Thanks!
> ___
> Pulp-list mailing list
> Pulp-list@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list



Sebastian Sonne
IT Systems Engineer
Systems & Applications
noris network AG
Thomas-Mann-Straße 16-20
90471 Nürnberg
Deutschland
Tel +49 911 9352 1184
Fax +49 911 9352 100

sebastian.so...@noris.de
noris network AG - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689


smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Pulp CLI feedback

2018-05-23 Thread Sebastian Sonne
Hi there,

> - What commands or functionality in the CLI do you rely on the most?
Synchronizing and copying repos, most of them work through cronjobs now.

> - Are there things you wish the CLI had or did?
While there are repository groups, from my perspective nothing useful can be 
done with them. Being able to trigger background synchronization for a whole 
group instead of having to give the repos a specific name to grep and iterate 
through them would help a lot.
Another feature I highly miss is being able to pass the "—bg" flag to the repo 
copy operation. For now I have to sit there and press ctrl+c to have them 
distributed to the workers properly.

> - Why do you use the CLI over using the REST API directly?
Sometimes you just want to quickly do something instead of looking for the API 
documentation. Basically it’s faster

> - Do you strictly use the CLI or do you use other things like Katello or the 
> REST API?
Most things are triggered via ansible

> - Would you prefer a CLI or a basic web UI?
A cli is always preferable, even if it only keeps the less experienced 
administrators too intimidated to try things on their own on production servers.

Regards,
Sebastian


--
Sebastian Sonne
Systems & Applications (OSA)

noris network AG
Thomas-Mann-Straße 16−20
90471 Nürnberg
Deutschland

Tel +49 911 9352 1184
Fax +49 911 9352 100
sebastian.so...@noris.de

https://www.noris.de - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689


smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Old python2-debpkgr package causing debian / ubuntu not to synchronize successfully

2018-02-13 Thread Sebastian Sonne
In my opinion it can be closed. The package is updated and contains the needed 
fix.

Am 12.02.2018 um 16:32 schrieb Robin Chan 
<rc...@redhat.com<mailto:rc...@redhat.com>>:

So what should we do with the issue tracker that Sebastian wrote? 
https://pulp.plan.io/issues/3311

On Fri, Feb 9, 2018 at 5:03 PM, Patrick Creech 
<pcre...@redhat.com<mailto:pcre...@redhat.com>> wrote:
On Fri, 2018-02-09 at 07:19 -0500, Patrick Creech wrote:
> On Fri, 2018-02-09 at 10:30 +, Sebastian Sonne wrote:
> > Hello everyone,
> >
> > ever since the stable release of 2.15, I’ve tried to synchronize debian, 
> > without any success. The error message given is "'md5sums' file not found, 
> > can't list MD5 sums“. In https://pulp.plan.io/is
> > su
> > es/3311, I have finally found out why: debpkg expects an md5sum, when 
> > apparently this is optional, causing everything else to fail. The fix for 
> > this is present in python-debpkgr version 1.0.2, but
> > the only currently available version from pulp is 1.0.1.
> >
> > As I’m unsure where to turn to with the request to update python2-debpkgr, 
> > I’ll do it here.
>
> Thanks for reporting this Sebastion!
>
> As pulp's release engineer, I'll get right on updating python-debpkgr in our 
> repos.  I'll send an update when the fix has been pushed

The latest python2-debpkgr has been published to pulp's repos
___
Pulp-list mailing list
Pulp-list@redhat.com<mailto:Pulp-list@redhat.com>
https://www.redhat.com/mailman/listinfo/pulp-list

___
Pulp-list mailing list
Pulp-list@redhat.com<mailto:Pulp-list@redhat.com>
https://www.redhat.com/mailman/listinfo/pulp-list



Sebastian Sonne
Systems & Applications (OSA)

noris network AG
Thomas-Mann-Strasse 16−20
90471 Nürnberg
Deutschland

Tel +49 911 9352 1184
Fax +49 911 9352 100

sebastian.so...@noris.de

https://www.noris.de - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689










smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

[Pulp-list] Old python2-debpkgr package causing debian / ubuntu not to synchronize successfully

2018-02-09 Thread Sebastian Sonne
Hello everyone,

ever since the stable release of 2.15, I’ve tried to synchronize debian, 
without any success. The error message given is "'md5sums' file not found, 
can't list MD5 sums“. In https://pulp.plan.io/issues/3311, I have finally found 
out why: debpkg expects an md5sum, when apparently this is optional, causing 
everything else to fail. The fix for this is present in python-debpkgr version 
1.0.2, but the only currently available version from pulp is 1.0.1.

As I’m unsure where to turn to with the request to update python2-debpkgr, I’ll 
do it here.

Regards,
Sebastian


Sebastian Sonne
Systems & Applications (OSA)

noris network AG
Thomas-Mann-Strasse 16−20
90471 Nürnberg
Deutschland

Tel +49 911 9352 1184
Fax +49 911 9352 100

sebastian.so...@noris.de

https://www.noris.de - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689










smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Resource manager behaving differently between clusters

2018-01-10 Thread Sebastian Sonne
I am unsure if this should actually be filed as a bug, at least with the scope 
that I described. I’ve tested this on the good cluster. The results so far are:

When node 01 is both the rabbitmq-master and active resource manager, and the 
VM is paused, everything goes down.

When node 01 is the rabbitmq-master, but node 02 contains the active resource 
manager, and node 02 is paused, the other two resource managers become active, 
but all the workers are down. This does not auto-resolve when the node is 
unpaused, but creates a separate rabbitmq-partition.

The only way that I can imagine the initially described behavior is when the 
hypervisor is rapidly pausing and unpausing. That way, the rabbitmq and mongodb 
clusters could both stay intact, while the workers and resource manager slowly 
miss their heartbeats.




Sebastian Sonne
Systems & Applications (OSA)

noris network AG
Thomas-Mann-Strasse 16−20
90471 Nürnberg
Deutschland

Tel +49 911 9352 1184
Fax +49 911 9352 100

sebastian.so...@noris.de

https://www.noris.de - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689









Am 10.01.2018 um 14:25 schrieb Dennis Kliban 
<dkli...@redhat.com<mailto:dkli...@redhat.com>>:

It sounds like you may be experiencing issue https://pulp.plan.io/issues/3135

From our conversation on IRC, I learned that the hypervisor is acting up and 
the VMs pause from time to time. So even though the system is not under heavy 
load it still behaves as though it is. As a result the INactive resource 
managers think that the active resource manager has become inactive and start 
being active. What I am still not clear on is why more than 1 resource manager 
is able to become active at a time. If this is actually happening, then this is 
a new bug. You could avoid this problem by only running 2 resource managers. 
Though it would be good to find a reliable way to reproduce this problem and 
file a bug.

On Wed, Jan 10, 2018 at 6:37 AM, Sebastian Sonne 
<sebastian.so...@noris.de<mailto:sebastian.so...@noris.de>> wrote:
Hello everyone.

I have two pulp clusters, each containing three nodes, all systems are up to 
date (pulp 2.14.3). However, the cluster behavior differs greatly. Let's call 
the working cluster the external one, and the broken one internal.

The setup: Everything is virtualized. Both clusters are distributed over two 
datacenters, but they're on different ESX-clusters. All nodes are allowed to 
migrate between hypervisors.

On the external cluster, "celery status" gives me one resource manager, on the 
external cluster I get either two or three resource managers. As far as I 
understand, I can run the resource manager on all nodes, but should only see 
one in celery, because the other two nodes are going into standby.

Running "ps fauxwww |grep resource_manage[r]" on the external cluster gives me 
four processes in the whole cluster. The currently active resource manager has 
two processes, the other ones have one process each. However, on the internal 
cluster I get six processes, two on each node.

From my understanding, the external cluster works correctly, as the active 
resource manager has one process to communicate with celery, and one to do 
work, with the other two nodes only having one active process to communicate 
with celery and become active in case the currently active resource manager 
goes down.

Oddly enough, celery seems to also disconnect it's own workers:

"Jan 10 08:52:36 pulp02 pulp[101629]: celery.worker.consumer:INFO: missed 
heartbeat from reserved_resource_worker-1@pulp02". As such, I think we can 
eliminate the network"

I'm completely stumped and don't even have a real clue of what logs I could 
provide, or where to start looking into things.

Grateful for any help,
Sebastian








___
Pulp-list mailing list
Pulp-list@redhat.com<mailto:Pulp-list@redhat.com>
https://www.redhat.com/mailman/listinfo/pulp-list




smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

Re: [Pulp-list] Resource manager behaving differently between clusters

2018-01-10 Thread Sebastian Sonne
Update: We seem to have found the issue. Infrastructure told me that there is 
an issue that can pause the VMs anywhere from nanoseconds to seconds, possibly 
hundreds of times with only splitseconds between the pauses. Thus, if the 
active manager pauses, a standby takes over. The paused manager comes back, and 
we have two managers.

By that point, the only bug that’s actually pulp related is that the active 
managers don’t check for other active managers, I guess.

Regards,
Sebastian

> Am 10.01.2018 um 12:37 schrieb Sebastian Sonne <sebastian.so...@noris.de>:
>
> Hello everyone.
>
> I have two pulp clusters, each containing three nodes, all systems are up to 
> date (pulp 2.14.3). However, the cluster behavior differs greatly. Let's call 
> the working cluster the external one, and the broken one internal.
>
> The setup: Everything is virtualized. Both clusters are distributed over two 
> datacenters, but they're on different ESX-clusters. All nodes are allowed to 
> migrate between hypervisors.
>
> On the external cluster, "celery status" gives me one resource manager, on 
> the external cluster I get either two or three resource managers. As far as I 
> understand, I can run the resource manager on all nodes, but should only see 
> one in celery, because the other two nodes are going into standby.
>
> Running "ps fauxwww |grep resource_manage[r]" on the external cluster gives 
> me four processes in the whole cluster. The currently active resource manager 
> has two processes, the other ones have one process each. However, on the 
> internal cluster I get six processes, two on each node.
>
> From my understanding, the external cluster works correctly, as the active 
> resource manager has one process to communicate with celery, and one to do 
> work, with the other two nodes only having one active process to communicate 
> with celery and become active in case the currently active resource manager 
> goes down.
>
> Oddly enough, celery seems to also disconnect it's own workers:
>
> "Jan 10 08:52:36 pulp02 pulp[101629]: celery.worker.consumer:INFO: missed 
> heartbeat from reserved_resource_worker-1@pulp02". As such, I think we can 
> eliminate the network"
>
> I'm completely stumped and don't even have a real clue of what logs I could 
> provide, or where to start looking into things.
>
> Grateful for any help,
> Sebastian
>
>
> Sebastian Sonne
> Systems & Applications (OSA)
> noris network AG
> Thomas-Mann-Strasse 16−20
> 90471 Nürnberg
> Deutschland
> Tel +49 911 9352 1184
> Fax +49 911 9352 100
>
> sebastian.so...@noris.de
> https://www.noris.de - Mehr Leistung als Standard
> Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
> Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689
>
>
>
>
> ___
> Pulp-list mailing list
> Pulp-list@redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list



Sebastian Sonne
Systems & Applications (OSA)
noris network AG
Thomas-Mann-Strasse 16−20
90471 Nürnberg
Deutschland
Tel +49 911 9352 1184
Fax +49 911 9352 100

sebastian.so...@noris.de
https://www.noris.de - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689






smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list

[Pulp-list] Resource manager behaving differently between clusters

2018-01-10 Thread Sebastian Sonne
Hello everyone.

I have two pulp clusters, each containing three nodes, all systems are up to 
date (pulp 2.14.3). However, the cluster behavior differs greatly. Let's call 
the working cluster the external one, and the broken one internal.

The setup: Everything is virtualized. Both clusters are distributed over two 
datacenters, but they're on different ESX-clusters. All nodes are allowed to 
migrate between hypervisors.

On the external cluster, "celery status" gives me one resource manager, on the 
external cluster I get either two or three resource managers. As far as I 
understand, I can run the resource manager on all nodes, but should only see 
one in celery, because the other two nodes are going into standby.

Running "ps fauxwww |grep resource_manage[r]" on the external cluster gives me 
four processes in the whole cluster. The currently active resource manager has 
two processes, the other ones have one process each. However, on the internal 
cluster I get six processes, two on each node.

From my understanding, the external cluster works correctly, as the active 
resource manager has one process to communicate with celery, and one to do 
work, with the other two nodes only having one active process to communicate 
with celery and become active in case the currently active resource manager 
goes down.

Oddly enough, celery seems to also disconnect it's own workers:

"Jan 10 08:52:36 pulp02 pulp[101629]: celery.worker.consumer:INFO: missed 
heartbeat from reserved_resource_worker-1@pulp02". As such, I think we can 
eliminate the network"

I'm completely stumped and don't even have a real clue of what logs I could 
provide, or where to start looking into things.

Grateful for any help,
Sebastian


Sebastian Sonne
Systems & Applications (OSA)
noris network AG
Thomas-Mann-Strasse 16−20
90471 Nürnberg
Deutschland
Tel +49 911 9352 1184
Fax +49 911 9352 100

sebastian.so...@noris.de
https://www.noris.de - Mehr Leistung als Standard
Vorstand: Ingo Kraupa (Vorsitzender), Joachim Astel, Jürgen Städing
Vorsitzender des Aufsichtsrats: Stefan Schnabel - AG Nürnberg HRB 17689






smime.p7s
Description: S/MIME cryptographic signature
___
Pulp-list mailing list
Pulp-list@redhat.com
https://www.redhat.com/mailman/listinfo/pulp-list