You need to understand that you have dangling messages on your šŸ‡.

Old messages that have invalid JSON, and ids that were picked by the hybrid
fallback mechanism so fail (which I agree it's not cool)

Stop Koha, stop the worker, stop rabbit, clean rabbit, start the worker,
let it exhaust already injected tasks. Stop the worker, start rabbit, start
the worker, start Koha.

Sounds too much, doesn't it?



El miƩ, 21 dic 2022 10:59, Philippe Blouin <philippe.blo...@inlibro.com>
escribiĆ³:

> Good evening, David,
>
> Thanks for the response.  Yours and David's and Michael's.  I feel less
> alone...
>
> I validated, and yes all the patches you refer are in our pile.  And until
> the problems arose, there were no customizations around that code.
>
> So yeah, even at 22.05.06, I get the JSON error and the race condition (we
> use ES).  And the *abandonned* children.  So I surmise, or dare I say
> postulate, that those issues are not as resolved as some would presume.
>
> I will revert background_jobs_worker.pl to its default, and shutdown MQ
> everywhere, for now.  :(
> Philippe Blouin,
> Directeur de la technologie
>
> TĆ©l.  : (833) 465-4276, poste 230
> philippe.blo...@inlibro.com
> inLibro | pour esprit libre | www.inLibro.com
> On 2022-12-20 17:55, David Cook wrote:
>
> Salut Philippe,
>
>
>
> That first issue shouldā€™ve been resolved in 22.05.00 by
> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=30172. I
> havenā€™t had any problems like that since applying that patch. Are you
> running Koha with or without customizations?
>
>
>
> As you say, bug 30654 discusses that second issue. And I obviously have my
> own opinion on that one šŸ˜‰.
>
>
>
> That JSON issue should be fixed by Bug 31351 in Koha 22.05.06 as well I
> believe: https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=31351
>
>
>
> --
>
>
>
> The only issue Iā€™ve had with the background jobs has been the one covered
> by Bug 30172. Otherwise, itā€™s been all fine for me, although I use Zebra
> rather than Elasticsearch. I think part of the reason I havenā€™t had issues
> is that I havenā€™t had many people using the background jobs either though.
>
>
>
> Iā€™m actually planning on writing a background job system based on RabbitMQ
> for a different non-Koha system. The main difference is that Iā€™ll reject or
> fail tasks where messages arenā€™t sent to RabbitMQ. I think thatā€™ll make my
> system a bit more robust than Kohaā€™s.
>
>
>
> The problem with the background jobs at the moment is that we havenā€™t
> fully committed to RabbitMQ. Weā€™re trying to do this weird hybrid with the
> database fallback which is not the right direction in my mind. We should do
> one or the other but not try to do both.
>
>
>
> But thatā€™s just my 2 cents.
>
>
>
> David Cook
>
> Senior Software Engineer
>
> Prosentient Systems
>
> Suite 7.03
>
> 6a Glen St
>
> Milsons Point NSW 2061
>
> Australia
>
>
>
> Office: 02 9212 0899
>
> Online: 02 8005 0595
>
>
>
> *From:* Koha-devel <koha-devel-boun...@lists.koha-community.org>
> <koha-devel-boun...@lists.koha-community.org> *On Behalf Of *Philippe
> Blouin
> *Sent:* Wednesday, 21 December 2022 6:13 AM
> *To:* koha-devel@lists.koha-community.org
> *Subject:* [Koha-devel] The many failings of background_jobs_worker.pl
>
>
>
> Howdy!
>
> Since moving a lot of our users to 22.05.06, we've installed the worker
> everywhere.  But the number of issues encountered is staggering.
>
> The first one was
>
> Can't call method "process" on an undefined value
>
> where the id received from MQ was not found in the DB, and the process is
> going straight to process_job and failing.  Absolutely no idea how that
> occurs, seems completely counterintuitive (the ID comes from the DB after
> all), but here it is.  Hacked the code to add a "sleep 1" to fix most of
> that one.
>
> Then came the fact that stored events were not checked if the connection
> to MQ was successful at startup.  Bug 30654 refers it.  Hacked a little
> "$init" in there to clear that up at startup.
>
> Then came the
>
> malformed UTF-8 character in JSON string, at character offset 296 (before
> "\x{e9}serv\x{e9} au ...")
>
> at decode_json that crashes the whole process.  And for some reason, it
> never gets over it, gets the same problem at every restart, like the event
> is never "eaten" from the queue.  Hacked an eval then a try-catch over it...
>
> After coding a monitor to alert when a background_jobs has been "new" over
> 5 minutes in the DB, I was inundated by messages.  There's alway one
> elasticsearch_update that escapes among the flurry, and they slowly add up.
>
> At this point, the only viable solution is to run the workers but disable
> RabbitMQ everywhere.  Are we really the only ones experiencing that?
>
> Regards,
>
> PS Our servers are well-above-average Debian 11 machines with lot of
> firepower (ram, cpu, i/o...).
>
> --
>
> Philippe Blouin,
> Directeur de la technologie
>
> TĆ©l.  : (833) 465-4276, poste 230
> philippe.blo...@inlibro.com
>
> inLibro | pour esprit libre | www.inLibro.com
>
> _______________________________________________
> Koha-devel mailing list
> Koha-devel@lists.koha-community.org
> https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : https://www.koha-community.org/
> git : https://git.koha-community.org/
> bugs : https://bugs.koha-community.org/
>
_______________________________________________
Koha-devel mailing list
Koha-devel@lists.koha-community.org
https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : https://www.koha-community.org/
git : https://git.koha-community.org/
bugs : https://bugs.koha-community.org/

Reply via email to