Send netdisco-users mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.sourceforge.net/lists/listinfo/netdisco-users
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of netdisco-users digest..."
Today's Topics:
1. Re: scheduled jobs stop (Oliver Gorwits)
--- Begin Message ---
Hi Ricardo
Please can you also run:
perl -MSereal\ 999 -e 1
perl -MMCE::Queue\ 999 -e 1
Then run
cpanm Sereal MCE
and then let us know if the problem is still there?
thanks,
oliver.
On Wed, 29 Jan 2020 at 14:15, Ricardo Stella <[email protected]> wrote:
>
> Well, it's definitely a bug with the latest versions. I upgraded the
> original instance I had which was running fine under 2.040006 since March
> of last year. This one also is exhibiting the same issues with jobs queued
> since 5:30pm yesterday.
>
> Error logs on that instance since last restart yesterday afternoon are:
>
> [7901] 2020-01-28 16:03:03 warn App::Netdisco 2.044011 backend
> Argument "" isn't numeric in read at
> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line 1.
> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at offset 1
> of input at srl_decoder.c line 580 at
> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <$__ANONIO__> line 1.
> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
> 1484, <$__ANONIO__> line 1753.
> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
> 1484, <$__ANONIO__> line 15984.
> Argument "" isn't numeric in read at
> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line 1.
> Can't call method "status" without a package or object reference at
> /home/netdisco/perl5/lib/perl5/App/Netdisco/Backend/Role/Poller.pm line 38,
> <$__ANONIO__> line 1.
> Argument "" isn't numeric in read at
> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line 1.
> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at offset 1
> of input at srl_decoder.c line 580 at
> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <$__ANONIO__> line 1.
>
>
>
> On Tue, Jan 28, 2020 at 11:18 AM Ricardo Stella <[email protected]> wrote:
>
>>
>> And just noticed that there's a newer version out there. Updated the new
>> instance (including wiping the perl5 directory) and right after I started
>> it, I got an error message. The old one was also updated but it's not
>> giving me any errors so far.
>>
>> [8849] 2020-01-28 16:13:41 warn App::Netdisco 2.044011 backend
>> Argument "" isn't numeric in read at
>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__> line 1.
>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at offset
>> 1 of input at srl_decoder.c line 580 at
>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__> line 1.
>> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
>> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
>> 1484, <__ANONIO__> line 32.
>>
>>
>>
>>
>> On Tue, Jan 28, 2020 at 9:56 AM Ricardo Stella <[email protected]> wrote:
>>
>>>
>>> Same here...
>>>
>>> backend status thinks it's running but jobs are queued since last night
>>> and not running. Here are the errors since last restart yesterday:
>>>
>>> [24657] 2020-01-27 16:00:58 warn App::Netdisco 2.044009 backend
>>> Argument "" isn't numeric in read at
>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__> line 1.
>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at offset
>>> 1 of input at srl_decoder.c line 580 at
>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__> line 1.
>>> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
>>> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
>>> 1484, <__ANONIO__> line 27785.
>>> Argument "" isn't numeric in read at
>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__> line 1.
>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at offset
>>> 1 of input at srl_decoder.c line 580 at
>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__> line 1.
>>>
>>> We also did a DB dump and restore it on a new instance, so not sure if
>>> this is related.
>>>
>>> I restarted our older instance. I will update Netdisco (running
>>> 2.040006) and see if there are any issues on this instance. Worse case,
>>> I'll redo the dump again.
>>>
>>>
>>> On Tue, Jan 28, 2020 at 4:33 AM <[email protected]> wrote:
>>>
>>>> Hi Ricardo
>>>> Sorry I forgot to sign my email
>>>> By the way I'm Marco
>>>>
>>>> It stopped again yesterday after few hour.
>>>> ...
>>>> [5754] 2020-01-27 17:06:59 debug -> run worker
>>>> main/wirelessnodes/100
>>>> [5754] 2020-01-27 17:06:59 info pol (3): wrapping up macsuck
>>>> job(22425208) - status done at Mon Jan 27 18:06:59 2020
>>>> [5750] 2020-01-27 17:06:59 debug [172.17.119.6] macsuck - port
>>>> 1:43 vlan unknown : 1 nodes
>>>> Argument "PID_5754" isn't numeric in abs at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Core/Manager.pm line 206, <__ANONIO__>
>>>> line 32948.
>>>> Can't call method "_mce_m_pending" on an undefined value at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 679, <__ANONIO__> line
>>>> 32949.
>>>>
>>>> I activated debug, it seems that some scheduled jobs (macsuck,
>>>> discoverall etc.) cause the error "Argument "PID_####" isn't numeric " and
>>>> it zombies netdisco-backend child
>>>> ps aux | grep netd
>>>> netdisco 3428 0.0 0.3 22840 15848 ? S gen27 2:05
>>>> netdisco-backend
>>>> netdisco 3429 0.0 0.0 0 0 ? Z gen27 0:15
>>>> [nd2: master] <defunct>
>>>>
>>>> I can't say if it is caused by my new setup/configuration or something
>>>> else
>>>>
>>>> Marco
>>>>
>>>> > Il 27 gennaio 2020 alle 17.03 Ricardo Stella <[email protected]> ha
>>>> scritto:
>>>> >
>>>> >
>>>> > Also happening here. I also had exported the DB in order to install
>>>> on a new VM with new OS. Had a couple of problems that I posted but had
>>>> this same error on the logs.
>>>> > Noticed all jobs queued for a couple of days and nothing running.
>>>> > Last message on logs was:
>>>> > Argument "" isn't numeric in read at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, line 1.
>>>> > Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>> offset 1 of input at srl_decoder.c line 580 at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, line 1.
>>>> >
>>>> > Restarting it seems to get the jobs running again.
>>>> >
>>>> >
>>>> > On Mon, Jan 27, 2020 at 10:54 AM marco via netdisco-users <
>>>> [email protected]> wrote:
>>>> > > Hi there
>>>> > >
>>>> > > I had set up a new ND2 host on debian buster some weeks ago
>>>> > > for experimental purpose
>>>> > > I have another ND2 host up and running since years
>>>> > >
>>>> > > Software Version
>>>> > > App::Netdisco 2.44.4
>>>> > > SNMP::Info 3.70
>>>> > > DB Schema 61
>>>> > > PostgreSQL 12.00.1
>>>> > > Perl 5.28.1
>>>> > >
>>>> > > I restore db from another ND2
>>>> > > and copy deployment.yml
>>>> > > It worked
>>>> > >
>>>> > > But I noticed that it stops running the scheduled jobs after some
>>>> times (days)
>>>> > > I had to restart netdisco-backend,
>>>> > >
>>>> > >
>>>> > > here some info I collect
>>>> > >
>>>> > > from netdisco-backend.log
>>>> > > ...
>>>> > > [392] 2020-01-24 15:15:18 debug mgr (2): getting potential jobs
>>>> for 1 workers
>>>> > > [2700] 2020-01-24 15:15:18 debug [172.17.185.50] arpnip -
>>>> processed 373 ARP Cache entries
>>>> > > [2700] 2020-01-24 15:15:18 debug [172.17.185.50] arpnip -
>>>> processed 0 IPv6 Neighbor Cache entries
>>>> > > [2700] 2020-01-24 15:15:18 info pol (3): wrapping up arpnip
>>>> job(22423168) - status done at Fri Jan 24 16:15:18 2020
>>>> > > [392] 2020-01-24 15:15:18 debug getsome: cancelled 0E0
>>>> duplicate(s) of job 22423235
>>>> > > [392] 2020-01-24 15:15:18 info mgr (2): job 22423235 booked
>>>> out for this processing node
>>>> > > Argument "PID_2700" isn't numeric in read at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 477, line 31470.
>>>> > > Sereal: Error: Bad Sereal header: Not a valid Sereal document.
>>>> at offset 1 of input at srl_decoder.c line 580 at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 480, line 31470.
>>>> > >
>>>> > > root@deb-netdisco:~# systemctl status netdisco-backend.service
>>>> > > ● netdisco-backend.service - Netdisco Backend Service
>>>> > > Loaded: loaded (/etc/systemd/system/netdisco-backend.service;
>>>> enabled; vendor preset: enabled)
>>>> > > Active: active (running) since Fri 2020-01-24 09:53:03 CET; 3
>>>> days ago
>>>> > > Process: 110 ExecStart=/home/netdisco/bin/netdisco-backend
>>>> start (code=exited, status=0/SUCCESS)
>>>> > > Main PID: 216 (netdisco-backen)
>>>> > > Tasks: 2 (limit: 4915)
>>>> > > Memory: 143.0M
>>>> > > CGroup: /system.slice/netdisco-backend.service
>>>> > > └─216 netdisco-backend
>>>> > >
>>>> > > gen 24 09:53:02 deb-netdisco systemd[1]: Starting Netdisco
>>>> Backend Service...
>>>> > > gen 24 09:53:03 deb-netdisco netdisco-backend[110]: Netdisco
>>>> Backend [Started]
>>>> > > gen 24 09:53:03 deb-netdisco netdisco-backend[110]: config
>>>> watcher: watching /home/netdisco/environments for updates.
>>>> > > gen 24 09:53:03 deb-netdisco systemd[1]: Started Netdisco
>>>> Backend Service.
>>>> > > gen 24 10:01:48 deb-netdisco netdisco-backend[110]: --
>>>> /home/netdisco/environments/deployment.yml updated.
>>>> > > gen 24 10:01:48 deb-netdisco netdisco-backend[110]: config
>>>> watcher: sending TERM to the server (pid:217)...
>>>> > >
>>>> > > root@deb-netdisco:~# ps aux | grep netd
>>>> > > netdisco 216 0.0 0.3 22840 16008 ? S gen24
>>>> 6:19 netdisco-backend
>>>> > > netdisco 281 0.0 0.3 20744 13680 ? S gen24
>>>> 0:00 perl /home/netdisco/bin/netdisco-web start
>>>> > > netdisco 282 0.0 0.3 22152 16696 ? S gen24
>>>> 0:47 starman master --disable-keepalive --user 1001 --group 1001
>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>> > > netdisco 372 0.0 0.0 0 0 ? Z gen24
>>>> 0:16 [nd2: master]
>>>> > > netdisco 373 0.0 2.7 135148 117200 ? S gen24
>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>> > > netdisco 374 0.0 2.8 136000 118000 ? S gen24
>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>> > > netdisco 375 0.0 2.7 133744 115940 ? S gen24
>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>> > > netdisco 376 0.0 2.8 137420 119504 ? S gen24
>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>> > > netdisco 377 0.0 2.7 133792 115996 ? S gen24
>>>> 0:05 starman worker --disable-keepalive --user 1001 --group 1001
>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>> > > root 3405 0.0 0.0 6096 824 pts/0 S+ 10:59
>>>> 0:00 grep netd
>>>> > >
>>>> > >
>>>> > > after stop and start
>>>> > > root@deb-netdisco:~# systemctl start netdisco-backend.service
>>>> > >
>>>> > > it seems to work again
>>>> > > [392] 2020-01-24 15:15:18 info mgr (2): job 22423235 booked
>>>> out for this processing node
>>>> > > Argument "PID_2700" isn't numeric in read at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 477, line 31470.
>>>> > > Sereal: Error: Bad Sereal header: Not a valid Sereal document.
>>>> at offset 1 of input at srl_decoder.c line 580 at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 480, line 31470.
>>>> > > [3429] 2020-01-27 10:10:08 warn App::Netdisco 2.044004 backend
>>>> > > [3429] 2020-01-27 10:10:08 info resolving backend hostname...
>>>> > > [3433] 2020-01-27 10:10:08 info applying role Scheduler to
>>>> worker 1
>>>> > > [3436] 2020-01-27 10:10:08 info applying role Poller to worker
>>>> 4
>>>> > > ...
>>>> > >
>>>> > >
>>>> > > _______________________________________________
>>>> > > Netdisco mailing list
>>>> > > [email protected]
>>>> > > https://sourceforge.net/p/netdisco/mailman/netdisco-users/
>>>> >
>>>> > --
>>>> > °((( = (( ===°°° ((( ================================================
>>>>
>>>
>>>
>>> --
>>> °(((=((===°°°(((================================================
>>>
>>
>>
>> --
>> °(((=((===°°°(((================================================
>>
>
>
> --
> °(((=((===°°°(((================================================
> _______________________________________________
> Netdisco mailing list
> [email protected]
> https://sourceforge.net/p/netdisco/mailman/netdisco-users/
--- End Message ---
_______________________________________________
Netdisco mailing list - Digest Mode
[email protected]
https://lists.sourceforge.net/lists/listinfo/netdisco-users