Send netdisco-users mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.sourceforge.net/lists/listinfo/netdisco-users
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of netdisco-users digest..."
Today's Topics:
1. Re: scheduled jobs stop (Oliver Gorwits)
--- Begin Message ---
I was looking to see if the issue is related to an upstream library change,
rather than in Netdisco.
Mainly because I'm scratching my head trying to work out what would cause
this, and I can't yet reproduce it.
On Wed, 29 Jan 2020 at 16:00, Ricardo Stella <[email protected]> wrote:
>
> Almost there...
>
> [netdisco@netdisco ~]$ ~/bin/localenv perl -MSereal\ 999 -e 1
> Sereal version 999 required--this is only version 4.007.
> BEGIN failed--compilation aborted.
> [netdisco@netdisco ~]$ ~/bin/localenv perl -MMCE::Queue\ 999 -e 1
> MCE::Queue version 999 required--this is only version 1.865.
> BEGIN failed--compilation aborted.
> [netdisco@netdisco ~]$ ~/bin/localenv cpanm Sereal MCE
> Sereal is up to date. (4.007)
> MCE is up to date. (1.865)
>
> I assume we are trying to delete them and force download?
>
>
>
> On Wed, Jan 29, 2020 at 10:52 AM Oliver Gorwits <[email protected]> wrote:
>
>> Sorry, my apologies, yes you would need to add "~/bin/localenv" to the
>> start of all those commands, I believe
>>
>>
>>
>> On Wed, 29 Jan 2020 at 15:17, Ricardo Stella <[email protected]> wrote:
>>
>>>
>>> Running as the netdisco user, I'm getting:
>>>
>>> Can't locate Sereal.pm in @INC (@INC contains: /usr/local/lib64/perl5
>>> /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl
>>> /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .).
>>> BEGIN failed--compilation aborted.
>>>
>>> Does it need --local-lib ~/perl5 or ~/bin/localenv first? And --notest?
>>>
>>>
>>>
>>>
>>> On Wed, Jan 29, 2020 at 9:47 AM Oliver Gorwits <[email protected]> wrote:
>>>
>>>> Hi Ricardo
>>>>
>>>> Please can you also run:
>>>> perl -MSereal\ 999 -e 1
>>>> perl -MMCE::Queue\ 999 -e 1
>>>>
>>>> Then run
>>>> cpanm Sereal MCE
>>>>
>>>> and then let us know if the problem is still there?
>>>>
>>>> thanks,
>>>> oliver.
>>>>
>>>> On Wed, 29 Jan 2020 at 14:15, Ricardo Stella <[email protected]> wrote:
>>>>
>>>>>
>>>>> Well, it's definitely a bug with the latest versions. I upgraded the
>>>>> original instance I had which was running fine under 2.040006 since March
>>>>> of last year. This one also is exhibiting the same issues with jobs queued
>>>>> since 5:30pm yesterday.
>>>>>
>>>>> Error logs on that instance since last restart yesterday afternoon are:
>>>>>
>>>>> [7901] 2020-01-28 16:03:03 warn App::Netdisco 2.044011 backend
>>>>> Argument "" isn't numeric in read at
>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line
>>>>> 1.
>>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <$__ANONIO__> line
>>>>> 1.
>>>>> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
>>>>> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
>>>>> 1484, <$__ANONIO__> line 1753.
>>>>> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
>>>>> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
>>>>> 1484, <$__ANONIO__> line 15984.
>>>>> Argument "" isn't numeric in read at
>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line
>>>>> 1.
>>>>> Can't call method "status" without a package or object reference at
>>>>> /home/netdisco/perl5/lib/perl5/App/Netdisco/Backend/Role/Poller.pm line
>>>>> 38,
>>>>> <$__ANONIO__> line 1.
>>>>> Argument "" isn't numeric in read at
>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line
>>>>> 1.
>>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <$__ANONIO__> line
>>>>> 1.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jan 28, 2020 at 11:18 AM Ricardo Stella <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> And just noticed that there's a newer version out there. Updated the
>>>>>> new instance (including wiping the perl5 directory) and right after I
>>>>>> started it, I got an error message. The old one was also updated but it's
>>>>>> not giving me any errors so far.
>>>>>>
>>>>>> [8849] 2020-01-28 16:13:41 warn App::Netdisco 2.044011 backend
>>>>>> Argument "" isn't numeric in read at
>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__> line
>>>>>> 1.
>>>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__> line
>>>>>> 1.
>>>>>> Argument
>>>>>> "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..." isn't
>>>>>> numeric
>>>>>> in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1484,
>>>>>> <__ANONIO__> line 32.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 28, 2020 at 9:56 AM Ricardo Stella <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> Same here...
>>>>>>>
>>>>>>> backend status thinks it's running but jobs are queued since last
>>>>>>> night and not running. Here are the errors since last restart yesterday:
>>>>>>>
>>>>>>> [24657] 2020-01-27 16:00:58 warn App::Netdisco 2.044009 backend
>>>>>>> Argument "" isn't numeric in read at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__>
>>>>>>> line 1.
>>>>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__>
>>>>>>> line 1.
>>>>>>> Argument
>>>>>>> "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..." isn't
>>>>>>> numeric
>>>>>>> in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1484,
>>>>>>> <__ANONIO__> line 27785.
>>>>>>> Argument "" isn't numeric in read at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__>
>>>>>>> line 1.
>>>>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__>
>>>>>>> line 1.
>>>>>>>
>>>>>>> We also did a DB dump and restore it on a new instance, so not sure
>>>>>>> if this is related.
>>>>>>>
>>>>>>> I restarted our older instance. I will update Netdisco (running
>>>>>>> 2.040006) and see if there are any issues on this instance. Worse case,
>>>>>>> I'll redo the dump again.
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Jan 28, 2020 at 4:33 AM <[email protected]> wrote:
>>>>>>>
>>>>>>>> Hi Ricardo
>>>>>>>> Sorry I forgot to sign my email
>>>>>>>> By the way I'm Marco
>>>>>>>>
>>>>>>>> It stopped again yesterday after few hour.
>>>>>>>> ...
>>>>>>>> [5754] 2020-01-27 17:06:59 debug -> run worker
>>>>>>>> main/wirelessnodes/100
>>>>>>>> [5754] 2020-01-27 17:06:59 info pol (3): wrapping up macsuck
>>>>>>>> job(22425208) - status done at Mon Jan 27 18:06:59 2020
>>>>>>>> [5750] 2020-01-27 17:06:59 debug [172.17.119.6] macsuck - port
>>>>>>>> 1:43 vlan unknown : 1 nodes
>>>>>>>> Argument "PID_5754" isn't numeric in abs at
>>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Core/Manager.pm line 206,
>>>>>>>> <__ANONIO__>
>>>>>>>> line 32948.
>>>>>>>> Can't call method "_mce_m_pending" on an undefined value at
>>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 679, <__ANONIO__> line
>>>>>>>> 32949.
>>>>>>>>
>>>>>>>> I activated debug, it seems that some scheduled jobs (macsuck,
>>>>>>>> discoverall etc.) cause the error "Argument "PID_####" isn't numeric "
>>>>>>>> and
>>>>>>>> it zombies netdisco-backend child
>>>>>>>> ps aux | grep netd
>>>>>>>> netdisco 3428 0.0 0.3 22840 15848 ? S gen27
>>>>>>>> 2:05 netdisco-backend
>>>>>>>> netdisco 3429 0.0 0.0 0 0 ? Z gen27
>>>>>>>> 0:15 [nd2: master] <defunct>
>>>>>>>>
>>>>>>>> I can't say if it is caused by my new setup/configuration or
>>>>>>>> something else
>>>>>>>>
>>>>>>>> Marco
>>>>>>>>
>>>>>>>> > Il 27 gennaio 2020 alle 17.03 Ricardo Stella <[email protected]>
>>>>>>>> ha scritto:
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > Also happening here. I also had exported the DB in order to
>>>>>>>> install on a new VM with new OS. Had a couple of problems that I
>>>>>>>> posted but
>>>>>>>> had this same error on the logs.
>>>>>>>> > Noticed all jobs queued for a couple of days and nothing running.
>>>>>>>> > Last message on logs was:
>>>>>>>> > Argument "" isn't numeric in read at
>>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, line 1.
>>>>>>>> > Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, line 1.
>>>>>>>> >
>>>>>>>> > Restarting it seems to get the jobs running again.
>>>>>>>> >
>>>>>>>> >
>>>>>>>> > On Mon, Jan 27, 2020 at 10:54 AM marco via netdisco-users <
>>>>>>>> [email protected]> wrote:
>>>>>>>> > > Hi there
>>>>>>>> > >
>>>>>>>> > > I had set up a new ND2 host on debian buster some weeks ago
>>>>>>>> > > for experimental purpose
>>>>>>>> > > I have another ND2 host up and running since years
>>>>>>>> > >
>>>>>>>> > > Software Version
>>>>>>>> > > App::Netdisco 2.44.4
>>>>>>>> > > SNMP::Info 3.70
>>>>>>>> > > DB Schema 61
>>>>>>>> > > PostgreSQL 12.00.1
>>>>>>>> > > Perl 5.28.1
>>>>>>>> > >
>>>>>>>> > > I restore db from another ND2
>>>>>>>> > > and copy deployment.yml
>>>>>>>> > > It worked
>>>>>>>> > >
>>>>>>>> > > But I noticed that it stops running the scheduled jobs after
>>>>>>>> some times (days)
>>>>>>>> > > I had to restart netdisco-backend,
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > here some info I collect
>>>>>>>> > >
>>>>>>>> > > from netdisco-backend.log
>>>>>>>> > > ...
>>>>>>>> > > [392] 2020-01-24 15:15:18 debug mgr (2): getting potential
>>>>>>>> jobs for 1 workers
>>>>>>>> > > [2700] 2020-01-24 15:15:18 debug [172.17.185.50] arpnip -
>>>>>>>> processed 373 ARP Cache entries
>>>>>>>> > > [2700] 2020-01-24 15:15:18 debug [172.17.185.50] arpnip -
>>>>>>>> processed 0 IPv6 Neighbor Cache entries
>>>>>>>> > > [2700] 2020-01-24 15:15:18 info pol (3): wrapping up
>>>>>>>> arpnip job(22423168) - status done at Fri Jan 24 16:15:18 2020
>>>>>>>> > > [392] 2020-01-24 15:15:18 debug getsome: cancelled 0E0
>>>>>>>> duplicate(s) of job 22423235
>>>>>>>> > > [392] 2020-01-24 15:15:18 info mgr (2): job 22423235
>>>>>>>> booked out for this processing node
>>>>>>>> > > Argument "PID_2700" isn't numeric in read at
>>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 477, line 31470.
>>>>>>>> > > Sereal: Error: Bad Sereal header: Not a valid Sereal
>>>>>>>> document. at offset 1 of input at srl_decoder.c line 580 at
>>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 480, line 31470.
>>>>>>>> > >
>>>>>>>> > > root@deb-netdisco:~# systemctl status
>>>>>>>> netdisco-backend.service
>>>>>>>> > > ● netdisco-backend.service - Netdisco Backend Service
>>>>>>>> > > Loaded: loaded
>>>>>>>> (/etc/systemd/system/netdisco-backend.service; enabled; vendor preset:
>>>>>>>> enabled)
>>>>>>>> > > Active: active (running) since Fri 2020-01-24 09:53:03 CET;
>>>>>>>> 3 days ago
>>>>>>>> > > Process: 110 ExecStart=/home/netdisco/bin/netdisco-backend
>>>>>>>> start (code=exited, status=0/SUCCESS)
>>>>>>>> > > Main PID: 216 (netdisco-backen)
>>>>>>>> > > Tasks: 2 (limit: 4915)
>>>>>>>> > > Memory: 143.0M
>>>>>>>> > > CGroup: /system.slice/netdisco-backend.service
>>>>>>>> > > └─216 netdisco-backend
>>>>>>>> > >
>>>>>>>> > > gen 24 09:53:02 deb-netdisco systemd[1]: Starting Netdisco
>>>>>>>> Backend Service...
>>>>>>>> > > gen 24 09:53:03 deb-netdisco netdisco-backend[110]:
>>>>>>>> Netdisco Backend [Started]
>>>>>>>> > > gen 24 09:53:03 deb-netdisco netdisco-backend[110]: config
>>>>>>>> watcher: watching /home/netdisco/environments for updates.
>>>>>>>> > > gen 24 09:53:03 deb-netdisco systemd[1]: Started Netdisco
>>>>>>>> Backend Service.
>>>>>>>> > > gen 24 10:01:48 deb-netdisco netdisco-backend[110]: --
>>>>>>>> /home/netdisco/environments/deployment.yml updated.
>>>>>>>> > > gen 24 10:01:48 deb-netdisco netdisco-backend[110]: config
>>>>>>>> watcher: sending TERM to the server (pid:217)...
>>>>>>>> > >
>>>>>>>> > > root@deb-netdisco:~# ps aux | grep netd
>>>>>>>> > > netdisco 216 0.0 0.3 22840 16008 ? S gen24
>>>>>>>> 6:19 netdisco-backend
>>>>>>>> > > netdisco 281 0.0 0.3 20744 13680 ? S gen24
>>>>>>>> 0:00 perl /home/netdisco/bin/netdisco-web start
>>>>>>>> > > netdisco 282 0.0 0.3 22152 16696 ? S gen24
>>>>>>>> 0:47 starman master --disable-keepalive --user 1001 --group 1001
>>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>>> > > netdisco 372 0.0 0.0 0 0 ? Z gen24
>>>>>>>> 0:16 [nd2: master]
>>>>>>>> > > netdisco 373 0.0 2.7 135148 117200 ? S gen24
>>>>>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>>> > > netdisco 374 0.0 2.8 136000 118000 ? S gen24
>>>>>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>>> > > netdisco 375 0.0 2.7 133744 115940 ? S gen24
>>>>>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>>> > > netdisco 376 0.0 2.8 137420 119504 ? S gen24
>>>>>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>>> > > netdisco 377 0.0 2.7 133792 115996 ? S gen24
>>>>>>>> 0:05 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>>> > > root 3405 0.0 0.0 6096 824 pts/0 S+ 10:59
>>>>>>>> 0:00 grep netd
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > after stop and start
>>>>>>>> > > root@deb-netdisco:~# systemctl start
>>>>>>>> netdisco-backend.service
>>>>>>>> > >
>>>>>>>> > > it seems to work again
>>>>>>>> > > [392] 2020-01-24 15:15:18 info mgr (2): job 22423235
>>>>>>>> booked out for this processing node
>>>>>>>> > > Argument "PID_2700" isn't numeric in read at
>>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 477, line 31470.
>>>>>>>> > > Sereal: Error: Bad Sereal header: Not a valid Sereal
>>>>>>>> document. at offset 1 of input at srl_decoder.c line 580 at
>>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 480, line 31470.
>>>>>>>> > > [3429] 2020-01-27 10:10:08 warn App::Netdisco 2.044004
>>>>>>>> backend
>>>>>>>> > > [3429] 2020-01-27 10:10:08 info resolving backend
>>>>>>>> hostname...
>>>>>>>> > > [3433] 2020-01-27 10:10:08 info applying role Scheduler to
>>>>>>>> worker 1
>>>>>>>> > > [3436] 2020-01-27 10:10:08 info applying role Poller to
>>>>>>>> worker 4
>>>>>>>> > > ...
>>>>>>>> > >
>>>>>>>> > >
>>>>>>>> > > _______________________________________________
>>>>>>>> > > Netdisco mailing list
>>>>>>>> > > [email protected]
>>>>>>>> > > https://sourceforge.net/p/netdisco/mailman/netdisco-users/
>>>>>>>> >
>>>>>>>> > --
>>>>>>>> > °((( = (( ===°°° (((
>>>>>>>> ================================================
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> °(((=((===°°°(((================================================
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> °(((=((===°°°(((================================================
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> °(((=((===°°°(((================================================
>>>>> _______________________________________________
>>>>> Netdisco mailing list
>>>>> [email protected]
>>>>> https://sourceforge.net/p/netdisco/mailman/netdisco-users/
>>>>
>>>>
>>>
>>> --
>>> °(((=((===°°°(((================================================
>>>
>>
>
> --
> °(((=((===°°°(((================================================
>
--- End Message ---
_______________________________________________
Netdisco mailing list - Digest Mode
[email protected]
https://lists.sourceforge.net/lists/listinfo/netdisco-users