Send netdisco-users mailing list submissions to
[email protected]
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.sourceforge.net/lists/listinfo/netdisco-users
or, via email, send a message with subject or body 'help' to
[email protected]
You can reach the person managing the list at
[email protected]
When replying, please edit your Subject line so it is more specific
than "Re: Contents of netdisco-users digest..."
Today's Topics:
1. Re: scheduled jobs stop (Ricardo Stella)
--- Begin Message ---
Almost there...
[netdisco@netdisco ~]$ ~/bin/localenv perl -MSereal\ 999 -e 1
Sereal version 999 required--this is only version 4.007.
BEGIN failed--compilation aborted.
[netdisco@netdisco ~]$ ~/bin/localenv perl -MMCE::Queue\ 999 -e 1
MCE::Queue version 999 required--this is only version 1.865.
BEGIN failed--compilation aborted.
[netdisco@netdisco ~]$ ~/bin/localenv cpanm Sereal MCE
Sereal is up to date. (4.007)
MCE is up to date. (1.865)
I assume we are trying to delete them and force download?
On Wed, Jan 29, 2020 at 10:52 AM Oliver Gorwits <[email protected]> wrote:
> Sorry, my apologies, yes you would need to add "~/bin/localenv" to the
> start of all those commands, I believe
>
>
>
> On Wed, 29 Jan 2020 at 15:17, Ricardo Stella <[email protected]> wrote:
>
>>
>> Running as the netdisco user, I'm getting:
>>
>> Can't locate Sereal.pm in @INC (@INC contains: /usr/local/lib64/perl5
>> /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl
>> /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .).
>> BEGIN failed--compilation aborted.
>>
>> Does it need --local-lib ~/perl5 or ~/bin/localenv first? And --notest?
>>
>>
>>
>>
>> On Wed, Jan 29, 2020 at 9:47 AM Oliver Gorwits <[email protected]> wrote:
>>
>>> Hi Ricardo
>>>
>>> Please can you also run:
>>> perl -MSereal\ 999 -e 1
>>> perl -MMCE::Queue\ 999 -e 1
>>>
>>> Then run
>>> cpanm Sereal MCE
>>>
>>> and then let us know if the problem is still there?
>>>
>>> thanks,
>>> oliver.
>>>
>>> On Wed, 29 Jan 2020 at 14:15, Ricardo Stella <[email protected]> wrote:
>>>
>>>>
>>>> Well, it's definitely a bug with the latest versions. I upgraded the
>>>> original instance I had which was running fine under 2.040006 since March
>>>> of last year. This one also is exhibiting the same issues with jobs queued
>>>> since 5:30pm yesterday.
>>>>
>>>> Error logs on that instance since last restart yesterday afternoon are:
>>>>
>>>> [7901] 2020-01-28 16:03:03 warn App::Netdisco 2.044011 backend
>>>> Argument "" isn't numeric in read at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line
>>>> 1.
>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>> offset 1 of input at srl_decoder.c line 580 at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <$__ANONIO__> line
>>>> 1.
>>>> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
>>>> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
>>>> 1484, <$__ANONIO__> line 1753.
>>>> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
>>>> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
>>>> 1484, <$__ANONIO__> line 15984.
>>>> Argument "" isn't numeric in read at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line
>>>> 1.
>>>> Can't call method "status" without a package or object reference at
>>>> /home/netdisco/perl5/lib/perl5/App/Netdisco/Backend/Role/Poller.pm line 38,
>>>> <$__ANONIO__> line 1.
>>>> Argument "" isn't numeric in read at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <$__ANONIO__> line
>>>> 1.
>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>> offset 1 of input at srl_decoder.c line 580 at
>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <$__ANONIO__> line
>>>> 1.
>>>>
>>>>
>>>>
>>>> On Tue, Jan 28, 2020 at 11:18 AM Ricardo Stella <[email protected]>
>>>> wrote:
>>>>
>>>>>
>>>>> And just noticed that there's a newer version out there. Updated the
>>>>> new instance (including wiping the perl5 directory) and right after I
>>>>> started it, I got an error message. The old one was also updated but it's
>>>>> not giving me any errors so far.
>>>>>
>>>>> [8849] 2020-01-28 16:13:41 warn App::Netdisco 2.044011 backend
>>>>> Argument "" isn't numeric in read at
>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__> line
>>>>> 1.
>>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__> line
>>>>> 1.
>>>>> Argument "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..."
>>>>> isn't numeric in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line
>>>>> 1484, <__ANONIO__> line 32.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Jan 28, 2020 at 9:56 AM Ricardo Stella <[email protected]>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> Same here...
>>>>>>
>>>>>> backend status thinks it's running but jobs are queued since last
>>>>>> night and not running. Here are the errors since last restart yesterday:
>>>>>>
>>>>>> [24657] 2020-01-27 16:00:58 warn App::Netdisco 2.044009 backend
>>>>>> Argument "" isn't numeric in read at
>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__> line
>>>>>> 1.
>>>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__> line
>>>>>> 1.
>>>>>> Argument
>>>>>> "=M-srl^D\0A,{App::Netdisco::Backend::Job(*^Ofstatusfqueu..." isn't
>>>>>> numeric
>>>>>> in int at /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1484,
>>>>>> <__ANONIO__> line 27785.
>>>>>> Argument "" isn't numeric in read at
>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, <__ANONIO__> line
>>>>>> 1.
>>>>>> Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, <__ANONIO__> line
>>>>>> 1.
>>>>>>
>>>>>> We also did a DB dump and restore it on a new instance, so not sure
>>>>>> if this is related.
>>>>>>
>>>>>> I restarted our older instance. I will update Netdisco (running
>>>>>> 2.040006) and see if there are any issues on this instance. Worse case,
>>>>>> I'll redo the dump again.
>>>>>>
>>>>>>
>>>>>> On Tue, Jan 28, 2020 at 4:33 AM <[email protected]> wrote:
>>>>>>
>>>>>>> Hi Ricardo
>>>>>>> Sorry I forgot to sign my email
>>>>>>> By the way I'm Marco
>>>>>>>
>>>>>>> It stopped again yesterday after few hour.
>>>>>>> ...
>>>>>>> [5754] 2020-01-27 17:06:59 debug -> run worker
>>>>>>> main/wirelessnodes/100
>>>>>>> [5754] 2020-01-27 17:06:59 info pol (3): wrapping up macsuck
>>>>>>> job(22425208) - status done at Mon Jan 27 18:06:59 2020
>>>>>>> [5750] 2020-01-27 17:06:59 debug [172.17.119.6] macsuck - port
>>>>>>> 1:43 vlan unknown : 1 nodes
>>>>>>> Argument "PID_5754" isn't numeric in abs at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Core/Manager.pm line 206,
>>>>>>> <__ANONIO__>
>>>>>>> line 32948.
>>>>>>> Can't call method "_mce_m_pending" on an undefined value at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 679, <__ANONIO__> line
>>>>>>> 32949.
>>>>>>>
>>>>>>> I activated debug, it seems that some scheduled jobs (macsuck,
>>>>>>> discoverall etc.) cause the error "Argument "PID_####" isn't numeric "
>>>>>>> and
>>>>>>> it zombies netdisco-backend child
>>>>>>> ps aux | grep netd
>>>>>>> netdisco 3428 0.0 0.3 22840 15848 ? S gen27 2:05
>>>>>>> netdisco-backend
>>>>>>> netdisco 3429 0.0 0.0 0 0 ? Z gen27 0:15
>>>>>>> [nd2: master] <defunct>
>>>>>>>
>>>>>>> I can't say if it is caused by my new setup/configuration or
>>>>>>> something else
>>>>>>>
>>>>>>> Marco
>>>>>>>
>>>>>>> > Il 27 gennaio 2020 alle 17.03 Ricardo Stella <[email protected]>
>>>>>>> ha scritto:
>>>>>>> >
>>>>>>> >
>>>>>>> > Also happening here. I also had exported the DB in order to
>>>>>>> install on a new VM with new OS. Had a couple of problems that I posted
>>>>>>> but
>>>>>>> had this same error on the logs.
>>>>>>> > Noticed all jobs queued for a couple of days and nothing running.
>>>>>>> > Last message on logs was:
>>>>>>> > Argument "" isn't numeric in read at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1439, line 1.
>>>>>>> > Sereal: Error: Bad Sereal header: Not a valid Sereal document. at
>>>>>>> offset 1 of input at srl_decoder.c line 580 at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 1445, line 1.
>>>>>>> >
>>>>>>> > Restarting it seems to get the jobs running again.
>>>>>>> >
>>>>>>> >
>>>>>>> > On Mon, Jan 27, 2020 at 10:54 AM marco via netdisco-users <
>>>>>>> [email protected]> wrote:
>>>>>>> > > Hi there
>>>>>>> > >
>>>>>>> > > I had set up a new ND2 host on debian buster some weeks ago
>>>>>>> > > for experimental purpose
>>>>>>> > > I have another ND2 host up and running since years
>>>>>>> > >
>>>>>>> > > Software Version
>>>>>>> > > App::Netdisco 2.44.4
>>>>>>> > > SNMP::Info 3.70
>>>>>>> > > DB Schema 61
>>>>>>> > > PostgreSQL 12.00.1
>>>>>>> > > Perl 5.28.1
>>>>>>> > >
>>>>>>> > > I restore db from another ND2
>>>>>>> > > and copy deployment.yml
>>>>>>> > > It worked
>>>>>>> > >
>>>>>>> > > But I noticed that it stops running the scheduled jobs after
>>>>>>> some times (days)
>>>>>>> > > I had to restart netdisco-backend,
>>>>>>> > >
>>>>>>> > >
>>>>>>> > > here some info I collect
>>>>>>> > >
>>>>>>> > > from netdisco-backend.log
>>>>>>> > > ...
>>>>>>> > > [392] 2020-01-24 15:15:18 debug mgr (2): getting potential
>>>>>>> jobs for 1 workers
>>>>>>> > > [2700] 2020-01-24 15:15:18 debug [172.17.185.50] arpnip -
>>>>>>> processed 373 ARP Cache entries
>>>>>>> > > [2700] 2020-01-24 15:15:18 debug [172.17.185.50] arpnip -
>>>>>>> processed 0 IPv6 Neighbor Cache entries
>>>>>>> > > [2700] 2020-01-24 15:15:18 info pol (3): wrapping up arpnip
>>>>>>> job(22423168) - status done at Fri Jan 24 16:15:18 2020
>>>>>>> > > [392] 2020-01-24 15:15:18 debug getsome: cancelled 0E0
>>>>>>> duplicate(s) of job 22423235
>>>>>>> > > [392] 2020-01-24 15:15:18 info mgr (2): job 22423235 booked
>>>>>>> out for this processing node
>>>>>>> > > Argument "PID_2700" isn't numeric in read at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 477, line 31470.
>>>>>>> > > Sereal: Error: Bad Sereal header: Not a valid Sereal
>>>>>>> document. at offset 1 of input at srl_decoder.c line 580 at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 480, line 31470.
>>>>>>> > >
>>>>>>> > > root@deb-netdisco:~# systemctl status
>>>>>>> netdisco-backend.service
>>>>>>> > > ● netdisco-backend.service - Netdisco Backend Service
>>>>>>> > > Loaded: loaded
>>>>>>> (/etc/systemd/system/netdisco-backend.service; enabled; vendor preset:
>>>>>>> enabled)
>>>>>>> > > Active: active (running) since Fri 2020-01-24 09:53:03 CET;
>>>>>>> 3 days ago
>>>>>>> > > Process: 110 ExecStart=/home/netdisco/bin/netdisco-backend
>>>>>>> start (code=exited, status=0/SUCCESS)
>>>>>>> > > Main PID: 216 (netdisco-backen)
>>>>>>> > > Tasks: 2 (limit: 4915)
>>>>>>> > > Memory: 143.0M
>>>>>>> > > CGroup: /system.slice/netdisco-backend.service
>>>>>>> > > └─216 netdisco-backend
>>>>>>> > >
>>>>>>> > > gen 24 09:53:02 deb-netdisco systemd[1]: Starting Netdisco
>>>>>>> Backend Service...
>>>>>>> > > gen 24 09:53:03 deb-netdisco netdisco-backend[110]: Netdisco
>>>>>>> Backend [Started]
>>>>>>> > > gen 24 09:53:03 deb-netdisco netdisco-backend[110]: config
>>>>>>> watcher: watching /home/netdisco/environments for updates.
>>>>>>> > > gen 24 09:53:03 deb-netdisco systemd[1]: Started Netdisco
>>>>>>> Backend Service.
>>>>>>> > > gen 24 10:01:48 deb-netdisco netdisco-backend[110]: --
>>>>>>> /home/netdisco/environments/deployment.yml updated.
>>>>>>> > > gen 24 10:01:48 deb-netdisco netdisco-backend[110]: config
>>>>>>> watcher: sending TERM to the server (pid:217)...
>>>>>>> > >
>>>>>>> > > root@deb-netdisco:~# ps aux | grep netd
>>>>>>> > > netdisco 216 0.0 0.3 22840 16008 ? S gen24
>>>>>>> 6:19 netdisco-backend
>>>>>>> > > netdisco 281 0.0 0.3 20744 13680 ? S gen24
>>>>>>> 0:00 perl /home/netdisco/bin/netdisco-web start
>>>>>>> > > netdisco 282 0.0 0.3 22152 16696 ? S gen24
>>>>>>> 0:47 starman master --disable-keepalive --user 1001 --group 1001
>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>> > > netdisco 372 0.0 0.0 0 0 ? Z gen24
>>>>>>> 0:16 [nd2: master]
>>>>>>> > > netdisco 373 0.0 2.7 135148 117200 ? S gen24
>>>>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>> > > netdisco 374 0.0 2.8 136000 118000 ? S gen24
>>>>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>> > > netdisco 375 0.0 2.7 133744 115940 ? S gen24
>>>>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>> > > netdisco 376 0.0 2.8 137420 119504 ? S gen24
>>>>>>> 0:06 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>> > > netdisco 377 0.0 2.7 133792 115996 ? S gen24
>>>>>>> 0:05 starman worker --disable-keepalive --user 1001 --group 1001
>>>>>>> /home/netdisco/perl5/bin/netdisco-web-fg
>>>>>>> > > root 3405 0.0 0.0 6096 824 pts/0 S+ 10:59
>>>>>>> 0:00 grep netd
>>>>>>> > >
>>>>>>> > >
>>>>>>> > > after stop and start
>>>>>>> > > root@deb-netdisco:~# systemctl start
>>>>>>> netdisco-backend.service
>>>>>>> > >
>>>>>>> > > it seems to work again
>>>>>>> > > [392] 2020-01-24 15:15:18 info mgr (2): job 22423235 booked
>>>>>>> out for this processing node
>>>>>>> > > Argument "PID_2700" isn't numeric in read at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 477, line 31470.
>>>>>>> > > Sereal: Error: Bad Sereal header: Not a valid Sereal
>>>>>>> document. at offset 1 of input at srl_decoder.c line 580 at
>>>>>>> /home/netdisco/perl5/lib/perl5/MCE/Queue.pm line 480, line 31470.
>>>>>>> > > [3429] 2020-01-27 10:10:08 warn App::Netdisco 2.044004
>>>>>>> backend
>>>>>>> > > [3429] 2020-01-27 10:10:08 info resolving backend
>>>>>>> hostname...
>>>>>>> > > [3433] 2020-01-27 10:10:08 info applying role Scheduler to
>>>>>>> worker 1
>>>>>>> > > [3436] 2020-01-27 10:10:08 info applying role Poller to
>>>>>>> worker 4
>>>>>>> > > ...
>>>>>>> > >
>>>>>>> > >
>>>>>>> > > _______________________________________________
>>>>>>> > > Netdisco mailing list
>>>>>>> > > [email protected]
>>>>>>> > > https://sourceforge.net/p/netdisco/mailman/netdisco-users/
>>>>>>> >
>>>>>>> > --
>>>>>>> > °((( = (( ===°°° (((
>>>>>>> ================================================
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> °(((=((===°°°(((================================================
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> °(((=((===°°°(((================================================
>>>>>
>>>>
>>>>
>>>> --
>>>> °(((=((===°°°(((================================================
>>>> _______________________________________________
>>>> Netdisco mailing list
>>>> [email protected]
>>>> https://sourceforge.net/p/netdisco/mailman/netdisco-users/
>>>
>>>
>>
>> --
>> °(((=((===°°°(((================================================
>>
>
--
°(((=((===°°°(((================================================
--- End Message ---
_______________________________________________
Netdisco mailing list - Digest Mode
[email protected]
https://lists.sourceforge.net/lists/listinfo/netdisco-users