On Sat, 21 Jun 2025, heasley wrote:
Wed, Jun 18, 2025 at 11:22:23PM +0000, Dan Mahoney (Gushi):
Hey there all,
Something's driving me batty.
My ASR-1001-X is only able to be connected to intermittently. Rancid (run
as the rancid user) always works from the command line, but rancid-run fails
for some reason.
When I watch rancid-run, I see several ssh processes start up, trying to
shell to the router in question, but of course, the output of those aren't
logged anywhwere? Clogin works. Running all the commands in rancid -d work
(though of course there are many extra commands in there).
There should only be 1 ssh process per device, though it will try
rancid.conf:MAX_ROUNDS times.
Much of the output is filtered, but effort is made to log relevant
errors to rancid.conf:${LOGDIR}/<group>.<datestamp>
It is possible that the device is simply slow executing some commands.
This is not unusual for older devices or because of bugs such as
memory leaks. Increasing the timeout can test this theory, either
increase the timeout for all devices of type cisco,
rancid.types.base: cisco;timeout;120
Interesting, this line wasn't in my existing rancid.types.base for type
cisco. I've added it at 300 in both the conf file and cloginrc.
But it seems not to be honored. For example, at the time of one of the
failures, I get:
$ time rancid-run
57.43 real 3.20 user 0.40 sys
And also, ps seems to report it's being hard-set at 90:
rancid 87909 2.1 0.1 18324 6952 0 S+ 17:40 0:00.06
/usr/local/bin/expect -- /usr/local/libexec/rancid/clogin -t 90 -c show
version;show redundancy secondary;show idprom backplane;show install
active;show env all;show rsp chassis-info;show gsr chassis;show diag
chassis-info;show boot;show bootvar;show variables boot;show license
udi;show license feature;show license;show license summary;show
activation-key (...)
Weirdly, sitting on the router and stalking "who" I see the rancid login
happen multiple times.
Adding a couple of quotes and running the full clogin command line always
runs quickly.
or specific devices,
~rancid/.cloginrc: add timeout <name glob> {<seconds>}
But every time I call rancid-run groupname, I get the "routers have not been
contacted in over 24 hours" email. And only intermittently. (It's been a
little over 24 hours with no changes now).
Another thing to check, which would also be revealed in the
aforemention logs, is that the repository is not buggered in
some manner that control_rancid can not resolve.
su - rancid
cd <group>
<SCM> update or <SCM> status
and look for errors.
Those are the things that I would investigate or try first.
cvs up/cvs status run clean.
I even deleted and re-added the file from cvs.
When it works, it works. This is what's confusing me.
===
(a few hours later)
I think I have one (silly) theory about what's going wrong. I have a bit
of ASCII art in the motd, and when I removed it, things started running
more fluidly. (It has # signs, carets, and slashes in it).
https://www.gushi.org/routerferret.png Too many weasels in the router.
I still don't know why this would only break things half the time, though.
I still don't know why things always work fluidly when I just paste
commands in -- perhaps the clogin goes fine, but what happens after is
breaking.
I also still don't know why -t 90 is being reported if I've set an
explicit timeout of longer.
I'm also not sure why rancid does something like:
more system:running-config;show running-config view full;show
running-config;write term -- if multiple of these commands work, are they
post-processed/deduplicated down to a single config before they're
committed to CVS?
Does it make sense to pare these down to a single command-set that works
only on my version of IOS-XE, and define my own device type for it?
Rancid seems to have a very "throw all the commands at the wall and see
what sticks" point of view.
-Dan
--
--------Dan Mahoney--------
Techie, Sysadmin, WebGeek
Gushi on efnet/undernet IRC
FB: fb.com/DanielMahoneyIV
LI: linkedin.com/in/gushi
Site: http://www.gushi.org
---------------------------
_______________________________________________
Rancid-discuss mailing list
[email protected]
https://www.shrubbery.net/mailman/listinfo/rancid-discuss