Konstantin Ryabitsev <konstan...@linuxfoundation.org> wrote: > On Tue, Nov 14, 2023 at 10:16:53PM +0000, Eric Wong wrote: > > Konstantin Ryabitsev <konstan...@linuxfoundation.org> wrote: > > > └─-cindex -u --al,4432 > > > ├─cidx shard[0],4646 > > > └─cidx shard[1],4647 > > > > > > Anything I can do to figure out why this is happening? > > > > You can show me strace and lsof +E of the processes (any other > > processes (join|sort|awk|perl)?). This code is highly in flux, > > so it's also fine to rm the test for now since nothing > > public-facing is using -cindex, yet... > > Yeah, but I figured I'll poke a bit in case it's helpful. > > I can't do +E because that's not available to me under CentOS7 (I can't wait > until we move on, but just when we think the yak is fully shaved, we find more > clumps of thick fur we hadn't considered). Is the output of the regular "lsof > -p" helpful at all?
Sure. > Strace for all three processes (-cindex, cidx shard[0], cidx shard[1]) just > sits at: > > select(24, [13 16], NULL, NULL, NULL OK, that's still useful. One FD is signalfd, the others would be a SOCK_SEQPACKET socket, I think... > As far as I can see, there are no other processes other than cidx. OK. Hmm.. Perhaps `kill -CHLD' on the top-level cindex process can move it along? There's still some weird timeout/sleep behavior leftover from Danga::Socket that should probably be removed: -------8<------ Subject: [PATCH] ds: run @post_loop_do if any user-queued events run --- We can probably kill more hacky wakeup behavior throughout cindex and maybe other places with this patch... lib/PublicInbox/DS.pm | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/DS.pm b/lib/PublicInbox/DS.pm index da26efc4..4c8b502f 100644 --- a/lib/PublicInbox/DS.pm +++ b/lib/PublicInbox/DS.pm @@ -144,13 +144,14 @@ sub next_tick () { # https://rt.perl.org/Public/Bug/Display.html?id=114340 blessed($obj) ? $obj->event_step : $obj->(); } + 1; } # runs timers and returns milliseconds for next one, or next event loop sub RunTimers { - next_tick(); + my $ran = next_tick(); - return ($nextq ? 0 : $loop_timeout) unless @Timers; + return ($nextq || $ran ? 0 : $loop_timeout) unless @Timers; my $now = now(); @@ -159,10 +160,11 @@ sub RunTimers { my $to_run = shift(@Timers); delete $UniqTimer{$to_run->[1] // ''}; $to_run->[2]->(@$to_run[3..$#$to_run]); + $ran = 1; } # timers may enqueue into nextq: - return 0 if $nextq; + return 0 if $nextq || $ran; return $loop_timeout unless @Timers;