[PATCH 1/4] lei: fix idempotent STDERR redirect in workers

2023-11-15 Thread Eric Wong
This is needed to support forking from already-forked lei workers and $lei->{2} is already STDERR. Fixes: e015c3742f91 (lei: use autodie where appropriate, 2023-10-17) --- lib/PublicInbox/LEI.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/LEI.pm

[PATCH 2/4] lei convert: fix repeat and idempotent v2 output

2023-11-15 Thread Eric Wong
We should be able to treat v2 outputs just like any other mail format, with the exception that content dedupe is always enforced by the v2 format. This allows users hosting v2 public-inboxes to catch up broken synchronization from alternate archives such as the mbox archives hosted by

[PATCH 4/3] xap_helper_cxx: accept leading spaces from pkg-config

2023-11-15 Thread Eric Wong
Eric Wong wrote: > Avoid mixing autodie use in different scopes since it's likely > to cause problems like it did in Gcf2. While none of these > fix known problems with test cases, it's likely worthwhile to > avoid it anyways to avoid future surprises. > lib/PublicInbox/XapHe

[PATCH] cindex: fix test when missing time(1) executable

2023-11-14 Thread Eric Wong
Eric Wong wrote: > +++ b/t/cindex.t > @@ -210,7 +210,7 @@ EOM > my $cmd = [ qw(-cindex -u --all --associate -d), "$tmp/ext", > '-I', $basic->{inboxdir} ]; > $cidx_out = $cidx_err = ''; > - ok(run_script($cmd, $env, $opt), 'ass

[PATCH 0/3] libgit2 fixes for CentOS 7.x users

2023-11-14 Thread Eric Wong
These only spring up with libgit2-devel installed, but 3/3 seems important for all users of older Perl in order to avoid future surprises. Eric Wong (3): gcf2client: add alias for PublicInbox::Git::fail gcf2: fix autodie usage for older Perl treewide: more autodie safety fixes for older

[PATCH 1/3] gcf2client: add alias for PublicInbox::Git::fail

2023-11-14 Thread Eric Wong
Ensure we can ->fail properly from other subs we can within Gcf2Client. This doesn't fix the test failures on CentOS 7.x, but tries to make it easier to fix underlying problems and report OOM errors and other things which the test suite doesn't touch on. --- lib/PublicInbox/Gcf2Client.pm | 1 +

[PATCH 3/3] treewide: more autodie safety fixes for older Perl

2023-11-14 Thread Eric Wong
Avoid mixing autodie use in different scopes since it's likely to cause problems like it did in Gcf2. While none of these fix known problems with test cases, it's likely worthwhile to avoid it anyways to avoid future surprises. For Process::IO, we'll add some additional tests in t/io.t to ensure

[PATCH 2/3] gcf2: fix autodie usage for older Perl

2023-11-14 Thread Eric Wong
At least on Perl v5.16.3 on CentOS 7.x, use-ing autodie within BEGIN {} affects all subroutines in that package, too. So just use autodie at the top-level and rely on CORE::* and try_cat to handle cases where autodie isn't desired. --- lib/PublicInbox/Gcf2.pm | 13 ++--- 1 file changed,

Re: t/cindex.t "associate w/o search" test hangs for me

2023-11-14 Thread Eric Wong
Konstantin Ryabitsev wrote: > t/gcf2_client.t .. 1/? Can't locate object method "fail" via > package "PublicInbox::Gcf2Client" at > /home/mricon/public-inbox-test/blib/lib/PublicInbox/Git.pm line 269. > (in cleanup) Can't locate object method "fail" via package >

Re: t/cindex.t "associate w/o search" test hangs for me

2023-11-14 Thread Eric Wong
Konstantin Ryabitsev wrote: > There are two sources of potential discrepancy: > > - differences in CPAN module versions I have installed > - different git or xapian14 versions Could also be CPU count or CPU speeds since IPC bugs are sensitive to that. That said, I can't get cindex.t to fail

Re: t/cindex.t "associate w/o search" test hangs for me

2023-11-14 Thread Eric Wong
Konstantin Ryabitsev wrote: > Looks like the last time I am able to successfully run "make test" is before > this commit: > > b231d91f42d791becf7b6861e723833d71e73237 is the first bad commit > > The error I start getting after this commit is: > > t/extsearch.t 160/? > #

[PATCH 1/2] lei: use -signal numbers for old Perl

2023-11-14 Thread Eric Wong
Unlike modern Perls, Perl 5.16.3 on CentOS doesn't accept negative string signals like "-TERM" . This only became a problem since commit b231d91f42d7 (treewide: enable warnings in all exec-ed processes) made our code stricter by enabling more warnings. In both cases, the kill is probably

[PATCH 0/2] some CentOS fixes

2023-11-14 Thread Eric Wong
Neither of these fix the t/cindex.t stuck problem Konstantin is encountering, though... Eric Wong (2): lei: use -signal numbers for old Perl t/lei-import: account for more verbose error lib/PublicInbox/LEI.pm| 2 +- lib/PublicInbox/LeiXSearch.pm | 2 +- t/lei-import.t

[PATCH 2/2] t/lei-import: account for more verbose error

2023-11-14 Thread Eric Wong
Perl 5.16.3 on CentOS seems more verbose in one of the EIO tests. Relax the regexp so we can account for extra errors reported by Perl. --- t/lei-import.t | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/t/lei-import.t b/t/lei-import.t index bd562617..b4446b56 100644 ---

Re: t/cindex.t "associate w/o search" test hangs for me

2023-11-14 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Tue, Nov 14, 2023 at 10:46:57PM +0000, Eric Wong wrote: > > > I can't do +E because that's not available to me under CentOS7 (I can't > > > wait > > > until we move on, but just when we think the yak is fully shaved, we find > &

Re: t/cindex.t "associate w/o search" test hangs for me

2023-11-14 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Tue, Nov 14, 2023 at 10:16:53PM +0000, Eric Wong wrote: > > Konstantin Ryabitsev wrote: > > > └─-cindex -u --al,4432 > > > ├─cidx shard[0],4646 > > > └─cidx shard[1],4647 > > &g

Re: t/cindex.t "associate w/o search" test hangs for me

2023-11-14 Thread Eric Wong
Konstantin Ryabitsev wrote: > └─-cindex -u --al,4432 > ├─cidx shard[0],4646 > └─cidx shard[1],4647 > > Anything I can do to figure out why this is happening? You can show me strace and lsof +E of the processes (any other processes (join|sort|awk|perl)?). This

Re: [PATCH] TestCommon: older strace does not have --version

2023-11-14 Thread Eric Wong
Thanks, pushed as commit 58e6ee9df4f74b1078541c8924cf2918ceec0765

[PATCH] cindex: fix missing semicolon on broken $GIT_DIR/objects

2023-11-14 Thread Eric Wong
Noticed while working on another feature... --- lib/PublicInbox/CodeSearchIdx.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/CodeSearchIdx.pm b/lib/PublicInbox/CodeSearchIdx.pm index 4ed5ea64..9ceef16c 100644 --- a/lib/PublicInbox/CodeSearchIdx.pm +++

Re: [Question] review links are disappearing from the qemu-devel mailing-list

2023-11-14 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Tue, Nov 14, 2023 at 04:36:29PM +0000, Eric Wong wrote: > > In any case, kernel.org folks should be able to import missing > > messages from GNU.org idempotently into lore/qemu-devel without > > having to resend: > > > > https:

Re: [Question] review links are disappearing from the qemu-devel mailing-list

2023-11-14 Thread Eric Wong
Salil Mehta wrote: > Rest other below link end up in 'Not found' > > https://yhbt.net/lore/qemu-devel/20231027150536.3c481...@imammedo.users.ipa.redhat.com/ > https://yhbt.net/lore/qemu-devel/20231027160814.3f47f...@imammedo.users.ipa.redhat.com/ >

[PATCH] cindex: fix missing semicolon on broken $GIT_DIR/objects

2023-11-14 Thread Eric Wong
Noticed while working on another feature... --- lib/PublicInbox/CodeSearchIdx.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/CodeSearchIdx.pm b/lib/PublicInbox/CodeSearchIdx.pm index 4ed5ea64..9ceef16c 100644 --- a/lib/PublicInbox/CodeSearchIdx.pm +++

Re: [Question] review links are disappearing from the qemu-devel mailing-list

2023-11-14 Thread Eric Wong
Salil Mehta wrote: > I have cross confirmed the behavior with other people across companies and > all of them are having issues in viewing above links. Surprising part is > these were present at the first instance when the review comments were floated > by Igor - I can assure you that. Does my

2.0 newsgroup name incompatibility (unlikely to affect anyone...)

2023-11-13 Thread Eric Wong
I don't think this affects anyone, but historically we supported uppercase characters in newsgroup names since (IIRC) the venerable INN server was case-sensitive. However, IMAP, POP3 require lowercase; and our extindex (and WIP -cindex) get confused/broken with uppercase characters. So I think

[PATCH 06/18] xap_helper_cxx: use -pipe by default in CXXFLAGS

2023-11-13 Thread Eric Wong
-ggdb3 is already used for g++ and clang, and -pipe is supported by clang even if it's a no-op. So just use it to speed up g++ since it saves me 30-40ms. We'll also get rid of the explicit `-O0' since it's the default for both clang and g++. --- lib/PublicInbox/XapHelperCxx.pm | 2 +- 1 file

[PATCH 09/18] spawn: don't append to scalarrefs on stdout/stderr

2023-11-13 Thread Eric Wong
None of our current code relies on it, and I can't imagine it's something we'd need in the future, actually... This keeps the door open for relying more on Spawn in TestCommon. --- lib/PublicInbox/Spawn.pm | 2 +- t/spawn.t| 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)

[PATCH 04/18] xap_helper_cxx: use write_file helper

2023-11-13 Thread Eric Wong
PublicInbox::IO already gets loaded by PublicInbox::Spawn, so there's no avoiding it even if we want fast startup time :< But startup time for this piece will be less relevant in the near future... --- lib/PublicInbox/XapHelperCxx.pm | 16 ++-- 1 file changed, 6 insertions(+), 10

[PATCH 02/18] tmpfile: check `stat' errors, use autodie for unlink

2023-11-13 Thread Eric Wong
`stat' can fail due to bugs on our end or ENOMEM, but there's no autodie support for it. So just die if `unlink' fails, since the FS wouldn't be usable for tmpfiles in that state, anyways. --- lib/PublicInbox/Tmpfile.pm | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git

[PATCH 18/18] cindex: support --associate-aggressive shortcut

2023-11-13 Thread Eric Wong
This is shorthand to enabling --associate with the most aggressive (and time-consuming) options available, starting from the Unix epoch and having an unlimited window to join on. --- lib/PublicInbox/CodeSearchIdx.pm | 5 + script/public-inbox-cindex | 1 + 2 files changed, 6

[PATCH 14/18] xap_helper: stricter and harsher error handling

2023-11-13 Thread Eric Wong
We'll require an error stream for dump_ibx and dump_roots commands; they're too important to ignore. Instead of writing code to provide diagnostics for errors, rely on abort(3) and the -ggdb3 compiler flag to generate nice core dumps for gdb since all commands sent to xap_helper are from internal

[PATCH 07/18] xap_client: spawn C++ xap_helper directly

2023-11-13 Thread Eric Wong
No need to suffer through an extra dose of slow Perl load times when we can drive the build in the big parent Perl process and get the executable path name to pass to spawn directly. --- lib/PublicInbox/XapClient.pm| 28 ++-- lib/PublicInbox/XapHelperCxx.pm | 9

[PATCH 13/18] cidx_xap_helper_aux: complain about truncated inputs

2023-11-13 Thread Eric Wong
This will help us notice bugs and system resource limitations sooner rather than later. --- lib/PublicInbox/CidxXapHelperAux.pm | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/CidxXapHelperAux.pm b/lib/PublicInbox/CidxXapHelperAux.pm index

[PATCH 10/18] cindex: imply --all with --associate w/o -I/--only

2023-11-13 Thread Eric Wong
I just forgot to use --all with --associate and it wasn't easily apparent what was wrong. We'll also show some extra progress while we're at it. --- lib/PublicInbox/CodeSearchIdx.pm | 26 +- 1 file changed, 21 insertions(+), 5 deletions(-) diff --git

[PATCH 17/18] cindex: rename associate-max => window

2023-11-13 Thread Eric Wong
"window" is probably a better term since it's an inexact thing to match on. --- lib/PublicInbox/CodeSearchIdx.pm | 10 +- script/public-inbox-cindex | 2 +- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/CodeSearchIdx.pm

[PATCH 11/18] cindex: delay associate until prune+indexing finish

2023-11-13 Thread Eric Wong
Prune can get rid of invalid commits while indexing can add new candidates for association, so we don't dump coderepo roots for association until those are squared away. However, we can dump inbox info since we don't touch inboxes while -cindex is running. --- lib/PublicInbox/CidxComm.pm |

[PATCH 08/18] treewide: update read_all to avoid eof|close checks

2023-11-13 Thread Eric Wong
read_all can be expanded to support FIFOs/pipes/sockets where read-until-EOF behavior is desired. We can also rely on wantarray to support splitting on EOL markers, but it's hard-coded to support only `$/ eq "\n"' since (AFAIK) it's the only way we use the wantarray form `readline'. ---

[PATCH 16/18] cindex: do not guess integer maximum for Xapian

2023-11-13 Thread Eric Wong
We can return an array to allow the caller to omit the internal `-m' arg entirely. We'll also allow any non-positive values to mean there's no limit; and we'll defer the "unlimited" case to the XapHelper implementation. This frees us of having to deal with mismatches between Perl and Xapian if

[PATCH 03/18] cindex: use `local' for pipes between processes

2023-11-13 Thread Eric Wong
We can let these pipes get auto-closed upon leaving the process subroutine scope. --- lib/PublicInbox/CodeSearchIdx.pm | 19 +-- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/lib/PublicInbox/CodeSearchIdx.pm b/lib/PublicInbox/CodeSearchIdx.pm index

[PATCH 12/18] xap_helper: Perl dump_ibx respects `-m MAX'

2023-11-13 Thread Eric Wong
The C++ version does, so the Perl/XS version should, too; even if we intentionally avoid using it right now. --- lib/PublicInbox/XapHelper.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/XapHelper.pm b/lib/PublicInbox/XapHelper.pm index 1ee918e3..4157600f

[PATCH 15/18] xap_helper: better variable naming for key buffer

2023-11-13 Thread Eric Wong
We'll use `kbuf' for the search object key, since we already use the `fbuf' term in `struct fbuf'. This also adds an extra check for open_memstream(3) failures in case of ENOMEM. --- lib/PublicInbox/xap_helper.h | 33 - 1 file changed, 16 insertions(+), 17

[PATCH 05/18] xap_helper_cxx: make the build process ccache-friendly

2023-11-13 Thread Eric Wong
We need to have stable filenames and separate compilation from the linkage stage for ccache to hit. So avoid the use of a temporary directory and instead rely on a lock file to guard against parallel builds. --- lib/PublicInbox/XapHelperCxx.pm | 32 +++- 1 file

[PATCH 00/18] cindex: some --associate work

2023-11-13 Thread Eric Wong
refixes=patchid+dfblob But Perl doesn't ship with getsubopt(3) emulation out-of-the-box Eric Wong (18): cindex: check `say' errors w/ close or ->flush tmpfile: check `stat' errors, use autodie for unlink cindex: use `local' for pipes between processes xap_helper_cxx: use write_file help

[PATCH 01/18] cindex: check `say' errors w/ close or ->flush

2023-11-13 Thread Eric Wong
We actually need to rely on autodie `close' to check for errors, since error-checking with `say' is not useful due to perlio write buffering. We'll also stop relying on `say ... or die' since it's needless noise. Fixes: 19f9089343c9 (cindex: drop redundant close on regular FH) ---

[PATCH] xap_helper: reset getopt(3) properly in workers

2023-11-12 Thread Eric Wong
I only noticed this while doing a full -cindex --associate with --associate-date-range=30.years.ago and --associate-max=-1 (no limit for Xapian) between local mirrors of lore and git.kernel.org my glibc-based system. Apparently, glibc requires `optind = 0' to reset getopt(3) in our workers.

Re: [Bug] lei: extra quotes inserted into query with AND/OR

2023-11-12 Thread Eric Wong
Henrik Grimler wrote: > Aha, I see, thanks for the explanation! Without the single quotes, and > after escaping parantheses, lei works as expected. Good to know. > For the record, I read some old posts where query was '' quoted, and > thought it was the way to do it (for example >

[PATCH] lei: don't read --stdin terminals from daemon

2023-11-12 Thread Eric Wong
We must use a foreground process to read from terminals on stdin, otherwise weird things like lost keystrokes and EIO can happen. So take advantage of ->send_exec_cmd to spawn `cat' in the same way we spawn MUAs, pagers, `git config --edit' and `git credential' from script/lei ---

Re: [Bug] lei: extra quotes inserted into query with AND/OR

2023-11-12 Thread Eric Wong
Henrik Grimler wrote: > Hi Eric, > > On Sun, Nov 12, 2023 at 12:10:50AM +0000, Eric Wong wrote: > > Henrik Grimler wrote: > > > Hi, > > > > > > I recently found out about lei and installed it through archlinux's > > > package manager

Re: [Bug] lei: extra quotes inserted into query with AND/OR

2023-11-11 Thread Eric Wong
Henrik Grimler wrote: > Hi, > > I recently found out about lei and installed it through archlinux's > package manager and am trying out queries. When using AND/OR extra > quotes are inserted in the curl command which messes it up, for > example: > > $ lei q -I https://lore.kernel.org/all/ -o

[PATCH 4/4] doc: update README.unsubscribe

2023-11-11 Thread Eric Wong
The whitelist was only used in the early days of its development and hasn't existed for a while. I've largely forgotten this thing exists since it's been working well... --- examples/README.unsubscribe | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git

[PATCH 3/4] mda: fix and test some usage problems

2023-11-11 Thread Eric Wong
-mda now honors `--help' properly and invocations missing ORIGINAL_RECIPIENT now fail with EX_NOUSER. Helped-by: Leah Neukirchen Link: https://public-inbox.org/meta/87msvlguqu@vuxu.org/ --- script/public-inbox-mda | 7 ++- t/mda.t | 18 ++ 2 files

[PATCH 1/4] learn: fix redundant ham import on dual matches

2023-11-11 Thread Eric Wong
When learning and injecting new messages ham, we want to avoid wasting cycles importing the same message into an inbox twice (once for the To/Cc match and once for the List-Id match). Our existing %seen hash turned out to be ineffective since PublicInbox::Inbox refs get re-blessed to

[PATCH 0/4] support publicinboxImport.dropUniqueUnsubscribe

2023-11-11 Thread Eric Wong
noticed while working on 2. Eric Wong (4): learn: fix redundant ham import on dual matches mda|learn|watch: support dropUniqueUnsubscribe config mda: fix and test some usage problems doc: update README.unsubscribe Documentation/public-inbox-config.pod | 17 Documentation/public-inbox

[PATCH 2/4] mda|learn|watch: support dropUniqueUnsubscribe config

2023-11-11 Thread Eric Wong
List-Unsubscribe headers with unique identifiers (such as those generated by our examples/unsubscribe.milter) should not end up in public archives. Add a new config knob to strip List-Unsubscribe headers if they have the `List-Unsubscribe-Post: List-Unsubscribe=One-Click' header. Unfortunately,

[PATCH] t/lei-import: skip strace for restricted systems

2023-11-10 Thread Eric Wong
Systems with Yama can restrict ptrace(2) (the underlying syscall used by strace(1)) and make it difficult to test error handling via error injection. Just skip the tests on such systems since it's probably not worth the effort to start using prctl(2) to enable the test on such systems. ---

Re: [RFC v2] www: add topics_(new|active).(html|atom) endpoints

2023-11-10 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Fri, Nov 10, 2023 at 03:09:59AM +0000, Eric Wong wrote: > > That said, the Atom feeds generated by this RFC includes full > > messages because that's the easiest way to tie into our existing > > Atom generation code, so it's currently

Re: [PATCH] public-inbox-mda: use status codes where applicable

2023-11-10 Thread Eric Wong
Leah Neukirchen wrote: > Many MTA understand these and map them to sensible SMTP error messages. > > Inability to find an inbox results in "5.1.1 user unknown". > Misformatted messages are rejected with "5.6.0 data format error". > Unsupported inbox versions are reported as "5.3.5 local

[RFC v2] www: add topics_(new|active).(html|atom) endpoints

2023-11-09 Thread Eric Wong
Konstantin Ryabitsev wrote: > On Thu, Nov 09, 2023 at 02:45:08AM +0000, Eric Wong wrote: > > This seems like a easy (but WWW-specific) way to get recent > > topics as suggested by Konstantin. Perhaps an Atom endpoint > > will also be useful. > > Yes, actually th

[PATCH 13/13] spawn: get rid of wantarray popen_rd/popen_wr

2023-11-09 Thread Eric Wong
We've updated all of our users to use Process::IO (and avoiding tied handles) so the trade-off for using the array context no longer exists. --- lib/PublicInbox/Spawn.pm | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm

[PATCH 10/13] lei_input: always close single `eml' inputs

2023-11-09 Thread Eric Wong
This matches the behavior we have for multi-message mbox files since we rely on ->close to detect errors on bad mboxes. This ensures we'll notice errors reading single messages from stdin. We'll also start relying more on strace error injection to test error handling. ---

[PATCH 12/13] lei: get rid of autoreap usage

2023-11-09 Thread Eric Wong
We can rely on Process::IO->DESTROY to close and reap in these cases. This is the final step in eliminating the wantarray invocations of popen_rd (and popen_wr). --- lib/PublicInbox/LeiInput.pm | 13 + lib/PublicInbox/LeiRemote.pm | 14 ++ lib/PublicInbox/LeiXSearch.pm

[PATCH 08/13] lei_mirror: note missing local manifests are non-fatal

2023-11-09 Thread Eric Wong
Sometimes seeing that warning is alarming. --- lib/PublicInbox/LeiMirror.pm | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/LeiMirror.pm b/lib/PublicInbox/LeiMirror.pm index 49febe9e..84266d03 100644 --- a/lib/PublicInbox/LeiMirror.pm +++

[PATCH 11/13] xapcmd: get rid of scalar wantarray popen_rd

2023-11-09 Thread Eric Wong
We can rely on Process::IO->attached_pid and work towards simplifying popen_rd. --- lib/PublicInbox/Xapcmd.pm | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/lib/PublicInbox/Xapcmd.pm b/lib/PublicInbox/Xapcmd.pm index c2b66e69..69f0af43 100644 ---

[PATCH 07/13] net: retry on EINTR and check for {quit} flag

2023-11-09 Thread Eric Wong
This should allow us to detect shutdown signals in -watch more quickly and not unnecessarily fail on inconsequential signals such as SIGWINCH. --- lib/PublicInbox/NetNNTPSocks.pm | 1 + lib/PublicInbox/NetReader.pm| 53 +++-- lib/PublicInbox/Watch.pm| 2

[PATCH 09/13] ipc: simplify partial sendmsg fallback

2023-11-09 Thread Eric Wong
In the rare case sendmsg(2) isn't able to send the full amount (due to buffers >=2GB on Linux), use print + (autodie)close to send the remainder and retry on EINTR. `substr' should be able to avoid a large malloc via offsets and CoW on modern Perl. --- lib/PublicInbox/IPC.pm | 13 +++--

[PATCH 06/13] lei ls-mail-source: gracefully handle network failures

2023-11-09 Thread Eric Wong
All network connections may fail, so try to emit a helpful error message instead of attempting to dispatch methods off `undef'. --- lib/PublicInbox/LeiLsMailSource.pm | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/LeiLsMailSource.pm

[PATCH 00/13] misc error handling stuff and simplifications

2023-11-09 Thread Eric Wong
::IO subclass. Eric Wong (13): lei_xsearch: put query in process title for debugging lei: use cached $daemon_pid when possible lei: reuse FDs atfork and close explicitly lei_up: use v5.12 net_nntp_socks: more comments around how it works lei ls-mail-source: gracefully handle network

[PATCH 05/13] net_nntp_socks: more comments around how it works

2023-11-09 Thread Eric Wong
This is convoluted as hell but I can't figure out a better way to make Net::NNTP work with SOCKS. --- lib/PublicInbox/NetNNTPSocks.pm | 11 ++- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/lib/PublicInbox/NetNNTPSocks.pm b/lib/PublicInbox/NetNNTPSocks.pm index

[PATCH 01/13] lei_xsearch: put query in process title for debugging

2023-11-09 Thread Eric Wong
Having queries in the process titles makes it easier to diagnose stuck queries due to IPC problems. This was used to diagnose commit e97a30e7624d (lei: fix SIGPIPE on large result sets to pager)). --- lib/PublicInbox/LeiXSearch.pm | 12 +++- 1 file changed, 7 insertions(+), 5

[PATCH 02/13] lei: use cached $daemon_pid when possible

2023-11-09 Thread Eric Wong
->lei_daemon_pid can only be called in the top-level daemon process when $daemon_pid is valid, so avoid a getpid(2) syscall in those cases. --- lib/PublicInbox/LEI.pm | 2 +- lib/PublicInbox/LeiUp.pm | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/LEI.pm

[PATCH 04/13] lei_up: use v5.12

2023-11-09 Thread Eric Wong
No unicode_strings dependencies here, AFAIK --- lib/PublicInbox/LeiUp.pm | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/LeiUp.pm b/lib/PublicInbox/LeiUp.pm index 0faa180d..9931f017 100644 --- a/lib/PublicInbox/LeiUp.pm +++ b/lib/PublicInbox/LeiUp.pm @@

[PATCH 03/13] lei: reuse FDs atfork and close explicitly

2023-11-09 Thread Eric Wong
We'll avoid having a redundant STDERR FD open in lei workers, and some explicit close() on `lei up' sockets reduces the likelyhood of inadvertantly open FDs causing processes to linger. --- lib/PublicInbox/LEI.pm | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git

[RFC] www: add topics.html endpoint [was: Query to see all new "topics"]

2023-11-08 Thread Eric Wong
Konstantin Ryabitsev wrote: > Hello: > > Following the discussion on the ksummit list [1], I wanted to give someone a > query > they could use to keep an eye on any new threads. Is there a xapian query that > can be used to effectively say "return just top-level messages and exclude any >

[PATCH] lei: fix SIGPIPE on large result sets to pager

2023-11-07 Thread Eric Wong
When dealing with large search results, we need to deal with EPIPE not just from the pager, but also EPIPE or ECONNRESET between lei_xsearch and lei2mail processes. Without this fix, lei_xsearch processes could linger and get stuck writing to dead lei2mail processes if a user aborts the pager

Re: [Question] review links are disappearing from the qemu-devel mailing-list

2023-11-06 Thread Eric Wong
"Michael S. Tsirkin" wrote: > On Mon, Nov 06, 2023 at 11:49:06AM -0500, Michael S. Tsirkin wrote: > > On Mon, Nov 06, 2023 at 02:58:18PM +, Salil Mehta wrote: > > > Hi Michael, > > > I have noticed something very strange. Review links have been > > > disappearing from the > > > Qemu-devel

[PATCH] lei_view_text: fix inverted condition

2023-11-03 Thread Eric Wong
This was causing `lei q -f text' output to be uncolored on color-capable terminals. Fixes: d3c55d072839 (treewide: use ->close to call ProcessIO->CLOSE) --- lib/PublicInbox/LeiViewText.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/LeiViewText.pm

[PATCH 15/14] ds: don't try ->close after ->accept_SSL failure

2023-11-02 Thread Eric Wong
Eric Wong wrote: > --- a/lib/PublicInbox/DS.pm > +++ b/lib/PublicInbox/DS.pm > @@ -341,8 +341,8 @@ sub greet { > my $ev = EPOLLIN; > my $wbuf; > if ($sock->can('accept_SSL') && !$sock->accept_SSL) { > - return CORE::close($sock) i

Re: lei - dfn filters for net/* catching drivers/net/*

2023-11-02 Thread Eric Wong
David Wei wrote: > Hi, > > I have a problem with lei dfn filters. Here is my query: > > lei q -o ~/Mail/overlay -I https://lore.kernel.org/all -t '(dfn:net/* OR > dfn:drivers/net/ethernet/mellanox/mlx5/* OR > dfn:drivers/net/ethernet/broadcom/bnxt/*) AND tc:net...@vger.kernel.org AND >

www: squash read_all usage fix

2023-11-02 Thread Eric Wong
Eric Wong wrote: > --- a/lib/PublicInbox/WWW.pm > +++ b/lib/PublicInbox/WWW.pm > @@ -588,7 +588,7 @@ sub stylesheets_prepare ($$) { > next; > }; > my $ctime = 0; > - my $loc

[PATCH 04/14] cindex: drop redundant close on regular FH

2023-11-02 Thread Eric Wong
There's no need to waste optree space on close() statements for file handles which are (effectively) read-only on their last use and incapable of error checking in our Perl code (since they're only read by git). Let Perl refcounting take care of it so we have less code to wade through when

[PATCH 06/14] multi_git: use autodie

2023-11-02 Thread Eric Wong
Trying to move away from half my code being "or die" statements... --- lib/PublicInbox/MultiGit.pm | 11 +-- 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/MultiGit.pm b/lib/PublicInbox/MultiGit.pm index 9074758a..1e8eb47a 100644 ---

[PATCH 10/14] spawn: support PerlIO layer in scalar redirects

2023-11-02 Thread Eric Wong
We have to deal with UTF-8 data for generating patches, so make it easier to pass Perl utf8 data to git, diff, sdiff, etc. to avoid "Wide character" warnings. --- lib/PublicInbox/MailDiff.pm | 3 +-- lib/PublicInbox/SearchIdx.pm | 2 +- lib/PublicInbox/Spawn.pm | 30

[PATCH 03/14] treewide: use ->close method rather than CORE::close

2023-11-02 Thread Eric Wong
It's easier-to-read and should open the door for us to get rid of `tie' for ProcessIO without performance penalties for more frequently-used perlop calls and ability to do `stat' directly on the object instead of the awkward `tied' thing. --- lib/PublicInbox/CodeSearchIdx.pm | 6 +++---

[PATCH 11/14] treewide: check alternates writes with eof + autodie

2023-11-02 Thread Eric Wong
We must use eof() combined with close() to detect errors in situations involving the readline()' op since `readline' (and most buffered I/O libraries) have weak error detection support. This fixes error detection for files opened for read/write access. The next commit will fix error detection

[PATCH 07/14] git_credential: use autodie where appropriate

2023-11-02 Thread Eric Wong
We can also rely on `say' in Perl 5.10+ to save us the trouble of printing a newline. --- lib/PublicInbox/GitCredential.pm | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/GitCredential.pm b/lib/PublicInbox/GitCredential.pm index ae2c..bb225ff3

[PATCH 14/14] t/cindex+extsearch: use write_file, autodie, etc.

2023-11-02 Thread Eric Wong
write_file is a new API which makes setting up config files more pleasant, while autodie and scalarref redirects (in tests) have been available for a while, now. So do what we can to reduce the code burden we have. --- t/cindex.t| 15 --- t/extsearch.t | 48

[PATCH 12/14] treewide: use eof and close to detect readline errors

2023-11-02 Thread Eric Wong
readline () isn't wrapped by autodie, and there's no way to know if read(2) errors truncated the readline output. IO::Handle->error isn't reliable on Perl < v5.34. Thus, combining the `eof' and `close' (combined with autodie) is the only way we can detect read(2) errors (injected via strace) when

[PATCH 05/14] treewide: use ->close to call ProcessIO->CLOSE

2023-11-02 Thread Eric Wong
This will open the door for us to drop `tie' usage from ProcessIO completely in favor of OO method dispatch. While OO method dispatches (e.g. `$fh->close') are slower than normal subroutine calls, it hardly matters in this case since process teardown is a fairly rare operation and we continue to

[PATCH 09/14] io: introduce write_file helper sub

2023-11-02 Thread Eric Wong
This is pretty convenient way to create files for diff generation in both WWW and lei. The test suite should also be able to take advantage of it. --- MANIFEST | 1 + lib/PublicInbox/IO.pm| 10 +- lib/PublicInbox/Import.pm| 6 ++

[PATCH 08/14] replace ProcessIO with untied PublicInbox::IO

2023-11-02 Thread Eric Wong
This fixes two major problems with the use of tie for filehandles: * no way to do fcntl, stat, etc. calls directly on the tied handle, forcing callers to use the `tied' perlop to access the underlying IO::Handle * needing separate classes to handle blocking and non-blocking I/O As a result,

[PATCH 13/14] move read_all, try_cat, and poll_in to PublicInbox::IO

2023-11-02 Thread Eric Wong
The IO package seems like a better home for I/O subs than the Git package. We lose the 60 second read timeout for `git cat-file --batch-*' processes since it's probably not necessary given how reliable the code has proven and things would fall over hard in other ways if the storage device were

[PATCH 02/14] ds: replace FD map hash table with array

2023-11-02 Thread Eric Wong
FDs are array indices into the kernel, anyways, so we can take advantage of space savings and speedups because the majority of FDs a big process has is going to end up in the array, anyways. --- lib/PublicInbox/DS.pm | 18 +- lib/PublicInbox/LeiStoreErr.pm | 2 +- 2

[PATCH 01/14] xap_helper.pm: use do_fork to Reset and reseed

2023-11-02 Thread Eric Wong
We may start using rand() in the worker someday if we need to seed a hash function for caching. It saves us some LoC in the meantime. --- lib/PublicInbox/XapHelper.pm | 7 +-- 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/lib/PublicInbox/XapHelper.pm

[PATCH 00/14] IO/IPC-related cleanups

2023-11-02 Thread Eric Wong
switch to sysread eventually to avoid the double-copy overhead of buffered bulk I/O, anyways. The only place we really benefit from userspace buffered disk reads is IdxStack, I think... The new write_file sub in 9/14 seems long overdue. Eric Wong (14): xap_helper.pm: use do_fork to Reset and

Re: [PATCH] git: reschedule cleanup if active

2023-11-01 Thread Eric Wong
> Subject: [PATCH] git: reschedule cleanup if activea There should be no trailing "a" :x

[PATCH] git: reschedule cleanup if activea

2023-11-01 Thread Eric Wong
This is necessary to reliably cleanup cat-file processes for coderepos in long-lived -netd and -httpd processes if they haven't been accessed in a while. Followup-to: 33e99002c552 (git: cleanup un-associated coderepo processes) --- lib/PublicInbox/Git.pm | 3 ++- 1 file changed, 2 insertions(+),

[PATCH 1/6] ds: next_tick: shorten object lifetimes

2023-10-31 Thread Eric Wong
Drop reference counts ASAP in case it saves us some memory sooner rather than later. This ought to give us more predictable resource use and ensure OnDestroy callbacks fire sooner. There's no need to use `local' to clobber the arrayref anymore, either. AFAIK, this doesn't fix any known bug, but

[PATCH 2/6] ds: do not defer close

2023-10-31 Thread Eric Wong
We can map all integer FDs to Perl objects once ->ep_wait returns, so there's no need to play tricks elsewhere to ensure FDs can be mapped to objects within the same event loop iteration. --- lib/PublicInbox/DS.pm | 67 ++- 1 file changed, 22 insertions(+),

[PATCH 5/6] pop3: use SSL_shutdown(3ssl) if appropriate

2023-10-31 Thread Eric Wong
This allows us support SSL session caching + reuse in the future. --- lib/PublicInbox/POP3.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/POP3.pm b/lib/PublicInbox/POP3.pm index 6d24b17c..06772069 100644 --- a/lib/PublicInbox/POP3.pm +++

[PATCH 4/6] watch: simplify DirIdle object cleanup

2023-10-31 Thread Eric Wong
There's no need to waste time nor reach into DS internals to map FDs to Perl objects, here. LEI.pm has never had to deal with integer FDs for DirIdle, either. --- lib/PublicInbox/Watch.pm | 11 --- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/lib/PublicInbox/Watch.pm

[PATCH 6/6] ds: make ->close behave like CORE::close

2023-10-31 Thread Eric Wong
Matching existing Perl IO semantics seems like a good idea to reduce confusion in the future. We'll also fix some outdated comments and update indentation to match the rest of our code base since we're far detached from Danga::Socket at this point. --- lib/PublicInbox/CidxComm.pm | 2 +-

<    1   2   3   4   5   6   7   8   9   10   >