Re: [PATCH] TODO: add item for searching based on git-patch-id(1)

2019-10-01 Thread Konstantin Ryabitsev
On Tue, Oct 01, 2019 at 03:37:47AM +, Eric Wong wrote: > I forgot about this feature when I was implementing > blob-ID-based searches :x > --- > TODO | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/TODO b/TODO > index 2c525615..93054bb3 100644 > --- a/TODO > +++ b/TODO > @@ -112,3

Re: [PATCH] TODO: add an item for Python pygments

2019-10-08 Thread Konstantin Ryabitsev
On Tue, Oct 08, 2019 at 09:53:21PM +, Eric Wong wrote: > Konstantin Ryabitsev wrote: > > On Thu, Sep 26, 2019 at 01:59:53AM +, Eric Wong wrote: > > > I had my reservations about relying on highlight.pm; and this > > > confirms them, unfortunately :< Oh well

Re: workflow problems and possible public-inbox solutions

2019-10-18 Thread Konstantin Ryabitsev
On Fri, Oct 18, 2019 at 03:25:16AM +, Eric Wong wrote: It seems like there's a bunch of problems the workflows@vger list is trying to solve. Some can be solved independently of others. Here's a summary of them and where items in https://public-inbox.org/TODO list can be possible solution (I

Re: how's memory usage on public-inbox-httpd?

2019-10-18 Thread Konstantin Ryabitsev
On Wed, Oct 16, 2019 at 10:10:45PM +, Eric Wong wrote: This is an old-ish discussion, but we finally had a chance to run the httpd daemon for a long time without restarting it to add more lists, and the memory usage on it is actually surprising: $ ps -eF | grep public-inbox publici+ 17741

Re: how's memory usage on public-inbox-httpd?

2019-10-22 Thread Konstantin Ryabitsev
On Sat, Oct 19, 2019 at 12:11:44AM +, Eric Wong wrote: It's been definitely dramatically better. We keep adding lists to lore, so I haven't really been able to watch memory usage after a long period of daemon uptime, but it's never really gone very much above 1GB. In fact, we're downgrading

RFC: monthly epochs for v2

2019-10-24 Thread Konstantin Ryabitsev
Hi, all: With public-inbox now providing manifest files, it is easy to communicate to mirroring services when an epoch rolls over. What do you think if we make these roll-overs month-based instead of size-based. So, instead of: git/ 0.git 1.git 2.git it becomes git/ 201908.git 201909.

Re: RFC: monthly epochs for v2

2019-10-24 Thread Konstantin Ryabitsev
On Thu, Oct 24, 2019 at 08:35:03PM +, Eric Wong wrote: Epoch size should be configurable, yes. But I'm against time periods such as months or years being a factor for rollover. Many inboxes (including this one) can go idle for weeks/months; and activity can be unpredictable if there's surges

Re: RFC: monthly epochs for v2

2019-10-25 Thread Konstantin Ryabitsev
On Fri, Oct 25, 2019 at 12:22:14PM +, Eric Wong wrote: I'm not sure about a libpublicinbox... I have been really hesitant to depend on shared C/C++ libraries whenever I use Perl or Ruby because of build and install complexity; especially for stuff that's not-yet-available on distros. Well-de

Re: RFC: monthly epochs for v2

2019-10-29 Thread Konstantin Ryabitsev
On Tue, Oct 29, 2019 at 10:03:43AM -0500, Eric W. Biederman wrote: > So not monthly epochs. But it would be very handing to have a > public-inbox command command that refreshes git mirrors. It would > be even more awesome if there was something like the IMAP IDLE command > in http that would let

Re: Archiving HTML mail

2019-11-12 Thread Konstantin Ryabitsev
On Tue, Nov 12, 2019 at 10:29:32PM +, Eric Wong wrote: > > You have to rewrite the HTML parts anyway, to resolve RFC 2392 cid: > > links, prior to handing them to web browsers. I don't think web > > browsers support them. Neither over HTTP, nor browsing locally. > > Yeah. I guess it could b

Re: Archiving HTML mail

2019-11-13 Thread Konstantin Ryabitsev
On Tue, Nov 12, 2019 at 11:10:36PM +, Eric Wong wrote: > > Now that public-inbox-mda supports list-id (THANK YOU!), my life > > moderating PI_EMERGENCY is much easier. For lore.kernel.org, > > emergency collects about a thousand messages a week. My Friday > > afternoon routine is usually to

Filter example for ML footers

2019-11-16 Thread Konstantin Ryabitsev
Hello: Groups.io adds a super-obnoxious footer to all outgoing messages, and I would like to be able to filter that out. Example: https://lore.kernel.org/keys/2019161300.hc7vb7rcb45gsqmg@chatter.i7.local/ The obnoxious footer can be either part of the main body (default "chaotic evil" versi

Limited-history local archives

2020-01-03 Thread Konstantin Ryabitsev
Hi, all: I wonder if it would be useful to have a feature allowing someone to run a limited-history local copy of a larger remote archive -- for example if someone only wanted a 3-month copy of LKML instead of the whole 20-year enchilada. It's possible to accomplish this with git already [^1],

Minimalist public-inbox feed: sendmail-pi-feed

2020-01-21 Thread Konstantin Ryabitsev
Hi, all: I was trying to create a "simplest possible" way to maintain a public-inbox developer feed, and this is the end-result: https://git.kernel.org/pub/scm/linux/kernel/git/mricon/korg-helpers.git/tree/sendmail-pi-feed It's written to be used with git-send-email, but can, theoretically, be

Attestation signatures in a separate ref

2020-02-07 Thread Konstantin Ryabitsev
ref containing just PGP-signed metadata of each message. refs/heads/master:m From: Foo Foo To: linux-ker...@vger.kernel.org Message-Id: Date: Fri, 7 Feb 2020 13:43:34 -0500 Subject: [PATCH] add foo to bar We need bar in foo! Signed-off-by: Konstantin Ryabitsev --- foo | 1

How to force stricter threading

2020-03-09 Thread Konstantin Ryabitsev
Hello: I think public-inbox currently does some heuristic-based threading, which may actually not be that useful. For example: https://lore.kernel.org/linux-renesas-soc/20200217101741.3758-1-geert+rene...@glider.be/ None of the [PATCH] messages have references or in-reply-to set, but for some

Re: How to force stricter threading

2020-03-19 Thread Konstantin Ryabitsev
On Thu, Mar 19, 2020 at 07:58:20AM +, Eric Wong wrote: > > So the "Patchwork summary for: linux-renesas-soc" message: > > > > https://lore.kernel.org/linux-renesas-soc/158229483332.12219.5639020605006542672.git-patchwork-summ...@kernel.org/raw > > > > has the following header: > > > > Refere

Re: mail header indexing additions

2020-04-22 Thread Konstantin Ryabitsev
On Mon, Apr 20, 2020 at 01:53:17AM +, Eric Wong wrote: > I'm probably going to start indexing List-Id: headers by > default, and have `lid:' be the search prefix for inboxes > which combine multiple lists and may have unstable email > addresses. This would be handy indeed! > Anything else tha

Re: [PATCH] doc: add clients.txt

2020-04-27 Thread Konstantin Ryabitsev
On Mon, Apr 27, 2020 at 08:57:08PM +, Eric Wong wrote: > +* kernel.org helpers, including get-lore-mbox and sendmail-pi-feed > + https://git.kernel.org/pub/scm/linux/kernel/git/mricon/korg-helpers.git I'd rather it listed b4 instead, as it's a successor to get-lore-mbox: https://git.kernel.or

Re: how's memory use? May 2020 edition

2020-05-14 Thread Konstantin Ryabitsev
On Tue, May 12, 2020 at 08:37:34AM +, Eric Wong wrote: > Hey all, if possible; I'd like to know the memory use of your > daemons (particularly -httpd), relevant pmap(1) (or equivalent) > output, and version of public-inbox in use. This is on lore.kernel.org. We upgraded to 1.5.0 yesterday, so

Search based on data in follow-ups

2020-05-26 Thread Konstantin Ryabitsev
Hello: I suspect this would be Pretty Hard To Do, but wanted to mention it on the list anyway, just as a "musing out loud." It would be cool to be able to exclude/include results based on conditions in thread follow-ups. E.g.: - (subject contains "PATCH") AND (follow-up from test...@example.co

Re: Search based on data in follow-ups

2020-05-27 Thread Konstantin Ryabitsev
On Tue, May 26, 2020 at 09:35:50PM +, Eric Wong wrote: > > - (subject contains "PATCH") AND (follow-up from test...@example.com > > that contains "Passed") AND NOT (follow-up from me that contains > > "Applied|NACK") > > > > I expect this would require client-side filtering, though, as I can

Re: thoughts on Git::Raw / libgit2?

2020-06-22 Thread Konstantin Ryabitsev
On Tue, Jun 16, 2020 at 09:40:51PM +, Eric Wong wrote: > Git::Raw is not packaged with CentOS 7.x; but cpan/cpanm is an > option. It is in Debian 10.x as libgit-raw-perl, so I can > report bugs via Debian's BTS[*]. FYI, even though lore.kernel.org runs on CentOS 7.x, most perl modules come f

Re: what storage system(s) are you using?

2020-08-06 Thread Konstantin Ryabitsev
On Wed, Aug 05, 2020 at 03:11:27AM +, Eric Wong wrote: > I've been mostly using ext4 on SSDs since I started public-inbox > and it works well. As you know, I hope to move lore.kernel.org to a system with a hybrid lvm-cache setup, specifically: 12 x 1.8TB rotational drives set up in a lvm rai

Re: Could public-inbox do something helpful with .mailmap?

2020-08-17 Thread Konstantin Ryabitsev
On Mon, Aug 17, 2020 at 08:17:37PM -0500, Eric W. Biederman wrote: > They have an update to their preferred email address in the .mailmap > in the linux-kernel source. Is there any chance public-inbox could > look at .mailmap and do something useful in the web interface? > > Perhaps display an al

Re: message bloat over time...

2020-09-02 Thread Konstantin Ryabitsev
On Wed, Sep 02, 2020 at 07:05:25PM +, Eric Wong wrote: > I've been indexing and reindexing a local mirror of > https://lore.kernel.org/lkml a bit, and it's kinda depressing to > see newer messages being more and more bloated even on a > plain-text-only mailing list :< > > The first column ("$X

Re: Converting Public-Inbox archived messages into mbox

2020-09-04 Thread Konstantin Ryabitsev
On Fri, Sep 04, 2020 at 10:40:24PM +0200, Simon Eigeldinger wrote: > Hi all, > > I know and I guess that has been asked a few times. > Is it possible to convert a git repo with messages to a mbox file and > how is that done? > > According to the install guide for Public-Inbox you don't need to >

Re: brain dump detached/external index so far...

2020-09-14 Thread Konstantin Ryabitsev
On Sun, Sep 13, 2020 at 06:55:50AM +, Eric Wong wrote: > Currently (and since the earliest days of this project > supporting Xapian), indices were per-inbox. This allowed > inboxes to be isolated, making it easy to add and remove > inboxes. > > The detached/external indices will allows a merg

Epoch roll-over with imap

2020-09-17 Thread Konstantin Ryabitsev
Good morning, and congratulations on 1.6.0! I'm starting to play with the new imapd mode (currently using the imap daemon on public-inbox.org), and I am curious how we can make it obvious to the clients that there is a new epoch available. For example, if someone configures mbsync to fetch thin

2 problems with listid matching

2020-09-21 Thread Konstantin Ryabitsev
Hello: Attempting to subscribe radio...@radiotap.org has highlighted two problems with list-id matching. When the email comes in from the mailing list, the header is set as: List-Id: radiotap.NetBSD.org Public-inbox doesn't find this because the above list-id header is not compliant with th

Thoughts on search-based imap mailboxes

2020-10-02 Thread Konstantin Ryabitsev
Hello: While discussing something else on the kernel.org users list, the question of "virtual inbox folders" came up when talking about imap and public-inbox. Here's how I imagine it could work in a way that doesn't require any kind of real user management. - any site visitor can create a save

Re: Thoughts on search-based imap mailboxes

2020-10-03 Thread Konstantin Ryabitsev
On Fri, Oct 02, 2020 at 08:08:30PM +, Eric Wong wrote: > Konstantin Ryabitsev wrote: > > Hello: > > > > While discussing something else on the kernel.org users list, the > > Btw, is this list public? It's not, because it's supposed to be just for pe

Subscribing to public-inbox lists using grokmirror + procmail

2020-10-07 Thread Konstantin Ryabitsev
Hi, all: I needed a way to pipe public-inbox straight into patchwork.kernel.org without having to manage yet another list subscription with postfix pipe integration. Then I realized that it's generally useful as a way to deliver straight from public-inbox archives into local inboxes -- similar

Announce: ezpi python library for writing to public-inbox v2 repos

2020-10-21 Thread Konstantin Ryabitsev
Hello: I am writing a tool that would provide an "audit feed" of all pushes performed to git.kernel.org, so I needed a way to write to public-inbox v2 format repositories from Python. I figured this may be useful as a standalone library, so I published it as "ezpi": https://pypi.org/project/ez

Re: [PATCH 00/52] detached external index: mostly

2020-10-27 Thread Konstantin Ryabitsev
On Tue, Oct 27, 2020 at 07:54:01AM +, Eric Wong wrote: > ...and mostly wired up for WWW, but requires manual config > editing atm. Needs docs and tests, and IMAP support. Great progress! I look forward to reading the forthcoming docs. :) -K

Re: WIP: searching all of lore

2020-12-01 Thread Konstantin Ryabitsev
On Thu, Nov 26, 2020 at 07:45:43PM +, Eric Wong wrote: > Requires Tor, for now: > > http://rskvuqcfnfizkjg6h5jvovwb3wkikzcwskf54lfpymus6mxrzw67b5ad.onion/all/ > http://lore.czquwvybam4bgbro.onion/all/ Thanks for this work, Eric, things are looking good in my tests, though I uncovered a bunch

Re: WIP: searching all of lore

2020-12-08 Thread Konstantin Ryabitsev
On Sat, Dec 05, 2020 at 08:07:17PM +, Eric Wong wrote: > Per-inbox search also uses subset search, so all the existing > inboxes should be searchable on an individual level, not just /all: > http://rskvuqcfnfizkjg6h5jvovwb3wkikzcwskf54lfpymus6mxrzw67b5ad.onion/lkml/ > http://rskvuqcfnfizkjg6h5j

Re: WIP: searching all of lore

2020-12-08 Thread Konstantin Ryabitsev
On Tue, Dec 08, 2020 at 06:02:32PM +, Eric Wong wrote: > > So, are things to the point where we only need a single xapian db > > for all lists, or do we still need to keep individual list indexes? > > Only indexlevel=basic (sqlite) for individual lists. This saves > a bunch of FDs and provid

Extra newline when retrieving messages

2020-12-10 Thread Konstantin Ryabitsev
Hello: While investigating why some of the messages retrieved via lore.kernel.org were failing DKIM checks, I realized that public-inbox-httpd appends an extra newline to message bodies. This newline isn't present in git backends, just in messages retrieved via (at least) public-inbox-httpd.

Re: Extra newline when retrieving messages

2020-12-10 Thread Konstantin Ryabitsev
On Thu, Dec 10, 2020 at 08:55:40PM +, Eric Wong wrote: > Konstantin Ryabitsev wrote: > > Hello: > > > > While investigating why some of the messages retrieved via > > lore.kernel.org were failing DKIM checks, I realized that > > public-inbox-httpd appends a

Re: Extra newline when retrieving messages

2020-12-10 Thread Konstantin Ryabitsev
On Thu, 10 Dec 2020 at 16:43, Konstantin Ryabitsev wrote: > This is what causes DKIM verification to fail, and NOT the newline: for the record, the DKIM RFC specifically deals with extra trailing newlines: The "simple" body canonicalization algorithm ignores all empty lines

Re: Extra newline when retrieving messages

2020-12-10 Thread Konstantin Ryabitsev
On Thu, Dec 10, 2020 at 10:38:47PM +, Eric Wong wrote: > Konstantin Ryabitsev wrote: > > > > That said, should public-inbox consider this case when generating the > > /raw and /t.mbox.gz messages? If the Archived-At and List-Archive > > headers are listed i

Re: are Perl regexps well-known enough for command-line use?

2020-12-15 Thread Konstantin Ryabitsev
On Mon, Dec 14, 2020 at 08:39:38PM +, Eric Wong wrote: > I've been thinking a bit about UI/UX for local command-line > tooling, and one thing I've been pondering is exposing Perl5 > regexps as a mechanism for filtering > mailboxes/newsgroups/URLs/pathnames, etc... I think it's best to stick to

public-inbox + mlmmj best practices?

2020-12-21 Thread Konstantin Ryabitsev
Hello: One of our projects is looking at mailing list hosting and I was wondering if I should steer them towards public-inbox + mlmmj as opposed to things like the moribund googlegroups, groups.io, etc. I know meta uses mlmmj, but there don't appear to be many docs on how things are organized beh

Re: About header filtering

2020-12-22 Thread Konstantin Ryabitsev
On Tue, Dec 22, 2020 at 08:37:04AM +0100, Uwe Kleine-König wrote: > I found that Konstantin Ryabitsev's tool to prepare an initial archive > from an already existing mailing list[1] filters some of these out, but > the instance on kernel.org has some of these details, too. (See for > example > http

Re: About header filtering

2020-12-23 Thread Konstantin Ryabitsev
On Tue, Dec 22, 2020 at 11:21:18PM +0100, Uwe Kleine-König wrote: > > 2. the goal of lore.kernel.org is maximum transparency, so we include > >everything that our own systems add to the headers in an attempt to show > >that "there's nothing up our sleeves" > > > > > I could handcraft a pre

Re: public-inbox + mlmmj best practices?

2020-12-28 Thread Konstantin Ryabitsev
On Tue, Dec 22, 2020 at 06:28:08AM +, Eric Wong wrote: > Eric Wong wrote: > > > > There's scripts/ssoma-replay which was v1-only and dependent on > > ssoma. I've been meaning to convert into something that reads > > NNTP so it's not locked into public-inbox. Maybe it could be > > part of `l

Re: thoughts improving duplicate message handling...

2020-12-29 Thread Konstantin Ryabitsev
On Mon, Dec 28, 2020 at 09:41:46PM +, Eric Wong wrote: > a) There are occasionally resent revisions of patches with the >same Message-ID This is broken and is unwanted. :) > b) More often, a cross-posted message has different trailers >depending on which list it was posted to. (And t

Re: public-inbox + mlmmj best practices?

2021-01-04 Thread Konstantin Ryabitsev
On Mon, Dec 28, 2020 at 09:31:39PM +, Eric Wong wrote: > AFAIK, V2Writable always does the right thing on -purge/-edit; > at least for WWW users(*). > > V2W does more work in rare cases when history gets rewritten, > but doesn't track anything beyond the latest indexed commit > hash. > > In t

Re: generic message-id redirector

2021-02-01 Thread Konstantin Ryabitsev
On Mon, Feb 01, 2021 at 02:26:30PM +0100, Uwe Kleine-König wrote: > > PublicInbox::NewsWWW fallback lets //$host/$message_id work (no /r/). > > It can be run as a standalone PSGI, too, see examples/newswww.psgi > > Huh, it seems I have to dig deeper into the internals of Plack. Thanks. > > > At l

Re: generic message-id redirector

2021-02-02 Thread Konstantin Ryabitsev
On Tue, Feb 02, 2021 at 09:08:10AM +0100, Uwe Kleine-König wrote: > (It seems from the outside I have to use /r/ though for lore.kernel.org, > https://lore.kernel.org/20201215212228.185517-2-clemens.gru...@pqgruber.com > at least doesn't work.) That's just a side-effect of our setup -- we define a

Re: [PATCH 3/3] t/www_listing: require grok-pull version 2 or later

2021-02-22 Thread Konstantin Ryabitsev
On Sun, Feb 21, 2021 at 10:20:13PM +, Eric Wong wrote: > > This was tested with the latest release of Grokmirror, v2.0.7. Note > > that the "pull" and "fsck" sections are required even though they're > > empty. Hmm... That grok-pull requires the [fsck] section is a bug that I introduced in on

watch a simple dir

2021-02-26 Thread Konstantin Ryabitsev
Hello: I'm playing around with using public-inbox as the archiving subsystem for mlmmj. I know it's possible to simply configure a local address to deliver to public-inbox-mda [1], but I wonder if there's a way to reduce complexity and simply configure public-inbox-watch to monitor mlmmj's "archiv

Re: watch a simple dir

2021-02-26 Thread Konstantin Ryabitsev
On Fri, 26 Feb 2021 at 09:38, Konstantin Ryabitsev wrote: > - it's a simple flat dir of numeric files corresponding to the number of the > message in the index > - each message is a valid rfc2822 document -- in fact, if I copy them into the > "new" folder of any ma

Re: release timelines (-extindex, JMAP, lei)

2021-03-08 Thread Konstantin Ryabitsev
On Fri, Mar 05, 2021 at 10:20:19PM +, Eric Wong wrote: > So I think the -extindex stuff might be stable and suitable for > general consumption. The HTML WWW UI around -extindex has some > rough edges but nothing that would take too much effort to fix. \o/ > But I'm deeply worried about unlea

Re: WIP: searching all of lore

2021-03-17 Thread Konstantin Ryabitsev
On Wed, Mar 17, 2021 at 01:11:16AM -0600, Eric Wong wrote: > Eric Wong wrote: > > Requires Tor, for now: > > > > http://rskvuqcfnfizkjg6h5jvovwb3wkikzcwskf54lfpymus6mxrzw67b5ad.onion/all/ > > http://lore.czquwvybam4bgbro.onion/all/ > > Also available without Tor: > > https://yhbt.net/lore

Re: WIP: searching all of lore

2021-03-17 Thread Konstantin Ryabitsev
On Wed, Mar 17, 2021 at 08:18:43PM +0200, Eric Wong wrote: > > Is that intentional, or can this be tweaked to show a single result for the > > same message-id? > > Not really. At least for the summary search results, it makes > no sense: > > https://public-inbox.org/meta/20210317181408.912

Re: upcoming perl v5.12 requirement...

2021-04-13 Thread Konstantin Ryabitsev
On Tue, Apr 13, 2021 at 03:44:35PM -0400, Eric Wong wrote: > On a side note, I'm strongly considering moving to Perl 5.12 > after public-inbox 1.7 is released. perl 5.12.4 will be a > decade old in a few months (however 5.12.5 was Nov 2012). As long as we can set the low bar at 5.16, I'm good wit

c1b912dea25 breaks make test on CentOS7

2021-04-20 Thread Konstantin Ryabitsev
Something I noticed today: $ make test PERL_DL_NONLAZY=1 "/usr/bin/perl" "-MExtUtils::Command::MM" "-MTest::Harness" "-e" "undef *Test::Harness::Switches; test_harness(0, 'blib/lib', 'blib/arch')" t/*.t t/address.t .. ok t/admin.t .

Re: [PATCH 0/2] CentOS 7 fixes

2021-04-20 Thread Konstantin Ryabitsev
On Wed, Apr 21, 2021 at 12:02:04AM +0500, Eric Wong wrote: > Konstantin Ryabitsev wrote: > > t/admin.t Warning: Use of "ref" without > > parentheses is ambiguous at > > /usr/local/share/public-inbox/blib/lib/PublicInbox/TestCommon.pm l

t/lei-daemon.t failure when PERL_INLINE_DIRECTORY is set

2021-04-20 Thread Konstantin Ryabitsev
While playing with libgit2/Gcf2, I've discovered that t/lei-daemon.t will fail when PERL_INLINE_DIRECTORY is set: t/lei-daemon.t ... 18/? # Failed test 'connect error noted' # at t/lei-daemon.t line 80. # '' # doesn't ma

Re: t/lei-daemon.t failure when PERL_INLINE_DIRECTORY is set

2021-04-20 Thread Konstantin Ryabitsev
On Tue, Apr 20, 2021 at 08:38:54PM +, Eric Wong wrote: > > t/lei-daemon.t ... 18/? > > # Failed test 'connect error noted' > > # at t/lei-daemon.t line 80. > > # '' > > # doesn't match '(?^:\bconnect\()' > > # Looks like you failed 1

Re: [PATCH] t/lei-daemon: skip inaccessible socket test as root

2021-04-21 Thread Konstantin Ryabitsev
On Tue, Apr 20, 2021 at 10:06:08PM +, Eric Wong wrote: > Konstantin Ryabitsev wrote: > > While poking around, I've discovered that it only fails when "make test" > > runs > > as root (don't judge -- this is in a throwaway lab VM): > > > Ho

newer xapian and git packages for CentOS7

2021-04-21 Thread Konstantin Ryabitsev
Sending a quick note here since this can be of interest to others. If you want to run public-inbox on CentOS-7 with newer git and xapian, you can use the following packages I am maintaining: https://copr.fedorainfracloud.org/coprs/icon/lfit/packages/ To enable on your system: yum install yum-pl

Re: newer xapian and git packages for CentOS7

2021-04-23 Thread Konstantin Ryabitsev
On Fri, Apr 23, 2021 at 02:02:24AM -0400, Eric Wong wrote: > > If you want to run public-inbox on CentOS-7 with newer git and xapian, you > > can > > use the following packages I am maintaining: > > > > https://copr.fedorainfracloud.org/coprs/icon/lfit/packages/ > > Btw, I noticed xapian14-bindi

Re: newer xapian and git packages for CentOS7

2021-04-23 Thread Konstantin Ryabitsev
On Fri, Apr 23, 2021 at 07:18:49PM +, Eric Wong wrote: > > > Btw, I noticed xapian14-bindings isn't packaged. If it were, it > > > would include the better-maintained Xapian.pm (SWIG) binding. > > > Getting Search::Xapian (XS) from CPAN would no longer be > > > necessary. > > > > Hmm... I mos

Setup pointers for extindex and /all

2021-04-23 Thread Konstantin Ryabitsev
Eric: I'm working on the new incarnation of lore.kernel.org (that will run on multiple frontends as opposed to the centralized version we have now) -- I hope to have everything ready to go by the time 1.7.x rolls out. I wonder if you can give some pointers for extindex and /all, specifically: - w

lei-managed pseudo mailing lists

2021-04-26 Thread Konstantin Ryabitsev
Hello: One of the services I think would be interesting to provide is ability for people to subscribe to "curated saved searches". For example, a kernel subsystem maintainer can define a set of query parameters (a thread mentions these files/functions/terms, etc), and allow others to follow this s

Re: lei-managed pseudo mailing lists

2021-04-26 Thread Konstantin Ryabitsev
On Mon, Apr 26, 2021 at 05:37:26PM +, Eric Wong wrote: > > The latter is specifically something I think would be of interest to kernel > > folks, so I envision that we'd have something like the following: > > > > - a maintainer publishes a configuration file we can pass to lei > > The command

Re: lei-managed pseudo mailing lists

2021-04-26 Thread Konstantin Ryabitsev
On Mon, Apr 26, 2021 at 06:47:17PM +, Eric Wong wrote: > > I'm thinking we need the ability to make it a real clonable repository -- > > perhaps without its own xapian index? Actual git repositories aren't large, > > especially if they are only used for direct git operations. Disk space is > >

Recording archiver origins in git

2021-06-28 Thread Konstantin Ryabitsev
Hello: I'm working away on grokmirror+public-inbox replication, and I'm trying to come up with a good solution for passing the "archiver origins" info. In examples/grok-pull.post_update_hook.sh, we try to get this information out of a curl call to the clone origin, but this may not be reliable for

Re: Recording archiver origins in git

2021-06-29 Thread Konstantin Ryabitsev
On Mon, Jun 28, 2021 at 10:12:36PM +, Eric Wong wrote: Hope you're finding ways of staying cool and sane. It's hot here on the East coast, but a) we're used to it, and b) it's not yikes degrees. > > Imaginary code snippet: > > > > $ git show refs/meta/origins:i > > [metadata] > > source = sm

Re: empty /manifest.js.gz response as of 520be116

2021-06-29 Thread Konstantin Ryabitsev
On Sun, Jun 27, 2021 at 04:28:37PM -0400, Kyle Meyer wrote: > I recently upgraded a server from 08b649735 to 5860b498a and noticed > that grok-pull didn't bring in any updates. It looks like what's going > on is that the top-level /manifest.js.gz endpoint is now coming up > empty. BTW, I just add

Re: Recording archiver origins in git

2021-06-29 Thread Konstantin Ryabitsev
On Tue, Jun 29, 2021 at 07:59:57PM +, Eric Wong wrote: > > My thinking is that with mirrors of mirrors of mirrors, if someone submits a > > GDPR removal request, then there should be an easy way of figuring out where > > these requests should actually go. Maybe infourl can cover this, but it's

Restarting daemons on config file change

2021-07-19 Thread Konstantin Ryabitsev
Hello: Something I stumbled on today is the need to have the -httpd and -nntpd daemons reread the config file after we've mirrored and initialized new inboxdirs. The situation is: - public-inbox-{httpd,nntpd} are running as systemd services as user "publicinbox" - the mirroring and initializati

Re: Restarting daemons on config file change

2021-07-20 Thread Konstantin Ryabitsev
On Mon, Jul 19, 2021 at 08:49:35PM +, Eric Wong wrote: > > The best I can think of is a systemd watcher service that automatically > > restarts the daemons when the config file is modified, but I wanted to check > > here first to see if perhaps I'm missing something simpler. > > Yes, a systemd

Re: Restarting daemons on config file change

2021-07-20 Thread Konstantin Ryabitsev
On Tue, Jul 20, 2021 at 08:34:33PM +, Eric Wong wrote: > > Okay, let me see what I can come up with. Looks like the best course of > > action > > is to: > > > > 1. use a global blocking lock > > 2. copy the config file to a new location > > 3. make the necessary changes to the temporary confi

Re: Restarting daemons on config file change

2021-07-20 Thread Konstantin Ryabitsev
On Tue, Jul 20, 2021 at 09:07:24PM +, Eric Wong wrote: > > I figured as much, but we do want to set extra keys *and* write the config > > in > > a certain order (e.g. prioritizing some sources over others by listing them > > first). I currently do this via a list-id globbing match (e.g. > > --

Re: [PATCH 0/2] things to make mirroring easier

2021-07-21 Thread Konstantin Ryabitsev
On Wed, Jul 21, 2021 at 02:05:48PM +, Eric Wong wrote: > publicinbox..boost is now supported (it should be obvious > higher numbers are handled first, because OS scheduler > "priority" always confuses me :x) > > And -init handles arbitrary "-c KEY=VALUE" things like > git-config and allows mul

--batch-size and --jobs combination

2021-07-29 Thread Konstantin Ryabitsev
Hello: Is there any specific logic for mixing --batch-size and --jobs? On a system with plenty of CPUs and lots of RAM, does it make sense to have more --jobs, larger --batch-size, or some balance of both? -K

Re: --batch-size and --jobs combination

2021-07-29 Thread Konstantin Ryabitsev
On Thu, Jul 29, 2021 at 09:13:21PM +, Eric Wong wrote: > jobs will be bound by I/O capability for your case. SATA-2 vs > SATA-3 vs NVME will have a notable difference, as does the > quality of the device (MLC, TLC, QLC; cache/controller). So, on these systems with large lvmcache disks, large

Re: --batch-size and --jobs combination

2021-08-01 Thread Konstantin Ryabitsev
On Thu, Jul 29, 2021 at 10:06:29PM +, Eric Wong wrote: > My gut says 1g batch-size seems too high (Xapian has extra > overhead) and could still eat too much into the kernel cache > (and slow down reads). 100m might be a more reasonable limit > for jobs=4 and 128G RAM. Okay, I have things up an

boost and /all/

2021-08-02 Thread Konstantin Ryabitsev
Eric: I'm not sure if boost is quite working correctly (or it's possible I'm not doing something right). E.g. I have the following thread: https://x-lore.kernel.org/all/d8319fd18df7086b12cdcc23193c313893aa071a.1627362340.git.viresh.ku...@linaro.org/T/#u It was sent to several vger destinations,

Re: boost and /all/

2021-08-02 Thread Konstantin Ryabitsev
On Mon, Aug 02, 2021 at 09:16:56PM +, Eric Wong wrote: > > I believe this should assign boost value of 0, but the virtualization source > > seems to be "winning" both via the interface and when retrieving t.mbox.gz. > > Anything I'm not doing right? > > Correct, boost=0 is the default. Did yo

Re: [PATCH 0/2] extindex boost + over-flushing fixes

2021-08-04 Thread Konstantin Ryabitsev
On Wed, Aug 04, 2021 at 10:02:46AM +, Eric Wong wrote: > 2/2 fixes the boost bug reported by Konstantin, and 1/2 > fixes an accounting error which may improve indexing > performance. Thank you, trying it out now and will let you know once reindex completes. Best regards, -K

Re: --batch-size and --jobs combination

2021-08-05 Thread Konstantin Ryabitsev
On Thu, Aug 05, 2021 at 11:05:41AM +, Eric Wong wrote: > > - I will bring up the rest of the nodes throughout the week, so > > x-lore.kernel.org will become more geoip-balanced. I will share any other > > observations once I have more data. Once all 4 nodes are up, I will share > > this m

Boosts still not quite working

2021-08-14 Thread Konstantin Ryabitsev
Eric: The new x-lore systems are rebuilt and reindexed using the latest master (as of 2 days ago), but boosts still don't appear to be quite working as intended. For example, this thread: https://x-lore.kernel.org/all/20210809175620.720923-1-ltyker...@gmail.com/ It was sent to io...@lists.linux-

Re: Boosts still not quite working

2021-08-14 Thread Konstantin Ryabitsev
On Sat, Aug 14, 2021 at 08:46:33PM +, Eric Wong wrote: > > It was sent to io...@lists.linux-foundation.org (mailman) and to a bunch of > > vger lists, all of which have higher boosts in the configuration, e.g. > > netdev: > > boost doesn't come into effect due to the Mailman footer from > the

[PATCH] Duplicate base css definitions in stylesheets

2021-08-16 Thread Konstantin Ryabitsev
eclaration into the contrib stylesheets makes sure that these styles are applied even with the strictest security policies in place. Signed-off-by: Konstantin Ryabitsev --- contrib/css/216dark.css | 3 ++- contrib/css/216light.css | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-) di

RFE: remove "- mbox.gz / Atom" from thread listing

2021-08-16 Thread Konstantin Ryabitsev
Hello: I was playing with mobile view and I think adding "- mbox.gz / Atom" on the thread listing page is both unnecessary and makes the output too busy. E.g. compare: [PATCH v2] drm: avoid races with modesetting rights 2021-08-16 15:20 UTC (2+ messages) - mbox.gz / Atom ` [

RFE: Long .onion URL breaks mobile view

2021-08-16 Thread Konstantin Ryabitsev
Hello: Passing more observations from people testing out x-lore.kernel.org: - the long .onion URL at the bottom of each page causes problems on mobile devices because it results in a very wide page with blank content on the right side. I suggest that we don't need to provide the "git clone"

Re: [PATCH] Duplicate base css definitions in stylesheets

2021-08-17 Thread Konstantin Ryabitsev
On Mon, Aug 16, 2021 at 10:21:48PM +, Eric Wong wrote: > > However, site security policies may deliberately prohibit execution of > > inline content such as scripts and stylesheets as an extra layer of > > protection against XSS vulnerabilities. For example, with the following > > HTTP headers

Re: [PATCH] view: remove mbox.gz and Atom from topic view

2021-08-17 Thread Konstantin Ryabitsev
On Mon, Aug 16, 2021 at 11:35:20PM +, Eric Wong wrote: > > - I don't think most people would choose to download mbox.gz or click on the > > "Atom" link from the thread listing page -- there is a thread view for > > this > > purpose. > > Agreed, I don't think I've ever used them from the t

Bug: wwwlisting doesn't get css styling

2021-08-17 Thread Konstantin Ryabitsev
Hello: Just noticed that the wwwlisting page doesn't apply any css stylesheets (currently still on https://x-lore.kernel.org/lists.html ). It should do the same thing as any other page view, e.g. so it can properly respect the client prefers-color-scheme choices. -K

nntpd errors retrieving group list

2021-08-24 Thread Konstantin Ryabitsev
Hello: I tried the nntpd daemon on x-lore, but I seem to be hitting something odd while retrieving the group list. This is from syslog: public-inbox-nntpd: Can't call method "minmax" on an undefined value at /usr/local/share/perl5/PublicInbox/NNTP.pm line 264. public-inbox-nntpd: during long res

Re: nntpd errors retrieving group list

2021-08-24 Thread Konstantin Ryabitsev
On Tue, Aug 24, 2021 at 08:11:15PM +, Eric Wong wrote: > Any chance you're out-of-FDs or permissions are wrong? > > There's also an off chance a non-inbox (extindex) object is there. > Perhaps this debugging patch can shine a light on things: Aha, it did help identify the problem. There was a

Re: [PATCH 0/8] various WWW + extindex stuff

2021-08-26 Thread Konstantin Ryabitsev
On Thu, Aug 26, 2021 at 12:33:30PM +, Eric Wong wrote: > This hopefully makes the long .onion URL more usable on small > displays; It's still not quite fixing the problem (actually, for some reason it's now looking worse in the narrow mobile view for me). May I suggest the following: - The /

Re: [RFC] wwwlisting: support global CSS in HTML view

2021-08-26 Thread Konstantin Ryabitsev
On Thu, Aug 26, 2021 at 12:36:06PM +, Eric Wong wrote: > Eric Wong wrote: > > Maybe "/+/" (as in "/+/$FOO.css") is a good URL path prefix... > > Pushed as commit 26c635060dcae35feae836b02a18a6a11e408312 It looks good to me, thanks! -K

Add a way to search all from wwwlisting

2021-08-26 Thread Konstantin Ryabitsev
Hello: I switched the configuration to return wwwlisting as toplevel view (instead of redirecting / to /all/), but there's some discontent, because the easy interface to "search everything" is gone and unless someone knows about /all/, they wouldn't find out about it from the wwwlisting view. How

Re: [PATCH 0/2] wwwlisting shows /all/

2021-08-27 Thread Konstantin Ryabitsev
On Fri, Aug 27, 2021 at 12:08:43PM +, Eric Wong wrote: > I think that's too much vertical whitespace at the top of the > page, and multiple s or boxes at the top can get > confusing. > > Just making /all/ show up at the top like a normal inbox (and > letting the admin decide on description) s

  1   2   3   4   >