We don't want to blow up users storage too badly when converting
v1 to v2 or break because they don't have Xapian bindings installed.
---
script/public-inbox-convert | 9 +
1 file changed, 9 insertions(+)
diff --git a/script/public-inbox-convert b/script/public-inbox-convert
index 906001c
SearchIdx always requires DBD::SQLite, so only require it
after we've passed `require_mods(qw(DBD::SQLite))'.
---
t/multi-mid.t | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/multi-mid.t b/t/multi-mid.t
index 94c0e0a2..df865efb 100644
--- a/t/multi-mid.t
+++ b/t/multi-mid.t
Many internal improvements to improve the developer experience,
long-term maintainability, ease-of-installation and compatibility.
There are also several bugfixes.
Some of the internal improvements involve avoiding Perl startup
time in tests. "make check" now runs about 50% faster than
before, an
A long overdue test for behavior established in 2016.
Fixes: 1b28cc7f00a866cb ("view: try assuming UTF-8 for bogus charsets")
---
MANIFEST | 1 +
t/msg_iter.t | 20
t/x-unknown-alpine.eml | 21 +
3 files changed, 42 insertions(+)
We already pre-populate the hashref when loading $smsg
(PublicInbox::SearchMsg) objects out of over.sqlite3 or Xapian,
so making expensive method calls isn't necessary in those cases.
We only need to use the method calls when SQLite or Xapian are
not available or are being populated (such as durin
We use `$top' in other places, so name it to `$top_subj'
consistently for `$subj' and `$prev_subj' comparisons down
the function.
---
lib/PublicInbox/View.pm | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 45c
Avoid needlessly normalizing the subject when dumping, since
it's pushed into the @$topic array during accumulation in
normalized form.
We can also safely treat $smsg as a hashref and avoid
calling "->ds" as a method since we know we've got that
loaded via Over||Search and won't have to use Email:
While multi-Subject messages are unfortunate, try not to
generate confusing/invalid HTML with multiple elements
having the same HTML id attribute.
---
lib/PublicInbox/View.pm | 15 +++
1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInb
We need to escape ampersands (and some other characters for href
attributes), so introduce a `mid_href' sub to do just that.
'<', '>' and '"' were always escaped, so there's no risk of tag
or attribute injection, but creative Message-IDs could cause
confusion for some parsers and generate invalid
Pretty insignificant, but the diffstat makes me happy :>
Eric Wong (8):
view: remove mhref arg from multipart_text_as_html
view: single id="t" for multi-Subject messages
view: dump_topics: better naming of top Subject
view: cleanup topic accumulation and dumping
view,sear
No point in passing something on stack only to stash it
into the $ctx which holds most other parameters used for
rendering the HTML.
---
lib/PublicInbox/View.pm | 14 +++---
lib/PublicInbox/WwwAtomStream.pm | 3 ++-
xt/perf-msgview.t| 3 ++-
3 files changed, 11 i
The object-oriented Hval API turned out to be less useful and
more clunky than I envisioned years ago, so get rid of it.
We'll no longer strip trailing whitespace from From: headers in
the HTML display, but I doubt anybody cares.
---
lib/PublicInbox/Hval.pm | 21 -
lib/PublicIn
No need to use the over-engineered Hval OO API when the subject
is already normalized and there's no trailing spaces because of
normalization.
---
lib/PublicInbox/View.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm
index 033
Long URLs waste bandwidth and redundant query parameters
make caching more difficult and expensive.
Fixes: ddec19694cbf0e1d ("viewdiff: rewrite and simplify")
---
lib/PublicInbox/ViewDiff.pm | 7 ---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/lib/PublicInbox/ViewDiff.pm b/l
The blob regeneration (solving) part has been stable and
performant for over a year with no problems, even with web
crawlers constantly hitting it without needing rate limits.
All the other stuff is open to bikeshedding (as long as
my crappy hardware supports it :P)
---
Documentation/design_www.t
We don't need to hold onto the Email::MIME object across
multiple WwwResponse->getline calls, instead we can stuff
the rendered HTML of the first (and hopefully only) message
of the buffer into ctx->{-html_tip}.
---
lib/PublicInbox/View.pm | 58 -
1 file cha
Pushed as commit 9703d80efd848f582e5b265db1958e0f143d8712
I expect this to be significant in high-concurrency situations.
There's more changes on the horizon to further reduce memory
usage of the WWW interface :>
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.
Since v2 inboxes contain multiple git repositories, avoid the
use of the word "repository" when referring to inboxes as a
whole in most places.
---
Documentation/public-inbox-convert.pod | 6 +++---
Documentation/public-inbox-daemon.pod| 3 +--
Documentation/public-inbox-index.pod | 6 ++
Kyle Meyer wrote:
> Eric Wong writes:
>
> > Since v2 inboxes contain multiple git repositories, avoid the
> > use of the word "repository" when referring to inboxes as a
> > whole in most places.
> [...]
> > diff --git a/TODO b/TODO
> > index 9
Can't code without data structures, and we emphasize
data over code just about everywhere.
---
Documentation/technical/data_structures.txt | 228
MANIFEST| 1 +
2 files changed, 229 insertions(+)
create mode 100644 Documentation/technical
Since v2 inboxes can be made of several git repositories,
consistently call them "inboxes", instead.
---
INSTALL | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/INSTALL b/INSTALL
index bf1c821a..7d14ca55 100644
--- a/INSTALL
+++ b/INSTALL
@@ -21,9 +21,9 @@ Requirements
pub
We never lookup `$ctx->{-obfuscate}' anywhere, as the
correct key is `$ctx->{-obfs_ibx}' since some of the
address obfuscation stuff is inbox-specific.
Note: some of the obfuscation stuff still needs tests,
but it's low-priority at the moment since I don't think
it's a good feature after all.
---
There are lots of mboxes out there :)
Eric Wong (2):
import_vger_from_mbox: drop redundant "use" statements
import_vger_from_mbox: add --filter parameter
scripts/import_vger_from_mbox | 7 +++
1 file changed, 3 insertions(+), 4 deletions(-)
--
unsubscribe: one-click
It shouldn't be hard to make this into a more generic
importer not specific to vger lists.
---
scripts/import_vger_from_mbox | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/scripts/import_vger_from_mbox b/scripts/import_vger_from_mbox
index b3aceb6b..d1ce7231 100644
--- a/sc
PublicInbox::InboxWritable takes care of those imports.
---
scripts/import_vger_from_mbox | 3 ---
1 file changed, 3 deletions(-)
diff --git a/scripts/import_vger_from_mbox b/scripts/import_vger_from_mbox
index 0e5ba6b4..b3aceb6b 100644
--- a/scripts/import_vger_from_mbox
+++ b/scripts/import_vge
The only caller of `flush_diff' is `add_text_body', and that
already did CRLF conversion on the text part. The regexps in
SolverGit still need to preserve CR, however, since that
actually applies patches (instead of rendering them), and we
need to preserve CRLF patches for CRLF files.
---
lib/Pub
Instead, we add CRLF conversion to the only remaining place
which needs it, ViewVCS. This save many redundant ops in in
many places.
The only other place where this mattered was in
View::add_text_body, but we already started doing CRLF
conversions when we added diff parsing and link generation fo
It was the only file in our tree which had CRLF line endings,
so make it consistent with the rest.
---
examples/nginx_proxy | 48 ++--
1 file changed, 24 insertions(+), 24 deletions(-)
diff --git a/examples/nginx_proxy b/examples/nginx_proxy
index 38e60643.
Redundant ops waste cycles and make the code more difficult to
follow. And 3/3 is an overdue cleanup which can also serve as
an impromptu test for solver...
Eric Wong (3):
hval: ascii_html: drop CRLF => LF conversion
viewdiff: remove optional CR handling
examples/nginx_proxy: convert C
improve some
v2writable behaviors while we're at it.
Eric Wong (2):
v2writable: make remove return-compatible w/ Import::remove
v2writable: lookup_content => content_exists
lib/PublicInbox/V2Writable.pm | 34 --
t/v2writable.t| 7 +--
It only needs to return a boolean, since none of the current
callers care about the return value. Thus avoid a hash table
assignment and use of `$smsg->{mime}', here.
---
lib/PublicInbox/V2Writable.pm | 11 +++
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/lib/PublicInbox/
It only needs to return a boolean, since none of the current
callers care about the return value, so avoid a hash assignment
and use of `$smsg->{mime}', here.
---
lib/PublicInbox/V2Writable.pm | 11 +++
1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/lib/PublicInbox/V2Writabl
Import::remove is a documented interface, and the return
value of the V2Writable work-alike should try to be compatible
with what Import implements.
---
lib/PublicInbox/V2Writable.pm | 23 +--
t/v2writable.t| 7 +--
2 files changed, 18 insertions(+), 12 del
Leah Neukirchen wrote:
> Hi,
>
> I've recently imported some sizable archives (~100k messages) of old
> mailing lists and noticed some slight inconveniences:
Thanks for the reports, will answer 2. separately.
> 1) RFC5322/822 invalid Date: headers should be parsed more gracefully
>
> Some old
Leah Neukirchen wrote:
> 2) Weird From: lines crash the whole import
>
> From: "=?iso-8859-1?Q?Jochen_K=FCpper?=
> This funny line broke import_maildir:
>
> fatal: Missing > in ident string: =?iso-8859-1?Q?Jochen_K=FCpper?= usenet
> <"=?iso-8859-1?Q?Jochen_K=FCpper?= 1101853296
> +0100
> fa
Perhaps 1.4.0 will be a small release, after all (and also
smaller in terms of memory use :)
---
Documentation/RelNotes/v1.4.0.eml | 6 +-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/Documentation/RelNotes/v1.4.0.eml
b/Documentation/RelNotes/v1.4.0.eml
index 6b1bc86e..0ebf8d6
`%over' could be confused for the overview SQLite DB
instance, so call it `%override', instead. There's
also no need to write a loop to override a hash when
the language can do it for us.
---
lib/PublicInbox/SearchView.pm | 10 +++---
1 file changed, 3 insertions(+), 7 deletions(-)
diff --gi
Eric Wong wrote:
> Leah Neukirchen wrote:
> > 2) Weird From: lines crash the whole import
> >
> > From: "=?iso-8859-1?Q?Jochen_K=FCpper?= >
> > This funny line broke import_maildir:
> >
> > fatal: Missing > in ident string: =?iso-8859-
This isn't anything new and has been a part of the design
since the beginning, but it may not be apparent to some
folks.
---
Documentation/design_www.txt | 17 +
1 file changed, 17 insertions(+)
diff --git a/Documentation/design_www.txt b/Documentation/design_www.txt
index 240fa50
IO::Compress is required for v2 inboxes and overview
indices, after all, but it is often pulled in by
other packages (HTTP::Message via Plack::Test).
---
INSTALL | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/INSTALL b/INSTALL
index 7d14ca55..4f0217a3 100644
--- a/INST
Eric Wong wrote:
> Leah Neukirchen wrote:
> > 1) RFC5322/822 invalid Date: headers should be parsed more gracefully
> >
> > Some old mails had Date: headers without time zones, e.g.
> > Date: Sat, 27 Sep 1997 10:02:32
> >
> > This results in public-i
Debian 10.0 was released July 2019, so update our documentation
to reflect that. While we're at it, fixup a broken footnote
reference for Inline::C, too.
---
INSTALL | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/INSTALL b/INSTALL
index 4f0217a3..3984df71 100644
--- a/IN
Both the C and pure Perl implementions of `pi_fork_exec'
returns `-1' on error, not `undef'.
---
lib/PublicInbox/Spawn.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/PublicInbox/Spawn.pm b/lib/PublicInbox/Spawn.pm
index 2d9f734c..ad6be187 100644
--- a/lib/PublicInbox/Sp
We rely on spawn/popen_rd for redirects, nowadays.
---
lib/PublicInbox/Git.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/PublicInbox/Git.pm b/lib/PublicInbox/Git.pm
index 7eaaeb8b..9c96b3f0 100644
--- a/lib/PublicInbox/Git.pm
+++ b/lib/PublicInbox/Git.pm
@@ -9,7 +9,7 @
When indexing messages without Date: and/or Received: headers,
fall back to using timestamps originally recorded by git in the
commit object. This allows git mirrors to preserve the import
datestamp and timestamp of a message according to what was fed
into git, instead of blindly falling back to t
Erm... sent prematurely :x Warns in tests.
--
unsubscribe: one-click, see List-Unsubscribe header
archive: https://public-inbox.org/meta/
We will occasionally see legit messages with zero lines,
be sure we index that count for NNTP clients.
I'm not sure about bytes being zero (aside from purged
messages), but we should've dealt with that earlier up
the stack.
---
lib/PublicInbox/SearchMsg.pm | 4 ++--
t/v2mirror.t |
We can just create a ParentPipe and let PublicInbox::DS
manage its life cycle. While we're at it, favor `\&coderef'
over `*coderef' so we're explicit about it being a code ref
and not some other ref type.
---
lib/PublicInbox/Daemon.pm | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff
Konstantin Ryabitsev wrote:
> Hello:
>
> I think public-inbox currently does some heuristic-based threading,
> which may actually not be that useful. For example:
>
> https://lore.kernel.org/linux-renesas-soc/20200217101741.3758-1-geert+rene...@glider.be/
>
> None of the [PATCH] messages have
Luke Kenneth Casson Leighton wrote:
> eric, hi,
>
> we're having difficulty understanding how to deploy public-inbox in a
> way that very simply and as a top and only priority records email in a
> public inbox, for the purposes of having it in a git repository, when
> that email is coming in via
Jeff King wrote:
> On Fri, Mar 13, 2020 at 09:25:31PM +0000, Eric Wong wrote:
>
> > > 6. Peff: this is all possible on the mailing list. I see things that look
> > > interesting, and have a to do folder. If someone replies, I’ll take it off
> > > the list. Once a
We need to favor "Transfer-Encoding: chunked" over the value of
the Content-Length header. We should also reject bogus,
duplicate and/or unreasonable values for both these, since they
can trigger unexpected behavior when combined with other HTTP
parsers in proxies such as varnish, nginx, haproxy,
lkcl wrote:
> hi eric we have things running, hooray, i thought you might appreciate
> it is a little different
> http://inbox.libre-riscv.org/libre-riscv-dev/new.html
Good to know! Btw, if you have DBD::SQLite (and optionally,
Search::Xapian), you can run `public-inbox-index $INBOX_DIR'
to get
Konstantin Ryabitsev wrote:
> Hello:
>
> I think public-inbox currently does some heuristic-based threading,
> which may actually not be that useful. For example:
>
> https://lore.kernel.org/linux-renesas-soc/20200217101741.3758-1-geert+rene...@glider.be/
>
> None of the [PATCH] messages have
RFC 5322 is the latest one in this line, but much documentation
and even command-line options in other programs (e.g. git) refer
to RFC 2822 or even RFC 822.
---
Documentation/standards.perl | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/Documentation/standards.perl b/Docum
Eric Wong wrote:
> So the "Patchwork summary for: linux-renesas-soc" message:
>
> https://lore.kernel.org/linux-renesas-soc/158229483332.12219.5639020605006542672.git-patchwork-summ...@kernel.org/raw
>
> has the following header:
>
> References: <20200217101
We'll also avoid explicitly loading standard library modules
like POSIX and Digest::SHA, here; instead we load our own
modules and let those load whatever non-PublicInbox:: modules
they need.
---
lib/PublicInbox/WWW.pm | 10 ++
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/
x27;m wondering if WWW should just preload by
default. I'm not sure if anybody uses public-inbox.cgi (or
should be using it :P). It's not like we don't ship
public-inbox-httpd; and any PSGI implementation could be used
for smaller inboxes (or powerful-enough hardware).
Eric Won
"use" is also evaluated earlier than "require", so it is
favorable for compile-only checking.
---
lib/PublicInbox/WwwListing.pm | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/lib/PublicInbox/WwwListing.pm b/lib/PublicInbox/WwwListing.pm
index c063fca6..a8aecaf7 100644
---
We can also avoid `o' regexp modifier, since it isn't
recommended by Perl upstream, anymore (although we don't
have any bugs or unintended behavior because of it).
---
lib/PublicInbox/ViewDiff.pm | 47 -
1 file changed, 26 insertions(+), 21 deletions(-)
diff --
We already lazy-load WwwListing for the CGI script, and
hiding another layer of lazy-loading makes things difficult
to do WWW->preload.
We want long-lived processes to do all long-lived allocations up
front to avoid fragmentation in the allocator, but we'll still
support short-lived processes by l
Doing immortal allocations late can cause those allocations
to end up in places where it fragments the heap. So do more
things up front for long-lived daemons.
---
lib/PublicInbox/NNTPD.pm | 4
lib/PublicInbox/WWW.pm | 21 +
2 files changed, 21 insertions(+), 4 deletio
We want WWW->preload to get as many immortal allocations done
as possible, and the `state' feature from Perl 5.10 prevents that.
---
lib/PublicInbox/SolverGit.pm | 13 +++--
lib/PublicInbox/ViewDiff.pm | 6 +++---
2 files changed, 10 insertions(+), 9 deletions(-)
diff --git a/lib/Public
public-inbox-httpd should work with any PSGI files, so make
it more apparent to people reading .psgi examples.
---
examples/cgit.psgi | 5 -
examples/highlight.psgi| 4
examples/newswww.psgi | 5 -
examples/public-inbox.psgi | 5 +
4 files changed, 17 insertions(+
lkcl wrote:
> On Thu, Mar 19, 2020 at 3:06 AM Eric Wong wrote:
> > > a section to disable spam and also adding the listid to the config is
> > > critical otherwise public-inbox-mda fails silently.
> >
> > There's also '--no-precheck' on the com
We can pass fewer order-dependent args to V2Writable::do_idx and
SearchIdxShard::index_raw by passing the smsg object, instead.
---
lib/PublicInbox/SearchIdxShard.pm | 27 ++--
lib/PublicInbox/V2Writable.pm | 52 +++
2 files changed, 42 insertions(+), 37
mess left in 1 and 2,
Finally, patch 9 fixes the corner-case-of-corner-cases for
dealing with multi-MID messages which require a one-off queue to
store the git commit/author times instead of overloading msgmap.
Eric Wong (9):
index: use git commit times on missing Date/Received
v2writable: pres
This lets us store author and committer times for deferred
indexing messages with ambiguous Message-IDs. This allows
us to reproducibly reindex messages with the git commit
and author times when a rare message lacks Received and/or
Date headers while having ambiguous Message-IDs.
---
MANIFEST
Favor `$smsg->{mid}' instead of `$mid0' to reduce parameters
down-the-line, but favor passing the Email::MIME::Header object
around instead of relying on the bloat-prone `$smsg->{mime}'
and calling ->header_obj on it.
---
lib/PublicInbox/OverIdx.pm | 8 +++-
lib/PublicInbox/SearchIdx.pm | 2
No need to pass extra parameters to this method, since
smsg has universal meanings for {blob} and {mid}.
---
lib/PublicInbox/OverIdx.pm | 2 +-
lib/PublicInbox/SearchIdx.pm | 4 +++-
lib/PublicInbox/Smsg.pm | 7 +++
3 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/lib/Publ
We can finally get rid of the awkward, ad-hoc use of V2Writable,
SearchIdx, and OverIdx args for passing {cotime} and {autime}
between classes.
We'll still use those git time fields internally within
V2Writable and SearchIdx for (re)indexing, but that's not
worth avoiding as a fallback.
---
lib/P
We can pass blessed PublicInbox::Smsg objects to internal
indexing APIs instead of having long parameter lists in some
places. The end goal is to avoid parsing redundant information
each step of the way and hopefully make things more
understandable.
---
lib/PublicInbox/OverIdx.pm| 14 +++-
While v2 indexing is triggered immediately after writing the
commit to the git repository, there may be a gap between when
PublicInbox::Import generates a timestamp and when
PublicInbox::SearchIdx sees the message. So follow the mirror
indexing behavior and take the to-be-indexed (time|date)stamps
Since the introduction of over.sqlite3, SearchMsg is not tied to
our search functionality in any way, so stop confusing ourselves
and future hackers by just calling it "PublicInbox::Smsg".
Add a missing "use" in ExtMsg while we're at it.
---
Documentation/mknews.perl | 4 ++--
When indexing messages without Date: and/or Received: headers,
fall back to using timestamps originally recorded by git in the
commit object. This allows git mirrors to preserve the import
datestamp and timestamp of a message according to what was fed
into git, instead of blindly falling back to t
I made tests run so quickly that I missed some warnings :x
Eric Wong (2):
wwwlisting: use first successfully loaded JSON module
t/www_listing: avoid 'once' warnings
lib/PublicInbox/WwwListing.pm | 2 +-
t/www_listing.t | 2 +-
2 files changed, 2 insertions(+), 2
And not the last...
I only noticed this since JSON::PP::Boolean was spewing
redefinition warnings via overload.pm
Fixes: 8fb8fc52420ef669 ("wwwlisting: avoid lazy loading JSON module")
---
lib/PublicInbox/WwwListing.pm | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/Publi
We reach into the WwwListing package directly to retrieve
that JSON encoder/decoder object, and we can't rely on `use'
since WwwListing loading may fail if Plack is missing.
---
t/www_listing.t | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/t/www_listing.t b/t/www_listing.t
in
We'll be supporting gzipped from sqlite3(1) dumps
for altid files in future commits.
In the future (and if we survive), we may replace
Plack::Middleware::Deflater with our own GzipFilter to work
better with asynchronous responses without relying on
memory-intensive anonymous subs.
---
MANIFEST
We only support searching on prefixes matching /\A\w+\z/ because
Xapian requires ':' to delimit the prefix and splits on spaces
without quotes.
I've also verified Xapian supports multibyte UTF-8 characters,
underscores, and bare numbers as search prefixes, so there's
no need to restrict it beyond
While we don't currently reinitialize the query parser for
the lifetime of a PublicInbox::Search object and have no plans
to, it's incorrect to be appending to an existing array in
case we reininitialize the query parser in the future.
---
lib/PublicInbox/Search.pm | 2 +-
1 file changed, 1 insert
This ensures all our indexed data, including data from altid
searches (e.g. "gmane:$ARTNUM") is retrievable.
It uses a "POST" request to avoid wasting cycles when invoked by
crawlers, since it could potentially be several megabytes of
data not indexable by search engines.
---
MANIFEST
PublicInbox::HTTP will chunk, otherwise, and that's
extra overhead which isn't needed.
---
lib/PublicInbox/WwwStream.pm | 13 +
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm
index fceef745..985e0262 100644
---
To improve reproducibility in mirrors, altid dumps can be
exported via "POST /$INBOX_URL/$prefix.sql.gz". $prefix is
something like "gmane" (though the search prefix is "gmane:"
with a colon).
Eric Wong (11):
qspawn: reinstate filter support, add gzip filter
And show contact info when there's no indexing, at all.
Installations where Xapian is too expensive can still support
threading since it only depends on SQLite, so we need to inform
users of what's available.
---
lib/PublicInbox/WwwText.pm | 10 +-
1 file changed, 9 insertions(+), 1 deleti
As sqlite3(1) and other executables may become unavailable or
uninstalled while a daemon runs, we need to gracefully handle
errors in those cases.
---
lib/PublicInbox/Qspawn.pm | 58 ++-
t/httpd-corner.psgi | 7 +
t/httpd-corner.t | 25 ++
This makes the error page more consistent.
Not that it really matters since Compress::Raw::Zlib and
IO::Compress packages have been distributed with Perl since
5.10.x. Of course, zlib itself is also a dependency of git.
---
lib/PublicInbox/Mbox.pm | 16 +++-
1 file changed, 7 inserti
No reason to use the ->getline interface for small responses.
---
lib/PublicInbox/ExtMsg.pm| 4 ++--
lib/PublicInbox/WwwStream.pm | 7 ---
2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/lib/PublicInbox/ExtMsg.pm b/lib/PublicInbox/ExtMsg.pm
index 44884ad2..74a95cf9 100644
--
zlib contexts are memory-intensive, particularly when used for
compression. Since the gzip filter may be sitting in a limiter
queue for a long period, delay the allocation we actually have
data to translate, and not a moment sooner.
---
lib/PublicInbox/GzipFilter.pm | 15 ++-
1 file c
The ->getline API is only useful for limiting memory use when
streaming responses containing multiple emails or log messages.
However it's unnecessary complexity and overhead for callers
(PublicInbox::HTTP) when there's only a single message.
---
lib/PublicInbox/ViewVCS.pm | 8 +---
lib/Pub
We reach into the WwwListing package directly to retrieve
that JSON encoder/decoder object, and we can't rely on `use'
since WwwListing loading may fail if Plack is missing.
---
*sigh* v1 was wrong :x
t/www_listing.t | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/t/www
Date::Parse falls back to using the local timezone when
it's missing from an email, so only test in a reasonable
TZ (UTC) for server software.
---
t/msgtime.t | 4
1 file changed, 4 insertions(+)
diff --git a/t/msgtime.t b/t/msgtime.t
index 7c95e547..3f09fb4e 100644
--- a/t/msgtime.t
+++ b/t
I noticed we never had tests for SIGUSR2, so I started
writing them and fixed two bugs.
Eric Wong (2):
daemon: fix SIGUSR2 upgrade with -W0 (no workers)
daemon: unlink .oldbin PID file correctly
lib/PublicInbox/Daemon.pm | 7 ++-
t/httpd-unix.t| 100
Disabling workers via `-W0' blesses the contents of the
@listeners array, so we need to ensure we call fcntl on
the GLOB ref in ->{sock}.
Add tests to ensure USR2 works regardless of whether workers
are enabled or not.
---
lib/PublicInbox/Daemon.pm | 3 ++
t/httpd-unix.t| 99
We need to track the PID file having ".oldbin" appended
to it while a SIGUSR2 upgrade is in progress and ensure
it is unlinked on SIGQUIT.
---
lib/PublicInbox/Daemon.pm | 4 ++--
t/httpd-unix.t| 1 +
2 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/lib/PublicInbox/Daemon.
Thomas Schneider wrote:
> Hi,
>
> I’ve created a package for Gentoo to ease installing public-inbox. It
> is located in my overlay[0], which can be activated with `layman -a qsx`
> or `eselect repository enable qsx`.
Thanks Thomas. I don't know much about Gentoo these days, but
hopefully it wa
We need to stop workers in the old process, check the socket and
ensure $new_pid is ready to receive signals before killing it.
---
t/httpd-unix.t | 6 ++
1 file changed, 6 insertions(+)
diff --git a/t/httpd-unix.t b/t/httpd-unix.t
index 939431f4..a0fe1e31 100644
--- a/t/httpd-unix.t
+++ b/t/
Exposing altid dumps will help and ensure total reproducibility
of existing instances.
AFAIK, sqlite3(1) can't execute arbitrary code, so it's not
quite as fashionable as the "curl | bash" stuff the cool people
are doing, these days :P
---
lib/PublicInbox/WwwText.pm | 20 ++--
1 f
Seeing the example config linkified, some users may inevitably
try to following it in a browser with a GET request. Provide
a helpful message to inform users to use POST instead of
attempting to treat /$INBOX/$ALTID.sql.gz as a Message-Id.
---
lib/PublicInbox/WWW.pm | 2 ++
lib/PublicInbox/
Provide helpful hints and pointers in existing config example
to reproduce altid DBs when mirroring
Eric Wong (3):
inbox: altid_map becomes a method
wwwtext: show altid instructions in config
wwwaltid: inform users to use POST instead of GET
lib/PublicInbox/Inbox.pm| 15
601 - 700 of 8319 matches
Mail list logo