[PATCH] view: escape From name properly for title

2016-06-07 Thread Eric Wong
Oops :x Add an additional test for live data for any unprintable characters, too, since this could be a dangerous source of HTML injection. --- lib/PublicInbox/View.pm | 3 ++- t/check-www-inbox.perl | 12 2 files changed, 14 insertions(+), 1 deletion(-) diff --git

[PATCH] view: remove trailing whitespace from reply command

2016-06-07 Thread Eric Wong
Oops, needless waste of space. --- lib/PublicInbox/View.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm index 0ba78fe..6b6597d 100644 --- a/lib/PublicInbox/View.pm +++ b/lib/PublicInbox/View.pm @@ -51,7 +51,7 @@ sub

[PATCH] view: be sure reply text describes plain-text

2016-06-07 Thread Eric Wong
While we may end up mirroring lists which allow HTML mail, encourage plain-text for compatibility since all current inboxes we host are text-only. --- lib/PublicInbox/View.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm

[PATCH] unsubscribe.milter: implement archive blacklist

2016-06-07 Thread Eric Wong
We don't want people following links from archivers and breaking archival. --- examples/unsubscribe.milter | 19 --- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/examples/unsubscribe.milter b/examples/unsubscribe.milter index eb1717b..c245a5b 100644 ---

[PATCH] unsubscribe.psgi: disable confirmation

2016-06-07 Thread Eric Wong
This makes unsubscribing easier and frictionless. --- examples/unsubscribe.psgi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/unsubscribe.psgi b/examples/unsubscribe.psgi index 82e186b..beeab9f 100644 --- a/examples/unsubscribe.psgi +++ b/examples/unsubscribe.psgi

[PATCH] unsubscribe: HTML encode undecryptable username

2016-06-10 Thread Eric Wong
Otherwise, URLs can be crafted to inject HTML. --- lib/PublicInbox/Unsubscribe.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/PublicInbox/Unsubscribe.pm b/lib/PublicInbox/Unsubscribe.pm index 95348ea..239feea 100644 --- a/lib/PublicInbox/Unsubscribe.pm +++

[PATCH 0/7] miscellaneous cleanups

2016-05-27 Thread Eric Wong
Only the last one (NewsGroup class removal for ::Inbox) is likely to cause problems but I'll be checking logs for errors. Eric Wong (7): t/plack: ensure we can cascade on common endpoints http: clarify comments about layering violation Makefile.PL: allow N to be overridden

[PATCH 1/7] t/plack: ensure we can cascade on common endpoints

2016-05-27 Thread Eric Wong
We don't serve things like robots.txt, favicon.ico, or .well-known/ endpoints ourselves, but ensure we can be used with Plack::App::Cascade for others. --- t/plack.t | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/t/plack.t b/t/plack.t index 04680b2..a4f3245

[PATCH 4/7] examples: config no longer supports atomUrl

2016-05-27 Thread Eric Wong
We build the atomUrl from url, which can change dynamically depending on what PSGI environment it is called under. --- examples/public-inbox-config | 1 - 1 file changed, 1 deletion(-) diff --git a/examples/public-inbox-config b/examples/public-inbox-config index 0c1db11..7fcbe0b 100644 ---

[PATCH] daemon: reset unused signal handlers to default in child

2016-06-11 Thread Eric Wong
They're effectively noops anyways, and we don't want to be holding a reference to the read end of the parent pipe. --- lib/PublicInbox/Daemon.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/PublicInbox/Daemon.pm b/lib/PublicInbox/Daemon.pm index b64ec87..b76b9ff 100644 ---

[PATCH] nntp: do not double-encode UTF-8 body

2016-06-14 Thread Eric Wong
Or whatever the appropriate Perl terminology, is... And we will need to do something appropriate for other encodings, too. I still barely understand Perl Unicode despite attempting to understand the docs over the years.. --- lib/PublicInbox/NNTP.pm | 17 - t/nntpd.t

[PATCH] examples: systemd socket and service definitions for daemons

2016-06-13 Thread Eric Wong
Since our daemons are built to take advantage of socket activation, provide example files to allow systems administrators to hit the ground running with systemd. Example init files for other systems greatly appreciated. --- examples/public-inbox-httpd.socket | 10 ++

[PATCH] view: msg_html uses getline body to reduce latency

2016-06-13 Thread Eric Wong
We need to ensure we show the message body ASAP since the thread generation via Xapian could take a while and maybe even raise an exception or crash. --- lib/PublicInbox/View.pm | 27 +-- lib/PublicInbox/WWW.pm | 2 +- t/view.t| 8 +++- 3 files

[PATCH] view: inline message reply into message view

2016-06-05 Thread Eric Wong
This should reduce link following for replies and improve visibility. This should also reduce cache overhead/footprint for crawlers. --- lib/PublicInbox/View.pm | 43 --- lib/PublicInbox/WWW.pm | 23 ++- t/view.t| 14

[PATCH] doc: update links to HTTPS sites in INSTALL and README

2016-06-08 Thread Eric Wong
Thanks to Let's Encrypt and getssl, we can afford to have HTTPS for our own hosting, and www.gnu.org has been accessible over HTTPS for a long while. While we're at it, update the copyright years, too. --- INSTALL | 6 +++--- README | 20 ++-- 2 files changed, 13 insertions(+),

[PATCH 2/2] nntp: fix for missing articles/bodies/heads

2016-05-28 Thread Eric Wong
Oops, we totally forgot to automate testing for this :x --- lib/PublicInbox/NNTP.pm | 2 +- t/nntpd.t | 8 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm index 58b86a8..232237c 100644 ---

[PATCH 2/2] git-http-backend: close pipe for generic PSGI on errors

2016-05-27 Thread Eric Wong
The generic PSGI code needs to avoid resource leaks if smart cloning is disabled (due to resource contraints). --- lib/PublicInbox/GitHTTPBackend.pm | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/GitHTTPBackend.pm b/lib/PublicInbox/GitHTTPBackend.pm

[PATCH 1/2] git-http-backend: move real close to GetlineBody

2016-05-27 Thread Eric Wong
This makes more sense as it keeps management of rpipe nice and neat. --- lib/PublicInbox/GetlineBody.pm| 12 lib/PublicInbox/GitHTTPBackend.pm | 1 - 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/lib/PublicInbox/GetlineBody.pm b/lib/PublicInbox/GetlineBody.pm

[PATCH] config: fix NewsWWW fallback for newsgroups in HTTP URLs

2016-05-27 Thread Eric Wong
Oops, added a test to prevent regressions while we're at it. --- lib/PublicInbox/Config.pm | 4 +++- lib/PublicInbox/NewsWWW.pm | 3 ++- t/plack.t | 15 +++ 3 files changed, 20 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Config.pm

[PATCH 4/3] httpd/async: do not needlessly weaken

2016-05-27 Thread Eric Wong
The restart_read callback has no chance of circular reference, and weakening $self before we create it can cause $self to be undefined inside the callback (seen during stress testing). Fixes: 395406118cb2 ("httpd/async: prevent circular reference") --- lib/PublicInbox/HTTPD/Async.pm | 7 ++-

[PATCH 2/3] http: avoid circular reference for getline responses

2016-05-27 Thread Eric Wong
Lightly tested, this seems to work when mass-aborting responses. Will still need to automate the testing... --- lib/PublicInbox/HTTP.pm | 45 - 1 file changed, 28 insertions(+), 17 deletions(-) diff --git a/lib/PublicInbox/HTTP.pm

[PATCH 1/3] httpd/async: prevent circular reference

2016-05-27 Thread Eric Wong
We must avoid circular references which can cause leaks in long-running processes. This callback is dangerous since it may never be called to properly terminate everything. --- lib/PublicInbox/HTTPD/Async.pm | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git

[PATCH 3/3] git-http-backend: fix aborts for generic PSGI clone

2016-05-27 Thread Eric Wong
We need to avoid circular references in the generic PSGI layer, do it by abusing DESTROY. --- lib/PublicInbox/GetlineBody.pm| 31 +++ lib/PublicInbox/GitHTTPBackend.pm | 13 - 2 files changed, 35 insertions(+), 9 deletions(-) create mode 100644

[PATCH 0/3] http: another round EPIPE fixes

2016-05-27 Thread Eric Wong
Hopefully this is end of resource leaks on prematurely aborted client connections. Eric Wong (3): httpd/async: prevent circular reference http: avoid circular reference for getline responses git-http-backend: fix aborts for generic PSGI clone lib/PublicInbox/GetlineBody.pm

[PATCH] www: force two element key-value pairs in query

2016-06-01 Thread Eric Wong
Oops, this quiets down a warning seen in logs. --- lib/PublicInbox/WWW.pm | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm index e211cd6..d26b69c 100644 --- a/lib/PublicInbox/WWW.pm +++ b/lib/PublicInbox/WWW.pm @@ -45,7 +45,8 @@

[PATCH ssoma] doc: do not override Makefile if POD2* is set

2016-05-29 Thread Eric Wong
MakeMaker will already set this, for us; so do not override it. --- Documentation/include.mk | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/Documentation/include.mk b/Documentation/include.mk index 7b64a19..7116f87 100644 --- a/Documentation/include.mk +++

[PATCH 2/3] www: remove gratuitous use of Plack::Request methods

2016-05-29 Thread Eric Wong
Accessing $env directly is faster and we will eventually remove all Plack::Request dependencies. --- lib/PublicInbox/WWW.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm index 88d4f6f..a820207 100644 ---

[PATCH 3/3] www: remove a few more Plack::Request dependencies

2016-05-29 Thread Eric Wong
Still a work in progress, but SearchView no longer depends on Plack::Request at all and Feed is getting there. We now parse all query parameters up front, but we may do that lazily again in the future. --- lib/PublicInbox/Feed.pm | 18 +- lib/PublicInbox/SearchView.pm | 14

[PATCH] http: yield body->getline running time

2016-05-29 Thread Eric Wong
We cannot let a client monopolize the single-threaded server even if it can drain the socket buffer faster than we can emit data. While we're at it, acknowledge the this behavior (which happens naturally) in httpd/async. The same idea is present in NNTP for the long_response code. This is the

[PATCH] use utf8::{encode,decode} for in-place transforms

2016-05-29 Thread Eric Wong
No need to duplicate the string when transforming it; learned from studying SpamAssassin 3.4.1 --- lib/PublicInbox/NNTP.pm | 6 ++ lib/PublicInbox/SearchMsg.pm | 4 +--- lib/PublicInbox/View.pm | 1 - 3 files changed, 3 insertions(+), 8 deletions(-) diff --git

[PATCH] daemon: disable SIGWINCH unless explicitly daemonized

2016-06-21 Thread Eric Wong
Checking stdin/stdout/stderr is not sufficient as the daemon without setsid can still be under the control of a terminal. Unfortunately this means systemd users cannot use SIGWINCH, either. --- lib/PublicInbox/Daemon.pm | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git

[PATCH] view: fix topic threading when ghosts are present

2016-06-22 Thread Eric Wong
This fixes a bug where a message replying to a ghost would accidentally be added to the wrong topic in the index/topic view. Before commit 76d8f68dc273e54809ad69cfe49e141003f790ef ("view: avoid recursion in topic index"), we would refuse to indent a topic which started with a ghost which hid the

[PATCH 4/9] t/mda.t: remove senseless use of Email::Filter

2016-06-14 Thread Eric Wong
Totally unnecessary... --- t/mda.t | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/t/mda.t b/t/mda.t index fdba967..66ba859 100644 --- a/t/mda.t +++ b/t/mda.t @@ -4,7 +4,6 @@ use strict; use warnings; use Test::More; use Email::MIME; -use Email::Filter; use File::Temp

[PATCH 5/9] t/mda: use only Maildir for testing

2016-06-14 Thread Eric Wong
Remove mbox tests since mbox is unreliable due to raciness and incompatible implementations. We will drop support for mbox emergency destinations, soon. --- t/mda.t | 75 +++-- 1 file changed, 12 insertions(+), 63 deletions(-) diff

[PATCH 7/9] filter: begin work on a new filter API

2016-06-14 Thread Eric Wong
This filter API should be independent of Email::Filter and hopefully less intrusive to long running processes. --- lib/PublicInbox/Filter/Base.pm | 100 +++ lib/PublicInbox/Filter/Mirror.pm | 12 + lib/PublicInbox/Filter/Vger.pm | 33 +

[PATCH 3/9] learn: remove IPC::Run dependency

2016-06-14 Thread Eric Wong
We'll be relying on our spawn implementation, for now; since it'll be consistent with the rest of our code and can optionally take advantage of vfork. --- script/public-inbox-learn | 36 1 file changed, 24 insertions(+), 12 deletions(-) diff --git

[PATCH] search: increase limit for thread search

2016-06-16 Thread Eric Wong
Some threads are easily over 100 messages, so the 50 limit is not enough. It is likely that 1000 messages is not enough, either, and we will need to tune our threading to handle more messages and supply options for configurability. --- lib/PublicInbox/Search.pm | 2 ++ 1 file changed, 2

[PATCH] address: no commas in email addresses

2016-06-16 Thread Eric Wong
We only do loose parsing, here, and I don't think I've seen a comma in a valid email address, so lets not support them. --- lib/PublicInbox/Address.pm | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/PublicInbox/Address.pm b/lib/PublicInbox/Address.pm index ef4cbdc..8b3daf5

[PATCH] TODO: remove cookies for colors

2016-06-16 Thread Eric Wong
It would be too much of a burden for caching system when user-supplied CSS is more powerful. --- Documentation/design_www.txt | 5 ++--- TODO | 2 -- 2 files changed, 2 insertions(+), 5 deletions(-) diff --git a/Documentation/design_www.txt b/Documentation/design_www.txt

[PATCH 1/3] scripts/dc-dlvr: update copyright

2016-06-16 Thread Eric Wong
--- scripts/dc-dlvr | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/scripts/dc-dlvr b/scripts/dc-dlvr index ca64505..e0f3210 100755 --- a/scripts/dc-dlvr +++ b/scripts/dc-dlvr @@ -1,6 +1,6 @@ #!/bin/sh -# Copyright (C) 2008-2013, Eric Wong <e...@80x24.org> -# L

[PATCH] mda: support loading arbitrary filters

2016-06-16 Thread Eric Wong
Give users some rope to do their own filtering. --- script/public-inbox-mda | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/script/public-inbox-mda b/script/public-inbox-mda index 63096fe..26b70cf 100755 --- a/script/public-inbox-mda +++ b/script/public-inbox-mda @@ -57,7

[PATCH] filter/base: reject more types by default

2016-06-17 Thread Eric Wong
Try to be descriptive for some of these. --- lib/PublicInbox/Filter/Base.pm | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/Filter/Base.pm b/lib/PublicInbox/Filter/Base.pm index 37f1ee7..b2bb146 100644 --- a/lib/PublicInbox/Filter/Base.pm +++

[PATCH 5/4] watch_maildir: tighten up path checks

2016-06-18 Thread Eric Wong
Only mark seen messages as spam, otherwise it could be too aggressive and cause problems or over training. We wouldn't want a wayward FIFO ruining our day, either :) --- lib/PublicInbox/WatchMaildir.pm | 12 +--- t/watch_maildir.t | 3 ++- 2 files changed, 7 insertions(+),

[PATCH] www: undefined query string values are empty strings

2016-06-17 Thread Eric Wong
We use very short query parameters for search, so "" without a '=' implies truth for 'r' (relevance). --- lib/PublicInbox/WWW.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm index 78b8826..f88894a 100644 --- a/lib/PublicInbox/WWW.pm +++

[PATCH 2/1] http: constrain getline/close responses by time

2016-06-19 Thread Eric Wong
This allows us to yield control to other clients gracefully if getline takes too long to generate a chunk. This is more expensive but should not cost a syscall on modern 64-bit systems. --- lib/PublicInbox/HTTP.pm | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git

[PATCH 1/2] spawn: try to keep signals blocked in spawned child

2016-06-18 Thread Eric Wong
While we only want to stop our daemons and gracefully destroy subprocesses, it is common for 'Ctrl-C' from a terminal to kill the entire pgroup. Killing an entire pgroup nukes subprocesses like git-upload-pack breaks graceful shutdown on long clones. Make a best effort to ensure git-upload-pack

[PATCH] http: avoid recursion when hitting write count limit

2016-06-19 Thread Eric Wong
Use the EvCleanup::asap handler to reschedule our writes after yielding to other clients. --- lib/PublicInbox/HTTP.pm | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/lib/PublicInbox/HTTP.pm b/lib/PublicInbox/HTTP.pm index 6df1c3f..e0ed2d1 100644 ---

[PATCH 0/2] graceful shutdown tweaks

2016-06-18 Thread Eric Wong
gnals from the master and instead fake signals via pipes. Eric Wong (2): spawn: try to keep signals blocked in spawned child daemon: be less misleading about graceful shutdown lib/PublicInbox/Daemon.pm | 3 ++- lib/PublicInbox/HTTPD/Async.pm | 3 --- lib/PublicInbox/Spawn.pm

[PATCH 2/2] daemon: be less misleading about graceful shutdown

2016-06-18 Thread Eric Wong
We do not need to count the httpd.async object against our running client count, that is tied to the socket of the actual client. This prevents misleading sysadmins about connected clients during shutdown. --- lib/PublicInbox/Daemon.pm | 3 ++- lib/PublicInbox/HTTPD/Async.pm | 3 --- 2

[PATCH 4/4] import: allow messages without subject

2016-06-18 Thread Eric Wong
Because our WatchMaildir module is liberal about what it accepts, we can potentially have messages without a subject. --- lib/PublicInbox/Import.pm | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Import.pm b/lib/PublicInbox/Import.pm index

[PATCH 1/4] emergency: avoid needless mkpath dependency

2016-06-18 Thread Eric Wong
Be more explicit and slightly speed up tests. --- lib/PublicInbox/Emergency.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Emergency.pm b/lib/PublicInbox/Emergency.pm index e402d30..4ee8621 100644 --- a/lib/PublicInbox/Emergency.pm +++

[PATCH 2/4] watch_maildir: add scan test

2016-06-18 Thread Eric Wong
This should be portable despite the intended use of this directory being non-portable. --- lib/PublicInbox/WatchMaildir.pm | 7 ++- t/watch_maildir.t | 39 +++ 2 files changed, 45 insertions(+), 1 deletion(-) create mode 100644

[PATCH 3/4] watch_maildir: spam removal support

2016-06-18 Thread Eric Wong
We can support spam removal by watching a special "spam" Maildir, too. We can run public-inbox-learn as a separate step, and that command will be improved to support auto-learning, too. --- lib/PublicInbox/WatchMaildir.pm | 134 ++-- t/watch_maildir.t

[PATCH 0/4] watch improvements for mirroring

2016-06-18 Thread Eric Wong
public-inbox-watch is more liberal than public-inbox-mda, so we need to make some adjustments about how it handles messages with missing Message-IDs, Subjects, and such. Eric Wong (4): emergency: avoid needless mkpath dependency watch_maildir: add scan test watch_maildir: spam

[PATCH] mbox: set gzip timestamp to the Unix epoch

2016-06-19 Thread Eric Wong
This allows consistency between different invocations from roughly the same period and is no worse for caching any any of our existing HTML and Atom feeds. We cannot set the timestamp to the end date since messages may be added to the repository while we are iterating (and this streaming

[PATCH 2/3] view: introduce WwwStream interface

2016-06-17 Thread Eric Wong
This will allow us to commonalize HTML generation in the future and is the start of moving existing HTML generation to a "pull" streaming model (from the existing "push" one). Using the getline/close pull model is superior to the existing $fh->write streaming as it allows us to throttle response

[PATCH 0/3] introduce WwwStream for per-message views

2016-06-17 Thread Eric Wong
It will be considerably more work to port the rest of the stuff over, but this is a start... Eric Wong (3): feed: split out top-of-page generation view: introduce WwwStream interface view: minor tweaks to reduce long lines lib/PublicInbox/Feed.pm | 36

[PATCH 1/3] feed: split out top-of-page generation

2016-06-17 Thread Eric Wong
This will eventually allow us to reuse code to generate a common header. --- lib/PublicInbox/Feed.pm | 36 1 file changed, 20 insertions(+), 16 deletions(-) diff --git a/lib/PublicInbox/Feed.pm b/lib/PublicInbox/Feed.pm index 07774cb..045e495 100644 ---

[PATCH] view: consolidate per-message newline handling

2016-06-18 Thread Eric Wong
We don't want to blindly append a trailing newline if the message ends in quoted text leading to a , as a newline is already added to a ... --- lib/PublicInbox/View.pm | 22 +++--- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/lib/PublicInbox/View.pm

[PATCH 2/2] examples/*@.service: wait one day for graceful shutdown

2016-06-19 Thread Eric Wong
Because sometimes folks will want to download gigantic mboxes or make large clones over Tor which are not resume-friendly. Note: the timeout logic in nntpd is somewhat over-aggressive and can break some large slrnpulls. This ought to be easily recoverable on the client-side, though, since it's

[PATCH 0/2] robustness improvements

2016-06-19 Thread Eric Wong
Because I care about users downloading over 300 MB from the all.mbox.gz endpoint over Tor or cloning nearly 800 MB over git. It would be nice to be able to resume when a disconnect does happen, however... Eric Wong (2): search: reopen and retry on updated databases examples

[PATCH 1/2] search: reopen and retry on updated databases

2016-06-19 Thread Eric Wong
This seems like a nasty thing which breaks downloads of large mailboxes. --- lib/PublicInbox/Search.pm | 35 ++- 1 file changed, 22 insertions(+), 13 deletions(-) diff --git a/lib/PublicInbox/Search.pm b/lib/PublicInbox/Search.pm index d9fbc36..856c8c1 100644 ---

Re: [PATCH 0/2] robustness improvements

2016-06-19 Thread Eric Wong
Eric Wong <e...@80x24.org> wrote: > Because I care about users downloading over 300 MB from > the all.mbox.gz endpoint over Tor or cloning nearly > 800 MB over git. Poo. Seeing tor and varnishd processes wakeup constantly when idle is disappointing, though :< Well, I suppos

[PATCH 0/2] better ghost handling

2016-06-20 Thread Eric Wong
Improve the handling of ghosts in the WWW topic view; which lead me to a related NNTP cleanup. Eric Wong (2): www: improve topic view by scanning for ghosts nntp: use lookup_mail instead of lookup_message lib/PublicInbox/NNTP.pm | 6 ++ lib/PublicInbox/Search.pm| 6

[PATCH 2/2] nntp: use lookup_mail instead of lookup_message

2016-06-20 Thread Eric Wong
lookup_mail is safer since it won't inadvertently load ghosts. --- lib/PublicInbox/NNTP.pm | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/NNTP.pm b/lib/PublicInbox/NNTP.pm index 93f654f..4b116a7 100644 --- a/lib/PublicInbox/NNTP.pm +++

[PATCH 1/2] www: improve topic view by scanning for ghosts

2016-06-20 Thread Eric Wong
This should help avoid having too many fake top-level messages in the topic view since we only have a partial window for threading results. --- lib/PublicInbox/Search.pm| 6 ++ lib/PublicInbox/SearchMsg.pm | 2 +- lib/PublicInbox/View.pm | 14 +++--- 3 files changed, 18

[PATCH] view: update git-send-email URL

2016-06-22 Thread Eric Wong
, at least. Junio C Hamano writes: > On Wed, Jun 22, 2016 at 12:00 PM, Eric Wong <e...@80x24.org> wrote: > > Just wondering, who updates > > https://kernel.org/pub/software/scm/git/docs/ > > and why hasn't it been updated in a while? > > (currently it says Last

[PATCH] searchview: fix Atom dump

2016-06-20 Thread Eric Wong
Ugh, and I will still need to write better tests for this (and a billion other things :x) Fixes: 4b313dc74bc9 ("feed: various object-orientation cleanups") --- lib/PublicInbox/SearchView.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/SearchView.pm

[PATCH 0/2] searchidx: fix ghost vivification

2016-06-20 Thread Eric Wong
These fix a message threading bug due to out-of-order message delivery, leading messages not showing up properly in thread views. Unfortunately this currently requires an index update which cannot be done in place (yet). Eric Wong (2): searchidx: simplify ghost creation searchidx

[PATCH 2/2] searchidx: merge old thread id from ghosts

2016-06-20 Thread Eric Wong
We failed to discard old thread IDs when vivifying ghosts due to out-of-order message arrival. This rectifies the failure and will trigger a re-index. --- lib/PublicInbox/Search.pm| 3 ++- lib/PublicInbox/SearchIdx.pm | 5 +++-- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git

[PATCH 1/2] searchidx: simplify ghost creation

2016-06-20 Thread Eric Wong
Remove some worthless parameters and redundant no-ops to make the next (important) patch easier-to-review. --- lib/PublicInbox/SearchIdx.pm | 20 1 file changed, 4 insertions(+), 16 deletions(-) diff --git a/lib/PublicInbox/SearchIdx.pm b/lib/PublicInbox/SearchIdx.pm index

[PATCH] spawn: improve error checking for fork failures

2016-06-20 Thread Eric Wong
fork failures are unfortunately common when Xapian has gigabytes and gigabytes mmapped. --- lib/PublicInbox/Config.pm | 2 +- lib/PublicInbox/Git.pm | 5 - lib/PublicInbox/Qspawn.pm | 2 +- lib/PublicInbox/Spawn.pm | 8 ++-- lib/PublicInbox/SpawnPP.pm | 6 ++

[PATCH 0/7] www: avoid recursion for thread walking

2016-06-20 Thread Eric Wong
Deep message threads can cause problems for perl since stack seems to be much more expensive than arrays. Switch to a non-recursive thread walking design and commonalize some common idioms, too. Eric Wong (7): view: remove upfx parameter from thread skeleton dump view: remove dst

[PATCH 1/7] view: remove upfx parameter from thread skeleton dump

2016-06-20 Thread Eric Wong
This makes the string creation somewhat simpler hopefully makes the code easier-to-reason with. --- lib/PublicInbox/View.pm | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm index dfae44f..9095c50 100644 ---

[PATCH 7/7] view: common thread walking interface

2016-06-20 Thread Eric Wong
Since we have a common pattern, for walking threads, extract it into a function and reduce the amount of code we haev. This will make it easier to switch to an event-driven interface for getline, too. --- lib/PublicInbox/SearchView.pm | 8 +--- lib/PublicInbox/View.pm | 35

[PATCH 2/7] view: remove dst parameter from thread skeleton dump

2016-06-20 Thread Eric Wong
We can stuff this into the state hash to reduce stack size and hopefully improve readability. --- lib/PublicInbox/View.pm | 15 +-- 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm index 9095c50..a1b45e9 100644 ---

[PATCH 5/7] searchview: remove recursion from thread view

2016-06-20 Thread Eric Wong
As before, recursion can cause problems sooner than unshifting objects into the head of a queue. --- lib/PublicInbox/SearchView.pm | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/lib/PublicInbox/SearchView.pm b/lib/PublicInbox/SearchView.pm index

[PATCH 3/7] view: remove recursion from thread skeleton dump

2016-06-20 Thread Eric Wong
This should help prevent OOM errors from arbitrarily deep threads and will make our streaming interface easier-to-implement. --- lib/PublicInbox/View.pm | 13 +++-- 1 file changed, 7 insertions(+), 6 deletions(-) diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm index

[PATCH 4/7] view: remove recursion from expanded thread view

2016-06-20 Thread Eric Wong
This should let us generate HTML for arbitrarily deep threads without blowing the stack. How it renders on the client side is another matter... --- lib/PublicInbox/View.pm | 15 --- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/lib/PublicInbox/View.pm

[PATCH] remove dependency on IPC::Run

2016-06-16 Thread Eric Wong
We no longer depend on it for the core code, and tests are optional for users. Hopefully this makes this easier-to-install. --- INSTALL | 1 - Makefile.PL | 1 - t/cgi.t | 5 +++-- t/mda.t | 21 +++-- 4 files changed, 14 insertions(+), 14 deletions(-) diff --git

[PATCH] TODO: add a few Xapian-related items

2016-06-25 Thread Eric Wong
"git cat-file --batch" seems expensive for big repos and loading 70K+ tree objects in git isn't all that fast. Ideas are cheap, time, code, and testing are not :P --- TODO | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/TODO b/TODO index 0d6f1a0..f29f2f0 100644 ---

[PATCH] view: safer and optional quoting for --in-reply-to arg

2016-06-25 Thread Eric Wong
Angle brackets around the --in-reply-to= arg for git send-email has been optional since git v1.5.3.2, so strip them and make the command-line argument easier-to-type. --- lib/PublicInbox/View.pm | 11 ++- t/view.t| 14 +- 2 files changed, 23 insertions(+), 2

[PATCH] address: remove Address::from_name

2016-06-25 Thread Eric Wong
Address::names is sufficient to handle what from_name did. --- lib/PublicInbox/Address.pm | 14 -- lib/PublicInbox/Feed.pm | 2 +- lib/PublicInbox/SearchMsg.pm | 3 ++- lib/PublicInbox/View.pm | 4 ++-- t/mda.t | 2 +- 5 files changed, 6

[PATCH] www_stream: linkify cloneurl entries if they're HTTP/HTTPS

2016-06-25 Thread Eric Wong
They may be other public-inbox instances which are browseable, so provide a link to them to encourage their use as clones. --- lib/PublicInbox/WwwStream.pm | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/lib/PublicInbox/WwwStream.pm b/lib/PublicInbox/WwwStream.pm

[PATCH 5/6] watch_maildir: implement optional spam checking

2016-06-24 Thread Eric Wong
Mailing lists I watch and mirror may not have the best spam filtering, and an extra layer should not hurt. --- lib/PublicInbox/Import.pm | 6 +- lib/PublicInbox/WatchMaildir.pm | 34 -- t/import.t | 6 +- t/watch_maildir.t

[PATCH 6/6] watch_maildir: ignore Trash and Drafts, support Dovecot

2016-06-24 Thread Eric Wong
Trashed messages and drafts are probably not intended for importing, so do not import them. Dovecot uses extra flags via lowercase letters, so we must support those (as that's the server I use). --- lib/PublicInbox/WatchMaildir.pm | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-)

[PATCH 4/6] watch_maildir: rename _check_spam => _remove_spam

2016-06-24 Thread Eric Wong
We do not actually do spam checking, here; but will do spam checking before adding a message in the future. --- lib/PublicInbox/WatchMaildir.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/WatchMaildir.pm b/lib/PublicInbox/WatchMaildir.pm index

[PATCH] www: unescape '+' in query parameter to space

2016-06-25 Thread Eric Wong
Fixes: fbcb7de93884b ("www: remove a few more Plack::Request dependencies") --- lib/PublicInbox/WWW.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/PublicInbox/WWW.pm b/lib/PublicInbox/WWW.pm index f1f4abd..d6b07bf 100644 --- a/lib/PublicInbox/WWW.pm +++ b/lib/PublicInbox/WWW.pm @@

[PATCH] watch_maildir: warn on spam check failures

2016-06-26 Thread Eric Wong
It would be nice to know about spamcheck failures. --- lib/PublicInbox/WatchMaildir.pm | 1 + 1 file changed, 1 insertion(+) diff --git a/lib/PublicInbox/WatchMaildir.pm b/lib/PublicInbox/WatchMaildir.pm index b25704e..145b363 100644 --- a/lib/PublicInbox/WatchMaildir.pm +++

[PATCH] inbox: do not weaken already-weak refs

2016-06-24 Thread Eric Wong
This quiets a (hopefully harmless) warning when a ref remains alive through several expiry timeouts. --- lib/PublicInbox/Inbox.pm | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/lib/PublicInbox/Inbox.pm b/lib/PublicInbox/Inbox.pm index 3f1b733..34191fc 100644 ---

[PATCH 3/4] mbox: reduce small packets for gzipped mboxes

2016-06-24 Thread Eric Wong
We want to avoid sending 10 or 20-byte gzip headers as separate TCP packets to reduce syscalls and avoid wasting bandwidth. --- lib/PublicInbox/Mbox.pm | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/lib/PublicInbox/Mbox.pm b/lib/PublicInbox/Mbox.pm

[PATCH 2/4] evcleanup: micro-optimize asap function

2016-06-24 Thread Eric Wong
Instead of relying on a timer with immediate callback, arm a pipe to watch for writability, ensuring the callback always fires. --- lib/PublicInbox/EvCleanup.pm | 42 +- 1 file changed, 33 insertions(+), 9 deletions(-) diff --git

[PATCH 0/4] http + mbox: tiny optimizations

2016-06-24 Thread Eric Wong
For the gigantic $INBOX/all.mbox.gz response, this seems to slightly improve speeds from roughly 290K/s to roughly 330K/s when fetching out of a ~750MB aggressively-packed inbox. Eric Wong (4): http: always yield on getline/body evcleanup: micro-optimize asap function mbox

[PATCH 4/4] http: cork chunked responses for small savings

2016-06-24 Thread Eric Wong
This only affects Linux users with MSG_MORE support. We can avoid extra TCP overhead for sub-optimal chunk sizes by using MSG_MORE even with chunk trailers under Linux. This breaks real-time apps which require <= 200ms latency for streaming small packets (e.g. implementing "tail -F"), but the

[PATCH] githttpbackend: shallow clone workaround

2016-06-24 Thread Eric Wong
Apparently git-http-backend exits with a non-zero status on shallow clones (due to git-upload-pack), so there is a to-be-fixed bug in git.git http://mid.gmane.org/20160621112303.ga21...@dcvr.yhbt.net http://mid.gmane.org/20160621121041.ga29...@sigill.intra.peff.net ---

[PATCH 0/2] www: show To/Cc destinations in conversation view

2016-06-25 Thread Eric Wong
And improve our name/address extraction to do it. Eric Wong (2): address: beef up the module with name list extaction view: show To/Cc destinations in conversation view MANIFEST | 1 + lib/PublicInbox/Address.pm | 13 - lib/PublicInbox/View.pm| 18

[PATCH 2/2] view: show To/Cc destinations in conversation view

2016-06-25 Thread Eric Wong
It is important to show the decentralized nature of communication in our web views. --- lib/PublicInbox/View.pm | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/lib/PublicInbox/View.pm b/lib/PublicInbox/View.pm index 9fa2a9b..d906276 100644 ---

[PATCH] mda: drop leading "From " lines again

2016-06-26 Thread Eric Wong
Oops... While we're at it, drop blank lines before the "From ", too, since it could happen. --- script/public-inbox-learn | 2 +- script/public-inbox-mda | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/script/public-inbox-learn b/script/public-inbox-learn index

[PATCH 2/1] inbox: ensure we do not show leading "From " lines

2016-06-26 Thread Eric Wong
Some messages will be misimported due to an old bug, clean them up and ensure we do not propagate the mistake. Followup-to: a0c07cba0e5d ("mda: drop leading "From " lines again") --- lib/PublicInbox/Inbox.pm | 4 +++- lib/PublicInbox/SearchIdx.pm | 2 ++ 2 files changed, 5 insertions(+), 1

[PATCH 3/6] document Filesys::Notify::Simple dependency

2016-06-24 Thread Eric Wong
And improve documentation for existing dependencies, too. --- INSTALL | 24 ++-- lib/PublicInbox/WatchMaildir.pm | 2 ++ t/watch_maildir.t | 5 + 3 files changed, 21 insertions(+), 10 deletions(-) diff --git a/INSTALL b/INSTALL

  1   2   3   4   5   6   7   8   9   10   >