Send inn-workers mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.isc.org/mailman/listinfo/inn-workers
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of inn-workers digest..."


Today's Topics:

   1. [PATCH 2/2] storage: Add "path" option, to store based on the
      path the article went (Christoph Biedl)
   2. Re: [PATCH 0/4] Improve delayer (Christoph Biedl)
   3. Re: [PATCH 0/2] More criteria for storage.conf (Julien ?LIE)


----------------------------------------------------------------------

Message: 1
Date: Tue, 16 Jan 2024 20:00:00 +0100
From: Christoph Biedl <[email protected]>
To: [email protected]
Subject: [PATCH 2/2] storage: Add "path" option, to store based on the
        path the article went
Message-ID: <[email protected]>

---
 doc/pod/storage.conf.pod | 15 +++++++++++++++
 include/inn/storage.h    |  2 ++
 innd/art.c               |  2 ++
 storage/interface.c      | 35 +++++++++++++++++++++++++++++++++++
 storage/interface.h      |  5 +++++
 5 files changed, 59 insertions(+)

diff --git a/doc/pod/storage.conf.pod b/doc/pod/storage.conf.pod
index 52c8cc2c8..16dca3099 100644
--- a/doc/pod/storage.conf.pod
+++ b/doc/pod/storage.conf.pod
@@ -145,6 +145,21 @@ to true in F<inn.conf>.  This is a boolean value; C<true>, 
C<yes> and
 C<on> are usable to enable this key.  The case of these values is not
 significant.  The default is false.
 
+=item I<path>: <wildmat>
+
+What articles by the path are stored using this storage method.
+<wildmat> is a I<uwildmat> pattern which is matched against the path an
+article has passed, or was posted at.  Poison wildmat expressions
+(expressions starting with C<@>) are allowed and can be used to exclude
+certain path patterns.  The <wildmat> pattern is matched in order.
+
+A typical use case might be to store articles from a spammy site in
+a small buffer to avoid overall retenition suffers:
+
+    Path: "*!spam-site.example.com!not-for-mail"
+
+The default is to match all articles.
+
 =back
 
 If an article matches all of the constraints of an entry, it is stored
diff --git a/include/inn/storage.h b/include/inn/storage.h
index 2cdb1625c..d6db1d345 100644
--- a/include/inn/storage.h
+++ b/include/inn/storage.h
@@ -53,6 +53,8 @@ typedef struct {
     bool filtered;            /* Article was marked by the filter */
     char *groups;             /* Where Newsgroups header field body starts */
     int groupslen;            /* Length of Newsgroups header field body */
+    char *path;               /* Where Path header field body starts */
+    int pathlen;              /* Length of Path header field body */
     TOKEN *token;             /* A pointer to the article's TOKEN */
 } ARTHANDLE;
 
diff --git a/innd/art.c b/innd/art.c
index 0ef1ac06b..ffb1951b9 100644
--- a/innd/art.c
+++ b/innd/art.c
@@ -565,6 +565,8 @@ ARTstore(CHANNEL *cp, bool filtered)
         arth.groups = HDR(HDR__NEWSGROUPS);
         arth.groupslen = HDR_LEN(HDR__NEWSGROUPS);
     }
+    arth.path = HDR(HDR__PATH);
+    arth.pathlen = HDR_LEN(HDR__PATH);
 
     SMerrno = SMERR_NOERROR;
     result = SMstore(arth);
diff --git a/storage/interface.c b/storage/interface.c
index 34c0ebaec..41eae78aa 100644
--- a/storage/interface.c
+++ b/storage/interface.c
@@ -262,6 +262,7 @@ ParseTime(char *tmbuf)
 #define SMoptions    15
 #define SMexactmatch 16
 #define SMfiltered   17
+#define SMpath       18
 
 static CONFTOKEN smtoks[] = {
     {SMlbrace,     (char *) "{"          },
@@ -274,6 +275,7 @@ static CONFTOKEN smtoks[] = {
     {SMoptions,    (char *) "options:"   },
     {SMexactmatch, (char *) "exactmatch:"},
     {SMfiltered,   (char *) "filtered:"  },
+    {SMpath,       (char *) "path:"  },
     {0,            NULL                  }
 };
 
@@ -301,6 +303,7 @@ SMreadconfig(void)
     int inbrace;
     bool exactmatch = false;
     bool filtered = false;
+    char *path_pattern = NULL;
 
     /* if innconf isn't already read in, do so. */
     if (innconf == NULL) {
@@ -355,6 +358,7 @@ SMreadconfig(void)
             maxexpire = 0;
             exactmatch = false;
             filtered = false;
+            path_pattern = NULL;
 
         } else {
             type = tok->type;
@@ -412,6 +416,15 @@ SMreadconfig(void)
                         || strcasecmp(p, "on") == 0)
                         filtered = true;
                     break;
+                case SMpath:
+                    if (path_pattern)
+                        free(path_pattern);
+                    path_pattern = xstrdup(tok->name);
+                    /* Transform ! to | */
+                    for (char *p = path_pattern; *p; p++)
+                        if (*p == '!')
+                            *p = '|';
+                    break;
                 default:
                     SMseterror(SMERR_CONFIG,
                                "Unknown keyword in method declaration");
@@ -458,6 +471,7 @@ SMreadconfig(void)
             sub->maxexpire = maxexpire;
             sub->exactmatch = exactmatch;
             sub->filtered = filtered;
+            sub->path_pattern = path_pattern;
 
             free(method);
             method = 0;
@@ -632,6 +646,24 @@ MatchGroups(const char *g, int len, const char *pattern, 
bool exactmatch)
     return wanted;
 }
 
+static bool
+MatchPath(const char *p, int len, const char *pattern)
+{
+    char *path, *q;
+    int i, lastwhite;
+    enum uwildmat matched;
+
+    path = xmalloc(len + 1);
+    strncpy(path, p, len);
+    path[len] = '\0';
+    for (q = path; *q; q++)
+        if (*q == '!')
+            *q = '|';
+    matched = uwildmat_poison(path, pattern);
+    free(path);
+    return matched;
+}
+
 STORAGE_SUB *
 SMgetsub(const ARTHANDLE article)
 {
@@ -654,6 +686,9 @@ SMgetsub(const ARTHANDLE article)
             && (!sub->minexpire || article.expires >= sub->minexpire)
             && (!sub->maxexpire || (article.expires <= sub->maxexpire))
             && (sub->filtered == article.filtered)
+            && (
+                sub->path_pattern == NULL ||
+                MatchPath(article.path, article.pathlen, sub->path_pattern))
             && MatchGroups(article.groups, article.groupslen, sub->pattern,
                            sub->exactmatch)) {
             if (InitMethod(typetoindex[sub->type]))
diff --git a/storage/interface.h b/storage/interface.h
index c67d1eb1e..6e3be1933 100644
--- a/storage/interface.h
+++ b/storage/interface.h
@@ -46,6 +46,11 @@ typedef struct __S_SUB__ {
     bool exactmatch;  /* all newsgroups to which article belongs
                          should match the patterns */
     bool filtered;    /* article was marked by the filter */
+    char *path_pattern;
+                      /* Wildmat pattern to check against the
+                         path to determine if the article
+                         should go to this method. NULL if any path
+                         should match. */
     struct __S_SUB__ *next;
 } STORAGE_SUB;
 
-- 
2.39.2



------------------------------

Message: 2
Date: Tue, 16 Jan 2024 21:25:08 +0100
From: Christoph Biedl <[email protected]>
To: [email protected]
Subject: Re: [PATCH 0/4] Improve delayer
Message-ID: <[email protected]>
Content-Type: text/plain; charset=utf-8

Julien ?LIE wrote...

> Now that the delayer script has proper documentation, been thoroughly
> tested, and got a few useful new flags, I suggest putting it into the
> backends directory and officially installing it.
> Any objection about that?

Certainly not.

> > +This program implements a delaying pipe. Lines sent to the process
> > +are spooled and printed after the given delay time has passed.
> If I understand well how the script works, all the spooled lines are printed
> at the same time when the given delay time has passed.  It means that
> articles are in fact delayed by a time between 0 and the next arrival of an
> article after the delay time.
> I can explicitly add that to the documentation if you confirm it.

Yes. In regular operation, the delay value is rather a minimum.[1] It may
increase to a huge amount of time if no other line arrives on stdin.
However, news.daily will restart all feeds, so that's the upper limit in
practice.

[1] And the exception, when the input is closed (usually via some ctlinnd),
then entire queue is sent right away. I implemented that storage option
for a reason.

> I'm wondering whether we should not implement a minimal delayed time for a
> line before being sent, for the script to be even more efficient.

Um, do you mean an upper time limit a line may be queued? Should be
doable, but I got burnt more than once from fiddling with alarm, read,
select and the things that need to be done for this. So I'll happily
leave that to someone else.

Thanks for your kind words,

    Christoph


------------------------------

Message: 3
Date: Wed, 17 Jan 2024 12:49:50 +0100
From: Julien ?LIE <[email protected]>
To: [email protected]
Subject: Re: [PATCH 0/2] More criteria for storage.conf
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi Christoph,

> These are two patches that enhance the criteria list for the storage
> decision of an individual article.

Thanks for these!


> Please check, I'm not sure whether the additional parameter to ARTstore
> is really necessary.

I'll have a look.


> I'm aware the root cause for implementing this will be gone
> in a few weeks. Still I'd like to have it in case something similar 
> happens again.

I totally agree this patch is worthwhile integrating into INN.
The "filtered" option reminded me something.  You already proposed it 
two decades ago :) so it proves to be re-usable for different waves of spam!
     https://github.com/InterNetNews/inn/issues/38

It's now time to integrate it.  Thanks for the updated patch against the 
current version!


> This path is not shown in the cnfsstat output yet, I felt cnfsstat
> could use a rewrite in the output formatting. Still I could do a
> quick-and-dirty solution for the moment, just say the word.

You can do whatever you feel appropriate.


> The patches were developed against the 2.7.1 release, and in use for
> several weeks.

Not years? :)
Did you stop using your previous patch?

-- 
Julien ?LIE

??Le temps change toute chose?; il n'y a aucune raison pour que la
   langue ?chappe ? cette loi universelle.?? (Saussure)


------------------------------

Subject: Digest Footer

_______________________________________________
inn-workers mailing list
[email protected]
https://lists.isc.org/mailman/listinfo/inn-workers


------------------------------

End of inn-workers Digest, Vol 157, Issue 10
********************************************

Reply via email to