Send inn-workers mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        https://lists.isc.org/mailman/listinfo/inn-workers
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of inn-workers digest..."


Today's Topics:

   1. Re: [PATCH 0/4] Improve delayer (Julien ?LIE)
   2. Re: tradspool tooken weirdness (Julien ?LIE)
   3. [PATCH 0/2] More criteria for storage.conf (Christoph Biedl)
   4. [PATCH 1/2] storage: Add "filtered" option, to store based on
      the filter's decision (Christoph Biedl)


----------------------------------------------------------------------

Message: 1
Date: Tue, 16 Jan 2024 19:43:07 +0100
From: Julien ?LIE <[email protected]>
To: [email protected]
Subject: Re: [PATCH 0/4] Improve delayer
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi Christoph,

> a while ago I re-discovered the delayer as a way to increase efficiency
> of NoCeM notice handling: By deliberately delaying articles from the
> well-known spam site, the NoCeMs will arrive first, blocking the spam's
> Message-IDs. As a result, users will not see the spam at all.
> 
> However, I found this little program could need some love, so there's
> some code cleanup, documentation, and a bugfix. And new features, most
> notably persistence for the lines not sent yet.

Thanks for your 4 patches.  I've quickly had a look, and they look fine 
to me.  I'll commit them within a few days.

Now that the delayer script has proper documentation, been thoroughly 
tested, and got a few useful new flags, I suggest putting it into the 
backends directory and officially installing it.
Any objection about that?



> Also, as I'm not
> a native English speaker, my wording might be weird or misleading.

Neither am I :)



> +This program implements a delaying pipe. Lines sent to the process
> +are spooled and printed after the given delay time has passed.
If I understand well how the script works, all the spooled lines are 
printed at the same time when the given delay time has passed.  It means 
that articles are in fact delayed by a time between 0 and the next 
arrival of an article after the delay time.
I can explicitly add that to the documentation if you confirm it.

I'm wondering whether we should not implement a minimal delayed time for 
a line before being sent, for the script to be even more efficient.

-- 
Julien ?LIE

????I see the world didn't end yesterday.
   ? Are you sure??? (Alan Moore, _Watchmen_)


------------------------------

Message: 2
Date: Tue, 16 Jan 2024 20:45:40 +0100
From: Julien ?LIE <[email protected]>
To: [email protected]
Subject: Re: tradspool tooken weirdness
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8; format=flowed

Hi Christoph,
> Heh, was thinking about 64bit article numbers as well ...
> 
> Major concern is "Is this really needed?". The highest article number I
> have here has seven digits (in control.cancel, no surprise), and the
> last number reset was at least 25 years ago. So hitting the 32 bit limit
> is not going to happen any time soon.

Some users speak about it from time to time.  Binary newsgroups are 
certainly the most likely ones to hit the current 2^31 limit first.



> Related, has anybody already tested whether clients can handle >32bit
> article numbers? I fear the worst ...

Last time I tested it, Thunderbird did not handle that.  It just hung 
upon receiving a large article number.
Some news readers like tin, slrn and flnews are reported to handle them.

FWIW, the idea would be not to return large article numbers unless the 
news client confirms it can handle them.  We discussed a MAXARTNUM 
capability to achieve that in December 2022 in news.software.nntp:

   <[email protected]>
   https://groups.google.com/g/news.software.nntp/c/4_KjHu9GlBg

I've still not taken the time to properly submit an Internet-Draft, but 
is on my todo list.  I can re-prioritize it if need be.



> On the other hand, there's the main lesson from y2k and upcoming
> y2038: Software will be used for way longer times and at way bigger
> scale than initially anticipated - therefore adapt early.

As you're speaking about y2038, there's an open bug to support the 
upcoming 64-bit time_t transition for 32-bit archs that some 
distributions plan to do:
   https://github.com/InterNetNews/inn/issues/292
   https://wiki.debian.org/ReleaseGoals/64bit-time

The CAF file header should be converted to use uint64_t instead of 
time_t (as well as size_t and off_t).  Otherwise 32-bit system upgraded 
to 64-bit time_t won't be able to retrieve old articles from a timecaf 
news spool.

-- 
Julien ?LIE

??L'amour, c'est comme les maths?: ?a commence par un B?zout et ?a finit
   par un Gau?.??


------------------------------

Message: 3
Date: Tue, 16 Jan 2024 20:00:00 +0100
From: Christoph Biedl <[email protected]>
To: [email protected]
Subject: [PATCH 0/2] More criteria for storage.conf
Message-ID: <[email protected]>

Hello,

These are two patches that enhance the criteria list for the storage
decision of an individual article.

The actual documentation is part of the patches, however I'd like to
provide some rationale here:


The "filtered" option arose from the problem I wanted to debug my Perl 
filter, and this requires access to filtered articles, so 
"dontrejectfiltered" was set to true. In times of big spam however, 
this affects retenition of all articles. Therefore this option, it 
allows filing such filtered articles differently, a small CNFS buffer 
or anything else with a rather tight expiration policy.

Please check, I'm not sure whether the additional parameter to ARTstore 
is really necessary.


The "path" option was motivated by a lot of spam from a particular 
site. While NoCeMs kill it after a few minutes, the area in a CNFS 
buffer is still allocated, and retenition suffers. By using this 
option, articles from a certain, spammy site can be stored separately 
so they will only harm each other.

About the path option, I actually switched to tradspool with 30 days 
expiration for that spammy site. That results in a few thousand 
articles only - and in the tradspool token weirdness reported 
earlier :)

This path is not shown in the cnfsstat output yet, I felt cnfsstat 
could use a rewrite in the output formatting. Still I could do a 
quick-and-dirty solution for the moment, just say the word.

Finally, I'm aware the root cause for implementing this will be gone in 
a few weeks. Still I'd like to have it in case something similar 
happens again.


The patches were developed against the 2.7.1 release, and in use for 
several weeks. For this submission, they were forward-ported. A build 
and the tests passed.

Christoph Biedl (2):
  storage: Add "filtered" option, to store based on the filter's
    decision
  storage: Add "path" option, to store based on the path the article
    went

 doc/pod/storage.conf.pod | 23 ++++++++++++++++++++
 frontends/cnfsstat.in    | 35 +++++++++++++++++++-----------
 include/inn/storage.h    |  3 +++
 innd/art.c               |  7 ++++--
 storage/interface.c      | 46 ++++++++++++++++++++++++++++++++++++++++
 storage/interface.h      |  6 ++++++
 6 files changed, 106 insertions(+), 14 deletions(-)

-- 
2.39.2


------------------------------

Message: 4
Date: Tue, 16 Jan 2024 20:00:00 +0100
From: Christoph Biedl <[email protected]>
To: [email protected]
Subject: [PATCH 1/2] storage: Add "filtered" option, to store based on
        the filter's decision
Message-ID: <[email protected]>

---
 doc/pod/storage.conf.pod |  8 ++++++++
 frontends/cnfsstat.in    | 35 +++++++++++++++++++++++------------
 include/inn/storage.h    |  1 +
 innd/art.c               |  5 +++--
 storage/interface.c      | 11 +++++++++++
 storage/interface.h      |  1 +
 6 files changed, 47 insertions(+), 14 deletions(-)

diff --git a/doc/pod/storage.conf.pod b/doc/pod/storage.conf.pod
index 852bf589f..52c8cc2c8 100644
--- a/doc/pod/storage.conf.pod
+++ b/doc/pod/storage.conf.pod
@@ -137,6 +137,14 @@ wildmat as described above.)  This is a boolean value; 
C<true>, C<yes>
 and C<on> are usable to enable this key.  The case of these values is
 not significant.  The default is false.
 
+=item I<filtered>: <bool>
+
+If this key is set to true, the article must have been rejected by the
+Perl/Python filter.  This also requires that dontfilterrejected is set
+to true in F<inn.conf>.  This is a boolean value; C<true>, C<yes> and
+C<on> are usable to enable this key.  The case of these values is not
+significant.  The default is false.
+
 =back
 
 If an article matches all of the constraints of an entry, it is stored
diff --git a/frontends/cnfsstat.in b/frontends/cnfsstat.in
index 667c24ccb..953f0c266 100644
--- a/frontends/cnfsstat.in
+++ b/frontends/cnfsstat.in
@@ -159,28 +159,29 @@ if ($lastconftime < $maxtime) {
 
 my $logline;
 my $header_printed = 0;
-my ($gr, $cl, $min, $max);
+my ($gr, $cl, $min, $max, $filtered);
 if ($oclass) {
     if ($class{$oclass}) {
         if (!$header_printed) {
             if ($stor{$oclass}) {
-                ($gr, $cl, $min, $max) = split(/:/, $stor{$oclass});
+                ($gr, $cl, $min, $max, undef, $filtered) = split(/:/, 
$stor{$oclass});
             } else {
-                ($gr, $cl, $min, $max) = ('', $oclass, 0, 0);
+                ($gr, $cl, $min, $max, $filtered) = ('', $oclass, 0, 0, 0);
             }
             # Remove leading and trailing double quotes, if present.
+            my $filtered_s = $filtered ? ", filtered only" : "";
             $gr =~ s/"?([^"]*)"?/$1/g;
             if ($use_syslog) {
                 if ($min || $max) {
                     $logline = sprintf(
                         "Class %s for groups matching \"%s\" "
-                          . "article size min/max: %d/%d",
-                        $oclass, $gr, $min, $max,
+                          . "article size min/max: %d/%d%s",
+                        $oclass, $gr, $min, $max, $filtered_s,
                     );
                 } else {
                     $logline = sprintf(
-                        "Class %s for groups matching \"%s\"",
-                        $oclass, $gr,
+                        "Class %s for groups matching \"%s\"%s",
+                        $oclass, $gr, $filtered_s,
                     );
                 }
             } else {
@@ -189,6 +190,9 @@ if ($oclass) {
                 if ($min || $max) {
                     print STDOUT ", article size min/max: $min/$max";
                 }
+                if ($filtered) {
+                    print STDOUT ", filtered articles only";
+                }
                 print STDOUT "\n";
             }
             $header_printed = 1;
@@ -213,19 +217,20 @@ if ($oclass) {
 } else {    # Print all Classes
     my %buffDone;
     foreach my $c (@storsort) {
-        ($gr, $cl, $min, $max) = split(/:/, $stor{$c});
+        ($gr, $cl, $min, $max, undef, $filtered) = split(/:/, $stor{$c});
+        my $filtered_s = $filtered ? ", filtered only" : "";
         # Remove leading and trailing double quotes, if present.
         $gr =~ s/"?([^"]*)"?/$1/g;
         if ($use_syslog) {
             if ($min || $max) {
                 $logline = sprintf(
                     "Class %s for groups matching \"%s\" "
-                      . "article size min/max: %d/%d",
-                    $c, $gr, $min, $max,
+                      . "article size min/max: %d/%d%s",
+                    $c, $gr, $min, $max, $filtered_s,
                 );
             } else {
                 $logline
-                  = sprintf("Class %s for groups matching \"%s\"", $c, $gr);
+                  = sprintf("Class %s for groups matching \"%s\"%s", $c, $gr, 
$filtered_s);
             }
         } else {
             print STDOUT "Class $c";
@@ -233,6 +238,9 @@ if ($oclass) {
             if ($min || $max) {
                 print STDOUT ", article size min/max: $min/$max";
             }
+            if ($filtered) {
+                print STDOUT ", filtered articles only";
+            }
             print STDOUT "\n";
         }
         @buffers = split(/,/, $class{$c});
@@ -357,10 +365,13 @@ sub read_storageconf {
 
             $key{'SIZE'} .= ",0" unless $key{'SIZE'} =~ /,/;
             $key{'SIZE'} =~ s/,/:/;
+            $key{'FILTERED'} =
+                defined $key{'FILTERED'} ?
+                $key{'FILTERED'} =~ /^(true|yes|on)$/ ? 1 : 0 : 0;
 
             if (!defined $stor{ $key{'OPTIONS'} }) {
                 $stor{ $key{'OPTIONS'} } = "$key{'NEWSGROUPS'}:$key{'CLASS'}:"
-                  . "$key{'SIZE'}:$key{'OPTIONS'}";
+                  . "$key{'SIZE'}:$key{'OPTIONS'}:$key{'FILTERED'}";
                 push(@storsort, $key{'OPTIONS'});
             }
         }
diff --git a/include/inn/storage.h b/include/inn/storage.h
index 56230792a..2cdb1625c 100644
--- a/include/inn/storage.h
+++ b/include/inn/storage.h
@@ -50,6 +50,7 @@ typedef struct {
     void *private;            /* A pointer to method specific data */
     time_t arrived;           /* The time when the article arrived */
     time_t expires;           /* The time when the article will be expired */
+    bool filtered;            /* Article was marked by the filter */
     char *groups;             /* Where Newsgroups header field body starts */
     int groupslen;            /* Length of Newsgroups header field body */
     TOKEN *token;             /* A pointer to the article's TOKEN */
diff --git a/innd/art.c b/innd/art.c
index be51e7bf1..0ef1ac06b 100644
--- a/innd/art.c
+++ b/innd/art.c
@@ -435,7 +435,7 @@ ARTheaderpcmp(const void *p1, const void *p2)
 /* Write an article using the storage api.  Put it together in memory and
    call out to the api. */
 static TOKEN
-ARTstore(CHANNEL *cp)
+ARTstore(CHANNEL *cp, bool filtered)
 {
     struct buffer *Article = &cp->In;
     ARTDATA *data = &cp->Data;
@@ -557,6 +557,7 @@ ARTstore(CHANNEL *cp)
     arth.arrived = (time_t) 0;
     arth.token = (TOKEN *) NULL;
     arth.expires = data->Expires;
+    arth.filtered = filtered;
     if (innconf->storeonxref) {
         arth.groups = data->Replic;
         arth.groupslen = data->ReplicLength;
@@ -2576,7 +2577,7 @@ ARTpost(CHANNEL *cp)
     for (i = 0; (ngp = GroupPointers[i]) != NULL; i++)
         ngp->PostCount = 0;
 
-    token = ARTstore(cp);
+    token = ARTstore(cp, Filtered);
     /* Change trailing '\r\n' to '\0\n' of all system header fields. */
     for (i = 0; i < MAX_ARTHEADER; i++) {
         if (HDR_FOUND(i)) {
diff --git a/storage/interface.c b/storage/interface.c
index 81e7a25b4..34c0ebaec 100644
--- a/storage/interface.c
+++ b/storage/interface.c
@@ -261,6 +261,7 @@ ParseTime(char *tmbuf)
 #define SMexpire     14
 #define SMoptions    15
 #define SMexactmatch 16
+#define SMfiltered   17
 
 static CONFTOKEN smtoks[] = {
     {SMlbrace,     (char *) "{"          },
@@ -272,6 +273,7 @@ static CONFTOKEN smtoks[] = {
     {SMexpire,     (char *) "expires:"   },
     {SMoptions,    (char *) "options:"   },
     {SMexactmatch, (char *) "exactmatch:"},
+    {SMfiltered,   (char *) "filtered:"  },
     {0,            NULL                  }
 };
 
@@ -298,6 +300,7 @@ SMreadconfig(void)
     char *options = 0;
     int inbrace;
     bool exactmatch = false;
+    bool filtered = false;
 
     /* if innconf isn't already read in, do so. */
     if (innconf == NULL) {
@@ -351,6 +354,7 @@ SMreadconfig(void)
             minexpire = 0;
             maxexpire = 0;
             exactmatch = false;
+            filtered = false;
 
         } else {
             type = tok->type;
@@ -403,6 +407,11 @@ SMreadconfig(void)
                         || strcasecmp(p, "on") == 0)
                         exactmatch = true;
                     break;
+                case SMfiltered:
+                    if (strcasecmp(p, "true") == 0 || strcasecmp(p, "yes") == 0
+                        || strcasecmp(p, "on") == 0)
+                        filtered = true;
+                    break;
                 default:
                     SMseterror(SMERR_CONFIG,
                                "Unknown keyword in method declaration");
@@ -448,6 +457,7 @@ SMreadconfig(void)
             sub->minexpire = minexpire;
             sub->maxexpire = maxexpire;
             sub->exactmatch = exactmatch;
+            sub->filtered = filtered;
 
             free(method);
             method = 0;
@@ -643,6 +653,7 @@ SMgetsub(const ARTHANDLE article)
             && (!sub->maxsize || (article.len <= sub->maxsize))
             && (!sub->minexpire || article.expires >= sub->minexpire)
             && (!sub->maxexpire || (article.expires <= sub->maxexpire))
+            && (sub->filtered == article.filtered)
             && MatchGroups(article.groups, article.groupslen, sub->pattern,
                            sub->exactmatch)) {
             if (InitMethod(typetoindex[sub->type]))
diff --git a/storage/interface.h b/storage/interface.h
index b3cc35ca6..c67d1eb1e 100644
--- a/storage/interface.h
+++ b/storage/interface.h
@@ -45,6 +45,7 @@ typedef struct __S_SUB__ {
                          method */
     bool exactmatch;  /* all newsgroups to which article belongs
                          should match the patterns */
+    bool filtered;    /* article was marked by the filter */
     struct __S_SUB__ *next;
 } STORAGE_SUB;
 
-- 
2.39.2



------------------------------

Subject: Digest Footer

_______________________________________________
inn-workers mailing list
[email protected]
https://lists.isc.org/mailman/listinfo/inn-workers


------------------------------

End of inn-workers Digest, Vol 157, Issue 9
*******************************************

Reply via email to