DLSauers posted on Fri, 16 Sep 2016 14:18:38 +0000 as excerpted: > In one particular group I haunt the alot of cruft gets crossposted in > for non related topics... > > I heavily filter this group, but could probably gut down on adding > filters daily and/or the existing ones if I could just get pan to filter > out things. > > What I am after is something along... > > Lets say the group is x > > If the post has MORE than group X and contains *.politics.* etc... mark > it -99999999999999999999999999999999999999999999999999999 or what ever.. > > None of the options for scoring rules seems to allow this or work, the > only way to filter this stuff is set up a lot of rules like > > contains Hillary contains Trump contains gay contains ...... > > Just being able to if post is xposted to more than 1 group ie X mark it > -9999 would nuke a lot of stuff.... > > Or is pan not able to setup such advanced scoring filters via the GUI > and/ > or otherwise???? > > This group is rather problematic, and always has been.. It has the > biggest fitlering/killfile and well the only filtering and killfil I've > used on Usenet in 30+ years! > > Any hints on getting more advanced filtering done???
First the general stuff, since you didn't indicate whether you knew this already yet or not, but might, if you're a list regular, as I've posted it here many times over the years, tho you likely won't otherwise unless you've used the other clients previously and looked at the scorefile itself, comparing it with that of the other clients. Pan's scorefile format is in general a less advanced implementation of SLRN's scorefile format (without the fancy stuff such as includes...): http://slrn.sourceforge.net/docs/score.txt ... but with the case insensitivity (but not the other changes) of xnews (my link for that one is dead, but slrn is primary, so it's not worth trying to google or otherwise resurrect the xnews one). Here's the abridged version of the format description I keep as comments in my own scorefile: % [newsgroup.*] wildcard (not regex) format (~ negates). % header lines regex. (~ negates). % Score conditions, single : and, double :: or. % Expires: immed. below score if present. % Leading % indicates comment % Leading whitespace and blank lines ignored. % Regex and newsgroup matches case insensitive with % keyword:, sensitive with keyword=. % Newsgroup change delimits section, % Score delimits "rule", multiple rules per section allowed. % Comment after score becomes rule "name". % Score levels: <=-9999 kill, -9998 to -1 low, % 0, 1 - 4999 med, 5000 - 9998 high, >=9999 watch ** EXCEPT: Unfortunately the last time I investigated, pan's scoring had a bug, and would **NOT** do logical AND -- the single : was treated as OR (::) regardless. Fortunately, most of my scoring (and I guess pretty much anyone elses) is single-shot OR logic anyway, so that's not as big of a deal as if OR logic were broken instead of AND, but it /does/ rather kill a direct implementation of your AND test above... if the bug still exists, which I suppose it does but haven't recently tested. However, it's /somewhat/ possible to work around that limitation by judicious use of additive scoring -- as an example, use two rules that each set -5000, so they combine to -10000 and trigger the kill level. (Tho if you have other rules that add say 100 and a message triggers them as well it'll end up at -9900 and not trigger kill, but that's a good thing as it makes it far more flexible, just make the two -9998 each so each one /almost/ kills, and any trivial +100s won't undo the kill of both combined, if you want that, or make them both -4950 if you want a trivial -100 to be necessary as well to kill, or...) The other thing that should stick out as pretty important from the above rules, once you understand a leading % indicates a comment, when looking at the rules pan creates if you use its gui to create rules, is that: ** Most of the lines pan adds to the scorefile are simply extra explanatory comments -- they don't actually affect the rules at all and deleting many of them can help massively shrink your scorefile without affecting actual scorefile logic at all. Finally, if you've been using pan's GUI to create most of your scores and haven't edited or have only lightly edited the scorefile itself, and you do a LOT of scoring, you should be able to *greatly* optimize things with some rather more active manual scorefile formatting and editing. For instance, a short excerpt from the alt.* spam-kill section of my own scorefile: Warning, adult themed example! %##################################################################### %##################################################################### [alt.*] Score:: =-9999 %Alt kill From: Seeking teens From: teens seeker From: ^LoLiTa < From: ^GOBLIN < From: sex coed From: NudeGirls From: voyeur only From: amateur From: SEXmag From: teens From: intermixed From: rectal Subject: adult movies Subject: dupped Subject: ^\([-0-9/]*\) Subject: Use critical pack from Microsoft Corporation Subject: R/-\\PE Subject: R/-\|PE Subject: Horny mom Subject: rectal exam Subject: body cavity Subject: mature women Subject: candid voyeur Just imagine how many lines that would take if they were each individually added as separate rules, complete with multiple comment lines each, by pan's GUI. Here, they're both easily human-read, and far easier and more efficient for pan to parse. The down side to this level of scorefile editing, of course, is that in ordered to maintain it, you pretty much have to either add new entries manually, or pretty regularly go in and reoptimize all the entries you've added via the pan GUI since the last time you cleaned up. The up side is of course that once you have it cleaned up, it's dead easy to manually add an additional single-line entry. Meanwhile, a few hints: * Set a pan hotkey for the articles, edit article's watch/ignore score, function. From there you can hit the close and rescore button, to rescore based on any manual edits you just made to the scorefile. That's the easiest way to get pan to reapply freshly manually edited scores I've come up with. * Use %#### or similar comment lines to visually separate sections, as I did in the example above. * Consider whether you want an expiring or permanent score. Permanent scores can be easily added to the nicely edited groups manually, while it's tougher to group expiring scores since the expires line will differ, so adding these via the pan gui works well enough. * Consider adding a %### separator line or two at the bottom of your permanent scores, so pan can append the expiring scores you add via the GUI, and it's easier to go in and clean up later since you know where the new ones start. Talking about which... * Pan doesn't clean up expired scores on its own. You'll have to go thru and weed them out once in awhile. (After doing so a few times, you may find yourself not adding so many expiring scores, choosing instead to either add a permanent one or simply skip it, so you don't have to clean up the expired score later. But if you're like me you'll still add a few, for people irritating enough to want to score down temporarily, but who you think might still learn some maturity, in say a year or so, so you don't want to make it permanent just yet.) * For expiring scores, I've found it helpful to keep pan's "created by Pan on <date>" comments, as that way I not only know when it expires, but I know when it was created, and thus have some idea of how irritated I was when I created the entry, based on how long I set it to last before expiring. *** Pan can score based on any header, not just the ones the GUI allows you to score. However, headers that aren't in the overviews as sent from the server won't apply until the message is actually downloaded to cache, making them much less efficient since you won't be able to see the effect until the message is already downloaded and in cache. That's a limitation of the protocol (and overviews) that pan can't do anything about, but sometimes, having to download a message before it can be killed is still better than having to actually read it. *** The above should let you manually add scores based on either the newsgroups header (as opposed to the newsgroup you're actually in at the time, the [*] section head specifier), or the xrefs header, both of which will contain the list of cross-posted groups (the xrefs header only listing the ones carried on that server, along with the message number for the message in each of those groups, the newsgroups header listing all the groups the message was posted to, regardless of whether that server carries them or not). However, I'm not sure whether these rules will apply before or after download, due to the above mentioned overviews issue. Those last two hints should allow you to score based on crossposting to N+ groups, provided you know enough about the crossposted group names in advance to create a score for them. Alternatively, scoring on xref and counting the number of colons should allow you to score on a message posted to N+ groups regardless of name, provided the server carries that many of the groups and thus crossposts the message to them. But again, I'd not know for sure without actually testing it, whether such scores could be applied before download, with only the overviews information available, or if they could only be applied after download. Either way, it should be possible, but one will obviously be far more convenient than the other. And again, as I said above, tho I believe the AND logic bug will prevent combining both an N newsgroups and a subject line filter into one, requiring both, by using multiple scoring rules and adjusting the scores applied by each, you should be able to approximate the same thing. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Panfirstname.lastname@example.org https://lists.nongnu.org/mailman/listinfo/pan-users