On Mon, 25 Jun 2007 16:26:41 +0000, Duncan wrote: > JCA <[EMAIL PROTECTED]> posted > a10b0c8a0706250650x7e27a0scd042a801b659f6d-JsoAwUIsXosN [EMAIL PROTECTED], > excerpted below, on Mon, 25 Jun 2007 06:50:26 -0700: > > JCA <[EMAIL PROTECTED]> posted > a10b0c8a0706250650x7e27a0scd042a801b659f6d-JsoAwUIsXosN [EMAIL PROTECTED], > excerpted below, on Mon, 25 Jun 2007 06:50:26 -0700: > >> For the last few weeks some idiot has taken to flooding sci.crypt >> (and possibly other groups) with junk. The postings are spoofed to >> appear as coming from regulars in the group, and the contents of the >> postings are just random drivel. >> >> Anybody know a rule, or set of rules, to filter them out? It would >> appear that the bogus postings all come from a specific news provider - >> things like >> >> news.highwinds-media.com!hw-filter.lga!newsfe04.lga.POSTED!53ab2750 >> >> but I don't know how to filter this out. > > <mode=rant> > > This is one reason I've pushed for a long time to have scoring/filtering > (since before pan had scoring, when it was all binary decision > filtering, that's how long) that could match anywhere in the post, in > the body, or in headers not in the overviews. The problem is, the stuff > in the overviews can generally be entirely controlled by the poster, so > if they want to be deliberately disruptive and therefore deliberately > and continuously modify this info, in ordered to evade scoring systems > like pan's, unfortunately, there's not a lot that the poor users of such > clients can do. > > The problem is, in ordered to score/filter on things not in the > overviews, the post must be downloaded first. For better or for worse, > Charles' position has always seemed to emphasize scoring in ordered to > choose /what/ to download (and/or what to delete without downloading), > simply trusting that the overview data used to make such decisions isn't > going to be deliberately obfuscated, in ordered to prevent such scoring/ > filters from working. > > My position, OTOH, is that while it's a bonus if a useful score can be > used to ignore (ultimately, to kill/delete) or watch (ultimately, to > auto- download or at least mark for download) before downloading, just > because the post must be downloaded first doesn't mean the war is > already lost. It still takes time to view the message, and if automated > tools (scoring/ filtering) can be used to either prioritize the viewing > (in the case of watch or positive scores), or to allow mark-read or > deletion without actual viewing (in the case of ignore or negative > scores), well, the war is still won, tho admittedly not as easily. > > Unfortunately, while I'd have much rather had effective filtering based > on /anything/ in the message, than scoring still restricted to overview > data only, and while I've been a very active volunteer here on the pan > lists/groups, it seems your problem and mine don't appear to hit enough > people to be very high on the priority list. > > Back years ago, when I originally filed the request, Charles stated that > yes, he agreed that sort of thing would be useful. However, it was for > him pretty much in the "nice to have at some point" category, and thus > was "blueskied" (aka "backburnered") into never-never-land. > > BTW, even the official slrn scorefile documentation, (slrn's scorefile > format is what pan uses) says non-overview headers can be matched, tho > it goes to pains to point out that it's less efficient since the posts > must be downloaded before those scores will match. > > Of course, Charles has always been quite open to patches, and I've > little doubt if someone with the skills had submitted a patch to > implement this functionality, we'd not be talking about it now as it'd > work as well as overview scoring does. Unfortunately, that's not a set > of skills I have, and no one else has seemed to have the itch to > scratch, so the functionality remains "bluesky", nice to have "someday". > > OTOH, the very fact that I'm still here means regardless of whether this > particular feature I'd sure like has been instituted or not, pan > continues to work better for me than the alternatives, so I guess I > can't complain to strenuously. > > </mode=rant> > > Meanwhile, despite the fact that we're left fighting with the equivalent > of our hands tied behind our backs, there's still a slight chance you > can find something useful to match. I assume you've already found > nothing useful to match in the subject or author headers, and date, > group, line- count, xref, etc, are too generic to be useful. > > That leaves one remaining possibility, the message-ID. If you are lucky > and this guy isn't an expert at this yet, the message-ID header, which > *IS* part of the overview headers, will contain something identifying > that can be scored on, hopefully without matching a bunch of other posts > in the process. > > Message-ID is (or is supposed to be) unique for each post, so you'll > have to use contains or regex expression type matching. You'll also > have to hand-edit the score in your scorefile, altho you can get it most > of the way there using pan's GUI. Of course, you first have to see if > there's part of the message-ID that's uniquely his, but matches all his > messages. Turn view headers on and check that header in several of his > messages. You will likely want to compare those of other regulars as > well, just to be sure you won't over-match. If you find something > useful to match, select one of his messages and add a score on it, based > on the References header, which pan will auto-fill-out with the > message-ID. You'll need to edit out the part that changes, of course. > Once you have it setup, add the score (without rescore), but keep open > the view scores dialog. Then load the scorefile in your favorite text > editor and find the score (should be at the end). Edit the References > line, changing it to Message-ID. Save the file, and back in pan, NOW > hit the close and rescore in the view article's score window. If you > got it right, that should do it, and won't match anyone else's real > posts. > > As I said tho, the good attackers won't overlook message-ID and will > already set it so his provider won't, and you'll have no reliable way to > score his posts. The best attackers won't just fake the message-ID, > they'll make it look like the one the regular author they are faking > uses, so matching it will unfortunately match the regular author's posts > as well. > > BTW, that highwinds-media entry looks familiar. My ISP (Cox) outsources > from them, so all Cox users get that stamp. If it's a Cox user, > however, not some other non-cox user of the same server, a number of > other headings will show up as well, including an unencrypted > NNTP-posting- host, an X-Complaints-To header listing > [EMAIL PROTECTED], and an X-Trace header listing the > same user IP as the NNTP-Posting-Host and the same server as the posted > entry. If it doesn't have those elements, it's probably not a Cox user, > anyway. Unfortunately, none of those headers normally appear in the > overviews, so pan can't properly score against them. =8^(
Unfortunately, I lack the skills to do any patching or programming also. but I am with you that I'd like to have the ability to score on more than the overview headers. That's one of the drawbacks to Agent and Thunderbird also. -- Frank Tabor Just to have it is enough. _______________________________________________ Pan-users mailing list [email protected] http://lists.nongnu.org/mailman/listinfo/pan-users
