W dniu 09.03.2017 o 16:05, Axb pisze: > On 03/09/2017 03:11 PM, mar...@mejor.pl wrote: >> W dniu 09.03.2017 o 15:05, mar...@mejor.pl pisze: >>> W dniu 09.03.2017 o 14:42, Axb pisze: >>>> On 03/09/2017 02:31 PM, mar...@mejor.pl wrote: >>>>> W dniu 08.03.2017 o 17:30, Axb pisze: >>>>>> On 03/08/2017 04:55 PM, mar...@mejor.pl wrote: >>>>>>> W dniu 08.03.2017 o 16:33, Axb pisze: >>>>>>>> On 03/08/2017 04:16 PM, mar...@mejor.pl wrote: >>>>>>>>> W dniu 08.03.2017 o 16:06, Axb pisze: >>>>>>>>>> On 03/08/2017 03:58 PM, mar...@mejor.pl wrote: >>>>>>>>>>> W dniu 08.03.2017 o 15:27, Axb pisze: >>>>>>>>>>>> As your command below shows you're using --reqpatlength 0 >>>>>>>>>>>> >>>>>>>>>>>> Start off with some sane as for example --reqpatlength 40 >>>>>>>>>>>> >>>>>>>>>>>> you may also want to play with --maxtextread >>>>>>>>>>>> ( I use --maxtextread 8192 for FRAUD rules) >>>>>>>>>>> >>>>>>>>>>> But with --reqpatlength 10, 40, 100 or 1000 I've go no hit. >>>>>>>>>>> Reading >>>>>>>>>>> help >>>>>>>>>>> ( "--reqpatlength: required pattern length, in characters >>>>>>>>>>> (default: 0)" >>>>>>>>>>> ) I understand that pattern in generated rule will be longer >>>>>>>>>>> than >>>>>>>>>>> reqpatlength (shorter strings will be ignored). Do I correctly >>>>>>>>>>> assume >>>>>>>>>>> how the parameter works? >>>>>>>>>> >>>>>>>>>> --reqpatlength 40 tells seekphrases to ignore any phrases >>>>>>>>>> which are >>>>>>>>>> smaller than 40 chars >>>>>>>>>> >>>>>>>>>> just checked by line which is using >>>>>>>>>> --reqpatlength 37 >>>>>>>>> >>>>>>>>> Any value>0 makes that no rule is generated. >>>>>>>>> >>>>>>>>>> body __AXB_FRAUD_LAF076 /It has come to our attention that you / >>>>>>>>>> body __AXB_FRAUD_UPVTRT / in order to confirm your >>>>>>>>>> disbursement\./ >>>>>>>>>> body __AXB_FRAUD_NOFUX2 / approval, your funds will be deposited >>>>>>>>>> directly into your / >>>>>>>>>> body __AXB_FRAUD_Z4ZZ7D / in order to accept your >>>>>>>>>> disbursement\./ >>>>>>>>>> body __AXB_FRAUD_CUXJ6X / approval, your funds will be direct >>>>>>>>>> deposited >>>>>>>>>> into your / >>>>>>>>>> body __AXB_FRAUD_NHWXKL /: You Are Eligible to Receive Funds >>>>>>>>>> up to >>>>>>>>>> \$.,000\. / >>>>>>>>>> >>>>>>>>>> hard to guess what is not working on your side without full >>>>>>>>>> insight >>>>>>>>> >>>>>>>>> What can I do to help more? Should I share all_w.h and all_w.s >>>>>>>>> files? >>>>>>>> >>>>>>>> before we go that way pls answer these questions >>>>>>>> >>>>>>>> how many spams/hams are you processing? >>>>>>> >>>>>>> ham: ~1400 >>>>>>> spam: ~8200 >>>>>>> >>>>>>>> do you have a file named assemble.state ? if yes, how large? >>>>>>> >>>>>>> Yes, I've got this file, it has ~9MB size. >>>>>>> >>>>>>>> and pls zip & send me the full script you're using to generate the >>>>>>>> rules, OFFLIST! do NOT post to list >>>>>>> >>>>>>> Ok, I'll choose tar.bz2 ;) >>>>>>> Thanks for help. >>>>>> >>>>>> replying on list as much as I can so it's archived FTR >>>>>> >>>>>> first thin I see is that your logs do not contain a list of rules >>>>>> which >>>>>> hit on each message. >>>>>> >>>>>> for example my "w.s" file has lines which look like: >>>>>> >>>>>> 53 >>>>>> /home/mc/Maildir/cur/1487823401.M695422P29583.ruler,S=7602,W=7747:2, >>>>>> ADVANCE_FEE_2_NEW_MONEY,ADVANCE_FEE_3_NEW,ADVANCE_FEE_3_NEW_MONEY,ADVANCE_FEE_4_NEW,ADVANCE_FEE_4_NEW_MONEY,ADVANCE_FEE_5_NEW,ADVANCE_FEE_5_NEW_MONEY,AXB_XM2600,AXB_XMAILER_MIMEOLE_OL_024C2,CM_XRCVD_VOOZER4,DEAR_WINNER,FORGED_MUA_OUTLOOK,FROM_MISSPACED,FROM_MISSP_MSFT,FROM_MISSP_REPLYTO,FROM_MISSP_URI,FSL_419_FP1,FSL_CTYPE_WIN1251,FSL_MISSP_REPLYTO,FSL_NEW_HELO_USER,FSL_RCVD_USER,FSL_UA,FSL_XM_419,HK_NAME_MR_MRS,LOTS_OF_MONEY,LOTTO_DEPT,MONEY_FRAUD_3,MONEY_FRAUD_5,MONEY_FROM_MISSP,MSOE_MID_WRONG_CASE,NSL_RCVD_HELO_USER,TO_NO_BRKTS_FROM_MSSP,T_AXB_XM2600,T_BIG_HEADERS_5K,T_CM_XRCVD_VOOZER4,T_FSL_FREEMAIL_1,T_FSL_HELO_NON_FQDN_2,T_HK_MUCHMONEY,T_LOTTO_AGENT,T_SINGLE_HEADER_1K,T_TO_NO_BRKTS_MSFT,__419_FROM_SIG,__ADVANCE_FEE_2_NEW,__ADVANCE_FEE_2_NEW_MONEY,__ADVANCE_FEE_3_NEW,__ADVANCE_FEE_3_NEW_MONEY,__ADVANCE_FEE_4_NEW,__ADVANCE_FEE_4_NEW_MONEY,__ADVANCE_FEE_5_NEW,__ADVANCE_FEE_5_NEW_MONEY,__AFF_LOTTERY,__ANY_OUTLOOK_MUA,__ANY_TEXT_ATTACH,__ANY_TEXT_ATTACH_DOC,__AXB_MO_OL_024C2,__AXB_MO_OL_D8ACC,__AXB_XM_OL_024C2,__AXB_XM_OL_080C4,__AXB_XM_OL_424A6,__AXB_XM_OL_B9D6C,__BOUNCE_RPATH_NULL,__CONGRADULAT,__CT,__CTE,__CTYPE_CHARSET_QUOTED,__CT_TEXT_PLAIN,__DOS_HAS_ANY_URI,__DOS_RCVD_THU,__DOS_RCVD_WED,__DOS_RELAYED_EXT,__FB_CONGRADS,__FH_HAS_XMSMAIL,__FH_HAS_XPRIORITY,__FORGED_OE,__FRAUD_DBI,__FRAUD_FCW,__FROM_FULL_NAME,__FROM_MISSPACED,__FROM_MISSP_REPLYTO,__FROM_MISSP_URI,__FROM_RUNON,__FSL_419_1,__FSL_419_2,__FSL_419_3,__FSL_419_4,__FSL_419_5,__FSL_HELO_USER_1,__FSL_HELO_USER_3,__FSL_UA_2,__HAS_ANY_EMAIL,__HAS_ANY_URI,__HAS_DATE,__HAS_FROM,__HAS_MESSAGE_ID,__HAS_MIMEOLE,__HAS_MSGID,__HAS_MSMAIL_PRI,__HAS_RCVD,__HAS_REPLY_TO,__HAS_SUBJECT,__HAS_TO,__HAS_URI,__HAS_XMAIL,__HAS_X_MAILER,__HK_NAME_MR_MRS,__LAST_EXTERNAL_RELAY_NO_AUTH,__LAST_UNTRUSTED_RELAY_NO_AUTH,__LOTSA_MONEY_04,__LOTTO_ADMITS,__LOTTO_ADMITS_1,__LOTTO_WIN_01,__MIMEOLE_MS,__MIME_VERSION,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__MONEY_FRAUD,__MONEY_FRAUD_3,__MONEY_FRAUD_5,__MONEY_LOTTERY,__MSGID_OK_DIGITS,__MSOE_MID_WRONG_CASE,__M_NOTIFIC,__NAKED_TO,__NONEMPTY_BODY,__NO_INR_YES_REF,__OE_MUA,__RCVD_VIA_APNIC_E,__RCVD_VIA_ARIN_E,__RCVD_VIA_RIPE,__RCVD_VIA_RIPE_E,__RDNS_SHORT,__REPLYTO_EXISTS,__REPLY_FREEMAIL,__SANE_MSGID,__SARE_FRAUD_BARRISTER,__SINGLE_HEADER_1K,__SUBJ_2UPPER,__SUBJ_4LOWER,__SUBJ_HAS_WORDS,__SUBJ_NOT_SHORT,__TOCC_EXISTS,__TO_NO_ARROWS_R,__TO_NO_BRKTS_FROM_MSSP,__TO_NO_BRKTS_FROM_RUNON,__TO_NO_BRKTS_MSFT,__TO_NO_BRKTS_NOTLIST,__TVD_BODY,__TVD_MIME_ATT_TP,__URI_MAILTO,__XM_MSOE6,__XM_MS_IN_GENERAL,__XM_OUTLOOK_EXPRESS,__XPRIO,__YOU_WON,__YOU_WON_01,__YOU_WON_02,__YOU_WON_SOMTIN,__hk_million,__hk_win_1,__hk_win_5,__hk_win_6,__hk_win_b >>>>>> >>>>>> >>>>>> time=0,scantime=0,format=f,reuse=no,set=0 >>>>>> >>>>>> so apparently your masschecker is not seeing rules. >>>>>> >>>>>> I don't use --cache & --cachedir (don't remember why) - for starters >>>>>> maybe remove >>>>> >>>>> I started without cache. >>>>> >>>>>> I have --cf='use_bayes 0' (speeds up processing) and make sure >>>>>> you use >>>>>> --cf='required_score 5' >>>>>> >>>>>> you'll have to play with your setup till your logs show SA rule hits. >>>>> >>>>> Therea are no SA rules because parameter "-C=/dev/null" is set. >>>>> >>>>> I don't understand something. Why do I need to check >>>>> mails-that-i-classified-as-spam-or-ham against rules? If I understand >>>>> how creating auto rules works masscheck only dumps strings from ham >>>>> and >>>>> spam. >>>> >>>> the routine is supposed to create rules based from msgs in your spam >>>> folder and needs the ham folder to counterweight against potential FPs >>>> so for example, you don't start producing rules based on phrases in >>>> disclaimers. >>>> >>>> in the log, each line starts with Y/N and a score - not sure how >>>> necessary it is, I've always had it that way and it "works for me" >>>> >>>>> And next seek-phrases-in-log should create rules using found strings. >>>>> I'm using script from svn with some changes in path. So I assumed that >>>>> it should be more or less working:) >>>> >>>> a wise man once said: "to assume is not to know" >>>> why not try avoiding modifications till you get some usefull results >>>> and >>>> the start doing mods, one at a time. >>> >>> I just modified "run" script, other perl scripts are untouched. >>> >>>>> Btw, I removed -C=/dev/null , rules hit are in logs but >>>>> seek-phrases-in-log still returns no rules if I use --reqpatlength= to >>>>> non zero value. >>>> >>>> I have no idea. >>>> I'll send you a modified seek-phrases-in-log (offlist) for you to >>>> try... >>> >>> I've got two news, bad and good. >>> The good news is you version of script works! >>> Bad news is that script in official repo doesn't work. >>> bugzilla? >> >> I see what is going. Variable maxreqpatlength isn't initialized in >> original script... >> > > > Pls open a bug to track the changes for the future. > > And I've got good news :) > I'll rename the one we now have in SVN and commit my working version as > a replacement.
Hmm, could it be that https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6640 isn't properly fixed?