W dniu 09.03.2017 o 16:05, Axb pisze:
> On 03/09/2017 03:11 PM, mar...@mejor.pl wrote:
>> W dniu 09.03.2017 o 15:05, mar...@mejor.pl pisze:
>>> W dniu 09.03.2017 o 14:42, Axb pisze:
>>>> On 03/09/2017 02:31 PM, mar...@mejor.pl wrote:
>>>>> W dniu 08.03.2017 o 17:30, Axb pisze:
>>>>>> On 03/08/2017 04:55 PM, mar...@mejor.pl wrote:
>>>>>>> W dniu 08.03.2017 o 16:33, Axb pisze:
>>>>>>>> On 03/08/2017 04:16 PM, mar...@mejor.pl wrote:
>>>>>>>>> W dniu 08.03.2017 o 16:06, Axb pisze:
>>>>>>>>>> On 03/08/2017 03:58 PM, mar...@mejor.pl wrote:
>>>>>>>>>>> W dniu 08.03.2017 o 15:27, Axb pisze:
>>>>>>>>>>>> As your command below shows you're using --reqpatlength 0
>>>>>>>>>>>>
>>>>>>>>>>>> Start off with some sane as for example --reqpatlength 40
>>>>>>>>>>>>
>>>>>>>>>>>> you may also want to play with --maxtextread
>>>>>>>>>>>> ( I use --maxtextread 8192  for FRAUD rules)
>>>>>>>>>>>
>>>>>>>>>>> But with --reqpatlength 10, 40, 100 or 1000 I've go no hit.
>>>>>>>>>>> Reading
>>>>>>>>>>> help
>>>>>>>>>>> ( "--reqpatlength: required pattern length, in characters
>>>>>>>>>>> (default: 0)"
>>>>>>>>>>> ) I understand that pattern in generated rule will be longer
>>>>>>>>>>> than
>>>>>>>>>>> reqpatlength (shorter strings will be ignored). Do I correctly
>>>>>>>>>>> assume
>>>>>>>>>>> how the parameter works?
>>>>>>>>>>
>>>>>>>>>> --reqpatlength 40  tells seekphrases to ignore any phrases
>>>>>>>>>> which are
>>>>>>>>>> smaller than 40 chars
>>>>>>>>>>
>>>>>>>>>> just checked by line which is using
>>>>>>>>>>  --reqpatlength 37
>>>>>>>>>
>>>>>>>>> Any value>0 makes that no rule is generated.
>>>>>>>>>
>>>>>>>>>> body __AXB_FRAUD_LAF076  /It has come to our attention that you /
>>>>>>>>>> body __AXB_FRAUD_UPVTRT  / in order to confirm your
>>>>>>>>>> disbursement\./
>>>>>>>>>> body __AXB_FRAUD_NOFUX2  / approval, your funds will be deposited
>>>>>>>>>> directly into your /
>>>>>>>>>> body __AXB_FRAUD_Z4ZZ7D  / in order to accept your
>>>>>>>>>> disbursement\./
>>>>>>>>>> body __AXB_FRAUD_CUXJ6X  / approval, your funds will be direct
>>>>>>>>>> deposited
>>>>>>>>>> into your /
>>>>>>>>>> body __AXB_FRAUD_NHWXKL  /: You Are Eligible to Receive Funds
>>>>>>>>>> up to
>>>>>>>>>> \$.,000\. /
>>>>>>>>>>
>>>>>>>>>> hard to guess what is not working on your side without full
>>>>>>>>>> insight
>>>>>>>>>
>>>>>>>>> What can I do to help more? Should I share all_w.h and all_w.s
>>>>>>>>> files?
>>>>>>>>
>>>>>>>> before we go that way pls answer these questions
>>>>>>>>
>>>>>>>> how many spams/hams are you processing?
>>>>>>>
>>>>>>> ham: ~1400
>>>>>>> spam: ~8200
>>>>>>>
>>>>>>>> do you have a file named assemble.state ? if yes, how large?
>>>>>>>
>>>>>>> Yes, I've got this file, it has ~9MB size.
>>>>>>>
>>>>>>>> and pls zip & send me the full script you're using to generate the
>>>>>>>> rules, OFFLIST! do NOT post to list
>>>>>>>
>>>>>>> Ok, I'll choose tar.bz2 ;)
>>>>>>> Thanks for help.
>>>>>>
>>>>>> replying on list as much as I can so it's  archived FTR
>>>>>>
>>>>>> first thin I see is that your logs do not contain a list of rules
>>>>>> which
>>>>>> hit on each message.
>>>>>>
>>>>>> for example my "w.s" file has lines which look like:
>>>>>>
>>>>>>  53
>>>>>> /home/mc/Maildir/cur/1487823401.M695422P29583.ruler,S=7602,W=7747:2,
>>>>>> ADVANCE_FEE_2_NEW_MONEY,ADVANCE_FEE_3_NEW,ADVANCE_FEE_3_NEW_MONEY,ADVANCE_FEE_4_NEW,ADVANCE_FEE_4_NEW_MONEY,ADVANCE_FEE_5_NEW,ADVANCE_FEE_5_NEW_MONEY,AXB_XM2600,AXB_XMAILER_MIMEOLE_OL_024C2,CM_XRCVD_VOOZER4,DEAR_WINNER,FORGED_MUA_OUTLOOK,FROM_MISSPACED,FROM_MISSP_MSFT,FROM_MISSP_REPLYTO,FROM_MISSP_URI,FSL_419_FP1,FSL_CTYPE_WIN1251,FSL_MISSP_REPLYTO,FSL_NEW_HELO_USER,FSL_RCVD_USER,FSL_UA,FSL_XM_419,HK_NAME_MR_MRS,LOTS_OF_MONEY,LOTTO_DEPT,MONEY_FRAUD_3,MONEY_FRAUD_5,MONEY_FROM_MISSP,MSOE_MID_WRONG_CASE,NSL_RCVD_HELO_USER,TO_NO_BRKTS_FROM_MSSP,T_AXB_XM2600,T_BIG_HEADERS_5K,T_CM_XRCVD_VOOZER4,T_FSL_FREEMAIL_1,T_FSL_HELO_NON_FQDN_2,T_HK_MUCHMONEY,T_LOTTO_AGENT,T_SINGLE_HEADER_1K,T_TO_NO_BRKTS_MSFT,__419_FROM_SIG,__ADVANCE_FEE_2_NEW,__ADVANCE_FEE_2_NEW_MONEY,__ADVANCE_FEE_3_NEW,__ADVANCE_FEE_3_NEW_MONEY,__ADVANCE_FEE_4_NEW,__ADVANCE_FEE_4_NEW_MONEY,__ADVANCE_FEE_5_NEW,__ADVANCE_FEE_5_NEW_MONEY,__AFF_LOTTERY,__ANY_OUTLOOK_MUA,__ANY_TEXT_ATTACH,__ANY_TEXT_ATTACH_DOC,__AXB_MO_OL_024C2,__AXB_MO_OL_D8ACC,__AXB_XM_OL_024C2,__AXB_XM_OL_080C4,__AXB_XM_OL_424A6,__AXB_XM_OL_B9D6C,__BOUNCE_RPATH_NULL,__CONGRADULAT,__CT,__CTE,__CTYPE_CHARSET_QUOTED,__CT_TEXT_PLAIN,__DOS_HAS_ANY_URI,__DOS_RCVD_THU,__DOS_RCVD_WED,__DOS_RELAYED_EXT,__FB_CONGRADS,__FH_HAS_XMSMAIL,__FH_HAS_XPRIORITY,__FORGED_OE,__FRAUD_DBI,__FRAUD_FCW,__FROM_FULL_NAME,__FROM_MISSPACED,__FROM_MISSP_REPLYTO,__FROM_MISSP_URI,__FROM_RUNON,__FSL_419_1,__FSL_419_2,__FSL_419_3,__FSL_419_4,__FSL_419_5,__FSL_HELO_USER_1,__FSL_HELO_USER_3,__FSL_UA_2,__HAS_ANY_EMAIL,__HAS_ANY_URI,__HAS_DATE,__HAS_FROM,__HAS_MESSAGE_ID,__HAS_MIMEOLE,__HAS_MSGID,__HAS_MSMAIL_PRI,__HAS_RCVD,__HAS_REPLY_TO,__HAS_SUBJECT,__HAS_TO,__HAS_URI,__HAS_XMAIL,__HAS_X_MAILER,__HK_NAME_MR_MRS,__LAST_EXTERNAL_RELAY_NO_AUTH,__LAST_UNTRUSTED_RELAY_NO_AUTH,__LOTSA_MONEY_04,__LOTTO_ADMITS,__LOTTO_ADMITS_1,__LOTTO_WIN_01,__MIMEOLE_MS,__MIME_VERSION,__MISSING_REF,__MISSING_REPLY,__MISSING_THREAD,__MONEY_FRAUD,__MONEY_FRAUD_3,__MONEY_FRAUD_5,__MONEY_LOTTERY,__MSGID_OK_DIGITS,__MSOE_MID_WRONG_CASE,__M_NOTIFIC,__NAKED_TO,__NONEMPTY_BODY,__NO_INR_YES_REF,__OE_MUA,__RCVD_VIA_APNIC_E,__RCVD_VIA_ARIN_E,__RCVD_VIA_RIPE,__RCVD_VIA_RIPE_E,__RDNS_SHORT,__REPLYTO_EXISTS,__REPLY_FREEMAIL,__SANE_MSGID,__SARE_FRAUD_BARRISTER,__SINGLE_HEADER_1K,__SUBJ_2UPPER,__SUBJ_4LOWER,__SUBJ_HAS_WORDS,__SUBJ_NOT_SHORT,__TOCC_EXISTS,__TO_NO_ARROWS_R,__TO_NO_BRKTS_FROM_MSSP,__TO_NO_BRKTS_FROM_RUNON,__TO_NO_BRKTS_MSFT,__TO_NO_BRKTS_NOTLIST,__TVD_BODY,__TVD_MIME_ATT_TP,__URI_MAILTO,__XM_MSOE6,__XM_MS_IN_GENERAL,__XM_OUTLOOK_EXPRESS,__XPRIO,__YOU_WON,__YOU_WON_01,__YOU_WON_02,__YOU_WON_SOMTIN,__hk_million,__hk_win_1,__hk_win_5,__hk_win_6,__hk_win_b
>>>>>>
>>>>>>
>>>>>> time=0,scantime=0,format=f,reuse=no,set=0
>>>>>>
>>>>>> so apparently your masschecker is not seeing rules.
>>>>>>
>>>>>> I don't use --cache &  --cachedir (don't remember why) - for starters
>>>>>> maybe remove
>>>>>
>>>>> I started without cache.
>>>>>
>>>>>> I have  --cf='use_bayes 0' (speeds up processing) and make sure
>>>>>> you use
>>>>>>   --cf='required_score 5'
>>>>>>
>>>>>> you'll have to play with your setup till your logs show SA rule hits.
>>>>>
>>>>> Therea are no SA rules because parameter "-C=/dev/null" is set.
>>>>>
>>>>> I don't understand something. Why do I need to check
>>>>> mails-that-i-classified-as-spam-or-ham against rules? If I understand
>>>>> how creating auto rules works masscheck only dumps strings from ham
>>>>> and
>>>>> spam.
>>>>
>>>> the routine is supposed to create rules based from msgs in your spam
>>>> folder and needs the ham folder to counterweight against potential FPs
>>>> so for example, you don't start producing rules based on phrases in
>>>> disclaimers.
>>>>
>>>> in the log, each line starts with Y/N and a score - not sure how
>>>> necessary it is, I've always had it that way and it "works for me"
>>>>
>>>>> And next seek-phrases-in-log should create rules using found strings.
>>>>> I'm using script from svn with some changes in path. So I assumed that
>>>>> it should be more or less working:)
>>>>
>>>> a wise man once said: "to assume is not to know"
>>>> why not try avoiding modifications till you get some usefull results
>>>> and
>>>> the start doing mods, one at a time.
>>>
>>> I just modified "run" script, other perl scripts are untouched.
>>>
>>>>> Btw, I removed -C=/dev/null , rules hit are in logs but
>>>>> seek-phrases-in-log still returns no rules if I use --reqpatlength= to
>>>>> non zero value.
>>>>
>>>> I have no idea.
>>>> I'll send you a modified seek-phrases-in-log (offlist) for you to
>>>> try...
>>>
>>> I've got two news, bad and good.
>>> The good news is you version of script works!
>>> Bad news is that script in official repo doesn't work.
>>> bugzilla?
>>
>> I see what is going. Variable maxreqpatlength isn't initialized in
>> original script...
>>
> 
> 
> Pls open a bug to track the changes for the future.
> 
> And I've got good news :)
> I'll rename the one we now have in SVN and commit my working version as
> a replacement.

Hmm, could it be that
https://bz.apache.org/SpamAssassin/show_bug.cgi?id=6640 isn't properly
fixed?

Reply via email to