http://bugzilla.spamassassin.org/show_bug.cgi?id=3776
------- Additional Comments From [EMAIL PROTECTED] 2004-10-09 08:35 ------- >> I wonder if this would also apply to other >> binary-in-the-body stuff, like old uuencoded >> binaries in the body. Hi Loren, the MIME decoder in SA3 should strip out in-line uuencoded just like it does on rfc822 attachments. The problem here is that this message has a malformed mime structure, causing SA3 and other tools i have tried (Ripmime) to decode it like it sees it. >> "It seems that the entire message body is being >> passed in to create_lm, not just the malformed MIME >> part that MUAs display in the body." Sidney, Isn't that what i said? I agree your proposed patch is the better way to go to limit the total size that can be scanned rather than the total number of iterations on the for loop. >> "Also, the slowness of that loop in TextCat doesn't >> explain the memory blowup." Sidney, after patching the create_lm call, i dont have any big increases in memory consumption. the for loop takes about 6 seconds because its doing 28k iterations. the ngrams sort in the else statement takes 4-5 seconds. the splice (5->6) takes 10-11 seconds. the return back to classify takes 4-5 seconds! its not the for loop that causes the big increases in memory, its the sorts and splices, and then having to return that big array back to classify(). 2004-10-09 10:31:41.246016500 debug: generic: going to textcat matches 2004-10-09 10:31:41.247064500 debug: generic: running TextCat::classify 2004-10-09 10:31:44.999990500 debug: generic: count was 28758 2004-10-09 10:31:45.000113500 debug: generic: 3 else sort ngrams 2004-10-09 10:31:49.693734500 debug: generic: 4 else sort ngrams is done 2004-10-09 10:31:49.693854500 debug: generic: 5 splice sorted 2004-10-09 10:31:59.008898500 debug: generic: 6 splice sorted is done 2004-10-09 10:31:59.009022500 debug: generic: 7 return sorted to classify() 2004-10-09 10:32:03.020713500 debug: generic: done running create_lm after patching create_lm(), the sorts and splices are very fast... 2004-10-09 10:30:07.046344500 debug: generic: going to textcat matches 2004-10-09 10:30:07.047347500 debug: generic: running TextCat::classify 2004-10-09 10:30:07.413029500 debug: generic: count was 2501 2004-10-09 10:30:07.413138500 debug: generic: 3 else sort ngrams 2004-10-09 10:30:07.568822500 debug: generic: 4 else sort ngrams is done 2004-10-09 10:30:07.568935500 debug: generic: 5 splice sorted 2004-10-09 10:30:07.591153500 debug: generic: 6 splice sorted is done 2004-10-09 10:30:07.591260500 debug: generic: 7 return sorted to classify() 2004-10-09 10:30:07.651056500 debug: generic: done running create_lm I am taking a weekend vacation with my wife right now, so i'm not sure if i can continue on this until monday. I agree the scan time of 4-6 seconds for this message is still too slow, and we need to figure out what is causing that next slow down. thanks. d ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
