Everything written there is wrong.
SA Bayes uses $pms->get_decoded_stripped_body_text_array(), which returns the text that is supposed to be displayed to user / MUAs, with text/html part rendered to text if exists. So use the stripped function, unless your engine handles mime multipart, HTML rendering etc. get_decoded_stripped_body_text_array is what 'body' rules process get_decoded_body_text_array is what 'rawbody' rules process I've written detailed info about the rule types here: https://cwiki.apache.org/confluence/display/spamassassin/WritingRulesAdvanced The PerMsgStatus docs are quite poor in this regard, I tried to described a bit more in current SVN versions.. Cheers, Henrik On Wed, Aug 21, 2019 at 03:12:22AM +0530, Shreyansh Shrivastava. wrote: > Hey Kris, > Thanks for the pointer. Will try to accommodate both the sections. > > Also, I found the answer. $pms->get_decoded_body_text_array() returns an array > of strings where each string represented one newline-separated line of the > body. Also since the newline gets converted into <br> int text/html, the whole > text/html part becomes the last element of the array. Using pop() on the array > will leave you with only the text/plain part. > > Thanks, > Shreyansh Shrivastava > > > On Wed, Aug 21, 2019 at 3:06 AM Kris Deugau <[1][email protected]> wrote: > > Shreyansh Shrivastava. wrote: > > I wanted to process only the text/plain part of the mail hence I was > > looking for a sub in SA. The closest I could get was > > $pms->get_decoded_body_text_array () which returns an array of strings > > comprising both text/plain and text/html part of the mail. > > > > Is there any other way of retrieving the text/plain part only? > > I can't really answer what you're asking, but I will point out that the > text/plain part is often empty or at least different from the text/html > part - on both spam and ham. Looking only at the text/html would be > slightly better, but using both would be better still. > > The HTML formatting/structure itself is often valuable for spam signs > too, on top of whatever readable text content it contains. > > -kgd > > > References: > > [1] mailto:[email protected]
