Hey Henrik, Thanks for pointing out the error. Will use the stripped function instead.
Regards, Shreyansh Shrivastava On Wed, 21 Aug 2019, 11:20 Henrik K, <h...@hege.li> wrote: > > Everything written there is wrong. > > SA Bayes uses $pms->get_decoded_stripped_body_text_array(), which returns > the text that is supposed to be displayed to user / MUAs, with text/html > part rendered to text if exists. > > So use the stripped function, unless your engine handles mime multipart, > HTML rendering etc. > > get_decoded_stripped_body_text_array is what 'body' rules process > get_decoded_body_text_array is what 'rawbody' rules process > > I've written detailed info about the rule types here: > > https://cwiki.apache.org/confluence/display/spamassassin/WritingRulesAdvanced > > The PerMsgStatus docs are quite poor in this regard, I tried to described a > bit more in current SVN versions.. > > Cheers, > Henrik > > On Wed, Aug 21, 2019 at 03:12:22AM +0530, Shreyansh Shrivastava. wrote: > > Hey Kris, > > Thanks for the pointer. Will try to accommodate both the sections. > > > > Also, I found the answer. $pms->get_decoded_body_text_array() returns an > array > > of strings where each string represented one newline-separated line of > the > > body. Also since the newline gets converted into <br> int text/html, the > whole > > text/html part becomes the last element of the array. Using pop() on the > array > > will leave you with only the text/plain part. > > > > Thanks, > > Shreyansh Shrivastava > > > > > > On Wed, Aug 21, 2019 at 3:06 AM Kris Deugau <[1]kdeu...@vianet.ca> > wrote: > > > > Shreyansh Shrivastava. wrote: > > > I wanted to process only the text/plain part of the mail hence I > was > > > looking for a sub in SA. The closest I could get was > > > $pms->get_decoded_body_text_array () which returns an array of > strings > > > comprising both text/plain and text/html part of the mail. > > > > > > Is there any other way of retrieving the text/plain part only? > > > > I can't really answer what you're asking, but I will point out that > the > > text/plain part is often empty or at least different from the > text/html > > part - on both spam and ham. Looking only at the text/html would be > > slightly better, but using both would be better still. > > > > The HTML formatting/structure itself is often valuable for spam signs > > too, on top of whatever readable text content it contains. > > > > -kgd > > > > > > References: > > > > [1] mailto:kdeu...@vianet.ca >