Hey Henrik,

Thanks for pointing out the error. Will use the stripped function instead.

Regards,
Shreyansh Shrivastava

On Wed, 21 Aug 2019, 11:20 Henrik K, <h...@hege.li> wrote:

>
> Everything written there is wrong.
>
> SA Bayes uses $pms->get_decoded_stripped_body_text_array(), which returns
> the text that is supposed to be displayed to user / MUAs, with text/html
> part rendered to text if exists.
>
> So use the stripped function, unless your engine handles mime multipart,
> HTML rendering etc.
>
> get_decoded_stripped_body_text_array is what 'body' rules process
> get_decoded_body_text_array is what 'rawbody' rules process
>
> I've written detailed info about the rule types here:
>
> https://cwiki.apache.org/confluence/display/spamassassin/WritingRulesAdvanced
>
> The PerMsgStatus docs are quite poor in this regard, I tried to described a
> bit more in current SVN versions..
>
> Cheers,
> Henrik
>
> On Wed, Aug 21, 2019 at 03:12:22AM +0530, Shreyansh Shrivastava. wrote:
> > Hey Kris,
> > Thanks for the pointer. Will try to accommodate both the sections.
> >
> > Also, I found the answer. $pms->get_decoded_body_text_array() returns an
> array
> > of strings where each string represented one newline-separated line of
> the
> > body. Also since the newline gets converted into <br> int text/html, the
> whole
> > text/html part becomes the last element of the array. Using pop() on the
> array
> > will leave you with only the text/plain part.
> >
> > Thanks,
> > Shreyansh Shrivastava
> >
> >
> > On Wed, Aug 21, 2019 at 3:06 AM Kris Deugau <[1]kdeu...@vianet.ca>
> wrote:
> >
> >     Shreyansh Shrivastava. wrote:
> >     > I wanted to process only the text/plain part of the mail hence I
> was
> >     > looking for a sub in SA. The closest I could get was
> >     > $pms->get_decoded_body_text_array () which returns an array of
> strings
> >     > comprising both text/plain and text/html part of the mail.
> >     >
> >     > Is there any other way of retrieving the text/plain part only?
> >
> >     I can't really answer what you're asking, but I will point out that
> the
> >     text/plain part is often empty or at least different from the
> text/html
> >     part - on both spam and ham.  Looking only at the text/html would be
> >     slightly better, but using both would be better still.
> >
> >     The HTML formatting/structure itself is often valuable for spam signs
> >     too, on top of whatever readable text content it contains.
> >
> >     -kgd
> >
> >
> > References:
> >
> > [1] mailto:kdeu...@vianet.ca
>

Reply via email to