On Sat, Mar 06, 2004 at 05:41:21PM -0800, Dan Quinlan wrote:
> Given how we decode MIME data, I think this might make sense:

As an fyi, this exact discussion came up a month or so ago.  There was
a ticket about it, and some sa-users (or dev?) chatter.  I tried to dig
up a pointer to it, but didn't find it in my archive or in bugzilla,
strangely. :(

>  * header - stays the same
>  * body - decoded and rendered text (unchanged)
>  * decoded - decoded text (by default, see below), not rendered (new
>              type, similar to the old "rawbody")
>  * raw - pristine body, no changes, (raw means raw, whee)
>  * full - pristine message, headers plus body (mostly for checksum tests)

Sounds fine to me, but see below.

BTW: the current full is essentially the same except the headers get
slightly cooked.  specifically whitespace folding is removed so that
the RE matches would be simpler to write.

> For each test, make the default form of the data be a reference to an
> array like how body currently works.

Sure.

> Next, a set of modifiers for each:
> 
>   one common modifier for all 4 types:
>     - a 'join' (or 'string' or whatever) modifier to return the entire
>       data in a single string, performance-be-damned

ok.

>   one modifier for "decoded":
>     - a regex to select decoded versions of specific content-types, any
>       possible content type: text, application, image, etc.  "decoded"
>       would default to the same set of types that are ultimately
>       rendered as body, of course

I don't really like the modifier idea for this, and I think the same
thing would need to work for "raw" as well.  "raw" normal would be
the whole pristine body, versus "raw" with a modifier or whatever which
could search for specific parts, etc.

Things that you can search/deal with now trivially:

Content-Type (just the type, no reason to use the full header), Attachment
Name (/\.zip$/ for instance), decoded section offset (start string at
offset X bytes) , and decoded section length (only let me match against
the first Y decoded bytes).

Doing both raw and decoded in this form, btw, would be sufficient for
ticket 3010 as well imo.


BTW: In talking with Dan about this earlier today off-list, I was strongly
opposed due to the fact we need to get a release out sooner than later.
But after thinking about it some more, it's really not a "minor version"
type of change, so we should straighten this out now or wait for 4.0...

-- 
Randomly Generated Tagline:
"You just reciprocate the small one ..."      - Peter Sagerson

Attachment: pgp2VeMDvNj4S.pgp
Description: PGP signature

Reply via email to