[issue13698] Mailbox module should support other mbox formats in addition to mboxo
Petri Lehtinen pe...@digip.org added the comment: The default mode for reading mbox files should also be modified a bit to maximize the support fordifferent implementations. See #11728. I think we should still use the mboxo format by default when writing, and the default format of RFC 4155 when reading. We could then add a format parameter to the mbox constructor to alter the writing and/or reading behavior to match a specific mbox format. According to RFC 4155, the best reference for different mbox formats is http://qmail.org./man/man5/mbox.html. -- components: +email nosy: +barry ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13698 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13698] Mailbox module should support other mbox formats in addition to mboxo
endolith endol...@gmail.com added the comment: - If the mailbox is written using the mboxrd format and read using the mboxo format, lines that were meant to start with From are changed to From . This is a new type of corruption. Well, yes. So the choices are: mboxrd as default: Sometimes results in corruption mboxo as default: Always results in corruption Is there a way to reliably detect the format of the file and produce an error if it seems to be reading it wrong? If not, maybe just include a function that guesses the format so the correct option can be found easily? If there are consecutive quoted lines, like this, for instance: This is the body. From my point of view there are 3 lines. then it was probably encoded with mboxrd? If instead you find: This is the body. From my point of view there are 3 lines. then it was probably encoded with mboxo? -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13698 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13698] Mailbox module should support other mbox formats in addition to mboxo
Petri Lehtinen pe...@digip.org added the comment: endolith wrote: - If the mailbox is written using the mboxrd format and read using - the mboxo format, lines that were meant to start with From - are changed to From . This is a new type of corruption. Well, yes. So the choices are: mboxrd as default: Sometimes results in corruption mboxo as default: Always results in corruption I don't think so. Assuming that mboxo (the current default) was used to write the mailbox, both formats sometimes result in corruption. mboxo as default: From lines get written (and subsequently read) as From . mboxrd as default: From lines were written as From but are read as From . Furthermore, if Python's mailbox module is used to write the mbox file and another software, that only supports mboxo, is used to read it (e.g. mutt), having mboxrd as the default would case From lines to be written as From . These linew would then be read as From by the reading software. So, I'd like to keep the default as is, and add a parameter to change to mboxrd when it's OK for the use case at hand. We should also clearly document that mboxrd is recommended as it never corrupts data if used for both reading and writing. Is there a way to reliably detect the format of the file and produce an error if it seems to be reading it wrong? If not, maybe just include a function that guesses the format so the correct option can be found easily? If there are consecutive quoted lines, like this, for instance: This is the body. From my point of view there are 3 lines. then it was probably encoded with mboxrd? If instead you find: This is the body. From my point of view there are 3 lines. then it was probably encoded with mboxo? It's not possible to automatically detect the format. Guessing like you suggested is too fragile. It might work on some situations, but wouldn't work on others. If it was possible to detect the format by guessing, I'm sure RFC 4155 would mention that, as it aims for the best possible outcome for reading any of the formats, without knowing which format is actually in use. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13698 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13698] Mailbox module should support other mbox formats in addition to mboxo
Changes by Petri Lehtinen pe...@digip.org: -- nosy: +petri.lehtinen versions: +Python 3.4 -Python 3.3 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13698 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13698] Mailbox module should support other mbox formats in addition to mboxo
Petri Lehtinen pe...@digip.org added the comment: I'm a little concerned about backwards compatibility. Someone might get upset if extra 's start appearing in the messages when they read the mailbox contents with an application that uses the mboxo format. A little analysis on the possible corruptions that happen with these formats: - When the mailbox is both read and written using the mboxo format, lines starting with From are changed to From . - When the mailbox is both read and written using the mboxrd format, no corruption happens. - If the mailbox is written using the mboxo format and read using the mboxrd format, lines that were meant to start with From are changed to From . So we essentially get a sligthly different corruption. - If the mailbox is written using the mboxrd format and read using the mboxo format, lines that were meant to start with From are changed to From . This is a new type of corruption. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13698 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13698] Mailbox module should support other mbox formats in addition to mboxo
endolith endol...@gmail.com added the comment: Ok. I'm not sure what backwards compatibility issues would exist, though. The only difference is that mboxrd converts \nFrom → \nFrom \nFrom → \nFrom making the conversion reversible, while mboxo does \nFrom → \nFrom \nFrom → \nFrom (no change) which is ambiguous, and both get converted back to \nFrom when converting back to text, corrupting the original message. mboxrd is essentially a bugfix for mboxo rather than a fundamentally different format. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13698 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13698] Mailbox module should support other mbox formats in addition to mboxo
R. David Murray rdmur...@bitdance.com added the comment: If that's really the only difference we might indeed be able to treat it as a bug fix. I'd have to look at a proposed patch to be sure. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13698 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13698] Mailbox module should support other mbox formats in addition to mboxo
R. David Murray rdmur...@bitdance.com added the comment: Well, supporting the other variants would be good (I'll review any proposed patches), but I think the default will have to stay mboxo for backward compatibility reasons (unless the consensus is to go through the warning/deprecation cycle to change it). As a new feature, this could only go into 3.3 or later. -- nosy: +r.david.murray stage: - needs patch title: Mailbox module should not use mboxo format - Mailbox module should support other mbox formats in addition to mboxo type: behavior - enhancement versions: -Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue13698 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com