On 19 August 2017 at 10:25, Daniel Gruno <[email protected]> wrote:
> On 08/19/2017 10:46 AM, sebb wrote:
>> -1
>>
>> Sorry, but changing an ID generator will break some Permalinks if the
>> messages are ever reparsed.
>
> I'm sorry, but I don't accept this as a valid reason for a veto.
My point is that you should not force a new ID strategy on existing
installations.
By all means add a new ID generator with the fix, but don't change the
status quo.
> The present first-chance generator is horribly broken as is, as it only
> relies on 'archived-at' (which may not exist and thus will default to
> $now which will always change!) and the list ID, meaning that
> reimporting is *already broken* for these messages if archived-at is
> either missing or not the same. If the DB gets lost or corrupted, and
> you re-import, you will already end up with new permalinks should they
> fall back to this generator due to a missing body.
>
> IMHO, the first generator also needs to change or it will not be
> reliable at all. but I have not looked at that yet.
>
> We start off on line 642 with:
> if not msg.get('archived-at'):
> msg.add_header('archived-at', email.utils.formatdate())
>
> then we generate a buggy ID on line 267 with:
> mid = hashlib.sha224(str("%s-%s" % (lid,
> msg_metadata['archived-at'])).encode('utf-8')).hexdigest() + "@" + (lid
> if lid else "none")
>
> This means, that if either the archived-at header differs, or if it's
> not present, you'll never get the same ID unless you reimport from the
> fixed-up mbox source from ES...which you probably don't have if you're
> trying to reimport an email. Reimporting from an external mbox file or
> other source will change the ID on every single import, as archived-at
> (if it's not present) will change to whatever time it is when you import.
>
> TL;DR: reimporting is already broken for messages without a main body, I
> don't believe there's technical merit in sticking with a broken format
> for the sake of maintaining status quo. status quo is broken, and no
> matter how we fix it, IDs may change as a result on a re-import from
> older installations. That is a consequence we should accept.
>
> With regards,
> Daniel.
>
>>
>> It will also cause duplicate messages if the messages are ever
>> exported and re-imported without deleting the originals.
>>
>> A possible work-round is to create a new version of the medium
>> generator that fixes this issue.
>> New installations can then use that if they wish.
>>
>> But existing installations cannot just change ID generator and hope
>> that things will be OK
>>
>>
>> On 19 August 2017 at 07:11, <[email protected]> wrote:
>>> Repository: incubator-ponymail
>>> Updated Branches:
>>> refs/heads/master be430c666 -> 58ce843d7
>>>
>>>
>>> Make sure we have a body, or ID generation will fail.
>>>
>>> If body is None, PM tries to encode it and fails, falling back
>>> on the legacy ID generator, which is not guaranteed to be
>>> the same across the board. Fix this by setting the body
>>> to an empty string if not found.
>>>
>>>
>>> Project: http://git-wip-us.apache.org/repos/asf/incubator-ponymail/repo
>>> Commit:
>>> http://git-wip-us.apache.org/repos/asf/incubator-ponymail/commit/58ce843d
>>> Tree:
>>> http://git-wip-us.apache.org/repos/asf/incubator-ponymail/tree/58ce843d
>>> Diff:
>>> http://git-wip-us.apache.org/repos/asf/incubator-ponymail/diff/58ce843d
>>>
>>> Branch: refs/heads/master
>>> Commit: 58ce843d79d413f3d3b49647d26f33b41ec9ff2d
>>> Parents: be430c6
>>> Author: Daniel Gruno <[email protected]>
>>> Authored: Sat Aug 19 08:11:00 2017 +0200
>>> Committer: Daniel Gruno <[email protected]>
>>> Committed: Sat Aug 19 08:11:00 2017 +0200
>>>
>>> ----------------------------------------------------------------------
>>> CHANGELOG.md | 1 +
>>> tools/generators.py | 4 ++++
>>> 2 files changed, 5 insertions(+)
>>> ----------------------------------------------------------------------
>>>
>>>
>>> http://git-wip-us.apache.org/repos/asf/incubator-ponymail/blob/58ce843d/CHANGELOG.md
>>> ----------------------------------------------------------------------
>>> diff --git a/CHANGELOG.md b/CHANGELOG.md
>>> index 05fab5d..e180827 100644
>>> --- a/CHANGELOG.md
>>> +++ b/CHANGELOG.md
>>> @@ -1,4 +1,5 @@
>>> ## CHANGES in 0.10:
>>> +- Fixed an issue with ID generation where an email body does not exist
>>> - Fixed an issue where favorites could contain null entries due to missing
>>> GC (#392)
>>> - Added sample configs for Pony Mail (#374)
>>> - Renamed ll.py to list-lists.py (#378)
>>>
>>> http://git-wip-us.apache.org/repos/asf/incubator-ponymail/blob/58ce843d/tools/generators.py
>>> ----------------------------------------------------------------------
>>> diff --git a/tools/generators.py b/tools/generators.py
>>> index 2ac2ca8..d8c373d 100644
>>> --- a/tools/generators.py
>>> +++ b/tools/generators.py
>>> @@ -55,6 +55,8 @@ def medium(msg, body, lid, attachments):
>>> attachments - list of attachments (not used)
>>> """
>>> # Use text body
>>> + if not body: # Make sure body is not None, which will fail.
>>> + body = ""
>>> xbody = body if type(body) is bytes else body.encode('ascii', 'ignore')
>>> # Use List ID
>>> xbody += bytes(lid, encoding='ascii')
>>> @@ -94,6 +96,8 @@ def redundant(msg, body, lid, attachments):
>>> attachments - list of attachments (uses the hashes)
>>> """
>>> # Use text body
>>> + if not body: # Make sure body is not None, which will fail.
>>> + body = ""
>>> xbody = body if type(body) is bytes else body.encode('ascii', 'ignore')
>>> # Use List ID
>>> xbody += bytes(lid, encoding='ascii')
>>>
>