On Tue, 25 Aug 2020 at 20:11, Daniel Gruno <[email protected]> wrote: > > On 25/08/2020 20.54, sebb wrote: > > On Tue, 25 Aug 2020 at 19:42, Daniel Gruno <[email protected]> wrote: > >> > >> On 25/08/2020 20.35, sebb wrote: > >>> On Tue, 25 Aug 2020 at 19:23, Daniel Gruno <[email protected]> wrote: > >>>> > >>>> On 25/08/2020 20.15, sebb wrote: > >>>>> AFAICT this will generate different hashes for the same message if > >>>>> they are loaded from a different source. > >>>> > >>>> Yeah, it will - at present, that is on purpose. We can look at doing > >>>> something like using Sean's DKIM parser for this, and only hashing the > >>>> output from that, with the x-archived-list-id added in from the command > >>>> line --lid argument if different from the canonical list id. > >>>> > >>>>> > >>>>> Whilst it should ensure that distinct messages don't clash, it won't > >>>>> weed out actual duplicates. > >>>> > >>>> Right, aware of that. In most cases, if you are reloading, you are doing > >>>> so with a fresh DB, and it won't matter much. In cases where you are > >>>> "cascading" mbox files, it would make duplicates, but that's only a > >>>> question of disk space for now, having duplicate source files won't > >>>> cause malfunctions, just a few more bytes used and source alternatives. > >>> > >>> This has implications for the API and the UI. > >>> > >>> If there are multiple matches for a Permalink, in general one cannot > >>> say which is correct, so all will have to be returned and displayed. > >> > >> I'm pondering how to address this. Currently, the prototype will return > >> the first hit it finds that matches. This should really be fine, as they > >> are all valid sources, so returning one or the other would not matter > >> for the end-user. > > > > This assumes that the Permalink is sufficiently unique. > > That is not true for some of the current designs. > > > > This would be the case only if you lost your database and decided to > re-image everything from scratch using foal with an older generator > instead of the original pony mail, and two or more emails had collisions. > > I would strongly recommend against doing this unless you have no other > choice or do not care about older permalinks that much. > > Foal is not meant as a drop-in replacement for the current Pony Mail. If > you lose your old database and want complete assuredness against this, > you should re-image using the old version first, and then migrate > across. There will be differences in both the archiver and the UI that > are not fully backwards compatible, as the 'old ways' are bugged here > and there. > > The migrator will, once it's done, migrate everything over verbatim, so > any overrides you had in the old system will apply to the new one as > well, and you won't see multiple choices for old emails, only newly > archived ones done with the foal archiver or importer.
If Foal is to support non-unique generators, it must use their Permalinks as the database Id, or it must support multiple matches.
