Stephen Finucane <step...@that.guru> writes: > On Sun, 2018-02-25 at 01:50 +1100, Daniel Axtens wrote: >> Parallel parsing would occasonally fail with: >> >> patchwork.models.MultipleObjectsReturned: get() returned more than one >> SeriesReference -- it returned 2! >> >> I think these are happening if you have different processes parsing >> e.g. 1/3 and 2/3 simultaneously: both will have a reference to 1/3, >> in the case of 1 it will be the msgid, in the case of 2 it will be >> in References. So when we come to parse 3/3, .get() finds 2 and >> throws the exception. >> >> This does not fix the creation of multiple series references; it >> just causes them to be ignored. We still have serious race conditions >> with series creation, but I don't yet have clear answers for them. >> With this patch, they will at least not stop patches from being >> processed - they'll just lead to wonky series, which we already have. >> >> Reviewed-by: Andrew Donnellan <andrew.donnel...@au1.ibm.com> >> Signed-off-by: Daniel Axtens <d...@axtens.net> > > Correct me if I'm wrong, but this seems to highlight two issues: > > * The race condition between searching for a series reference and > usingĀ it > * The fact we expect a series to be unique for a given msgid and > project, yet the 'unique_together' is for a given msgid and > *series*. > > The first of these is caught here (yet not fixed, as it's a non-trivial > thing to resolve). The second should still be fixed though. Should we > do this here, as part of this series?
My thinking is no, for these reasons: - I want a patch I can backport to stable without requiring a database migration, because: * that seems like a very unusual thing for a stable update * I don't trust my ability to get it right for a stable release * relatedly, it will require a lot of testing for me to trust it, and I want something sooner rather than later so that the Buildroot people can get on with their lives. * The different numbering of migrations will make the different upgrade paths a nightmare to describe, yet alone test - I have a long-term ongoing effort to move 'project' up into the various tables that use it so as to avoid JOINs and this would be a good thing to do in that effort. (It's the next big push on my list, it's just a nightmare to get the compatibility and migration code going.) Regards, Daniel > > Stephen > >> --- >> patchwork/parser.py | 23 +++++++++++++++++++++-- >> 1 file changed, 21 insertions(+), 2 deletions(-) >> >> diff --git a/patchwork/parser.py b/patchwork/parser.py >> index 56dc7006c811..5a7344cee93c 100644 >> --- a/patchwork/parser.py >> +++ b/patchwork/parser.py >> @@ -240,6 +240,13 @@ def _find_series_by_references(project, mail): >> msgid=ref[:255], series__project=project).series >> except SeriesReference.DoesNotExist: >> continue >> + except SeriesReference.MultipleObjectsReturned: >> + # FIXME: Open bug: this can happen when we're processing >> + # messages in parallel. Pick the first and log. >> + logger.error("Multiple SeriesReferences for %s in project %s!" % >> + (ref[:255], project.name)) >> + return SeriesReference.objects.filter( >> + msgid=ref[:255], series__project=project).first().series >> >> >> def _find_series_by_markers(project, mail, author): >> @@ -1037,6 +1044,9 @@ def parse_mail(mail, list_id=None): >> series__project=project) >> except SeriesReference.DoesNotExist: >> SeriesReference.objects.create(series=series, msgid=ref) >> + except SeriesReference.MultipleObjectsReturned: >> + logger.error("Multiple SeriesReferences for %s" >> + " in project %s!" % (ref, project.name)) >> >> # add to a series if we have found one, and we have a numbered >> # patch. Don't add unnumbered patches (for example diffs sent >> @@ -1075,6 +1085,11 @@ def parse_mail(mail, list_id=None): >> msgid=msgid, series__project=project).series >> except SeriesReference.DoesNotExist: >> series = None >> + except SeriesReference.MultipleObjectsReturned: >> + logger.error("Multiple SeriesReferences for %s" >> + " in project %s!" % (msgid, project.name)) >> + series = SeriesReference.objects.filter( >> + msgid=msgid, series__project=project).first().series >> >> if not series: >> series = Series(project=project, >> @@ -1087,8 +1102,12 @@ def parse_mail(mail, list_id=None): >> # we don't save the in-reply-to or references fields >> # for a cover letter, as they can't refer to the same >> # series >> - SeriesReference.objects.get_or_create(series=series, >> - msgid=msgid) >> + try: >> + SeriesReference.objects.get_or_create(series=series, >> + msgid=msgid) >> + except SeriesReference.MultipleObjectsReturned: >> + logger.error("Multiple SeriesReferences for %s" >> + " in project %s!" % (msgid, project.name)) >> >> cover_letter = CoverLetter( >> msgid=msgid, _______________________________________________ Patchwork mailing list Patchwork@lists.ozlabs.org https://lists.ozlabs.org/listinfo/patchwork