On 2016-10-26 09:24, Daniel Axtens wrote:
Hi Stephen,

+    for ref in find_references(mail, True):
+        # try parsing by RFC5322 fields first
+        try:
+            return SeriesReference.objects.get(msgid=ref).series
+        except SeriesReference.DoesNotExist:
+            pass

Hopefully I've understood what's going on here:
 - first, check for a Series that directly matches this message's
   message id (hence the True to find_references)
 - then, check for a series that matches an In-Reply-To
 - then, check for a series that matches the References, from most
   recent to least recent (hopefully!)


Is that right?

Sure is. We prefer locality to the current email where possible. I've added comments to clarify this.

I think I would prefer pulling out the check for the Message ID into a
different test, rather than the extra flag to find_references - I'm not
sure a message ID is a reference in the sense in the usual sense. But
I'm not particularly fussy - if you've considered this approach and
rejected it for some reason that's fine.

Nope - it was this way in previous revisions and this seemed cleaner. Reverted to the older approach.

+def _parse_prefixes(subject_prefixes, regex):
+    for prefix in subject_prefixes:
+        m = regex.match(prefix)
+        if m:
+            return m
+
+
_first_prefix_match or _first_matching_prefix perhaps?

Done.

+    regex = re.compile('^[vV](\d+)$')
+    m = _parse_prefixes(subject_prefixes, regex)
+    if m:
+        return int(m.group(1))
+
+    m = re.search(r'\([vV](\d+)\)', subject)
Should this regex be precompiled at the head of the file? I also notice
you've attempted to compile it at the head of the function, you're just
not using the compiled version here...

Is seems Python caches this for us anyway [1], so I see no real reason. We compile in the first case simply for readability's sake, however, the regexes are not the same so we can't reuse it for the second case (note the additional brackets).

[1] http://stackoverflow.com/a/452143/613428

+        series = find_series(mail)
+        if not series and n:  # the series markers indicates a series
+            series = SeriesRevision(date=date,
+                                    submitter=author,
+                                    version=version,
+                                    total=n)
+            series.save()
+
+            for ref in refs + [msgid]:  # save references for series
+                # we don't want duplicates
+                SeriesReference.objects.get_or_create(series=series,
msgid=ref)

I must admit I don't quite understand the benefit of creating all the
series references straight away. What is the advantage of creating them
here as opposed to when the subsequent patches are received? (I could be
completely misunderstanding some key part of the parsing here -
apologies if so!)

The key reason is to handle patches received out-of-order. Without storing references, we have no way to identify a relationship between the two patches as the RFC822 headers (References, Message-Id, In-Reply-To) only refer to older, prior mails. This fixes that issue.

FWIW, another idea was to store the root message ID against the series (I think the freedesktop fork does this), but doing so would require some magic for deeply threaded series (where each patch in sent in reply to the previous patch). In addition, I don't think any amount of magic would work for series sent in reply to a previous series.

+            try:
+ series = SeriesReference.objects.get(msgid=msgid).series
+            except SeriesReference.DoesNotExist:
+                series = None
django shortcuts has a get_object_or_None - I'll leave it to you to
weigh whether or not the brevity is worth the additional dependency.

I don't think it is. TBH, if we could get rid of the non-Django dependencies we do have, I'm sure we'd make some kernel.org sysadmins very happy. Alas, django-rest-framework is far too good to ignore.

The tests and the rest of the code look good!

Thanks for the reviews :)

Stephen
_______________________________________________
Patchwork mailing list
[email protected]
https://lists.ozlabs.org/listinfo/patchwork

Reply via email to