Sebb created COMDEV-292: --------------------------- Summary: Mailglomper does not handle renamed lists well Key: COMDEV-292 URL: https://issues.apache.org/jira/browse/COMDEV-292 Project: Community Development Issue Type: Bug Components: Reporter Tool Reporter: Sebb
The mailglomper script does not take account of renamed mailing lists. This can result in double counting the activity for a project. For example, commits@libcloud was renamed to notifications@libcloud in March 2014. However the data in the maildata_extended.json file includes weekly epoch entries for commits: 1507161600 2017-10-05 00:00:00 UTC to 1524096000 2018-04-19 00:00:00 UTC whereas notifications has: 1515024000 2018-01-04 00:00:00 UTC to 1531958400 2018-07-19 00:00:00 UTC The weekly counts agree for the overlap period. If the commits mbox files were still present up to April 2018, there would be an index entry for the list, and if there was also a redirect in place, the code would see the redirected files. I think the code should probably ignore redirects if that's possible. When a list is renamed, the old data ought to be dropped, otherwise it may be double-counted. Also the obsolete entries will gradually accumulate. This applies to both the maildata_weekly.json and maildata_extended.json files. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@community.apache.org For additional commands, e-mail: dev-h...@community.apache.org