On Fri, 2005-06-10 at 10:33 +1200, Simon Kitching wrote: > On Thu, 2005-06-09 at 20:29 +0100, robert burrell donkin wrote: > > On Thu, 2005-06-09 at 10:41 +1200, Simon Kitching wrote: > > > On Wed, 2005-06-08 at 22:21 +0100, robert burrell donkin wrote: > > > > > > > > === > > > > > Maven reports: > > > > > > > > > > I would suggest disabling this report. Firstly, a log of the last 30 > > > > > days isn't of much use. And secondly, due to the import into SVN of > > > > > back-dated CVS changes, date-based selection on the apache subversion > > > > > repository is broken, so the report is not just useless but actively > > > > > WRONG. > > > > > > > > > > I suggest that "Developer Activity" and "File Activity" reports are > > > > > also > > > > > useless, and (if based on SVN date selection) also wrong. > > > > > > > > AIUI the problem occurs only with the dates on the imported data. new > > > > data is fine. i've checked the results and they look about right. i do > > > > agree that they aren't all that useful but i know some users like them > > > > so i'm inclined to keep them... > > > > > > Unfortunately the date problem is repository-wide, and long-lasting. > > > > > > When subversion is passed a date, it immediately converts this into a > > > revision-number. And it does this by performing a binary search on its > > > revisions. > > > > > > Assuming there are 1000 revisions currently in the repository, it > > > * checks whether revision 500 is earlier or later than the desired date > > > * 500 is earlier, so revision 750 is checked > > > * 750 is earlier, so 875 is checked > > > * 875 is later, so 812 is checked, etc > > > > > > This process is based on a fundamental assumption that revision X has an > > > earlier date than revision X+1. When this assumption is broken, the > > > binary search can go off in the wrong direction. And it looks to me like > > > after a "problem" import date selections will continue to be broken > > > until the revision# has at least doubled in size. > > > > > > Whether you actually get bitten by the problem for a particular > > > date-based select is a bit of a lottery; if the binary search happens to > > > hit "valid" nodes all the way down, the search will work correctly. But > > > hit the wrong node and the select can be a long way out. > > > > > > Alas, it is just not possible to "insert" revisions into an existing > > > repository; when importing data it can only be added as new revisions. > > > > > > So the current choice when importing CVS history is > > > (1) stuff up all date-based selections *repository-wide*, or > > > (2) discard all date information associated with imported CVS history, > > > and put "current" dates against the revisions. > > > > thanks for the detailed information > > > > once all the CVS repositories have been imported, this should no longer > > be a problem but until then i'll disable the reports. > > Alas, I don't think so. As I said above, I believe that after the last > CVS import has been done, we will then need to wait for the # of > revisions in the repository to double before date-based searching is > completely reliable again. > > I expect that the revision# will reach about 200,000 by the time > everything is imported into SVN. So when the count reaches 400,000 > everything will be right again. That should only take a decade or so :-(
once all the projects have been imported, this issue should only effect the reliability of searches during the period before the date of the last import. the binary algorithm should work reliably for searches after that date. AFAIK the reports in question only required searches to function reliably for the last 30 days. (though the revisions with higher numbers during that period may have dates early than lower numbered revisions, all must have dates before the date being searched for. therefore, the binary algorithm should always choose the higher half until it reaches the region where the expected relationship holds. the binary algorithm will work as expected within this region and so the search should prove reliable for dates after the lat import.) - robert --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
