Paul Burba wrote:
> Hi Mike and Julian,
> 
> Sorry for the delayed response; been working on/thinking about other
> mergey stuff lately...
> 
> A few inline comments and then a proposed course of action at the end.
> 
> On Thu, Apr 15, 2010 at 9:28 AM, C. Michael Pilato <cmpil...@collab.net> 
> wrote:
>> The question then becomes, "How do users deal with legitimate partial dumps
>> that will be loaded atop something other than loads from previous
>> incremental windows?"  I think they do this the same way they handle the
>> copy case:  either hand-hacking the dumpstream or using 'svndumpfilter'.  So
>> maybe 'svndumpfilter' grows the logic and options required to pare away
>> mergeinfo that doesn't meet some criteria (such as "ranges outside the dump
>> stream range")?
> 
> I vote for svndumpfilter growing the logic to filter ranges predating
> the start of the dump stream in the same way it strips mergeinfo with
> merge sources filtered from the dump stream.  Basically this means
> moving the r927243 functionality to svndumpfilter.  No really sure we
> need a new option, seems we could overload
> --skip-missing-merge-sources to include this behavior.  Thoughts?

+1

>> Julian Foad wrote:
>>> On Wed, 2010-04-14, C. Michael Pilato wrote:
>>>> Assuming similar behavior for mergeinfo handling, we have, at a minimum:
>>>>
>>>> 1.  'svnadmin dump' warns when it is in incremental mode and must generate
>>>> mergeinfo from a merge source that predates the beginning of the dump
>>>> window, but it's only a warning and the dump continues.
> 
> Why only --incremental mode?  We should warn for both --incremental
> dumps *and* non-incremental dumps because in the latter case we might
> be specifying a dump range -rX:Y where rX>0* and there might be
> mergeinfo that refers to revs < X.  And while we might want to be
> clever about skipping the warning when no range is specified (i.e.
> defaulting to -r0:HEAD), even then we should warn because issue #3020
> might already have resulted in a r1 with mergeinfo in the starting
> repository.

You're right, of course.  I was thinking "partial dump" and saying
"incremental".  My bad.

>>>> 2.  'svnadmin load' does nothing smart, trusting that the dump it's being
>>>> fed is a sensible one.
>>> Better: 'svnadmin load' tries to validate the mergeinfo, and issues a
>>> warning if it refers to a non-existent source.  The admin then gets to
>>> figure out whether it was bad in the first place or because of his/her
>>> partial-dump/partial-load scenario.
>>>
>>> However, it depends how efficient the checking is.  If that would make
>>> the 'load' really slow, I can see that not being wanted.

[...]

> No unsurprisingly, this took significantly (62%) longer, 00:05:39.609.

(My vote was for 'svnadmin load' does nothing smart, and I stand by it.)

> Taking what you've had to say into consideration, I propose the following:
> 
> 1) Revert http://svn.apache.org/viewvc?view=revision&revision=927243.
> This stops svnadmin load from automatically filtering mergeinfo
> referring to revisions outside of the load stream.
> 
> 2) Reapply or reimplement the part of r327243 that accounts for the
> case where the first loaded revision in a load stream has mergeinfo,
> see http://subversion.tigris.org/issues/show_bug.cgi?id=3020#desc10
> item #1.

(r927243, you mean.)

> 3) Keep the inline copy-source warnings as they exist today in
> svnadmin dump, but add a new end-of-process summary warning e.g.:
> 
>   * Dumped revision 504.
>   * Dumped revision 505.
>   * Dumped revision 506.
>   * Dumped revision 507.
>   WARNING: Referencing data in revision 250, which is older than the oldest
>   WARNING: dumped revision (504).  Loading this dump into an empty repository
>   WARNING: will fail.
>   * Dumped revision 58.
>   * Dumped revision 59.
>   * Dumped revision 510.
>   * Dumped revision 511.
>   .
>   .
>   .
>   WARNING: The range of revisions dumped contained references to copy
> sources outside that range.
> 
> 4) Implement inline and summary functionality analogous to 3) but for
> mergeinfo ranges outside of the dump stream, e.g.:
> 
>   * Dumped revision 504.
>   * Dumped revision 505.
>   * Dumped revision 506.
>   * Dumped revision 507.
>   WARNING: Mergeinfo referencing revision(s) prior to revision 250,
> which is older
>   WARNING: than the oldest dumped revision (504).  Loading this dump may 
> result
>   WARNING: in invalid mergeinfo.
>   * Dumped revision 58.
>   * Dumped revision 59.
>   * Dumped revision 510.
>   * Dumped revision 511.
>   .
>   .
>   .
>   WARNING: The range of revisions dumped contained references to
> mergeinfo outside that range.
> 
> Also, as mentioned earlier, I think this warning should kick in
> regardless of whether --incremental is used.
> 
> 5) Overload svndumpfilter's --skip-missing-merge-sources option to
> also remove mergeinfo that predates the starting revision of the dump
> stream.

+1 all over this stuff.

> 6) Open: How does svnadmin load deal with mergeinfo that doesn't exist
> in the load steam or the target repos:
> 
>    A) It doesn't.
> 
>    B) Only checks that the revs are valid (fast, but
>       can allow mergeinfo that describes invalid
>       mergesource:rev pairs)
> 
>    C) Find an efficient way to test for valid
>       mergesource:rev pairs

For now, A (or maybe B).  Then we have our Paul back.  :-)

-- 
C. Michael Pilato <cmpil...@collab.net>
CollabNet   <>   www.collab.net   <>   Distributed Development On Demand

Reply via email to