Re: svn commit: r985514 - stream_move_mark()

2010-08-18 Thread Stefan Sperling
On Tue, Aug 17, 2010 at 11:57:11PM +0200, Stefan Fuhrmann wrote:
 I overgeneralized my use-case here: I actually needed move forward only.
 The concept of skip N bytes, however, is perfectly in line with
 the general
 stream semantics. Because this also fits my needs, I changed the API in
 r986485 accordingly.

But the stream implementation may still need to read all bytes until N
anyway, to make sure the internal state is still valid after moving to
offset N.

Of course, we could say that a stream implementation is free to optimize
for this use case if possible, but allow wrapper streams to override
the move forward implementation.

For instance, a stream wrapping an APR file would support move forward,
but as soon as you wrap this APR file stream with a translation stream,
the translation stream's semantics take over and the APR file stream
will effectively be read byte-per-byte when the move forward method
is called on the translation stream. This would allow the translation stream
to keep its internal state intact during a move forward operation.

Performance benefits would then depend on the type of stream being used.

Does this make sense?

 The problem is that a lot of parser code would need to change at least its
 function signatures to string buffers and offsets instead of
 streams. I'm not
 even sure that all streams are based on strings; some may be parsed on
 APR file as well and use the same parser code (e.g. read hash from stream).

Sure, that's what I meant. The patch code for instance uses multiple streams
all mapped onto a subset of a file, and each such stream does special
filtering and transformations on the text read from the file.
See the definition of svn_hunk_t in include/svn_diff.h.

Stefan


Re: Branch audit

2010-08-18 Thread Stefan Sperling
On Tue, Aug 17, 2010 at 05:33:13PM -0500, Hyrum K. Wright wrote:
 It's that time again!  In looking at the list of current branches
 (excepting release and backport branches I've noticed several which
 look to be abandoned or unmaintained.  Branches which have not been
 touched in over a year include (along with author and date of last
 revision, stolen from ViewVC):
 
   artem-soc-work/  r863880 3 years djames  Really, 
 actually, merge
 r23802 and r23803 from trunk to the artem-soc-work branc…
   atomic-revprop/  r984264 6 days  danielshTweak the 
 behaviour of a
 recently-added API to allow conveniently passing errors…
   bdb-fixes/   r858217 4 years brane   Working on issue 
 #2449. Move
 error_info refcount management where it belongs, t…
   capabilities-abstraction/r869334 2 years kfogel  On the
 capabilities-abstraction branch: Start unifying some of the
 capabilities…
   issue-2699-dev/  r863182 3 years dlr On the 
 issue-2699-dev
 branch: Improve locking error message. * subversion/libsv…
   issue-2897/  r869309 2 years kameshj On the 
 issue-2897 branch:
 Improve documentation. * subversion/libsvn_client/me…
   issue-2897-take2/r875276 19 months   kameshj On 
 issue-2897-take2:
 Merge from ^/trunk revisions r35038 through r35201.
   issue-3081/  r869126 2 years kfogel  On the issue-3081 
 branch: *
 3081-repro/: New directory, with patches, repro scr…
   meta-data-versioning/r858153 4 years pmarek  Change 
 the live
 properties only if they have really changed. Without this chan…
   perl-bindings-improvements/  r867175 2 years jpeacock   
  Merge with
 tr...@27098
   record_exact_merge_and_commit_revs/  r868556 2 years 
 kameshj On
 the 'record_exact_merge_and_commit_revs' branch, merge 28395:28481
 from /trun…
   scheme-bindings/ r869778 2 years
   scons-build-system/  r862146 3 years danderson   Make 
 the source
 path creator safer. * build/builder.py (SvnBuild._make_src_…
   server-l10n/ r857339 4 years dlr * 
 subversion/mod_dav_svn/lang.c
 (sort_lang_pref): Swap return values to correc…
   status--filter/  r867402 2 years epg *** Just 
 stashing this away
 until after 1.5 branches. Add status --filter, for …
   svnserve-ssl/r859467 4 years mbk Make svnserve 
 SSL config items
 repo-conf-dir-relative, rather than absolute path…
   youngest-common-ancestor/r867405 2 years cmpilato   
  Begin work on
 a client-side youngest-common-ancestor API. NOTE: This code is…

That list is quite hard to read here.
Here's a shorter alternative (svn ls -v ^/subversion/branches):

 986608 stefan2   Aug 18 11:29 ./
 873205 hwright   Sep 17  2008 1.0.x/
 873092 cmpilato  Sep 10  2008 1.1.x/
 873092 cmpilato  Sep 10  2008 1.2.x/
 873208 hwright   Sep 17  2008 1.3.x/
 904181 philipJan 28  2010 1.4.x/
 904182 philipJan 28  2010 1.5.x/
 875595 stylesen  Jan 28  2009 1.5.x-issue2489/
 877232 pburbaApr 10  2009 1.5.x-issue3392/
 876852 pburbaMar 25  2009 1.5.x-r36775/
 877215 cmpilato  Apr 09  2009 1.5.x-r37137/
 877733 stylesen  May 09  2009 1.5.x-r37646/
 877856 pburbaMay 19  2009 1.5.x-r37779/
 879745 pburbaSep 29  2009 1.5.x-r39109/
 880457 stsp  Nov 05  2009 1.5.x-r40200/
 984934 danielsh  Aug 12 21:54 1.6.x/
 956644 pburbaJun 21 19:16 1.6.x-issue3242-reintegrate/
 958108 pburbaJun 25 22:42 1.6.x-issue3646/
 964472 pburbaJul 15 17:47 1.6.x-issue3648/
 966852 pburbaJul 22 22:43 1.6.x-issue3657/
 966542 stylesen  Jul 22 10:45 1.6.x-issue3683/
 879771 stylesen  Sep 30  2009 1.6.x-r39692/
 981459 stsp  Aug 02 13:25 1.6.x-r980811/
 981926 danielsh  Aug 03 17:23 1.6.x-r981921/
 929382 petersMar 31 07:13 1.6.x-wc-ng-error/
 863880 djamesMar 14  2007 artem-soc-work/
 984264 danielsh  Aug 11 01:26 atomic-revprop/
 881688 cmpilato  Nov 18  2009 authz-overhaul/
 858217 brane Jan 18  2006 bdb-fixes/
 869334 kfogelFeb 10  2008 capabilities-abstraction/
 929295 pburbaMar 30 23:22 fs-successor-ids/
 984925 hwright   Aug 12 21:26 ignore-mergeinfo/
 863182 dlr   Jan 19  2007 issue-2699-dev/
 869309 kameshj   Feb 08  2008 issue-2897/
 875276 kameshj   Jan 13  2009 issue-2897-take2/
 869126 kfogelJan 26  2008 issue-3081/
 892311 cmpilato  Dec 

Re: Branch audit

2010-08-18 Thread Hyrum K. Wright
On Wed, Aug 18, 2010 at 11:31 AM, Stefan Sperling s...@elego.de wrote:
 On Tue, Aug 17, 2010 at 05:33:13PM -0500, Hyrum K. Wright wrote:
 It's that time again!  In looking at the list of current branches
 (excepting release and backport branches I've noticed several which
 look to be abandoned or unmaintained.  Branches which have not been
 touched in over a year include (along with author and date of last
 revision, stolen from ViewVC):

   artem-soc-work/      r863880         3 years         djames  Really, 
 actually, merge
 r23802 and r23803 from trunk to the artem-soc-work branc…
   atomic-revprop/      r984264         6 days  danielsh        Tweak the 
 behaviour of a
 recently-added API to allow conveniently passing errors…
   bdb-fixes/   r858217         4 years         brane   Working on issue 
 #2449. Move
 error_info refcount management where it belongs, t…
   capabilities-abstraction/    r869334         2 years         kfogel  On the
 capabilities-abstraction branch: Start unifying some of the
 capabilities…
   issue-2699-dev/      r863182         3 years         dlr     On the 
 issue-2699-dev
 branch: Improve locking error message. * subversion/libsv…
   issue-2897/  r869309         2 years         kameshj         On the 
 issue-2897 branch:
 Improve documentation. * subversion/libsvn_client/me…
   issue-2897-take2/    r875276         19 months       kameshj         On 
 issue-2897-take2:
 Merge from ^/trunk revisions r35038 through r35201.
   issue-3081/  r869126         2 years         kfogel  On the issue-3081 
 branch: *
 3081-repro/: New directory, with patches, repro scr…
   meta-data-versioning/        r858153         4 years         pmarek  
 Change the live
 properties only if they have really changed. Without this chan…
   perl-bindings-improvements/  r867175         2 years         jpeacock      
   Merge with
 tr...@27098
   record_exact_merge_and_commit_revs/  r868556         2 years         
 kameshj         On
 the 'record_exact_merge_and_commit_revs' branch, merge 28395:28481
 from /trun…
   scheme-bindings/     r869778         2 years
   scons-build-system/  r862146         3 years         danderson       Make 
 the source
 path creator safer. * build/builder.py (SvnBuild._make_src_…
   server-l10n/         r857339         4 years         dlr     * 
 subversion/mod_dav_svn/lang.c
 (sort_lang_pref): Swap return values to correc…
   status--filter/      r867402         2 years         epg     *** Just 
 stashing this away
 until after 1.5 branches. Add status --filter, for …
   svnserve-ssl/        r859467         4 years         mbk     Make svnserve 
 SSL config items
 repo-conf-dir-relative, rather than absolute path…
   youngest-common-ancestor/    r867405         2 years         cmpilato      
   Begin work on
 a client-side youngest-common-ancestor API. NOTE: This code is…

 That list is quite hard to read here.
 Here's a shorter alternative (svn ls -v ^/subversion/branches):

  986608 stefan2               Aug 18 11:29 ./
  873205 hwright               Sep 17  2008 1.0.x/
  873092 cmpilato              Sep 10  2008 1.1.x/
  873092 cmpilato              Sep 10  2008 1.2.x/
  873208 hwright               Sep 17  2008 1.3.x/
  904181 philip                Jan 28  2010 1.4.x/
  904182 philip                Jan 28  2010 1.5.x/
  875595 stylesen              Jan 28  2009 1.5.x-issue2489/
  877232 pburba                Apr 10  2009 1.5.x-issue3392/
  876852 pburba                Mar 25  2009 1.5.x-r36775/
  877215 cmpilato              Apr 09  2009 1.5.x-r37137/
  877733 stylesen              May 09  2009 1.5.x-r37646/
  877856 pburba                May 19  2009 1.5.x-r37779/
  879745 pburba                Sep 29  2009 1.5.x-r39109/
  880457 stsp                  Nov 05  2009 1.5.x-r40200/
  984934 danielsh              Aug 12 21:54 1.6.x/
  956644 pburba                Jun 21 19:16 1.6.x-issue3242-reintegrate/
  958108 pburba                Jun 25 22:42 1.6.x-issue3646/
  964472 pburba                Jul 15 17:47 1.6.x-issue3648/
  966852 pburba                Jul 22 22:43 1.6.x-issue3657/
  966542 stylesen              Jul 22 10:45 1.6.x-issue3683/
  879771 stylesen              Sep 30  2009 1.6.x-r39692/
  981459 stsp                  Aug 02 13:25 1.6.x-r980811/
  981926 danielsh              Aug 03 17:23 1.6.x-r981921/
  929382 peters                Mar 31 07:13 1.6.x-wc-ng-error/
  863880 djames                Mar 14  2007 artem-soc-work/
  984264 danielsh              Aug 11 01:26 atomic-revprop/
  881688 cmpilato              Nov 18  2009 authz-overhaul/
  858217 brane                 Jan 18  2006 bdb-fixes/
  869334 kfogel                Feb 10  2008 capabilities-abstraction/
  929295 pburba                Mar 30 23:22 fs-successor-ids/
  984925 hwright               Aug 12 21:26 ignore-mergeinfo/
  863182 dlr                   Jan 19  2007 issue-2699-dev/
  869309 kameshj               Feb 08  2008 issue-2897/
  875276 kameshj               Jan 13  2009 

Re: svn commit: r985514 - stream_move_mark()

2010-08-18 Thread Julian Foad
On Wed, 2010-08-18, Stefan Sperling wrote:
 On Tue, Aug 17, 2010 at 11:57:11PM +0200, Stefan Fuhrmann wrote:
  I overgeneralized my use-case here: I actually needed move forward only.
  The concept of skip N bytes, however, is perfectly in line with
  the general
  stream semantics. Because this also fits my needs, I changed the API in
  r986485 accordingly.
 
 But the stream implementation may still need to read all bytes until N
 anyway, to make sure the internal state is still valid after moving to
 offset N.
 
 Of course, we could say that a stream implementation is free to optimize
 for this use case if possible, but allow wrapper streams to override
 the move forward implementation.
 
 For instance, a stream wrapping an APR file would support move forward,
 but as soon as you wrap this APR file stream with a translation stream,
 the translation stream's semantics take over and the APR file stream
 will effectively be read byte-per-byte when the move forward method
 is called on the translation stream. This would allow the translation stream
 to keep its internal state intact during a move forward operation.
 
 Performance benefits would then depend on the type of stream being used.
 
 Does this make sense?

Yes, that's exactly what I was thinking. Sounds good.

- Julian




Re: svn diff optimization to make blame faster?

2010-08-18 Thread Stefan Sperling
On Wed, Aug 18, 2010 at 12:59:21AM +0200, Johan Corveleyn wrote:
 Hi devs,
 
 While Looking to improve performance of svn annotate [1], I found
 that the current blame algorithm is mainly client-side bound, and that
 most of its time is spent on svn diff (calls to svn_diff_file_diff_2
 from add_file_blame in blame.c). Apart from avoiding to build
 full-texts and diffing them altogether (which is subject of further
 discussion in [1]), I'm wondering if optimization of svn diff
 wouldn't also be an interesting way to improve the speed of blame.
 
 So the main question is: is it worth it to spend time to analyze this
 further and try to improve performance? Or has this already been
 optimized in the past, or is it simply already as optimal as it can
 get? I have no idea really, so if anyone can shed some light ...
 
 Gut feeling tells me that there must be room for optimization, since
 GNU diff seems a lot faster than svn diff for the same large file
 (with one line changed) on my machine [1]. But maybe svn's diff
 algorithm is purposefully different (better? more accurate? ...) than
 GNU's, or there are specific things in the svn context so svn diff has
 to do more work.
 
 Any thoughts?

Can you show a profiler run that illustrates where the client is
spending most of its time during diff? That would probably help with
getting opinions from people, because it saves them from spending time
doing this research themselves.
You've already hinted at svn_diff__get_tokens() in another mail, but
a real profiler run would show more candidates.

Thanks,
Stefan


Re: Webpage showing 1.7 status?

2010-08-18 Thread Hyrum K. Wright
On Tue, Aug 17, 2010 at 6:26 PM, C. Michael Pilato cmpil...@collab.net wrote:
 On 08/17/2010 09:59 AM, Hyrum K. Wright wrote:
 Howdy all.

 All of the wandiscians are having a face-to-face this week, and one of
 the ideas floated was a public-facing page on our website listing the
 items left to complete before the 1.7 release.  In the past, we've had
 a STATUS-1.x in trunk, which has been useful for tracking the stages
 toward the release, and helping people who want to contribute know
 what needs to be done.  By putting that document on the web, we can
 also use it to communicate to the people who are asking about the
 status of 1.7.

 We (Julian, Philip, Erik, and I) can seed this document with the
 results of our discussions this week, and with the release plan I
 posted a few weeks ago.

 Thoughts?

 I'm no fan of spawning tons of little webpages whose meaningful lifespan is
 relatively short.  This is Subversion, so you can point folks directly into
 the version control repository for stuff like that.  However, if you can
 make a case for a webpage that can have a much longer lifespan and serve us
 far into the future, +1 from me.  For example, I can see us maintaining a
 subversion.apache.org/release-status.html page which serves multiple purposes:

   Status of Current Release

   ### Describe the status of the next .0 release we're working toward.
   ### This is the stuff you guys are suggesting.

   Subversion Release Maintenance Policy

   ### Explain our general policy of actively maintaining only one existing
   ### release line save for extremely critical security bug fixes that we
   ### might backport to even older release streams.

   Status of Prior Releases

   ### Note the status of each of our pre-1.7 release streams (1.6.x is open
   ### for maintenance, with a 1.6.13 release due on ?/?/2010; all others
   ### are closed to changes excepting severe security bug fixes.)

 The one downside I see to any of this is that history has shown the
 Subversion developers to be pretty slack about updating web pages.  I guess
 we just don't tend to think most of the time in terms of How can I better
 help Joe Suitwearer understand how and to what degree the coding work I'm
 doing is making an impact towards are next release?

I've added something to the roadmap page which attempts to categorize
the remaining items for 1.7.  It'd be good to include already
completed items as well, so we don't get too depressed with what left
(but remember how far we've come!)

Edits, reversions, etc., welcome.

-Hyrum


Re: Branch audit

2010-08-18 Thread Daniel Shahaf
   atomic-revprop/  r984264 6 days  danielshTweak the 
 behaviour of a recently-added API to allow conveniently passing errors…

Last touched 6 days ago, so I'm not sure why it's on your list.

I've discussed the plans on dev@ recently, the only thing blocking it
from moving forward is that I haven't had time to work on it yet.

Hyrum K. Wright wrote on Tue, Aug 17, 2010 at 17:33:13 -0500:
 It's that time again!  In looking at the list of current branches
 (excepting release and backport branches I've noticed several which
 look to be abandoned or unmaintained.  Branches which have not been
 touched in over a year include (along with author and date of last
 revision, stolen from ViewVC):
 
   artem-soc-work/  r863880 3 years djames  Really, 
 actually, merge r23802 and r23803 from trunk to the artem-soc-work branc…
   atomic-revprop/  r984264 6 days  danielshTweak the 
 behaviour of a recently-added API to allow conveniently passing errors…
   bdb-fixes/   r858217 4 years brane   Working on issue 
 #2449. Move error_info refcount management where it belongs, t…
   capabilities-abstraction/r869334 2 years kfogel  On the 
 capabilities-abstraction branch: Start unifying some of the capabilities…
   issue-2699-dev/  r863182 3 years dlr On the 
 issue-2699-dev branch: Improve locking error message. * subversion/libsv…
   issue-2897/  r869309 2 years kameshj On the 
 issue-2897 branch: Improve documentation. * subversion/libsvn_client/me…
   issue-2897-take2/r875276 19 months   kameshj On 
 issue-2897-take2: Merge from ^/trunk revisions r35038 through r35201.
   issue-3081/  r869126 2 years kfogel  On the issue-3081 
 branch: * 3081-repro/: New directory, with patches, repro scr…
   meta-data-versioning/r858153 4 years pmarek  Change 
 the live properties only if they have really changed. Without this chan…
   perl-bindings-improvements/  r867175 2 years jpeacock   
  Merge with tr...@27098
   record_exact_merge_and_commit_revs/  r868556 2 years 
 kameshj On the 'record_exact_merge_and_commit_revs' branch, merge 
 28395:28481 from /trun…
   scheme-bindings/ r869778 2 years
   scons-build-system/  r862146 3 years danderson   Make 
 the source path creator safer. * build/builder.py (SvnBuild._make_src_…
   server-l10n/ r857339 4 years dlr * 
 subversion/mod_dav_svn/lang.c (sort_lang_pref): Swap return values to correc…
   status--filter/  r867402 2 years epg *** Just 
 stashing this away until after 1.5 branches. Add status --filter, for …
   svnserve-ssl/r859467 4 years mbk Make svnserve 
 SSL config items repo-conf-dir-relative, rather than absolute path…
   youngest-common-ancestor/r867405 2 years cmpilato   
  Begin work on a client-side youngest-common-ancestor API. NOTE: This code is…
 
 Could folks take a look at these branches and justify their continued
 existence?  It would also be nice to include plans to continue working
 on them, or issues which keep them from being merged to trunk.
 
 Thanks,
 -Hyrum


Re: RFC: WCNG and Issue #2915: Merge tracking and missing subtrees due to non-svn removal

2010-08-18 Thread Daniel Shahaf
Paul Burba wrote on Tue, Aug 17, 2010 at 20:06:34 -0400:
 On Thu, Aug 12, 2010 at 2:51 PM, Daniel Shahaf d...@daniel.shahaf.name 
 wrote:
  Paul Burba wrote on Fri, Aug 06, 2010 at 10:30:50 -0400:
  As described in issue #2915, in 1.6 when a merge target is a missing
  subtree due to its removal by non-svn means, we try to record
  mergeinfo such that the missing subtree doesn't have (or inherit)
  mergeinfo describing the merge:
 
  (If you already have a vague idea of how this works you can skip to
  'You might suggest that it makes more sense' below for the RFC)
 
  ### A file is missing in our merge target
 
    1.6.13-devsvn st
    !       A_COPY\D\H\psi
 
  ### No initial mergeinfo
 
    1.6.13-devsvn pg svn:mergeinfo -vR
 
  ### Merge tries to apply change to missing file, but can't
  ### and reports it as skipped
 
    1.6.13-devsvn merge ^^/A A_COPY -r2:4
    --- Merging r3 through r4 into 'A_COPY':
    U    A_COPY\D\G\rho
    Skipped missing target: 'A_COPY\D\H\psi'
    Summary of conflicts:
      Skipped paths: 1
 
  ### Merge target gets mergeinfo describing the merge
  ### performed and the missing file gets empty override
  ### mergeinfo so it doesn't inherit the target's mergeinfo
 
    1.6.13-devsvn st
     M      A_COPY
    M       A_COPY\D\G\rho
    !M      A_COPY\D\H\psi
 
    1.6.13-devsvn pg svn:mergeinfo -vR
    Properties on 'A_COPY\D\H\psi':
      svn:mergeinfo
 
    Properties on 'A_COPY':
      svn:mergeinfo
        /A:3-4
 
  If the missing subtree was a directory we obviously can't set its
  properties, so we treat this as a tree conflict:
 
    1.6.13-devsvn st
    !       A_COPY\D\H
 
    1.6.13-devsvn merge ^^/A A_COPY -r2:4
    --- Merging r3 through r4 into 'A_COPY':
    U    A_COPY\D\G\rho
       C A_COPY\D\H
    Summary of conflicts:
      Tree conflicts: 1
 
    1.6.13-devsvn st
     M      A_COPY
    M       A_COPY\D\G\rho
    !     C A_COPY\D\H
             local delete, incoming edit upon merge
 
  ~
 
  You might suggest that it makes more sense to simply throw an error
  when a mergeinfo aware merge hits a missing subtree rather than
  jumping through these hoops.  I wouldn't disagree, but an even better
  solution was suggested to me recently by Bert: Restore the missing
  subtrees.  With wcng's single DB this should be easy with
  svn_wc_restore().
 
  Does anyone see a reason we should not restore missing subtrees
  encountered during a merge?
 
  Assuming for a moment there is general agreement about the goodness of
  this approach, which sounds better:
 
  A) Check for missing subtrees at the start of the merge and restore
  them prior to any editor drives.  This would be easy enough to do in
  libsvn_client/merge.c:get_mergeinfo_paths() which we call at the start
  of a merges to collect information about subtrees interesting from a
  merge-tracking perspective.  The drawback here is that we need to
  detect all the missing subtrees and that means a new call to
  svn_io_check_path/apr_stat for every path in the merge target.  Since
  99.99%* of merges don't involve missing subtrees, we are imposing a
  needless performance penalty on most users.
 
 
  Agreed: stat() on every file on, say, our trunk during a merge from a
  branch, is too expensive.
 
  B) Restore the missing subtrees on-the-fly when the svn_delta_editor_t
  callbacks (i.e. open_directory, open_file) actually encounter a
  missing subtree.  This keeps the svn_io_check_path() calls to a
  minimum.  The drawback is that missing subtrees untouched by the
  editor are still missing after the merge.
 
 
  *nod*
 
  Any preference or suggestions for a superior solution are appreciated.
 
 
  We could treat missing files as conflicts?  e.g., about the same as what
  we'd do if someone truncated the file to zero length.
 
  That way all information is present locally (so there will be no need to
  repeat the merge).
 
 (Sorry for the tardy reply, I've been on vacation and wonderfully out
 of the loop the last 4 days)
 
 That is certainly an option, but how is it better than restoring the
 missing file(s) and letting the merge complete?  WCNG with a single DB
 enabled allows us to do that, it seems a *much* better solution that
 raising what is almost certainly a spurious conflict?  No?  Or am I
 missing something?
 

If we mark it as a conflict, then the user can still obtain the merged
result by running 'svn resolve', no?

i.e., whoops, the file is missing.  we don't know if you wanted that or
not, so we'll make you take a small effort and decide explicitly.

 Paul


Re: svn commit: r985487 - in /subversion/branches/performance/subversion: include/svn_io.h libsvn_subr/io.c libsvn_subr/stream.c

2010-08-18 Thread Daniel Shahaf
Stefan Fuhrmann wrote on Tue, Aug 17, 2010 at 23:39:37 +0200:
 Daniel Shahaf wrote:
 stef...@apache.org wrote on Sat, Aug 14, 2010 at 13:20:22 -:
 URL: http://svn.apache.org/viewvc?rev=985487view=rev
 +++ subversion/branches/performance/subversion/include/svn_io.h Sat Aug 14 
 13:20:22 2010
 +svn_io_file_read_full2(apr_file_t *file,
 +   void *buf,
 +   apr_size_t nbytes,
 +   apr_size_t *bytes_read,
 +   svn_boolean_t eof_is_ok,
 +   apr_pool_t *pool);

 Why didn't you deprecate svn_io_file_read_full() in the same commit?
   
 By the time I wrote that code almost 3 months ago, I tried to minimize
 the impact (code churn) on the code base.

When an API is revved, marking its old version deprecated and rewriting
it as a wrapper isn't considered churn.

Upgrading all existing calls is a bit more noise, but in general we
still do it (although not in tests).  Usually I try to do that in
a separate commit.

 Usually we do it as follows:

 + the declaration of svn_io_file_read_full2() is *above* that
   of svn_io_file_read_full (not critical)
 + mark the old function SVN_DEPRECATED (preprocessor symbol) and
   doxygen @deprecated
 + change the doc string of the old function to be a diff against the new
   function's docs
 + in the appropriate *.c file, upgrade the definition in-place
 + re-define the old function in lib_$foo/deprecated.c as a wrapper around the
   new function
   
 Took me a number of commits but it should be o.k. by r986492.
 Why you didn't make svn_io_file_read_full() a deprecated wrapper around
 svn_io_file_read_full2()?
   
 See above ;)


Thanks!  And I hope you don't see this as procedural nitpicks --- I was
just highlighting the conventions we follow when revving an API.

 -- Stefan^2.


In other news: I'm a bit concerned about not seeing much
activity/discussions about merging some of your work to trunk.  Perhaps
it'll be a good idea to start reviewing and merging some parts of it
now?  (in parallel to your committing further new work to the branch)

For a change to be applied to trunk, you'll have to get a full
committer's +1 on that change.


1.7 and obliterate

2010-08-18 Thread C. Michael Pilato
What's the status of obliterate?

I get the sense that it exists in our codebase, for all practical purposes,
in name only.  If that's true, and if that's the state it's expected to be
in when 1.7 ships, then I'd really, really like to just see the feature not
ship at all.  We don't have to can all the code, but certainly de-publicize
the lot of it (no pre-obliterate.tmpl hook creation, no obliterate-related
public APIs, etc.)

Some might say I'm letting the perfect be the enemy of the good, here.  I'd
disagree, but openly admit that I'm trying to let the tolerable be the enemy
of the barely-working.

But maybe I'm misremembering the current utility of the feature?

-- 
C. Michael Pilato cmpil...@collab.net
CollabNet  www.collab.net  Distributed Development On Demand


Re: 1.7 and obliterate

2010-08-18 Thread Julian Foad
On Wed, 2010-08-18, C. Michael Pilato wrote:
 What's the status of obliterate?

Heh.  I started down a path that looked promising initially but became
ever more difficult.  The parts that are checked in are the easy parts -
hook script support and some ability to remove node-revs from the
repository in only a really trivial case that's not often useful.  Now
that I'm concentrating on the essential WC-NG task, reluctantly I have
left this work aside.

Philip Martin had a good idea for doing a simpler kind of obliteration,
tried out on the obliterate-like-deltify branch, which works in what
it does and appears to be more likely to be feasible to implement fully.
That certainly deserves more attention.

 I get the sense that it exists in our codebase, for all practical purposes,
 in name only.  If that's true, and if that's the state it's expected to be
 in when 1.7 ships, then I'd really, really like to just see the feature not
 ship at all.  We don't have to can all the code, but certainly de-publicize
 the lot of it (no pre-obliterate.tmpl hook creation, no obliterate-related
 public APIs, etc.)
 
 Some might say I'm letting the perfect be the enemy of the good, here.  I'd
 disagree, but openly admit that I'm trying to let the tolerable be the enemy
 of the barely-working.
 
 But maybe I'm misremembering the current utility of the feature?

You're about spot on.  It's not in a fit state to release and I'm no
longer working on it.

I'll go and pull the existing code out some time in the next few weeks.
We can always resurrect any parts of it that we want to work on again.

- Julian




Re: Webpage showing 1.7 status?

2010-08-18 Thread Daniel Shahaf
Hyrum K. Wright wrote on Wed, Aug 18, 2010 at 13:18:09 +0100:
 I've added something to the roadmap page which attempts to categorize
 the remaining items for 1.7.  It'd be good to include already
 completed items as well, so we don't get too depressed with what left
 (but remember how far we've come!)
 

I've added two features off the top of my head.

But this reminds me... we need to update CHANGES and write a release
notes.  How do we plan to do that?  (at least CHANGES; the table hwright
added is a good release-notes skeleton)

Daniel
(Have we ever finished the 1.6 release notes?)


 Edits, reversions, etc., welcome.
 
 -Hyrum


Re: svn_client_status5() and depth

2010-08-18 Thread Stefan Küng

On 17.08.2010 23:41, Hyrum K. Wright wrote:

On Tue, Aug 17, 2010 at 3:45 PM, C. Michael Pilatocmpil...@collab.net  wrote:

On 08/17/2010 01:42 PM, Stefan Küng wrote:

Maybe I don't understand that change:
--depth specifies a depth to use for the command. If I want the command
to use the depth of the working copy, I specify an unknown depth or none
at all. But if I specify a depth, I would assume the command to respect
that depth and return the info with that depth.
So why should the -u flag not use the specified depth?


--depth is a filtering option -- it can only reduce the scope of an
operation, it can not expand it.

Bert sez he'll make the behavior optional in the API call so you can make
use of it.  We won't expose it at all in the 'svn' command-line client, but
TortoiseSVN can get its behavior back.  Sound good?


sarcasm
Yay for another API flag!
/sarcasm


Well, it's either another flag, or use bitmasks. And since bitmasks are 
not allowed anymore, there aren't any more options.



I'm starting to wonder if the mission (at the API level) of 'st -u'
and vanilla 'st' has diverged enough that we may want to consider an
API named svn_client_remote_status() or something.  It seems a bit
silly to keep extending APIs using various boolean flags[1], in a
manner which eventually leads to both consumer and programmer
confusion.  This eventually leads to entry points whose behavior
wildly varies based upon the permutations of the set of flags.
Perhaps status5 (with its 13 parameters) has reached this point?


But that would lead to another problem: how would you combine the local 
status and the remote status data? Have the clients merge those two?


Bert:
I've checked r986510 and I like to add some comments:

In libsvn_client/deprecated.c, svn_client_status4() you pass FALSE for 
the new sticky param. But that's wrong: in 1.6.x and previous versions, 
the depth was always respected/enforced. That's why I noticed the 
changed behavior in the first place!

Shouldn't you pass TRUE there?

And what I don't understand: why the extra parameter at all?
If the param is set to TRUE unconditionally, then the behavior seems to 
be the very same as in 1.6.x:
If the depth is less than the depth of the WC, it behaves the same as if 
the param is FALSE. Only if the depth is deeper then the param has any 
effect - setting the param to TRUE *ignores* deeper depths.


So from my point of view, that parameter isn't necessary: the behavior 
can be configured by using the depth parameter alone.


Also, since this is a readonly operation, the naming 'depth_is_sticky' 
isn't very good. Even confusing.


Ok, reading my text above again, I think I'm not very clear why I think 
the parameter isn't necessary. So I'll try again:


depthsticky
= WC depthFALSE 1.6.x == 1.7 : data for depth
= WC depth   TRUE  1.6.x == 1.7 : data for depth
 WC depthFALSE 1.6.x data for depth, 1.7 data for WCdepth
 WC depthTRUE  1.6.x data for depth, 1.7 data for depth

Also, if a client wants the data only for the WC depth, passing a depth 
of 'unknown' will do exactly that, no matter what the sticky is set to.


So: every outcome of that API can be triggered by specifying the depth 
alone. So why not get rid of the sticky param and always do as if it was 
set to TRUE?


Stefan

--
   ___
  oo  // \\  De Chelonian Mobile
 (_,\/ \_/ \ TortoiseSVN
   \ \_/_\_/The coolest Interface to (Sub)Version Control
   /_/   \_\ http://tortoisesvn.net


Performance branch ready for review

2010-08-18 Thread Stefan Fuhrmann

Hi @all,

I just finished my porting work; the performance branch
is now fully synchronized with my prototype code.
From my point of view, review can start now.

According to my measurements, the code is now faster
than the original prototype. Large caches provided, a
single multi-threaded svnserve instance on a modern
quad-core machine should be able to saturate a 10Gb
server connection.

Open issues / things still to do

* there is an issue with log triggering an assertion()
 - I will investigate that next
* test mod_web_dav and add FSFS cache configuration
 parameters to it
* tune membuffer cache eviction strategy such that even
 small caches can have a large impact
* add tests for the new APIs
* provide APR patches.

There are many things I would like to do but they may
better be deferred to 1.8.

-- Stefan^2.


RE: svn_client_status5() and depth

2010-08-18 Thread Bert Huijben


 -Original Message-
 From: Stefan Küng [mailto:tortoise...@gmail.com]
 Sent: woensdag 18 augustus 2010 11:39
 To: Hyrum K. Wright
 Cc: C. Michael Pilato; Bert Huijben; Subversion Development
 Subject: Re: svn_client_status5() and depth
 
 On 17.08.2010 23:41, Hyrum K. Wright wrote:
  On Tue, Aug 17, 2010 at 3:45 PM, C. Michael Pilatocmpil...@collab.net
 wrote:
  On 08/17/2010 01:42 PM, Stefan Küng wrote:
  Maybe I don't understand that change:
  --depth specifies a depth to use for the command. If I want the
 command
  to use the depth of the working copy, I specify an unknown depth or
 none
  at all. But if I specify a depth, I would assume the command to
respect
  that depth and return the info with that depth.
  So why should the -u flag not use the specified depth?
 
  --depth is a filtering option -- it can only reduce the scope of an
  operation, it can not expand it.
 
  Bert sez he'll make the behavior optional in the API call so you can
make
  use of it.  We won't expose it at all in the 'svn' command-line client,
but
  TortoiseSVN can get its behavior back.  Sound good?
 
  sarcasm
  Yay for another API flag!
  /sarcasm
 
 Well, it's either another flag, or use bitmasks. And since bitmasks are
 not allowed anymore, there aren't any more options.
 
  I'm starting to wonder if the mission (at the API level) of 'st -u'
  and vanilla 'st' has diverged enough that we may want to consider an
  API named svn_client_remote_status() or something.  It seems a bit
  silly to keep extending APIs using various boolean flags[1], in a
  manner which eventually leads to both consumer and programmer
  confusion.  This eventually leads to entry points whose behavior
  wildly varies based upon the permutations of the set of flags.
  Perhaps status5 (with its 13 parameters) has reached this point?
 
 But that would lead to another problem: how would you combine the local
 status and the remote status data? Have the clients merge those two?
 
 Bert:
 I've checked r986510 and I like to add some comments:
 
 In libsvn_client/deprecated.c, svn_client_status4() you pass FALSE for
 the new sticky param. But that's wrong: in 1.6.x and previous versions,
 the depth was always respected/enforced. That's why I noticed the
 changed behavior in the first place!
 Shouldn't you pass TRUE there?

The not filtering of the nodes was never the intended behavior. It was a
bug, but had useful behavior for your specific case: explicitly pulling
unavailable nodes into the working copy.

 And what I don't understand: why the extra parameter at all?
 If the param is set to TRUE unconditionally, then the behavior seems to
 be the very same as in 1.6.x:
 If the depth is less than the depth of the WC, it behaves the same as if
 the param is FALSE. Only if the depth is deeper then the param has any
 effect - setting the param to TRUE *ignores* deeper depths.
 
 So from my point of view, that parameter isn't necessary: the behavior
 can be configured by using the depth parameter alone.

Well.. that depends on what is the ambient depth of the directory and what
you try to do with it. If the directory has depth children, passing depth
infinity should be seen as depth children and depth files as depth files.

svn status -u was designed to show what svn up would do, and that really
needs the current behavior to work correctly. I think that is the behavior
older clients want. (But thanks for opening the discussion on that!)

 Also, since this is a readonly operation, the naming 'depth_is_sticky'
 isn't very good. Even confusing.

Actually the name is 'depth_as_sticky'.

 
 Ok, reading my text above again, I think I'm not very clear why I think
 the parameter isn't necessary. So I'll try again:
 
 depthsticky
 = WC depth FALSE 1.6.x == 1.7 : data for depth
 = WC depth   TRUE  1.6.x == 1.7 : data for depth
   WC depthFALSE 1.6.x data for depth, 1.7 data for
WCdepth
   WC depthTRUE  1.6.x data for depth, 1.7 data for depth
 
 Also, if a client wants the data only for the WC depth, passing a depth
 of 'unknown' will do exactly that, no matter what the sticky is set to.

 So: every outcome of that API can be triggered by specifying the depth
 alone. So why not get rid of the sticky param and always do as if it was
 set to TRUE?

Note that depth is used for more than just this filtering on the ra layer.
It also applies on what nodes of your local working copy are shown. It did
work as a filter there.

Are you sure you can mimic that without the extra Boolean?
(I certainly don't see any of this in your table)

Bert

 
 Stefan
 
 --
 ___
oo  // \\  De Chelonian Mobile
   (_,\/ \_/ \ TortoiseSVN
 \ \_/_\_/The coolest Interface to (Sub)Version Control
 /_/   \_\ http://tortoisesvn.net



RE: svn commit: r986865 - /subversion/trunk/notes/wc-ng/node-data

2010-08-18 Thread Bert Huijben


 -Original Message-
 From: phi...@apache.org [mailto:phi...@apache.org]
 Sent: woensdag 18 augustus 2010 12:17
 To: comm...@subversion.apache.org
 Subject: svn commit: r986865 - /subversion/trunk/notes/wc-ng/node-data
 
 Author: philip
 Date: Wed Aug 18 19:16:59 2010
 New Revision: 986865
 
 URL: http://svn.apache.org/viewvc?rev=986865view=rev
 Log:
 Some initial thoughts about NODE_DATA from the Sheffield meeting.
 
 * notes/wc-ng/node-data: New.
 
 Added:
 subversion/trunk/notes/wc-ng/node-data   (with props)
 
 Added: subversion/trunk/notes/wc-ng/node-data
 URL: http://svn.apache.org/viewvc/subversion/trunk/notes/wc-ng/node-
 data?rev=986865view=auto
 ==
 
 --- subversion/trunk/notes/wc-ng/node-data (added)
 +++ subversion/trunk/notes/wc-ng/node-data Wed Aug 18 19:16:59 2010
 @@ -0,0 +1,255 @@
 +NODE_DATA Design (Sheffield 2010-08-18)
 +===
 +
 +Essentially it replaces BASE_NODE and WORKING_NODE by combining all
 +the existing columns with a new op_depth column where op_depth == 0 is
 +the old BASE_NODE and op_depth != 0 is the old WORKING_NODE. 

I don't think it *replaces* BASE_NODE and WORKING_NODE. It will contain data 
for both, but it doesn't replace these tables.

Maybe it can replace WORKING_NODE, but BASE_NODE has more information than the 
columns you list here. Thinks like the repos_relpath and copyfrom_* are only 
defined on BASE and/or an operation root.

That is why they are still modeled to stay on BASE_NODE and WORKING_NODE. And 
last time I looked at the design, translated_size and last_mod_time (used for 
optimizing away comparisions from things like 'svn status') were still on BASE 
and WORKING as they are only relevant for nodes that are in the wc.

 +
 +  Category Columns
 +
 +  indexing:wc_id, local_relpath, parent_relpath, op_depth
 +  status:  presence
 +  node-rev:repos_id, repos_relpath, revnum
 +  content: kind, properties, depth, target, checksum
 +  last-change: changed_rev, changed_date, changed_author
 +  wc-cache:translated_size, last_mod_time
 +  misc:dav_cache, file_external
 +
 +When op_depth == 0 the node-rev columns represent the checked-out
 +repository node, otherwise they represent the copyfrom node.
 +
 +Presence has the same six values as BASE_NODE/WORKING_NODE:
 normal,
 +incomplete, absent, excluded, not-present, base-deleted.  There are
 +some presence/op_depth constraints, e.g. base-deleted is not valid for
 +op_depth 0 and absent is not valid for op_depth != 0.
 +

snip

Bert



Re: svn_client_status5() and depth

2010-08-18 Thread Hyrum K. Wright
On Wed, Aug 18, 2010 at 7:39 PM, Stefan Küng tortoise...@gmail.com wrote:
 On 17.08.2010 23:41, Hyrum K. Wright wrote:

 On Tue, Aug 17, 2010 at 3:45 PM, C. Michael Pilatocmpil...@collab.net
  wrote:

 On 08/17/2010 01:42 PM, Stefan Küng wrote:

 Maybe I don't understand that change:
 --depth specifies a depth to use for the command. If I want the command
 to use the depth of the working copy, I specify an unknown depth or none
 at all. But if I specify a depth, I would assume the command to respect
 that depth and return the info with that depth.
 So why should the -u flag not use the specified depth?

 --depth is a filtering option -- it can only reduce the scope of an
 operation, it can not expand it.

 Bert sez he'll make the behavior optional in the API call so you can make
 use of it.  We won't expose it at all in the 'svn' command-line client,
 but
 TortoiseSVN can get its behavior back.  Sound good?

 sarcasm
 Yay for another API flag!
 /sarcasm

 Well, it's either another flag, or use bitmasks. And since bitmasks are not
 allowed anymore, there aren't any more options.

False dichotomy.  There are more options than just add parameters or
add bitmasks, one of which I mentioned below: rethinking the roles
of our various APIs, and augmenting *that* instead.  (But given your
comments below, I think you're right about this particular case.)

 I'm starting to wonder if the mission (at the API level) of 'st -u'
 and vanilla 'st' has diverged enough that we may want to consider an
 API named svn_client_remote_status() or something.  It seems a bit
 silly to keep extending APIs using various boolean flags[1], in a
 manner which eventually leads to both consumer and programmer
 confusion.  This eventually leads to entry points whose behavior
 wildly varies based upon the permutations of the set of flags.
 Perhaps status5 (with its 13 parameters) has reached this point?

 But that would lead to another problem: how would you combine the local
 status and the remote status data? Have the clients merge those two?

Good point.  I'm not claiming the above solution is optimal, but that
we need to do some hard thinking about these issues before adding Yet
Another Parameter to the APIs.

-Hyrum


Re: [PATCH] New dumpstream parser to check version number

2010-08-18 Thread Hyrum K. Wright
On Wed, Aug 18, 2010 at 8:59 PM, Ramkumar Ramachandra
artag...@gmail.com wrote:
 [[[
 * subversion/libsvn_repos/load.c
  (svn_repos_parse_dumpstream2, svn_repos_parse_dumpstream3): Rename
  the older function and add a version_number argument; error out if
  there's a version mismatch.

 * subversion/include/svn_repos.h
  (svn_repos_parse_dumpstream2, svn_repos_parse_dumpstream3): Add new
  function and mark svn_repos_parse_dumpstream2 as deprecated.

 * subversion/svnrdump/load_editor.c
  (drive_dumpstream_loader): Update to use the new API, and call it
  with version_number 3.
 ]]]

 Index: subversion/include/svn_repos.h
 ===
 --- subversion/include/svn_repos.h      (revision 986884)
 +++ subversion/include/svn_repos.h      (working copy)
 @@ -2646,6 +2646,10 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn
  * @a cancel_baton as argument to see if the client wishes to cancel
  * the dump.
  *
 + * If @a version_number is -1, it is ignored and the dumpstream is
 + * parsed without this information. If not -1, the function checks the

We try to avoid magic numbers.  Is there a #define or constant
somewhere that you could use?  If not, I'd add a #define for
UNKNOWN_VERSION or something similar (look at other #define's in the
header files).

 + * dumpstream's version number, and errors out if there's a mismatch.
 + *
  * This parser has built-in knowledge of the dumpfile format, but only
  * in a general sense:
  *
 @@ -2661,9 +2665,25 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn
  * This is enough knowledge to make it easy on vtable implementors,
  * but still allow expansion of the format:  most headers are ignored.
  *
 - * @since New in 1.1.
 + * @since New in 2.0.

Uh, we haven't (not do we have an immediate plans to) released 2.0 yet. :)

  */
  svn_error_t *
 +svn_repos_parse_dumpstream3(svn_stream_t *stream,
 +                            const svn_repos_parse_fns2_t *parse_fns,
 +                            void *parse_baton,
 +                            svn_cancel_func_t cancel_func,
 +                            void *cancel_baton,
 +                            int version_number,
 +                            apr_pool_t *pool);
 +
 +/**
 + * Similar to svn_repos_parse_dumpstream3(), but is dumpfile version
 + * agnostic.
 + *
 + * @deprecated Provided for backward compatibility with the 1.6 API.

Should maintain the @since tag from before.

 + */
 +SVN_DEPRECATED
 +svn_error_t *
  svn_repos_parse_dumpstream2(svn_stream_t *stream,
                             const svn_repos_parse_fns2_t *parse_fns,
                             void *parse_baton,
 Index: subversion/libsvn_repos/load.c
 ===
 --- subversion/libsvn_repos/load.c      (revision 986884)
 +++ subversion/libsvn_repos/load.c      (working copy)
 @@ -654,14 +654,27 @@ parse_format_version(const char *versionstring, in
  }


 +svn_error_t *
 +svn_repos_parse_dumpstream2(svn_stream_t *stream,
 +                            const svn_repos_parse_fns2_t *parse_fns,
 +                            void *parse_baton,
 +                            svn_cancel_func_t cancel_func,
 +                            void *cancel_baton,
 +                            apr_pool_t *pool)
 +{
 +  return svn_repos_parse_dumpstream3(stream, parse_fns, parse_baton,
 +                                     cancel_func, cancel_baton, -1, pool);
 +}

 +
  /* The Main Parser Logic */
  svn_error_t *
 -svn_repos_parse_dumpstream2(svn_stream_t *stream,
 +svn_repos_parse_dumpstream3(svn_stream_t *stream,
                             const svn_repos_parse_fns2_t *parse_fns,
                             void *parse_baton,
                             svn_cancel_func_t cancel_func,
                             void *cancel_baton,
 +                            int version_number,
                             apr_pool_t *pool)
  {
   svn_boolean_t eof;
 @@ -690,6 +703,11 @@ svn_error_t *
     return svn_error_createf(SVN_ERR_STREAM_MALFORMED_DATA, NULL,
                              _(Unsupported dumpfile version: %d), version);

 +  /* Error out if version doesn't match with the provided version_number */
 +  if (version_number != -1  version_number != version)

Another magic number.

 +    return svn_error_createf(SVN_ERR_STREAM_MALFORMED_DATA, NULL,
 +                             _(Unsupported dumpfile version: %d), version);
 +
   /* A dumpfile record is defined to be a header-block of
      rfc822-style headers, possibly followed by a content-block.

 Index: subversion/svnrdump/load_editor.c
 ===
 --- subversion/svnrdump/load_editor.c   (revision 986884)
 +++ subversion/svnrdump/load_editor.c   (working copy)
 @@ -540,8 +540,8 @@ drive_dumpstream_loader(svn_stream_t *stream,
   pb = parse_baton;

   SVN_ERR(svn_ra_get_repos_root2(session, (pb-root_url), pool));
 -  

Re: svn_client_status5() and depth

2010-08-18 Thread Stefan Küng

On 18.08.2010 21:58, Bert Huijben wrote:


In libsvn_client/deprecated.c, svn_client_status4() you pass FALSE for
the new sticky param. But that's wrong: in 1.6.x and previous versions,
the depth was always respected/enforced. That's why I noticed the
changed behavior in the first place!
Shouldn't you pass TRUE there?


The not filtering of the nodes was never the intended behavior. It was a
bug, but had useful behavior for your specific case: explicitly pulling
unavailable nodes into the working copy.


Well, it might have been intended, but it didn't work that way. It 
worked as if you would pass TRUE. Since you pass FALSE, the behavior is 
not the same as in 1.6.x.
And shouldn't the old APIs stay compatible? To stay compatible with the 
old APIs, you would have to pass TRUE in that wrapper.

Or am I missing something important here?




And what I don't understand: why the extra parameter at all?
If the param is set to TRUE unconditionally, then the behavior seems to
be the very same as in 1.6.x:
If the depth is less than the depth of the WC, it behaves the same as if
the param is FALSE. Only if the depth is deeper then the param has any
effect - setting the param to TRUE *ignores* deeper depths.

So from my point of view, that parameter isn't necessary: the behavior
can be configured by using the depth parameter alone.


Well.. that depends on what is the ambient depth of the directory and what
you try to do with it. If the directory has depth children, passing depth
infinity should be seen as depth children and depth files as depth files.


Why? If I pass depth infinity, I *want* the depth infinity, not depth 
children. If I want the depth of the WC, I can pass depth unknown. 
That's what the depth 'unknown' is for: to specify that the depth of the 
WC should be used instead. If I pass a specific depth, I want that depth 
and not something else.



svn status -u was designed to show what svn up would do, and that really
needs the current behavior to work correctly. I think that is the behavior
older clients want. (But thanks for opening the discussion on that!)


Also, since this is a readonly operation, the naming 'depth_is_sticky'
isn't very good. Even confusing.


Actually the name is 'depth_as_sticky'.



Ok, reading my text above again, I think I'm not very clear why I think
the parameter isn't necessary. So I'll try again:

depthsticky
= WC depthFALSE 1.6.x == 1.7 : data for depth
= WC depth   TRUE  1.6.x == 1.7 : data for depth
WC depthFALSE 1.6.x data for depth, 1.7 data for

WCdepth

WC depthTRUE  1.6.x data for depth, 1.7 data for depth

Also, if a client wants the data only for the WC depth, passing a depth
of 'unknown' will do exactly that, no matter what the sticky is set to.

So: every outcome of that API can be triggered by specifying the depth
alone. So why not get rid of the sticky param and always do as if it was
set to TRUE?


Note that depth is used for more than just this filtering on the ra layer.
It also applies on what nodes of your local working copy are shown. It did
work as a filter there.

Are you sure you can mimic that without the extra Boolean?
(I certainly don't see any of this in your table)


I don't understand: what exactly are you missing from that table above?
Can you maybe provide me that part of the table, and I'll fill in the 
data that's returned.


Stefan

--
   ___
  oo  // \\  De Chelonian Mobile
 (_,\/ \_/ \ TortoiseSVN
   \ \_/_\_/The coolest Interface to (Sub)Version Control
   /_/   \_\ http://tortoisesvn.net


Re: [PATCH] New dumpstream parser to check version number

2010-08-18 Thread Daniel Shahaf
Ramkumar Ramachandra wrote on Thu, Aug 19, 2010 at 01:29:02 +0530:
 +++ subversion/include/svn_repos.h(working copy)
 @@ -2661,9 +2665,25 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn
   * This is enough knowledge to make it easy on vtable implementors,
   * but still allow expansion of the format:  most headers are ignored.
   *
 - * @since New in 1.1.
 + * @since New in 2.0.

s/2.0/1.7/

   */
  svn_error_t *
 +svn_repos_parse_dumpstream3(svn_stream_t *stream,
 +const svn_repos_parse_fns2_t *parse_fns,
 +void *parse_baton,
 +svn_cancel_func_t cancel_func,
 +void *cancel_baton,
 +int version_number,
 +apr_pool_t *pool);
 +
 +/**
 + * Similar to svn_repos_parse_dumpstream3(), but is dumpfile version
 + * agnostic.
 + *

Perhaps say: with @a version_number always set to -1 ?

 + * @deprecated Provided for backward compatibility with the 1.6 API.
 + */
 +SVN_DEPRECATED
 +svn_error_t *
  svn_repos_parse_dumpstream2(svn_stream_t *stream,
  const svn_repos_parse_fns2_t *parse_fns,
  void *parse_baton,
 Index: subversion/libsvn_repos/load.c
 ===
 --- subversion/libsvn_repos/load.c(revision 986884)
 +++ subversion/libsvn_repos/load.c(working copy)
 @@ -654,14 +654,27 @@ parse_format_version(const char *versionstring, in
  /* The Main Parser Logic */
  svn_error_t *
 -svn_repos_parse_dumpstream2(svn_stream_t *stream,
 +svn_repos_parse_dumpstream3(svn_stream_t *stream,
  const svn_repos_parse_fns2_t *parse_fns,
  void *parse_baton,
  svn_cancel_func_t cancel_func,
  void *cancel_baton,
 +int version_number,
  apr_pool_t *pool)
  {
svn_boolean_t eof;
 @@ -690,6 +703,11 @@ svn_error_t *
  return svn_error_createf(SVN_ERR_STREAM_MALFORMED_DATA, NULL,
   _(Unsupported dumpfile version: %d), version);
  
 +  /* Error out if version doesn't match with the provided version_number */
 +  if (version_number != -1  version_number != version)

svnrdump needs an inequality check here.  But 'svnadmin load', for
example, needs an is at most check.  So I don't think this is generic
enough: it would be better to provide something that 'svnadmin load' can
use too.

So, two options:

* Repeat the parse_format_version() trick (from load.c) in svnrdump.

* In the API, PARSE_FNS could grow a dumpfile_version_record() callback
  (like it has a uuid_record()) callback.  The caller can implement
  whatever check they want in that callback.  (possibly with a special
  error code that svn_repos_parse_dumpstreamN() recognizes; compare
  usage of SVN_ERR_CANCELLED)


NODE_DATA discussions

2010-08-18 Thread Julian Foad
Hyrum, Eric H., Philip M. and I met up at WANdisco's office in Sheffield
(England) yeterday and today.  One of the things we discussed was
NODE_DATA.

We discussed several sub-topics, of which here are two.  I've written
these up including my further thoughts which weren't part of the
discussion, so it's biased.


---

Observation:

The new NODE_DATA table can completely subsume the old BASE_NODE and
WORKING_NODE tables.

  BASE_NODE= NODE_DATA(op_depth==0),
  WORKING_NODE = NODE_DATA(op_depth==max).

A few of the columns do not make sense in every op_depth.
translated_size and last_mod_time make sense only on the top-most node.
But putting these columns into this table seems better than keeping them
in a separate table.

  BASE_NODE NODE_DATA   WORKING_NODE
    --  --
  Indexing  =  yes + op_depth=Indexing
  presence  =  yes   =presence
  Node-Rev  =  yes: original_*   =copyfrom_*
  Content   =  yes   =Content
  Last-Change   =  yes   =Last-Change
  translated_size   =  TODO  =translated_size
  last_mod_time =  TODO  =last_mod_time
  file_external x   no - obsolete
  dav_cache =  TODO
  incomplete_children   x   no - obsolete
no - not ready?   =moved_here
no - not ready?   =moved_to
no - obsolete  xkeep_local

By TODO I mean not yet in wc-metadata.sql.

By not ready? I mean we're not ready to fully define and use the
'moved_*' columns so it would be better to insert them with a WC format
upgrade when we are ready.


---

[RFC] Instead of recording the deleted state as a presence value in
the topmost operative layer, consider whether to record it by means of a
flag in the next-nearest layer *beneath*.

This initially sounded appealing, but I'm not so sure now.  It could
just be a premature optimisation side-track.  It's certainly a way down
the list of Important Things.

Example sequence of operations, showing representation in both schemes:

  Operation: clean checkout

  WC pathsop_depth=0  op_depth=1op_depth=0  op_depth=1
  --  ----  --
  A1/ norm  norm
   +- f.old   norm  norm
   +- f   norm  norm
  B1/ norm  norm
   +- f   norm  norm
   +- f.new   norm  norm

  Operation: delete ./A1

  WC pathsop_depth=0  op_depth=1op_depth=0  op_depth=1
  --  ----  --
  A1/ norm  Del!normbase_deleted
   +- f.old   norm  Del!normbase_deleted
   +- f   norm  Del!normbase_deleted
  B1/ norm  norm
   +- f   norm  norm
   +- f.new   norm  norm

  Operation: copy ^/B1 to ./A1, replacing ^/A1

  WC pathsop_depth=0  op_depth=1op_depth=0  op_depth=1
  --  ----  --
  A1/ norm  Del!  norm  normnorm
   +- f.old   norm  Del!normbase_deleted
   +- f   norm  Del!  norm  normnorm
   +- f.new   norm  norm
  B1/ norm  norm
   +- f   norm  norm
   +- f.new   norm  norm


Flag in layer beneath doesn't require a final base_deleted row in the
topmost layer.  The flag is required for (and only for) paths where a
row exists in the previous layer, so it seems more space-efficient to
store it there.

Flag in layer beneath allows simpler reverting of the add half of a
replacement: (remove all op_depth=N rows that are children of this op)
rather than (convert all these rows to a different presence value that
depends on their presence in layer N-1).  That's not considered an
important UI feature, but the fact that it can be done by a logically
simple operation is likely to result in simpler, less buggy code.

Flag in layer beneath makes reverting the whole operation harder: two
layers need to be modified.

Flag in layer beneath is redundant when overridden by a higher layer, in
other words when a node is present in the WC at this path.

Flag in layer beneath copes with intentional deletion, but what about
'absent' and 

Re: [PATCH] New dumpstream parser to check version number

2010-08-18 Thread Ramkumar Ramachandra
Hi Hyrum,

Hyrum K. Wright writes:
  + * If @a version_number is -1, it is ignored and the dumpstream is
  + * parsed without this information. If not -1, the function checks the
 
 We try to avoid magic numbers.  Is there a #define or constant
 somewhere that you could use?  If not, I'd add a #define for
 UNKNOWN_VERSION or something similar (look at other #define's in the
 header files).

Right, done.

  - * @since New in 1.1.
  + * @since New in 2.0.
 
 Uh, we haven't (not do we have an immediate plans to) released 2.0 yet. :)

Right. I seem to have forgotten about 1.7-1.9 :p

  +/**
  + * Similar to svn_repos_parse_dumpstream3(), but is dumpfile version
  + * agnostic.
  + *
  + * @deprecated Provided for backward compatibility with the 1.6 API.
 
 Should maintain the @since tag from before.

Ok.

  +  /* Error out if version doesn't match with the provided version_number */
  +  if (version_number != -1  version_number != version)
 
 Another magic number.

Fixed.

  -  SVN_ERR(svn_repos_parse_dumpstream2(stream, parser, parse_baton,
  -                                      NULL, NULL, pool));
  +  SVN_ERR(svn_repos_parse_dumpstream3(stream, parser, parse_baton,
  +                                      NULL, NULL, 3, pool));
 
 Another magic number.

Fixed.

-- Ram


Re: [PATCH] New dumpstream parser to check version number

2010-08-18 Thread Ramkumar Ramachandra
Hi Daniel,

Daniel Shahaf writes:
  - * @since New in 1.1.
  + * @since New in 2.0.
 
 s/2.0/1.7/

Fixed.

  +/**
  + * Similar to svn_repos_parse_dumpstream3(), but is dumpfile version
  + * agnostic.
  + *
 
 Perhaps say: with @a version_number always set to -1 ?

Fixed, along with Hyrum's suggestions.

  +  /* Error out if version doesn't match with the provided version_number */
  +  if (version_number != -1  version_number != version)
 
 svnrdump needs an inequality check here.  But 'svnadmin load', for
 example, needs an is at most check.  So I don't think this is generic
 enough: it would be better to provide something that 'svnadmin load' can
 use too.
 
 So, two options:
 
 * Repeat the parse_format_version() trick (from load.c) in svnrdump.

It's actually not possible. If I read one line of the dumpstream in
load_editor.c and then call svn_repos_parse_dumpstream2, the function
won't have the version number information because I've already read
that out from the dumpstream, and I can't rewind the stream. Note that
as you already pointed out on IRC, parse_format_version imposes a
maximum already (load.c:647).

 * In the API, PARSE_FNS could grow a dumpfile_version_record() callback
   (like it has a uuid_record()) callback.  The caller can implement
   whatever check they want in that callback.  (possibly with a special
   error code that svn_repos_parse_dumpstreamN() recognizes; compare
   usage of SVN_ERR_CANCELLED)

Excellent idea; too much work for one patch though- I'll add a TODO
comment about this and re-post the patch.

-- Ram


Re: [PATCH] Bug in svn_fs_paths_changed2() Python bindings?

2010-08-18 Thread Gavin Beau Baumanis
Ping. The renewed patch submission has received no comments.


Gavin.



On 12/08/2010, at 6:12 AM, Alexey Neyman wrote:

 On Wednesday, August 11, 2010 12:10:41 pm C. Michael Pilato wrote:
 On 08/10/2010 09:22 PM, Alexey Neyman wrote:
 Okay, try again:
 
 [[[
 Fix the type of structures returned in bindings from
 svn_fs_paths_changed2().
 
 * subversion/include/svn_fs.h
 
  (svn_fs_paths_changed2): Rename the argument from changed_paths_p to
  changed_paths_p2, so that it's different from argument to
  svn_fs_paths_changed().
 
 Minor nit -- I think 'changed_paths2_p' would be a better name.  To me, a
 number at the very end of a parameter name says differentiator (as in
 'strcmp(str1, str2)') whereas having that '2' closer to the 'changed_paths'
 name says version iterator.  But like I said, it's a minor nit.
 
 Otherwise, +1 on the patch.
 
 Agreed. Updated patch attached.
 
 Regards,
 Alexey.
 fs_paths_changed2.diff



Re: Performance branch ready for review

2010-08-18 Thread Johan Corveleyn
On Wed, Aug 18, 2010 at 9:14 PM, Stefan Fuhrmann
stefanfuhrm...@alice-dsl.de wrote:
 Hi @all,

 I just finished my porting work; the performance branch
 is now fully synchronized with my prototype code.
 From my point of view, review can start now.

 According to my measurements, the code is now faster
 than the original prototype. Large caches provided, a
 single multi-threaded svnserve instance on a modern
 quad-core machine should be able to saturate a 10Gb
 server connection.

 Open issues / things still to do

 * there is an issue with log triggering an assertion()
  - I will investigate that next
 * test mod_web_dav and add FSFS cache configuration
  parameters to it
 * tune membuffer cache eviction strategy such that even
  small caches can have a large impact
 * add tests for the new APIs
 * provide APR patches.

 There are many things I would like to do but they may
 better be deferred to 1.8.

I tried compiling your branch on Windows (XP) with Visual C++ Express
2008 (which I also use successfully to build trunk). I had a couple of
issues. FWIW I'm listing them here. I'm not an expert, just
pragmatically trying to get the thing to build, so some of these
things may be user error. Eventually, I was able to build svn.exe,
but svnadmin.exe and svnserve.exe still fail for me.

1) In build.conf, I added private\svn_temp_serializer.h and
private\svn_file_handle_cache.h to the list of msvc-export of
libsvn_subr. Otherwise, the linker gave problems:

libsvn_fs_fs-1.lib(fs_fs.obj) : error LNK2019: unresolved external
symbol _svn_file_handle_cache__has_file referenced in function
_svn_fs_fs__path_rev_absolute
libsvn_fs_fs-1.lib(fs_fs.obj) : error LNK2019: unresolved external
symbol _svn_stream__from_cached_file_handle referenced in function
_get_node_revision_body
libsvn_fs_fs-1.lib(fs_fs.obj) : error LNK2019: unresolved external
symbol _svn_file_handle_cache__open referenced in function
_get_node_revision_body
libsvn_fs_fs-1.lib(fs_fs.obj) : error LNK2019: unresolved external
symbol _svn_file_handle_cache__get_apr_handle referenced in function
_open_pack_or_rev_file
libsvn_fs_fs-1.lib(fs_fs.obj) : error LNK2019: unresolved external
symbol _svn_file_handle_cache__flush referenced in function
_sync_file_handle_cache
libsvn_fs_fs-1.lib(fs_fs.obj) : error LNK2019: unresolved external
symbol _svn_file_handle_cache__close referenced in function
_svn_fs_fs__rev_get_root
libsvn_fs_fs-1.lib(caching.obj) : error LNK2019: unresolved external
symbol _svn_file_handle_cache__create_cache referenced in function
_get_global_file_handle_cache
libsvn_fs_fs-1.lib(id.obj) : error LNK2019: unresolved external symbol
_svn_temp_serializer__pop referenced in function
_svn_fs_fs__id_serialize
libsvn_fs_fs-1.lib(temp_serializer.obj) : error LNK2001: unresolved
external symbol _svn_temp_serializer__pop
libsvn_fs_fs-1.lib(id.obj) : error LNK2019: unresolved external symbol
_svn_temp_serializer__push referenced in function
_svn_fs_fs__id_serialize
libsvn_fs_fs-1.lib(temp_serializer.obj) : error LNK2001: unresolved
external symbol _svn_temp_serializer__push
libsvn_fs_fs-1.lib(id.obj) : error LNK2019: unresolved external symbol
_svn_temp_serializer__add_string referenced in function
_serialize_id_private
libsvn_fs_fs-1.lib(temp_serializer.obj) : error LNK2001: unresolved
external symbol _svn_temp_serializer__add_string
libsvn_fs_fs-1.lib(dag.obj) : error LNK2001: unresolved external
symbol _svn_temp_serializer__add_string
libsvn_fs_fs-1.lib(id.obj) : error LNK2019: unresolved external symbol
_svn_temp_deserializer__resolve referenced in function
_svn_fs_fs__id_deserialize
libsvn_fs_fs-1.lib(temp_serializer.obj) : error LNK2001: unresolved
external symbol _svn_temp_deserializer__resolve
libsvn_fs_fs-1.lib(dag.obj) : error LNK2001: unresolved external
symbol _svn_temp_deserializer__resolve
libsvn_fs_fs-1.lib(temp_serializer.obj) : error LNK2019: unresolved
external symbol _svn_temp_serializer__get referenced in function
_svn_fs_fs__serialize_txdelta_window
libsvn_fs_fs-1.lib(dag.obj) : error LNK2001: unresolved external
symbol _svn_temp_serializer__get
libsvn_fs_fs-1.lib(temp_serializer.obj) : error LNK2019: unresolved
external symbol _svn_temp_serializer__init referenced in function
_svn_fs_fs__serialize_txdelta_window
libsvn_fs_fs-1.lib(dag.obj) : error LNK2001: unresolved external
symbol _svn_temp_serializer__init
libsvn_fs_fs-1.lib(temp_serializer.obj) : error LNK2019: unresolved
external symbol _svn_temp_deserializer__ptr referenced in function
_svn_fs_fs__extract_dir_entry
libsvn_fs_fs-1.lib(dag.obj) : error LNK2019: unresolved external
symbol _svn_temp_serializer__set_null referenced in function
_svn_fs_fs__dag_serialize
..\..\..\Debug\subversion\libsvn_fs\libsvn_fs-1.dll : fatal error
LNK1120: 15 unresolved externals

Here's a patch that solves this for me:
[[[
Index: build.conf
===
--- build.conf  (revision 986928)
+++ build.conf  (working copy)
@@ -322,6 +322,7 @@

Re: [PATCH] Bug in svn_fs_paths_changed2() Python bindings?

2010-08-18 Thread Alexey Neyman
There was a race condition :) C. Michael Pilato committed the same patch at 
the same time as I sent updated version, with less than 3 minutes difference.

Regards,
Alexey.

On Wednesday, August 18, 2010 03:54:43 pm Gavin Beau Baumanis wrote:
 Ping. The renewed patch submission has received no comments.
 
 
 Gavin.
 
 On 12/08/2010, at 6:12 AM, Alexey Neyman wrote:
  On Wednesday, August 11, 2010 12:10:41 pm C. Michael Pilato wrote:
  On 08/10/2010 09:22 PM, Alexey Neyman wrote:
  Okay, try again:
  
  [[[
  Fix the type of structures returned in bindings from
  svn_fs_paths_changed2().
  
  * subversion/include/svn_fs.h
  
   (svn_fs_paths_changed2): Rename the argument from changed_paths_p to
   changed_paths_p2, so that it's different from argument to
   svn_fs_paths_changed().
  
  Minor nit -- I think 'changed_paths2_p' would be a better name.  To me,
  a number at the very end of a parameter name says differentiator (as
  in 'strcmp(str1, str2)') whereas having that '2' closer to the
  'changed_paths' name says version iterator.  But like I said, it's a
  minor nit.
  
  Otherwise, +1 on the patch.
  
  Agreed. Updated patch attached.
  
  Regards,
  Alexey.
  fs_paths_changed2.diff


Re: 1.7 and obliterate

2010-08-18 Thread C. Michael Pilato
Thanks for the update, Julian.  Great that you were able to make progress at
all on such a difficult problem; bummer that Reality showed her dark side.
But, as you said, it's version control -- nothing is ever really lost!


On 08/18/2010 08:11 AM, Julian Foad wrote:
 On Wed, 2010-08-18, C. Michael Pilato wrote:
 What's the status of obliterate?
 
 Heh.  I started down a path that looked promising initially but became
 ever more difficult.  The parts that are checked in are the easy parts -
 hook script support and some ability to remove node-revs from the
 repository in only a really trivial case that's not often useful.  Now
 that I'm concentrating on the essential WC-NG task, reluctantly I have
 left this work aside.
 
 Philip Martin had a good idea for doing a simpler kind of obliteration,
 tried out on the obliterate-like-deltify branch, which works in what
 it does and appears to be more likely to be feasible to implement fully.
 That certainly deserves more attention.
 
 I get the sense that it exists in our codebase, for all practical purposes,
 in name only.  If that's true, and if that's the state it's expected to be
 in when 1.7 ships, then I'd really, really like to just see the feature not
 ship at all.  We don't have to can all the code, but certainly de-publicize
 the lot of it (no pre-obliterate.tmpl hook creation, no obliterate-related
 public APIs, etc.)

 Some might say I'm letting the perfect be the enemy of the good, here.  I'd
 disagree, but openly admit that I'm trying to let the tolerable be the enemy
 of the barely-working.

 But maybe I'm misremembering the current utility of the feature?
 
 You're about spot on.  It's not in a fit state to release and I'm no
 longer working on it.
 
 I'll go and pull the existing code out some time in the next few weeks.
 We can always resurrect any parts of it that we want to work on again.
 
 - Julian
 
 


-- 
C. Michael Pilato cmpil...@collab.net
CollabNet  www.collab.net  Distributed Development On Demand


Re: 1.7 and obliterate

2010-08-18 Thread i . grok
On Wed, Aug 18, 2010 at 05:17:58PM -0700, C. Michael Pilato wrote:
 Thanks for the update, Julian.  Great that you were able to make progress at
 all on such a difficult problem; bummer that Reality showed her dark side.
 But, as you said, it's version control -- nothing is ever really lost!
Until obliterate is available...

-- 
Beware of he who would deny you access to information, for in his heart, he
dreams himself your master. -- Alpha Centauri


RE: svn_client_status5() and depth

2010-08-18 Thread Bert Huijben


 -Original Message-
 From: Stefan Küng [mailto:tortoise...@gmail.com]
 Sent: woensdag 18 augustus 2010 13:37
 To: Bert Huijben
 Cc: 'Hyrum K. Wright'; 'C. Michael Pilato'; 'Subversion Development'
 Subject: Re: svn_client_status5() and depth


 I don't understand: what exactly are you missing from that table above?
 Can you maybe provide me that part of the table, and I'll fill in the
 data that's returned.

If I have a directory with ambient depth files, I want

svn status -u --depth children .

to show exactly what would change if I would do

svn up --depth children .

As this is the documented behavior for 'svn status -u'. *)

If status is patched as suggested by you, I would see all the directories
that are missing in my local working copy in the 'svn status', but I
wouldn't get them by calling the update. (not what I want)

If I do pass --depth unknown (by not passing --depth) I would not have depth
children, but depth infinity. (not what I want as that shows nodes inside
the directories below the current directory)

Theoretically. I could get the same result by calling something like 'svn
info' to retrieve the depth and then pass that depth.. 
So that would be

svn status -u --depth files.
But this also removes all the directories that I have explicitly checked out
from the status information.


So there is no way that I can use this documented behavior, without this
flag.


Bert

*) And that is the reason I didn't pass TRUE. The old behavior is a bug,
which we ought to fix in 1.6.. But then we break TortoiseSVN 1.6, which is
something that I try to avoid. So maybe I should forget that (which is
what I did)




[RFC/PATCH] Create a fresh svn_repos_parse_fns3_t

2010-08-18 Thread Ramkumar Ramachandra
Hi,

I sent a patch a while ago for svn_repos_parse_dumpstream3. While I
wait for approval, this is an RFC patch describing my future plan once
that patch gets approved. It can be described tersely as While at it
(adding the new callback), fix everything that's wrong with the
struct. I'm planning to fix a few other things as well, but this is
the basic sketch. See load_editor.c for the usecase- I actually
stuffed a scratch pool into the parse_baton so I could use it
everywhere.

-- Ram

Index: subversion/include/svn_repos.h
===
--- subversion/include/svn_repos.h  (revision 986884)
+++ subversion/include/svn_repos.h  (working copy)
@@ -2544,6 +2544,129 @@ svn_repos_load_fs(svn_repos_t *repos,
 
 
 /**
+ * A vtable that is driven by svn_repos_parse_dumpstream3().
+ *
+ * @since New in 1.7.
+ */
+typedef struct svn_repos_parse_fns3_t
+{
+  /** The parser has parsed the version information from header of
+   *  the dumpsteeam within the parsing session represented by
+   *  @parse_baton. The version information is available in @a
+   *  version, and a scratch pool @a pool is available for any
+   *  temporary allocations.
+   */
+  svn_error_t *(*dumpstream_version_record)(int version,
+void *parse_baton,
+apr_pool_t *pool);
+
+  /** The parser has discovered a new revision record within the
+   * parsing session represented by @a parse_baton.  All the headers are
+   * placed in @a headers (allocated in @a pool), which maps ttconst
+   * char */tt header-name == ttconst char */tt header-value.
+   * The @a revision_baton received back (also allocated in @a pool)
+   * represents the revision.
+   */
+  svn_error_t *(*new_revision_record)(void **revision_baton,
+  apr_hash_t *headers,
+  void *parse_baton,
+  apr_pool_t *pool);
+
+  /** The parser has discovered a new uuid record within the parsing
+   * session represented by @a parse_baton.  The uuid's value is
+   * @a uuid, and it is allocated in @a pool.
+   */
+  svn_error_t *(*uuid_record)(const char *uuid,
+  void *parse_baton,
+  apr_pool_t *pool);
+
+  /** The parser has discovered a new node record within the current
+   * revision represented by @a revision_baton.  All the headers are
+   * placed in @a headers (as with @c new_revision_record), allocated in
+   * @a pool.  The @a node_baton received back is allocated in @a pool
+   * and represents the node.
+   */
+  svn_error_t *(*new_node_record)(void **node_baton,
+  apr_hash_t *headers,
+  void *revision_baton,
+  apr_pool_t *pool);
+
+  /** For a given @a revision_baton, set a property @a name to @a
+   *  value. Scratch pool @a pool is available for any temporary
+   *  allocations.
+   */
+  svn_error_t *(*set_revision_property)(void *revision_baton,
+const char *name,
+const svn_string_t *value,
+apr_pool_t *pool);
+
+  /** For a given @a node_baton, set a property @a name to @a
+   *  value. Scratch pool @a pool is available for any temporary
+   *  allocations.
+   */
+  svn_error_t *(*set_node_property)(void *node_baton,
+const char *name,
+const svn_string_t *value,
+apr_pool_t *pool);
+
+  /** For a given @a node_baton, delete property @a name. Scratch pool
+* @a pool is available for any temporary allocations.
+*/
+  svn_error_t *(*delete_node_property)(void *node_baton,
+   const char *name,
+   apr_pool_t *pool);
+
+  /** For a given @a node_baton, remove all properties. Scratch pool
+  @a pool is available for any temporary allocations. */
+  svn_error_t *(*remove_node_props)(void *node_baton, apr_pool_t *pool);
+
+  /** For a given @a node_baton, receive a writable @a stream capable of
+   * receiving the node's fulltext.  After writing the fulltext, call
+   * the stream's close() function.
+   *
+   * If a @c NULL is returned instead of a stream, the vtable is
+   * indicating that no text is desired, and the parser will not
+   * attempt to send it.
+   *
+   * Scratch pool @a pool is available for any temporary allocations.
+   */
+  svn_error_t *(*set_fulltext)(svn_stream_t **stream,
+   void *node_baton,
+   apr_pool_t *pool);
+
+  /** For a given @a node_baton, set @a handler and @a handler_baton
+   * to a window handler and baton capable of receiving a delta
+   * against the node's previous 

[PATCH v2] New dumpstream parser to check version number

2010-08-18 Thread Ramkumar Ramachandra
[[[
* subversion/libsvn_repos/load.c
  (svn_repos_parse_dumpstream2, svn_repos_parse_dumpstream3): Rename
  the older function and add a version_number argument; error out if
  there's a version mismatch.

* subversion/include/svn_repos.h
  (svn_repos_parse_dumpstream2, svn_repos_parse_dumpstream3): Add new
  function and mark svn_repos_parse_dumpstream2 as deprecated.

* subversion/svnrdump/load_editor.c
  (drive_dumpstream_loader): Update to use the new API, and call it
  with version_number 3.
]]]

Index: subversion/include/svn_repos.h
===
--- subversion/include/svn_repos.h  (revision 986884)
+++ subversion/include/svn_repos.h  (working copy)
@@ -2286,6 +2286,7 @@ svn_repos_node_from_baton(void *edit_baton);
 /* The RFC822-style headers in our dumpfile format. */
 #define SVN_REPOS_DUMPFILE_MAGIC_HEADERSVN-fs-dump-format-version
 #define SVN_REPOS_DUMPFILE_FORMAT_VERSION   3
+#define SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION-1
 #define SVN_REPOS_DUMPFILE_UUID  UUID
 #define SVN_REPOS_DUMPFILE_CONTENT_LENGTHContent-length
 
@@ -2646,6 +2647,11 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn
  * @a cancel_baton as argument to see if the client wishes to cancel
  * the dump.
  *
+ * If @a exact_version is SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION,
+ * it is ignored and the dumpstream is parsed without this
+ * information. Else, the function checks the dumpstream's version
+ * number, and errors out if there's a mismatch.
+ *
  * This parser has built-in knowledge of the dumpfile format, but only
  * in a general sense:
  *
@@ -2661,17 +2667,33 @@ typedef svn_repos_parse_fns2_t svn_repos_parser_fn
  * This is enough knowledge to make it easy on vtable implementors,
  * but still allow expansion of the format:  most headers are ignored.
  *
- * @since New in 1.1.
+ * @since New in 1.7.
  */
 svn_error_t *
-svn_repos_parse_dumpstream2(svn_stream_t *stream,
+svn_repos_parse_dumpstream3(svn_stream_t *stream,
 const svn_repos_parse_fns2_t *parse_fns,
 void *parse_baton,
 svn_cancel_func_t cancel_func,
 void *cancel_baton,
+int exact_version,
 apr_pool_t *pool);
 
+/**
+ * Similar to svn_repos_parse_dumpstream3(), but with @a exact_version
+ * always set to SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION.
+ *
+ * @since New in 1.1.
+ * @deprecated Provided for backward compatibility with the 1.6 API.
+ */
+SVN_DEPRECATED
+svn_error_t *svn_repos_parse_dumpstream2(svn_stream_t *stream,
+const svn_repos_parse_fns2_t *parse_fns,
+void *parse_baton,
+svn_cancel_func_t cancel_func,
+void *cancel_baton,
+apr_pool_t *pool);
 
+
 /**
  * Set @a *parser and @a *parse_baton to a vtable parser which commits new
  * revisions to the fs in @a repos.  The constructed parser will treat
Index: subversion/libsvn_repos/load.c
===
--- subversion/libsvn_repos/load.c  (revision 986884)
+++ subversion/libsvn_repos/load.c  (working copy)
@@ -654,14 +654,29 @@ parse_format_version(const char *versionstring, in
 }
 
 
+svn_error_t *
+svn_repos_parse_dumpstream2(svn_stream_t *stream,
+const svn_repos_parse_fns2_t *parse_fns,
+void *parse_baton,
+svn_cancel_func_t cancel_func,
+void *cancel_baton,
+apr_pool_t *pool)
+{
+  return svn_repos_parse_dumpstream3(stream, parse_fns, parse_baton,
+ cancel_func, cancel_baton,
+ SVN_REPOS_DUMPFILE_FORMAT_UNKNOWN_VERSION,
+ pool);
+}
 
+
 /* The Main Parser Logic */
 svn_error_t *
-svn_repos_parse_dumpstream2(svn_stream_t *stream,
+svn_repos_parse_dumpstream3(svn_stream_t *stream,
 const svn_repos_parse_fns2_t *parse_fns,
 void *parse_baton,
 svn_cancel_func_t cancel_func,
 void *cancel_baton,
+int exact_version,
 apr_pool_t *pool)
 {
   svn_boolean_t eof;
@@ -690,6 +705,15 @@ svn_error_t *
 return svn_error_createf(SVN_ERR_STREAM_MALFORMED_DATA, NULL,
  _(Unsupported dumpfile version: %d), version);
 
+  /* Error out if exact_version doesn't match version exactly */
+  /* ### TODO: Extend svn_parse_fns2_t to provide another callback for
+ ### version number; that way we can impose arbitrary constraints
+ ### on it, not just check for exact version */
+