On Thu, Apr 22, 2010 at 05:07:56PM -0400, Greg Stein wrote: > On Thu, Apr 22, 2010 at 15:48, Daniel Näslund <dan...@longitudo.com> wrote: > > >> read_info() returns changed_* information for copied nodes. A copied > >> node has a corresponding node in the repository, and the changed_* > >> information reflects that. (thus, your assemble_status should not be > >> wiping the values out... if the caller wants to interpret added nodes > >> that way, then it can) > > >> Personally, I would suggest that assemble_status() does NOTHING more > >> than returning read_info's REVISION result. That is the "ideal" value > >> for the revision of the node LOCAL_ABSPATH. > >> > >> If that revision is NOT what you want (ie. you have some other rules > >> like "well... if it is added, then ..." or "if it is deleted, then > >> ..."), then the caller should perform those calculations explicitly. > >> The status code should not pre-suppose what rules the caller may want > >> for revision handling. It should simply report the "ideal" revision > >> for that node and be done with it. > > > > WHAT? You're saying that all the scanning should take place in the > > client callback? > > No. I said "don't create strange semantics". > > The code you had in there had the following definition for status->revision: > > "The revision of the BASE node, unless that has been deleted, then: if > nothing was added, then still the same BASE revision, but if something > was added, then SVN_INVALID_REVNUM. BUT! If you delete a child of that > addition, then the BASE version MAY be returned again... we say "may" > because if a BASE doesn't correspond to the deleted child of the add, > then you'll get an error instead." > > The definition that read_info uses is: > > "The revision of the unmodified (BASE) node, or SVN_INVALID_REVNUM if > any (structural) changes have been made to that node." > > (where "structural" means an add/copy/move/delete; we don't mean > content edits like propchanges or editing the text) > > > That's a major change from what I've done so far. That > > I should just return SVN_INVALID_REVNUM for cases where I've scanned for > > deletion? Is that a behaviour that everyone agrees on? You're saying > > personally so I'm assuming there hasn't been a discussion about it. > > No, I don't think it should do a scan_add, nor a scan_deletion. Report > the add/delete, and leave it at that. > > repos_relpath, repos_root_url, and repos_uuid are all inherited > values, so I believe a scan_base_repos makes sense: you're simply > returning those values. You're not looking for further details about > operations (which the caller may not be interested in anyways). > > (and note that scan_base_repos doesn't even have to be used cuz > assemble_status could be passed the parent repos_* values; if the BASE > node hasn't been switched and (therefore) returns NULL for repos_*, > then its repos_* values are easily derived within wc/status.c; but > repos_* values will always be NULL for modified nodes (ie. if a > WORKING node exists)) > > Scanning additional rows is not "free", so we shouldn't be looking up > information from the ancestors unless/until we find that most/all of > the callers need that information. And in some cases, we can even make > status.c fill in the info, since it has seen the ancestors already. > > In the future, we may return added/copied/moved_here distinctly so > that a scan won't be needed to resolve that (tho you'd still have to > scan to locate the root of the operation, tho we may be able to > optimize that inside wc_db).
I set out to replace the entry field in svn_wc_statusX_t. I thought that ideally I could just keep the semantics from the entry but move out the fields to svn_wc_status2_t. I wanted to keep the current behaviour since the status code is involved in so _many_ tests and rewriting all of those would cause me to loose momentum. Guess, what? I've already lost that momentum. :) I guess I should do what you suggested: Make assemble_status() simpler to insure we that we can easily describe what it's returning. On the other hand, I don't see the scanning process as something expensive comparing to 'text compares'. And I question if the API we're exposing really should demand that the callers do any extra work for fetching something as fundamental as a revision. AFAIK, the only thing that is not clear is how we handle working changes below a deleted node. If we get that one settled, then why not return the _right_ revision at once? I mean, for a VCS where the notion of globally recognized identifiers for change sets is a central part, there should be little vagueness as to which one we really mean for a path! The users of our API will hardly want to use another behaviour than what the CLI exposes. Since I'm easily persuaded, below is my idea of what to do to get to the point where assemble_status() only returns the most basic info about the revision of a node. (I really should skip step (i) and rewrite it to be part (ii) at once but I've put so much effort into that patch. I just want to commit it! Hardly a good argument, though.) Suggested work flow ==================== i) Commit the current patch with a few mods to correct the behaviour for changed_rev, entries-checking and some '###' comments in assemble_status() clearly stating our intent to return a more well-defined revision (less ambiguity). ii) Create a svn_wc__node_get_revision_of_deleted() or similar to invoke when status->text_status == svn_wc_status_deleted. That one should be able to cope with changes below the base_del_abspath when returning the abspath. Remove relevant parts from assemble_status() and invoke the node function instead from svn/status.c. iii) Replace the status->entry->lock_{token, comment,...} with status->lock_{token, comment, ... } iv) Remove the entry field and use another way for checking if a path is versioned or not instead of just checking if status->entry is set. v) Create the svn_status2_from_3() logic to create an entry for svn_wc_status2_t. vi) Change the output of 'svn status' to set uncommitted paths as having rev ' ' instead of '0'. vii) Do the same for 'svn info' after removing the uses of entries in there. build_info_for_entry() is a rat nest of entry usages. Thanks, Daniel