Oh, and it case I wasn't clear, I'm referring to a Subversion repository, not a local copy. And I'm referring to the top-most API. If some of the lower layers are more restrictive than the top-most API, then they should use some encoding scheme (what, I don't care) to shield this platform-specific restriction from the top-level API---which is what I thought Daniel was saying at first.

Garret


On 1/20/2012 7:28 PM, Garret Wilson wrote:
On 1/20/2012 7:00 PM, Daniel Shahaf wrote:
Garret Wilson wrote on Fri, Jan 20, 2012 at 18:18:24 -0800:
On 1/20/2012 6:14 PM, Daniel Shahaf wrote:
You don't care what FS backend the server runs. All you care is
that the endpoint of svn_ra_open4() implements the Subversion RA
API properly. Normal Subversion servers use svn_fs.h which in turn
presents the same API _regardless of which backend is used_. I'll
spell it out: the notion of 'valid pathname in a Subversion
filesystem' does not depend on the FS backend in use.
All that is good news. So I guess the important question is: what
spells out "the notion of 'valid pathname on a Subversion
filesystem'"? Is it "any valid Unicode code point?" What I'm getting
See my previous reply.

Right. So your previous reply said that a "valid pathname" is the same on all platforms, and that the underlying implementation will take care of the details. I'm asking what are the rules for a "valid pathname". I'm glad that these rules are the same across all platforms, but I don't know what the rules are. In other words, what goes in the following function?

boolean isValidSubversionPathname(String pathname);



at is that I need to know which characters, if any, I need to encode
before passing them to Subversion. If Subversion supports any
Unicode character, I can just pass the path decoded and sleep
soundly at night. If not, I need to know which ones to decode and
which ones to pass through.
Err, that depends on what API layer you're working with.  (For example:
svn_fs.h is perfectly happy with :,*,\n as part of the basename, but
libsvn_wc on windows, and the mergeinfo logic, aren't.)

Oh, that's bad news. In your previous reply you said, "the notion of 'valid pathname in a Subversion filesystem' does not depend on the FS backend in use." Now you seem to say "whether some pathname is valid or not it depends on whether you 're on Windows or some other platform." (Even worse, you seem to be saying that the notion of "valid pathname" isn't even consistent across the API.)

And 'what to encode/decode' is a rather vague question.  I'm not sure if
it means "Does `svn info uri:///foo bar` == `svn info uri:///foo%20bar`?"
or something else.  Can you be more concrete?

It doesn't matter. It's some black box that works like this:

String encode(String input);
String decode(String output);

I can come up with a thousand ways to encode/decode. I can use %hh. I can use ^0xhh. The only two requirements are that 1) encode() provides me with a string guaranteed to be a valid pathname, and 2) decode() will take the encoded string and give me back the decoded string I started with.

But to meet requirement #1, I have to know which characters are considered valid and which aren't. That's what I don't know, and that's what I'm asking:

 1. Does the API guarantee that a "valid pathname" (whatever that is)
    is the same across all platforms? I thought you said yes, but now
    it seems you're saying no. (If you say "no", then there's no point
    in answering question 2, because we're stuck---I can write code
    that may work with one repository on one platform, but suddenly
    fail when I move the same data to another platform.)
 2. What is the definition of "valid pathname"? Is it any Unicode
    character? Is it only XML name characters? Is it any Unicode
    character except control characters and NULL (\u0000)?

Sorry if I'm not clear. It's a very simple question, and I hope I'm not making it more complicated than it is.

Think about it this way: pretend you have an XML document with the element <a-b>. You to walk the DOM of that document on Windows, and it works fine. But you try process the DOM on a Mac, it breaks, with your XML processor saying, "sorry, an XML name cannot have a '-' character". That will never happen. Why? Because (these are analogous questions to the ones above concerning Subversion):

 1. The XML specification guarantees that all XML processors agree on
    what an XML name is.
 2. Specifically, an XML name is composed of a NameStartChar followed
    by any NameChar, as defined here:
    http://www.w3.org/TR/REC-xml/#NT-Name

Does that make sense? Can we answer those same two questions concerning Subversion pathnames?

Garret

Reply via email to