Oh, and it case I wasn't clear, I'm referring to a Subversion
repository, not a local copy. And I'm referring to the top-most API. If
some of the lower layers are more restrictive than the top-most API,
then they should use some encoding scheme (what, I don't care) to shield
this platform-specific restriction from the top-level API---which is
what I thought Daniel was saying at first.
Garret
On 1/20/2012 7:28 PM, Garret Wilson wrote:
On 1/20/2012 7:00 PM, Daniel Shahaf wrote:
Garret Wilson wrote on Fri, Jan 20, 2012 at 18:18:24 -0800:
On 1/20/2012 6:14 PM, Daniel Shahaf wrote:
You don't care what FS backend the server runs. All you care is
that the endpoint of svn_ra_open4() implements the Subversion RA
API properly. Normal Subversion servers use svn_fs.h which in turn
presents the same API _regardless of which backend is used_. I'll
spell it out: the notion of 'valid pathname in a Subversion
filesystem' does not depend on the FS backend in use.
All that is good news. So I guess the important question is: what
spells out "the notion of 'valid pathname on a Subversion
filesystem'"? Is it "any valid Unicode code point?" What I'm getting
See my previous reply.
Right. So your previous reply said that a "valid pathname" is the same
on all platforms, and that the underlying implementation will take
care of the details. I'm asking what are the rules for a "valid
pathname". I'm glad that these rules are the same across all
platforms, but I don't know what the rules are. In other words, what
goes in the following function?
boolean isValidSubversionPathname(String pathname);
at is that I need to know which characters, if any, I need to encode
before passing them to Subversion. If Subversion supports any
Unicode character, I can just pass the path decoded and sleep
soundly at night. If not, I need to know which ones to decode and
which ones to pass through.
Err, that depends on what API layer you're working with. (For example:
svn_fs.h is perfectly happy with :,*,\n as part of the basename, but
libsvn_wc on windows, and the mergeinfo logic, aren't.)
Oh, that's bad news. In your previous reply you said, "the notion of
'valid pathname in a Subversion
filesystem' does not depend on the FS backend in use." Now you seem to
say "whether some pathname is valid or not it depends on whether you
're on Windows or some other platform." (Even worse, you seem to be
saying that the notion of "valid pathname" isn't even consistent
across the API.)
And 'what to encode/decode' is a rather vague question. I'm not sure if
it means "Does `svn info uri:///foo bar` == `svn info uri:///foo%20bar`?"
or something else. Can you be more concrete?
It doesn't matter. It's some black box that works like this:
String encode(String input);
String decode(String output);
I can come up with a thousand ways to encode/decode. I can use %hh. I
can use ^0xhh. The only two requirements are that 1) encode() provides
me with a string guaranteed to be a valid pathname, and 2) decode()
will take the encoded string and give me back the decoded string I
started with.
But to meet requirement #1, I have to know which characters are
considered valid and which aren't. That's what I don't know, and
that's what I'm asking:
1. Does the API guarantee that a "valid pathname" (whatever that is)
is the same across all platforms? I thought you said yes, but now
it seems you're saying no. (If you say "no", then there's no point
in answering question 2, because we're stuck---I can write code
that may work with one repository on one platform, but suddenly
fail when I move the same data to another platform.)
2. What is the definition of "valid pathname"? Is it any Unicode
character? Is it only XML name characters? Is it any Unicode
character except control characters and NULL (\u0000)?
Sorry if I'm not clear. It's a very simple question, and I hope I'm
not making it more complicated than it is.
Think about it this way: pretend you have an XML document with the
element <a-b>. You to walk the DOM of that document on Windows, and it
works fine. But you try process the DOM on a Mac, it breaks, with your
XML processor saying, "sorry, an XML name cannot have a '-'
character". That will never happen. Why? Because (these are analogous
questions to the ones above concerning Subversion):
1. The XML specification guarantees that all XML processors agree on
what an XML name is.
2. Specifically, an XML name is composed of a NameStartChar followed
by any NameChar, as defined here:
http://www.w3.org/TR/REC-xml/#NT-Name
Does that make sense? Can we answer those same two questions
concerning Subversion pathnames?
Garret