URI-encoding on 1.7 repository?

Garret Wilson Fri, 20 Jan 2012 10:39:14 -0800

What is the canonical way to encode filenames, both in the API and inthe underlying FSFS in a Subversion 1.7 repository?

Let's say I have the file "a b.txt", which consists of "a" and "b" witha space in between. How should this be stored on the server? How shouldthe various APIs give it to me?

Let me explain further. If I commit a file on Windows 7 Professional 64bit on an NTFS partition using TortoiseSVN, and then turn around andread that repository using SVNKit, the SVNDirEntry.getRelativePath()gives me "a b.txt". I don't know if on the back-end these files arebeing stored as "a b.txt", or if they are being stored in canonical URIform (i.e. "a%20b.txt") and SVNKit is just being "helpful" by decoding them.

From my end I'm actually starting with 100% canonically-encoded URIs tobegin with. If Subversion is storing these things in decoded form on theback end, does it compensate for characters not supported by theunderlying file system? So when I take my URI and I decode it just so Ican save the filename the way Subversion likes, how do I know whichcharacters to decode (those supported by the underlying file system---asif I, the client know what that is!) but which characters to leaveencoded (those not supported by the underlying file system on the server)?

Maybe someone can set me straight here. I'm hoping that Subversionstores everything in correctly UTF-8 encoded and escaped URIs in theback-end and in its APIs, and that the real culprit here is SVNKit forbeing "helpful" and decoding the strings for me without asking. Or Isuppose the other option that would work almost as well is if everythingon the back-end was stored in decoded form, but some tricks are pulledso that /all/ characters are supported, regardless of the underlyingfile system. The case I don't want to end up in is where I have toencode some characters but not others based upon some file systemimplementation I don't know about on the server.


Thanks for shedding some light on this.

Garret

URI-encoding on 1.7 repository?

Reply via email to