Am 02.08.2011, 08:02 Uhr, schrieb Jonathan M Davis <[email protected]>:

"file." and "file" do _not_ have the same extension. One has an empty extension whereas the other has none.

Still I would expect a get extension function to return the empty string for both. Why is that so? As Wikipedia states the interpretation depends on the filesystem (or maybe on the originating OS, but you can use ext3 on Windows and NTFS on Linux nowadays).

But others seem to have problems as well:

Trailing dots disappear in Samba:
http://lists.samba.org/archive/rsync/2002-September/003636.html

On Windows files ending in a dot cannot be deleted:
http://cygwin.com/ml/cygwin/2004-01/msg00848.html
http://blog.dotsmart.net/2008/06/12/solved-cannot-read-from-the-source-file-or-disk/

Mozilla Linux cannot open files ending in a dot:
https://bugzilla.mozilla.org/show_bug.cgi?id=149586

The file extension is what is following the last dot.
On Windows it cannot be empty, thus 'foo.' will be an inaccessible file.
Yet 'foo..bar' is perfectly fine, which is causing us trouble now, since 'foo.' is 'foo..bar' stripped from its extension, but 'foo.' itself - while valid on Posix - is an ambiguous name in Windows.
Camp A thinks:
- it has no extension as long as the dot isn't followed by one
- changing the extension must result in 'foo..ext'
- getExtension should never return null, but be either '' or include the dot as in '.ext' - disassembling and reassembling a filename by string concatenation should return the original filename in all cases

Camp B thinks:
- no dot = no extension, otherwise what follows the dot is the extension
- changing the extension must result in 'foo.ext'
- getExtension returns null if no dot is found, an empty string if the file ends in a dot or otherwise what is following the dot
- disassembling and reassembling a filename isn't a portable process

I started at camp A, but now I'm really caught in the middle. Their arguments make as much sense. Funny enough even Sun avoided file extension methods in their Java File class, so I checked Python for that matter:
os.path.splitext ( "foo.bar" ) -> '.bar'
os.path.splitext ( "foo." ) -> '.'
os.path.splitext ( "foo" ) -> ''
Although there is no routine to change the extension, the obvious approach would result in changeExt('foo.', '.bar') == 'foo.bar'.

This is what Jonathan prefers and I agree with this solution now that I made up my mind. It's just inconvenient that by this convention you cannot change the extension of 'Keep my dot.' in a way that the result is 'Keep my dot..ext'.

Reply via email to