Hi Alistair, I think it is great that you are working on FileSystem and are trying to contribute. I appreciate your effort to find consensus.
But, and please take this as positive criticism, I feel a bit uneasy when I read your reasoning, but maybe I am wrong. You see, the way I understand FileSystem (what I think was the original design idea), it is a cross platform abstraction over concrete file systems and paths. A FileReference holds a FileSystem and a Path. A Path is just a collection of elements that leads to a location, the leaf possibly being interpreted as a file with an extension (although that last point is just a convention). It seems great effort was put into avoiding the use of separators. Pharo is an object oriented system with appropriate abstractions, FileSystem gives us one approach to this difficult problem. In your reasoning, your truth, your reference is always the Unix file system and the way paths are handled there. You expect a number of things based on that, but maybe that was/is not the design goal. I said this before, but I am not sure we should interpret the argument of #/. I think that FileLocator root / 'foo' / 'bar' / 'readme.txt'. is more abstract (fitting to the original goal) then '/foo' asFileReference / 'bar/readme.txt'. because it totally avoids a reference to the platform dependent path separator. The same goes for '..' as (FileLocator root / 'foo' / 'bar' / 'down') parent / 'readme.txt'. is a more object oriented way to write '/foo/bar/down/../readme.txt', IMO. Windows is an important target for Pharo, they use $\ not $/. Do you see my point ? BTW, I consider the fact that current, '/foo' asFileReference / 'bar/readme.txt' works with #exists even though 'bar/readme.txt' was not fully parsed/resolved a bug, or a happy coincidence at best. Another point is the distinction between internal/external and/or concrete/abstract paths. Should something like '..' remain part of a path. Never, always, only when it can be resolved ? What about special characters ? Should we support ~ ? Sometimes I would like that too, but maybe FileLocator home is enough. All this being said, I think FileSystem can and should be improved, but carefully. It would be good if more people joined this discussion. Sven > On 26 Jun 2017, at 10:12, Alistair Grant <[email protected]> wrote: > > Pharo's FileSystem behaviour is currently somewhat inconsistent in its > treatment of path strings. > > To be able to demonstrate the limitations and fixes, I'm assuming that > the following files and directories have been created (/dev/shm is a > memory based file system in Ubuntu): > > cd /dev/shm > mkdir -p d1/d2/d3/d4 > touch d1/t1.txt d1/d2/t2.txt d1/d2/d3/t3.txt d1/d2/d3/d4/t4.txt > pushd d1 && ln -s d2/d3 ./s3 && popd > > > > 1. Path's are canonicalised during initial creation, but are not when > extending the path, i.e. sending #/ > > E.g. > > '/dev/shm/d1/d2/../t1.txt' asFileReference " File @ /dev/shm/d1/t1.txt" > > '/dev/shm/d1/' asFileReference / 'd2/../t1.txt' " File @ > /dev/shm/d1/d2/../t1.txt" > > Automatic canonicalisation is problematic as the code doesn't handle > symbolic links in the path name: > > ('/dev/shm/d1/' asFileReference / 's3/../t2.txt') exists " true (correct > answer)" > > '/dev/shm/d1/s3/../t2.txt' asFileReference exists " false (wrong answer)" > > > 2. In an attempt to be completely general, the argument to #/ is assumed > to be a single directory / file. This is incorrect as the path > delimiter is not allowed to be part of the file name. The result is > that path comparisons and operations such as #parent give unexpected > results. > > E.g. > > ('/dev/shm/d1/' asFileReference / 'd2/d3/t3.txt') exists " true" > > ('/dev/shm/d1/' asFileReference / 'd2/d3/t3.txt') parent " File @ > /dev/shm/d1" > > > My PR modified FileSystem so that: > > 1. Canonicalisation is separated out as an operation that has to be > manually requested: > > '/dev/shm/d1/d2/../t1.txt' asFileReference " File @ /dev/shm/d1/d2/../t1.txt" > > '/dev/shm/d1/d2/../t1.txt' asFileReference canonicalize " File @ > /dev/shm/d1/t1.txt" > > > 2. The argument to #/ can be either a single file / directory name or a > path. > > ('/dev/shm/d1/' asFileReference / 'd2/d3/t3.txt') parent " File @ > /dev/shm/d1/d2/d3" > > > > 3. Adds unit tests to cover the changed behaviour. > > > While I believe that these changes improve Pharo overall, making it more > intuitive and consistent, modifying the path creation to remove > canonicalisation is not backward compatible and requires a change to > PathTest>>testRelativeFromStringNormalization and > PathTest>>testRelativeFromStringNormalizationParent to manually > canonicalise the paths. > > Before I submit the patch, does anyone have any strong objections to the > changes as described above? > > Thanks, > Alistair >
