I did not take any blame in wht you say. I just wanted to explain that yes comments can be wrong :)
I'm super happy to get some brains looking at improving FS. I'm fighting to see how we can start to integrate faster changes in P70. On Fri, Jun 30, 2017 at 10:58 AM, Alistair Grant <[email protected]> wrote: > Hi Stef, > > On Fri, Jun 30, 2017 at 09:46:44AM +0200, Stephane Ducasse wrote: >> Hi alistair >> >> I do not know if this is me that wrote a wrong path class comment. >> You should consider that theere were nearly no comment at all and I >> started to try to give more love to this great library. >> >> Stef > > No problem, I wasn't trying to blame anyone. The comments appear to > align with the design goals of the class. And I really do agree with > the goals, it is just that given the confusion it creates, in this > particular case it is more practical to parse the strings. > > Actually issue 18042[1] offers an alternative approach, which is to enforce > not allowing the directory delimiter. I would extend the error message > to suggest using #resolve: instead. I think that following the > programmer's rule of being strict in what you write and forgiving in > what you read, parsing the string is more practical, but I'm keen to see > what everyone else thinks. > > [1] > https://pharo.fogbugz.com/f/cases/18042/FileSystem-a-file-doesn-t-exist-but-still-exists > >> I'm not expert of file system. Now I like your idea to have explicit >> messages such as canonalize or expand. > > :-) > > Cheers, > Alistair > > > >> On Tue, Jun 27, 2017 at 8:56 PM, Alistair Grant <[email protected]> >> wrote: >> > Hi Sven, >> > >> > On Tue, Jun 27, 2017 at 01:31:42PM +0200, Sven Van Caekenberghe wrote: >> >> Hi Alistair, >> >> >> >> I think it is great that you are working on FileSystem and are trying >> >> to contribute. I appreciate your effort to find consensus. >> > >> > :-) >> > >> > >> >> But, and please take this as positive criticism, I feel a bit uneasy >> >> when I read your reasoning, but maybe I am wrong. >> > >> > Thanks for taking the trouble to reply. Hopefully I can address some of >> > your unease below. >> > >> > >> >> You see, the way I understand FileSystem (what I think was the >> >> original design idea), it is a cross platform abstraction over >> >> concrete file systems and paths. >> > >> > It does appear to be the case that effort was put in to avoid specifying >> > a separator, but it makes an assumption that is incorrect. The class >> > comments of path state that the seperator can be part of a file name, >> > and this is incorrect for every disk file system Pharo supports (as far >> > as I know). >> > >> > >> > >> >> A FileReference holds a FileSystem and a Path. A Path is just a >> >> collection of elements that leads to a location, the leaf possibly >> >> being interpreted as a file with an extension (although that last >> >> point is just a convention). It seems great effort was put into >> >> avoiding the use of separators. >> > >> >> Pharo is an object oriented system with appropriate abstractions, >> >> FileSystem gives us one approach to this difficult problem. >> >> >> >> In your reasoning, your truth, your reference is always the Unix file >> >> system and the way paths are handled there. You expect a number of >> >> things based on that, but maybe that was/is not the design goal. >> > >> > I do have a bias towards the Unix filesystem, and you're correct that >> > all the examples I provided use the Unix file system, but I've tried to >> > make sure that the changes I've made apply equally to Windows file >> > systems (more below). >> > >> > >> >> I said this before, but I am not sure we should interpret the argument of >> >> #/. >> > >> > I'll come back to this later :-) >> > >> > >> >> I think that >> >> >> >> FileLocator root / 'foo' / 'bar' / 'readme.txt'. >> >> >> >> is more abstract (fitting to the original goal) then >> >> >> >> '/foo' asFileReference / 'bar/readme.txt'. >> >> >> >> because it totally avoids a reference to the platform dependent path >> >> separator. >> > >> > I wasn't trying to suggest that code would normally be written this way. >> > It is more likely that '/foo' asFileReference is created early in the >> > code, and much later an input is supplied which is 'bar/readme.txt'. At >> > that point, the natural thing to do is send #/ as above. >> > >> > The problem with only supporting the first case above is that it puts >> > the onus of parsing the string on the user. We could add a another >> > method that parses the supplied string, but that will just make the >> > interface more confusing, and we already have a steady stream of >> > messages to the list from people getting confused by this. >> > >> > >> >> The same goes for '..' as >> >> >> >> (FileLocator root / 'foo' / 'bar' / 'down') parent / 'readme.txt'. >> >> >> >> is a more object oriented way to write '/foo/bar/down/../readme.txt', IMO. >> > >> > Agreed, but as mentioned above, we also have to deal with input strings >> > from external sources. >> > >> > >> >> Windows is an important target for Pharo, they use $\ not $/. >> > >> > The patch handles that, so on windows I can do: >> > >> > ('C:\cygwin' asFileReference / 'usr\local\bin') parent " File @ >> > C:\cygwin\usr\local" >> > >> > (just to be clear, this is an example of how the strings are handled, >> > I'm not expecting anyone to write code like the above). >> > >> > >> >> Do you see my point ? >> > >> > I think so, but... :-) >> > >> > >> >> BTW, I consider the fact that current, >> >> >> >> '/foo' asFileReference / 'bar/readme.txt' >> >> >> >> works with #exists even though 'bar/readme.txt' was not fully >> >> parsed/resolved a bug, or a happy coincidence at best. >> > >> > I agree that it is a happy coincedence, but it is also intuitive. >> > >> > >> >> Another point is the distinction between internal/external and/or >> >> concrete/abstract paths. Should something like '..' remain part of a >> >> path. Never, always, only when it can be resolved ? What about special >> >> characters ? >> > >> > I'm arguing that '..' should be left in the path unless explicitly told >> > to be removed, so that things like symbolic links are handled properly >> > (by the file system itself). (Thanks to Denis for reminding me about >> > this in an earlier email thread) >> > >> > While it's nice to be completely general, is there a supported file >> > system that doesn't interpret ".." as the parent directory? >> > >> > As an aside, another patch I'll be submitting fixes symbolic link >> > handling so we could conveivably write a #canonicalizeOnFileSystem that >> > would properly handle 'symboliclink/..', but it would be slow. >> > >> > >> >> Should we support ~ ? Sometimes I would like that too, but maybe >> >> >> >> FileLocator home >> >> >> >> is enough. >> > >> > As Subbu points out in a later message, things like '~' are handled by >> > the shell. We could consider adding something like #expand to handle >> > '~' and shell variables, e.g. $HOME, but I'd make that a separate patch. >> > >> > It would also need to be done carefully as $ and ~ are valid characters >> > within file names (only / and the null character are not allowed in >> > Posix file systems, Windows has more characters that are not allowed). >> > >> > >> > >> >> All this being said, I think FileSystem can and should be improved, >> >> but carefully. >> >> >> >> It would be good if more people joined this discussion. >> >> >> >> Sven >> > >> > I think I understand the original goals, and I don't disagree with them, >> > however: >> > >> > 1. they are based on an arguably incorrect assumption (the >> > separator character can be part of a file name), >> > >> > 2. they've caused quite a bit of confusion due to unexpected behaviour, >> > >> > 3. if support for a file system that allows the separator in file names >> > is ever added, we will still have to be able to parse a full path >> > correctly (which the current code wouldn't do) >> > >> > 4. I don't think these changes conflict with the original goals, all the >> > examples you provided above still function exactly the same. >> > >> > Does that help reduce your uneasiness? >> > >> > Thanks again, >> > Alistair >> > >> > >> > >> > >> >> > On 26 Jun 2017, at 10:12, Alistair Grant <[email protected]> wrote: >> >> > >> >> > Pharo's FileSystem behaviour is currently somewhat inconsistent in its >> >> > treatment of path strings. >> >> > >> >> > To be able to demonstrate the limitations and fixes, I'm assuming that >> >> > the following files and directories have been created (/dev/shm is a >> >> > memory based file system in Ubuntu): >> >> > >> >> > cd /dev/shm >> >> > mkdir -p d1/d2/d3/d4 >> >> > touch d1/t1.txt d1/d2/t2.txt d1/d2/d3/t3.txt d1/d2/d3/d4/t4.txt >> >> > pushd d1 && ln -s d2/d3 ./s3 && popd >> >> > >> >> > >> >> > >> >> > 1. Path's are canonicalised during initial creation, but are not when >> >> > extending the path, i.e. sending #/ >> >> > >> >> > E.g. >> >> > >> >> > '/dev/shm/d1/d2/../t1.txt' asFileReference " File @ /dev/shm/d1/t1.txt" >> >> > >> >> > '/dev/shm/d1/' asFileReference / 'd2/../t1.txt' " File @ >> >> > /dev/shm/d1/d2/../t1.txt" >> >> > >> >> > Automatic canonicalisation is problematic as the code doesn't handle >> >> > symbolic links in the path name: >> >> > >> >> > ('/dev/shm/d1/' asFileReference / 's3/../t2.txt') exists " true >> >> > (correct answer)" >> >> > >> >> > '/dev/shm/d1/s3/../t2.txt' asFileReference exists " false (wrong >> >> > answer)" >> >> > >> >> > >> >> > 2. In an attempt to be completely general, the argument to #/ is assumed >> >> > to be a single directory / file. This is incorrect as the path >> >> > delimiter is not allowed to be part of the file name. The result is >> >> > that path comparisons and operations such as #parent give unexpected >> >> > results. >> >> > >> >> > E.g. >> >> > >> >> > ('/dev/shm/d1/' asFileReference / 'd2/d3/t3.txt') exists " true" >> >> > >> >> > ('/dev/shm/d1/' asFileReference / 'd2/d3/t3.txt') parent " File @ >> >> > /dev/shm/d1" >> >> > >> >> > >> >> > My PR modified FileSystem so that: >> >> > >> >> > 1. Canonicalisation is separated out as an operation that has to be >> >> > manually requested: >> >> > >> >> > '/dev/shm/d1/d2/../t1.txt' asFileReference " File @ >> >> > /dev/shm/d1/d2/../t1.txt" >> >> > >> >> > '/dev/shm/d1/d2/../t1.txt' asFileReference canonicalize " File @ >> >> > /dev/shm/d1/t1.txt" >> >> > >> >> > >> >> > 2. The argument to #/ can be either a single file / directory name or a >> >> > path. >> >> > >> >> > ('/dev/shm/d1/' asFileReference / 'd2/d3/t3.txt') parent " File @ >> >> > /dev/shm/d1/d2/d3" >> >> > >> >> > >> >> > >> >> > 3. Adds unit tests to cover the changed behaviour. >> >> > >> >> > >> >> > While I believe that these changes improve Pharo overall, making it more >> >> > intuitive and consistent, modifying the path creation to remove >> >> > canonicalisation is not backward compatible and requires a change to >> >> > PathTest>>testRelativeFromStringNormalization and >> >> > PathTest>>testRelativeFromStringNormalizationParent to manually >> >> > canonicalise the paths. >> >> > >> >> > Before I submit the patch, does anyone have any strong objections to the >> >> > changes as described above? >> >> > >> >> > Thanks, >> >> > Alistair >> > >> >
