Re: Filesystem case sensitive check and java.io.File#equals
Thanks for the clarification, Alan. -Jaikiran On Thursday, November 8, 2018, Alan Bateman wrote: > On 08/11/2018 12:59, Jaikiran Pai wrote: >> >> A slightly related question - I used the example that Roger showed and >> it mostly worked. However, Files.isSameFile throws a >> java.nio.file.NoSuchFileException since the path2 doesn't exist (on a >> case sensitive file system). I checked the javadoc of Files.isSameFile >> and couldn't find a clear mention if this is expected. It does talk >> about the existence checks that might be carried out in this >> implementation, but doesn't mention that it will throw an exception in >> the absence of either of the passed paths. Is this expected? Or should >> the implementation internally catch this exception and return false? >> > IOException is correct, the methods in this API aren't specify all the possible sub-classes. In a few places you will see specific exceptions specified as "optional specific exception" where it make sense. We could potentially improve the spec for the cannot access or does not exist cases but hasn't been an issue to date. > > -Alan >
Re: Filesystem case sensitive check and java.io.File#equals
On 08/11/2018 12:59, Jaikiran Pai wrote: A slightly related question - I used the example that Roger showed and it mostly worked. However, Files.isSameFile throws a java.nio.file.NoSuchFileException since the path2 doesn't exist (on a case sensitive file system). I checked the javadoc of Files.isSameFile and couldn't find a clear mention if this is expected. It does talk about the existence checks that might be carried out in this implementation, but doesn't mention that it will throw an exception in the absence of either of the passed paths. Is this expected? Or should the implementation internally catch this exception and return false? IOException is correct, the methods in this API aren't specify all the possible sub-classes. In a few places you will see specific exceptions specified as "optional specific exception" where it make sense. We could potentially improve the spec for the cannot access or does not exist cases but hasn't been an issue to date. -Alan
Re: Filesystem case sensitive check and java.io.File#equals
A slightly related question - I used the example that Roger showed and it mostly worked. However, Files.isSameFile throws a java.nio.file.NoSuchFileException since the path2 doesn't exist (on a case sensitive file system). I checked the javadoc of Files.isSameFile and couldn't find a clear mention if this is expected. It does talk about the existence checks that might be carried out in this implementation, but doesn't mention that it will throw an exception in the absence of either of the passed paths. Is this expected? Or should the implementation internally catch this exception and return false? For now, in my implementation, I have explicitly caught this NoSuchFileException, around the Files.isSameFile statement and consider it to imply the filesystem is case sensitive. -Jaikiran On 07/11/18 8:37 PM, Roger Riggs wrote: > Hi Jaikiran, > > To check if two pathnames are the same file, > java.nio.file.Files.isSameFile(path1, path2) > is cleaner. It uses the file system specific mechanisms to determine > if the two paths > refer to the identical file. Traversing symbolic links, etc. > > Something like: > Path dir = ...; // supply a directory to test a > particular file system > Path path1 = Files.createTempFile(dir, "MixedCase", ""); > Path path2 = > Path.of(path1.toString().toLowerCase(Locale.US)); > return Files.isSameFile(path1, path2); > > or similar... > > Roger > > On 11/07/2018 09:27 AM, Jaikiran Pai wrote: >> Hi Alan, >> >> On 07/11/18 7:15 PM, Alan Bateman wrote: >>> On 07/11/2018 13:13, Jaikiran Pai wrote: : My impression, based on that javadoc, was that the implementation of that API will use the underlying _filesystem_ to decide whether or not its case sensitive. However, my experiments on a macOS which is case insensitive and also a basic check of the implementation code in the JDK, shows that this uses a lexicographic check on the string representation of the paths. Is that what this javadoc means by "underlying system". I understand that there's a further sentence in that javadoc which says UNIX systems are case significant and Windows isn't. However, it wasn't fully clear to me that this API wouldn't delegate it to the underlying filesystem on the OS. >>> Maybe the javadoc could be clearer but having an equals method >>> delegate to the underlying file system would be very problematic >>> (think URL equals or cases where the file path is relative or doesn't >>> locate a file). >> I see what you mean. About the javadoc, do you think it is worth >> improving and if so, should I file an enhancement at bugs.java.com? >> While we are at it, is there a better API that can be used to do such a case sensitivity check of the underlying filesystem? I checked the APIs in java.nio.file but couldn't find anything relevant. If there isn't any such API currently, is that something that can be introduced? >>> This has been looked into a couple of times over the years, >>> suggestions have included FileStore exposing a comparator that uses >>> the rules of a specific underlying file system or volume. This turns >>> out to be very complicated as it means looking beyond case sensitively >>> issues and into topics such as case preservation, Unicode >>> normalization forms, per volume case mapping tables, and handling of >>> limits and truncation. >> So I guess, that leaves us (the client applications) to rely on some >> kind of file creation tricks like the one below to figure this out: >> >> // create a temp file in a fresh directory >> final Path tmpDir = Files.createTempDirectory(null); >> final Path tmpFile = Files.createTempFile(tmpDir, null, null); >> tmpFile.toFile().deleteOnExit(); >> tmpDir.toFile().deleteOnExit(); >> // now check if a file with that same name but different >> case is >> considered to exist >> final boolean existsAsLowerCase = >> Files.exists(Paths.get(tmpDir.toString(), >> tmpFile.getFileName().toString().toLowerCase())); >> final boolean existsAsUpperCase = >> Files.exists(Paths.get(tmpDir.toString(), >> tmpFile.getFileName().toString().toUpperCase())); >> // if the temp file that we created is found to not exist in a >> particular "case", then >> // the filesystem is case sensitive >> final boolean filesystemCaseSensitive = existsAsLowerCase == >> false || existsAsUpperCase == false; >> >> One final question - is there anywhere in the JDK code, where files are >> auto created (for whatever purpose)? If such a construct exists, maybe >> during the initialization of each FileStore implementation, it could do >> such tricks (maybe in a much better and performant way) and figure out >> this detail and then expose it some way? It feels hacky to do it in the >> JDK, so I fully understand if such a construct won't be entertained. >>
Re: Filesystem case sensitive check and java.io.File#equals
Hello Roger, That indeed is a much cleaner approach. Thank you for that example. -Jaikiran On 07/11/18 8:37 PM, Roger Riggs wrote: > Hi Jaikiran, > > To check if two pathnames are the same file, > java.nio.file.Files.isSameFile(path1, path2) > is cleaner. It uses the file system specific mechanisms to determine > if the two paths > refer to the identical file. Traversing symbolic links, etc. > > Something like: > Path dir = ...; // supply a directory to test a > particular file system > Path path1 = Files.createTempFile(dir, "MixedCase", ""); > Path path2 = > Path.of(path1.toString().toLowerCase(Locale.US)); > return Files.isSameFile(path1, path2); > > or similar... > > Roger > > On 11/07/2018 09:27 AM, Jaikiran Pai wrote: >> Hi Alan, >> >> On 07/11/18 7:15 PM, Alan Bateman wrote: >>> On 07/11/2018 13:13, Jaikiran Pai wrote: : My impression, based on that javadoc, was that the implementation of that API will use the underlying _filesystem_ to decide whether or not its case sensitive. However, my experiments on a macOS which is case insensitive and also a basic check of the implementation code in the JDK, shows that this uses a lexicographic check on the string representation of the paths. Is that what this javadoc means by "underlying system". I understand that there's a further sentence in that javadoc which says UNIX systems are case significant and Windows isn't. However, it wasn't fully clear to me that this API wouldn't delegate it to the underlying filesystem on the OS. >>> Maybe the javadoc could be clearer but having an equals method >>> delegate to the underlying file system would be very problematic >>> (think URL equals or cases where the file path is relative or doesn't >>> locate a file). >> I see what you mean. About the javadoc, do you think it is worth >> improving and if so, should I file an enhancement at bugs.java.com? >> While we are at it, is there a better API that can be used to do such a case sensitivity check of the underlying filesystem? I checked the APIs in java.nio.file but couldn't find anything relevant. If there isn't any such API currently, is that something that can be introduced? >>> This has been looked into a couple of times over the years, >>> suggestions have included FileStore exposing a comparator that uses >>> the rules of a specific underlying file system or volume. This turns >>> out to be very complicated as it means looking beyond case sensitively >>> issues and into topics such as case preservation, Unicode >>> normalization forms, per volume case mapping tables, and handling of >>> limits and truncation. >> So I guess, that leaves us (the client applications) to rely on some >> kind of file creation tricks like the one below to figure this out: >> >> // create a temp file in a fresh directory >> final Path tmpDir = Files.createTempDirectory(null); >> final Path tmpFile = Files.createTempFile(tmpDir, null, null); >> tmpFile.toFile().deleteOnExit(); >> tmpDir.toFile().deleteOnExit(); >> // now check if a file with that same name but different >> case is >> considered to exist >> final boolean existsAsLowerCase = >> Files.exists(Paths.get(tmpDir.toString(), >> tmpFile.getFileName().toString().toLowerCase())); >> final boolean existsAsUpperCase = >> Files.exists(Paths.get(tmpDir.toString(), >> tmpFile.getFileName().toString().toUpperCase())); >> // if the temp file that we created is found to not exist in a >> particular "case", then >> // the filesystem is case sensitive >> final boolean filesystemCaseSensitive = existsAsLowerCase == >> false || existsAsUpperCase == false; >> >> One final question - is there anywhere in the JDK code, where files are >> auto created (for whatever purpose)? If such a construct exists, maybe >> during the initialization of each FileStore implementation, it could do >> such tricks (maybe in a much better and performant way) and figure out >> this detail and then expose it some way? It feels hacky to do it in the >> JDK, so I fully understand if such a construct won't be entertained. >> >> -Jaikiran >> >> >> >
Re: Filesystem case sensitive check and java.io.File#equals
Hi Jaikiran, To check if two pathnames are the same file, java.nio.file.Files.isSameFile(path1, path2) is cleaner. It uses the file system specific mechanisms to determine if the two paths refer to the identical file. Traversing symbolic links, etc. Something like: Path dir = ...; // supply a directory to test a particular file system Path path1 = Files.createTempFile(dir, "MixedCase", ""); Path path2 = Path.of(path1.toString().toLowerCase(Locale.US)); return Files.isSameFile(path1, path2); or similar... Roger On 11/07/2018 09:27 AM, Jaikiran Pai wrote: Hi Alan, On 07/11/18 7:15 PM, Alan Bateman wrote: On 07/11/2018 13:13, Jaikiran Pai wrote: : My impression, based on that javadoc, was that the implementation of that API will use the underlying _filesystem_ to decide whether or not its case sensitive. However, my experiments on a macOS which is case insensitive and also a basic check of the implementation code in the JDK, shows that this uses a lexicographic check on the string representation of the paths. Is that what this javadoc means by "underlying system". I understand that there's a further sentence in that javadoc which says UNIX systems are case significant and Windows isn't. However, it wasn't fully clear to me that this API wouldn't delegate it to the underlying filesystem on the OS. Maybe the javadoc could be clearer but having an equals method delegate to the underlying file system would be very problematic (think URL equals or cases where the file path is relative or doesn't locate a file). I see what you mean. About the javadoc, do you think it is worth improving and if so, should I file an enhancement at bugs.java.com? While we are at it, is there a better API that can be used to do such a case sensitivity check of the underlying filesystem? I checked the APIs in java.nio.file but couldn't find anything relevant. If there isn't any such API currently, is that something that can be introduced? This has been looked into a couple of times over the years, suggestions have included FileStore exposing a comparator that uses the rules of a specific underlying file system or volume. This turns out to be very complicated as it means looking beyond case sensitively issues and into topics such as case preservation, Unicode normalization forms, per volume case mapping tables, and handling of limits and truncation. So I guess, that leaves us (the client applications) to rely on some kind of file creation tricks like the one below to figure this out: // create a temp file in a fresh directory final Path tmpDir = Files.createTempDirectory(null); final Path tmpFile = Files.createTempFile(tmpDir, null, null); tmpFile.toFile().deleteOnExit(); tmpDir.toFile().deleteOnExit(); // now check if a file with that same name but different case is considered to exist final boolean existsAsLowerCase = Files.exists(Paths.get(tmpDir.toString(), tmpFile.getFileName().toString().toLowerCase())); final boolean existsAsUpperCase = Files.exists(Paths.get(tmpDir.toString(), tmpFile.getFileName().toString().toUpperCase())); // if the temp file that we created is found to not exist in a particular "case", then // the filesystem is case sensitive final boolean filesystemCaseSensitive = existsAsLowerCase == false || existsAsUpperCase == false; One final question - is there anywhere in the JDK code, where files are auto created (for whatever purpose)? If such a construct exists, maybe during the initialization of each FileStore implementation, it could do such tricks (maybe in a much better and performant way) and figure out this detail and then expose it some way? It feels hacky to do it in the JDK, so I fully understand if such a construct won't be entertained. -Jaikiran
Re: Filesystem case sensitive check and java.io.File#equals
Hi Alan, On 07/11/18 7:15 PM, Alan Bateman wrote: > On 07/11/2018 13:13, Jaikiran Pai wrote: >> : >> >> >> My impression, based on that javadoc, was that the implementation of >> that API will use the underlying _filesystem_ to decide whether or not >> its case sensitive. However, my experiments on a macOS which is case >> insensitive and also a basic check of the implementation code in the >> JDK, shows that this uses a lexicographic check on the string >> representation of the paths. Is that what this javadoc means by >> "underlying system". I understand that there's a further sentence in >> that javadoc which says UNIX systems are case significant and Windows >> isn't. However, it wasn't fully clear to me that this API wouldn't >> delegate it to the underlying filesystem on the OS. > Maybe the javadoc could be clearer but having an equals method > delegate to the underlying file system would be very problematic > (think URL equals or cases where the file path is relative or doesn't > locate a file). I see what you mean. About the javadoc, do you think it is worth improving and if so, should I file an enhancement at bugs.java.com? > >> While we are at it, is there a better API that can be used to do such a >> case sensitivity check of the underlying filesystem? I checked the APIs >> in java.nio.file but couldn't find anything relevant. If there isn't any >> such API currently, is that something that can be introduced? >> > This has been looked into a couple of times over the years, > suggestions have included FileStore exposing a comparator that uses > the rules of a specific underlying file system or volume. This turns > out to be very complicated as it means looking beyond case sensitively > issues and into topics such as case preservation, Unicode > normalization forms, per volume case mapping tables, and handling of > limits and truncation. So I guess, that leaves us (the client applications) to rely on some kind of file creation tricks like the one below to figure this out: // create a temp file in a fresh directory final Path tmpDir = Files.createTempDirectory(null); final Path tmpFile = Files.createTempFile(tmpDir, null, null); tmpFile.toFile().deleteOnExit(); tmpDir.toFile().deleteOnExit(); // now check if a file with that same name but different case is considered to exist final boolean existsAsLowerCase = Files.exists(Paths.get(tmpDir.toString(), tmpFile.getFileName().toString().toLowerCase())); final boolean existsAsUpperCase = Files.exists(Paths.get(tmpDir.toString(), tmpFile.getFileName().toString().toUpperCase())); // if the temp file that we created is found to not exist in a particular "case", then // the filesystem is case sensitive final boolean filesystemCaseSensitive = existsAsLowerCase == false || existsAsUpperCase == false; One final question - is there anywhere in the JDK code, where files are auto created (for whatever purpose)? If such a construct exists, maybe during the initialization of each FileStore implementation, it could do such tricks (maybe in a much better and performant way) and figure out this detail and then expose it some way? It feels hacky to do it in the JDK, so I fully understand if such a construct won't be entertained. -Jaikiran
Re: Filesystem case sensitive check and java.io.File#equals
On 07/11/2018 13:13, Jaikiran Pai wrote: : My impression, based on that javadoc, was that the implementation of that API will use the underlying _filesystem_ to decide whether or not its case sensitive. However, my experiments on a macOS which is case insensitive and also a basic check of the implementation code in the JDK, shows that this uses a lexicographic check on the string representation of the paths. Is that what this javadoc means by "underlying system". I understand that there's a further sentence in that javadoc which says UNIX systems are case significant and Windows isn't. However, it wasn't fully clear to me that this API wouldn't delegate it to the underlying filesystem on the OS. Maybe the javadoc could be clearer but having an equals method delegate to the underlying file system would be very problematic (think URL equals or cases where the file path is relative or doesn't locate a file). While we are at it, is there a better API that can be used to do such a case sensitivity check of the underlying filesystem? I checked the APIs in java.nio.file but couldn't find anything relevant. If there isn't any such API currently, is that something that can be introduced? This has been looked into a couple of times over the years, suggestions have included FileStore exposing a comparator that uses the rules of a specific underlying file system or volume. This turns out to be very complicated as it means looking beyond case sensitively issues and into topics such as case preservation, Unicode normalization forms, per volume case mapping tables, and handling of limits and truncation. -Alan