Re: Filesystem case sensitive check and java.io.File#equals

2018-11-08 Thread Jaikiran Pai
Thanks for the clarification, Alan.

-Jaikiran

On Thursday, November 8, 2018, Alan Bateman  wrote:
> On 08/11/2018 12:59, Jaikiran Pai wrote:
>>
>> A slightly related question - I used the example that Roger showed and
>> it mostly worked. However, Files.isSameFile throws a
>> java.nio.file.NoSuchFileException since the path2 doesn't exist (on a
>> case sensitive file system). I checked the javadoc of Files.isSameFile
>> and couldn't find a clear mention if this is expected. It does talk
>> about the existence checks that might be carried out in this
>> implementation, but doesn't mention that it will throw an exception in
>> the absence of either of the passed paths. Is this expected? Or should
>> the implementation internally catch this exception and return false?
>>
> IOException is correct, the methods in this API aren't specify all the
possible sub-classes. In a few places you will see specific exceptions
specified as "optional specific exception" where it make sense. We could
potentially improve the spec for the cannot access or does not exist cases
but hasn't been an issue to date.
>
> -Alan
>


Re: Filesystem case sensitive check and java.io.File#equals

2018-11-08 Thread Alan Bateman

On 08/11/2018 12:59, Jaikiran Pai wrote:

A slightly related question - I used the example that Roger showed and
it mostly worked. However, Files.isSameFile throws a
java.nio.file.NoSuchFileException since the path2 doesn't exist (on a
case sensitive file system). I checked the javadoc of Files.isSameFile
and couldn't find a clear mention if this is expected. It does talk
about the existence checks that might be carried out in this
implementation, but doesn't mention that it will throw an exception in
the absence of either of the passed paths. Is this expected? Or should
the implementation internally catch this exception and return false?

IOException is correct, the methods in this API aren't specify all the 
possible sub-classes. In a few places you will see specific exceptions 
specified as "optional specific exception" where it make sense. We could 
potentially improve the spec for the cannot access or does not exist 
cases but hasn't been an issue to date.


-Alan


Re: Filesystem case sensitive check and java.io.File#equals

2018-11-08 Thread Jaikiran Pai
A slightly related question - I used the example that Roger showed and
it mostly worked. However, Files.isSameFile throws a
java.nio.file.NoSuchFileException since the path2 doesn't exist (on a
case sensitive file system). I checked the javadoc of Files.isSameFile
and couldn't find a clear mention if this is expected. It does talk
about the existence checks that might be carried out in this
implementation, but doesn't mention that it will throw an exception in
the absence of either of the passed paths. Is this expected? Or should
the implementation internally catch this exception and return false?

For now, in my implementation, I have explicitly caught this
NoSuchFileException, around the Files.isSameFile statement and consider
it to imply the filesystem is case sensitive.

-Jaikiran


On 07/11/18 8:37 PM, Roger Riggs wrote:
> Hi Jaikiran,
>
> To check if two pathnames are the same file,
> java.nio.file.Files.isSameFile(path1, path2)
> is cleaner.  It uses the file system specific mechanisms to determine
> if the two paths
> refer to the identical file.  Traversing symbolic links, etc.
>
> Something like:
>             Path dir = ...;   // supply a directory to test a
> particular file system
>     Path path1 = Files.createTempFile(dir, "MixedCase", "");
>     Path path2 =
> Path.of(path1.toString().toLowerCase(Locale.US));
>     return Files.isSameFile(path1, path2);
>
> or similar...
>
> Roger
>
> On 11/07/2018 09:27 AM, Jaikiran Pai wrote:
>> Hi Alan,
>>
>> On 07/11/18 7:15 PM, Alan Bateman wrote:
>>> On 07/11/2018 13:13, Jaikiran Pai wrote:
 :


 My impression, based on that javadoc, was that the implementation of
 that API will use the underlying _filesystem_ to decide whether or not
 its case sensitive. However, my experiments on a macOS which is case
 insensitive and also a basic check of the implementation code in the
 JDK, shows that this uses a lexicographic check on the string
 representation of the paths. Is that what this javadoc means by
 "underlying system". I understand that there's a further sentence in
 that javadoc which says UNIX systems are case significant and Windows
 isn't. However, it wasn't fully clear to me that this API wouldn't
 delegate it to the underlying filesystem on the OS.
>>> Maybe the javadoc could be clearer but having an equals method
>>> delegate to the underlying file system would be very problematic
>>> (think URL equals or cases where the file path is relative or doesn't
>>> locate a file).
>> I see what you mean. About the javadoc, do you think it is worth
>> improving and if so, should I file an enhancement at bugs.java.com?
>>
 While we are at it, is there a better API that can be used to do
 such a
 case sensitivity check of the underlying filesystem? I checked the
 APIs
 in java.nio.file but couldn't find anything relevant. If there
 isn't any
 such API currently, is that something that can be introduced?

>>> This has been looked into a couple of times over the years,
>>> suggestions have included FileStore exposing a comparator that uses
>>> the rules of a specific underlying file system or volume. This turns
>>> out to be very complicated as it means looking beyond case sensitively
>>> issues and into topics such as case preservation, Unicode
>>> normalization forms, per volume case mapping tables, and handling of
>>> limits and truncation.
>> So I guess, that leaves us (the client applications) to rely on some
>> kind of file creation tricks like the one below to figure this out:
>>
>>      // create a temp file in a fresh directory
>>  final Path tmpDir = Files.createTempDirectory(null);
>>  final Path tmpFile = Files.createTempFile(tmpDir, null, null);
>>  tmpFile.toFile().deleteOnExit();
>>  tmpDir.toFile().deleteOnExit();
>>  // now check if a file with that same name but different
>> case is
>> considered to exist
>>  final boolean existsAsLowerCase =
>> Files.exists(Paths.get(tmpDir.toString(),
>> tmpFile.getFileName().toString().toLowerCase()));
>>  final boolean existsAsUpperCase =
>> Files.exists(Paths.get(tmpDir.toString(),
>> tmpFile.getFileName().toString().toUpperCase()));
>>  // if the temp file that we created is found to not exist in a
>> particular "case", then
>>  // the filesystem is case sensitive
>>  final boolean filesystemCaseSensitive = existsAsLowerCase ==
>> false || existsAsUpperCase == false;
>>
>> One final question - is there anywhere in the JDK code, where files are
>> auto created (for whatever purpose)? If such a construct exists, maybe
>> during the initialization of each FileStore implementation, it could do
>> such tricks (maybe in a much better and performant way) and figure out
>> this detail and then expose it some way? It feels hacky to do it in the
>> JDK, so I fully understand if such a construct won't be entertained.
>>

Re: Filesystem case sensitive check and java.io.File#equals

2018-11-07 Thread Jaikiran Pai
Hello Roger,

That indeed is a much cleaner approach. Thank you for that example.

-Jaikiran


On 07/11/18 8:37 PM, Roger Riggs wrote:
> Hi Jaikiran,
>
> To check if two pathnames are the same file,
> java.nio.file.Files.isSameFile(path1, path2)
> is cleaner.  It uses the file system specific mechanisms to determine
> if the two paths
> refer to the identical file.  Traversing symbolic links, etc.
>
> Something like:
>             Path dir = ...;   // supply a directory to test a
> particular file system
>     Path path1 = Files.createTempFile(dir, "MixedCase", "");
>     Path path2 =
> Path.of(path1.toString().toLowerCase(Locale.US));
>     return Files.isSameFile(path1, path2);
>
> or similar...
>
> Roger
>
> On 11/07/2018 09:27 AM, Jaikiran Pai wrote:
>> Hi Alan,
>>
>> On 07/11/18 7:15 PM, Alan Bateman wrote:
>>> On 07/11/2018 13:13, Jaikiran Pai wrote:
 :


 My impression, based on that javadoc, was that the implementation of
 that API will use the underlying _filesystem_ to decide whether or not
 its case sensitive. However, my experiments on a macOS which is case
 insensitive and also a basic check of the implementation code in the
 JDK, shows that this uses a lexicographic check on the string
 representation of the paths. Is that what this javadoc means by
 "underlying system". I understand that there's a further sentence in
 that javadoc which says UNIX systems are case significant and Windows
 isn't. However, it wasn't fully clear to me that this API wouldn't
 delegate it to the underlying filesystem on the OS.
>>> Maybe the javadoc could be clearer but having an equals method
>>> delegate to the underlying file system would be very problematic
>>> (think URL equals or cases where the file path is relative or doesn't
>>> locate a file).
>> I see what you mean. About the javadoc, do you think it is worth
>> improving and if so, should I file an enhancement at bugs.java.com?
>>
 While we are at it, is there a better API that can be used to do
 such a
 case sensitivity check of the underlying filesystem? I checked the
 APIs
 in java.nio.file but couldn't find anything relevant. If there
 isn't any
 such API currently, is that something that can be introduced?

>>> This has been looked into a couple of times over the years,
>>> suggestions have included FileStore exposing a comparator that uses
>>> the rules of a specific underlying file system or volume. This turns
>>> out to be very complicated as it means looking beyond case sensitively
>>> issues and into topics such as case preservation, Unicode
>>> normalization forms, per volume case mapping tables, and handling of
>>> limits and truncation.
>> So I guess, that leaves us (the client applications) to rely on some
>> kind of file creation tricks like the one below to figure this out:
>>
>>      // create a temp file in a fresh directory
>>  final Path tmpDir = Files.createTempDirectory(null);
>>  final Path tmpFile = Files.createTempFile(tmpDir, null, null);
>>  tmpFile.toFile().deleteOnExit();
>>  tmpDir.toFile().deleteOnExit();
>>  // now check if a file with that same name but different
>> case is
>> considered to exist
>>  final boolean existsAsLowerCase =
>> Files.exists(Paths.get(tmpDir.toString(),
>> tmpFile.getFileName().toString().toLowerCase()));
>>  final boolean existsAsUpperCase =
>> Files.exists(Paths.get(tmpDir.toString(),
>> tmpFile.getFileName().toString().toUpperCase()));
>>  // if the temp file that we created is found to not exist in a
>> particular "case", then
>>  // the filesystem is case sensitive
>>  final boolean filesystemCaseSensitive = existsAsLowerCase ==
>> false || existsAsUpperCase == false;
>>
>> One final question - is there anywhere in the JDK code, where files are
>> auto created (for whatever purpose)? If such a construct exists, maybe
>> during the initialization of each FileStore implementation, it could do
>> such tricks (maybe in a much better and performant way) and figure out
>> this detail and then expose it some way? It feels hacky to do it in the
>> JDK, so I fully understand if such a construct won't be entertained.
>>
>> -Jaikiran
>>
>>
>>
>



Re: Filesystem case sensitive check and java.io.File#equals

2018-11-07 Thread Roger Riggs

Hi Jaikiran,

To check if two pathnames are the same file, 
java.nio.file.Files.isSameFile(path1, path2)
is cleaner.  It uses the file system specific mechanisms to determine if 
the two paths

refer to the identical file.  Traversing symbolic links, etc.

Something like:
            Path dir = ...;   // supply a directory to test a 
particular file system

    Path path1 = Files.createTempFile(dir, "MixedCase", "");
    Path path2 = Path.of(path1.toString().toLowerCase(Locale.US));
    return Files.isSameFile(path1, path2);

or similar...

Roger

On 11/07/2018 09:27 AM, Jaikiran Pai wrote:

Hi Alan,

On 07/11/18 7:15 PM, Alan Bateman wrote:

On 07/11/2018 13:13, Jaikiran Pai wrote:

:


My impression, based on that javadoc, was that the implementation of
that API will use the underlying _filesystem_ to decide whether or not
its case sensitive. However, my experiments on a macOS which is case
insensitive and also a basic check of the implementation code in the
JDK, shows that this uses a lexicographic check on the string
representation of the paths. Is that what this javadoc means by
"underlying system". I understand that there's a further sentence in
that javadoc which says UNIX systems are case significant and Windows
isn't. However, it wasn't fully clear to me that this API wouldn't
delegate it to the underlying filesystem on the OS.

Maybe the javadoc could be clearer but having an equals method
delegate to the underlying file system would be very problematic
(think URL equals or cases where the file path is relative or doesn't
locate a file).

I see what you mean. About the javadoc, do you think it is worth
improving and if so, should I file an enhancement at bugs.java.com?


While we are at it, is there a better API that can be used to do such a
case sensitivity check of the underlying filesystem? I checked the APIs
in java.nio.file but couldn't find anything relevant. If there isn't any
such API currently, is that something that can be introduced?


This has been looked into a couple of times over the years,
suggestions have included FileStore exposing a comparator that uses
the rules of a specific underlying file system or volume. This turns
out to be very complicated as it means looking beyond case sensitively
issues and into topics such as case preservation, Unicode
normalization forms, per volume case mapping tables, and handling of
limits and truncation.

So I guess, that leaves us (the client applications) to rely on some
kind of file creation tricks like the one below to figure this out:

         // create a temp file in a fresh directory
     final Path tmpDir = Files.createTempDirectory(null);
     final Path tmpFile = Files.createTempFile(tmpDir, null, null);
     tmpFile.toFile().deleteOnExit();
     tmpDir.toFile().deleteOnExit();
     // now check if a file with that same name but different case is
considered to exist
     final boolean existsAsLowerCase =
Files.exists(Paths.get(tmpDir.toString(),
tmpFile.getFileName().toString().toLowerCase()));
     final boolean existsAsUpperCase =
Files.exists(Paths.get(tmpDir.toString(),
tmpFile.getFileName().toString().toUpperCase()));
     // if the temp file that we created is found to not exist in a
particular "case", then
     // the filesystem is case sensitive
     final boolean filesystemCaseSensitive = existsAsLowerCase ==
false || existsAsUpperCase == false;

One final question - is there anywhere in the JDK code, where files are
auto created (for whatever purpose)? If such a construct exists, maybe
during the initialization of each FileStore implementation, it could do
such tricks (maybe in a much better and performant way) and figure out
this detail and then expose it some way? It feels hacky to do it in the
JDK, so I fully understand if such a construct won't be entertained.

-Jaikiran







Re: Filesystem case sensitive check and java.io.File#equals

2018-11-07 Thread Jaikiran Pai
Hi Alan,

On 07/11/18 7:15 PM, Alan Bateman wrote:
> On 07/11/2018 13:13, Jaikiran Pai wrote:
>> :
>>
>>
>> My impression, based on that javadoc, was that the implementation of
>> that API will use the underlying _filesystem_ to decide whether or not
>> its case sensitive. However, my experiments on a macOS which is case
>> insensitive and also a basic check of the implementation code in the
>> JDK, shows that this uses a lexicographic check on the string
>> representation of the paths. Is that what this javadoc means by
>> "underlying system". I understand that there's a further sentence in
>> that javadoc which says UNIX systems are case significant and Windows
>> isn't. However, it wasn't fully clear to me that this API wouldn't
>> delegate it to the underlying filesystem on the OS.
> Maybe the javadoc could be clearer but having an equals method
> delegate to the underlying file system would be very problematic
> (think URL equals or cases where the file path is relative or doesn't
> locate a file).

I see what you mean. About the javadoc, do you think it is worth
improving and if so, should I file an enhancement at bugs.java.com?

>
>> While we are at it, is there a better API that can be used to do such a
>> case sensitivity check of the underlying filesystem? I checked the APIs
>> in java.nio.file but couldn't find anything relevant. If there isn't any
>> such API currently, is that something that can be introduced?
>>
> This has been looked into a couple of times over the years,
> suggestions have included FileStore exposing a comparator that uses
> the rules of a specific underlying file system or volume. This turns
> out to be very complicated as it means looking beyond case sensitively
> issues and into topics such as case preservation, Unicode
> normalization forms, per volume case mapping tables, and handling of
> limits and truncation.
So I guess, that leaves us (the client applications) to rely on some
kind of file creation tricks like the one below to figure this out:

        // create a temp file in a fresh directory
    final Path tmpDir = Files.createTempDirectory(null);
    final Path tmpFile = Files.createTempFile(tmpDir, null, null);
    tmpFile.toFile().deleteOnExit();
    tmpDir.toFile().deleteOnExit();
    // now check if a file with that same name but different case is
considered to exist
    final boolean existsAsLowerCase =
Files.exists(Paths.get(tmpDir.toString(),
tmpFile.getFileName().toString().toLowerCase()));
    final boolean existsAsUpperCase =
Files.exists(Paths.get(tmpDir.toString(),
tmpFile.getFileName().toString().toUpperCase()));
    // if the temp file that we created is found to not exist in a
particular "case", then
    // the filesystem is case sensitive
    final boolean filesystemCaseSensitive = existsAsLowerCase ==
false || existsAsUpperCase == false;

One final question - is there anywhere in the JDK code, where files are
auto created (for whatever purpose)? If such a construct exists, maybe
during the initialization of each FileStore implementation, it could do
such tricks (maybe in a much better and performant way) and figure out
this detail and then expose it some way? It feels hacky to do it in the
JDK, so I fully understand if such a construct won't be entertained.

-Jaikiran





Re: Filesystem case sensitive check and java.io.File#equals

2018-11-07 Thread Alan Bateman

On 07/11/2018 13:13, Jaikiran Pai wrote:

:


My impression, based on that javadoc, was that the implementation of
that API will use the underlying _filesystem_ to decide whether or not
its case sensitive. However, my experiments on a macOS which is case
insensitive and also a basic check of the implementation code in the
JDK, shows that this uses a lexicographic check on the string
representation of the paths. Is that what this javadoc means by
"underlying system". I understand that there's a further sentence in
that javadoc which says UNIX systems are case significant and Windows
isn't. However, it wasn't fully clear to me that this API wouldn't
delegate it to the underlying filesystem on the OS.
Maybe the javadoc could be clearer but having an equals method delegate 
to the underlying file system would be very problematic (think URL 
equals or cases where the file path is relative or doesn't locate a file).




While we are at it, is there a better API that can be used to do such a
case sensitivity check of the underlying filesystem? I checked the APIs
in java.nio.file but couldn't find anything relevant. If there isn't any
such API currently, is that something that can be introduced?

This has been looked into a couple of times over the years, suggestions 
have included FileStore exposing a comparator that uses the rules of a 
specific underlying file system or volume. This turns out to be very 
complicated as it means looking beyond case sensitively issues and into 
topics such as case preservation, Unicode normalization forms, per 
volume case mapping tables, and handling of limits and truncation.


-Alan