On Thu, 17 Apr 2025 04:26:38 GMT, Martin Balao <[email protected]> wrote:
>> Francisco Ferrari Bihurriet has updated the pull request with a new target
>> base due to a merge or a rebase. The incremental webrev excludes the
>> unrelated changes brought in by the merge/rebase. The pull request contains
>> three additional commits since the last revision:
>>
>> - Do minor adjustments
>>
>> Update copyright year, improve comments and use File::toPath to convert
>> back to Path.
>> - Merge openjdk/master into JDK-8352728
>> - 8352728: InternalError loading java.security due to Windows parent folder
>> permissions
>
> Looks like `File::getCanonicalPath` is more resilient to canonicalization and
> resolution failures. This observation makes me wonder the following:
>
> 1) Can a normalization failure (e.g. return of a partially normalized path)
> affect relative includes?
>
> 2) Can a failure in symlinks resolution (which seems to be ignored) affect
> relative includes?
>
> Startup will fail if the included file is not found, which is a safe
> behavior. The problematic scenario would be one in which a different file
> exists in a location that uses the partially normalized or resolved path as a
> base, and is not what the "include" properties writer intended. While
> unlikely, I want to understand if this is possible and a downside of using
> `File::getCanonicalPath`.
Hi @martinuy,
> Looks like `File::getCanonicalPath` is more resilient to canonicalization and
> resolution failures. This observation makes me wonder the following:
>
> 1. Can a normalization failure (e.g. return of a partially normalized path)
> affect relative includes?
> 2. Can a failure in symlinks resolution (which seems to be ignored) affect
> relative includes?
>
> Startup will fail if the included file is not found, which is a safe
> behavior. The problematic scenario would be one in which a different file
> exists in a location that uses the partially normalized or resolved path as a
> base, and is not what the "include" properties writer intended. While
> unlikely, I want to understand if this is possible and a downside of using
> `File::getCanonicalPath`.
By the time we are resolving a _relative include_, we already performed the
following two operations on the file issuing the `include` statement, in this
order:
* The path has been normalized, or partially normalized by
`File::getCanonicalFile`
* The file has been opened, and we are parsing it (so it's known to be readable
and exist)
I would like to put aside filesystem items creation/deletion race conditions,
which of course can occur, but would always be problematic, regardless of the
path canonicalization mechanism. Excluding such scenarios, we need to think of
edge cases where there is a partial normalization or a symlinks resolution
failure, **while at the same time, the file exists and is readable**.
### <span>#</span>1. Partial normalization, without symlinks resolution failure
As I understand it, in _Linux_, normalization involves resolving relative paths
against the current working directory, substituting `.` or `..` path items, and
resolving symlinks. In _Windows_, in addition to the _Linux_ normalization,
there is the resolution of absolute paths without drive/unit against the
current unit, and the case normalization for case-insensitive filesystems
(including the unit letter).
The only _Linux_ partial normalization case I'm aware of, includes symlinks
failures (`File::getCanonicalFile` performs `.` and `..` substitutions for
inaccessible and nonexistent paths, see examples of this below).
There could be partial normalization cases in _Windows_, when `FindFirstFileW`
fails due to parent directories permissions, and the actual filesystem items
case is normalized up to a certain point. However, in such cases the resulting
path is equivalent, and should work as expected for relative includes
resolution.
### <span>#</span>2. Partial normalization, with symlinks resolution failure
I can't think of a _Linux_ scenario where a link is readable but not
resolvable. I've tried the following.
mkdir /tmp/scratch
sudo mkdir /tmp/scratch/protected
sudo mkdir /tmp/scratch/protected/inner
sudo tee /tmp/scratch/protected/inner/regular.properties <<<'name=value'
>/dev/null
tee /tmp/scratch/target.properties <<<'name=value' >/dev/null
sudo ln -s /tmp/scratch/target.properties /tmp/scratch/protected/link.properties
sudo ln -s /tmp/scratch/target.properties
/tmp/scratch/protected/inner/link.properties
sudo chown -R $(whoami):$(whoami) /tmp/scratch/protected/inner
sudo chmod 766 /tmp/scratch/protected
This is the created directories structure:
fferrari@vmhost:~$ sudo tree -up /tmp/scratch
[drwxr-xr-x fferrari] /tmp/scratch
├── [drwxrw-rw- root ] protected
│ ├── [drwxr-xr-x fferrari] inner
│ │ ├── [lrwxrwxrwx fferrari] link.properties ->
/tmp/scratch/target.properties
│ │ └── [-rw-r--r-- fferrari] regular.properties
│ └── [lrwxrwxrwx root ] link.properties -> /tmp/scratch/target.properties
└── [-rw-r--r-- fferrari] target.properties
3 directories, 4 files
fferrari@vmhost:~$ cat /tmp/scratch/target.properties
name=value
Sice `protected` doesn't have execution permissions, we can't even open regular
files inside `inner`:
fferrari@vmhost:~$ cat /tmp/scratch/protected/inner/regular.properties
cat: /tmp/scratch/protected/inner/regular.properties: Permission denied
fferrari@vmhost:~$ cat /tmp/scratch/protected/inner/link.properties
cat: /tmp/scratch/protected/inner/link.properties: Permission denied
This means that if we are trying to load
`/tmp/scratch/protected/inner/link.properties`, even if
`/tmp/scratch/target.properties` had a relative include, we wouldn't face its
resolution, because `/tmp/scratch/protected/inner/link.properties` isn't
readable.
Even in those cases, `File::getCanonicalFile` gives a reasonable result, and is
able to substitute `.` and `..`:
fferrari@vmhost:~$ jshell -q
-<<<'System.out.println(Path.of("/tmp/scratch/protected/inner/./regular.properties").toFile().getCanonicalFile())'
/tmp/scratch/protected/inner/regular.properties
fferrari@vmhost:~$ jshell -q
-<<<'System.out.println(Path.of("/tmp/scratch/protected/inner/../inner/link.properties").toFile().getCanonicalFile())'
/tmp/scratch/protected/inner/link.properties
In _Windows_, as explained in the description, even if `FindFirstFileW` fails
at some point, `File::getCanonicalFile` will proceed with a [later symlinks
resolution](https://github.com/openjdk/jdk/blob/jdk-24-ga/src/java.base/windows/classes/java/io/WinNTFileSystem.java#L476).
This edge case is exercised by the new test added in
7abb62c069ad35f4ec34f6cd9b9f6d05febceecc. So I can't think of a scenario where
a symlink/soft-link that is readable isn't resolved.
### Final note
I might be missing something, so let me know if you have any other ideas to
try. I provided commands to recreate my experiments in case you want to build
on top of them.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/24465#issuecomment-2835992962