Ok, I thought it was over, but it is not over yet. Many problematic file
names are now correctly handled with explicit normalisation, but I just
got:
Caused by: java.nio.file.InvalidPathException: Malformed input or input
contains unmappable characters: M?sica Antigua Eduardo Paniagua/La
On Tue, 28 Apr 2015 16:40:36 +0200, Mike Hearn m...@plan99.net wrote:
I thought Mac OS X has a standard normalization for unicode filenames.
Linux just treats whatever it gets as bytes so it is up to the software
creating the file. Am I correct?
Looks like you are:
On Tue, 28 Apr 2015 15:11:55 +0200, Mike Hearn m...@plan99.net wrote:
They were rsynced from Mac OS X.
I said *original* app. Rsync is not the original app and most likely does
not attempt to re-encode or re-normalise Unicode strings.
Ok. The original app is iTunes.
I feared that. In
On Mon, 27 Apr 2015 15:13:46 +0200, Mike Hearn m...@plan99.net wrote:
Thus this may not be a bug in Java so much as a design problem/oversight
with the operating systems themselves.
Note that the issue you're running in to is *not* to do with encodings.
It's not a UTF-8 vs UTF-16 type issue.
I thought Mac OS X has a standard normalization for unicode filenames.
Linux just treats whatever it gets as bytes so it is up to the software
creating the file. Am I correct?
(e.g. see:
http://stackoverflow.com/questions/9757843/unicode-encoding-for-filesystem-in-mac-os-x-not-correct-in-python
)
I thought Mac OS X has a standard normalization for unicode filenames.
Linux just treats whatever it gets as bytes so it is up to the software
creating the file. Am I correct?
Looks like you are:
https://developer.apple.com/legacy/library/technotes/tn/tn1150.html#UnicodeSubtleties
So HFS+
They were rsynced from Mac OS X.
I said *original* app. Rsync is not the original app and most likely does
not attempt to re-encode or re-normalise Unicode strings.
I feared that. In the end it might be even reasonably doable, if I can
take advantage of some preconditions... for instance:
Ok, I've run into many problems in the past with diacritics, as there were
some JDK problems, but I supposed they were all fixed today. But perhaps
there's something I'm not understanding.
I've several files with diacritics in their name, let's say e.g. La
Cathédrale Engloutie.m4a. A