Re: JDK 1.8.0 33/40, diacritics and file problems

2015-05-10 Thread Fabrizio Giudici
Ok, I thought it was over, but it is not over yet. Many problematic file names are now correctly handled with explicit normalisation, but I just got: Caused by: java.nio.file.InvalidPathException: Malformed input or input contains unmappable characters: M?sica Antigua Eduardo Paniagua/La

Re: JDK 1.8.0 33/40, diacritics and file problems

2015-04-29 Thread Fabrizio Giudici
On Tue, 28 Apr 2015 16:40:36 +0200, Mike Hearn m...@plan99.net wrote: I thought Mac OS X has a standard normalization for unicode filenames. Linux just treats whatever it gets as bytes so it is up to the software creating the file. Am I correct? Looks like you are:

Re: JDK 1.8.0 33/40, diacritics and file problems

2015-04-28 Thread Fabrizio Giudici
On Tue, 28 Apr 2015 15:11:55 +0200, Mike Hearn m...@plan99.net wrote: They were rsynced from Mac OS X. I said *original* app. Rsync is not the original app and most likely does not attempt to re-encode or re-normalise Unicode strings. Ok. The original app is iTunes. I feared that. In

Re: JDK 1.8.0 33/40, diacritics and file problems

2015-04-28 Thread Fabrizio Giudici
On Mon, 27 Apr 2015 15:13:46 +0200, Mike Hearn m...@plan99.net wrote: Thus this may not be a bug in Java so much as a design problem/oversight with the operating systems themselves. Note that the issue you're running in to is *not* to do with encodings. It's not a UTF-8 vs UTF-16 type issue.

Re: JDK 1.8.0 33/40, diacritics and file problems

2015-04-28 Thread Scott Palmer
I thought Mac OS X has a standard normalization for unicode filenames. Linux just treats whatever it gets as bytes so it is up to the software creating the file. Am I correct? (e.g. see: http://stackoverflow.com/questions/9757843/unicode-encoding-for-filesystem-in-mac-os-x-not-correct-in-python )

Re: JDK 1.8.0 33/40, diacritics and file problems

2015-04-28 Thread Mike Hearn
I thought Mac OS X has a standard normalization for unicode filenames. Linux just treats whatever it gets as bytes so it is up to the software creating the file. Am I correct? Looks like you are: https://developer.apple.com/legacy/library/technotes/tn/tn1150.html#UnicodeSubtleties So HFS+

Re: JDK 1.8.0 33/40, diacritics and file problems

2015-04-28 Thread Mike Hearn
They were rsynced from Mac OS X. I said *original* app. Rsync is not the original app and most likely does not attempt to re-encode or re-normalise Unicode strings. I feared that. In the end it might be even reasonably doable, if I can take advantage of some preconditions... for instance:

JDK 1.8.0 33/40, diacritics and file problems

2015-04-24 Thread Fabrizio Giudici
Ok, I've run into many problems in the past with diacritics, as there were some JDK problems, but I supposed they were all fixed today. But perhaps there's something I'm not understanding. I've several files with diacritics in their name, let's say e.g. La Cathédrale Engloutie.m4a. A