On Thu, Apr 14, 2011 at 18:27, William A. Rowe Jr. <[email protected]>wrote:
> On 4/14/2011 8:04 PM, Branko Čibej wrote: > > On 15.04.2011 01:24, William A. Rowe Jr. wrote: > >> On 4/14/2011 6:00 PM, Jonathan Leffler wrote: > >>> Given that the second byte is in the range 0x40..0x7E (second para), > and / is 0x2F, there > >>> shouldn't be a problem with Shift-JIS. That's not to say there isn't > another codeset > >>> where there isn't a problem, but I don't think it is Shift-JIS and > possibly not any of the > >>> main Japanese codesets. > >> Looking at the references to the EUC and Big5 encodings, it seems > similarly safe to > >> assume 20-3F are always the expected ASCII representations, while > 0x40-0x7E seem dicey. > >> > >> Thanks for this research, it saved me a ton of headaches! > > > > This is what I remember from my messy days, too. Path handling would be > > a nightmare othewise, it's bad enough for poor Windows lusers that \ > > gets displayed as ¥ :) > > Ok, stupid question; is ¥ the pathname delimiter, for real? I don't think > I have > a working shift-jis environment set up. > The Wikipedia page for Shift-JIS (http://en.wikipedia.org/wiki/Shift_JIS) shows the Yen symbol ¥ as appearing at 0x5C, which is where the backslash appears in Unicode and ISO 8859-x codesets. It (backslash) also falls into the danger zone identified earlier (0x40..0x7E). Sorry - I didn't check backslash earlier. -- Jonathan Leffler <[email protected]> #include <disclaimer.h> Guardian of DBD::Informix - v2008.0513 - http://dbi.perl.org "Blessed are we who can laugh at ourselves, for we shall never cease to be amused."
