Thanks for the tip. It's good to know that Tcl does everything in utf-8. I haven't had the need to localize this application yet but have been putting effort into writing it in such a way that it will lend itself to cross-platform compatibility and localization.
Scott ---------------------------------------------------------------------- Scott Parrill Information Technology Specialist Enterprise Technology Services / Wyoming State Geological Survey State of Wyoming P.O. Box 1347 Laramie, WY 82073 Phone: 307-766-2286 x242 Fax: 307-766-2605 E-mail: scott.parr...@wyo.gov<https://uwmail.uwyo.edu/owa/redir.aspx?C=4dc4ce62e583463fb8ad86380018935b&URL=mailto%3aScott.Parrill%40wyo.gov> ---------------------------------------------------------------------- E-Mail to and from me, in connection with the transaction of public business, is subject to the Wyoming Public Records Act and may be disclosed to third parties. ---------------------------------------------------------------------- On Thu, Aug 1, 2013 at 1:39 PM, Jeff Hobbs <je...@activestate.com> wrote: > Also note that you are dealing with the Perl<>Tcl bridge in this case. > Tcl uses utf-8 exclusively under the covers, and will deliver that to > Perl. This only will become important when you hit the funky filenames, > but you'll get much more sensitive to it should you need to localize a user > interface. > > Jeff > > On 2013-08-01, at 12:28 PM, Scott Parrill <scott.parr...@wyo.gov> wrote: > > Well that explains why I've never run across the problem before in the > number of years I've been working with Perl scripts. This is the fist > major application I've written with Tkx that had to do any file handling. > Always before any file handling I did either didn't involve any of the > oddball characters Windows allows in file names for used things like > readdir to hide the behind-the-scenes-magic. > > It's amazing what you learn when things don't work as expected. :) > > Thanks again. > Scott > > On Thu, Aug 1, 2013 at 1:14 PM, Jeff Hobbs <je...@activestate.com> wrote: > >> Of course, I knew it would. ;) >> >> OpenFile1 is returning utf-8 encoded text, which needs the special >> Win32 functions. OpenFile2 is only obviating this with a special case >> in readdir() for directory contents. The regular Perl functions for >> file handling are otherwise not capable of general unicode-containing >> filename operations on Windows. >> >> Jeff >> >> On Thu, Aug 1, 2013 at 12:07 PM, Scott Parrill <scott.parr...@wyo.gov> >> wrote: >> > Jeff, >> > >> > Interestingly enough, it appears that Win32::Unicode::File does actually >> > seem to solve the problem. I changed OpenFile1 to >> > >> > sub OpenFile1 { >> > my $filename = Tkx::tk___getOpenFile(); >> > print $filename . "\n"; >> > >> > @list = statW($filename); >> > print Dumper(\@list); >> > >> > $fh = Win32::Unicode::File->new(); >> > $fh->open('<', $filename) or die; >> > while ($line = $fh->readline()) { >> > print $line; >> > } >> > } >> > >> > And the file opened and displayed correctly. How COOL! >> > >> > Thanks, >> > Scott >> > >> > >> > ---------------------------------------------------------------------- >> > Scott Parrill >> > Information Technology Specialist >> > Enterprise Technology Services / >> > Wyoming State Geological Survey >> > State of Wyoming >> > P.O. Box 1347 >> > Laramie, WY 82073 >> > Phone: 307-766-2286 x242 >> > Fax: 307-766-2605 >> > E-mail: scott.parr...@wyo.gov >> > ---------------------------------------------------------------------- >> > E-Mail to and from me, in connection with the transaction of public >> > business, is subject to the Wyoming Public Records Act and may be >> > disclosed to third parties. >> > ---------------------------------------------------------------------- >> > >> > >> > >> > >> > >> > >> > On Thu, Aug 1, 2013 at 10:01 AM, Scott Parrill <scott.parr...@wyo.gov> >> > wrote: >> >> >> >> Jeff, >> >> >> >> The print statement in OpenFile1 and OpenFile2 display different >> >> characters when opening the same file. OpenFile1 prints "╞Æ.txt" >> where as >> >> OpenFile2 prints "â.txt". Incidentally, the Windows dir command and >> Windows >> >> Explorer display the file name as "ƒ.txt". >> >> >> >> The difference between the way the file name prints between the >> OpenFile1 >> >> (which fails to open the file) and OpenFile2 (which correctly opens the >> >> file) makes me think this is problem in the Tkx::tk___getOpenFile() >> >> function. >> >> >> >> The difference between the way OpenFile2 and the Windows dir command >> >> display the file name is most likely a Unicode translation problem >> which is >> >> only an issue when I try to store the file name in a database or >> something >> >> of the sort. (I'll eventually have to work this one out as well.) >> >> >> >> Scott >> >> >> >> >> >> >> >> On Wed, Jul 31, 2013 at 6:13 PM, Jeff Hobbs <je...@activestate.com> >> wrote: >> >>> >> >>> It is my understanding that on Windows, you need to use >> >>> Win32::Unicode::File/Dir functions to manipulate unicode filenames. >> >>> If you print out (or display in Unicode-compliant way) $filename >> >>> before the stat() call, does it show the right unicode? If so, try >> >>> the Win32::Unicode alternatives. >> >>> >> >>> Jeff >> >>> >> >>> On Wed, Jul 31, 2013 at 9:03 AM, Scott Parrill <scott.parr...@wyo.gov >> > >> >>> wrote: >> >>> > I've noted that the Tkx::tk__getOpenFile function does not seem to >> like >> >>> > high-ASCII characters in file names on Windows. (I haven't had an >> >>> > opportunity to test this on non-Windows platforms at this point.) >> >>> > >> >>> > Given the following code: >> >>> > --------------------------------------- >> >>> > use feature 'unicode_strings'; >> >>> > >> >>> > use Encode qw(encode decode); >> >>> > use Data::Dumper; >> >>> > use Tkx; >> >>> > >> >>> > sub OpenFile1 { >> >>> > my $filename = Tkx::tk___getOpenFile(); >> >>> > @list = stat($filename); >> >>> > print Dumper(\@list); >> >>> > >> >>> > open IN, "<", $filename or die; >> >>> > while ($line = <IN>) { >> >>> > print $line; >> >>> > } >> >>> > close IN; >> >>> > } >> >>> > >> >>> > sub OpenFile2 { >> >>> > opendir DIR, "."; >> >>> > while ($filename = readdir(DIR)) { >> >>> > next if ($filename eq '.' or $filename eq '..'); >> >>> > if ($filename =~ /txt$/) { >> >>> > @list = stat($filename); >> >>> > print Dumper(\@list); >> >>> > >> >>> > open IN, "<", $filename or die; >> >>> > while ($line = <IN>) { >> >>> > print $line; >> >>> > } >> >>> > close IN; >> >>> > } >> >>> > } >> >>> > } >> >>> > >> >>> > $mw = Tkx::widget->new("."); >> >>> > $b = $mw->new_ttk__button(-text => 'Open File 1', >> >>> > -command => sub { OpenFile1() }, >> >>> > ); >> >>> > $b->g_pack(); >> >>> > >> >>> > $b2 = $mw->new_ttk__button(-text => 'Open File 2', >> >>> > -command => sub { OpenFile2() }, >> >>> > ); >> >>> > $b2->g_pack(); >> >>> > >> >>> > Tkx::MainLoop(); >> >>> > --------------------------------------- >> >>> > >> >>> > Now create a file in the working directory, so that OpenFile2() will >> >>> > find >> >>> > it, with a name containing a high-ASCII character (I used >> >>> > "\x{0092}.txt"). >> >>> > What I have found is that the OpenFile1() button will die on the >> "open >> >>> > IN,..." statement at line 12 and the stat() function, on line 9, >> >>> > returns no >> >>> > data. However, the OpenFile2() function will correctly open and >> read >> >>> > the >> >>> > file and the stat() function, on line 24, returns correct data for >> the >> >>> > file. >> >>> > Using a file with unicode charaters in the name (like >> "\x{0289}.txt") >> >>> > seems >> >>> > to work correctly in both OpenFile1() and OpenFile2(). >> >>> > >> >>> > Does anyone have a suggestion on how to either get the >> >>> > Tkx::tk__getOpenFile() to return the file name correctly or how to >> work >> >>> > around the problem? >> >>> > >> >>> > Thanks, >> >>> > Scott >> >>> > >> > E-Mail to and from me, in connection with the transaction of public business, is subject to the Wyoming Public Records Act and may be disclosed to third parties.