I have a patch to change this vanilla file read: my $text = do { local $/; open my($fh), $file; <$fh> };
to this one using perlio's utf8 layer my $text = do { local $/; open my($fh), "< :utf8", $file; <$fh> }; Aside from differences in Perl versions (that's easy enough to handle) and the stupidity of reading the whole file in one go, is this going to bite me in butt with encodings? I have next to no expereince with UTF8 problems because we mostly use ASCII around here. My main concerns are: * Will a plist always be utf8? For instance, what about localizations for Japan, etc? The encoding shows up in the <?xml ...?> portion, but am I going to have to open the file to find that, then re-open it? * What happens if it's not utf8 but I tell Perl it is? * Will whatever I do work on another machine? I wrote this module to read plist files on my FreeBSD machine, and I don't want to make users have to be on a Mac (or even the same Mac) to be able to parse a file. I have some other ideas to get around this, but my inexperience with character encodings make me cautious (e.g. allows me to not do anything). * Ignore it. A user can get the same thing by reading the file and using parse_plist which takes a string * Add a parse_plist_fh method and let the user make his own filehandles. I'm going to do that because it's a good idea, and I can make parse_plist_file redirect to that if its arg is a filehandle. However, the user then needs to open their own filehandle and know things I'd like to be able to guess for them. * If the plist is always going to utf8, just make the change. * Actually hook into the Mac libraries and let the OS parse it and give me back the data. That's the best solution, but only if I give up being pure-Perl and portable, which are both very important to me. I will gladly add something to do that if someone writes it for me, but alongside the pure Perl version. * Can I hook into the Mac stuff with something like Inline::Java? * Find a sucker to take over the module and figure it out for me. As PT Barnum said "There's a new CPAN user registered every second".