On 3/20/08 5:05 PM, Gunnar Hjalmarsson wrote:
David Newman wrote:
I have some CSV input files that contain control and extended ASCII characters,

<snip>

The Text::CSV or Tie::Handle::CSV modules don't like these characters; the snippets below both return errors when they get to one.

<snip>

my $csv = Text::CSV->new();

In the docs for Text::CSV, that way of creating a new object is mentioned at the top of the SYNOPSIS section. The solution to your problem is stated right after that.

So, the usual recommendation:

"Read the docs for the module you are using."

is very much applicable. ;-)

<time passes, seasons change, children grow up>

OK, thanks for this polite RTFM.

However, it doesn't answer the root question, namely how to parse text that contains Western European characters such as accents and umlauts.

I see from the Text::CSV documentation that this module handles only characters between 0x20 and 0x7e. I also see there is a binary mode for any character, but the documentation does not describe whether the module parses binary-mode characters the same way as ASCII characters.

This seems like a fairly standard problem. What's the "right" way (or, given perl culture, "a" way) to handle text outside the 0x20 to 0x7e range?

Many thanks!

dn




--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/


Reply via email to