Re: [Pythonmac-SIG] Unicode and split

Jeremy Reichman Fri, 23 May 2008 11:53:23 -0700

Thanks to everyone who replied!

I'll take a further look into the encoding of the file because I'm
interested in that for other reasons. In the output I saw, u"\xe1" (and a
few others I found after sending my note) were prevalent around the splits.


For the moment, though, I've solved my immediate difficulty by splitting
twice. I really only need the space delimited fields that appear after a tab
in each line, and the characters causing problems are always before that. I
split by tab first and then a normal split of that gets me to the fields I
need.


-- 
Jeremy


_______________________________________________
Pythonmac-SIG maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/pythonmac-sig

Re: [Pythonmac-SIG] Unicode and split

Reply via email to