John Machin [EMAIL PROTECTED] wrote:
william tanksley [EMAIL PROTECTED] wrote:
Cool. Sorry for the misunderstanding. Thank you for helping again!
Postscript: your request to print the actual data did the trick.
I'd back inspecting actual data against armchair philosophy any
time :-)
Heh.
On Aug 2, 10:02 am, william tanksley [EMAIL PROTECTED] wrote:
Given that the input file was
Unicode,
You mean something like encoded in UTF-8.
Here's another reference for you to read: http://www.amk.ca/python/howto/unicode
--
http://mail.python.org/mailman/listinfo/python-list
william tanksley wrote:
william tanksley [EMAIL PROTECTED] wrote:
I'm still puzzled why I'm getting some non-Unicode out of an
ElementTree's text, though.
Now I know.
Okay, my answer is that cElementTree (in Python 2.5) is simply
deranged when it comes to Unicode. It assumes everything's
On Jul 31, 12:58 am, william tanksley [EMAIL PROTECTED] wrote:
Thank you for the response. Here's some more info, including a little
that you didn't ask me for but which might be useful.
John Machin [EMAIL PROTECTED] wrote:
william tanksley [EMAIL PROTECTED] wrote:
To ask another way: how
Stefan Behnel [EMAIL PROTECTED] wrote:
william tanksley wrote:
Okay, my answer is that ElementTree (in Python 2.5) is simply
deranged when it comes to Unicode. It assumes everything's ASCII.
It does not assume that. It *requires* byte strings to be ASCII.
You can't encode Unicode into an
John Machin [EMAIL PROTECTED] wrote:
william tanksley [EMAIL PROTECTED] wrote:
Buffett Time - Annual Shareholders\xc2\xa0L.mp3
1. This isn't Unicode; it's missing the u (I printed using repr).
2. It's got the UTF-8 bytes there in the middle.
In addition to the above results,
*WHAT*
william tanksley wrote:
I didn't
pass a string. I passed a file. It didn't error out; instead, it
produced bytestring-encoded output (not Unicode).
From my experience (and from the source code I have seen so far), ElementTree
does not return UTF-8 encoded strings at the API level. Can you
On Jul 31, 11:54 pm, william tanksley [EMAIL PROTECTED] wrote:
John Machin [EMAIL PROTECTED] wrote:
william tanksley [EMAIL PROTECTED] wrote:
Buffett Time - Annual Shareholders\xc2\xa0L.mp3
1. This isn't Unicode; it's missing the u (I printed using repr).
2. It's got the UTF-8 bytes
John Machin [EMAIL PROTECTED] wrote:
william tanksley [EMAIL PROTECTED] wrote:
Let's try again:
Cool. Sorry for the misunderstanding. Thank you for helping again!
Postscript: your request to print the actual data did the trick. I'm
including the rest of my reply just to provide context, but
On Thu, Jul 31, 2008 at 9:44 AM, william tanksley [EMAIL PROTECTED] wrote:
I'm using a file, a file that's correctly encoded as UTF-8, and it
returns some text elements that are raw bytes (undecoded). I have to
manually decode them.
I can't reproduce this behavior. Here's a simple test case:
On Aug 1, 7:44 am, william tanksley [EMAIL PROTECTED] wrote:
John Machin [EMAIL PROTECTED] wrote:
william tanksley [EMAIL PROTECTED] wrote:
Let's try again:
Cool. Sorry for the misunderstanding. Thank you for helping again!
Postscript: your request to print the actual data did the trick.
If you want to convert the file names which use standard URL encoding
(with %20 for space, etc) use:
from urllib import unquote
new_filename = unquote(filename)
I have found this does not convert encoded characters of the form
'#CC;' so you may have to do that manually. I think these are just
Thank you for the response. Here's some more info, including a little
that you didn't ask me for but which might be useful.
John Machin [EMAIL PROTECTED] wrote:
william tanksley [EMAIL PROTECTED] wrote:
To ask another way: how do I convert from a file:// URL to a local
path in a standard
On Wed, Jul 30, 2008 at 10:58 AM, william tanksley
[EMAIL PROTECTED] wrote:
Here's one example. The others are similar -- they have the same
things that look like problems to me.
Buffett Time - Annual Shareholders\xc2\xa0L.mp3
Note some problems here:
1. This isn't Unicode; it's missing
Jerry Hill [EMAIL PROTECTED] wrote:
william tanksley [EMAIL PROTECTED] wrote:
Here's one example. The others are similar -- they have the same
things that look like problems to me.
Buffett Time - Annual Shareholders\xc2\xa0L.mp3
I tried doing track_id.encode(utf-8), but it doesn't seem
william tanksley wrote:
Okay, so you decode to go from raw
byes into a given encoding, and you encode to go from a given encoding
to raw bytes.
No, decoding goes from a byte sequence to a Unicode string and encoding goes
from a Unicode string to a byte sequence.
Unicode is not an encoding. A
On Wed, Jul 30, 2008 at 2:27 PM, william tanksley [EMAIL PROTECTED] wrote:
Awesome... Thank you! I had my mental model of Python turned around
backwards. That's an odd feeling. Okay, so you decode to go from raw
byes into a given encoding, and you encode to go from a given encoding
to raw
Jerry Hill [EMAIL PROTECTED] wrote:
On Wed, Jul 30, 2008 at 2:27 PM, william tanksley [EMAIL PROTECTED] wrote:
Awesome... Thank you! I had my mental model of Python turned around
backwards. That's an odd feeling. Okay, so you decode to go from raw
byes into a given encoding, and you encode
william tanksley [EMAIL PROTECTED] wrote:
I'm still puzzled why I'm getting some non-Unicode out of an
ElementTree's text, though.
Now I know.
Okay, my answer is that cElementTree (in Python 2.5) is simply
deranged when it comes to Unicode. It assumes everything's ASCII.
Reference:
To ask another way: how do I convert from a file:// URL to a local
path in a standard way, so that filepaths from two different sources
will work the same way in a dictionary?
Right now I'm using the following source:
track_id = url2pathname(urlparse(track_id).path)
url2pathname is from urllib;
On Jul 30, 3:53 am, william tanksley [EMAIL PROTECTED] wrote:
To ask another way: how do I convert from a file:// URL to a local
path in a standard way, so that filepaths from two different sources
will work the same way in a dictionary?
Right now I'm using the following source:
track_id =
I'm trying to convert the URLs contained in iTunes' XML file into a
form comparable with the filenames returned by iTunes' COM interface.
I'm writing a podcast sorter in Python; I'm using iTunes under Windows
right now. iTunes' COM provides most of my data input and all of my
mp3/aac editing
22 matches
Mail list logo