tomer filiba schrieb:
> # read 3 UTF8 *characters*
> f.read(3)
> 
> # this will seek by AT LEAST 7 *bytes*, until resynched
> f.substream.seekby(7)
> 
> # we can resume reading of UTF8 *characters*
> f.read(3)
> 
> heck, i even like this idea :)

Notice that resyncing is a really tricky operation, and
should not be expected to work for all encodings. For
example, for the iso-2022 encodings, you have to know
what character set you are "in", and you have to read
forward/backward until you find a character-code switching
escape sequence.

There is an RFC-imposed requirement that each line
of input is "neutral" wrt. character set switching,
so you can typically synchronize at a line break. Still,
this could require to skip an arbitrary amount of text.

Regards,
Martin
_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to