I've attached a patch  to Issue #42, as a first step towards real
streaming of input.

It changes the HTMLInputStream.dataStream into a codecs.StreamReader and
uses the StreamReader's read() method in HTMLInputStream.char().

I also totally refactored position() computing (there's no more 'tell'
variable, though I probably could have kept it). Actually, I haven't
understood how 'tell' was managed exactly (particularly in charsUntil())
Maybe this has to do with conversion from \r into \n?
Given that 'tell' is only used internally (html5parser only uses position())

I told above this was a first step, because you still cannot use a
non-seekable stream if you rely on encoding detection (which still uses seek())


Maybe I should make a branch in the repository? or are you OK to
commit the patch in the trunk?

-- 
Thomas Broyer

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"html5lib-discuss" group.
 To post to this group, send email to [email protected]
 To unsubscribe from this group, send email to [EMAIL PROTECTED]
 For more options, visit this group at 
http://groups.google.com/group/html5lib-discuss?hl=en-GB
-~----------~----~----~----~------~----~------~--~---

Reply via email to