in my hasty reading of libhtml i was thinking that the tokenization is almost
correct. the only change needed is to not translate \t to 8 spaces. on output,
for rendering, perhaps the solution is to add a flag indicating that the output
is <pre>-formatted and just memcpy() the text in render.
i was impressed with how little the tokenizing and rendering code was
special cased, given how ad hoc html is. however, otoh, maybe <pre> should be
handled in a special manner, with the tokenizer just converting character
sets and entities and treating that result as one big Bytes*.
erik
Federico Benavento <[EMAIL PROTECTED]> writes
|
| hi
|
| On 10/19/05, erik quanstrom <[EMAIL PROTECTED]> wrote:
| > to be more exact: this
| >
| > <pre>
| > 1 2
| > </pre>
| >
| > yields 2 tokens
| >
| > '1 ' ('1' + ' '*8)
| > '2'
| >
| > render() never sees the <pre> tag and renderrunes()
| > eats the trailing spaces yielding
| >
| > 1 2
| >
| I'm aware of this,
|
| > i'm not sure if this is the spec or not, but it's not what
| > one expects, based on most browsers.
| >
|
| You are right, there are a lot of things to be improved:
| <pre>, <table>, justified text, etc. And to
| do this the whole render process need to be changed,
| I still don't know how to solve this.
| Charon solves this by using the Line struct (layout.b),
| and braking the html items list into a list of Lines.
| I don't want to brake the list of items,I'm short
| of ideas about how to do this, but i'm still thinking.
|
| Suggestions are welcome.