okay, i was using htmlfmt to try to turn some <pre>-formatted
character tables into something useful. this is what i did:
rcsdiff html.c
===================================================================
RCS file: RCS/html.c,v
retrieving revision 1.1
diff -r1.1 html.c
84a85,100
> static void
> renderprerunes(Bytes* b, Rune* r)
> {
> char *s;
> int nr;
>
> nr = r ? runestrlen(r) : 0;
> if (!nr)
> return;
> s = smprint("%.*S", runestrlen(r), r);
> growbytes(b, s, strlen(s));
> free(s);
> col = 0; // just print it.
> inword = 0;
> }
>
211c227,230
< renderrunes(t, it->s);
---
> if (il->state & IFwrap)
> renderrunes(t, it->s);
> else
> renderprerunes(t, it->s);
this does not fix the problem of converting \t → spaces
and "col = 0" in renderprerunes() is wrong. but it doesn't
look like col is reset when il->state == IFbrk or IFbrksp.
i still think this would be easier if everything inside
the <pre> were one token. that way we would know
when we would be done and be able to more easily/
accurately set the end state.
erik
Federico Benavento <[EMAIL PROTECTED]> writes
|
| Now abaco handles <pre> text in a different way,
| this a quick and dirty change, since when the line
| width is bigger than the screen width the text is not
| showed. I knew that this was easy to fix, but I didn't
| because, right now, <table> is my problem.
|
| cheers
|
| On 10/21/05, Federico G. Benavento <[EMAIL PROTECTED]> wrote:
| > >in my hasty reading of libhtml i was thinking that the tokenization is
almost
| > >correct. the only change needed is to not translate \t to 8 spaces. on
output,
| >
| > I don't think this is needed since when there is a <pre> tag,
| > libhtml set the Item->flag to IFwrap, so this item should be treated
| > in a different way, this is what abaco should do.
| >
| > >for rendering, perhaps the solution is to add a flag indicating that the
output
| > >is <pre>-formatted and just memcpy() the text in render.
| >
| > As I said the flag is already there, In my opinion libhtml is ok,
| > charon uses it (libhtml was a part of "I" web browser, which is a charon's
translation
| > from limbo to c), what needs to be improved is abaco.
| >
| > >i was impressed with how little the tokenizing and rendering code was
| > >special cased, given how ad hoc html is. however, otoh, maybe <pre> should
be
| > >handled in a special manner, with the tokenizer just converting character
| > >sets and entities and treating that result as one big Bytes*.
| >
| > cheers
| >
| > Federico G.Benavento