On Thu, Oct 14, 2010 at 7:23 PM, William Leslie <
[email protected]> wrote:

> 2010/10/15 Ben Kloosterman <[email protected]>:
> > The main cons I see is besides the tree index/reference cost , each
> > substring would need a field (which may be aligned to 4-8 bytes) or char
>  to
> > indicate the encoding and the higher initial / final parse overhead.
>
> I think shap imagines that there are different types for leaf nodes
> with different encodings, so the encoding is determined by the type/gc
> tag. So a string with one encoding type would appear in memory as
>
> | utf-8 node tag + gc header | encoded data |
>

Close, but not quite. I would say that the *strand* encoding is as you
describe, and a *string* is defined as a sequence of strands.

shap
_______________________________________________
bitc-dev mailing list
[email protected]
http://www.coyotos.org/mailman/listinfo/bitc-dev

Reply via email to