2009/4/17 Karsten Bräckelmann <guent...@rudersport.de>:
> On Fri, 2009-04-17 at 16:18 +0100, Matt wrote:
>> Karsten Bräckelmann wrote:
>
>> > Err, Matt, just had a very brief look at the code and the resulting
>> > metas, but -- how is that different? :)
>>
>> Blame that comment on lack of sleep - I read that as limiting the depth
>> of the tree and not being an n-ary tree.
>
> ;)
>
>> > Hmm, interesting, the results aren't linear. The 8-ary tree performs
>> > much better than the flat meta. However, the 4-ary tree with even less
>> > children per node (meta) doesn't improve this further.
>>
>> I haven't had a chance to look at the SA compile code yet to see how it
>> works.  I am going to run some more tests to see what impact the
>> different values have on the different parts of the sa-compile process.
>> I *think* that the actually compilation of the .c files was quicker
>> using 4 rather than 8 children.
>
> Which also seems more likely, from naive point of view without me having
> any closer look at the code.
>
>> I also want to look at the caching code - as with smaller 4-ary trees
>> the chance of keeping the same blocks would increase - assuming there is
>> some inteligence in the groupings.
>
> By looking at the sub-rules' names I got the impression they are just
> random. But maybe they actually are somehow based on the rule's content?
> Never checked.  Justin?

yep, they're derived from a hash of the string.

>> > Btw, there's a minor issue with the additional nodes not being non-
>> > scoring sub-rules and thus scoring a default 1.0. Just to point it out,
>> > I do realize this is a proof-of-concept hack. :)
>>
>> Missed that - ta!  Good job I am not running it in production!
>
> Exactly why I pointed it out. :)
>
>  guenther
>
>
> --
> char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4";
> main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1:
> (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
>
>

Reply via email to