On Fri, 2009-04-17 at 16:18 +0100, Matt wrote: > Karsten Bräckelmann wrote:
> > Err, Matt, just had a very brief look at the code and the resulting > > metas, but -- how is that different? :) > > Blame that comment on lack of sleep - I read that as limiting the depth > of the tree and not being an n-ary tree. ;) > > Hmm, interesting, the results aren't linear. The 8-ary tree performs > > much better than the flat meta. However, the 4-ary tree with even less > > children per node (meta) doesn't improve this further. > > I haven't had a chance to look at the SA compile code yet to see how it > works. I am going to run some more tests to see what impact the > different values have on the different parts of the sa-compile process. > I *think* that the actually compilation of the .c files was quicker > using 4 rather than 8 children. Which also seems more likely, from naive point of view without me having any closer look at the code. > I also want to look at the caching code - as with smaller 4-ary trees > the chance of keeping the same blocks would increase - assuming there is > some inteligence in the groupings. By looking at the sub-rules' names I got the impression they are just random. But maybe they actually are somehow based on the rule's content? Never checked. Justin? > > Btw, there's a minor issue with the additional nodes not being non- > > scoring sub-rules and thus scoring a default 1.0. Just to point it out, > > I do realize this is a proof-of-concept hack. :) > > Missed that - ta! Good job I am not running it in production! Exactly why I pointed it out. :) guenther -- char *t="\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4"; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;i<l;i++){ i%8? c<<=1: (c=*++x); c&128 && (s+=h); if (!(h>>=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}