You know, this was my suspicion Kirby.
Thanks for giving the heads up... automaton rocks.
Lewis


On Thu, May 23, 2013 at 5:06 PM, Kirby Bohling <[email protected]>wrote:

> Standard micro-benchmark issues with Java, run the 50 last and it'll run
> faster.  JVM warmup, and JIT compilation, yadda, yadda, yadda.
>
>
> On Thu, May 23, 2013 at 1:57 PM, Lewis John Mcgibbney <
> [email protected]> wrote:
>
> > Hi All,
> > A really nice aspect of the regex (urlfilter-automaton and
> urfilter-regex)
> > plugin implementation's in Nutch is that there is a small but very useful
> > RegexURLFilterBaseTest [0] which compares benchmarks for simple regex
> > parsing.
> > The results we get are as follows
> >
> > urls      automaton      regex
> > 50        343ms           210ms
> > 100      48ms             187ms
> > 200      65ms             363ms
> > 400      100ms           692ms
> > 800      165ms           1385ms
> >
> > The problem I have here is understanding why the first (50) bench appears
> > to be more expensive for both implementations?
> > Additionally, why does this same bench cost much more for automaton?
> >
> > Anyone have a clue?
> > Thanks
> > Lewis
> >
> > [0]
> >
> >
> http://svn.apache.org/viewvc/nutch/branches/2.x/src/plugin/lib-regex-filter/src/test/org/apache/nutch/urlfilter/api/RegexURLFilterBaseTest.java?view=markup
> >
> > --
> > *Lewis*
> >
>



-- 
*Lewis*

Reply via email to