Re: Index test on bigger dataset

Michael Carey Tue, 17 Sep 2013 11:29:56 -0700

Interesting! I haven't followed enough yet, but now you have myinterest;. :-)

do you have an explanation for why your index wins even in the case of 100%?

(Not intuitive - maybe I am missing some details that would fill myintuition gap.)


On 9/17/13 10:59 AM, Steven Jacobs wrote:

I ran a test on one of Preston's real-world data sets (Weathercollection) that had around 40,000 files. I am attaching the results.There are three graphs.
The first shows the time for returning the entire XML for all 40000files. My index algorithm has huge gains over collection, no matterhow much of the data is returned.
The second shows how the two algorithms perform as the number of filesincreases. Both linearly increase, but collection has a much higher slope.
The last is just a one-point comparison for returning paths that onlyexist in only 100 out of the 40000 files. Once again, index has a hugeadvantage.
Steven

Re: Index test on bigger dataset

Reply via email to