Hi,

So we've implemented both suggestions and it made a big difference.
You can see a Beta sample at

http://www.fastbuzz.com/search/index.jsp

We have around 7,000,000 items in the index.

What we did:
1. Instead of using msec granularity, we're using hour granularity for
date searches. This reduced search times from tens of seconds to 2-5
seconds. No ideal, but ...
2. We cache the results. So if you're looking for items in the last 15
days and then do a "next" it'll save the filter using
CachingWrapperFilter and reuse it, resulting in much faster times the
second time.
This reduces the times from the above 2-5 secons to 0.2 - 0.8 msecs.

One of the challenges though is that since the index is updated in real
time, we can't cache for very long. We'll probably have to set up a
mechanism to "seed" the cache before the "new" index becomes available.

Regards,

Dror



On Sat, Nov 15, 2003 at 11:03:13AM -0800, Dror Matalon wrote:
> After posting the original email, I started wondering if that's the
> issue, the fact that we store timestamp up to the millisecond rather
> than a more reasonable granularity. Dates are too high a granularity for
> us, but minutes, and possibly hours should work.
> 
> I'll report once we've tested some more.
> 
> Regards,
> 
> Dror
> 
> On Sat, Nov 15, 2003 at 12:25:47PM -0500, Erik Hatcher wrote:
> > On Saturday, November 15, 2003, at 11:38  AM, Karsten Konrad wrote:
> > >If the number of different date terms causes this effect, why not 
> > >"round"
> > >the date to the nearest or next midnight while indexing. Thus, 
> > >filtering
> > >for the last  15 days would require walking over 15-17 different date 
> > >terms.
> > >If you don't do this, the number of different terms will be the same as
> > >the number of documents you indexed, explaining the slowing down when 
> > >you
> > >have more results.
> > 
> > I wholeheartedly concur.  And in fact I don't use the Keyword(String, 
> > Date) thing at all if I just need to represent a date.  I use YYYYMMDD 
> > as a String instead.  It's just too fiddly to deal with dates using the 
> > built-in handling of it.
> > 
> >     Erik
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> > 
> 
> -- 
> Dror Matalon
> Zapatec Inc 
> 1700 MLK Way
> Berkeley, CA 94709
> http://www.fastbuzz.com
> http://www.zapatec.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.fastbuzz.com
http://www.zapatec.com

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to