For the sake of date ranges, I'm storing dates as YYYYMMDD in my e-mail indexing application.
My users typically want to limit their queries to ranges of dates, which include today. The application is indexing in real time. I gather I should prefer RangeQuery to ConstantScoreQuery+RangeFilter, because it is faster not to use a Filter. However, I sometimes have to combine my RangeQuery with a PrefixQuery and of course TooManyClauses exceptions arise, when I exceed BooleanQuery.getMaxClauseCount(), which I've currently left at the default 1024 value. In a year of 365 days with e-mail messages arriving every day, can I assume that an inclusive date range of 20050713-20060713 in a RangeQuery is going to contribute 365 clauses to a BooleanQuery? Can I assume that 5 years would mean 5 x 365 = 1825 clauses? If so, how can I figure out how expensive is it in terms of memory requirement to adjust the maximum clause count to deal with 5 year ranges? i.e. // Increase the maximum clause count to cope with date ranges // up to 5 years - my worst case BooleanQuery.setMaxClauseCount(BooleanQuery.getMaxClauseCount()+1825); Do I need to consider whether this would significantly degrade performance too? An alternative would be to assume that my users are mostly going to ask for e-mail arriving within the last day, two days, week, fortnight, month, quarter, year, 5 years and pre-cache filters for these typical usage ranges every time the clock rolls over, using a CachingWrapperFilter with RangeFilter and to BooleanQuery that with a term query on today's date. e.g. // Get the cache for predetermined (i.e. already cached) date range, // which doesn't include today, because we are indexing all the time. // These ranges were pre-cached at midnight. CachingWrapperFilter wrapper = /* ... */; BooleanQuery dateRangeBooleanQuery = new BooleanQuery(); dateRangeBooleanQuery.add( new ConstantScoreQuery(new RangeFilter(wrapper)) ,BooleanClause.Occur.SHOULD ); dateRangeBooleanQuery.add( new TermQuery("20060714") // i.e. today ,BooleanClause.Occur.SHOULD ); BooleanQuery mainQuery = new BooleanQuery(); mainQuery.add( dateRangeBooleanQuery ,BooleanClause.Occur.MUST ); How can I figure out how expensive is it in terms of memory requirement to retain CachingWrappeFilters for a set of date ranges?
smime.p7s
Description: S/MIME cryptographic signature