found also some 1M test
258033ms. Buiding index of 1000000 docs 29703ms. Verifying data integrity with 100 docs 1821ms. Preparing 10000 random queries 2867284ms. Regex queries 18772ms. Regexp queries (new style) 29257ms. Wildcard queries 4920ms. Boolean queries Totals: [1749708, 1744494, 1749708, 1744494] On Fri, Nov 30, 2012 at 12:13 PM, Roman Chyla <roman.ch...@gmail.com> wrote: > Hi, > > Some time ago we have done some measurement of the performance fo the > regexp queries and found that they are VERY FAST! We can't be grateful > enough, it saves many days/lives ;) > > This was an old lenovo x61 laptop, core2 due, 1.7GHz,no special memory > allocation, SSD disk: > > > 51459ms. Buiding index of 100000 docs > 181175ms. Verifying data integrity with 100 docs > 315ms. Preparing 1000 random queries > > 61167ms. Regex queries - Stopping execution, # queries finished: 150 > 2795ms. Regexp queries (new style) > 3936ms. Wildcard queries > 777ms. Boolean queries > 893ms. Boolean queries (truncated) > 3596ms. Span queries > 91751ms. Span queries (truncated)Stopping execution, # queries finished: 100 > 3937ms. Payload queries > 93726ms. Payload queries (truncated)Stopping execution, # queries finished: > 100 > Totals: [4865, 18284, 18286, 18284, 18405, 287934, 44375, 18284, 2489] > > Examples of queries: > -------------------- > regex:bgiyodjrr, k\w* michael\w* jay\w* .* > regexp:/bgiyodjrr, k\w* michael\w* jay\w* .*/ > wildcard:bgiyodjrr, k*1 michael*2 jay*3 * > +n0:bgiyodjrr +n1:k +n2:michael +n3:jay > +n0:bgiyodjrr +n1:k* +n2:m* +n3:j* > spanNear([vectrfield:bgiyodjrr, vectrfield:k, vectrfield:michael, > vectrfield:jay], 0, true) > spanNear([vectrfield:bgiyodjrr, SpanMultiTermQueryWrapper(vectrfield:k*), > SpanMultiTermQueryWrapper(vectrfield:m*), > SpanMultiTermQueryWrapper(vectrfield:j*)], 0, true) > spanPayCheck(spanNear([vectrfield:bgiyodjrr, vectrfield:k, > vectrfield:michael, vectrfield:jay], 1, true), payloadRef: > b[0]=48;b[0]=49;b[0]=50;b[0]=51;) > spanPayCheck(spanNear([vectrfield:bgiyodjrr, > SpanMultiTermQueryWrapper(vectrfield:k*), > SpanMultiTermQueryWrapper(vectrfield:m*), > SpanMultiTermQueryWrapper(vectrfield:j*)], 1, true), payloadRef: > b[0]=48;b[0]=49;b[0]=50;b[0]=51;) > > > The code here: > > https://github.com/romanchyla/montysolr/blob/solr-trunk/contrib/adsabs/src/test/org/adsabs/lucene/BenchmarkAuthorSearch.java > > The benchmark should probably not be called 'benchmark', do you think it > may be too simplistic? Can we expect some bad surprises somewhere? > > Thanks, > > roman >