Re: [lucy-user] Lucy Benchmarking

Nick Wellnhofer Wed, 01 Feb 2017 04:43:35 -0800

On 01/02/2017 01:44, Kasi Lakshman Karthi Anbumony wrote:

(1)  Is Lucy multithreaded or single threaded?


Single-threaded.

(2) Are "C" runtime and bindings stable?


Yes.

(2) Is there preexisting benchmark code written in "C" to measure Lucy 
performance?

No.

(3) I am seeing one under devel/benchmarks/indexers/LuceneIndexer.java. But 
this one is written in Java and looks like benchmarking Lucene not Lucy. Am I 
right in my observation?


The corresponding Perl benchmark script for Lucy is lucy_indexer.plx:


https://git1-us-west.apache.org/repos/asf?p=lucy.git;a=tree;f=devel/benchmarks/indexers;h=77626c37285602941376c5e5950a20e50683da40;hb=HEAD

(4) I was thinking of modifying the lucy/c/sample applications as benchmarking 
application. Is this a good strategy.
Btw is there a good way to build sample files. I have to modify the Makefile in 
luc/c/ directory to build the sample files and  I am not sure if this is the 
correct way.

You can find some guidance on how to compile Lucy applications in the commenton top of getting_started.c:



https://git1-us-west.apache.org/repos/asf?p=lucy.git;a=blob;f=c/sample/getting_started.c;h=6d6193d772f2ceaac86c67cc49169878b4d4d2f6;hb=HEAD

Basically, you have to run the Clownfish compiler "cfc" to generate headerfiles, then you can compile your code and link against libclownfish and liblucy.

Benchmark results for the indexer will largely depend on the particularAnalyzer chain and the total size of your index. The default EasyAnalyzerconsists of


- StandardTokenizer
- Unicode Normalizer
- SnowballStemmer

StandardTokenizer is pretty fast, but Normalizer and Stemmer areCPU-intensive. Last time I checked, they account for about two-thirds of theprocessing time for small indices.


A better benchmarking framework would be a much needed contribution.

Nick

Re: [lucy-user] Lucy Benchmarking

Reply via email to