On Wed, 13 Feb 2008, Alexandre Fiori wrote:

i would like to know if it's possible to generate an index with pylucene-jcc
in such a way that it's compatible with pylucene-gcj.
that's because i figured out that pylucene-jcc is much faster creating the
index, but slower searching.

with the same data i generated two indexes, one using pylucene-jcc and
another using pylucene-gcj.
jcc can add 100.000 documents in ~56s, while gcj do the same in ~4m15s.
but searching is different. jcc takes ~0.718s while gcj takes only
0.069sfor the same query.

also, i've found that jcc version of pylucene can read/search indexes
created with gcj version, but not the opposite.
i would like to know if it's possible generate indexes faster with gcj, or,
if it's possible to generate it by using jcc version and read/search with
gcj version.

If you're using the same versions of the underlying Java Lucene software to build either jcc- or gcj-PyLucene I expect their indexes to be fully compatible. It is my understanding that newer Lucene versions are capable of reading older Lucene indexes but not the other way around.

The timing differences you're seeing are most likely due to the fact that a long running task, such as index creation, gives the Java VM (embedded in jcc-PyLucene) a better chance to compile the bytecode. gcj-PyLucene is faster quicker but is eventually passed by jcc-PyLucene once the embedded JVM has had a chance to compile the bytecode. If you were to run a sizeable bunch of search queries in both jcc-PyLucene and gcj-PyLucene, I'm not sure which one would come out ahead. I suspect that jcc-PyLucene might actually.

Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to