Hi Steven,
Are the indexes you get the same size? My guess is that the code isn't
really equivalent. Ferret should be faster than Lucene. Try this;
include Ferret::Document
@index = Index::Index.new(:path => inIndexPath)
def createIndex(inRepositoryPath)
Find.find(inRepositoryPath) do |path|
if FileTest.file?(path)
File.open(path) do |file|
doc = Document.new()
doc << Field.new(:file, path,
Field::Store::YES, Field::Index::UNTOKENIZED)
doc << Field.new(:content, file.readlines,
Field::Store::NO, Field::Index::TOKENIZED)
@index << doc
end
end
end
end
Let me know if this helps.
Cheers,
Dave
On 5/3/06, steven <[EMAIL PROTECTED]> wrote:
> Hi all,
>
> Have been looking at lucene and ferret.
>
> Have noticed that ferret takes ~463 seconds to index 200Mb of docs,
> whereas lucene takes ~60 seconds.
>
> I'm using the standard "get you started" sort of code provided by both
> libraries.
>
> My ruby code is: (abridged)
>
> @index = Index::Index.new(:path => inIndexPath)
>
> def createIndex(inRepositoryPath)
> Find.find(inRepositoryPath) do |path|
> if FileTest.file?(path)
> File.open(path) do |file|
> @index.add_document(:file =>path, :content =>
> file.readlines)
> end
>
> My Java code is basically a direct port.
>
> Has anyone else noticed this difference in speed? Am I doing something
> wrong? Is this speed normal?
>
> Any advice gratefully received.
> Thanks,
> Steven
>
> --
> Posted via http://www.ruby-forum.com/.
> _______________________________________________
> Ferret-talk mailing list
> [email protected]
> http://rubyforge.org/mailman/listinfo/ferret-talk
>
_______________________________________________
Ferret-talk mailing list
[email protected]
http://rubyforge.org/mailman/listinfo/ferret-talk