Dear All
How much actually the disk space needed to optimize the index?The
explanation given in documentation seems to be very different with the
practical situation
I have an index file of size 18.6 G and I am going to optimize it.I keep
this index in mobile Hard Disk with
to 3minutes in searching inside
this unoptimized index
How bout the memory consumption?will it took greater amount of memory
consumption if using the optimized one?
Thanks a lot
Regards,
Maureen
Michael McCandless [EMAIL PROTECTED] wrote:
maureen tanuwidjaja wrote
have the searching result in 30 to 3 minutes, which is actually quite
unacceptable for the search engine I build...Is there any recommendation on
how faster searching could be done?
Thanks,
Maureen
Michael McCandless [EMAIL PROTECTED] wrote: maureen tanuwidjaja wrote:
One
Oops sorry,mistyping..
I have the searching result in 30 SECONDS to 3 minutes, which is actually
quite unacceptable for the search engine I build...Is there any
recommendation on how faster searching could be done?
maureen tanuwidjaja [EMAIL PROTECTED] wrote: Hi mike
The only
Hi Mike,
How to disable/turn off the norm?is it while indexing?
Thanks,
Maureen
-
Need Mail bonding?
Go to the Yahoo! Mail QA for great tips from Yahoo! Answers users.
Hi all,
How to disable lucene norm factor?
Thanks,
Maureen
-
We won't tell. Get more on shows you hate to love
(and love to hate): Yahoo! TV's Guilty Pleasures list.
ok mike.I'll try it and see wheter could work :) then I will proceed to
optimize the index.
Well then i guess it's fine to use the default value for maxMergeDocs which
is INTEGER.MAX?
Thanks a lot
Regards,
Maureen
Michael McCandless [EMAIL PROTECTED] wrote:
maureen
.
Thanks,
Xiaocheng
maureen tanuwidjaja wrote: Ya...I think i will store it in the database so
that later it could be used in scoring/ranking for retrieval...:)
Another thing i would like to see is whether the precision or recall will be
much affaected by this...
Regards,
Maureen
PROTECTED] wrote:
maureen tanuwidjaja wrote:
I had an exsisting index file with the size 20.6 GB...I havent done any
optimization in this index yet.Now I had a HDD of 100 GB,but apparently
when I create program to optimize(which simply calls writer.optimize()
to this indexfile),it gives
Hi,
I had an exsisting index file with the size 20.6 GB...I havent done any
optimization in this index yet.Now I had a HDD of 100 GB,but apparently when I
create program to optimize(which simply calls writer.optimize() to this
indexfile),it gives the error that there is not enough space
I also would like to know wheter searching in the indexfile eats lots of
memory...I always ran out of memory when doing searching,i.e. it gives the
exception java heap space(although I have put -Xmx768 in the VM argument) ...Is
there any way to solve it?
-
TV
Hi,
May I also ask wheter there is a way to use writer.optimize() without
indexing the files from the beginning?
It took me about 17 hrs to finish building an unoptimized index(finish when
I call IndexWriter.close() ).I just wonder wheter this existing index could be
optimized...
Dear All,
I was indexing 660,000 XML documents.The unoptimized index file was
successfully built in about 17 hrs...This index file resides in my D drive
which has the free space 38 Gb.This space is insufficient for optimizing the
index file --I read Lucene documentation said about its
: maureen tanuwidjaja [mailto:[EMAIL PROTECTED]
Sent: 01 February 2007 14:22
To: java-user@lucene.apache.org
Subject: Building lucene index using 100 Gb Mobile HardDisk
Dear All,
I was indexing 660,000 XML documents.The unoptimized index file was
successfully built in about 17 hrs...This index
factors and other IndexWriter settings it could
just be doing a relaly big merge.
: Date: Sat, 27 Jan 2007 09:40:47 -0800 (PST)
: From: maureen tanuwidjaja
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: My program stops indexing after 1th documents
triggering a thread dump to see what it was doing at that
point?
depending on your merge factors and other IndexWriter settings it could
just be doing a relaly big merge.
: Date: Sat, 27 Jan 2007 09:40:47 -0800 (PST)
: From: maureen tanuwidjaja
: Reply-To: java-user@lucene.apache.org
: To: java
OK,This is the printout of the stack trace while failing to indexing the
190,000th ocument
Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491886.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491887.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491891.xml
I think so ...btw may I ask the opinion, will it be useful to optimize let say
every 50,000-60,000 documents? I have total of 660,000 docs...
Erik Hatcher [EMAIL PROTECTED] wrote:
On Jan 28, 2007, at 9:15 PM, maureen tanuwidjaja wrote:
OK,This is the printout of the stack trace while failing
Hi all,
Is there any limitation of number of file that lucene can handle?
I indexed a total of 3 XML Documents,however it stops at 1th
documents.
No warning,no error ,no exception as well.
Indexing C:\sweetpea\wikipedia_xmlfiles\part-18\491876.xml
Indexing
Hi Mike and Erick and all,
I have fixed my code and yes,indexing is much faster than previously when I
do such hammering with IndexWriter
However,I am now encountering the error while indexing
Exception in thread main java.lang.OutOfMemoryError: Java heap space
This error
E...where shall I put that -XX:MaxPermSize=128m?
Thanks Pustovalov
Regards,
Maureen
ÐÑÑÑовалов ÐиÑ
аил [EMAIL PROTECTED] wrote: try this :
-XX:MaxPermSize=128m
On Fri, 26 Jan 2007 19:32:45 +0300, maureen tanuwidjaja
wrote:
Hi Mike and Erick
oh thanks then:)
ÐÑÑÑовалов ÐиÑ
аил [EMAIL PROTECTED] wrote: in your java
command line, of course :)
Example : java -Xms128m -Xmx1024m -server -Djava.awt.headless=true
-XX:MaxPermSize=128m protei.Starter
On Fri, 26 Jan 2007 19:39:13 +0300, maureen tanuwidjaja
wrote
Hi,
I am indexing thousands of XML document,then it stops after indexing for
about 7 hrs
...
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37003.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37004.xml
Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37008.xml
Indexing
and Best regards ^^
Maureen
maureen tanuwidjaja [EMAIL PROTECTED] wrote:
Thanks a lot Daniel :)
Regards,
Maureen
Daniel Noll wrote:
maureen tanuwidjaja wrote:
Before implementing this search engine,I have designed to build the
index in such a way that every XML tag is converted using
dunno wheter 7 hrs later it will raise the same
problemLock obtain timed out
4.I use the latest version of Lucene (nightly build)
Thanks and Regards,
Maureen
Michael McCandless [EMAIL PROTECTED] wrote: maureen tanuwidjaja wrote:
I am indexing thousands of XML document
it in main is a recipe for
disaster. Trust me on this one, I've spent way more time than I'd like
to admit debugging this kind of problem .
Best
Erick
On 1/25/07, maureen tanuwidjaja wrote:
Hi Mike,thanks for the reply...
1.Here is the class that I use for indexing..
package
Thanks Doron =)
Regards,
Maureen
Doron Cohen [EMAIL PROTECTED] wrote: Hi Maureen,
Some relevant info in the file formats doc -
http://lucene.apache.org/java/docs/fileformats.html
Regards,
Doron
maureen tanuwidjaja wrote on 25/01/2007
01:31:25:
btw Daniel,can please give me
Hi...
I am a Final Year Undergrad.My Final year project is about search engine for
XML Document..I am currently building this system using Lucene.
The example of XML element from an XML document :
--
article
body
section
28 matches
Mail list logo