Related searches

2006-01-30 Thread Leon Chaddock
Hi, Does anyone know if it is possible to show related searches with lucene, for example if someone searched for "car insurance" you could bring back the results and related searches like these Automobile Insurance Car Insurance Quote Car Insurance Quotes Auto Insurance Cheap Car Insurance Car

Memory problem

2006-02-01 Thread Leon Chaddock
Hi All, We have a lucene index of over 10 000 000 docs at this time. When we try and run a search we get java.lang.OutOfMemoryError: Java heap space We have tried setting the xmx settings to 1gb but to no avail (the box has 4gb of memory available) . IS there any guidance on handling memory or

Re: Memory problem

2006-02-01 Thread Leon Chaddock
i Leon, I had a similar problem when doing a test import which I believe was actually down to object churn in parsing the data to create the Documents. I achieved a quick fix by calling System.gc() every thousand documents. Cheers, Nick ____ From: Leon Chaddoc

Re: Memory problem

2006-02-02 Thread Leon Chaddock
issue if you have many indexed fields. FieledNorms take up one byte per doc per indexed field -- even if a doc doens't have a value for that field, it still gets a norm for that field. There are options when indexing to prevent norms from being calculated, which can save a lot of space.

Size + memory restrictions

2006-02-14 Thread Leon Chaddock
Hi, we are having tremendous problems building a large lucene index and querying it. The programmers are telling me that when the index file reaches 3.5 gb or 5 million docs the index file can no longer grow any larger. To rectify this they have built index files in multiple directories. Now

Re: Size + memory restrictions

2006-02-14 Thread Leon Chaddock
]> To: Sent: Tuesday, February 14, 2006 6:38 PM Subject: RE: Size + memory restrictions Yes. We have the same problem. It is mainly because TermInforReader.java that takes memory space to keep *.tii. Eugene -Original Message- From: Leon Chaddock [mailto:[EMAIL PROTECTED] Sent: Tues

Re: Size + memory restrictions

2006-02-15 Thread Leon Chaddock
you will need enough memory to store a full sorting of your documents in memory. If you're trying to sort on a string or anything other than an int or float, this could require a lot of memory. I've used indices much bigger than 5 mil. docs/3.5 gb with less than 4GB of RAM and had no prob

Re: Size + memory restrictions

2006-02-15 Thread Leon Chaddock
tch (IOException e) { log.error(ClassTool.getClassNameOnly(e) + ": " + e.getMessage(), e); } } mSearcher = new MultiSearcher(srs); changeTime = System.currentTimeMillis(); } } return mSearcher; } - Original Message - From: "Leon Chaddock" <[EMAIL PROTECTED]>

Re: Size + memory restrictions

2006-02-15 Thread Leon Chaddock
iSearcher(srs); : changeTime = System.currentTimeMillis(); :} : } : return mSearcher; : } : - Original Message - : From: "Leon Chaddock" <[EMAIL PROTECTED]> : To: : Sent: Wednesday, February 15, 2006 9:28 AM : Subject: Re: Size + memory restrictions : : : > Hi G

Re: Question

2006-03-07 Thread Leon Chaddock
Hi, I am very interested in this aswell, as I wish to display related searches for users. Does anyone know if this work is open source and is there an api available? Thanks Leon - Original Message - From: "Pasha Bizhan" <[EMAIL PROTECTED]> To: Sent: Tuesday, March 07, 2006 12:39 PM

Changing ranking

2006-03-23 Thread Leon Chaddock
Hi, At present lucene seems to rank very short documents over longer documents where the phrase occurs more regularily for instance which the search term "cat" "the cat went home" ranks higher than "the black cat when home past some other cats, on cat street" Is there anyway I can change lu

Re: Changing ranking

2006-03-24 Thread Leon Chaddock
Hi Chris, You said: " 5 word occurances in a 10 word document would probably score the same as those 5 words in a 20 word document" OK so If I set this option would this mean no of occurences was a major factor so that: A phrase occurs 1 time in a 3 word document would be a lower rank than A