Re: Optional Terms in a single query

2005-02-21 Thread Paul Elschot
OR harry)* +olfaithfull:stillhere)) typo? (type:1 81) > I would really think to do this all in one Query. Is this even possible? How would you want to combine the results? Regards, Paul Elschot - To unsubscribe, e-m

Re: Query Tuning

2005-02-21 Thread Paul Elschot
in the current version that some scoring is done ahead for each clause into an unordered buffer. This helps for top level OR queries, but loses for OR queries that are subqueries of AND. The svn version does not score ahead. It reli

Re: Query Tuning

2005-02-21 Thread Paul Elschot
quot; and I rewrote the query as: > c AND (a OR b) > Would the query run faster? Exchanging the operands of AND would not make a noticeable difference in speed. Queries are evaluated by iterating the inverted term index entries for all query terms in p

Re: Lucene in the Humanities

2005-02-19 Thread Paul Elschot
On Saturday 19 February 2005 11:02, Erik Hatcher wrote: > > On Feb 19, 2005, at 3:52 AM, Paul Elschot wrote: > >>> By lowercasing the querytext and searching in title_lc ? > >> > >> Well sure, but how about this query: > >> > >&

Re: Lucene in the Humanities

2005-02-19 Thread Paul Elschot
Erik, On Saturday 19 February 2005 01:33, Erik Hatcher wrote: > > On Feb 18, 2005, at 6:37 PM, Paul Elschot wrote: > > > On Friday 18 February 2005 21:55, Erik Hatcher wrote: > >> > >> On Feb 18, 2005, at 3:47 PM, Paul Elschot wrote: > >> > >

Re: Lucene in the Humanities

2005-02-18 Thread Paul Elschot
On Friday 18 February 2005 21:55, Erik Hatcher wrote: > > On Feb 18, 2005, at 3:47 PM, Paul Elschot wrote: > > > Erik, > > > > Just curious: it would seem easier to use multiple fields for the > > original case and lowercase searching. Is there any parti

Re: Lucene in the Humanities

2005-02-18 Thread Paul Elschot
Erik, Just curious: it would seem easier to use multiple fields for the original case and lowercase searching. Is there any particular reason you analyzed the documents to multiple indexes instead of multiple fields? Regards, Paul Elschot

Re: Multiple Keywords/Keyphrases fields

2005-02-16 Thread Paul Elschot
AND like queries will match in the indexed field. A gap is implemented by providing the a tokenstream from the analyzer that has a position increment that equals the gap for the first token in the stream. For the first field instance with same name the gap is not needed. Regards, Paul Elschot >

Re: chained restrictive queries

2005-02-14 Thread Paul Elschot
e in IndexSearcher.search(). A profiler might tell you whether that is a bottleneck for your queries. If it is, there is some code in development that might help . In case it turns out that the memory occupied by the BitSet of the filter is a bottleneck, please check the (very) recen

Re: Problem searching Field.Keyword field

2005-02-10 Thread Paul Elschot
32 required/prohibited clauses in query". In the development version this restriction has gone. The limitation of the maximum clause count (default 1024, configurable) is still there. Regards, Paul Elschot - To unsubscribe

Re: HELP! JIT error when searching... Lucene 1.3 on Java 1.1

2005-02-08 Thread Paul Elschot
;(Ljava/io/File;Z)Lorg/apache/lucene/store/FSDirectory;': Interpreting > >>method. > >> Please report this error in detail to > >>http://java.sun.com/cgi-bin/bugreport.cgi Iirc java 1.1 had a switch to turn of JIT compilation. It did slow things d

Re: Searching for doc without a field

2005-02-04 Thread Paul Elschot
On Friday 04 February 2005 17:29, Bill Tschumy wrote: > > On Feb 4, 2005, at 10:19 AM, Bill Tschumy wrote: > > > > > On Feb 3, 2005, at 2:04 PM, Paul Elschot wrote: > > > >> On Thursday 03 February 2005 20:18, Bill Tschumy wrote: > >>> Is

Re: Searching for doc without a field

2005-02-03 Thread Paul Elschot
ining the field names of all (other) indexed fields in the document. Assuming there is always a primary key field the query is then: +fieldnames:primarykeyfield -fieldnames:specificfield Regards, Paul Elschot - To unsubscribe, e-mai

Re: Rewrite causes BooleanQuery to loose required terms

2005-02-03 Thread Paul Elschot
it also be > broken? No. Currently, the "old" constructor for BooleanClause does not carry the old state forward. The "new" constructor does carry the new state backward. I'll post a fix in bugzilla later. Thanks, Paul Elschot. --

Re: Subversion conversion

2005-02-02 Thread Paul Elschot
ion 151042. So much for the few minutes instead of hours, Paul Elschot. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Compile lucene

2005-02-02 Thread Paul Elschot
empty beforehand): cvs -d :pserver:[EMAIL PROTECTED]:/home/cvspublic checkout -r lucene_1_4_3 -d lucene-1.4.3 jakarta_lucene In there you can correct the build.xml file and do: ant compile to compile the source code. Regards, Paul Elschot On Wednesday 02 February 2005 20:55, Helen Butler wrote: >

Re: Compile lucene

2005-02-02 Thread Paul Elschot
ilt has a 1.5 version number because of an incorrect version number in the 1.4.3 build.xml. You need to correct the version property in the build.xml file: Regards, Paul Elschot. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Penalty for storing unrelated field?

2005-01-29 Thread Paul Elschot
On Friday 28 January 2005 22:30, Andy Goodell wrote: > You should be fine. For search performance, yes. But the extra field data does slow down optimization of a modified index because all the field (and index) data is read and written for that. When the extra data gets bulky, it's normally better

Re: Suggestions for documentation or LIA

2005-01-26 Thread Paul Elschot
C1 synC2 ...) the development version of BooleanQuery might be a bit faster than the current one. For an interesting twist in the use of idf please search for "fuzzy scoring changes" on lucene-dev at the end of 2004. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Filtering w/ Multiple Terms

2005-01-24 Thread Paul Elschot
gt; Setting bit on > Setting bit on > Setting bit on > Setting bit on > Leaving AccountFilter... > Leaving AccountFilter... > Leaving AccountFilter... I don't see any recursion in your code, but this output suggests nesting three deep

Re: Opening up one large index takes 940M or memory?

2005-01-22 Thread Paul Elschot
r to the way the MySQL index cache works... It would be possible to add another level of indexing to the terms. No one has done this yet, so I guess it's prefered to buy RAM instead... Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: StandardAnalyzer unit tests?

2005-01-17 Thread Paul Elschot
ne-dev? Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Span Query Performance

2005-01-06 Thread Paul Elschot
Sorry for the duplicate on lucene-dev, it should have gone to lucene-user directly: A bit more: On Thursday 06 January 2005 10:22, Paul Elschot wrote: > On Thursday 06 January 2005 02:17, Andrew Cunningham wrote: > > Hi all, > > > > I'm currently doing a quer

Re: Span Query Performance

2005-01-06 Thread Paul Elschot
ndexing all word combinations that you're interested in. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: searching while indexing.

2005-01-05 Thread Paul Elschot
nt or use batches of several documents, > but you cannot escape the need to serialize the writes. And while this updating is going on, you can keep another reader open for searching, it will not be affected by the updates. After all updates are done, close that reader and reopen another one to s

Re: document boost not showing up in Explanation

2004-12-28 Thread Paul Elschot
ore (assuming no hits in the changed field text.) Finally, a change in document score only influences the document ordering in the search results when another document has a score that is within the range of the change. Regards, Paul Elschot. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Relevance percentage

2004-12-22 Thread Paul Elschot
d one). Then inherit from BooleanQuery.BooleanWeight to return the above Scorer. Then inherit from BooleanQuery to use the above Weight in createWeight(). Then inherit from QueryParser to use the above Query in getBooleanQuery(). Finally us

Re: Word co-occurrences counts

2004-12-22 Thread Paul Elschot
s and no further information from the matching documents, you may consider using your own HitCollector on the lower level search methods. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: MergerIndex + Searchables

2004-12-21 Thread Paul Elschot
> > > Question : > > When on Search Process , How to Display that this relevan Document Id > Originated from Which MRG??? > > [ Some thing like this : - Search word 'ISBN12345' is avalible from > "MRGx" ] I thi

Re: index size doubled?

2004-12-21 Thread Paul Elschot
ot an error back from the file system. After that it has put the name of that segment in the deletable file, so it can try later to delete that segment. This is known behaviour on FAT file systems. These randomly take some time for themselves to finish closing a file after it has been corr

Re: Relevance percentage

2004-12-20 Thread Paul Elschot
a score that sorts by the number of matching clauses. Higher powers as above can come a long way, though. Regards, Paul Elschot > Thanks, > Gururaja > > Mike Snare <[EMAIL PROTECTED]> wrote: > I'm still new to Lucene, but wouldn't that be the coord()? My > understan

Re: Optimising A Security Filter

2004-12-20 Thread Paul Elschot
gt; approach where they could end up laboriously marking > the entire index as True? The filter is checked only for search results on the query over the whole index. The bit filters generally work well, except when you need a lot of very sparse filters and memory is a concern. Regards, Pau

Re: Permissioning Documents

2004-12-10 Thread Paul Elschot
p Filter could be the first time a user from a group queries an index after it is opened. Filters can be cached, see the recent discussion on CachingWrappingFilter and friends. Regards, Paul Elschot - To unsubscribe, e-mail: [EM

Re: Retrieving all docs in the index

2004-12-09 Thread Paul Elschot
of the primary key field can serve as the constant value. Regards, Paul Elschot > -Original Message- > From: Aviran [mailto:[EMAIL PROTECTED] > Sent: Thursday, December 09, 2004 2:08 PM > To: 'Lucene Users List' > Subject: RE: Retrieving all docs in the index &

Re: restricting search result

2004-12-04 Thread Paul Elschot
a filter does reduce the search space. A filter might also be used to reduce the I/O for searching, but Lucene doesn't do that now, probably because there was little to gain. Regards, Paul Elschot. P.S. The code doing the filte

Re: restricting search result

2004-12-04 Thread Paul Elschot
Paul, On Friday 03 December 2004 23:31, you wrote: > Hi, > how yould you restrict the search results for a certain user? I'm One way to restrict results is by using a Filter. > indexing all the existing data in my application but there are certain > access levels so some users should see more r

Re: IndexWriter.optimize and memory usage

2004-12-03 Thread Paul Elschot
On Friday 03 December 2004 08:43, Paul Elschot wrote: > On Friday 03 December 2004 07:50, Chris Hostetter wrote: ... > > So, If I'm understanding you (and the javadocs) correctly, the real key > > here is maxMergeDocs.  It seems like addDocument will never merge a > > s

Re: IndexWriter.optimize and memory usage

2004-12-02 Thread Paul Elschot
ergeFactor at 10, the 1000'th added document will create a segment of size 1000. With maxMergeDocs at a lower value than 1000, the last merge (of the 10 segments with 100 docs each) will not be done. optimize() uses minMergeDocs for its final merges, but it ignores maxMergeDocs. Regards, Pau

Re: Does Lucene perform ranking in the retrieved set?

2004-11-30 Thread Paul Elschot
ocs/api/org/apache/lucene/search/Similarity.html See also the DefaultSimilarity. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: lucene Scorers

2004-11-24 Thread Paul Elschot
and/or Refactoring on how to get rid of the parallel class hierarchy. That could also involve some sort of accrual scorer and Lucene's Similarity. Regards, Paul Elschot > -Ken > > On Sat, 13 Nov 2004 12:07:05 +0100, Paul Elschot <[EMAIL PROTECTED]> wrote: > > On Frida

Re: URGENT: Help indexing large document set

2004-11-24 Thread Paul Elschot
ut that will probably not make a difference. Adding the documents can be done with multiple threads. Last time I checked that, there was a moderate speed up using three threads instead of one on a single CPU machine. Tuning the values of minMergeDocs and maxMergeDocs may also help t

Re: Numeric Range Restrictions: Queries vs Filters

2004-11-23 Thread Paul Elschot
Chris, On Tuesday 23 November 2004 03:25, Hoss wrote: > (NOTE: numbers in [] indicate Footnotes) > > I'm rather new to Lucene (and this list), so if I'm grossly > misunderstanding things, forgive me. > > One of my main needs as I investigate Search technologies is to restrict > results based on

Re: Using multiple analysers within a query

2004-11-22 Thread Paul Elschot
On Monday 22 November 2004 05:02, Kauler, Leto S wrote: > Hi Lucene list, > > We have the need for analysed and 'not analysed/not tokenised' clauses > within one query. Imagine an unparsed query like: > > +title:"Hello World" +path:Resources\Live\1 > > In the above example we would want the fir

Re: boolean/set operations on lucene queries

2004-11-18 Thread Paul Elschot
On Thursday 18 November 2004 16:57, Rupinder Singh Mazara wrote: > hi all > > I needed some help in solving the following problem > a user executes query1 and query2 > > both the queries( not result sets ) get stored, over time the user > wants to find > which documents from query1 are commo

Re: Lucene and SVD

2004-11-18 Thread Paul Elschot
; and with first 2 columns docuemnts will be displayed in a 2D-space. > Does anyone work on a project like this? I don't know. Is there a good SVD package for Java? Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: COUNT SUBINDEX [IN MERGERINDEX]

2004-11-17 Thread Paul Elschot
On Wednesday 17 November 2004 07:10, Karthik N S wrote: > Hi guy's > > > Apologies. > > > So A Mergeed Index is again a Single [ addition of subIndexes... ), > > If that case , If One of the Field Types is of type 'Field.Keyword' > whic is Unique across the subIndexes [Before Mergin

Re: Need help with filtering

2004-11-17 Thread Paul Elschot
On Wednesday 17 November 2004 01:20, Edwin Tang wrote: > Hello, > > I have been using DateFilter to limit my search results to a certain date > range. I am now asked to replace this filter with one where my search results > have document IDs greater than a given document ID. This document ID is >

Re: BooleanQuery - TooManyClauses Issue

2004-11-16 Thread Paul Elschot
g dates into day and time components. Once you approach 1000 days, you'll get the same problem again, so you might want to use a filter for the dates. See DateFilter and the archives on MMDD. Regards, Paul Elschot. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-13 Thread Paul Elschot
ncations? To have only longer matches one can also use queries with multiple ? characters, each matching exactly one character. I think it would be better encourage the users to use longer and maybe also more prefixes. This gives m

Re: lucene Scorers

2004-11-13 Thread Paul Elschot
On Friday 12 November 2004 22:56, Chuck Williams wrote: > I had a similar need and wrote MaxDisjunctionQuery and > MaxDisjunctionScorer. Unfortunately these are not available as a patch > but I've included the original message below that has the code (modulo > line breaks added by simple text emai

Re: lucene Scorers

2004-11-12 Thread Paul Elschot
to return that weight. - override QueryParser.getBooleanQuery() to return that query in the cases you want, that is when all clauses are optional. "replace" usually means "inherit from" in new code. When you need more info on this, try lucene-dev. Regards, Paul Elschot. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Bug in the BooleanQuery optimizer? ..TooManyClauses

2004-11-12 Thread Paul Elschot
e actual tradeoff depends on the user requirements and the time and memory available on the server, so the users get what they pay for. Imposing a minimum prefix length can be done by overriding the method in QueryParser that provides a prefix query. Regards, Paul Elschot -

Re: Query#rewrite Question

2004-11-11 Thread Paul Elschot
word 2 or "word3 word4"~4)"~2 SpanQueries can also enforce an order on the matching subqueries, but that is difficult to express in the current query syntax. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: What is the difference between these searches?

2004-11-09 Thread Paul Elschot
On Tuesday 09 November 2004 23:14, Luke Francl wrote: > On Tue, 2004-11-09 at 16:00, Paul Elschot wrote: > > > Lucene has no provision for matching by being prohibited only. This can > > be achieved by indexing something for each document that can be > > used in queries t

Re: can lucene be backed to have an update field

2004-11-09 Thread Paul Elschot
replace the value efficiently. The only updates available are on the field norms. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: What is the difference between these searches?

2004-11-09 Thread Paul Elschot
g not having + or - prefix is optional and only influences the score. In case there is nothing required by a + prefix, at least one of the things without prefix is required. Regards, Paul Elschot. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Search speed

2004-11-02 Thread Paul Elschot
analyzer. It's an unusual solution, though. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Search speed

2004-11-02 Thread Paul Elschot
rms in the phrases. Another way is to avoid using the term positions by querying for words instead of phrases. In case you have hardware/resources there are more options like using faster disks and/or using RAM for critical parts of the index. Lucene can use extra RAM in

Re: When do document ids change

2004-10-29 Thread Paul Elschot
. Just retrieve what you need from the Lucene index in the order of the docId's. Try and store as little data per document as possible. > about updating the database when the documentID is created? To know the docId use an indexed primary key in lucene and search for it using IndexReader.termD

Re: threading and indexing......

2004-10-16 Thread Paul Elschot
ad, 10-15% iirc. More threads were of no use for me in that case. Regards, Paul Elschot > Otis > > --- Chris Fraschetti <[EMAIL PROTECTED]> wrote: > > if i have four threads all trying to call my index function, will > > lucene do what is necessary for each threa

Re: sorting and score ordering

2004-10-13 Thread Paul Elschot
or each term, and one scorer to combine the other two to provide the search results, usually a BooleanScorer or a ConjunctionScorer. For proximity queries, other scorers are used. Regards, Paul Elschot - To unsubscribe, e-mail:

Re: Special field values

2004-10-12 Thread Paul Elschot
On Tuesday 12 October 2004 19:27, Paul Elschot wrote: > > IndexReader.open(indexName).termDocs(new Term(term, > field)).skipTo(documentNr) > > returns the boolean indicating that. Well, almost. When it returns true one still needs to check the TermDocs for being at the documentNr

Re: Special field values

2004-10-12 Thread Paul Elschot
erm is in a field of a document: IndexReader.open(indexName).termDocs(new Term(term, field)).skipTo(documentNr) returns the boolean indicating that. What do you need the {0,1} values for? Regards, Paul Elschot. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to pull document scoring values

2004-09-29 Thread Paul Elschot
ay to find out number of indexed terms for each > > document? By default, the stored norm is the inverse square root of the number of indexed terms of an indexed document field. The encoding/decoding is somewhat rough, though. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: How to pull document scoring values

2004-09-29 Thread Paul Elschot
order of the search results much. Taking the square roots of the query term weights would have the query weights directly apllied to the the query term density in the document field, whereas now the weights seem to be applied to the square root of the density. The density value is an approximation

Re: displaying 'pages' of search results...

2004-09-21 Thread Paul Elschot
ed the 1.000.000 results? Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: WildCardQuery

2004-09-20 Thread Paul Elschot
dcard. As each clause ends up using some buffer memory internally, a maximum was introduced to avoid running out of memory. You can change the maximum nr of added clauses using BooleanQuery.setMaxClauseCount() but then it is advisable to monitor memory usage, and evt. i

Re: Too many boolean clauses

2004-09-20 Thread Paul Elschot
an IndexReader for this > particular search, where all other searches use the pool. Suggestions? You could use a map from the IndexSearcher back to the IndexReader that was used to create it. (It's a bit of a waste because the IndexSearcher has a reader attribute inte

Re: Too many boolean clauses

2004-09-20 Thread Paul Elschot
On Monday 20 September 2004 20:54, Shawn Konopinsky wrote: > Hey Paul, > > Thanks for the quick reply. Excuse my ignorance, but what do I do with the > generated BitSet? You can return it in in the bits() method of the object implementing your org.apache.lucene.search.Filter (http://jakarta.apache

Similarity scores: tf(), lengthNorm(), sumOfSquaredWeights().

2004-09-20 Thread Paul Elschot
oots? This would allow a more straightforward comprehension of the of the term weights as directly weighing the term densities. Section 5 of the reference above has the full weighted p-Norm formula's. The OR p-Norm there is very close to the Lucene formula without coord(). Regards, Paul Elschot

Re: Too many boolean clauses

2004-09-20 Thread Paul Elschot
on the latest version to see the code). and iteratate over you doc ids instead of over dates. This will give you a filter for the doc ids you want to query. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Possible to remove duplicate documents in sort API?

2004-09-06 Thread Paul Elschot
Kevin, On Sunday 05 September 2004 23:13, Kevin A. Burton wrote: > Paul Elschot wrote: > >Kevin, > > > >On Sunday 05 September 2004 10:16, Kevin A. Burton wrote: > >>I want to sort a result set but perform a group by as well... IE remove > >>duplicate

Re: Possible to remove duplicate documents in sort API?

2004-09-05 Thread Paul Elschot
case you can define another field that defines what is a duplicate by having the same value for duplicates, you can use it as one of the SortField's for sorting. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTE

Re: Build problems

2004-09-03 Thread Paul Elschot
.4.1 is out, but it's not available there yet. In case you want that version please ask on lucene-dev. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Question concerning speed of Lucene.....

2004-08-27 Thread Paul Elschot
l > be indexing about 400.000 messages per month. To easily keep the primary keys in sync between the SQL db and Lucene, I'd start by keeping the images and the full text only in the SQL db. Lucene optimisations (needed after adding/deleting docs) copy all

Re: Using 2nd Index to constraing Search

2004-08-27 Thread Paul Elschot
#x27;s and then use the 2nd index to qualify > the full text search over the document table. The reason I want to do > this is to reduce the numbers of documents that the full text query will > run. Regards, Paul Elschot -

Re: How not to show results with the same score?

2004-08-25 Thread Paul Elschot
e outgoing URL's. Crawlers also keep track of multiple host names resolving to the same IP address. In case you need to crawl and index an intranet or more, have a look at Nutch. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Index Size

2004-08-19 Thread Paul Elschot
e web site. You can then see the total disk size of for example the stored fields. Regards, Paul Elschot - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Performance when computing computing a filter using hundreds of diff terms.

2004-08-06 Thread Paul Elschot
Kevin, On Thursday 05 August 2004 23:32, Kevin A. Burton wrote: > I'm trying to compute a filter to match documents in our index by a set > of terms. > > For example some documents have a given field 'category' so I need to > compute a filter with mulitple categories. > > The problem is that our

Re: Question on number of fields in a document.

2004-08-04 Thread Paul Elschot
On Wednesday 04 August 2004 18:22, John Z wrote: > Hi > > I had a question related to number of fields in a document. Is there any > limit to the number of fields you can have in an index. > > We have around 25-30 fields per document at present, about 6 are keywords, > Around 6 stored, but not ind

Re: Caching of TermDocs

2004-07-26 Thread Paul Elschot
On Monday 26 July 2004 21:41, John Patterson wrote: > Is there any way to cache TermDocs? Is this a good idea? Lucene does this internally by buffering up to 32 document numbers in advance for a query Term. You can view the details here in case you're interested: http://cvs.apache.org/viewcvs.cg

Syntax for query parsers

2004-06-09 Thread Paul Elschot
ussions on creating new > query parsers (one size doesn't fit all, I don't think) and what syntax > should be used. > > Paul Elschot created a "surround" query parser that he posted about to > the list in April. > > Erik Here is a bit about the syn

Re: incomplete word match

2004-03-11 Thread Paul Elschot
On Thursday 11 March 2004 06:15, Tomcat Programmer wrote: > I have a situation where I need to be able to find > incomplete word matches, for example a search for the > string 'ape' would return matches for 'grapes' > 'naples' 'staples' etc. I have been searching the > archives of this user list a