Re: some thoughts about adding transactions.

2005-01-11 Thread Scott Ganyo
I didn't want to let this drop this on the floor, but I haven't had the time to craft a response to it either. So, just for the record I agree that transactions would be nice. I think that it is important that the solution address change visibility and concurrent transactions within multiple

Re: dotLucene (port of Jakarta Lucene to C#)

2004-12-01 Thread Scott Ganyo
Why does it seem to you that C# is faster than Java? In any case, generally the bottleneck isn't the VM. It's the I/O to the disks... Scott The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on

Re: BooleanQuery - Too Many Clases on date range.

2004-10-01 Thread Scott Ganyo
You can use: BooleanQuery.setMaxClauseCount(int maxClauseCount); to increase the limit. On Sep 30, 2004, at 8:24 PM, Chris Fraschetti wrote: I recently read in regards to my problem that date_field:[0820483200 TO 110448] is evluated into a series of boolean queries ... which has a cap of 1024

Re: Open-ended range queries

2004-06-10 Thread Scott ganyo
At one point it definitely supported null for either term. I think that has been removed/forgotten in the later revisions of the QueryParser... Scott On Jun 10, 2004, at 1:24 PM, Erik Hatcher wrote: On Jun 10, 2004, at 2:13 PM, Terry Steichen wrote: Actually, QueryParser does support

Re: Open-ended range queries

2004-06-10 Thread Scott ganyo
It looks to me like Revision 1.18 broke it. On Jun 10, 2004, at 3:26 PM, Erik Hatcher wrote: On Jun 10, 2004, at 4:07 PM, Terry Steichen wrote: Well, I'm using 1.4 RC3 and the null range upper limit works just fine for searches in two of my fields; one is in the form of a cannonical date (eg,

Re: Open-ended range queries

2004-06-10 Thread Scott ganyo
Well, I do like the *, but apparently there are some people that are using this with the null... Scott On Jun 10, 2004, at 7:15 PM, Erik Hatcher wrote: On Jun 10, 2004, at 4:54 PM, Scott ganyo wrote: It looks to me like Revision 1.18 broke it. It seems this could be it: revision 1.18 date: 2002

Re: DocumentWriter, StopFilter should use HashMap... (patch)

2004-03-11 Thread Scott ganyo
I don't buy it. HashSet is but one implementation of a Set. By choosing the HashSet implementation you are not only tying the class to a hash-based implementation, you are trying the interface to *that specific* hash-based implementation or it's subclasses. In the end, either you buy the

Re: Index advice...

2004-02-10 Thread Scott ganyo
I have. While document.add() itself doesn't increase over time, the merge does. Ways of partially overcoming this include increasing the mergeFactor (but this will increase the number of file handles used), or building blocks of the index in memory and then merging them to disk. This has

Re: BooleanQuery question

2004-01-16 Thread Scott ganyo
No, you don't need required or prohibited, but you can't have both. Here is a rundown: * A required clause will allow a document to be selected if and only if it contains that clause and will exclude any documents that don't. * A prohibited clause will exclude any documents that contain that

Re: java.io.IOException: Bad file number

2003-11-10 Thread Scott Ganyo
I don't think adding extensive locking is necessary. What you are probably experiencing is that you've closed the index before you're done using it. If you aren't careful to close the index only after all searches on it have been completed, you'll get an error like this. Scott [EMAIL

Re: Multiple writers

2003-10-29 Thread Scott Ganyo
Offhand, I would say that using 2 directories and merging them is exactly what you waht. It really shouldn't be all that complicated and Lucene should handle the synchronization for you... Scott Dror Matalon wrote: Hi folks, We're in the process of adding search to our online RSS

Re: Limit on number of required/prohibited clauses

2003-09-05 Thread Scott Ganyo
Hi Eugene, Yes. Doug (Cutting) added this to eliminate OutOfMemory errors that apparently some people were having. Unfortunately, it causes backward-compatibility issues if you were used to using version 1.2. So, you'll need to add a call like this:

Re: Reuse IndexSearcher?

2003-08-19 Thread Scott Ganyo
Yes. You can (and should for best performance) reuse an IndexSearcher as long as you don't need access to changes made to the index. An open IndexSearcher won't pick up changes to the index, so if you need to see the changes, you will need to open a new searcher at that point. Scott Aviran

Re: Make Lucene Index distributable

2003-08-18 Thread Scott Ganyo
Be careful with option 1. NFS and the Lucene file-based locking mechanism don't get along extremely well. (See the archives for details...) Scott Lienhard, Andrew wrote: I can think of three options: 1) Single index dir on a shared drive (NFS, etc.) which is mounted on each app server. 2)

Re: NLucene up to date ?

2003-07-31 Thread Scott Ganyo
Do these implementations maintain file compatibility with the Java version? Scott Erik Hatcher wrote: I'd love to see there be quality implementations of the Lucene API in other languages, that are up to date with the latest Java codebase. I'm embarking on a Ruby port, which I'm hosting at

Re: Luke - Lucene Index Browser

2003-07-14 Thread Scott Ganyo
Nifty cool! I'm gonna like this, I can tell already! I'm having a really hard time actually using Luke, though, as all the window panes and table columns are apparently of fixed size. Do you think you could through in the ability to resize the various window panes and table columns? This

Re: Incremental indexing

2002-12-05 Thread Scott Ganyo
+1. Support for transactions in Lucene are high on my list of desirable features as well. I would love to have time to look into adding this, but lately... well, you know how that goes. Scott Eric Jain wrote: If you want to update a set of documents, you can remove their previous version

Re: How does delete work?

2002-11-22 Thread Scott Ganyo
It just marks the record as deleted. The record isn't actually removed until the index is optimized. Scott Rob Outar wrote: Hello all, I used the delete(Term) method, then I looked at the index files, only one file changed _1tx.del I found references to the file still in some of the

Re: Fun project?

2002-11-21 Thread Scott Ganyo
I'm rather partial to Jini for distributed systems, but I agree that JXTA would definitely be the way to go on this type of peer-to-peer scenario. Scott [EMAIL PROTECTED] wrote: I'll be doing something very similar some time in the next 12 months for the project I'm working on. I'll be more

Re: Searching Ranges

2002-11-11 Thread Scott Ganyo
Hi Alex, I just looked at this and had the following thought: The RangeQuery must continue to iterate after the first match is found in order to match everything within the specified range. In other words, if you have a range of a to d, you can't stop with a, you need to continue to d. At

Re: Your experiences with Lucene

2002-10-29 Thread Scott Ganyo
Actually, 10k isn't very large. We have indexes with more than 1M records. It hasn't been a problem. Scott Tim Jones wrote: Hi, I am currently starting work on a project that requires indexing and searching on potentially thousands, maybe tens of thousands, of text documents. I'm hoping

RE: Using Filters in Lucene

2002-07-31 Thread Scott Ganyo
Cool. But instead of adding a new class, why not change Hits to inherit from Filter and add the bits() method to it? Then one could pipe the output of one Query into another search without modifying the Queries... Scott -Original Message- From: Doug Cutting [mailto:[EMAIL

RE: Too many open files?

2002-07-23 Thread Scott Ganyo
Are you closing the searcher after each when done? No: Waiting for the garbage collector is not a good idea. Yes: It could be a timeout on the OS holding the files handles. Either way, the only real option is to avoid thrashing the searchers... Scott -Original Message- From: Hang

Forked files? was: RE: Too many open files?

2002-07-23 Thread Scott Ganyo
? It would seem that if there was an efficient implementation of a forked file, perhaps that could be used instead of the set of files that Lucene currently uses to represent a segment. Scott -Original Message- From: Scott Ganyo [mailto:[EMAIL PROTECTED]] Sent: Tuesday, July 23, 2002 10:13 AM

RE: CachedSearcher

2002-07-16 Thread Scott Ganyo
I'd like to see the finalize() methods removed from Lucene entirely. In a system with heavy load and lots of gc, using finalize() causes problems. To wit: 1) I was at a talk at JavaOne last year where the gc performance experts from Sun (the engineers actually writing the HotSpot gc) were

RE: CachedSearcher

2002-07-16 Thread Scott Ganyo
with them rather than allowing finalization to take care of it. Scott -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED]] Sent: Tuesday, July 16, 2002 11:56 AM To: Lucene Users List Subject: Re: CachedSearcher Scott Ganyo wrote: I'd like to see the finalize

RE: IndexReader Pool

2002-07-08 Thread Scott Ganyo
Deadlocks could be created if the order in which locks are obtained is not consistent. Note, though, that the locks are obtained in the same order each time throughout. (BTW: The inner lock is merely needed because the wait/notify calls need to own the monitor.) Naturally, you are free to make

RE: Stress Testing Lucene

2002-06-27 Thread Scott Ganyo
- From: Scott Ganyo [mailto:[EMAIL PROTECTED]] Sent: Wednesday, June 26, 2002 7:15 PM To: 'Lucene Users List' Subject: RE: Stress Testing Lucene 1) Are you sure that the index is corrupted? Maybe the file handles just haven't been released yet. Did you try to reboot and try again

RE: Stress Testing Lucene

2002-06-26 Thread Scott Ganyo
1) Are you sure that the index is corrupted? Maybe the file handles just haven't been released yet. Did you try to reboot and try again? 2) To avoid the too-many files problem: a) increase the system file handle limits, b) make sure that you reuse IndexReaders as much as you can rather across

RE: Boolean Query + Memory Monster

2002-06-13 Thread Scott Ganyo
Use the java -Xmx option to increase your heap size. Scott -Original Message- From: Nader S. Henein [mailto:[EMAIL PROTECTED]] Sent: Thursday, June 13, 2002 12:20 PM To: [EMAIL PROTECTED] Subject: Boolean Query + Memory Monster I have 1 Geg of memory on the machine with the

RE: Queryparser croaking on [ and ]

2002-02-20 Thread Scott Ganyo
Actually, [] denotes an inclusive range of Terms. Anyway, why not change the syntax if this is bad...? Scott -Original Message- From: Brian Goetz [mailto:[EMAIL PROTECTED]] Sent: Wednesday, February 20, 2002 10:08 AM To: Lucene Users List Subject: Re: Queryparser croaking on [ and

RE: JDK 1.1 vs 1.2+

2002-01-22 Thread Scott Ganyo
+1 -Original Message- From: Matt Tucker [mailto:[EMAIL PROTECTED]] Sent: Tuesday, January 22, 2002 11:06 AM To: 'Lucene Users List' Subject: RE: JDK 1.1 vs 1.2+ Hey all, I'd just like to chime in support for dropping JDK 1.1, especially if it would aid i18n in Lucene.

Re: Industry Use of Lucene?

2001-12-06 Thread Scott Ganyo
We use Lucene extensively as a core part of our ASP product here at eTapestry. In fact, we've built our database query engine on top of it. We have been extremely pleased with the results. Scott Jeff Kunkle asks: Does anyone know of any companies or agencies using Lucene for their

RE: Problems with prohibited BooleanQueries

2001-11-02 Thread Scott Ganyo
of a BooleanQuery subtract. Sure, it works, but it ain't pretty... Scott -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED]] Sent: Thursday, November 01, 2001 10:49 AM To: 'Lucene Users List' Subject: RE: Problems with prohibited BooleanQueries From: Scott Ganyo

RE: File Handles issue

2001-10-16 Thread Scott Ganyo
P.S. At one point I tried doing an in-memory index using the RAMDirectory and then merging it with an on-disk index and it didn't work. The RAMDirectory never flushed to disk... leaving me with an empty index. I think this is because of a bug in the mechanism that is supposed

RE: Trying To Understand Query Syntax Details

2001-10-16 Thread Scott Ganyo
Not sure about the rest, but if you've stored your dates in mmdd format, you can use a RangeQuery like so: dateField:[20011001-null] This would return all dates on or after October 1, 2001. Scott -Original Message- From: W. Eliot Kimber [mailto:[EMAIL PROTECTED]] Sent: Tuesday,

RE: File Handles issue

2001-10-15 Thread Scott Ganyo
Thanks for the detailed information, Doug! That helps a lot. Based on what you've said and on taking a closer look at the code, it looks like by setting mergeFactor and maxMergeDocs to Integer.MAX_VALUE, an entire index will be built in a single segment completely in memory (using the

File Handles issue

2001-10-11 Thread Scott Ganyo
We're having a heck of a time with too many file handles around here. When we create large indexes, we often get thousands of temporary files in a given index! Even worse, we just plain run out of file handles--even on boxes where we've upped the limits as much as we think we can! We've played