UNIQUE FILE SEARCH

2004-11-28 Thread Karthik N S


Hi Guy's


Apologies.



I  have a Index with one of the fields   is   FieldType  'KeyWord' .

To this Field I add  UNIQUE  File Names .



On Search How can I display All the File names  with out  any SearchKeyword
?.



Thx in Advance.




  WITH WARM REGARDS
  HAVE A NICE DAY
  [ N.S.KARTHIK]




-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



dotLucene (port of Jakarta Lucene to C#)

2004-11-28 Thread George Aroush
Hi folks,

I am please to announce the availability of dotLucene 1.4.0 RC1.  dotLucene
is a complete port of Jakarta Lucene to C#.  The port is almost a
line-by-line port and it includes the demos as well as all the JUnit tests.
An index created by dotLucene is cross compatible with Jakarta Lucene and
via verse.

Please visit http://sourceforge.net/projects/dotlucene/ to learn more about
dotLucene and to download the source code.

Best regards,

-- George Aroush


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



IndexWriter.addIndexes efficiency

2004-11-28 Thread Yonik Seeley
I'd like to use addIndexes(Directory[] dirs) to add
batches of documents to a main index.

My main problem is that the addIndexes()
implementation calls optimize() at the beginning and
the end.

Now, my main index will be ~25GB in size, so adding a
single document and then doing an optimize will mean
rewriting 25GB of files, right?  That sounds like it
is going to be too expensive to do often.

What I would really like is to be able to control more
explicitly when an optimize happens.  Could
addIndexes() be easily rewritten to just call
maybeMergeSegments()?

-Yonik





__ 
Do you Yahoo!? 
All your favorites on one personal page – Try My Yahoo!
http://my.yahoo.com 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Index in RAM - is it realy worthy?

2004-11-28 Thread Justin Swanhart
My indexes are stored on a NetApp filter via NFS.  The indexer process
updates the indexes over NFS.  I have multiple indexes.  My search
process determines if the nfs indexes have been updated, and if they
have, then loads the index into a RAMDirectory.   RAMDirectory is of
course much faster than searching over NFS.  This way, I can also have
multiple search servers running easily.  The drawback of course is
startup time.  It takes a few minutes to start each search server
because it has to load the data into memory.  RAMDirectory also seems
to be kind of memory inneficient, using a lot more memory than the
data actually consumes on disk.


On Wed, 24 Nov 2004 14:26:40 -0800, Jonathan Hager <[EMAIL PROTECTED]> wrote:
> When comparing RAMDirectory and FSDirectory it is important to mention
> what OS you are using.  When using linux it will cache the most recent
> disk access in memory.  Here is a good article that describes its
> strategy: http://forums.gentoo.org/viewtopic.php?t=175419
> 
> The 2% difference you are seeing is the memory copy.  With other OSes
> you may see a speed up when using the RAMDirectory, because not all
> OSes contain a disk cache in memory and must access the disk to read
> the index.
> 
> Another consideration is there is currently a 2GB limitation with the
> size of the RAMDirectory.  Indexes over 2GB causes a overflow in the
> int used to create the buffer.  [see int len = (int) is.length(); in
> RamDirectory]
> 
> I ended up using RAM directory for a very different reason.  The index
> is 1 to 2MB and is rebuilt every few hours.  It takes 3 to 4 minutes
> to query the database and rebuild the index.  But the search should be
> available 100% of the time.  Since the index is so small I do the
> following:
> 
> on server startup:
> - look for semaphore, if it is there delete the index
> - if there is no index, build it to FSdirectory
> - load the index from FSDirectory into RAMDirectory
> 
> on reindex:
> - create semaphore
> - rebuild index to FSDirectory
> - delete semaphore
> - load index from FSDirecttory into RAMDirectory
> 
> to search:
> - search the RAMDirectory
> 
> RAMDirectory could be replaced by a regular FSDirectory, but it seemed
> silly to copy the index from disk to disk, when it ultimately needs to
> be in memory.
> 
> FSDirectory could be replaced by a RAMDirectory, but this means that
> it would take the server 3 to 4 minutes longer to startup every time.
> By persisting the index, this time would only be necessary if indexing
> was interrupted.
> 
> Jonathan
> 
> On Mon, 22 Nov 2004 12:39:07 -0800, Kevin A. Burton
> 
> 
> <[EMAIL PROTECTED]> wrote:
> > Otis Gospodnetic wrote:
> >
> > >For the Lucene book I wrote some test cases that compare FSDirectory
> > >and RAMDirectory.  What I found was that with certain settings
> > >FSDirectory was almost as fast as RAMDirectory.  Personally, I would
> > >push FSDirectory and hope that the OS and the Filesystem do their share
> > >of work and caching for me before looking for ways to optimize my code.
> > >
> > >
> > Yes... I performed the same benchmark and in my situation RAMDirectory
> > for searches was about 2% slower.
> >
> > I'm willing to bet that it has to do with the fact that its a Hashtable
> > and not a HashMap (which isn't synchronized).
> >
> > Also adding a constructor for the term size could make loading a
> > RAMDirectory faster since you could prevent rehash.
> >
> > If you're on a modern machine your filesystme cache will end up
> > buffering your disk anyway which I'm sure was happening in my situation.
> >
> > Kevin
> >
> > --
> >
> > Use Rojo (RSS/Atom aggregator).  Visit http://rojo.com. Ask me for an
> > invite!  Also see irc.freenode.net #rojo if you want to chat.
> >
> > Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html
> >
> > If you're interested in RSS, Weblogs, Social Networking, etc... then you
> > should work for Rojo!  If you recommend someone and we hire them you'll
> > get a free iPod!
> >
> > Kevin A. Burton, Location - San Francisco, CA
> >AIM/YIM - sfburtonator,  Web - http://peerfear.org/
> > GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
> >
> >
> >
> >
> > -
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
> 
> -
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
> 
>

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]