date:20050301

Questions about GermanAnalyzer/Stemmer

2005-03-01 Thread Jon Humble

Hello, Were using the GermanAnalyzer/Stemmer to index/search our (German) Website. I have a few questions: (1) Why is the GermanAnalyzer case-sensitive? None of the other language indexers seem to be. What does this feature add? (2) With the German Analyzer, wildcard searches

Is IndexSearcher thread safe?

2005-03-01 Thread Volodymyr Bychkoviak

Is it thread-safe to share one instance of IndexSearcher between multiple threads? - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Questions about GermanAnalyzer/Stemmer [auf Viren geprueft]

2005-03-01 Thread Jonathan O'Connor

Jon, I too found some problems with the German analyser recently. Here's what may help: 1. You can try reading Joerg Caumanns' paper A Fast and Simple Stemming Algorithm for German Words. This paper describes the algorithm implemented by GermanAnalyser. 2. I guess German nouns all capitalized, so

Re: Questions about GermanAnalyzer/Stemmer [auf Viren geprueft]

2005-03-01 Thread Erik Hatcher

I had to moderate both Jonathan and Jon's messages in to the list. Please subscribe to the list and post to it with the address you've subscribed. I cannot always guarantee I'll catch moderation messages and send them through in a timely fashion. Erik On Mar 1, 2005, at 6:18 AM,

Re: Custom filters document numbers

2005-03-01 Thread tomsdepot-lucene

I'm also interested in knowing what can change the doc numbers. Does this happen frequently? Like Stanislav has been asking... what sort of operations on the index cause the document number to change for any given document? If the document numbers change frequently, is there a straightforward

Re[2]: Is IndexSearcher thread safe?

2005-03-01 Thread Yura Smolsky

Hello, Volodymyr. VB Additional question. VB If I'm sharing one instance of IndexSearcher between different threads VB Is it good to just to drop this instance to GC. VB Because I don't know if some thread is still using this searcher or done VB with it. It is safe to share one instance between

Re: Questions about GermanAnalyzer/Stemmer [auf Viren geprueft]

2005-03-01 Thread Jonathan O'Connor

Apologies Erik, This must be one of those apostrophe in email address problems I always get. Recently I removed the apostrophe from the email address I give out. Our server recognizes both email addresses, but some of these mail lists don't like the O'Connor clann! Ciao, Jonathan O'Connor XCOM

RE: help with boolean expression

2005-03-01 Thread Omar Didi

I found something kind fo weird about the way lucene interprets boolean expressions wihout parenthesis. when i run the query A AND B OR C, it returns only the documents that have A(in other words as if the query was just the term A). when I run the query A OR B AND C, it returns only the

Remove document fails

2005-03-01 Thread Alex Kiselevski

Hi, I have a problem doing IndexReader.delete(int doc) and it fails on lock error. Alex Kiselevski +9.729.776.4346 (desk) +9.729.776.1504 (fax) AMDOCS INTEGRATED CUSTOMER MANAGEMENT The information contained in this message is proprietary of Amdocs, protected from disclosure, and may be

RE: Is IndexSearcher thread safe?

2005-03-01 Thread Cocula Remi

Additional question. If I'm sharing one instance of IndexSearcher between different threads Is it good to just to drop this instance to GC. Because I don't know if some thread is still using this searcher or done with it. Note that as far as one of the threads keep a reference on the

RE: Re[2]: Is IndexSearcher thread safe?

2005-03-01 Thread Cocula Remi

I probably had the same trouble (but I'm not sure). I have run a test programm that was creating a lot of IndexSearchers (but also close and free them). It went to an outOfMemory Exception. But i'm not finished with that problem (need to use a profiler). But I have discovered one strange

Re: Remove document fails

2005-03-01 Thread Volodymyr Bychkoviak

may be you have open IndexWriter at the same time you are trying to delete document. Alex Kiselevski wrote: Hi, I have a problem doing IndexReader.delete(int doc) and it fails on lock error. Alex Kiselevski +9.729.776.4346 (desk) +9.729.776.1504 (fax) AMDOCS INTEGRATED CUSTOMER MANAGEMENT

Zip Files

2005-03-01 Thread Luke Shannon

Hello; Anyone have an ideas on how to index the contents within zip files? Thanks, Luke - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]

Re: Zip Files

2005-03-01 Thread Ernesto De Santis

Hello first, you need a parser for each file type: pdf, txt, word, etc. and use a java api to iterate zip content, see: http://java.sun.com/j2se/1.4.2/docs/api/java/util/zip/ZipInputStream.html use getNextEntry() method little example: ZipInputStream zis = new ZipInputStream(fileInputStream);

Large Index managing

2005-03-01 Thread Volodymyr Bychkoviak

Hi, just an idea how to manage large index that is updated very often. Very often there is need to update an document in index. To update document in index you should delete old document from index and then add new one. In most cases it require you to open IndexReader, delete document, close

Re: Zip Files

2005-03-01 Thread Luke Shannon

Thanks Ernesto. The issue I'm working with now (this is more lack of experience than anything) is getting an input I can index. All my indexing classes (doc, pdf, xml, ppt) take a File object as a parameter and return a Lucene Document containing all the fields I need. I'm struggling with how I

Re: Fast access to a random page of the search results.

2005-03-01 Thread Doug Cutting

Stanislav Jordanov wrote: startTs = System.currentTimeMillis(); dummyMethod(hits.doc(nHits - nHits)); stopTs = System.currentTimeMillis(); System.out.println(Last doc accessed in + (stopTs - startTs)

Re: Zip Files

2005-03-01 Thread Chris Lamprecht

Luke, Look at the javadocs for java.io.ByteArrayInputStream - it wraps a byte array and makes it accessible as an InputStream. Also see java.util.zip.ZipFile. You should be able to read and parse all contents of the zip file in memory.

Investingating Lucene For Project

2005-03-01 Thread Scott Purcell

I am looking for a solution to a problem I am having. We have a web-based asset management solution where we manage customers assets. We have had requests from some clients who would like the ability to index PDF files, now and possibly other text files in the future. The PDF files live on a

Re: Investingating Lucene For Project

2005-03-01 Thread Ben Litchfield

See inlined comments below. We have had requests from some clients who would like the ability to index PDF files, now and possibly other text files in the future. The PDF files live on a server and are in a structured environment. I would like to somehow index the content inside the PDF and

Best Practices for Distributing Lucene Indexing and Searching

2005-03-01 Thread Luke Francl

Lucene Users, We have a requirement for a new version of our software that it run in a clustered environment. Any node should be able to go down but the application must keep functioning. Currently, we use Lucene on a single node but this won't meet our fail over requirements. If we can't find a

Re: Fast access to a random page of the search results.

2005-03-01 Thread Doug Cutting

Daniel Naber wrote: After fixing this I can reproduce the problem with a local index that contains about 220.000 documents (700MB). Fetching the first document takes for example 30ms, fetching the last one takes 100ms. Of course I tested this with a query that returns many results (about

Multiple indexes

2005-03-01 Thread Ben

Hi My site has two types of documents with different structure. I would like to create an index for each type of document. What is the best way to implement this? I have been trying to implement this but found out that 90% of the code is the same. In Lucene in Action book, there is a case study

How to manipulate the lucene index table

2005-03-01 Thread Srimant Mishra

Hi all, I have a web-based application that we use to index text documents as well as images; the indexes fields are either Field.Unstored or Field.Keyword. Currently, we plan to modify some of the index field names. For example, if the index field name was

Re: Multiple indexes

2005-03-01 Thread Erik Hatcher

It's hard to answer such a general question with anything very precise, so sorry if this doesn't hit the mark. Come back with more details and we'll gladly assist though. First, certainly do not copy/paste code. Use standard reuse practices, perhaps the same program can build the two

Re: Best Practices for Distributing Lucene Indexing and Searching

2005-03-01 Thread Yonik Seeley

6. Index locally and synchronize changes periodically. This is an interesting idea and bears looking into. Lucene can combine multiple indexes into a single one, which can be written out somewhere else, and then distributed back to the search nodes to replace their existing index. This is a

Re: Multiple indexes

2005-03-01 Thread Ben

Is it true that for each index I have to create a seperate instance for FSDirectory, IndexWriter and IndexReader? Do I need to create a seperate locking mechanism as well? I have already implemented a program using just one index. Thanks, Ben On Tue, 1 Mar 2005 22:09:05 -0500, Erik Hatcher

Re: Best Practices for Distributing Lucene Indexing and Searching

2005-03-01 Thread Doug Cutting

Yonik Seeley wrote: 6. Index locally and synchronize changes periodically. This is an interesting idea and bears looking into. Lucene can combine multiple indexes into a single one, which can be written out somewhere else, and then distributed back to the search nodes to replace their existing

Re: Multiple indexes

2005-03-01 Thread Otis Gospodnetic

Ben, You do need to use a separate instance of those 3 classes for each index yes. But this is really something like: IndexWriter writer = new IndexWriter(); So it's normal code-writing process you don't really have to create anything new, just use existing Lucene API. As for locking,

list moving to lucene.apache.org

2005-03-01 Thread Roy T . Fielding

This list is about to be moved to java-user at lucene.apache.org. Please excuse the temporary inconvenience. Cheers, Roy T. Fielding, co-founder, The Apache Software Foundation ([EMAIL PROTECTED]) http://roy.gbiv.com/

Questions about GermanAnalyzer/Stemmer

Is IndexSearcher thread safe?

Re: Questions about GermanAnalyzer/Stemmer [auf Viren geprueft]

Re: Questions about GermanAnalyzer/Stemmer [auf Viren geprueft]

Re: Custom filters document numbers

Re[2]: Is IndexSearcher thread safe?

Re: Questions about GermanAnalyzer/Stemmer [auf Viren geprueft]

RE: help with boolean expression

Remove document fails

RE: Is IndexSearcher thread safe?

RE: Re[2]: Is IndexSearcher thread safe?

Re: Remove document fails

Zip Files

Re: Zip Files

Large Index managing

Re: Zip Files

Re: Fast access to a random page of the search results.

Re: Zip Files

Investingating Lucene For Project

Re: Investingating Lucene For Project

Best Practices for Distributing Lucene Indexing and Searching

Re: Fast access to a random page of the search results.

Multiple indexes

How to manipulate the lucene index table

Re: Multiple indexes

Re: Best Practices for Distributing Lucene Indexing and Searching

Re: Multiple indexes

Re: Best Practices for Distributing Lucene Indexing and Searching

Re: Multiple indexes

list moving to lucene.apache.org

30 matches

Site Navigation

Mail list logo

Footer information