Re: Strange Error while deleting Documents from index while indexing.

2007-07-27 Thread miztaken

Where shall i post this issue.
I am naive to Lucene.
And about IndexWriter Closing.
Now i am trying like this:

1. Open New IndexReader.
2. Delete Documents.
3. Close IndexReader.
4. Open New IndexWriter.
5. Write Documents.
6. Close IndexWriter.
7. Repeat the process for n times the in nth time optimize the index before
closing indexwriter.

Is it acceptable.
But according to the http://wiki.apache.org/lucene-java/ImproveIndexingSpeed
It says to use only once instance of IndexWriter.
But in my case, after each iteration i have to close my IndexWriter as you
have suggested.
Is this the only way to do this..
Or is it that, using one instance of IndexWriter is only applicable while
doing indexing without deleting documents..?
Please make me clear.

And another question..?
What can be the possible scenario for threaded indexWriting and deleting
context..?  


-- 
View this message in context: 
http://www.nabble.com/Strange-Error-while-deleting-Documents-from-index-while-indexing.-tf4149570.html#a11823559
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



RE: Search terms on a single instance of field

2007-07-27 Thread Ard Schrijvers
Hello,
 

 Company AB, ...). With this I´d like to search for documents that has
 daniel and president on the same field, because in a same 
 text, can exist
 daniel and president in different fields. Is this possible??

Not totally sure wether I understand your problem, because it does not sound 
like a problem at all: 

If you just have a query that looks like: fieldA:termA + fieldA:termB you are 
looking for documents which have termA AND termB in fieldA. 

Isn't that all you want/need?

Regards Ard
 
 I know that if
 I had and index where the Document is a phrase like this, it 
 would solve my
 problem, but I´d like to stay with only one index.
 
Hope I made myself clear.
 
 []s
 Rossini
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Strange Error while deleting Documents from index while indexing.

2007-07-27 Thread karl wettin


27 jul 2007 kl. 10.50 skrev miztaken:



My application simply shut downs.
After that when i try to open the same index using IndexReader and  
fetch the
document then it says trying to access deleted document. After  
getting

such error, i opened the indexWriter, optimized and then closed it.
Then again i tried to get the documents using indexreader and its  
working

all fine.

What can be the problem.

Well the pseudo code will be like:


Can you please supply an isolated and working test case that  
demonstrate your problem?


Can you use IndexWriter#deleteDocument instead?

--
karl

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Strange Error while deleting Documents from index while indexing.

2007-07-27 Thread miztaken

Can you use IndexWriter#deleteDocument instead?

No i cant use this method.
I dont know docid and i dont wanna search for it. It will only add extra
time.
I am deleting the document on the basis of unique key field.


Can you please supply an isolated and working test case that  
demonstrate your problem?

Well my application is used to index the database rowsets so i wont be able
to give you running code but i have tried my best to keep it elaborative.
Please check the attachment.
I have attached a C# file.
Please open with any text editor of your choice.
Find the attachment here:  http://www.nabble.com/file/p11827583/Program2.cs
Program2.cs 

Please help me.

-- 
View this message in context: 
http://www.nabble.com/Strange-Error-while-deleting-Documents-from-index-while-indexing.-tf4149570.html#a11827583
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Search terms on a single instance of field

2007-07-27 Thread Rafael Rossini
Actually no,

   Because I'd like to retrieve terms that were computed on the same
instance of Field. Taking your example to ilustrate better, I have 2
documents, on documentA I structured one field, Field(fieldA, termA
termB, customAnalyzer). On documentB I structured 2 fields, Field(fieldA,
termA termC, customAnalyzer) and Field(fieldA, termB termC,
customAnalyzer).

   The problem is, if I search like you suggested, fieldA:termA +
fieldA:termB, I will get both documents, but I want only documentA. For that
to happen, somehow, somewhere should be an information that tells that on
documentA, termA and termB where indexed on the same instance of fieldA. I´m
guessing this is not possible, but it would be great if someone has an ideia
to solve this.

Thanks for the response Ard.

[]s
Rossini


On 7/27/07, Ard Schrijvers [EMAIL PROTECTED] wrote:

 Hello,
 

  Company AB, ...). With this I´d like to search for documents that has
  daniel and president on the same field, because in a same
  text, can exist
  daniel and president in different fields. Is this possible??

 Not totally sure wether I understand your problem, because it does not
 sound like a problem at all:

 If you just have a query that looks like: fieldA:termA + fieldA:termB you
 are looking for documents which have termA AND termB in fieldA.

 Isn't that all you want/need?

 Regards Ard

  I know that if
  I had and index where the Document is a phrase like this, it
  would solve my
  problem, but I´d like to stay with only one index.
 
 Hope I made myself clear.
 
  []s
  Rossini
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




Size of field?

2007-07-27 Thread Eduardo Botelho
Hi guys,

I would like to know if exist some limit of size for the fields of a
document.

I'm with the following problem:
When a term is after a certain amount of characters (approximately 87300) in
a field, the search does not find de occurrency.
If I divide my field in pages, the terms are found normally.
This problem occours when I make an exact query (query between quotes)

What can be happening?

I'm using BrazilianAnalyzer and StandardAnalyzer(for tests only) for both,
search and indexation.

thanks...

Sorry for my poor english...


Re: Strange Error while deleting Documents from index while indexing.

2007-07-27 Thread karl wettin


27 jul 2007 kl. 13.43 skrev miztaken:




Can you use IndexWriter#deleteDocument instead?


No i cant use this method.
I dont know docid and i dont wanna search for it. It will only add  
extra

time.
I am deleting the document on the basis of unique key field.


You can do that with IndexWriter#deleteDocuments(Term)

And if Lucene.Net does not support that, can you try doing it by  
searching and deleting just to see how it changes the outcome?



Can you please supply an isolated and working test case that
demonstrate your problem?


Find the attachment here:  http://www.nabble.com/file/p11827583/ 
Program2.cs


That is a fairly large amount of code. And as it is C# I have nothing  
to test it on. Too much for me to compile and run in the head. I was  
hoping for a few line of code that demonstrated the problem and  
nothing else.



--
karl

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: Payloads and PhraseQuery

2007-07-27 Thread Peter Keegan
I guess this also ties in with 'getPositionIncrementGap', which is relevant
to fields with multiple occurrences.

Peter

On 7/27/07, Peter Keegan [EMAIL PROTECTED] wrote:

 I have a question about the way fields are analyzed and inverted by the
 index writer. Currently, if a field has multiple occurrences in a document,
 each occurrence is analyzed separately (see DocumentsWriter.processField).
 Is it safe to assume that this behavior won't change in the future? The
 reason I ask is that my custom analyzer's 'tokenStream' method creates a
 custom filter which produces a payload based on the existence of each field
 occurrence. However, if DocumentsWriter was changed and combined all the
 occurrences before inversion, my scheme wouldn't work.  Since payloads are
 created by filters/tokenizers, it helps to keep things flexible.

 Thanks,
 Peter


 On 7/12/07, Grant Ingersoll [EMAIL PROTECTED] wrote:
 
 
  On Jul 12, 2007, at 6:12 PM, Chris Hostetter wrote:
 
 
  
   Hmm... okay so the issue is that in order to get the payload data, you
   have to have a TermPositions instance.
  
   instead of adding getPayload methods to the Spans class (which as Paul
 
   points out, can have nesting issues) perhaps more general solutions
   would
   be:
  
   a) a more high level getPayload API that let's you get a payload
   arbitrarily for a toc/position (perhaps as part of the TernDocs
   API?) ...
   then for Spans you could use this new API with Spans.start() and
   Spans.end(). (and all the positions in between)
 
  Not sure I follow this.  I don't see the fit w/ TermDocs.
  
   b) add a variation of the TermPositions class to allow people to
   iterate
   through the terms of a TermDoc in position order (TermPosition first
   iterates over the Terms and then over the positions) ... then you
   could
   seek(span.start()) to get the Payload data
  
   c) add methods to the Spans API to get the subspans (if any) ... this
   would be the Spans corrilary to getTerms() and would always return
   TermSpans which would have TermPositions for getting payload data.
 
 
  This could be a good alternative.
 
  When we first talked about payloads we wondered if we could just make
  all Queries into SpanQueries by passing TermPositions instead of term
  docs, but in the end decided not to do it because of performance
  issues (some of which are lessened by lazy loading of TermPositions.
 
  The thing is, I think, that the Spans is already moving you along in
  the term positions, so it just seems like a natural fit to have it
  there, even if there is nesting.  It doesn't seem like it would be
  that hard to then return back the nesting stuff b/c you are just
  collating the results from the underlying SpanTermQuery.  Having said
  that, I haven't looked into the actual code, so take that w/ a grain
  of salt.
 
  I will try to do some more investigation, as others are welcome to
  do.  Perhaps we should move this to dev?
 
  Cheers,
  Grant
 
 
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 



NPE in MultiReader

2007-07-27 Thread testn

Every once in a while I got the following exception with Lucene 2.2. Do you
have any idea?

Thanks,

java.lang.NullPointerException
at
org.apache.lucene.index.MultiReader.getFieldNames(MultiReader.java:264)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:180)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:97)
at
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1883)
at
org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1741)
at
org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1733)
at
org.apache.lucene.index.IndexWriter.maybeFlushRamSegments(IndexWriter.java:1727)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1004)
at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:983)
at
org.springmodules.lucene.index.factory.SimpleLuceneIndexWriter.addDocument(SimpleLucen
eIndexWriter.java:44)
at
org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.addDocumentsInIn
dex(DefaultDatabaseIndexer.java:274)
at
org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.doHandleRequest(
DefaultDatabaseIndexer.java:306)
at
org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.index(DefaultDat
abaseIndexer.java:354)

-- 
View this message in context: 
http://www.nabble.com/NPE-in-MultiReader-tf4160812.html#a11838713
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: NPE in MultiReader

2007-07-27 Thread Dmitry
What the conditions you are following when running lucene - like 
configuration, parameters..can you describe more?

thanks,
dt,
www.ejinz.com
Search Engine News

- Original Message - 
From: testn [EMAIL PROTECTED]

To: java-user@lucene.apache.org
Sent: Friday, July 27, 2007 7:50 PM
Subject: NPE in MultiReader




Every once in a while I got the following exception with Lucene 2.2. Do 
you

have any idea?

Thanks,

java.lang.NullPointerException
   at
org.apache.lucene.index.MultiReader.getFieldNames(MultiReader.java:264)
   at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:180)
   at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:97)
   at
org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1883)
   at
org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1741)
   at
org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1733)
   at
org.apache.lucene.index.IndexWriter.maybeFlushRamSegments(IndexWriter.java:1727)
   at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1004)
   at
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:983)
   at
org.springmodules.lucene.index.factory.SimpleLuceneIndexWriter.addDocument(SimpleLucen
eIndexWriter.java:44)
   at
org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.addDocumentsInIn
dex(DefaultDatabaseIndexer.java:274)
   at
org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.doHandleRequest(
DefaultDatabaseIndexer.java:306)
   at
org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.index(DefaultDat
abaseIndexer.java:354)

--
View this message in context: 
http://www.nabble.com/NPE-in-MultiReader-tf4160812.html#a11838713

Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: NPE in MultiReader

2007-07-27 Thread testn

- Using Spring Module 0.8a
- Using RAM directory
- Having about 100,000 documents
- Index all documents in one thread
- Perform the optimize only at the end of the indexing process
- Using Lucene 2.2


Dmitry-17 wrote:
 
 What the conditions you are following when running lucene - like 
 configuration, parameters..can you describe more?
 thanks,
 dt,
 www.ejinz.com
 Search Engine News
 
 - Original Message - 
 From: testn [EMAIL PROTECTED]
 To: java-user@lucene.apache.org
 Sent: Friday, July 27, 2007 7:50 PM
 Subject: NPE in MultiReader
 
 

 Every once in a while I got the following exception with Lucene 2.2. Do 
 you
 have any idea?

 Thanks,

 java.lang.NullPointerException
at
 org.apache.lucene.index.MultiReader.getFieldNames(MultiReader.java:264)
at
 org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:180)
at
 org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:97)
at
 org.apache.lucene.index.IndexWriter.mergeSegments(IndexWriter.java:1883)
at
 org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1741)
at
 org.apache.lucene.index.IndexWriter.flushRamSegments(IndexWriter.java:1733)
at
 org.apache.lucene.index.IndexWriter.maybeFlushRamSegments(IndexWriter.java:1727)
at
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1004)
at
 org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:983)
at
 org.springmodules.lucene.index.factory.SimpleLuceneIndexWriter.addDocument(SimpleLucen
 eIndexWriter.java:44)
at
 org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.addDocumentsInIn
 dex(DefaultDatabaseIndexer.java:274)
at
 org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.doHandleRequest(
 DefaultDatabaseIndexer.java:306)
at
 org.springmodules.lucene.index.object.database.DefaultDatabaseIndexer.index(DefaultDat
 abaseIndexer.java:354)

 -- 
 View this message in context: 
 http://www.nabble.com/NPE-in-MultiReader-tf4160812.html#a11838713
 Sent from the Lucene - Java Users mailing list archive at Nabble.com.


 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]

 
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 

-- 
View this message in context: 
http://www.nabble.com/NPE-in-MultiReader-tf4160812.html#a11839042
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Lucene performance using a solid state disk (SSD)

2007-07-27 Thread Kent Fitch
Has anyone done any benchmarking of Lucene running with the index
stored on a SSD?

Given the performance characteristics quoted for, say, the SANDISK
devices (eg 
http://www.sandisk.com/OEM/ProductCatalog(1321)-SanDisk_SSD_SATA_5000_25.aspx:
7000 IO/sec for 512 byte requests, 67MB/sec sustained read rate), it
looks a promising match for some Lucene applications

Regards,

Kent Fitch

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]