Re: Extract the text that was indexed

2009-01-02 Thread Lebiram
.3.1.jar org.apache.lucene.index.FieldNormModifier C:/index -n field1 field2 field3 field4 field5 field6 field7 From: Chris Hostetter To: java-user@lucene.apache.org Sent: Friday, January 2, 2009 12:25:13 AM Subject: Re: Extract the text that was indexed : > Just wanted to reconstruct

Re: Extract the text that was indexed

2009-01-01 Thread Chris Hostetter
: > Just wanted to reconstruct a new index based on an existing index(but : > turning off norms) that's all. : : If you want to create an identical index but without norms use : FieldNormModifier in contrib/miscellaneous. and that ladies nad gentlemen is *exactly* the definition of an "X/Y Prob

Re: Extract the text that was indexed

2008-12-30 Thread Karl Wettin
30 dec 2008 kl. 17.13 skrev Lebiram: Hi Lebiram, contrib/misc contains a couple of tools that might be of help. Just wanted to reconstruct a new index based on an existing index(but turning off norms) that's all. If you want to create an identical index but without norms use FieldNormModi

Re: Extract the text that was indexed

2008-12-30 Thread Lebiram
guys! From: Erick Erickson To: java-user@lucene.apache.org Sent: Tuesday, December 30, 2008 3:41:46 PM Subject: Re: Extract the text that was indexed Actually, you can reconstruct the text, but it's a lossy process. Stop words aren't in the index for instance. And

Re: Extract the text that was indexed

2008-12-30 Thread Erick Erickson
Actually, you can reconstruct the text, but it's a lossy process. Stop words aren't in the index for instance. And it's very time-consuming. Luke makes a "best guess" at this process, so you might want to take a look at that code. But even the very bright folks who put Luke together caution that it

Re: Extract the text that was indexed

2008-12-30 Thread Greg Shackles
That is my understanding of it too. Terms in the index will point to the position of the tokens they map to. Since one index term can point at any number of tokens, this isn't a sequence map, but just a search map. If you still have the text that was indexed you could run it through an analyzer

Re: Extract the text that was indexed

2008-12-30 Thread Alexander Aristov
I am not sure but from my understanding fields that are only indexed and not stored do not keep position. So even if you get back all terms for a field for a given document you won't be able to reconstruct original words sequence. And remember that not all words are indexed. Alex 2008/12/30 Lebi