Re: Retrieving Tokens

2007-12-20 Thread Erick Erickson
I think that what Yonik wants is a higher-level response.
*Why* do you want to process the tokens later? What is the
use case you're trying to satisfy?

Best
Erick

On Dec 20, 2007 1:37 AM, Rishabh Joshi [EMAIL PROTECTED] wrote:

  What are you trying to do with the tokens?

 Yonik, we wanted a tokenizer that would tokenize the content of a
 document
 as per our requirements, and then store them in the index so that, we
 could
 retrieve those tokens at search time, for further processing in our
 application.

 Regards,
 Rishabh

 On Dec 19, 2007 10:02 PM, Yonik Seeley [EMAIL PROTECTED] wrote:

  On Dec 19, 2007 10:59 AM, Rishabh Joshi [EMAIL PROTECTED] wrote:
   I have created my own Tokenizer and I am indexing the documents using
  the
   same.
  
   I wanted to know if there is a way to retrieve the tokens (created by
 my
   custom tokenizer) from the index.
 
  If you want the tokens in the index, see the luke request handler.
 
  If you want the tokens for a specific document, it's more
  complicated... Lucene maintains an *inverted* index... terms point to
  documents, so by default there is no way to ask for all of the terms
  in a certain document.  One could ask lucene to store the terms for
  certain fields (called term vectors), but that requires extra space in
  the index, and solr doesn't yet have a way to ask that they be
  retrieved.
 
  What are you trying to do with the tokens?
 
  -Yonik
 



Re: Retrieving Tokens

2007-12-20 Thread Eswar K
Yonik/Erick,

We are building a custome Search which is to be done in 2 parts executed at
different points of time. As a result of it, the first step we want tokenize
the information and store it, which we want to retrieve a later point of
time for further processing and then store it back into the index. This
processed information is what we want the users to be able to search on.

Regards,
Eswar

On Dec 20, 2007 8:15 PM, Erick Erickson [EMAIL PROTECTED] wrote:

 I think that what Yonik wants is a higher-level response.
 *Why* do you want to process the tokens later? What is the
 use case you're trying to satisfy?

 Best
 Erick

 On Dec 20, 2007 1:37 AM, Rishabh Joshi [EMAIL PROTECTED] wrote:

   What are you trying to do with the tokens?
 
  Yonik, we wanted a tokenizer that would tokenize the content of a
  document
  as per our requirements, and then store them in the index so that, we
  could
  retrieve those tokens at search time, for further processing in our
  application.
 
  Regards,
  Rishabh
 
  On Dec 19, 2007 10:02 PM, Yonik Seeley [EMAIL PROTECTED] wrote:
 
   On Dec 19, 2007 10:59 AM, Rishabh Joshi [EMAIL PROTECTED] wrote:
I have created my own Tokenizer and I am indexing the documents
 using
   the
same.
   
I wanted to know if there is a way to retrieve the tokens (created
 by
  my
custom tokenizer) from the index.
  
   If you want the tokens in the index, see the luke request handler.
  
   If you want the tokens for a specific document, it's more
   complicated... Lucene maintains an *inverted* index... terms point to
   documents, so by default there is no way to ask for all of the terms
   in a certain document.  One could ask lucene to store the terms for
   certain fields (called term vectors), but that requires extra space in
   the index, and solr doesn't yet have a way to ask that they be
   retrieved.
  
   What are you trying to do with the tokens?
  
   -Yonik
  
 



Retrieving Tokens

2007-12-19 Thread Rishabh Joshi
Hi,

I have created my own Tokenizer and I am indexing the documents using the
same.

I wanted to know if there is a way to retrieve the tokens (created by my
custom tokenizer) from the index.
Do we have to modify the code to get these tokens?

Regards,
Rishabh