browse terms of index

2009-10-15 Thread jfmelian
Hi 

I use a sample embedded Apache Solr to create a Lucene index with few documents 
for tests purpose. 
Documents have text string, sint, sfloat, bool, and date fields, each of them 
are indexed. 
At this time they are also stored but only the ids documents will be stored at 
the end. 

I want to list the terms of index. I don't found a way this solr api so I made 
a try Apache Luke ( Lucene api.) 
Here the code of luke to see terms of index : 

public void terms(String field) throws CorruptIndexException, IOException { 
validateIndexSet(); 
validateOperationPossible(); 
SortedMapString,Integer termMap = new TreeMapString,Integer(); 
IndexReader reader = null; 
try { 
reader = IndexReader.open(indexName); 
TermEnum terms = reader.terms(); // return an enumeration of terms 
while (terms.next()) { 
Term term = terms.term(); 
if ((field.trim().length() == 0) || field.equals(term.field())) { 
termMap.put(term.field() + : + term.text(), 
new Integer((terms.docFreq(; 
} 
} 
int nkeys = 0; 
for (String key : termMap.keySet()) { 
Lucli.message(key + :  + termMap.get(key)); 
nkeys++; 
if (nkeys  Lucli.MAX_TERMS) { 
break; 
} 
} 
} finally { 
closeReader(reader); 
} 
} 

But for sfloat field (is the same for sint) I don't see the value of the term. 
The class Term of Lucene have just 2 fields of type String (name and value) 

Here values returned for the dynamic field f_float of type sfloat : 

f_float:┼?? 
f_float:┼?? 
f_float:┼?l 
f_float:┼?? 
f_float:┼?? 

So, 
have a way to convert term in the good type (int, date, float ) ? 
Or Have a way to see index terms with solr api ? 

Thanks for help 

Jean-François Melian

Re: browse terms of index

2009-10-15 Thread Grant Ingersoll

Have a look at http://wiki.apache.org/solr/TermsComponent

On Oct 15, 2009, at 5:43 AM, jfmel...@free.fr wrote:


Hi

I use a sample embedded Apache Solr to create a Lucene index with  
few documents for tests purpose.
Documents have text string, sint, sfloat, bool, and date fields,  
each of them are indexed.
At this time they are also stored but only the ids documents will be  
stored at the end.


I want to list the terms of index. I don't found a way this solr api  
so I made a try Apache Luke ( Lucene api.)

Here the code of luke to see terms of index :

public void terms(String field) throws CorruptIndexException,  
IOException {

validateIndexSet();
validateOperationPossible();
SortedMapString,Integer termMap = new TreeMapString,Integer();
IndexReader reader = null;
try {
reader = IndexReader.open(indexName);
TermEnum terms = reader.terms(); // return an enumeration of terms
while (terms.next()) {
Term term = terms.term();
if ((field.trim().length() == 0) || field.equals(term.field())) {
termMap.put(term.field() + : + term.text(),
new Integer((terms.docFreq(;
}
}
int nkeys = 0;
for (String key : termMap.keySet()) {
Lucli.message(key + :  + termMap.get(key));
nkeys++;
if (nkeys  Lucli.MAX_TERMS) {
break;
}
}
} finally {
closeReader(reader);
}
}

But for sfloat field (is the same for sint) I don't see the value of  
the term. The class Term of Lucene have just 2 fields of type String  
(name and value)


Here values returned for the dynamic field f_float of type sfloat :

f_float:┼??
f_float:┼??
f_float:┼?l
f_float:┼??
f_float:┼??

So,
have a way to convert term in the good type (int, date, float ) ?
Or Have a way to see index terms with solr api ?

Thanks for help

Jean-François Melian


--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search