Hi,
lucene documentation seems to be very confusing... here is my predicament
I have an object like the following:
public class PropertyImpl implements Property {
String id;
List<String> names = new ArrayList<String>();
String address = "";
String city = "";
String street = "";
String state = "";
String cachedMatchingString;
...
..
public String toString() {
// save some time
if (cachedMatchingString == null) {
StringBuffer buf = new StringBuffer();
for (String name : names) {
//buf.append(name + ",");
buf.append(name + " ");
}
if (address != null) {
//buf.append(address + ",");
buf.append(address + " ");
}
if (city != null) {
//buf.append(city + ",");
buf.append(city + " ");
}
if (street != null) {
//buf.append(street + ",");
buf.append(street + " ");
}
if (state != null) {
//buf.append(state);
buf.append(state);
}
// get rid of any spaces
//cachedMatchingString = buf.toString().replace(" ",
"");
cachedMatchingString = buf.toString();
}
return cachedMatchingString;
}
A typical instance of this object would have the following data
id=123134
names= {'Emmett Walsh', 'John Walsh'}
address="Milltown."
city="Dublin"
street="Bankside Cottages"
state="Ireland"
cachedMatchingString="Emmett Walsh John Walsh Milltown Dublin Bankside
Cottages Ireland"
I have about 8000 of these objects which I store in a hashmap for direct
access based on there id. It is the id(s) that I need lucene to return to me
so that I can retireve the desired object(s) from the hashmap.
The cachedMatchingString string is what I use when indexing the information
like following:
e.g. so for each of the 8000 objects i do the following to build the index
idxWriter.addDocument(createDocument(prop));
private static Document createDocument(Property prop) {
Document doc = new Document();
Field field = new Field("id", prop.getId(), Store.YES,
Index.NO);
doc.add(field);
Field field2 = new Field("content", prop.toString(), Store.NO,
Index.TOKENIZED); //will tokenise our long
string
doc.add(field2);
return doc;
}
Searching
I have a text field in my app that typically would take in a strings like
"Main s" , which
would end up in a query like "Main AND s*" like follows
"B", which would end up in a query like "b*"
queryString = queryString.trim(); //the string
taken in from textbox
String[] strings = queryString.split(" ");
int numStrings = strings.length;
StringBuffer luceneQuery = new StringBuffer();
for(int i=0;i<numStrings;i++){
String aString = strings[i];
if(aString.equals("")){
continue;
}
luceneQuery.append(aString);
if((i+1) != numStrings){
luceneQuery.append(" AND ");
}else{
luceneQuery.append("*");
}
}
//BooleanQuery.setMaxClauseCount(2*1024);
QueryParser parser = new QueryParser("content",
new StandardAnalyzer());
org.apache.lucene.search.Query query =
parser.parse(luceneQuery.toString());
// Search for the query
Hits hits = searcher.search(query);
but i keep getting Toomanyclausesexceptions
Any suggestions on how to index or search better ?
--
View this message in context:
http://www.nabble.com/Indexing-and-searching-help-tf4020937.html#a11420608
Sent from the Lucene - Java Developer mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]