I need some help figuring out the following:

I was looking at: BasicIndexingFilter.java where it's stated:

// url is both stored and indexed, so it's both searchable and returned
doc.add(Field.Text("url", url));

// content is indexed, so that it's searchable, but not stored in index
doc.add(Field.UnStored("content", parse.getText()));

I'm stuck on what replacement can be made here. I'm assuming doc.add is the
object that would add tokens to the index? How can a token (word, phrase) be
"searchable but not stored in the index"?

I'm basicly trying to do the following, given two pages A and B:
A is written in eastern alphabet
B is written in latin alphabet.
I would like to index page B as it is, and page A as it is, and the content
of page A translated to latin in addition to it.

Would I have to add something as:
String content = parse.getText();
content +=" ";
content += myTranslationFunctionToLatin(content);
doc.add (Field.Text("content", content);

Or would the last line be:
doc.add(Field.UnStored("content", content));

What's the difference with regard to the Field.* object?


Regards,
EM

Reply via email to