Also, since the docs are so regular from a line file, to gain indexing
throughput you can re-use the Document & Field instances, and inside
the while loop just use Field.setValue(...) to change the value per
document.
Mike
Anshum wrote:
Hi,
How about just opening a file and parsing through it while adding
doing a
doc.add on each newline? That should be pretty straight and simple.
Just writing the snippet here, though this might have issues as
didnt try to
compile it.
IndexWriter writer = new IndexWriter(indexDir, new
StandardAnalyzer(),
true);
FileInputStream fstream = new FileInputStream("textfile.txt");
DataInputStream in = new DataInputStream(fstream);
BufferedReader br = new BufferedReader(new InputStreamReader(in));
String strLine;
while ((strLine = br.readLine()) != null) {
Document doc = new Document();
doc.add(new Field("filename",
f.getCanonicalPath(),Field.Store.YES,Field.Index.TOKENIZED));
doc.add(new Field("filename",
strLine,Field.Store.YES,Field.Index.TOKENIZED));//DEPENDING UPON HOW
YOU
WANT TO INDEX IT
writer.addDocument(doc);
}
in.close();
writer.close();
Also, I have tokenized the content and stored it so that it could be
fetched, you might just want to have a ref key instead of storing the
entrire content though. Upto you for implementation.
--
Anshum
http://ai-cafe.blogspot.com
On Thu, Aug 7, 2008 at 1:42 AM, Brittany Jacobs <[EMAIL PROTECTED]
>wrote:
Hello, I am new to all this. I need to read in a text file and
have each
line in the file be a document.
The LineDocMaker seems to be intended for this purpose. But I
can't figure
out how to read the data into it.
Any examples would be greatly appreciated.
--
--
The facts expressed here belong to everybody, the opinions to me.
The distinction is yours to draw............
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]