Dear Digy, You cannot store the Filed value when using a TokenStream but can store the term vector For this you should create an instance of the Field in this manner:
Field Field1 = new Field("Field1", DummyTokenStream1, TermVector.YES); Below is the code that should work. public class Main2Class{ Document Doc = new Document(); DummyTokenStream DummyTokenStream1 = new DummyTokenStream(); Field Field1 = new Field("Field1", DummyTokenStream1, TermVector.YES); DummyTokenStream DummyTokenStream2 = new DummyTokenStream(); Field Field2 = new Field("Field1", DummyTokenStream2, TermVector.YES); public void Index() throws Exception { Doc.add(Field1); Doc.add(Field2); IndexWriter wr = new IndexWriter("testindex", new WhitespaceAnalyzer(), true); for (int i = 0; i < 100; i++){ PrepDoc(); wr.addDocument(Doc); } wr.close(); } void PrepDoc(){ DummyTokenStream1.SetText("test1"); Field1.setValue(DummyTokenStream1); DummyTokenStream2.SetText("test2"); Field2.setValue(DummyTokenStream2); } public static void main(String[] args) throws Exception { Main2Class m = new Main2Class(); m.Index(); } } Cheers Ajay 2008/7/8 Ajay Lakhani <[EMAIL PROTECTED]>: > Dear Digy, > > To add on, I might think that this is not a glitch. > > A TokenStream is usually not stored. > If you change your field attribute to * > org.apache.lucene.document.Field.Store.NO *then there will be no issue. > > Developers, any thoughts on this! > > Cheers > Ajay > > 2008/7/8 Ajay Lakhani <[EMAIL PROTECTED]>: > > Dear Digy, >> As of Lucene 2.3, there are new setValue(...) methods that allow you to >> change the value of a Field. However, there seems to be an issue with the >> org.apache.lucene.index.FieldWriter.writeField(...) API that stores the >> string value for the field, which happens to be null in the case of a >> TokenStream. >> >> >> The org.apache.lucene.index.FieldWriter.writeField(...) API needs to be >> changed to verify whether the Field Data is an instance of String, Reader or >> a TokenStream and then retrieve the respective values. I shall patch this >> soon. >> >> Is there a particular reason you are using a TokenStream ? I suggest you >> set the text value directly to the Field: Field1.setValue("xxx"); >> >> Moreover, it's best to create a single Document instance, then add >> multiple Field instances to it, but hold onto these Field instances and >> re-use them by changing their values for each added document. After the >> document is added, you then directly change the Field values >> (idField.setValue(...), etc), and then re-add your Document instance. You >> cannot re-use a single Field instance within a Document, and, you should not >> change a Field's value until the Document containing that Field has been >> added to the index. >> >> 2008/7/8 Digy <[EMAIL PROTECTED]>: >> >> Hi all, >>> >>> >>> >>> I am a Lucene.Net user. Since I need a fast indexing in my current >>> project I try to use Lucene 2.3.2 which I convert to .Net with IKVM(Since >>> Lucene.Net is currently in v2.1) and I use the same instances of document >>> and fields to gain some speed improvements. >>> >>> >>> >>> I use TokenStreams to set the value of fields. >>> >>> >>> >>> My problem is that I get NullPointerException in "addDocument". >>> >>> >>> >>> Exception in thread "main" java.lang.NullPointerException >>> >>> at >>> org.apache.lucene.store.IndexOutput.writeString(IndexOutput.java:99) >>> >>> at >>> org.apache.lucene.index.FieldsWriter.writeField(FieldsWriter.java:127) >>> >>> at >>> org.apache.lucene.index.DocumentsWriter$ThreadState$FieldData.processField(DocumentsWriter.java:1418) >>> >>> at >>> org.apache.lucene.index.DocumentsWriter$ThreadState.processDocument(DocumentsWriter.java:1121) >>> >>> at >>> org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:2442) >>> >>> at >>> org.apache.lucene.index.DocumentsWriter.addDocument(DocumentsWriter.java:2424) >>> >>> at >>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1464) >>> >>> at >>> org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1442) >>> >>> at MainClass.Test(MainClass.java:39) >>> >>> at MainClass.main(MainClass.java:10) >>> >>> >>> >>> To show the same bug in Java I prepared a sample application (oh, that >>> was hard since this is my second app. in java(first one was a "Hello World" >>> app.)) >>> >>> >>> >>> Is something wrong with my application or is it a bug in Lucene? >>> >>> >>> >>> Thanks, >>> >>> DIGY >>> >>> >>> >>> >>> >>> >>> >>> *SampleCode:* >>> >>> * public class **MainClass*** >>> >>> * {* >>> >>> * * >>> >>> * DummyTokenStream **DummyTokenStream1** = new >>> DummyTokenStream();* >>> >>> * DummyTokenStream **DummyTokenStream2** = new >>> DummyTokenStream();* >>> >>> * * >>> >>> * //use the same document&field instances for Indexing* >>> >>> * org.apache.lucene.document.Document **Doc** = new >>> org.apache.lucene.document.Document();* >>> >>> * * >>> >>> * org.apache.lucene.document.Field **Field1** = new >>> org.apache.lucene.document.Field("Field1", "", >>> org.apache.lucene.document.Field.Store.YES, >>> org.apache.lucene.document.Field.Index.TOKENIZED);* >>> >>> * org.apache.lucene.document.Field **Field2** = new >>> org.apache.lucene.document.Field("Field2", "", >>> org.apache.lucene.document.Field.Store.YES, >>> org.apache.lucene.document.Field.Index.TOKENIZED);* >>> >>> * * >>> >>> * public **MainClass**()* >>> >>> * {* >>> >>> * Doc.add(Field1);* >>> >>> * Doc.add(Field2);* >>> >>> * }* >>> >>> * * >>> >>> * * >>> >>> * public void Index() throws * >>> >>> * >>> org.apache.lucene.index.CorruptIndexException,* >>> >>> * >>> org.apache.lucene.store.LockObtainFailedException,* >>> >>> * java.io.IOException* >>> >>> * {* >>> >>> * System.out.println("Index Started"); * >>> >>> * org.apache.lucene.index.IndexWriter wr = new >>> org.apache.lucene.index.IndexWriter("testindex", new >>> org.apache.lucene.analysis.WhitespaceAnalyzer(),true);* >>> >>> * * >>> >>> * for (int i = 0; i < 100; i++)* >>> >>> * {* >>> >>> * PrepDoc();* >>> >>> * wr.addDocument(Doc);* >>> >>> * }* >>> >>> * wr.close();* >>> >>> * System.out.println("Index Completed"); * >>> >>> * }* >>> >>> * * >>> >>> * **void PrepDoc()* >>> >>> * {* >>> >>> * DummyTokenStream1.SetText("test1"); //Set a new Text to >>> Token Stream* >>> >>> * Field1.setValue(DummyTokenStream1); //Set TokenStream to >>> Field Value* >>> >>> * * >>> >>> * * >>> >>> * DummyTokenStream2.SetText("test2"); //Set a new Text to >>> Token Stream* >>> >>> * Field2.setValue(DummyTokenStream2); //Set TokenStream to >>> Field Value* >>> >>> * }* >>> >>> * * >>> >>> * public static void main(String[] args) throws* >>> >>> * org.apache.lucene.index.CorruptIndexException,* >>> >>> * org.apache.lucene.store.LockObtainFailedException,* >>> >>> * java.io.IOException* >>> >>> * {* >>> >>> * MainClass m = new MainClass();* >>> >>> * m.Index();* >>> >>> * }* >>> >>> * * >>> >>> * * >>> >>> * * >>> >>> * * >>> >>> * public class **DummyTokenStream **extends >>> org.apache.lucene.analysis.TokenStream* >>> >>> * {* >>> >>> * String Text = "";* >>> >>> * boolean EndOfStream = false;* >>> >>> * org.apache.lucene.analysis.Token Token = new >>> org.apache.lucene.analysis.Token();* >>> >>> * * >>> >>> * //return "Text" as the first token and null as the second* >>> >>> * public org.apache.lucene.analysis.Token next()* >>> >>> * {* >>> >>> * if (EndOfStream == false)* >>> >>> * {* >>> >>> * EndOfStream = true;* >>> >>> * * >>> >>> * Token.setTermText(Text);* >>> >>> * Token.setStartOffset(0);* >>> >>> * Token.setEndOffset(Text.length() - 1);* >>> >>> * Token.setTermLength(Text.length());* >>> >>> * return Token;* >>> >>> * }* >>> >>> * return null;* >>> >>> * }* >>> >>> * * >>> >>> * public void SetText(String Text)* >>> >>> * {* >>> >>> * EndOfStream = false;* >>> >>> * this.Text = Text;* >>> >>> * }* >>> >>> * }* >>> >>> * * >>> >>> * }* >>> >>> >>> >>> >>> >> >> >