Could you boil this down to a smallish test case, showing the term vector files getting incorrectly deleted?
Then we can test this test case against the current 3.x trunk where LUCENE-3403 is fixed, to see if that fixes it. Luke removing the files means that the files were "dead", ie, unreferenced by any segments_N files in the index, which is bad if that index was produced by calling addIndexes into a new directory. Mike McCandless http://blog.mikemccandless.com 2011/8/29 <ari...@csk.com>: > > > Hi, I don't know whether my problem is the same reason. > > When I merged some indexs to one the term vectors missed in this case. > > The input indexs is saved on several difference dirctories, for example > /index1/, /index2/, /index3/. > And the output merge index will be saved to another new directory, for > example /mergeindex/. > > addIndexs and opitimize is used here. > > So the results is that the index is merged correctly in the /mergeindex/ > and there is no any problem for search function. > And the TermVector files(tvd, tvf and tvx) are also hold the data because > the size of them is not 0 byte. > > But after I open this index using Luke, I found the TermVector files(tvd, > tvf and tvx) became 0 byte in the Luke file list. > And after closed Luke, this three files is disappeared. > > I checked a lot of documents and Lucene index format explanation, > I think the reason is that the [DocStoreOffset] is not set correctly in > the segment file here. > Because the [DocStoreOffset] is not set, Luke thought there is no these > three files. > > But if I merged these in indexs to one which is existed already, the > results is correct and TermVector is also layout correctly. > > For exampl, the input is /index1/, /index2/, /index3/ and the merge output > dirctory is /index1/. > > I don't know whether my case is same as your case and I don't know whether > it is a bug of lucene. > > In fact I post one question about it Aug. 25 which title is <The question > about DocStoreOffset>. > > I would apprecate if you could give me some advices about it > > Thanks in advance. > > Best regards. > > Yali Hu > > > > > > 送信者: "Michael McCandless (JIRA)" <j...@apache.org> 日付: 2011/08/26 > 12:37 GMT > > dev@lucene.apache.orgに返信してください > > 宛先: dev@lucene.apache.org > cc: > 件名: [jira] [Commented] (LUCENE-3403) Term vectors missing after > addIndexes + optimize > > > > [ > > https://issues.apache.org/jira/browse/LUCENE-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13091739#comment-13091739 > ] > > Michael McCandless commented on LUCENE-3403: > -------------------------------------------- > > Phew nice catch Shai! > > >> Term vectors missing after addIndexes + optimize >> ------------------------------------------------ >> >> Key: LUCENE-3403 >> URL: https://issues.apache.org/jira/browse/LUCENE-3403 >> Project: Lucene - Java >> Issue Type: Bug >> Components: core/index >> Affects Versions: 3.3 >> Reporter: Shai Erera >> Assignee: Shai Erera >> Priority: Blocker >> Fix For: 3.4, 4.0 >> >> Attachments: LUCENE-3403.patch >> >> >> I encountered a problem with addIndexes where term vectors disappeared > following optimize(). I wrote a simple test case which demonstrates the > problem. The bug appears with both addIndexes() versions, but does not > appear if addDocument is called twice, committing changes in between. >> I think I tracked the problem down to IndexWriter.mergeMiddle() -- it > sets term vectors before merger.merge() was called. In the addDocs case, > merger.fieldInfos is already populated, while in the addIndexes case it is > empty, hence fieldInfos.hasVectors returns false. >> will post a patch shortly. > > -- > This message is automatically generated by JIRA. > For more information on JIRA, see: http://www.atlassian.com/software/jira > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > > > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org