好的,谢谢! ----- Original Message ----- From: "kai.hu" <[EMAIL PROTECTED]> To: <java-user@lucene.apache.org> Sent: Sunday, May 04, 2008 4:27 PM Subject: Re: Need addtional info for Field(希望看得懂中文的朋友帮我出出主意)
> 在google里搜一下中文分词,出车东的包外,应该还有很多了,如果你发现有更好分词,更高效率的,也推荐一份啊。 > > -------------------------------------------------- > From: "kai.hu" <[EMAIL PROTECTED]> > Sent: Sunday, May 04, 2008 4:20 PM > To: <java-user@lucene.apache.org> > Subject: Re: Need addtional info for Field(希望看得懂中文的朋友帮我出出主意) > > >> 你只要索引并分词“下午去开会”就行了,把对应的时间保存进去。 >> 如document.add(new >> Field("sub","下午去开会",Field.Store.YES,Field.Index.TOKENIZED)); >> document.add(new >> Field("time","01:02:02",Field.Store.YES,Field.Index.UN_TOKENIZED)); >> 到时候搜索出的单个document里就包含这两个Field了。 >> >> only index and tokenized "下午去开会",and store the time with this sub. >> >> -------------------------------------------------- >> From: "Cedric Ho" <[EMAIL PROTECTED]> >> Sent: Tuesday, April 22, 2008 3:36 PM >> To: <java-user@lucene.apache.org> >> Subject: Re: Need addtional info for Field(希望看得懂中文的朋友帮我出出主意) >> >> >>> In that case you may want to index each: >>> >>> Field("Sub","下午去开会","01:02:02"); >>> >>> as a separate document. So your document contains 3 fields >>> 1. title >>> 2. time >>> 3. sub >>> >>> then you can get both title and time by searching the "sub" field. >>> >>> Cedric >>> >>> >>> 2008/4/22 王建新 <[EMAIL PROTECTED]>: >>>> >>>> 谢谢,我只是检索sub,不检索时间,在检索sub时,只想得到匹配Field对应的时间。 >>>> 用payload似乎不可以? >>>> >>>> >>>> >>>> ----- Original Message ----- >>>> From: <[EMAIL PROTECTED]> >>>> To: <java-user@lucene.apache.org> >>>> Sent: Tuesday, April 22, 2008 1:55 PM >>>> Subject: RE: Need addtional info for Field(希望看得懂中文的朋友帮我出出主意) >>>> >>>> >>>> Try to use payload which is stored as additional information. Currently >>>> lucene only support per token payload, but you can add an arbitrary >>>> token for the time information. >>>> >>>> I am not sure what are the query information? Only the subtitle or both >>>> subtitle and time? >>>> >>>> Regards, >>>> >>>> -----Original Message----- >>>> From: 王建新 [mailto:[EMAIL PROTECTED] >>>> Sent: Tuesday, April 22, 2008 1:06 PM >>>> To: java-user >>>> Subject: Need addtional info for Field(希望看得懂中文的朋友帮我出出主意) >>>> >>>> 用英文可能描述得不是很清楚,不好意思:) >>>> >>>> >>>> ----- Original Message ----- >>>> From: 王建新 >>>> To: Chris >>>> Sent: Tuesday, April 22, 2008 9:52 AM >>>> Subject: Re: Need addtional info for Field >>>> >>>> >>>> 谢谢。 >>>> 我的问题是这样的:要对一批视频文件(video)建立索引(index),在建立索引之前,我已经分析出了在视频的什么时间出现了什么样的字幕内容。 >>>> 在这种情况下,一个视频节目就相当于一个Document,那么需要(希望)对字幕建立索引,如下: >>>> Field("Sub","下午去开会","01:02:02"); >>>> Field("Sub","后天去开会","01:03:05"); >>>> [注:"01:02:02"是附属的时间,lucene没有提供这种用法。] >>>> >>>> >>>> 这两个Field表示在当前的视频节目中,在01:02:02时间出现了字幕"下午去开会",在01:03:05时间出现了"后天去开会",如果用户(User)搜索"下午",当前视频节目是可以匹配的,但是只匹配到了第一个Field,只需要知道时间"01:02:02"。如果用户搜索"开会",则两个Field都可以匹配到。因此需要知道时间"01:02:02"和"01:03:05"。 >>>> 不知道我有没有说清楚。 >>>> >>>> 我想知道lucene是不是可以通过某种方式解决这个问题,如果不行的话,需要怎样修改lucene呢? >>>> >>>> 王建新 >>>> ----- Original Message ----- >>>> From: Chris >>>> To: 王建新 >>>> Sent: Monday, April 21, 2008 7:34 PM >>>> Subject: Re: Need addtional info for Field >>>> >>>> >>>> 您的功能可以再清楚一點嗎,因為其實這樣處理,好像要斷詞.... >>>> >>>> 但看到您沒斷,而且欄位名稱一樣是 multi-pair 值的話,不是用 String 存哦 >>>> >>>> 以上 >>>> Chris. >>>> >>>> >>>> 2008/4/21, 王建新 <[EMAIL PROTECTED]>: >>>> 你看得懂中文吗? >>>> >>>> 我不是很明白你的意思。 >>>> 你是说可以用lucene现有的功能来解决这个问题吗? >>>> >>>> ----- Original Message ----- >>>> From: Chris >>>> To: 王建新 >>>> Sent: Monday, April 21, 2008 5:14 PM >>>> Subject: Re: Need addtional info for Field >>>> >>>> >>>> This problem is not solve with lucene but or method will solve it. >>>> >>>> The structure is not define as this as well ...... >>>> >>>> You may check it clear.... >>>> >>>> above >>>> Chris. >>>> >>>> >>>> 2008/4/21, 王建新 <[EMAIL PROTECTED]>: >>>> hi Chris, it is me "王建新" >>>> >>>> I have a new problem, Could you give me any advice? Thank you. >>>> >>>> >>>> I want to use lucene with some additional info,like: >>>> >>>> 1.index >>>> Document additionalDoc=ew Document() >>>> >>>> additionalDoc.add(new Field("field","AA BB","Addtional info >>>> ...............")); >>>> additionalDoc.add(new Field("field","BB CC","Addtional info >>>> 222222222222222222222222...............")); >>>> >>>> writer.addDocument(additionalDoc) >>>> >>>> ........ >>>> >>>> >>>> 2. search >>>> >>>> Searcher searcher; >>>> .... >>>> >>>> searcher.search(termQuery("field","BB")); >>>> >>>> >>>> >>>> >>>> in this condition, I want lucene returns the additionalDoc , >>>> also know which fileds were matched, then I will get the additional info >>>> from the matched fields. >>>> >>>> Can lucene make it in version 2.3.1? >>>> >>>> >>>> >>>> -- >>>> Chris Lin >>>> [EMAIL PROTECTED] >>>> Taipei , Taiwan. >>>> ----------------------------------------------------------- >>>> >>>> >>>> >>>> -- >>>> Chris Lin >>>> [EMAIL PROTECTED] >>>> Taipei , Taiwan. >>>> ----------------------------------------------------------- >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: [EMAIL PROTECTED] >>>> For additional commands, e-mail: [EMAIL PROTECTED] >>>> >>> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [EMAIL PROTECTED] >> For additional commands, e-mail: [EMAIL PROTECTED] >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] >