Thank you for your response.
I have to update my code [?]^_^

在 2011年4月13日 下午7:19,Julien Nioche <[email protected]>写道:

> Hi,
>
> Nutch has moved away from handling the indexing and search itself and now
> delegates that to SOLR as of versions 1.3 and 2.0 (both forthcoming). The
> issue you described won't be fixed as this part of the code has been
> removed. Users are encouraged to start using 1.3 and use SOLR for the
> indexing and search.
>
> Your comments should be useful to anyone having the same issue with Nutch
> <= 1.2, so thanks for sharing this.
>
> Julien
>
>
> 2011/4/13 Bupo Jung <[email protected]>
>
>> I use Nutch for Chinese search. I input a query string like
>> "可爱的小女生"(a lovely little girl),the chinese analyzer turn it to three query
>> token――
>> 可爱、小女、女生. When using the tokens to get the summary of the result page, a
>> StringIndexOutOfBoundsException throw out. Here is the error log:
>>
>> 2010-12-15 12:18:43,505 ERROR searcher.NutchBean �C Exception occured while
>> executing search: java.lang.RuntimeException:
>> java.util.concurrent.ExecutionException:
>> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
>>
>> java.lang.RuntimeException: java.util.concurrent.ExecutionException:
>> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
>>
>> at
>> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:297)
>>
>> at org.apache.nutch.searcher.NutchBean.getSummary(NutchBean.java:350)
>>
>> at org.apache.nutch.searcher.NutchBean.main(NutchBean.java:410)
>>
>> Caused by: java.util.concurrent.ExecutionException:
>> java.lang.StringIndexOutOfBoundsException: String index out of range: -1
>>
>> at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222)
>>
>> at java.util.concurrent.FutureTask.get(FutureTask.java:83)
>>
>> at
>> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:292)
>>
>> … 2 more
>>
>> Caused by: java.lang.StringIndexOutOfBoundsException: String index out of
>> range: -1
>>
>> at java.lang.String.substring(String.java:1937)
>>
>> at
>> org.apache.nutch.summary.basic.BasicSummarizer.getSummary(BasicSummarizer.java:188)
>>
>> at
>> org.apache.nutch.searcher.FetchedSegments.getSummary(FetchedSegments.java:263)
>>
>> at
>> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:63)
>>
>> at
>> org.apache.nutch.searcher.FetchedSegments$SummaryTask.call(FetchedSegments.java:53)
>>
>> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>
>> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>
>> at java.lang.Thread.run(Thread.java:662)
>>
>> This is because there is overlap between the two query tokens “小女” and
>> “女生”。
>>
>>
>> nutch/src/plugin/summary-basic/src/java/org/apache/nutch/summary/basic/BasicSummarizer.java
>>
>> line 188:
>>
>> *if* (highlight.contains(t.term())) {
>> excerpt.addToken(t.term());
>> //when two tokens overlap,offset>t.startOffset()
>> //
>> excerpt.add(*new*Fragment(text.substring(offset, t.startOffset())));//this
>> is where the exception accur
>> excerpt.add(*new*
>> Highlight(text.substring(t.startOffset(),t.endOffset())));
>> offset = t.endOffset();
>> endToken = Math.*min*(j +sumContext, tokens.length);
>> }
>>
>>
>> //Change code to fix the error:
>> *if* (highlight.contains(t.term())) {
>> excerpt.addToken(t.term());
>> //bupo changed the code to fix the chinese token overlap error 2010.12.15
>> *if*(offset < t.startOffset()){
>> excerpt.add(*new*Fragment(text.substring(offset, t.startOffset())));
>> excerpt.add(*new*
>> Highlight(text.substring(t.startOffset(),t.endOffset())));
>> }*else*{
>> excerpt.add(*new*Highlight(text.substring(offset,t.endOffset())));
>> }//bupo
>> }
>>
>> --
>>
>> Yizhong Zhuang
>> Beijing University of Posts and Telecommunications
>> Email:[email protected]
>>
>
>
>
> --
> *
> *Open Source Solutions for Text Engineering
>
> http://digitalpebble.blogspot.com/
> http://www.digitalpebble.com
>



-- 
Yizhong Zhuang
Beijing University of Posts and Telecommunications
Email:[email protected]

<<341.gif>>

Reply via email to