[ 
https://issues.apache.org/jira/browse/NUTCH-787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12830534#action_12830534
 ] 

Dawid Weiss commented on NUTCH-787:
-----------------------------------

Definitely not an easy thing to do. I need to finish for today, the code 
compiles, here's a brief summary of changes:

- modified all filters and streams to use token attributes instead of raw 
Tokens. In many places I tried to be least intrusive so that the patch can be 
easily reviewed and accepted; improvements resulting from the new API can 
follow,

- replaced deprecated constants to their new equivalents (UN_TOKENIZED, etc),

- there are no compressed fields any more, so this stuff is commented out.

If I may ask as many people with Lucene/Nutch knowledge to go through the patch 
and point out potential problems, it would be great. At the moment one core 
test fails for me -- TestIndexSorter. I don't know if the difference in boosts 
is something that is a result of Lucene changes or my bug introduced somewhere 
along the way. 



> Upgrade Lucene to 3.0.0.
> ------------------------
>
>                 Key: NUTCH-787
>                 URL: https://issues.apache.org/jira/browse/NUTCH-787
>             Project: Nutch
>          Issue Type: Task
>          Components: build
>            Reporter: Dawid Weiss
>            Priority: Trivial
>         Attachments: NUTCH-787.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to