Hey Julien,

I'll take care of the ones with my name on them below (NUTCH-564 and NUTCH-825).

Cheers,
Chris

On Jan 5, 2011, at 8:36 AM, Julien Nioche (JIRA) wrote:

> 
>    [ 
> https://issues.apache.org/jira/browse/NUTCH-951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977828#action_12977828
>  ] 
> 
> Julien Nioche commented on NUTCH-951:
> -------------------------------------
> 
> NUTCH-894 : has been written for 2.0 and would need some effort to backport 
> to 1.3 
> I suggest that we leave it there. 
> 
> The list of things that IMHO are worth porting to 1.3 are now 
> 
>    * NUTCH-564 External parser supports encoding attribute (Antony Bowesman, 
> mattmann)
>    * NUTCH-825 Publish nutch artifacts to central maven repository (mattmann)
>    * NUTCH-872 Change the default fetcher.parse to FALSE (ab).
>    * NUTCH-876 Remove remaining robots/IP blocking code in lib-http (ab)
>    * NUTCH-884 FetcherJob should run more reduce tasks than default (ab)
>    * NUTCH-921 Reduce dependency of Nutch on config files (ab)
> 
> Any volunteers?
> 
> 
>> Backport changes from 2.0 into 1.3
>> ----------------------------------
>> 
>>                Key: NUTCH-951
>>                URL: https://issues.apache.org/jira/browse/NUTCH-951
>>            Project: Nutch
>>         Issue Type: Task
>>   Affects Versions: 1.3
>>           Reporter: Julien Nioche
>>           Priority: Blocker
>>            Fix For: 1.3
>> 
>> 
>> I've compared the changes from 2.0 with 1.3 and found the following 
>> differences (excluding anything specific to 2.0/GORA)
>>    *  NUTCH-564 External parser supports encoding attribute (Antony 
>> Bowesman, mattmann)
>>    *  NUTCH-714 Need a SFTP and SCP Protocol Handler (Sanjoy Ghosh, mattmann)
>>    *  NUTCH-825 Publish nutch artifacts to central maven repository 
>> (mattmann)
>>    *  NUTCH-851 Port logging to slf4j (jnioche)
>>    *  NUTCH-861 Renamed HTMLParseFilter into ParseFilter
>>    *  NUTCH-872 Change the default fetcher.parse to FALSE (ab).
>>    *  NUTCH-876 Remove remaining robots/IP blocking code in lib-http (ab)
>>    *  NUTCH-880 REST API for Nutch (ab)
>>    *  NUTCH-883 Remove unused parameters from nutch-default.xml (jnioche)
>>    *  NUTCH-884 FetcherJob should run more reduce tasks than default (ab)
>>    *  NUTCH-886 A .gitignore file for Nutch (dogacan)
>>    *  NUTCH-894 Move statistical language identification from indexing to 
>> parsing step
>>    *  NUTCH-921 Reduce dependency of Nutch on config files (ab)
>>    *  NUTCH-930 Remove remaining dependencies on Lucene API (ab)
>>    *  NUTCH-931 Simple admin API to fetch status and stop the service (ab)
>>    *  NUTCH-932 Bulk REST API to retrieve crawl results as JSON (ab)
>> Let's go through this and decide what to port to 1.3
> 
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattm...@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Reply via email to