+1

Regards

----- Chris Mattmann <[email protected]> escribió:
> ++1!
> 
>  
> 
> Sounds great.
> 
>  
> 
> Cheers,
> 
> Chris
> 
>  
> 
>  
> 
>  
> 
>  
> 
> From: Sebastian Nagel <[email protected]>
> Reply-To: "[email protected]" <[email protected]>
> Date: Monday, June 11, 2018 at 7:35 AM
> To: "[email protected]" <[email protected]>
> Cc: "[email protected]" <[email protected]>
> Subject: Preparing to release Nutch 1.15 ?
> 
>  
> 
> Hi all,
> 
>  
> 
> almost 80 fixes and improvements are done now and include:
> 
>  
> 
> NUTCH-2375 upgrade to new mapreduce API
> 
>   It was a huge change affecting more than 10,000 lines of code. Thanks, 
> Omkar!
> 
>   Well, there have been some regressions but those are resolved now. Tests in
> 
>   pseudo-distributed mode [1] succeeded and also a mid-size test crawl (180
> 
>   million pages) on a Hadoop cluster.
> 
>   Would be great if anybody is able to test the Nutch master in combination 
> with
> 
>   a non-HDFS file system (e.g. s3://)! Please let us know whether this works. 
> Thanks!
> 
>  
> 
> NUTCH-1480: Multiple index writer instances with different configurations
> 
>   Thanks to Roannel it's now possible to index into multiple Solr or 
> Elasticsearch
> 
>   instances. With NUTCH- (needs to be reviewed) also the routing to of 
> documents
> 
>   to the index will be configurable.
> 
>  
> 
> NUTCH-2583: Ralf contributed a huge upgrade of dependencies.
> 
>    Nutch now runs and compiles on Java 9 + 10. Only errors in unit tests need
> 
>    to be addressed in NUTCH-2596.
> 
>  
> 
> And two important issues are almost ready to be committed soon:
> 
>  
> 
> NUTCH-2549: a long list of fixes and improvements to protocol-http. Thanks to
> 
>    Gerard Bouchard!
> 
>  
> 
> NUTCH-2576: plugin protocol-okhttp, a new HTTP protocol implementation based
> 
>    on the okhttp library. Supports HTTP/2.
> 
>  
> 
>  
> 
> The full list of fixes and improvements is available at [2].
> 
>  
> 
> I'll plan to work through the remaining 70 open issues during the next
> 
> days and hope to commit/resolve 15-25 of them and move the remaining
> 
> ones to Nutch 1.16.
> 
>  
> 
> Please vote for issues you want to get included. If there are open
> 
> pull requests, it will help if these can be merged, the unit tests
> 
> pass, and any review comments are addressed. Thanks!
> 
>  
> 
> If there are any objections or blockers, please also let us know!
> 
>  
> 
> I'll also plan to run a test crawl on Hadoop mid of this week.
> 
> But any help in testing is welcome.
> 
>  
> 
> Note that the tutorial needs to be updated (will be done after 1.15
> 
> is finally released) to reflect the changes related to NUTCH-1480.
> 
>  
> 
>  
> 
> Thanks,
> 
> Sebastian
> 
>  
> 
>  
> 
> [1] https://github.com/sebastian-nagel/nutch-test-single-node-cluster
> 
> [2] https://issues.apache.org/jira/projects/NUTCH/versions/12342302
> 
>  
> 
>  
> 

UCIENCIA 2018: III Conferencia Científica Internacional de la Universidad de 
las Ciencias Informáticas.
Del 24-26 de septiembre, 2018 http://uciencia.uci.cu http://eventos.uci.cu

Reply via email to