Re: [DISCUSS] Release Trunk
Hi, +1 to release soon (this year, or early next year) and probably a few others but they could also be done later. At least, these should be done before releasing: NUTCH-1646 IndexerMapReduce to consider DB status NUTCH-1413 Record response time Sebastian On 11/28/2013 05:49 PM, Julien Nioche wrote: Hi Lewis We've done quite a few things in 1.x since the previous release (e.g. generic deduplication, removing indexer.solr package, etc...) and the next 2.x release will be after the changes to GORA have been made, tested and used on the Nutch side so that could be quite a while. I am neutral as to whether we should do a 1.x release now. There are some minor issues that we could do in 1.x before the next release like : * https://issues.apache.org/jira/browse/NUTCH-1360 * https://issues.apache.org/jira/browse/NUTCH-1676 and probably a few others but they could also be done later. Let's hear what others think. Thanks Julien On 28 November 2013 16:34, Lewis John Mcgibbney lewis.mcgibb...@gmail.com mailto:lewis.mcgibb...@gmail.com wrote: Hi Folks, Thread says it all. There are some hot tickets over in Gora right now so I think holding off the next while for a 2.x release would be wise. I can spin the RC for trunk tonight/tomorrow/weekend if we get the thumbs up. Ta Lewis -- /Lewis/ -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble
RE: [DISCUSS] Release Trunk
Hi! Well, we've been doing a release roughly every 6 months for over three years now so it's about time indeed. I'll look into some open issues i have left when i have some spare time in the office. Hopefully soon but i'm not too sure about that going to happen. Cheers! -Original message- From:Sebastian Nagel wastl.na...@googlemail.com Sent: Monday 2nd December 2013 22:02 To: dev@nutch.apache.org Subject: Re: [DISCUSS] Release Trunk Hi, +1 to release soon (this year, or early next year) and probably a few others but they could also be done later. At least, these should be done before releasing: NUTCH-1646 IndexerMapReduce to consider DB status NUTCH-1413 Record response time Sebastian On 11/28/2013 05:49 PM, Julien Nioche wrote: Hi Lewis We've done quite a few things in 1.x since the previous release (e.g. generic deduplication, removing indexer.solr package, etc...) and the next 2.x release will be after the changes to GORA have been made, tested and used on the Nutch side so that could be quite a while. I am neutral as to whether we should do a 1.x release now. There are some minor issues that we could do in 1.x before the next release like : * https://issues.apache.org/jira/browse/NUTCH-1360 * https://issues.apache.org/jira/browse/NUTCH-1676 and probably a few others but they could also be done later. Let's hear what others think. Thanks Julien On 28 November 2013 16:34, Lewis John Mcgibbney lewis.mcgibb...@gmail.com mailto:lewis.mcgibb...@gmail.com wrote: Hi Folks, Thread says it all. There are some hot tickets over in Gora right now so I think holding off the next while for a 2.x release would be wise. I can spin the RC for trunk tonight/tomorrow/weekend if we get the thumbs up. Ta Lewis -- /Lewis/ -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com http://twitter.com/digitalpebble
[jira] [Created] (NUTCH-1678) Remove dependency on org.apache.oro
James Sullivan created NUTCH-1678: - Summary: Remove dependency on org.apache.oro Key: NUTCH-1678 URL: https://issues.apache.org/jira/browse/NUTCH-1678 Project: Nutch Issue Type: Improvement Components: parser Affects Versions: 2.2 Reporter: James Sullivan Priority: Minor org.apache.oro has been archived for three years and it may be good to remove the dependency as Java has had a built in regexes for quite some time now. There don't seem to have been any specific Perl5 functionality needed in the regexes so unless there are specific threading or performance reasons for continuing to use oro it may be time to lose the dependency. Attached patch needs to be checked thoroughly as I am rusty with Java and the unit tests are sparse. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (NUTCH-1678) Remove dependency on org.apache.oro
[ https://issues.apache.org/jira/browse/NUTCH-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Sullivan updated NUTCH-1678: -- Attachment: 2.x.patch parse/OutlinkExtractor index-more parse-js urlnormalizer-basic Needs to be looked over and tested first. Remove dependency on org.apache.oro --- Key: NUTCH-1678 URL: https://issues.apache.org/jira/browse/NUTCH-1678 Project: Nutch Issue Type: Improvement Components: parser Affects Versions: 2.2 Reporter: James Sullivan Priority: Minor Labels: newbie, patch Attachments: 2.x.patch org.apache.oro has been archived for three years and it may be good to remove the dependency as Java has had a built in regexes for quite some time now. There don't seem to have been any specific Perl5 functionality needed in the regexes so unless there are specific threading or performance reasons for continuing to use oro it may be time to lose the dependency. Attached patch needs to be checked thoroughly as I am rusty with Java and the unit tests are sparse. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Updated] (NUTCH-1678) Remove dependency on org.apache.oro
[ https://issues.apache.org/jira/browse/NUTCH-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Sullivan updated NUTCH-1678: -- Description: org.apache.oro has been archived for three years and it may be good to remove the dependency as Java has had built in regexes for quite some time now. There don't seem to have been any specific Perl5 functionality needed in the regexes so unless there are specific threading or performance reasons for continuing to use oro it may be time to lose the dependency. Attached patch needs to be checked thoroughly as I am rusty with Java and the unit tests are sparse. (was: org.apache.oro has been archived for three years and it may be good to remove the dependency as Java has had a built in regexes for quite some time now. There don't seem to have been any specific Perl5 functionality needed in the regexes so unless there are specific threading or performance reasons for continuing to use oro it may be time to lose the dependency. Attached patch needs to be checked thoroughly as I am rusty with Java and the unit tests are sparse. ) Remove dependency on org.apache.oro --- Key: NUTCH-1678 URL: https://issues.apache.org/jira/browse/NUTCH-1678 Project: Nutch Issue Type: Improvement Components: parser Affects Versions: 2.2 Reporter: James Sullivan Priority: Minor Labels: newbie, patch Attachments: 2.x.patch org.apache.oro has been archived for three years and it may be good to remove the dependency as Java has had built in regexes for quite some time now. There don't seem to have been any specific Perl5 functionality needed in the regexes so unless there are specific threading or performance reasons for continuing to use oro it may be time to lose the dependency. Attached patch needs to be checked thoroughly as I am rusty with Java and the unit tests are sparse. -- This message was sent by Atlassian JIRA (v6.1#6144)