Re: [DISCUSS] Release Trunk

2013-12-02 Thread Sebastian Nagel
Hi,

+1 to release soon (this year, or early next year)

 and probably a few others but they could also be done later.
At least, these should be done before releasing:
NUTCH-1646 IndexerMapReduce to consider DB status
NUTCH-1413 Record response time

Sebastian

On 11/28/2013 05:49 PM, Julien Nioche wrote:
 Hi Lewis
 
 We've done quite a few things in 1.x since the previous release (e.g. generic 
 deduplication,
 removing indexer.solr package, etc...)  and the next 2.x release will be 
 after the changes to GORA
 have been made, tested and used on the Nutch side so that could be quite a 
 while.
 
 I am neutral as to whether we should do a 1.x release now. There are some 
 minor issues that we could
 do in 1.x before the next release like :
 * https://issues.apache.org/jira/browse/NUTCH-1360
 * https://issues.apache.org/jira/browse/NUTCH-1676
 and probably a few others but they could also be done later.
 
 Let's hear what others think.
 
 Thanks
 
 Julien
 
 
 
 
 On 28 November 2013 16:34, Lewis John Mcgibbney lewis.mcgibb...@gmail.com
 mailto:lewis.mcgibb...@gmail.com wrote:
 
 Hi Folks,
 Thread says it all.
 There are some hot tickets over in Gora right now so I think holding off 
 the next while for a
 2.x release would be wise.
 I can spin the RC for trunk tonight/tomorrow/weekend if we get the thumbs 
 up.
 Ta
 Lewis
 
 -- 
 /Lewis/
 
 
 
 
 -- 
 *
 *Open Source Solutions for Text Engineering
 
 http://digitalpebble.blogspot.com/
 http://www.digitalpebble.com
 http://twitter.com/digitalpebble



RE: [DISCUSS] Release Trunk

2013-12-02 Thread Markus Jelsma
Hi!

Well, we've been doing a release roughly every 6 months for over three years 
now so it's about time indeed. I'll look into some open issues i have left when 
i have some spare time in the office. Hopefully soon but i'm not too sure about 
that going to happen.

Cheers!

-Original message-
 From:Sebastian Nagel wastl.na...@googlemail.com
 Sent: Monday 2nd December 2013 22:02
 To: dev@nutch.apache.org
 Subject: Re: [DISCUSS] Release Trunk
 
 Hi,
 
 +1 to release soon (this year, or early next year)
 
  and probably a few others but they could also be done later.
 At least, these should be done before releasing:
 NUTCH-1646 IndexerMapReduce to consider DB status
 NUTCH-1413 Record response time
 
 Sebastian
 
 On 11/28/2013 05:49 PM, Julien Nioche wrote:
  Hi Lewis
  
  We've done quite a few things in 1.x since the previous release (e.g. 
  generic deduplication,
  removing indexer.solr package, etc...)  and the next 2.x release will be 
  after the changes to GORA
  have been made, tested and used on the Nutch side so that could be quite a 
  while.
  
  I am neutral as to whether we should do a 1.x release now. There are some 
  minor issues that we could
  do in 1.x before the next release like :
  * https://issues.apache.org/jira/browse/NUTCH-1360
  * https://issues.apache.org/jira/browse/NUTCH-1676
  and probably a few others but they could also be done later.
  
  Let's hear what others think.
  
  Thanks
  
  Julien
  
  
  
  
  On 28 November 2013 16:34, Lewis John Mcgibbney lewis.mcgibb...@gmail.com
  mailto:lewis.mcgibb...@gmail.com wrote:
  
  Hi Folks,
  Thread says it all.
  There are some hot tickets over in Gora right now so I think holding 
  off the next while for a
  2.x release would be wise.
  I can spin the RC for trunk tonight/tomorrow/weekend if we get the 
  thumbs up.
  Ta
  Lewis
  
  -- 
  /Lewis/
  
  
  
  
  -- 
  *
  *Open Source Solutions for Text Engineering
  
  http://digitalpebble.blogspot.com/
  http://www.digitalpebble.com
  http://twitter.com/digitalpebble
 
 


[jira] [Created] (NUTCH-1678) Remove dependency on org.apache.oro

2013-12-02 Thread James Sullivan (JIRA)
James Sullivan created NUTCH-1678:
-

 Summary: Remove dependency on org.apache.oro
 Key: NUTCH-1678
 URL: https://issues.apache.org/jira/browse/NUTCH-1678
 Project: Nutch
  Issue Type: Improvement
  Components: parser
Affects Versions: 2.2
Reporter: James Sullivan
Priority: Minor


org.apache.oro has been archived for three years and it may be good to remove 
the dependency as Java has had a built in regexes for quite some time now. 
There don't seem to have been any specific Perl5 functionality needed in the 
regexes so unless there are specific threading or performance reasons for 
continuing to use oro it may be time to lose the dependency. Attached patch 
needs to be checked thoroughly as I am rusty with Java and the unit tests are 
sparse. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (NUTCH-1678) Remove dependency on org.apache.oro

2013-12-02 Thread James Sullivan (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Sullivan updated NUTCH-1678:
--

Attachment: 2.x.patch

parse/OutlinkExtractor
index-more
parse-js
urlnormalizer-basic

Needs to be looked over and tested first.

 Remove dependency on org.apache.oro
 ---

 Key: NUTCH-1678
 URL: https://issues.apache.org/jira/browse/NUTCH-1678
 Project: Nutch
  Issue Type: Improvement
  Components: parser
Affects Versions: 2.2
Reporter: James Sullivan
Priority: Minor
  Labels: newbie, patch
 Attachments: 2.x.patch


 org.apache.oro has been archived for three years and it may be good to remove 
 the dependency as Java has had a built in regexes for quite some time now. 
 There don't seem to have been any specific Perl5 functionality needed in the 
 regexes so unless there are specific threading or performance reasons for 
 continuing to use oro it may be time to lose the dependency. Attached patch 
 needs to be checked thoroughly as I am rusty with Java and the unit tests are 
 sparse. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (NUTCH-1678) Remove dependency on org.apache.oro

2013-12-02 Thread James Sullivan (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Sullivan updated NUTCH-1678:
--

Description: org.apache.oro has been archived for three years and it may be 
good to remove the dependency as Java has had built in regexes for quite some 
time now. There don't seem to have been any specific Perl5 functionality needed 
in the regexes so unless there are specific threading or performance reasons 
for continuing to use oro it may be time to lose the dependency. Attached patch 
needs to be checked thoroughly as I am rusty with Java and the unit tests are 
sparse.   (was: org.apache.oro has been archived for three years and it may be 
good to remove the dependency as Java has had a built in regexes for quite some 
time now. There don't seem to have been any specific Perl5 functionality needed 
in the regexes so unless there are specific threading or performance reasons 
for continuing to use oro it may be time to lose the dependency. Attached patch 
needs to be checked thoroughly as I am rusty with Java and the unit tests are 
sparse. )

 Remove dependency on org.apache.oro
 ---

 Key: NUTCH-1678
 URL: https://issues.apache.org/jira/browse/NUTCH-1678
 Project: Nutch
  Issue Type: Improvement
  Components: parser
Affects Versions: 2.2
Reporter: James Sullivan
Priority: Minor
  Labels: newbie, patch
 Attachments: 2.x.patch


 org.apache.oro has been archived for three years and it may be good to remove 
 the dependency as Java has had built in regexes for quite some time now. 
 There don't seem to have been any specific Perl5 functionality needed in the 
 regexes so unless there are specific threading or performance reasons for 
 continuing to use oro it may be time to lose the dependency. Attached patch 
 needs to be checked thoroughly as I am rusty with Java and the unit tests are 
 sparse. 



--
This message was sent by Atlassian JIRA
(v6.1#6144)