Re: Build failed in Jenkins: Nutch-trunk #1702

2011-12-26 Thread Lewis John Mcgibbney
No hassle Markus,

I am fully supportive of your initiatives to upgrade to the newer
Hadoop API's. This was initially the reason I am trying to get
BIGTOP-284 [1] off the ground. As we all know the Hadoop ecosystem is
moving at an alarming rate, it's great to have guys with the drive to
keep up with it... or at least to try :0)

At least we know the failing tests can be resolved easily.

Thanks Markus.

[1] https://issues.apache.org/jira/browse/BIGTOP-284

On Mon, Dec 26, 2011 at 11:34 AM, Markus Jelsma
markus.jel...@openindex.io wrote:
 There was an upgrade to 0.22.0 but i'm likely going to downgrade it again to
 the latest 0.20.x version. Initial tests showed good results but 0.22 is
 very unstable when fetching large lists. I've asked about this on the mapred
 user list but nothing so far.

 cheers

 Hi Guys,



 Our trunk builds have been broken since migrating to new Hadoop 0.20.2

 and migrating CrawlDBScanner to new MR API e.g. trunk build [1] 1698.

 Looking to the stack trace, I'm assuming that this has to do with how

 we are specifying the new file reads. Hopefully this shouldn't be too

 hard to solve so maybe we can get on to it at some stage in the near

 future.



 I just want to say Merry Christmas to EVERYONE celebrating and happy

 holidays to everyone else who may not be.



 Best



 Lewis



 [1] https://builds.apache.org/view/M-R/view/Nutch/job/Nutch-trunk/1698/

 On Sat, Dec 24, 2011 at 7:36 AM, Apache Jenkins Server



 jenk...@builds.apache.org wrote:

  See https://builds.apache.org/job/Nutch-trunk/1702/changes

 

  Changes:

 

  [markus] Updated pom to reflect Hadoop upgrade

 

  --

  [...truncated 2836 lines...]

  resolve-default:

  [ivy:resolve] :: loading settings :: file =

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/ivy/ivysettings.xml

 

  compile:

      [echo] Compiling plugin: urlfilter-validator

     [javac]

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/src/plugin/build-plugin.xml:117: warning: 'includeantruntime'

  was not set, defaulting to build.sysclasspath=last; set to false for

  repeatable builds [javac] Compiling 1 source file to

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlfilter-validator/classes

 

  jar:

       [jar] Building jar:

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlfilter-validator/urlfilter-validator.jar

 

  deps-test:

 

  deploy:

      [copy] Copying 1 file to

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/plugins/urlfilter-validator

 

  copy-generated-lib:

      [copy] Copying 1 file to

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/plugins/urlfilter-validator

 

  init:

     [mkdir] Created dir:

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlmeta [mkdir] Created dir:

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlmeta/classes [mkdir] Created dir:

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlmeta/test [mkdir] Created dir:

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/plugins/urlmeta

 

  init-plugin:

 

  deps-jar:

 

  clean-lib:

 

  resolve-default:

  [ivy:resolve] :: loading settings :: file =

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/ivy/ivysettings.xml

 

  compile:

      [echo] Compiling plugin: urlmeta

     [javac]

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/src/plugin/build-plugin.xml:117: warning: 'includeantruntime'

  was not set, defaulting to build.sysclasspath=last; set to false for

  repeatable builds [javac] Compiling 2 source files to

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlmeta/classes

 

  jar:

       [jar] Building jar:

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlmeta/urlmeta.jar

 

  deps-test:

 

  deploy:

      [copy] Copying 1 file to

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/plugins/urlmeta

 

  copy-generated-lib:

      [copy] Copying 1 file to

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/plugins/urlmeta

 

  init:

     [mkdir] Created dir:

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlnormalizer-basic [mkdir] Created dir:

  /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/Nutch-tru

  nk/trunk/build/urlnormalizer-basic/classes [mkdir] Created dir:

  

[jira] [Created] (NUTCH-1236) Add link to site documentation to download older versions of Nutch.

2011-12-26 Thread Lewis John McGibbney (Created) (JIRA)
Add link to site documentation to download older versions of Nutch.
---

 Key: NUTCH-1236
 URL: https://issues.apache.org/jira/browse/NUTCH-1236
 Project: Nutch
  Issue Type: Improvement
  Components: documentation
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor


As we are moving towards 2012 I thought it best to clear out my mailbox. I 
found an older email which requested the link to download older versions of 
Nutch. This was discussed and I think it would be best to get the link added to 
the site documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (NUTCH-1236) Add link to site documentation to download older versions of Nutch.

2011-12-26 Thread Lewis John McGibbney (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1236:


Attachment: NUTCH-1236.patch

This small patch simply adds pages to older Nutch releases as well as a link to 
the link to the trunk Sonar Analysis page. I will commit this once the svn site 
area has been updated to accomodate 1.4 changes. Thanks

 Add link to site documentation to download older versions of Nutch.
 ---

 Key: NUTCH-1236
 URL: https://issues.apache.org/jira/browse/NUTCH-1236
 Project: Nutch
  Issue Type: Improvement
  Components: documentation
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
 Attachments: NUTCH-1236.patch


 As we are moving towards 2012 I thought it best to clear out my mailbox. I 
 found an older email which requested the link to download older versions of 
 Nutch. This was discussed and I think it would be best to get the link added 
 to the site documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (NUTCH-1236) Add link to site documentation to download older versions of Nutch.

2011-12-26 Thread Lewis John McGibbney (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1236?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1236:


Patch Info: Patch Available

 Add link to site documentation to download older versions of Nutch.
 ---

 Key: NUTCH-1236
 URL: https://issues.apache.org/jira/browse/NUTCH-1236
 Project: Nutch
  Issue Type: Improvement
  Components: documentation
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
 Attachments: NUTCH-1236.patch


 As we are moving towards 2012 I thought it best to clear out my mailbox. I 
 found an older email which requested the link to download older versions of 
 Nutch. This was discussed and I think it would be best to get the link added 
 to the site documentation. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-1217) Update NOTICE.txt to drop some copyrights

2011-12-26 Thread Lewis John McGibbney (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175949#comment-13175949
 ] 

Lewis John McGibbney commented on NUTCH-1217:
-

Hi Guys, as I've looked deeper in to this the first patch is a load of dribble. 
As we are pulling the overwhelming majority of our dependencies from upstream 
repositories using Ivy, there is no need to include them in the NOTICE.txt 
declarations. We only ship with JavaSWF  Automaton libraries (both of which 
are plugins). I'll commit this, do the same for Nutchgora then shut this one 
off.

One last question, is anyone aware if our licences for the above two packages 
are OK? I am not aware but I am more than happy to have a word with the authors 
to find out. Thanks   

 Update NOTICE.txt to drop some copyrights
 -

 Key: NUTCH-1217
 URL: https://issues.apache.org/jira/browse/NUTCH-1217
 Project: Nutch
  Issue Type: Improvement
  Components: documentation
Affects Versions: 1.4
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: nutchgora, 1.5

 Attachments: NUTCH-1217-trunk.patch


 We have many references to software copyrights which should be dropped. Most 
 of these relate to the Lucene legacy days.
 -Carrot2
 -saxpath
 -jaxen
 -jdom
 -snowball
 -violinstrings
 -Jena
 -bouncycastle
 -fontbox
 -jempbox
 -pdfbox
 -rome
 Also some need to be added
 -slf4j
 -activation
 -mortbay (jetty)
 -jline
 -junit
 -stax
 -wstx
 As I am unfamiliar with most of these, and that is important to inlcude all 
 references to software outside of the ASF, I would appreciate if this list 
 could act as a beginning for completing this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-1081) ant tests fail

2011-12-26 Thread Lewis John McGibbney (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175956#comment-13175956
 ] 

Lewis John McGibbney commented on NUTCH-1081:
-

Hi Ferdy. There has been almost no problems within the CI testing environment 
for a number of weeks/months. Any failures seem to have been down to the 
project building on Ubuntu slaves as oppose to Solaris slaves, the failures are 
a result of incorrect envars being specified. I've added some more 
functionality to the nutchgora build characteristics e.g. Publish JUnit test 
result report and publish Javadoc. So as agreed we will keep an eye on this.   

 ant tests fail 
 ---

 Key: NUTCH-1081
 URL: https://issues.apache.org/jira/browse/NUTCH-1081
 Project: Nutch
  Issue Type: Bug
  Components: fetcher, generator, injector, storage
Affects Versions: nutchgora
 Environment: Ubuntu release 11.04 (natty)
 Kernerl Linux 2.6.38-10-generic
 GNOME 2.32.1
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Critical
 Fix For: nutchgora


 The following tests fail when running ant test on trunk 2.0
 {code}
 [junit] Running org.apache.nutch.api.TestAPI
 [junit] Tests run: 4, Failures: 1, Errors: 0, Time elapsed: 11.028 sec
 [junit] Test org.apache.nutch.api.TestAPI FAILED
 [junit] Running org.apache.nutch.crawl.TestGenerator
 [junit] Tests run: 4, Failures: 0, Errors: 4, Time elapsed: 0.478 sec
 [junit] Test org.apache.nutch.crawl.TestGenerator FAILED
 [junit] Running org.apache.nutch.crawl.TestInjector
 [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.474 sec
 [junit] Test org.apache.nutch.crawl.TestInjector FAILED
 [junit] Running org.apache.nutch.fetcher.TestFetcher
 [junit] Tests run: 2, Failures: 0, Errors: 2, Time elapsed: 0.526 sec
 [junit] Test org.apache.nutch.fetcher.TestFetcher FAILED
 [junit] Running org.apache.nutch.storage.TestGoraStorage
 [junit] Tests run: 2, Failures: 0, Errors: 2, Time elapsed: 0.468 sec
 [junit] Test org.apache.nutch.storage.TestGoraStorage FAILED
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (NUTCH-1217) Update NOTICE.txt to drop some copyrights

2011-12-26 Thread Lewis John McGibbney (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney updated NUTCH-1217:


Attachment: NUTCH-1217-trunk-v2.patch

new patch which greatly simplifies the trunk NOTICE.txt file.

 Update NOTICE.txt to drop some copyrights
 -

 Key: NUTCH-1217
 URL: https://issues.apache.org/jira/browse/NUTCH-1217
 Project: Nutch
  Issue Type: Improvement
  Components: documentation
Affects Versions: 1.4
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: nutchgora, 1.5

 Attachments: NUTCH-1217-trunk-v2.patch, NUTCH-1217-trunk.patch


 We have many references to software copyrights which should be dropped. Most 
 of these relate to the Lucene legacy days.
 -Carrot2
 -saxpath
 -jaxen
 -jdom
 -snowball
 -violinstrings
 -Jena
 -bouncycastle
 -fontbox
 -jempbox
 -pdfbox
 -rome
 Also some need to be added
 -slf4j
 -activation
 -mortbay (jetty)
 -jline
 -junit
 -stax
 -wstx
 As I am unfamiliar with most of these, and that is important to inlcude all 
 references to software outside of the ASF, I would appreciate if this list 
 could act as a beginning for completing this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (NUTCH-1217) Update NOTICE.txt to drop some copyrights

2011-12-26 Thread Lewis John McGibbney (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney resolved NUTCH-1217.
-

Resolution: Fixed

Committed @ revision 1224748 in trunk
Committed @ revision 1224750 in nutchgora branch



 Update NOTICE.txt to drop some copyrights
 -

 Key: NUTCH-1217
 URL: https://issues.apache.org/jira/browse/NUTCH-1217
 Project: Nutch
  Issue Type: Improvement
  Components: documentation
Affects Versions: 1.4
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: nutchgora, 1.5

 Attachments: NUTCH-1217-trunk-v2.patch, NUTCH-1217-trunk.patch


 We have many references to software copyrights which should be dropped. Most 
 of these relate to the Lucene legacy days.
 -Carrot2
 -saxpath
 -jaxen
 -jdom
 -snowball
 -violinstrings
 -Jena
 -bouncycastle
 -fontbox
 -jempbox
 -pdfbox
 -rome
 Also some need to be added
 -slf4j
 -activation
 -mortbay (jetty)
 -jline
 -junit
 -stax
 -wstx
 As I am unfamiliar with most of these, and that is important to inlcude all 
 references to software outside of the ASF, I would appreciate if this list 
 could act as a beginning for completing this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Closed] (NUTCH-1217) Update NOTICE.txt to drop some copyrights

2011-12-26 Thread Lewis John McGibbney (Closed) (JIRA)

 [ 
https://issues.apache.org/jira/browse/NUTCH-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lewis John McGibbney closed NUTCH-1217.
---


closing this one off subject to existing licences being OK. We can reopen if 
deemed an issue. Thanks

 Update NOTICE.txt to drop some copyrights
 -

 Key: NUTCH-1217
 URL: https://issues.apache.org/jira/browse/NUTCH-1217
 Project: Nutch
  Issue Type: Improvement
  Components: documentation
Affects Versions: 1.4
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: nutchgora, 1.5

 Attachments: NUTCH-1217-trunk-v2.patch, NUTCH-1217-trunk.patch


 We have many references to software copyrights which should be dropped. Most 
 of these relate to the Lucene legacy days.
 -Carrot2
 -saxpath
 -jaxen
 -jdom
 -snowball
 -violinstrings
 -Jena
 -bouncycastle
 -fontbox
 -jempbox
 -pdfbox
 -rome
 Also some need to be added
 -slf4j
 -activation
 -mortbay (jetty)
 -jline
 -junit
 -stax
 -wstx
 As I am unfamiliar with most of these, and that is important to inlcude all 
 references to software outside of the ASF, I would appreciate if this list 
 could act as a beginning for completing this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-1218) Improve trunk API documentation

2011-12-26 Thread Lewis John McGibbney (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175966#comment-13175966
 ] 

Lewis John McGibbney commented on NUTCH-1218:
-

Does anyone have any objections for me to hack away at this making commits when 
I can? The intention is to work my way through the core classes, providing a 
description of each package, then get in to more detail with individual classes 
within the 'core' bunch of classes. 

After this I'll move on to the plugin's. 

After that I'll move on to Nutchgora!!!  

 Improve trunk API documentation
 ---

 Key: NUTCH-1218
 URL: https://issues.apache.org/jira/browse/NUTCH-1218
 Project: Nutch
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 1.4
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 1.5

 Attachments: NUTCH-1218.patch


 The trunk API Java documentation could do with some improving. This issue 
 should track that. It should however not seek to change any functionality 
 within the codebase, only to substantiate and improve the existing 
 documentation.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (NUTCH-1218) Improve trunk API documentation

2011-12-26 Thread Lewis John McGibbney (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175967#comment-13175967
 ] 

Lewis John McGibbney commented on NUTCH-1218:
-

Another thing, even if I make the changes to trunk, it would be great to view 
them dynamically on the trunk Javadoc site [1] e.g. publish them after every 
commit to see the actual changes at incremental stages. Any advice on this? 
From looking at build.xml, it appears that this we only fully publish the 
Javadoc when releasing... Is this the case? If not then can someone please 
advise me otherwise? Thanks guys  

 Improve trunk API documentation
 ---

 Key: NUTCH-1218
 URL: https://issues.apache.org/jira/browse/NUTCH-1218
 Project: Nutch
  Issue Type: Sub-task
  Components: documentation
Affects Versions: 1.4
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: 1.5

 Attachments: NUTCH-1218.patch


 The trunk API Java documentation could do with some improving. This issue 
 should track that. It should however not seek to change any functionality 
 within the codebase, only to substantiate and improve the existing 
 documentation.  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] [Commented] (NUTCH-1218) Improve trunk API documentation

2011-12-26 Thread Markus Jelsma
looks fine


 [
 https://issues.apache.org/jira/browse/NUTCH-1218?page=com.atlassian.jira.p
 lugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175966#comm
 ent-13175966 ]
 
 Lewis John McGibbney commented on NUTCH-1218:
 -
 
 Does anyone have any objections for me to hack away at this making commits
 when I can? The intention is to work my way through the core classes,
 providing a description of each package, then get in to more detail with
 individual classes within the 'core' bunch of classes.
 
 After this I'll move on to the plugin's.
 
 After that I'll move on to Nutchgora!!!
 
  Improve trunk API documentation
  ---
  
  Key: NUTCH-1218
  URL: https://issues.apache.org/jira/browse/NUTCH-1218
  
  Project: Nutch
   
   Issue Type: Sub-task
   Components: documentation
 
 Affects Versions: 1.4
 
 Reporter: Lewis John McGibbney
 Assignee: Lewis John McGibbney
 
  Fix For: 1.5
  
  Attachments: NUTCH-1218.patch
  
  The trunk API Java documentation could do with some improving. This issue
  should track that. It should however not seek to change any
  functionality within the codebase, only to substantiate and improve the
  existing documentation.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA
 administrators:
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (NUTCH-1138) remove LogUtil from trunk and nutch gora

2011-12-26 Thread Lewis John McGibbney (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176046#comment-13176046
 ] 

Lewis John McGibbney commented on NUTCH-1138:
-

Looking for logging irregularities in hadoop.log after running a medium sized 
crawl over mini MR cluster I'm struggling to see any adverse behaviour produced 
as a result of applying this patch. Most WARN's can be attributed to new MR API 
and I've a couple of java.net.SocketException: Connection reset ERRORS, which 
we must expect from time to time :0)

 remove LogUtil from trunk and nutch gora
 

 Key: NUTCH-1138
 URL: https://issues.apache.org/jira/browse/NUTCH-1138
 Project: Nutch
  Issue Type: Improvement
Affects Versions: 1.4, nutchgora
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
Priority: Minor
 Fix For: nutchgora, 1.5

 Attachments: Document1.txt, NUTCH-1138-trunk-20111023.patch


 This should move towards the removal of the LogUtil class from both codebases 
 as per comments in NUTCH-1078.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [jira] [Commented] (NUTCH-1138) remove LogUtil from trunk and nutch gora

2011-12-26 Thread Markus Jelsma
Sounds good enough to me. +1 so i'll pass it into production tomorrow. 

 [
 https://issues.apache.org/jira/browse/NUTCH-1138?page=com.atlassian.jira.p
 lugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176046#comm
 ent-13176046 ]
 
 Lewis John McGibbney commented on NUTCH-1138:
 -
 
 Looking for logging irregularities in hadoop.log after running a medium
 sized crawl over mini MR cluster I'm struggling to see any adverse
 behaviour produced as a result of applying this patch. Most WARN's can be
 attributed to new MR API and I've a couple of java.net.SocketException:
 Connection reset ERRORS, which we must expect from time to time :0)
 
  remove LogUtil from trunk and nutch gora
  
  
  Key: NUTCH-1138
  URL: https://issues.apache.org/jira/browse/NUTCH-1138
  
  Project: Nutch
   
   Issue Type: Improvement
 
 Affects Versions: 1.4, nutchgora
 
 Reporter: Lewis John McGibbney
 Assignee: Lewis John McGibbney
 Priority: Minor
 
  Fix For: nutchgora, 1.5
  
  Attachments: Document1.txt, NUTCH-1138-trunk-20111023.patch
  
  This should move towards the removal of the LogUtil class from both
  codebases as per comments in NUTCH-1078.
 
 --
 This message is automatically generated by JIRA.
 If you think it was sent incorrectly, please contact your JIRA
 administrators:
 https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
 For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (NUTCH-1217) Update NOTICE.txt to drop some copyrights

2011-12-26 Thread Hudson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-1217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13176081#comment-13176081
 ] 

Hudson commented on NUTCH-1217:
---

Integrated in Nutch-nutchgora #109 (See 
[https://builds.apache.org/job/Nutch-nutchgora/109/])
commit to address NUTCH-1217 and update to CHANGES.txt

lewismc : 
http://svn.apache.org/viewvc/nutch/branches/nutchgora/viewvc/?view=revroot=revision=1224750
Files : 
* /nutch/branches/nutchgora/CHANGES.txt
* /nutch/branches/nutchgora/NOTICE.txt


 Update NOTICE.txt to drop some copyrights
 -

 Key: NUTCH-1217
 URL: https://issues.apache.org/jira/browse/NUTCH-1217
 Project: Nutch
  Issue Type: Improvement
  Components: documentation
Affects Versions: 1.4
Reporter: Lewis John McGibbney
Assignee: Lewis John McGibbney
 Fix For: nutchgora, 1.5

 Attachments: NUTCH-1217-trunk-v2.patch, NUTCH-1217-trunk.patch


 We have many references to software copyrights which should be dropped. Most 
 of these relate to the Lucene legacy days.
 -Carrot2
 -saxpath
 -jaxen
 -jdom
 -snowball
 -violinstrings
 -Jena
 -bouncycastle
 -fontbox
 -jempbox
 -pdfbox
 -rome
 Also some need to be added
 -slf4j
 -activation
 -mortbay (jetty)
 -jline
 -junit
 -stax
 -wstx
 As I am unfamiliar with most of these, and that is important to inlcude all 
 references to software outside of the ASF, I would appreciate if this list 
 could act as a beginning for completing this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira