[jira] [Closed] (NUTCH-342) Nutch commands log to nutch/logs/hadoop.logs by default

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil closed NUTCH-342. - Resolution: Won't Fix I agree with Lewis wrt closing this issue as won't fix. Nutch

[jira] [Commented] (NUTCH-213) checkstyle

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645400#comment-13645400 ] Tejas Patil commented on NUTCH-213: --- IMHO, I dont think that we are in dire need to have

[jira] [Commented] (NUTCH-427) protocol-smb: plugin protocol implementing the CIFS/SMB protocol. This protocol allows Nutch to crawl Microsoft Windows Shares remotely using the CIFS/SMB protocol implm

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645403#comment-13645403 ] Tejas Patil commented on NUTCH-427: --- As [~ab] mentioned earlier This plugin uses an LGPL

[jira] [Closed] (NUTCH-449) Format of junit output should be configurable

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil closed NUTCH-449. - Resolution: Implemented I have verified that current trunk and 2.x build files already have this change.

[jira] [Updated] (NUTCH-1514) Phase out the deprecated configuration properties (if possible)

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1514: --- Attachment: NUTCH-1514.2.x.patch Here is a corresponding patch for 2.x. Unless there are any

[jira] [Resolved] (NUTCH-802) Problems managing outlinks with large url length

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil resolved NUTCH-802. --- Resolution: Won't Fix Agree with Markus and Lewis. Hence marking this one as wont fix. If someone

[jira] [Updated] (NUTCH-1543) Display consistent usage of DBUpdaterJob with 1.X

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1543: --- Attachment: NUTCH-1543.v2.patch Hi [~amuseme], The patch will kill the current behavior wherein if

[jira] [Commented] (NUTCH-1273) Fix [deprecation] javac warnings

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645434#comment-13645434 ] Tejas Patil commented on NUTCH-1273: Hey [~lewismc], I still see deprecation warnings

[jira] [Comment Edited] (NUTCH-1273) Fix [deprecation] javac warnings

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645434#comment-13645434 ] Tejas Patil edited comment on NUTCH-1273 at 4/30/13 11:19 AM: --

[jira] [Updated] (NUTCH-1273) Fix [deprecation] javac warnings

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1273: --- Attachment: NUTCH-1249.2.x.v2.patch Fix [deprecation] javac warnings

[jira] [Commented] (NUTCH-1053) Parsing of RSS feeds fails

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645497#comment-13645497 ] Tejas Patil commented on NUTCH-1053: I have figured out the issue here but cant figure

[jira] [Resolved] (NUTCH-1529) Port nutch-mongdb-parser to trunk

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil resolved NUTCH-1529. Resolution: Won't Fix There was a similar jira related to mongodb which lead to a discussion that

[jira] [Updated] (NUTCH-1273) Fix [deprecation] javac warnings

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1273: Fix Version/s: 2.2 Fix [deprecation] javac warnings

[jira] [Commented] (NUTCH-1273) Fix [deprecation] javac warnings

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645683#comment-13645683 ] Lewis John McGibbney commented on NUTCH-1273: - Hi [~tejasp], MimeUtil

[jira] [Updated] (NUTCH-649) Log list of files found but not crawled.

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-649: --- Fix Version/s: 2.2 Log list of files found but not crawled.

[jira] [Commented] (NUTCH-649) Log list of files found but not crawled.

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645693#comment-13645693 ] Lewis John McGibbney commented on NUTCH-649: What is the overhead of

[jira] [Commented] (NUTCH-1329) parser not extract outlinks to external web sites

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645695#comment-13645695 ] Lewis John McGibbney commented on NUTCH-1329: - +1 Close. We can reopen if it

[jira] [Created] (NUTCH-1567) More useful logging for batch id (null) scenario

2013-04-30 Thread Lewis John McGibbney (JIRA)
Lewis John McGibbney created NUTCH-1567: --- Summary: More useful logging for batch id (null) scenario Key: NUTCH-1567 URL: https://issues.apache.org/jira/browse/NUTCH-1567 Project: Nutch

[jira] [Updated] (NUTCH-1545) capture batchId and remove references to segments in 2.x crawl script.

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1545: Fix Version/s: (was: 2.3) 2.2 capture batchId and

[jira] [Updated] (NUTCH-1545) capture batchId and remove references to segments in 2.x crawl script.

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1545: Patch Info: Patch Available capture batchId and remove references to segments

[jira] [Commented] (NUTCH-1273) Fix [deprecation] javac warnings

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645794#comment-13645794 ] Tejas Patil commented on NUTCH-1273: Hi [~lewismc], Yup. I went through the new API

[jira] [Commented] (NUTCH-649) Log list of files found but not crawled.

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645802#comment-13645802 ] Tejas Patil commented on NUTCH-649: --- Hi [~lewismc], Now thats an awesome idea...

[jira] [Commented] (NUTCH-1273) Fix [deprecation] javac warnings

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645857#comment-13645857 ] Lewis John McGibbney commented on NUTCH-1273: - +1 Fix

[jira] [Commented] (NUTCH-649) Log list of files found but not crawled.

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645860#comment-13645860 ] Lewis John McGibbney commented on NUTCH-649: Tejas, you can take most of what

[jira] [Commented] (NUTCH-1273) Fix [deprecation] javac warnings

2013-04-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645919#comment-13645919 ] Hudson commented on NUTCH-1273: --- Integrated in Nutch-nutchgora #589 (See

Build failed in Jenkins: Nutch-nutchgora #589

2013-04-30 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-nutchgora/589/changes Changes: [tejasp] NUTCH-1273 Fix [deprecation] javac warnings -- [...truncated 2902 lines...] resolve-default: [ivy:resolve] :: loading settings :: file =

[jira] [Commented] (NUTCH-213) checkstyle

2013-04-30 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645953#comment-13645953 ] Lewis John McGibbney commented on NUTCH-213: +1 close

[jira] [Resolved] (NUTCH-213) checkstyle

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil resolved NUTCH-213. --- Resolution: Won't Fix Closing this one as we don't need this. checkstyle

[jira] [Resolved] (NUTCH-1549) Fix deprecated use of Tika MimeType API in o.a.n.util.MimeUtil

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil resolved NUTCH-1549. Resolution: Fixed Assignee: Tejas Patil Ported the change from 2.x to trunk. Committed @

[jira] [Closed] (NUTCH-1329) parser not extract outlinks to external web sites

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil closed NUTCH-1329. -- Resolution: Cannot Reproduce Closing for now by marking it cannot reproduce parser

[jira] [Commented] (NUTCH-1549) Fix deprecated use of Tika MimeType API in o.a.n.util.MimeUtil

2013-04-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13645986#comment-13645986 ] Hudson commented on NUTCH-1549: --- Integrated in Nutch-trunk #2188 (See

Build failed in Jenkins: Nutch-trunk #2188

2013-04-30 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-trunk/2188/changes Changes: [tejasp] NUTCH-1549 Fix deprecated use of Tika MimeType API in o.a.n.util.MimeUtil -- [...truncated 4039 lines...] [javac]

[jira] [Resolved] (NUTCH-1334) NPE in FetcherOutputFormat

2013-04-30 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil resolved NUTCH-1334. Resolution: Fixed As the patch was trivial (null checks) so went ahead and committed to trunk @

[jira] [Commented] (NUTCH-1334) NPE in FetcherOutputFormat

2013-04-30 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13646046#comment-13646046 ] Hudson commented on NUTCH-1334: --- Integrated in Nutch-trunk #2189 (See

Build failed in Jenkins: Nutch-trunk #2189

2013-04-30 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-trunk/2189/changes Changes: [tejasp] NUTCH-1334 NPE in FetcherOutputFormat -- [...truncated 1096 lines...] AU src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java AU

Build failed in Jenkins: Nutch-trunk #2190

2013-04-30 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-trunk/2190/ -- [...truncated 1096 lines...] AU src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java AU

Build failed in Jenkins: Nutch-nutchgora #590

2013-04-30 Thread Apache Jenkins Server
See https://builds.apache.org/job/Nutch-nutchgora/590/ -- [...truncated 991 lines...] A src/plugin/language-identifier/src/java/org/apache/nutch/analysis/lang/HTMLLanguageParser.java A