[
https://issues.apache.org/jira/browse/NUTCH-2034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2034:
Fix Version/s: 1.12
> CrawlDB filtered documents coun
[
https://issues.apache.org/jira/browse/NUTCH-2032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2032:
Fix Version/s: 1.12
> Plugin to index the raw content of a readable docum
[
https://issues.apache.org/jira/browse/NUTCH-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2046:
Fix Version/s: 1.12
> The crawl script should be able to skip an initial inject
[
https://issues.apache.org/jira/browse/NUTCH-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-2046:
---
Assignee: Lewis John McGibbney
> The crawl script should be able to skip
[
https://issues.apache.org/jira/browse/NUTCH-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2005:
Labels: gsoc2016 (was: )
> Implement HTrace'ing
[
https://issues.apache.org/jira/browse/NUTCH-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141296#comment-15141296
]
Lewis John McGibbney commented on NUTCH-2144:
-
bq. [~chrismattmann] I am
[
https://issues.apache.org/jira/browse/NUTCH-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141213#comment-15141213
]
Lewis John McGibbney commented on NUTCH-2144:
-
Hi [~thammegowda], limitat
[
https://issues.apache.org/jira/browse/NUTCH-2144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2144:
Fix Version/s: 1.12
> Plugin to override db.ignore.external to exempt interest
[
https://issues.apache.org/jira/browse/NUTCH-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15137375#comment-15137375
]
Lewis John McGibbney commented on NUTCH-1314:
-
Committed @ revisions 172
[
https://issues.apache.org/jira/browse/NUTCH-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-1314:
---
Assignee: Lewis John McGibbney
> Impose a limit on the length of outl
Assistance Applications now open!
1271 by: lewis john mcgibbney
Administrivia:
-
To post to the list, e-mail: priv...@nutch.apache.org
To unsubscribe, e-mail: private-digest-unsubscr...@nutch.apache.org
For additional
[
https://issues.apache.org/jira/browse/NUTCH-1314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15129575#comment-15129575
]
Lewis John McGibbney commented on NUTCH-1314:
-
Yep, if someone
Hi Ammar,
I've given you write permissions for the wiki.
Feel free to create a page for your proposed work at the URL below
https://wiki.apache.org/nutch/GoogleSummerOfCode#A2016
On Fri, Jan 22, 2016 at 4:49 PM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:
> Hi A
[
https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15118286#comment-15118286
]
Lewis John McGibbney commented on NUTCH-2206:
-
+1 [~sujenshah], th
[
https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15117800#comment-15117800
]
Lewis John McGibbney commented on NUTCH-2206:
-
We should most likely
[
https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-1741.
-
Resolution: Fixed
Committed revision 1726853 in 2.X
Thank you to everyone that
[
https://issues.apache.org/jira/browse/NUTCH-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2208:
Attachment: TEST-org.apache.nutch.crawl.TestGenerator.txt
Attached is full test log
Lewis John McGibbney created NUTCH-2208:
---
Summary: Fix 4 skipped tests in TestGenerator
Key: NUTCH-2208
URL: https://issues.apache.org/jira/browse/NUTCH-2208
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1741:
Attachment: NUTCH-1741v7.patch
Managed to update this at the weekend and forgot to
Lewis John McGibbney created NUTCH-2207:
---
Summary: Remove class duplication and smarten-up
scoring-similarity plugin
Key: NUTCH-2207
URL: https://issues.apache.org/jira/browse/NUTCH-2207
Lewis John McGibbney created NUTCH-2206:
---
Summary: Provide example scoring.similarity.stopword.file
Key: NUTCH-2206
URL: https://issues.apache.org/jira/browse/NUTCH-2206
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15116491#comment-15116491
]
Lewis John McGibbney commented on NUTCH-2206:
-
CC [~sujenshah]
>
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2184:
Attachment: NUTCH-2184v2.patch
Updated patch for trunk. [~markus17], working to
[
https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113423#comment-15113423
]
Lewis John McGibbney commented on NUTCH-1741:
-
I'm nearly finished
[
https://issues.apache.org/jira/browse/NUTCH-1741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1741:
Assignee: cihad güzel
> Support of Sitemaps in Nutch
83.html) and
> doesn't have any reply so far.
> I would appreciate use your suggestion.
>
> Warmest regards
> Ammar Shadiq
>
> On Tue, Nov 3, 2015 at 3:28 AM, Lewis John Mcgibbney <
> lewis.mcgibb...@gmail.com> wrote:
>
>> Hi Ammar,
>> I have a few s
[
https://issues.apache.org/jira/browse/NUTCH-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15113380#comment-15113380
]
Lewis John McGibbney commented on NUTCH-2171:
-
Hey [~jorgelbg] feel fre
Hi Folks,
!!Apologies for cross posting!!
The Apache Nutch PMC are pleased to announce the immediate release of
Apache Nutch v2.3.1, we advise all current users and developers of the 2.X
series to upgrade to this release.
Nutch is a well matured, production ready Web crawler. Nutch 2.X branch is
[
https://issues.apache.org/jira/browse/NUTCH-2202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110867#comment-15110867
]
Lewis John McGibbney commented on NUTCH-2202:
-
I agree [~robertmeusel],
Hi Folks,
I am bringing this VOTE to a close with the following results
[3] +1 Release this package as Apache Nutch 2.3.1.
Lewis John McGibbney*
Sebastian Nagel*
Chris Mattmann*
[0] -1 Do not release this package because…
*Nutch PMC Member
I am really happy to therefore announce that the VOTE
[
https://issues.apache.org/jira/browse/NUTCH-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110733#comment-15110733
]
Lewis John McGibbney commented on NUTCH-1325:
-
Nice Markus, the conversa
[
https://issues.apache.org/jira/browse/NUTCH-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110702#comment-15110702
]
Lewis John McGibbney commented on NUTCH-1325:
-
What a patch. Real nic
Hi user@, dev@,
PING on the Nutch 2.3.1 RC#2
Would really appreciate anyone who is able to review this release
candidate. It would mean a lot for our 2.X user base.
Thank you
Lewis
On Sun, Jan 10, 2016 at 7:01 AM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:
> Hi Folks,
>
Lewis John McGibbney created NUTCH-2200:
---
Summary: Establish process for publishing Docker containers
Key: NUTCH-2200
URL: https://issues.apache.org/jira/browse/NUTCH-2200
Project: Nutch
Any others above to review please?
On Sun, Jan 10, 2016 at 7:01 AM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:
> Hi Folks,
>
> A second candidate for the Nutch 2.3.1 release is available at:
>
> https://dist.apache.org/repos/dist/dev/nutch/2.3.1rc2/
>
>
Hi Seb,
Thanks for taking the time to review the release candidate.
Replies inline
On Tue, Jan 12, 2016 at 10:17 AM, wrote:
> +1
>
> - good signatures
> - tests pass
> - I've successfully run a test crawl (bin/crawl) using HBase 0.98.8
>
> Two minor points:
>
> - CHANGES.txt mentions the rc1 rel
[
https://issues.apache.org/jira/browse/NUTCH-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15091300#comment-15091300
]
Lewis John McGibbney commented on NUTCH-1186:
-
Hi [~markus17] I have sc
Hi Folks,
A second candidate for the Nutch 2.3.1 release is available at:
https://dist.apache.org/repos/dist/dev/nutch/2.3.1rc2/
The release candidate is a zip and tar.gz sources archive of the sources in:
http://svn.apache.org/repos/asf/nutch/tags/release-2.3.1rc2/
In addition, a staged maven
Lewis John McGibbney created NUTCH-2199:
---
Summary: Documentation for Nutch 2.X REST API
Key: NUTCH-2199
URL: https://issues.apache.org/jira/browse/NUTCH-2199
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1800:
Summary: Documentation for Nutch 1.X REST API (was: Documentation for
Nutch 1.X
[
https://issues.apache.org/jira/browse/NUTCH-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-1800:
Fix Version/s: (was: 2.3.1)
> Documentation for Nutch 1.X REST
[
https://issues.apache.org/jira/browse/NUTCH-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2094:
Fix Version/s: (was: 2.4)
2.3.1
> Stopping and Restartin
[
https://issues.apache.org/jira/browse/NUTCH-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2165:
Fix Version/s: (was: 2.4)
> FileDumper Util hard codes part-# folder n
[
https://issues.apache.org/jira/browse/NUTCH-2166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2166:
Fix Version/s: (was: 2.4)
> Add reverse URL format to dump t
[
https://issues.apache.org/jira/browse/NUTCH-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090337#comment-15090337
]
Lewis John McGibbney edited comment on NUTCH-2168 at 1/9/16 2:0
[
https://issues.apache.org/jira/browse/NUTCH-2168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15090337#comment-15090337
]
Lewis John McGibbney commented on NUTCH-2168:
-
+1 for commit [~wastl-n
[
https://issues.apache.org/jira/browse/NUTCH-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15087804#comment-15087804
]
Lewis John McGibbney commented on NUTCH-2143:
-
Tested v3 and confirmed to
[
https://issues.apache.org/jira/browse/NUTCH-1186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083138#comment-15083138
]
Lewis John McGibbney commented on NUTCH-1186:
-
Will scope and test [~mark
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074453#comment-15074453
]
Lewis John McGibbney commented on NUTCH-2184:
-
[~markus17] coming bac
[
https://issues.apache.org/jira/browse/NUTCH-1946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074319#comment-15074319
]
Lewis John McGibbney commented on NUTCH-1946:
-
Hi [~kalanya]
bq. Hey
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060155#comment-15060155
]
Lewis John McGibbney commented on NUTCH-2184:
-
Ack
On Wednesday, Decembe
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15060023#comment-15060023
]
Lewis John McGibbney commented on NUTCH-2184:
-
Excellent points Markus th
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059489#comment-15059489
]
Lewis John McGibbney commented on NUTCH-2184:
-
No, just the following
h
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059459#comment-15059459
]
Lewis John McGibbney commented on NUTCH-2184:
-
I've tested this on
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15058977#comment-15058977
]
Lewis John McGibbney commented on NUTCH-2184:
-
To describe what this p
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15058962#comment-15058962
]
Lewis John McGibbney commented on NUTCH-2184:
-
Issue is logged at NUTCH-
Lewis John McGibbney created NUTCH-2186:
---
Summary: -addBinaryContent flag can cause "String length must be a
multiple of four" error in IndexingJob
Key: NUTCH-2186
URL: https://issues.apache.org/j
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15058955#comment-15058955
]
Lewis John McGibbney commented on NUTCH-2184:
-
I am going to open ano
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2184:
Attachment: NUTCH-2184.patch
Patch for trrunk. During testing this patch against
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2184:
Flags: Patch
Patch Info: Patch Available
> Enable IndexingJob to funct
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2184 stopped by Lewis John McGibbney.
---
> Enable IndexingJob to function with no craw
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15056690#comment-15056690
]
Lewis John McGibbney commented on NUTCH-2184:
-
This issue also impr
Lewis John McGibbney created NUTCH-2185:
---
Summary: protocol-soda-consumer plugin
Key: NUTCH-2185
URL: https://issues.apache.org/jira/browse/NUTCH-2185
Project: Nutch
Issue Type: Bug
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on NUTCH-2184 started by Lewis John McGibbney.
---
> Enable IndexingJob to function with no craw
Lewis John McGibbney created NUTCH-2184:
---
Summary: Enable IndexingJob to function with no crawldb
Key: NUTCH-2184
URL: https://issues.apache.org/jira/browse/NUTCH-2184
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15053975#comment-15053975
]
Lewis John McGibbney commented on NUTCH-2184:
-
Working on this right
[
https://issues.apache.org/jira/browse/NUTCH-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2183.
-
Resolution: Fixed
Committed @revision 1719006 in trunk. Thank you [~mjoyce] for
[
https://issues.apache.org/jira/browse/NUTCH-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2180.
-
Resolution: Fixed
Committed @revision 1719004 in trunk
> FileDumper dumps d
[
https://issues.apache.org/jira/browse/NUTCH-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049698#comment-15049698
]
Lewis John McGibbney commented on NUTCH-2183:
-
Would like to commit toda
[
https://issues.apache.org/jira/browse/NUTCH-2180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15048999#comment-15048999
]
Lewis John McGibbney commented on NUTCH-2180:
-
Harsha do you know
[
https://issues.apache.org/jira/browse/NUTCH-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2183:
Description:
The scenario is that you have a bunch of Nutch data which has been
[
https://issues.apache.org/jira/browse/NUTCH-2183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2183:
Attachment: NUTCH-2183.patch
Patch for trunk.
> Improvement to SegmentChecker
Lewis John McGibbney created NUTCH-2183:
---
Summary: Improvement to SegmentChecker for skipping non-segments
present in segments directory
Key: NUTCH-2183
URL: https://issues.apache.org/jira/browse/NUTCH-2183
--
-- Forwarded message --
From: lewis john mcgibbney
To:
Cc: "travel-assista...@apache.org"
Date: Mon, 7 Dec 2015 20:15:50 -0800
Subject: ApacheCon NA 2015 Travel Assistance Applications now open!
Hi pmcs@,
[
https://issues.apache.org/jira/browse/NUTCH-2181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2181:
Issue Type: Task (was: Bug)
> Add Webpage for 3rd Party Connectors/Libraries
Lewis John McGibbney created NUTCH-2181:
---
Summary: Add Webpage for 3rd Party Connectors/Libraries to Apache
Nutch
Key: NUTCH-2181
URL: https://issues.apache.org/jira/browse/NUTCH-2181
Project
Hello Folks,
07 December 2015 - Nutch 1.11 Release
The Apache Nutch PMC are pleased to announce the immediate release of
Apache Nutch v1.11, we advise all current users and developers of the 1.X
series to upgrade to this release.
What is Apache Nutch?
Nutch is a well matured, production ready W
Hi user@ dev@,
72hrs has lapsed so I would like to bring this thread to a close!
VOTE's wee cast with the following RESULT
[7] +1 Release this package as Apache Nutch 1.11
Lewis John Mcgibbney*
Roannel Fernández Hernández
Sujen Shah*
Chris A Mattmann*
Julien Nioche*
Sebastian Nagel*
Jorge
-1.11-rc2/
All artifacts have been signed with the following signature as present
within KEYS
48BAEBF6 2013-10-28 Lewis John McGibbney (CODE SIGNING KEY) <
lewi...@apache.org>
In addition, a staged maven repository is available here:
https://repository.apache.org/content/repositories/orgapach
Hi Chris,
Can you please drop the Nutch 1.11RC#1 artifacts from repository.a.o and
from https://dist.apache.org/repos/dist/dev/nutch/1.11/
Thanks very much
Lewis
--
*Lewis*
[
https://issues.apache.org/jira/browse/NUTCH-2178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2178:
Fix Version/s: (was: 1.11)
1.12
> DeduplicationJob
[
https://issues.apache.org/jira/browse/NUTCH-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2128:
Fix Version/s: (was: 1.12)
1.11
> Refactor configuration
[
https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2149:
Fix Version/s: (was: 1.12)
1.11
> REST endpoint to r
[
https://issues.apache.org/jira/browse/NUTCH-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15038801#comment-15038801
]
Lewis John McGibbney commented on NUTCH-2172:
-
+1
> Parsing whitesp
[
https://issues.apache.org/jira/browse/NUTCH-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037471#comment-15037471
]
Lewis John McGibbney commented on NUTCH-2172:
-
[~wastl-nagel] this is a
[
https://issues.apache.org/jira/browse/NUTCH-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037470#comment-15037470
]
Lewis John McGibbney commented on NUTCH-2172:
-
I think that is the point
[
https://issues.apache.org/jira/browse/NUTCH-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15023141#comment-15023141
]
Lewis John McGibbney edited comment on NUTCH-2158 at 11/23/15 9:4
[
https://issues.apache.org/jira/browse/NUTCH-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15023141#comment-15023141
]
Lewis John McGibbney commented on NUTCH-2158:
-
I am +1 for this. If we
[
https://issues.apache.org/jira/browse/NUTCH-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15018544#comment-15018544
]
Lewis John McGibbney commented on NUTCH-2158:
-
Hi [~jnioche], I repro
[
https://issues.apache.org/jira/browse/NUTCH-2058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney resolved NUTCH-2058.
-
Resolution: Fixed
Tests are not failing as per recent local builds
https
Hi Folks,
Title says it all.
There is only one pending issue for 1.11.
https://issues.apache.org/jira/browse/NUTCH-2158
I am testing our the Tika 1.11 patch right now.
Do you guys want me to push a release if we can get the Tika committed?
I can do this tonight when I get home.
Ta
Lewis
--
*Lewis
[
https://issues.apache.org/jira/browse/NUTCH-2162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2162:
Fix Version/s: (was: 1.11)
1.12
> Nutch Webapp Crawl fa
[
https://issues.apache.org/jira/browse/NUTCH-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2069:
Fix Version/s: (was: 1.12)
1.11
> Ignore external li
[
https://issues.apache.org/jira/browse/NUTCH-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15015387#comment-15015387
]
Lewis John McGibbney commented on NUTCH-2069:
-
+1 for patch. Sorry a
[
https://issues.apache.org/jira/browse/NUTCH-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2069:
Fix Version/s: 1.12
> Ignore external links based on dom
Lewis John McGibbney created NUTCH-2171:
---
Summary: Upgrade Nutch Trunk to Java 1.8
Key: NUTCH-2171
URL: https://issues.apache.org/jira/browse/NUTCH-2171
Project: Nutch
Issue Type: Task
Hi Folks,
Mike Joyce and myself have been working on a Tinkerpop implementation of
Node and NodeDB (generated through WebGraph) which builds a Vertex input,
used by Tinkerpop, subsequently Gremlin and persisted into a graph database
such as TitanDB.
We have analyzed the problem quite a bit and cam
[
https://issues.apache.org/jira/browse/NUTCH-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15005130#comment-15005130
]
Lewis John McGibbney commented on NUTCH-2157:
-
+1 commit, this looks
[
https://issues.apache.org/jira/browse/NUTCH-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney closed NUTCH-2170.
---
Resolution: Fixed
Hi prabhakar please go to our mailing lists and we can help you
[
https://issues.apache.org/jira/browse/NUTCH-2130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-2130:
Fix Version/s: (was: 2.4)
2.3.1
> copyField rawcont
801 - 900 of 4217 matches
Mail list logo