[
https://issues.apache.org/jira/browse/NUTCH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486041#comment-13486041
]
Roberto Gardenier commented on NUTCH-585:
-
I have compiled nutch 1.5.1 with the pro
[
https://issues.apache.org/jira/browse/NUTCH-585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roberto Gardenier updated NUTCH-585:
Comment: was deleted
(was: I have compiled nutch 1.5.1 with the provided plugin and used the
Julien Nioche created NUTCH-1482:
Summary: Rename HTMLParseFilter
Key: NUTCH-1482
URL: https://issues.apache.org/jira/browse/NUTCH-1482
Project: Nutch
Issue Type: Task
Components: p
[
https://issues.apache.org/jira/browse/NUTCH-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486106#comment-13486106
]
Lewis John McGibbney commented on NUTCH-1482:
-
Hi Julien. +1 for this
[
https://issues.apache.org/jira/browse/NUTCH-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1245:
---
Attachment: NUTCH-1245-578-TEST-1.patch
JUnit test to catch this problem and NUTCH-578: a lar
[
https://issues.apache.org/jira/browse/NUTCH-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reassigned NUTCH-1370:
---
Assignee: Lewis John McGibbney
> Expose exact number of urls injected @ru
In addition to this. Can someone please explain why [0]
StorageUtils#getDataStoreClass is a private method in this class. The
reason I ask is that it would be nice to be able to log which Gora
class is being used to persist the Injected URLs.
Are there any security risks associated with making thi
Hi Lewis
see comments below
>
> So I thought I'd take this one on tonight and see if I can resolve.
> Basically, my high level question is as follows...
> Is each line of a text file (seed file) which we attempt to inject
> into the webdb considered as an individual map task?
>
no - each file in
[
https://issues.apache.org/jira/browse/NUTCH-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486144#comment-13486144
]
Sebastian Nagel commented on NUTCH-1482:
+1
> Rename HTMLParseFil
Hi Julien,
Thanks for the comments. Any additional ones regarding the accessibility of
the getDataStoreClass?
Thanks again
Lewis
On Mon, Oct 29, 2012 at 4:52 PM, Julien Nioche <
lists.digitalpeb...@gmail.com> wrote:
> Hi Lewis
>
> see comments below
>
>>
>> So I thought I'd take this one on to
[
https://issues.apache.org/jira/browse/NUTCH-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1245:
---
Attachment: NUTCH-1245-1.patch
FetchSchedule.setPageGoneSchedule is called exclusively for a
[
https://issues.apache.org/jira/browse/NUTCH-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486155#comment-13486155
]
Markus Jelsma commented on NUTCH-1482:
--
+0 I'm fine with such a change but this will
[
https://issues.apache.org/jira/browse/NUTCH-1482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486290#comment-13486290
]
Sebastian Nagel commented on NUTCH-1482:
Markus, you are right: I remember the API
[
https://issues.apache.org/jira/browse/NUTCH-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-1245:
---
Attachment: NUTCH-1245-2.patch
NUTCH-1245-578-TEST-2.patch
Improved patches
[
https://issues.apache.org/jira/browse/NUTCH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486484#comment-13486484
]
Sebastian Nagel commented on NUTCH-578:
---
NUTCH-1245 provides a test to catch this pro
[
https://issues.apache.org/jira/browse/NUTCH-578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-578:
--
Attachment: NUTCH-578_v5.patch
> URL fetched with 403 is generated over and over again
> ---
16 matches
Mail list logo