Hi Tom,
I have been using Nutch 1.x for the last 9 months or so and it works well
for large scale crawls up to around a billion pages. However, the inherent
lack of random access in HDFS really starts to become a burden on our hadoop
cluster when going through the whole
Julien, devs, users,
I'd like to see bugs fixed in 2.0 but some of them are way out of my league or
would cost me an absurd amount of time. I'd also really like to use Gora but
Gora must be maintained. Gora will play a fundamental role in 2.0 and if
something is broken there it is not trivial
Hi,
Without changing the flow of conversation and the points which have already
been touched upon, I would like to add:
I am really split here between a couple of decisions. I like the abstraction
that Gora provides, even though it is somewhat of a pain to configure, this
also presents a barrier
Hi,
I have been working on NUTCH-208 [1] in an attempt to clean up some dated
issues on our JIRA and have been considering Sami's comments regarding unit
tests for this specific improvement. My questions are as follows
What is the procedure for defining whether a certain patch requires
Hi Julien this has now been dealt with.
Any chance of checking when you get round to it.
Thank you
On Thu, Jul 28, 2011 at 7:28 PM, Julien Nioche (JIRA) j...@apache.orgwrote:
[
[
https://issues.apache.org/jira/browse/NUTCH-208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-208:
---
Attachment: NUTCH-208-trunk-2.0-20110810.patch
Patch attached for trunk 2.0.
I am
Priority: Trivial
Labels: patch
Fix For: 1.4, 2.0
Attachments: NUTCH-208-branch-1.4-20110807.patch,
NUTCH-208-branch-1.4-20110809-v2.patch, NUTCH-208-trunk-2.0-20110810.patch,
patch.txt, patch.txt, proxy_exception_list-0.8.diff
I suggest that a parameter
[
https://issues.apache.org/jira/browse/NUTCH-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082292#comment-13082292
]
Lewis John McGibbney commented on NUTCH-258:
When I was viewing
[
https://issues.apache.org/jira/browse/NUTCH-258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082298#comment-13082298
]
Julien Nioche commented on NUTCH-258:
-
Lewis - this issue is closed and I am not sure
[
https://issues.apache.org/jira/browse/NUTCH-917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Julien Nioche closed NUTCH-917.
---
Resolution: Fixed
That's great, thanks Lewis
Website Navigation Links
[
https://issues.apache.org/jira/browse/NUTCH-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082304#comment-13082304
]
Markus Jelsma commented on NUTCH-1028:
--
Committed for 1.4 in rev. 1156132.
Log
Hi,
Just for information purposes, I committed our DOAP which can now be found
under trunk svn. I have been informed by site-dev@ that the system they use
oes not support more than one doap file, however I thought it best to keep
it in svn for the time being. If at some point in the future Nutch
[
https://issues.apache.org/jira/browse/NUTCH-208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-208:
---
Attachment: NUTCH-208-trunk-2.0-20110810-v2.patch
new patch for trunk 2.0
Upgrade all instances of commons logging to slf4j (with log4j backend)
--
Key: NUTCH-1078
URL: https://issues.apache.org/jira/browse/NUTCH-1078
Project: Nutch
Issue Type:
That's great, thanks!
On 10 August 2011 14:58, lewis john mcgibbney lewis.mcgibb...@gmail.comwrote:
Hi,
Just for information purposes, I committed our DOAP which can now be found
under trunk svn. I have been informed by site-dev@ that the system they
use oes not support more than one doap
[
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082373#comment-13082373
]
Simão Fontes commented on NUTCH-296:
The GSoC did generate some code. There have been
[
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082388#comment-13082388
]
Lewis John McGibbney commented on NUTCH-296:
Hi Simão, any chance we could
The code developed was for integration on nutchwax. The link to the project is:
https://webarchive.jira.com/wiki/display/SOC06/Text-based+image+search+capability+for+NutchWAX
The code has been made available to checkout, but it works on a
previous version of nutch.
[
https://issues.apache.org/jira/browse/NUTCH-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-623:
---
Attachment: NUTCH-623-branch-1.4-20110810.patch
This patch for branch-1.4 simply
[
https://issues.apache.org/jira/browse/NUTCH-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney updated NUTCH-623:
---
Attachment: NUTCH-623-branch-1.4-20110810.patch
patch for trunk.
Both of the above
[
https://issues.apache.org/jira/browse/NUTCH-296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lewis John McGibbney reopened NUTCH-296:
This issue is back open...
The code developed was for integration on nutchwax. The
[
https://issues.apache.org/jira/browse/NUTCH-672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082558#comment-13082558
]
Lewis John McGibbney commented on NUTCH-672:
OK having tried to get this
[
https://issues.apache.org/jira/browse/NUTCH-1075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13082595#comment-13082595
]
Lewis John McGibbney commented on NUTCH-1075:
-
Hi Julien,
Would it be
23 matches
Mail list logo