Hello Max,
(Besides the fact that the this client seems to have a broken random URL
generator)
Crawlers (like Nutch clients) may not always obey robot rules. If Nutch is not
configured properly, it will not recognize your Nutch entry in your robots.txt
file.
If the requests come from a
[
https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143038#comment-13143038
]
Julien Nioche commented on NUTCH-1098:
--
@Radim
Sounds like I am not going to is your
Hi,
If you look at the recently failing Nutch trunk builds, namely #1645, #1650
#1651, a common denominator is the
org.apache.nutch.segment.TestSegmentMerger.testLargeMergehttps://builds.apache.org/job/Nutch-trunk/1651/testReport/org.apache.nutch.segment/TestSegmentMerger/testLargeMerge/
which
DiskCheckerException usually smells like running out of disk space in the
designated tmp dir.
On Thursday 03 November 2011 12:39:11 Lewis John Mcgibbney wrote:
Hi,
If you look at the recently failing Nutch trunk builds, namely #1645, #1650
#1651, a common denominator is the
Would make logical sense Markus, thank you.
I think it's about time to try a more generic Jenkins build configuration
e.g. build on Ubuntu slaves as well as Solaris. I'll see what we can get
running over the next while.
On Thu, Nov 3, 2011 at 11:43 AM, Markus Jelsma
I can't tell what the exact cause is. Because tests run locally fine and
because the commits since last build succes seem completely unrelated, I
would say yes this is definitely caused by the Solaris build
invironment. Unfortunately I'm still a novice in regard to the build
process so I'm not
Hello,
I would like to be subscribed to the nutch developers list.
Update job should impose an upper limit on the number of inlinks (nutchgora)
Key: NUTCH-1196
URL: https://issues.apache.org/jira/browse/NUTCH-1196
Project: Nutch
Hello dear :
I have the following running information from
hadoop.log when I configured Nutch 1.3 in Eclipse (Win 7), but I don't
know how to resolve it ,Can you help me . I'm new to nutch , so forgive me
for some mistakes of using wrong terminology!
2011-11-03 16:51:53,300
[
https://issues.apache.org/jira/browse/NUTCH-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Markus Jelsma updated NUTCH-1140:
-
Fix Version/s: 1.5
index-more plugin, resetTitle method creates multiple values in the
Hi
Please use the user@nutch mailing list for user-related questions. This is for
development of Nutch itself.
Cheers
Hello dear :
I have the following running information from
hadoop.log when I configured Nutch 1.3 in Eclipse (Win 7), but I don't
know how to resolve
[
https://issues.apache.org/jira/browse/NUTCH-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13143503#comment-13143503
]
Joe Liedtke commented on NUTCH-1140:
Thanks!
index-more plugin,
[
https://issues.apache.org/jira/browse/NUTCH-1195?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki resolved NUTCH-1195.
--
Resolution: Fixed
Committed in rev. 1197319.
Add Solr 4x (trunk)
[
https://issues.apache.org/jira/browse/NUTCH-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Radim Kolar updated NUTCH-1194:
---
Comment: was deleted
(was: locking should be done in setup/cleanup task. Currently if you kill
[
https://issues.apache.org/jira/browse/NUTCH-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Radim Kolar updated NUTCH-1070:
---
Attachment: (was: nutch.bat)
Run nutch under native windows (no cygwin)
[
https://issues.apache.org/jira/browse/NUTCH-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Radim Kolar resolved NUTCH-1070.
Resolution: Won't Fix
Run nutch under native windows (no cygwin)
[
https://issues.apache.org/jira/browse/NUTCH-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Radim Kolar updated NUTCH-1070:
---
Attachment: (was: bash.c)
Run nutch under native windows (no cygwin)
[
https://issues.apache.org/jira/browse/NUTCH-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Radim Kolar updated NUTCH-1070:
---
Attachment: (was: chmod.c)
Run nutch under native windows (no cygwin)
Hi Lewis,
I guess in gora-cassandra/src/test/conf/gora.properties, the servers are
listed as:
gora.cassandrastore.servers=localhost:9160
In setting the properties for gora data stores, you have to supply the data
store that it applies to. The documentation at
Add statically configured field values to solrindex-mapping.xml
---
Key: NUTCH-1197
URL: https://issues.apache.org/jira/browse/NUTCH-1197
Project: Nutch
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/NUTCH-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andrzej Bialecki updated NUTCH-1197:
-
Attachment: NUTCH-1197.patch
Patch with the implementation. I added some javadocs, and a
I’ve been ‘out of it’ for a while. It used to be that Nutch has a localized
HTML search page that featured these
guyshttp://upload.wikimedia.org/wikipedia/commons/5/53/Nutch.png.
Did 1.3 bring this forward in some form that I cannot find (maybe involving
an XSL on search results?), or has this
See https://builds.apache.org/job/Nutch-trunk/1652/changes
23 matches
Mail list logo