[jira] [Created] (NUTCH-1074) topN is ignored with maxNumSegments

2011-08-01 Thread Markus Jelsma (JIRA)
topN is ignored with maxNumSegments --- Key: NUTCH-1074 URL: https://issues.apache.org/jira/browse/NUTCH-1074 Project: Nutch Issue Type: Bug Components: generator Affects Versions: 1.3

Re: Nutch 2 and Cassandra

2011-08-01 Thread Alexis
Hi, libthrift is a dependency of cassandra-thrift, as listed here: http://mvnrepository.com/artifact/org.apache.cassandra/cassandra-thrift/0.8.1 During Nutch build, you have to manually tweak the Ivy configuration depending on your choice of the Gora store, in this case Cassandra. Basically you

Re: Nutch 2 and Cassandra

2011-08-01 Thread Alexis
Ok this version of hector was properly resolved. Thanks! These are the logs: ~/java/workspace/Nutch/trunk/runtime/deploy$ bin/nutch inject ~/java/workspace/Nutch/seeds 11/08/01 15:17:45 INFO crawl.InjectorJob: InjectorJob: starting 11/08/01 15:17:45 INFO crawl.InjectorJob: InjectorJob: urlDir:

RE: Nutch 2 and Cassandra

2011-08-01 Thread Tom Davidson
OK... Are you running with a clustered version of Hadoop? I think you have to have your HADOOP_HOME env variable set. Otherwise it runs in local mode. I have been able to run in local mode, but not in deployed mode. -Original Message- From: Alexis [mailto:alexis.detregl...@gmail.com]

[jira] [Commented] (NUTCH-1044) Redirected URLs and possibly all of their outlinked URLs have invalid scores.

2011-08-01 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13076043#comment-13076043 ] Julien Nioche commented on NUTCH-1044: -- Will commit soon if there aren't any