The generator also does not have filters. Its mapper goes over all records as far as I know. If you use hadoop you can see how many records go as input to mappers. Also see this
https://issues.apache.org/jira/browse/GORA-119 Alex. -----Original Message----- From: Roland <[email protected]> To: user <[email protected]> Sent: Wed, Feb 20, 2013 11:47 am Subject: Re: nutch with cassandra internal network usage Hi Alex, the GeneratorJob seems to have a solution for that, if not it would iterate over all records too, am I right? --Roland Am 20.02.2013 20:42, schrieb [email protected]: > Hi, > > This is because fetch's mapper goes over all records and selects those that has the given batchId. Currently mappers of all nutch commands does not have filters. > It is interesting to know if you can selects records with a given batchId in cassandra without iterating over all records. > > > Alex. >

