> Can you read your db and see if there are any pages pending a fetch? After inject
[default@webpage] list f; Using default limit of 100 ------------------- RowKey: 6c742e62616c7361732e7777773a687474702f => (column=6669, value=00278d00, timestamp=1348032953800000) => (column=73, value=3f800000, timestamp=1348032953802000) => (column=7473, value=00000139dd066f6b, timestamp=1348032953798000) ------------------- RowKey: 6c742e6c72797461732e7777773a687474702f => (column=6669, value=00278d00, timestamp=1348032953811000) => (column=73, value=3f800000, timestamp=1348032953814000) => (column=7473, value=00000139dd066f6b, timestamp=1348032953809000) ------------------- RowKey: 6c742e31356d696e2e7777773a687474702f => (column=6669, value=00278d00, timestamp=1348032953787000) => (column=73, value=3f800000, timestamp=1348032953789000) => (column=7473, value=00000139dd066f6b, timestamp=1348032953785000) ------------------- RowKey: 6c742e64656c66692e7777773a687474702f => (column=6669, value=00278d00, timestamp=1348032953749000) => (column=73, value=3f800000, timestamp=1348032953752000) => (column=7473, value=00000139dd066f6b, timestamp=1348032953656000) 4 Rows Returned. Then after fetch Very very long sequence of this ....... d3e3c2f6469763e0a3c2f6469763e3c212d2d2064656c666920636f6e7461696e6572207772617070657220626567696e202d2d3e0a0a0a202020200a3c2f626f64793e0a3c2f68746d6c3e0a0a, timestamp=1347972430537000) => (column=6669, value=00278d00, timestamp=1347972384062000) => (column=707473, value=00000139d96a3c7a, timestamp=1347972430534000) => (column=73, value=3f800000, timestamp=1347972384065000) => (column=7374, value=00000002, timestamp=1347972430531000) => (column=7473, value=0000013b0e6877e8, timestamp=1347974904068000) => (column=747970, value=6170706c69636174696f6e2f7868746d6c2b786d6c, timestamp=1347972430640000) 4 Rows Returned. Elapsed time: 10255 msec(s). parse and list p returns similar very long sequence of bite codes. updatedb apparently no changes. Then starting new generate, fetch, parse iteration list f ....02020200a3c2f626f64793e0a3c2f68746d6c3e0a0a, timestamp=1348033056939000) => (column=6669, value=00278d00, timestamp=1348032953749000) => (column=707473, value=00000139dd066f6b, timestamp=1348033056934000) => (column=73, value=3f800000, timestamp=1348032953752000) => (column=7374, value=00000002, timestamp=1348033056931000) => (column=7473, value=0000013a7786c6be, timestamp=1348033211184000) => (column=747970, value=6170706c69636174696f6e2f7868746d6c2b786d6c, timestamp=1348033056949000) 4 Rows Returned. Elapsed time: 11825 msec(s). Also I have added those jar's to nutch lib, maybe versions are not right? cassandra-all-1.1.2.jar cassandra-thrift-1.1.2.jar gora-core-0.2.1.jar gora-cassandra-0.2.1.jar hector-core-1.1-0.jar thrift-0.2.0.jar (not needed I think, libtrift has all what is necessary) libthrift-0.7.0.jar cassandra -v 1.0.11 That's a bit strange for I have downloaded v1.1.5 (also tried the one which installs via aptitude on ubuntu) On Tue, Sep 18, 2012 at 5:16 PM, Lewis John Mcgibbney < [email protected]> wrote: > Hi, > > On Tue, Sep 18, 2012 at 2:34 PM, Žygimantas Medelis <[email protected]> > wrote: > > > Commands I am issuing > > > > Can you read your db and see if there are any pages pending a fetch? > > > > > Also I was getting NullPointerException on inject before > > changing conf/gora-cassandra-mapping.xml > > from: <class keyClass="java.lang.String" > > name="org.apache.nutch.storage.WebPage"> > > to: <class keyClass="java.lang.String" > > name="org.apache.nutch.storage.WebPage" keyspace="webpage"> > > I've now fixed this in the 2.x branch. Thank you for reporting >

