You're right, this is a dev issue for sure.
On Mon, Apr 1, 2013 at 2:45 PM, kaveh minooie <[email protected]> wrote: > The patch NUTCH-1551 didn't solve my issue. I am still getting the same > exact error when i try to run generate. (this was run in local mode) : > NUTCH-1551 is not supposed to fix this problem entirely. It merely attempts to make the WebTableReader tool backwards compatible and permits you to check whether accesor methods WebPage.getBatchID() and WebPage.getPrevModifiedTime() actually work for your use case. If you are able to check and provide feedback of the webtable dump for the URL causing the NPE it would be very valuable indeed. > > now the likely variable that is null seems to be 'mapkey' which is > probably as a result of male formed URL ( thou I can't say that for sure ) > > now the put function is being called from here > > this is from gora 2.1: > > gora/blob/0.2.1/gora-core/src/**main/java/org/apache/gora/** > mapreduce/GoraRecordWriter.**java: > > ... > > > the same function in gora trunk is like this: > ... > > which seems to me that would allow the code to recover from this kind of > errors. now I get gora through ivy and I don't know how or if I can have > ivy to fetch the trunk but regardless I still think the question remains > whether it is a nutch issue or gora? > > So it appears that some issues have been addressed and improved within Gora trunk (which is nice). You can pull a Gora SNAPSHOT from here [0] and place it on your class path then try it out. Feedback would be greatly appreciated. The underlying problem here is that not everyone using and developing Gora is using and developing Nutch. We have been making good progress towards building diversity over in Gora so that it is not so heavily reliant upon Nutch users. This means the project can stand on its own two feet. The downside of this, is that *some* bugs arising from *some* use cases are not discovered until a little later than we would like. Your feedback is really really helpful. It should be noted that you can also patch your local copy of 2.x HEAD to not contain the two offending issues we've previously discussed. [0] https://repository.apache.org/content/repositories/snapshots/org/apache/gora/

