Hi,

I will shortly submit a patch that fixes the threading issue with the SQL store. About your suggestion to change TestGoraStorage; I don't think it is up to Nutchgora to test all stores (or more than a single one), because it is really up to Gora itself to make sure that each store works as is expected. Otherwise, where would you draw the line? Testing the default store (SQL) in Nutchgora is fine, because the basic read and write tests inherently also test Nutchgora related functionality.

However I do agree that Gora needs a better testing framework. Could we not use a base test class that is similar to TestGoraStorage in Gora too? I really like the idea that a data store simply extends from a base test class and only provides a few neccesary implementations. The test framework will then test basic reading / writing / multithreading so on for every store implementation. Of course, every store is still able to include specific testing code.

Ferdy.

On 02/02/2012 01:16 PM, Lewis John Mcgibbney wrote:
Hi Guys,

We've recently hit a stumbling block of sorts with the above issue. It appears that there is a good bit more interest in Gora recently and I personally feel it is really important that users are able to utilize Nutchgora. Currently we have a problem; when NUTCH-1205 [1] is applied TestGoraStorage breaks, however Ferdy tracked this to a problem with Gora as oppose to Nutchgora...

In the pipeline we have planned to replace various libraries in the gora-sql module, which would solve some licensing issues and hopefully inclusively fix the problem we are having as described above. I would therefore propose the following two options:

1) Commit patch allowing Nutchgora-gora-hbase & Nutchgora-gora-cassandra users the ability to use the software and work towards fixing the other stuff as they sey fit. This would mean that Nutchgora would be effectively broken until the latter happens... 2) Wait until the gora-sql module has been fixed and pushed to repository.apache.org <http://repository.apache.org> and this dependency can be used in the existing tests.

Sorry to completely confuse things, but on a side note, would it not make sense to be testing Hbase and Cassandra as well as Sql storage functionality within TestGoraStorage?

If this was suitable, I propose to change from
.
`-- TestGoraStorage.java

to the following

.
`-- TestGoraSqlStorage.java
`-- TestGoraHbaseStorage.java
`-- TestGoraCassandraStorage.java
`-- TestGoraStorage.java

This way the testSingleThreaded, testMultiThreaded and testMultiProcess logic could remain within TestGoraStorage and all three tests would implement datastore specific logic and configuration... same as Gora currently does.

Any comments please? Thank you

https://issues.apache.org/jira/browse/NUTCH-1205
--
/Lewis/

  • NUTCH-1205 Lewis John Mcgibbney
    • Re: NUTCH-1205 Ferdy Galema

Reply via email to