Hi Frédéric,

Thanks for you info ...and sorry for the late answer...

I'm currently waiting for a test system...could still take a few weeks.. :-(

When its up and running I will start to compare at first compare hbase raw api 
vs. data nucleus-orm on different db-sizes (20GB - 100GB)

If datanucleus-orm is much slower than hbase-api I will test the other ones you 
mentioned ;)

Should someone be interested in the results I willl post them here ..

Somehow I wonder that there hasn't been such benchmarks yet....or have I just 
missed them?

regards
Chris


________________________________
Von: Frédéric Fondement <[email protected]>
An: [email protected]
Gesendet: 10:24 Dienstag, 18.Oktober 2011 
Betreff: Re: Persist to HBase with JPA using HBql-JDBC-Driver (Examples)?

Le 15/10/11 23:34, Christian Schäfer a écrit :
> But nevertheless I will try on using data nucleus' jpa for hbase and make 
> some benchmarks to compare it with the hbase native interface;-)
Hi there,

Would be great if you plan to make such study to publish results (here ?) !!!

What about proposing a simple application that all those guys who created an 
ORM (Datanucleus, Kundera, ...) could implement and submit (you?) for a bench ?

I'm part of those guys. We created n-orm (http://code.google.com/p/n-orm/) just 
as a matter to separate responsibilities in our team (functionnal vs 
non-functionnal), to centralize data management (to improve separation of 
concerns, and thus maintainability), and to still understand what really 
happens under the hood (and still be able to change platform in case of 
problem...). Actually, our ORM considers POJOs as some kind of schema for the 
base (query-driven), and thus, philosophy is more to use java objects but with 
the knowledge of how to use HBase in mind, so that we hope not loosing too much 
of HBase possibilities.

I agree when Michel says that the HBase API is easy, but when it comes to 
details, it's really hard to think of everything, especially when it's 
interleaved with functionnal code (scan caching, inter-process schema 
management, compression, migration, error handling, new versions of the API, 
new possibilities... or just learning a new important stuff to be integrated in 
the complete application !).

Nevertheless, as our application becomes more and more complex, it's 
unconceivable for us to re-implement it just using the HBase raw API. But, as a 
consequence, I have no real idea of the price we pay regarding performance just 
to help us developing...

Another ORM that deserves attention is 
https://github.com/ghelmling/meetup.beeno which is built on the same 
philosophy. Actually, we didn't choose it as it's too tightly coupled with 
HBase, but I guess it must really perform well (because of the latter reason).

I think the real danger of ORMs is to think your schema in a domain-driven 
(classical) fashion, instead of query-driven. It might be the case that this 
danger is less important when you use raw APIs.

Cheers,

Frédéric.

Reply via email to