Hi Renato, See replies inline.
On Thu, Aug 22, 2019 at 5:52 PM Renato Marroquín Mogrovejo < renatoj.marroq...@gmail.com> wrote: > Hey Sheriffo, > > Thanks for the report and all the work! > Gora performing worst when inserting data in the HBase case I think it > can make sense, because Gora still needs to serialize every data bean > through Avro, (maybe some caching? but Sheriffo also deactivated this > with gora.hbasestore.hbase.client.autoflush.enabled=true) so I guess > the rest of the time it is just Gora serialization. > I agree with you. > Now for the reads in HBase-native and HBase-Gora, are we sure we are > getting the same granularity of objects? I mean because of the mapping > Gora does (different column families per attribute), maybe we are > fetching the attributes in a different way than HBase is doing, maybe > Gora fetches only some column families whereas HBase fetches > everything. > I have done some basic test to verify this see the testUpdate() method in the GoraClientTest file. Here, I insert some strings retrieve them and verify that they match the expected value. Did you run any correctness tests to know that we are retrieving the > correct results in both cases? Something like inserting an integer as > part of the attributes, and then summing them when retrieved to check > that the sum is what we expect. > Thanks for this, I have added a new test case called testCorrectness() to handle the issue you have raised. The results I got are consistent with we are expecting. > > Best, > > Renato M. > > El jue., 22 ago. 2019 a las 5:17, Sheriffo Ceesay > (<sneceesa...@gmail.com>) escribió: > > > > Hi Furqan, > > > > Yes, it baffled me as well. I haven't made any specific performance > optimisation configuration to either of the setups so I think these results > may not be final at this stage and would need further investigation. > > > > The only setting I set for HBase for Apache Gora in the gora.properties > file is: > > > > gora.hbasestore.hbase.client.autoflush.enabled=true > > > > For the local HBase setup, I have followed the recommendations here [1] > to avoid any performance issues. > > > > https://github.com/brianfrankcooper/YCSB/tree/master/hbase098 > > > > Basically, the setups are fresh and simplified installations with any > major configuration for optimisation. > > > > Thank you. > > > > *Sheriffo Ceesay* > > > > > > > > On Thu, Aug 22, 2019 at 12:45 PM Furkan KAMACI <furkankam...@gmail.com> > wrote: > >> > >> Hi Sheriffo, > >> > >> Thanks for the updates! > >> > >> By the way, I still wonder the reason of poorly performance of HBase > native > >> implementation. > >> > >> Kind Regards, > >> Furkan KAMACI > >> > >> On Thu, Aug 22, 2019 at 2:37 PM Sheriffo Ceesay <sneceesa...@gmail.com> > >> wrote: > >> > >> > Hi Furkan, > >> > Thanks for your feedback. > >> > > >> > Please find replies to your comments inline. > >> > > >> > On Wed, Aug 21, 2019 at 6:19 PM Furkan KAMACI <furkankam...@gmail.com > > > >> > wrote: > >> > > >> > > Hi Sheriffo, > >> > > > >> > > Thanks for your great effort! > >> > > > >> > > 1) Could you separate charts for HBase and MongoDB? HBase charts > suppress > >> > > MongoDB ones. > >> > > > >> > Yes, this is now done. Can you please have a look? > >> > > >> > > > >> > > 2) Report says that: > >> > > > >> > > *"In this work, we have time to include only three gora data stores > >> > > (MongoDB, HBase and CouchDB)"* > >> > > > >> > > However, you have not run this benchmark for CouchDB as far as I > know? > >> > > > >> > > >> > Yes, you are right that it is not included in the benchmark results > but I > >> > have included its implementation in the module. This includes > >> > auto-generating mapping and related files. Due to time factors, there > was a > >> > bit of discussion as to which datastores to include in the preliminary > >> > benchmarking and we have decided to include HBase and MongoDB. In > future, I > >> > will work on adding more data stores and compare their performance as > well. > >> > > >> > > >> > > 3) I don't think there is a need to add commit hashes and messages > as > >> > > Appendix. Especially if we consider that hashes will be changed > once the > >> > PR > >> > > merged into the codebase. > >> > > > >> > > >> > I have seen this as a good tip in the email send by GSoC team, but I > agree > >> > with you and I have now removed this. > >> > > >> > > > >> > > Kind Regards, > >> > > Furkan KAMACI > >> > > >> > > >> > Thank you. > >> > Sheriffo. > >> > > >> > > > >> > > >> > > >> > > On Wed, Aug 21, 2019 at 7:42 PM Sheriffo Ceesay < > sneceesa...@gmail.com> > >> > > wrote: > >> > > > >> > > > All, > >> > > > > >> > > > My draft final report is available at > >> > > > > >> > > > > >> > > > >> > > https://cwiki.apache.org/confluence/display/GORA/Final+Report%3A+%5BGORA-532%5D+Benchmark+Module+For+Apache+Gora > >> > > > > >> > > > We have until 26th of this month submit the report. Please let me > know > >> > if > >> > > > you have any comments to improve it. > >> > > > > >> > > > Meanwhile, I will work on the documentation on how to run the > benchmark > >> > > > module and publish on gora website. > >> > > > > >> > > > Thank you. > >> > > > > >> > > > **Sheriffo Ceesay** > >> > > > > >> > > > >> > >