[GitHub] [gora] LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast Jet execution engine support

2019-08-23 Thread GitBox
LahiruJayasekara commented on a change in pull request #175: GORA-546 Hazelcast 
Jet execution engine support
URL: https://github.com/apache/gora/pull/175#discussion_r317183898
 
 

 ##
 File path: pom.xml
 ##
 @@ -792,6 +792,7 @@
 gora-ignite
 gora-tutorial
 sources-dist
+gora-jet
 
 Review comment:
   Done


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Final Report

2019-08-23 Thread Sheriffo Ceesay
Hi Renato,

See replies inline.

On Thu, Aug 22, 2019 at 5:52 PM Renato Marroquín Mogrovejo <
renatoj.marroq...@gmail.com> wrote:

> Hey Sheriffo,
>
> Thanks for the report and all the work!
> Gora performing worst when inserting data in the HBase case I think it
> can make sense, because Gora still needs to serialize every data bean
> through Avro, (maybe some caching? but Sheriffo also deactivated this
> with gora.hbasestore.hbase.client.autoflush.enabled=true) so I guess
> the rest of the time it is just Gora serialization.
>

I agree with you.


> Now for the reads in HBase-native and HBase-Gora, are we sure we are
> getting the same granularity of objects? I mean because of the mapping
> Gora does (different column families per attribute), maybe we are
> fetching the attributes in a different way than HBase is doing, maybe
> Gora fetches only some column families whereas HBase fetches
> everything.
>

I have done some basic test to verify this see the testUpdate() method in
the GoraClientTest file. Here, I insert some strings retrieve them and
verify that they match the expected value.

Did you run any correctness tests to know that we are retrieving the
> correct results in both cases? Something like inserting an integer as
> part of the attributes, and then summing them when retrieved to check
> that the sum is what we expect.
>

Thanks for this, I have added a new test case called testCorrectness() to
handle the issue you have raised. The results I got are consistent with we
are expecting.

>
> Best,
>
> Renato M.
>
> El jue., 22 ago. 2019 a las 5:17, Sheriffo Ceesay
> () escribió:
> >
> > Hi Furqan,
> >
> > Yes, it baffled me as well. I haven't made any specific performance
> optimisation configuration to either of the setups so I think these results
> may not be final at this stage and would need further investigation.
> >
> > The only setting I set for HBase for Apache Gora in the gora.properties
> file is:
> >
> > gora.hbasestore.hbase.client.autoflush.enabled=true
> >
> > For the local HBase setup, I have followed the recommendations here [1]
> to avoid any performance issues.
> >
> > https://github.com/brianfrankcooper/YCSB/tree/master/hbase098
> >
> > Basically, the setups are fresh and simplified installations with any
> major configuration for optimisation.
> >
> > Thank you.
> >
> > *Sheriffo Ceesay*
> >
> >
> >
> > On Thu, Aug 22, 2019 at 12:45 PM Furkan KAMACI 
> wrote:
> >>
> >> Hi Sheriffo,
> >>
> >> Thanks for the updates!
> >>
> >> By the way, I still wonder the reason of poorly performance of HBase
> native
> >> implementation.
> >>
> >> Kind Regards,
> >> Furkan KAMACI
> >>
> >> On Thu, Aug 22, 2019 at 2:37 PM Sheriffo Ceesay 
> >> wrote:
> >>
> >> > Hi Furkan,
> >> > Thanks for your feedback.
> >> >
> >> > Please find replies to your comments inline.
> >> >
> >> > On Wed, Aug 21, 2019 at 6:19 PM Furkan KAMACI  >
> >> > wrote:
> >> >
> >> > > Hi Sheriffo,
> >> > >
> >> > > Thanks for your great effort!
> >> > >
> >> > > 1) Could you separate charts for HBase and MongoDB? HBase charts
> suppress
> >> > > MongoDB ones.
> >> > >
> >> > Yes, this is now done. Can you please have a look?
> >> >
> >> > >
> >> > > 2) Report says that:
> >> > >
> >> > > *"In this work, we have time to include only three gora data stores
> >> > > (MongoDB, HBase and CouchDB)"*
> >> > >
> >> > > However, you have not run this benchmark for CouchDB as far as I
> know?
> >> > >
> >> >
> >> > Yes, you are right that it is not included in the benchmark results
> but I
> >> > have included its implementation in the module. This includes
> >> > auto-generating mapping and related files. Due to time factors, there
> was a
> >> > bit of discussion as to which datastores to include in the preliminary
> >> > benchmarking and we have decided to include HBase and MongoDB. In
> future, I
> >> > will work on adding more data stores and compare their performance as
> well.
> >> >
> >> >
> >> > > 3) I don't think there is a need to add commit hashes and messages
> as
> >> > > Appendix. Especially if we consider that hashes will be changed
> once the
> >> > PR
> >> > > merged into the codebase.
> >> > >
> >> >
> >> > I have seen this as a good tip in the email send by GSoC team, but I
> agree
> >> > with you and I have now removed this.
> >> >
> >> > >
> >> > > Kind Regards,
> >> > > Furkan KAMACI
> >> >
> >> >
> >> > Thank you.
> >> > Sheriffo.
> >> >
> >> > >
> >> >
> >> >
> >> > > On Wed, Aug 21, 2019 at 7:42 PM Sheriffo Ceesay <
> sneceesa...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > All,
> >> > > >
> >> > > > My draft final report is available at
> >> > > >
> >> > > >
> >> > >
> >> >
> https://cwiki.apache.org/confluence/display/GORA/Final+Report%3A+%5BGORA-532%5D+Benchmark+Module+For+Apache+Gora
> >> > > >
> >> > > > We have until 26th of this month submit the report. Please let me
> know
> >> > if
> >> > > > you have any comments to improve it.
> >> > > >
> >> > > > Meanwhile, I will work on the documentation