Re: Apache Gora Benchmark

Sheriffo Ceesay Tue, 26 Mar 2019 07:32:37 -0700

I have updated the Benchmark Module after some suggestions from Renato. So
basically, the suggestion was to consider extending YCSB to include Gora
since YCSB already have an implementation of other KV stores.


So it will be great if a potential mentor could have look at this and give
me some feedback. We are currently in the proposal submission period of
GSoC timeline, so any comment on the document will really help.

Please find below link to the shared Google doc.

https://docs.google.com/document/d/1djelY4yVwTuWPA310E_JBinOPnt5PJh3x67z0ZxgBLg/edit



**Sheriffo Ceesay**


On Mon, Mar 25, 2019 at 12:30 PM Sheriffo Ceesay <sneceesa...@gmail.com>
wrote:

> Hi Renato,
>
> Thanks for the reply and the comments on the Google doc.
>
> I think, adding Gora to YCBS framework will be the best approach. Like, I
> mentioned in the shared doc, I will dig more into this and update the
> proposal accordingly.
>
> Thank you.
>
>
> **Sheriffo Ceesay**
>
>
> On Mon, Mar 25, 2019 at 12:05 PM Renato Marroquín Mogrovejo <
> renatoj.marroq...@gmail.com> wrote:
>
>> Hey Sheriffo,
>>
>> Thanks for sharing this. I went quickly over it, and it looks good
>> overall.
>> One question I have is the one I left on the proposal as well. The
>> proposal
>> is about implementing a benckmarking module but why aren't we
>> using/integrating with something like YCSB?
>>
>> I am asking this because it has a few benefits:
>> - Most of the operations one would be interested in kv-stores are already
>> modeled by YCSB (as you know)
>> - With this we would already get support for most key-value stores and we
>> wouldn't have to implement it(or support it) later on.
>> - We get a benchmark module that is already accepted and understood by
>> people using key-value stores.
>>
>> The resulting deliverables could be the integration (adding Gora to YCSB,
>> the module could live in Gora and also could live in YCSB if they want to
>> take it), and the scripts to run it.
>> What do you guys think?
>>
>>
>> Best,
>>
>> Renato M.
>>
>> El dom., 24 mar. 2019 a las 13:05, Sheriffo Ceesay (<
>> sneceesa...@gmail.com>)
>> escribió:
>>
>> > Hi Renato,
>> >
>> > Thanks for the reply. As far as I am concerned all options are on the
>> > table. I have shared my draft project proposal with the dev email list
>> for
>> > comments. I will visit it again and see how best your ideas can be
>> added to
>> > the implementation.
>> >
>> > Below is the Google doc file, please feel free to add comments.
>> >
>> >
>> >
>> https://docs.google.com/document/d/1djelY4yVwTuWPA310E_JBinOPnt5PJh3x67z0ZxgBLg/edit?usp=sharing
>> >
>> > Thank you.
>> >
>> > **Sheriffo Ceesay**
>> >
>> >
>> > On Sun, Mar 24, 2019 at 11:08 AM Renato Marroquín Mogrovejo <
>> > renatoj.marroq...@gmail.com> wrote:
>> >
>> > > Hi Sheriffo,
>> > >
>> > > Thanks for your interest in Gora and in this project.
>> > > We have discussed this a bit already and what the important bit is to
>> > > figure out Gora's overhead compared to using just the kv stores.
>> > > Obviously, we incurr in overheads, but it'd be interesting to know
>> where
>> > > exactly (most likely serialization) and not just say how slow Gora is.
>> > > Ideally, one could fix the easy performance bugs but this might be
>> out of
>> > > the scope, but anyway, that would be nice.
>> > > Another idea would be to actually get the final benchmark run as part
>> of
>> > > CI? So we know how every change impacts performance.
>> > >
>> > >
>> > > Best,
>> > >
>> > > Renato M.
>> > > El mié., 20 mar. 2019 a las 17:15, sneceesa...@gmail.com (<
>> > > sneceesa...@gmail.com>) escribió:
>> > > >
>> > > >
>> > > >
>> > > > On 2017/12/23 20:17:12, Furkan KAMACI <furkankam...@gmail.com>
>> wrote:
>> > > > > Hi Fellows,
>> > > > >
>> > > > > As you know that our project is defined as:
>> > > > >
>> > > > > "*The Apache Gora™ open source framework provides an in-memory
>> data
>> > > model
>> > > > > and persistence for big data.*[1]"
>> > > > >
>> > > > > I believe that Apache Gora is a special project and it touches
>> many
>> > > > > projects. I always wonder the performance of NoSQL DBs as
>> individual
>> > > and
>> > > > > accessed via Apache Gora.
>> > > > >
>> > > > > I think that we should make a benchmark and publish it, and
>> Yahoo!’s
>> > > Cloud
>> > > > > Serving Benchmark (YCSB) [2] is the most suitable tool for such a
>> > > purpose.
>> > > > > I found a recent research about Object-NoSQL Database Mapper
>> (ONDM)
>> > > > > benchmark [3] which includes Apache Gora and they have produced
>> the
>> > > > > benchmark source code as ASF 2.0 licensed [4].
>> > > > >
>> > > > > Here is an example from Apache Accumulo which is based on YCSB too
>> > [5].
>> > > > >
>> > > > > What do you think about it? Who wants to join that work apart from
>> > me?
>> > > > >
>> > > > > Kind Regards,
>> > > > > Furkan KAMACI
>> > > > >
>> > > > >
>> > > > > [1] https://gora.apache.org
>> > > > > [2] Cooper BF, Silberstein A, Tam E, Ramakrishnan R, Sears R.
>> > > Benchmarking
>> > > > > cloud serving systems with YCSB. In: Proceedings of the 1st ACM
>> > > symposium
>> > > > > on Cloud computing - SoCC ’10. Association for Computing Machinery
>> > > (ACM):
>> > > > > 2010. p. 143–154, doi:10.1145/1807128.1807152.
>> > > > > http://dx.doi.org/10.1145/1807128.1807152.
>> > > > > [3] https://doi.org/10.1186/s13174-016-0052-x
>> > > > > [4] https://github.com/vreniers/ONDM-Benchmarker
>> > > > > [5]
>> https://accumulo.apache.org/papers/accumulo-benchmarking-2.1.pdf
>> > > > >
>> > > >
>> > > > Hi All, I was advised by Kevin Ratnasekera to start or reignite this
>> > > discussion. I am currently going over the documentation, installation
>> and
>> > > familiarising myself with the code base. Any good pointers here will
>> be
>> > > helpful.
>> > >
>> >
>>
>

Re: Apache Gora Benchmark

Reply via email to