Re: [DISCUSSION]: Adding GridGain component in Bigtop

Konstantin Boudnik Wed, 26 Mar 2014 15:35:30 -0700

Oh, and BTW - don't forget to tweet with #asfbigtop hash to RSVP for the event 
;)


On Wed, Mar 26, 2014 at 02:36PM, Dmitriy Setrakyan wrote:
> I plan to be at ApacheCon on Monday, April 7th. I hear that Bigtop will
> have a meetup there in the evening. Do you think it will be OK if I could
> spend about 20 minutes there to present GridGain GGFS and overall approach
> to Hadoop acceleration? I think it would be interesting to go through a
> couple of architectural diagrams and may spur a good discussion.
> 
> -Dmitriy
> 
> On Wed, Mar 26, 2014 at 8:35 AM, Jay Vyas <[email protected]> wrote:
> 
> > I love the fact that GridGain is going to be part of bigtop !   This will
> > give us two new compute paradigms, all packaged  and testable under the
> > same umbrella.  And now with our vagrant recipes, people will be able to
> > demo grid gain by simply typing "vagrant up" into the console.
> >
> > And Im pretty sure GridGain and Spark will drive each other forward .  Just
> > the same way Ceph, HDFS, and GlusterFS do.
> >
> > Dmitriy will you be at apachecon?  If so why dont you come share your
> > thoughts with us at the two bigtop meetups on the 7th and the 8th ?
> >
> >
> >
> >
> >
> > On Wed, Mar 26, 2014 at 10:26 AM, Dmitriy Setrakyan <
> > [email protected]
> > > wrote:
> >
> > > Andrew,
> > >
> > > I agree with you. All I meant to say is that currently users of Hadoop
> > that
> > > would like to improve performance of their deployments have to switch to
> > > Spark and code to Spark APIs. GridGain, on the other hand, will provide
> > an
> > > option to accelerate existing Hadoop deployments without any changes in
> > > code.
> > >
> > > Regards,
> > > -Dmtiriy
> > >
> > > On Tue, Mar 25, 2014 at 4:16 PM, Andrew Purtell <[email protected]>
> > > wrote:
> > >
> > > > Thank you.
> > > >
> > > > On this part of your response:
> > > >
> > > > > GridGain is working on adding native MapReduce component which will
> > > > provide
> > > > native complete Hadoop integration without changes in API, like Spark
> > > > currently forces you to do
> > > >
> > > > I'm not sure those flocking to Spark are doing so by force. Nor that
> > the
> > > > Spark API should be considered a liability when compared to Hadoop
> > > > MapReduce. For your consideration.
> > > >
> > > >
> > > >
> > > > On Tue, Mar 25, 2014 at 12:08 AM, Dmitriy Setrakyan <
> > > > [email protected]
> > > > > wrote:
> > > >
> > > > > I think the feature set is pretty close and GGFS would be a good
> > > contract
> > > > > to Tachyon for performance and reliability features.
> > > > >
> > > > > I am not an expert on Tachyon, but I think the main differences are:
> > > > >
> > > > > - GGFS allows read-through and write-through to/from underlying HDFS
> > or
> > > > any
> > > > > other Hadoop compliant file system with zero code change. Essentially
> > > > GGFS
> > > > > entirely removes ETL step from integration.
> > > > >
> > > > > - GGFS has ability to pick and choose what folders stay in memory,
> > what
> > > > > folders stay on disc, and what folders get synchronized with
> > underlying
> > > > > (HD)FS either synchronously or asynchronously.
> > > > >
> > > > > - GridGain is working on adding native MapReduce component which will
> > > > > provide native complete Hadoop integration without changes in API,
> > like
> > > > > Spark currently forces you to do. Essentially GridGain MR+GGFS will
> > > allow
> > > > > to bring Hadoop completely or partially in-memory in Plug-n-Play
> > > fashion
> > > > > without any API changes.
> > > > >
> > > > > There are probably other differences that I am forgetting right now,
> > > but
> > > > I
> > > > > think the above set lists the most significant ones.
> > > > >
> > > > > Regards,
> > > > > --
> > > > > Dmitriy Setrakyan, EVP Engineering
> > > > > *GridGain Systems*
> > > > > www.gridgain.com
> > > > >
> > > > >
> > > > > On Mon, Mar 24, 2014 at 11:53 PM, Andrew Purtell <
> > [email protected]
> > > > > >wrote:
> > > > >
> > > > > > Dmitriy,
> > > > > >
> > > > > > Would it be possible to contrast GGFS with Tachyon (
> > > > > > http://tachyon-project.org/)?
> > > > > >
> > > > > > Also, do you have any plans for Spark integration?
> > > > > >
> > > > > >
> > > > > > On Mon, Mar 24, 2014 at 11:35 PM, Dmitriy Setrakyan <
> > > > > > [email protected]
> > > > > > > wrote:
> > > > > >
> > > > > > > Hi Roman,
> > > > > > >
> > > > > > > At this point the integration is pluggable in memory file system,
> > > > GGFS.
> > > > > > It
> > > > > > > works just like HDFS (same API), but in reality serves as a
> > caching
> > > > > layer
> > > > > > > on top  of HDFS. GGFS caches the hottest file blocks and then
> > > > > > synchronizes
> > > > > > > them with underlying HDFS either synchronously or asynchronously,
> > > > > > depending
> > > > > > > on configuration.
> > > > > > >
> > > > > > > Since, GGFS implements standard Hadoop File System API, it
> > > > > automatically
> > > > > > > integrates with other Hadoop ecosystem pieces via File System API
> > > as
> > > > > > well.
> > > > > > >
> > > > > > > Going forward, we are planning to add same native API integration
> > > for
> > > > > > > MapReduce component as well.
> > > > > > >
> > > > > > > Hope this answers your question.
> > > > > > >
> > > > > > > -Dmitriy
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Mar 24, 2014 at 11:11 PM, Roman Shaposhnik <
> > [email protected]
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Dmitriy!
> > > > > > > >
> > > > > > > > Welcome to the Bigtop community!
> > > > > > > >
> > > > > > > > On Mon, Mar 24, 2014 at 10:43 PM, Konstantin Boudnik <
> > > > [email protected]
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >> One of the main pieces of our platform is our In-Memory
> > Apache
> > > > > > Hadoop
> > > > > > > > >> Accelerator which aims to accelerate HDFS and Map/Reduce by
> > > > > bringing
> > > > > > > > both,
> > > > > > > > >> data and computations into memory. We do it with our GGFS -
> > > > Hadoop
> > > > > > > > >> compliant in-memory file system. For I/O intensive jobs
> > > GridGain
> > > > > > GGFS
> > > > > > > > >> offers performance close to 100x faster than standard HDFS.
> > > More
> > > > > > > > >> information can be found here:
> > > > > > > > >> http://www.gridgain.org/features/hadoop-acceleration/
> > > > > > > > >>
> > > > > > > > >> We would like to have an opportunity to integrate our Apache
> > > > > Hadoop
> > > > > > > > >> Accelerator with Apache Bigtop. Please let us know if this
> > is
> > > > > > possible
> > > > > > > > and
> > > > > > > > >> what steps are required of us.
> > > > > > > >
> > > > > > > > I've been actually fascinated by the in-memory analytics
> > > platforms
> > > > > > > lately.
> > > > > > > > Things like Apache Spark seem to be a really good addition to
> > the
> > > > > > > > Hadoop ecosystem.
> > > > > > > >
> > > > > > > > Now, I understand that you've got a piece of technology that
> > can
> > > > > > > > essentially
> > > > > > > > serve as a replacement for HDFS, but could you please elaborate
> > > on
> > > > > > > > what other integration points do you have between GridGain and
> > > the
> > > > > rest
> > > > > > > > of Hadoop ecosystem?
> > > > > > > >
> > > > > > > > That, I think, would be a much wider discussion.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Roman.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Best regards,
> > > > > >
> > > > > >    - Andy
> > > > > >
> > > > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > > > Hein
> > > > > > (via Tom White)
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Best regards,
> > > >
> > > >    - Andy
> > > >
> > > > Problems worthy of attack prove their worth by hitting back. - Piet
> > Hein
> > > > (via Tom White)
> > > >
> > >
> >
> >
> >
> > --
> > Jay Vyas
> > http://jayunit100.blogspot.com
> >

Re: [DISCUSSION]: Adding GridGain component in Bigtop

Reply via email to