Re: ignite-spark module in Hadoop Accelerator

Konstantin Boudnik Sat, 26 Nov 2016 13:17:12 -0800

In fact I am very much agree with you. Right now, running the "accelerator"
component in Bigtop disto gives one a pretty much complete fabric anyway. But
in order to make just an accelerator component we perform quite a bit of
woodoo magic during the packaging stage of the Bigtop build, shuffling jars
from here and there. And that's quite crazy, honestly ;)


Cos
 
On Mon, Nov 21, 2016 at 03:33PM, Valentin Kulichenko wrote:
> I tend to agree with Denis. I see only these differences between Hadoop
> Accelerator and Fabric builds (correct me if I miss something):
> 
>    - Limited set of available modules and no optional modules in Hadoop
>    Accelerator.
>    - No ignite-hadoop module in Fabric.
>    - Additional scripts, configs and instructions included in Hadoop
>    Accelerator.
> 
> And the list of included modules frankly looks very weird. Here are only
> some of the issues I noticed:
> 
>    - ignite-indexing and ignite-spark are mandatory. Even if we need them
>    for Hadoop Acceleration (which I doubt), are they really required or can be
>    optional?
>    - We force to use ignite-log4j module without providing other logger
>    options (e.g., SLF).
>    - We don't include ignite-aws module. How to use Hadoop Accelerator with
>    S3 discovery?
>    - Etc.
> 
> It seems to me that if we try to fix all this issue, there will be
> virtually no difference between Fabric and Hadoop Accelerator builds except
> couple of scripts and config files. If so, there is no reason to have two
> builds.
> 
> -Val
> 
> On Mon, Nov 21, 2016 at 3:13 PM, Denis Magda <[email protected]> wrote:
> 
> > > On the separate note, in the Bigtop, we start looking into changing the
> > way we
> > > deliver Ignite and we'll likely to start offering the whole 'data fabric'
> > > experience instead of the mere "hadoop-acceleration”.
> >
> > And you still will be using hadoop-accelerator libs of Ignite, right?
> >
> > I’m thinking of if there is a need to keep releasing Hadoop Accelerator as
> > a separate delivery.
> > What if we start releasing the accelerator as a part of the standard
> > fabric binary putting hadoop-accelerator libs under ‘optional’ folder?
> >
> > —
> > Denis
> >
> > > On Nov 21, 2016, at 12:19 PM, Konstantin Boudnik <[email protected]> wrote:
> > >
> > > What Denis said: spark has been added to the Hadoop accelerator as a way
> > to
> > > boost the performance of more than just MR compute of the Hadoop stack,
> > IIRC.
> > > For what it worth, Spark is considered a part of Hadoop at large.
> > >
> > > On the separate note, in the Bigtop, we start looking into changing the
> > way we
> > > deliver Ignite and we'll likely to start offering the whole 'data fabric'
> > > experience instead of the mere "hadoop-acceleration".
> > >
> > > Cos
> > >
> > > On Mon, Nov 21, 2016 at 09:54AM, Denis Magda wrote:
> > >> Val,
> > >>
> > >> Ignite Hadoop module includes not only the map-reduce accelerator but
> > Ignite
> > >> Hadoop File System component as well. The latter can be used in
> > deployments
> > >> like HDFS+IGFS+Ignite Spark + Spark.
> > >>
> > >> Considering this I’m for the second solution proposed by you: put both
> > 2.10
> > >> and 2.11 ignite-spark modules under ‘optional’ folder of Ignite Hadoop
> > >> Accelerator distribution.
> > >> https://issues.apache.org/jira/browse/IGNITE-4254 <
> > https://issues.apache.org/jira/browse/IGNITE-4254>
> > >>
> > >> BTW, this task may be affected or related to the following ones:
> > >> https://issues.apache.org/jira/browse/IGNITE-3596 <
> > https://issues.apache.org/jira/browse/IGNITE-3596>
> > >> https://issues.apache.org/jira/browse/IGNITE-3822
> > >>
> > >> —
> > >> Denis
> > >>
> > >>> On Nov 19, 2016, at 1:26 PM, Valentin Kulichenko <
> > [email protected]> wrote:
> > >>>
> > >>> Hadoop Accelerator is a plugin to Ignite and this plugin is used by
> > Hadoop
> > >>> when running its jobs. ignite-spark module only provides IgniteRDD
> > which
> > >>> Hadoop obviously will never use.
> > >>>
> > >>> Is there another use case for Hadoop Accelerator which I'm missing?
> > >>>
> > >>> -Val
> > >>>
> > >>> On Sat, Nov 19, 2016 at 3:12 AM, Dmitriy Setrakyan <
> > [email protected]>
> > >>> wrote:
> > >>>
> > >>>> Why do you think that spark module is not needed in our hadoop build?
> > >>>>
> > >>>> On Fri, Nov 18, 2016 at 5:44 PM, Valentin Kulichenko <
> > >>>> [email protected]> wrote:
> > >>>>
> > >>>>> Folks,
> > >>>>>
> > >>>>> Is there anyone who understands the purpose of including ignite-spark
> > >>>>> module in the Hadoop Accelerator build? I can't figure out a use
> > case for
> > >>>>> which it's needed.
> > >>>>>
> > >>>>> In case we actually need it there, there is an issue then. We
> > actually
> > >>>> have
> > >>>>> two ignite-spark modules, for 2.10 and 2.11. In Fabric build
> > everything
> > >>>> is
> > >>>>> good, we put both in 'optional' folder and user can enable either
> > one.
> > >>>> But
> > >>>>> in Hadoop Accelerator there is only 2.11 which means that the build
> > >>>> doesn't
> > >>>>> work with 2.10 out of the box.
> > >>>>>
> > >>>>> We should either remove the module from the build, or fix the issue.
> > >>>>>
> > >>>>> -Val
> > >>>>>
> > >>>>
> > >>
> >
> >

Re: ignite-spark module in Hadoop Accelerator

Reply via email to