Re: ignite-spark module in Hadoop Accelerator

Valentin Kulichenko Mon, 21 Nov 2016 15:34:19 -0800

I tend to agree with Denis. I see only these differences between Hadoop
Accelerator and Fabric builds (correct me if I miss something):


   - Limited set of available modules and no optional modules in Hadoop
   Accelerator.
   - No ignite-hadoop module in Fabric.
   - Additional scripts, configs and instructions included in Hadoop
   Accelerator.

And the list of included modules frankly looks very weird. Here are only
some of the issues I noticed:

   - ignite-indexing and ignite-spark are mandatory. Even if we need them
   for Hadoop Acceleration (which I doubt), are they really required or can be
   optional?
   - We force to use ignite-log4j module without providing other logger
   options (e.g., SLF).
   - We don't include ignite-aws module. How to use Hadoop Accelerator with
   S3 discovery?
   - Etc.

It seems to me that if we try to fix all this issue, there will be
virtually no difference between Fabric and Hadoop Accelerator builds except
couple of scripts and config files. If so, there is no reason to have two
builds.

-Val

On Mon, Nov 21, 2016 at 3:13 PM, Denis Magda <dma...@apache.org> wrote:

> > On the separate note, in the Bigtop, we start looking into changing the
> way we
> > deliver Ignite and we'll likely to start offering the whole 'data fabric'
> > experience instead of the mere "hadoop-acceleration”.
>
> And you still will be using hadoop-accelerator libs of Ignite, right?
>
> I’m thinking of if there is a need to keep releasing Hadoop Accelerator as
> a separate delivery.
> What if we start releasing the accelerator as a part of the standard
> fabric binary putting hadoop-accelerator libs under ‘optional’ folder?
>
> —
> Denis
>
> > On Nov 21, 2016, at 12:19 PM, Konstantin Boudnik <c...@apache.org> wrote:
> >
> > What Denis said: spark has been added to the Hadoop accelerator as a way
> to
> > boost the performance of more than just MR compute of the Hadoop stack,
> IIRC.
> > For what it worth, Spark is considered a part of Hadoop at large.
> >
> > On the separate note, in the Bigtop, we start looking into changing the
> way we
> > deliver Ignite and we'll likely to start offering the whole 'data fabric'
> > experience instead of the mere "hadoop-acceleration".
> >
> > Cos
> >
> > On Mon, Nov 21, 2016 at 09:54AM, Denis Magda wrote:
> >> Val,
> >>
> >> Ignite Hadoop module includes not only the map-reduce accelerator but
> Ignite
> >> Hadoop File System component as well. The latter can be used in
> deployments
> >> like HDFS+IGFS+Ignite Spark + Spark.
> >>
> >> Considering this I’m for the second solution proposed by you: put both
> 2.10
> >> and 2.11 ignite-spark modules under ‘optional’ folder of Ignite Hadoop
> >> Accelerator distribution.
> >> https://issues.apache.org/jira/browse/IGNITE-4254 <
> https://issues.apache.org/jira/browse/IGNITE-4254>
> >>
> >> BTW, this task may be affected or related to the following ones:
> >> https://issues.apache.org/jira/browse/IGNITE-3596 <
> https://issues.apache.org/jira/browse/IGNITE-3596>
> >> https://issues.apache.org/jira/browse/IGNITE-3822
> >>
> >> —
> >> Denis
> >>
> >>> On Nov 19, 2016, at 1:26 PM, Valentin Kulichenko <
> valentin.kuliche...@gmail.com> wrote:
> >>>
> >>> Hadoop Accelerator is a plugin to Ignite and this plugin is used by
> Hadoop
> >>> when running its jobs. ignite-spark module only provides IgniteRDD
> which
> >>> Hadoop obviously will never use.
> >>>
> >>> Is there another use case for Hadoop Accelerator which I'm missing?
> >>>
> >>> -Val
> >>>
> >>> On Sat, Nov 19, 2016 at 3:12 AM, Dmitriy Setrakyan <
> dsetrak...@apache.org>
> >>> wrote:
> >>>
> >>>> Why do you think that spark module is not needed in our hadoop build?
> >>>>
> >>>> On Fri, Nov 18, 2016 at 5:44 PM, Valentin Kulichenko <
> >>>> valentin.kuliche...@gmail.com> wrote:
> >>>>
> >>>>> Folks,
> >>>>>
> >>>>> Is there anyone who understands the purpose of including ignite-spark
> >>>>> module in the Hadoop Accelerator build? I can't figure out a use
> case for
> >>>>> which it's needed.
> >>>>>
> >>>>> In case we actually need it there, there is an issue then. We
> actually
> >>>> have
> >>>>> two ignite-spark modules, for 2.10 and 2.11. In Fabric build
> everything
> >>>> is
> >>>>> good, we put both in 'optional' folder and user can enable either
> one.
> >>>> But
> >>>>> in Hadoop Accelerator there is only 2.11 which means that the build
> >>>> doesn't
> >>>>> work with 2.10 out of the box.
> >>>>>
> >>>>> We should either remove the module from the build, or fix the issue.
> >>>>>
> >>>>> -Val
> >>>>>
> >>>>
> >>
>
>

Re: ignite-spark module in Hadoop Accelerator

Reply via email to