Re: ignite-spark module in Hadoop Accelerator

Sergey Kozlov Wed, 30 Nov 2016 22:49:35 -0800

Denis

I agree that at the moment there's no reason to split into fabric and
hadoop editions.


On Thu, Dec 1, 2016 at 4:45 AM, Denis Magda <[email protected]> wrote:

> Hadoop Accelerator doesn’t require any additional libraries in compare to
> those we have in the fabric build. It only lacks some of them as Val
> mentioned below.
>
> Wouldn’t it better to discontinue Hadoop Accelerator edition and simply
> deliver hadoop jar and its configs as a part of the fabric?
>
> —
> Denis
>
> > On Nov 27, 2016, at 3:12 PM, Dmitriy Setrakyan <[email protected]>
> wrote:
> >
> > Separate edition for the Hadoop Accelerator was primarily driven by the
> > default libraries. Hadoop Accelerator requires many more libraries as
> well
> > as configuration settings compared to the standard fabric download.
> >
> > Now, as far as spark integration is concerned, I am not sure which
> edition
> > it belongs in, Hadoop Accelerator or standard fabric.
> >
> > D.
> >
> > On Sat, Nov 26, 2016 at 7:39 PM, Denis Magda <[email protected]> wrote:
> >
> >> *Dmitriy*,
> >>
> >> I do believe that you should know why the community decided to a
> separate
> >> edition for the Hadoop Accelerator. What was the reason for that?
> >> Presently, as I see, it brings more confusion and difficulties rather
> then
> >> benefit.
> >>
> >> —
> >> Denis
> >>
> >> On Nov 26, 2016, at 2:14 PM, Konstantin Boudnik <[email protected]> wrote:
> >>
> >> In fact I am very much agree with you. Right now, running the
> "accelerator"
> >> component in Bigtop disto gives one a pretty much complete fabric
> anyway.
> >> But
> >> in order to make just an accelerator component we perform quite a bit of
> >> woodoo magic during the packaging stage of the Bigtop build, shuffling
> jars
> >> from here and there. And that's quite crazy, honestly ;)
> >>
> >> Cos
> >>
> >> On Mon, Nov 21, 2016 at 03:33PM, Valentin Kulichenko wrote:
> >>
> >> I tend to agree with Denis. I see only these differences between Hadoop
> >> Accelerator and Fabric builds (correct me if I miss something):
> >>
> >>  - Limited set of available modules and no optional modules in Hadoop
> >>  Accelerator.
> >>  - No ignite-hadoop module in Fabric.
> >>  - Additional scripts, configs and instructions included in Hadoop
> >>  Accelerator.
> >>
> >> And the list of included modules frankly looks very weird. Here are only
> >> some of the issues I noticed:
> >>
> >>  - ignite-indexing and ignite-spark are mandatory. Even if we need them
> >>  for Hadoop Acceleration (which I doubt), are they really required or
> can
> >> be
> >>  optional?
> >>  - We force to use ignite-log4j module without providing other logger
> >>  options (e.g., SLF).
> >>  - We don't include ignite-aws module. How to use Hadoop Accelerator
> with
> >>  S3 discovery?
> >>  - Etc.
> >>
> >> It seems to me that if we try to fix all this issue, there will be
> >> virtually no difference between Fabric and Hadoop Accelerator builds
> except
> >> couple of scripts and config files. If so, there is no reason to have
> two
> >> builds.
> >>
> >> -Val
> >>
> >> On Mon, Nov 21, 2016 at 3:13 PM, Denis Magda <[email protected]> wrote:
> >>
> >> On the separate note, in the Bigtop, we start looking into changing the
> >>
> >> way we
> >>
> >> deliver Ignite and we'll likely to start offering the whole 'data
> fabric'
> >> experience instead of the mere "hadoop-acceleration”.
> >>
> >>
> >> And you still will be using hadoop-accelerator libs of Ignite, right?
> >>
> >> I’m thinking of if there is a need to keep releasing Hadoop Accelerator
> as
> >> a separate delivery.
> >> What if we start releasing the accelerator as a part of the standard
> >> fabric binary putting hadoop-accelerator libs under ‘optional’ folder?
> >>
> >> —
> >> Denis
> >>
> >> On Nov 21, 2016, at 12:19 PM, Konstantin Boudnik <[email protected]>
> wrote:
> >>
> >> What Denis said: spark has been added to the Hadoop accelerator as a way
> >>
> >> to
> >>
> >> boost the performance of more than just MR compute of the Hadoop stack,
> >>
> >> IIRC.
> >>
> >> For what it worth, Spark is considered a part of Hadoop at large.
> >>
> >> On the separate note, in the Bigtop, we start looking into changing the
> >>
> >> way we
> >>
> >> deliver Ignite and we'll likely to start offering the whole 'data
> fabric'
> >> experience instead of the mere "hadoop-acceleration".
> >>
> >> Cos
> >>
> >> On Mon, Nov 21, 2016 at 09:54AM, Denis Magda wrote:
> >>
> >> Val,
> >>
> >> Ignite Hadoop module includes not only the map-reduce accelerator but
> >>
> >> Ignite
> >>
> >> Hadoop File System component as well. The latter can be used in
> >>
> >> deployments
> >>
> >> like HDFS+IGFS+Ignite Spark + Spark.
> >>
> >> Considering this I’m for the second solution proposed by you: put both
> >>
> >> 2.10
> >>
> >> and 2.11 ignite-spark modules under ‘optional’ folder of Ignite Hadoop
> >> Accelerator distribution.
> >> https://issues.apache.org/jira/browse/IGNITE-4254 <
> >>
> >> https://issues.apache.org/jira/browse/IGNITE-4254>
> >>
> >>
> >> BTW, this task may be affected or related to the following ones:
> >> https://issues.apache.org/jira/browse/IGNITE-3596 <
> >>
> >> https://issues.apache.org/jira/browse/IGNITE-3596>
> >>
> >> https://issues.apache.org/jira/browse/IGNITE-3822
> >>
> >> —
> >> Denis
> >>
> >> On Nov 19, 2016, at 1:26 PM, Valentin Kulichenko <
> >>
> >> [email protected]> wrote:
> >>
> >>
> >> Hadoop Accelerator is a plugin to Ignite and this plugin is used by
> >>
> >> Hadoop
> >>
> >> when running its jobs. ignite-spark module only provides IgniteRDD
> >>
> >> which
> >>
> >> Hadoop obviously will never use.
> >>
> >> Is there another use case for Hadoop Accelerator which I'm missing?
> >>
> >> -Val
> >>
> >> On Sat, Nov 19, 2016 at 3:12 AM, Dmitriy Setrakyan <
> >>
> >> [email protected]>
> >>
> >> wrote:
> >>
> >> Why do you think that spark module is not needed in our hadoop build?
> >>
> >> On Fri, Nov 18, 2016 at 5:44 PM, Valentin Kulichenko <
> >> [email protected]> wrote:
> >>
> >> Folks,
> >>
> >> Is there anyone who understands the purpose of including ignite-spark
> >> module in the Hadoop Accelerator build? I can't figure out a use
> >>
> >> case for
> >>
> >> which it's needed.
> >>
> >> In case we actually need it there, there is an issue then. We
> >>
> >> actually
> >>
> >> have
> >>
> >> two ignite-spark modules, for 2.10 and 2.11. In Fabric build
> >>
> >> everything
> >>
> >> is
> >>
> >> good, we put both in 'optional' folder and user can enable either
> >>
> >> one.
> >>
> >> But
> >>
> >> in Hadoop Accelerator there is only 2.11 which means that the build
> >>
> >> doesn't
> >>
> >> work with 2.10 out of the box.
> >>
> >> We should either remove the module from the build, or fix the issue.
> >>
> >> -Val
> >>
> >>
> >>
> >>
> >>
> >>
> >>
>
>


-- 
Sergey Kozlov
GridGain Systems
www.gridgain.com

Re: ignite-spark module in Hadoop Accelerator

Reply via email to