Vovan, As one of hadoop maintainers, please share your point of view on this.
— Denis > On Nov 30, 2016, at 10:49 PM, Sergey Kozlov <skoz...@gridgain.com> wrote: > > Denis > > I agree that at the moment there's no reason to split into fabric and > hadoop editions. > > On Thu, Dec 1, 2016 at 4:45 AM, Denis Magda <dma...@apache.org> wrote: > >> Hadoop Accelerator doesn’t require any additional libraries in compare to >> those we have in the fabric build. It only lacks some of them as Val >> mentioned below. >> >> Wouldn’t it better to discontinue Hadoop Accelerator edition and simply >> deliver hadoop jar and its configs as a part of the fabric? >> >> — >> Denis >> >>> On Nov 27, 2016, at 3:12 PM, Dmitriy Setrakyan <dsetrak...@apache.org> >> wrote: >>> >>> Separate edition for the Hadoop Accelerator was primarily driven by the >>> default libraries. Hadoop Accelerator requires many more libraries as >> well >>> as configuration settings compared to the standard fabric download. >>> >>> Now, as far as spark integration is concerned, I am not sure which >> edition >>> it belongs in, Hadoop Accelerator or standard fabric. >>> >>> D. >>> >>> On Sat, Nov 26, 2016 at 7:39 PM, Denis Magda <dma...@apache.org> wrote: >>> >>>> *Dmitriy*, >>>> >>>> I do believe that you should know why the community decided to a >> separate >>>> edition for the Hadoop Accelerator. What was the reason for that? >>>> Presently, as I see, it brings more confusion and difficulties rather >> then >>>> benefit. >>>> >>>> — >>>> Denis >>>> >>>> On Nov 26, 2016, at 2:14 PM, Konstantin Boudnik <c...@apache.org> wrote: >>>> >>>> In fact I am very much agree with you. Right now, running the >> "accelerator" >>>> component in Bigtop disto gives one a pretty much complete fabric >> anyway. >>>> But >>>> in order to make just an accelerator component we perform quite a bit of >>>> woodoo magic during the packaging stage of the Bigtop build, shuffling >> jars >>>> from here and there. And that's quite crazy, honestly ;) >>>> >>>> Cos >>>> >>>> On Mon, Nov 21, 2016 at 03:33PM, Valentin Kulichenko wrote: >>>> >>>> I tend to agree with Denis. I see only these differences between Hadoop >>>> Accelerator and Fabric builds (correct me if I miss something): >>>> >>>> - Limited set of available modules and no optional modules in Hadoop >>>> Accelerator. >>>> - No ignite-hadoop module in Fabric. >>>> - Additional scripts, configs and instructions included in Hadoop >>>> Accelerator. >>>> >>>> And the list of included modules frankly looks very weird. Here are only >>>> some of the issues I noticed: >>>> >>>> - ignite-indexing and ignite-spark are mandatory. Even if we need them >>>> for Hadoop Acceleration (which I doubt), are they really required or >> can >>>> be >>>> optional? >>>> - We force to use ignite-log4j module without providing other logger >>>> options (e.g., SLF). >>>> - We don't include ignite-aws module. How to use Hadoop Accelerator >> with >>>> S3 discovery? >>>> - Etc. >>>> >>>> It seems to me that if we try to fix all this issue, there will be >>>> virtually no difference between Fabric and Hadoop Accelerator builds >> except >>>> couple of scripts and config files. If so, there is no reason to have >> two >>>> builds. >>>> >>>> -Val >>>> >>>> On Mon, Nov 21, 2016 at 3:13 PM, Denis Magda <dma...@apache.org> wrote: >>>> >>>> On the separate note, in the Bigtop, we start looking into changing the >>>> >>>> way we >>>> >>>> deliver Ignite and we'll likely to start offering the whole 'data >> fabric' >>>> experience instead of the mere "hadoop-acceleration”. >>>> >>>> >>>> And you still will be using hadoop-accelerator libs of Ignite, right? >>>> >>>> I’m thinking of if there is a need to keep releasing Hadoop Accelerator >> as >>>> a separate delivery. >>>> What if we start releasing the accelerator as a part of the standard >>>> fabric binary putting hadoop-accelerator libs under ‘optional’ folder? >>>> >>>> — >>>> Denis >>>> >>>> On Nov 21, 2016, at 12:19 PM, Konstantin Boudnik <c...@apache.org> >> wrote: >>>> >>>> What Denis said: spark has been added to the Hadoop accelerator as a way >>>> >>>> to >>>> >>>> boost the performance of more than just MR compute of the Hadoop stack, >>>> >>>> IIRC. >>>> >>>> For what it worth, Spark is considered a part of Hadoop at large. >>>> >>>> On the separate note, in the Bigtop, we start looking into changing the >>>> >>>> way we >>>> >>>> deliver Ignite and we'll likely to start offering the whole 'data >> fabric' >>>> experience instead of the mere "hadoop-acceleration". >>>> >>>> Cos >>>> >>>> On Mon, Nov 21, 2016 at 09:54AM, Denis Magda wrote: >>>> >>>> Val, >>>> >>>> Ignite Hadoop module includes not only the map-reduce accelerator but >>>> >>>> Ignite >>>> >>>> Hadoop File System component as well. The latter can be used in >>>> >>>> deployments >>>> >>>> like HDFS+IGFS+Ignite Spark + Spark. >>>> >>>> Considering this I’m for the second solution proposed by you: put both >>>> >>>> 2.10 >>>> >>>> and 2.11 ignite-spark modules under ‘optional’ folder of Ignite Hadoop >>>> Accelerator distribution. >>>> https://issues.apache.org/jira/browse/IGNITE-4254 < >>>> >>>> https://issues.apache.org/jira/browse/IGNITE-4254> >>>> >>>> >>>> BTW, this task may be affected or related to the following ones: >>>> https://issues.apache.org/jira/browse/IGNITE-3596 < >>>> >>>> https://issues.apache.org/jira/browse/IGNITE-3596> >>>> >>>> https://issues.apache.org/jira/browse/IGNITE-3822 >>>> >>>> — >>>> Denis >>>> >>>> On Nov 19, 2016, at 1:26 PM, Valentin Kulichenko < >>>> >>>> valentin.kuliche...@gmail.com> wrote: >>>> >>>> >>>> Hadoop Accelerator is a plugin to Ignite and this plugin is used by >>>> >>>> Hadoop >>>> >>>> when running its jobs. ignite-spark module only provides IgniteRDD >>>> >>>> which >>>> >>>> Hadoop obviously will never use. >>>> >>>> Is there another use case for Hadoop Accelerator which I'm missing? >>>> >>>> -Val >>>> >>>> On Sat, Nov 19, 2016 at 3:12 AM, Dmitriy Setrakyan < >>>> >>>> dsetrak...@apache.org> >>>> >>>> wrote: >>>> >>>> Why do you think that spark module is not needed in our hadoop build? >>>> >>>> On Fri, Nov 18, 2016 at 5:44 PM, Valentin Kulichenko < >>>> valentin.kuliche...@gmail.com> wrote: >>>> >>>> Folks, >>>> >>>> Is there anyone who understands the purpose of including ignite-spark >>>> module in the Hadoop Accelerator build? I can't figure out a use >>>> >>>> case for >>>> >>>> which it's needed. >>>> >>>> In case we actually need it there, there is an issue then. We >>>> >>>> actually >>>> >>>> have >>>> >>>> two ignite-spark modules, for 2.10 and 2.11. In Fabric build >>>> >>>> everything >>>> >>>> is >>>> >>>> good, we put both in 'optional' folder and user can enable either >>>> >>>> one. >>>> >>>> But >>>> >>>> in Hadoop Accelerator there is only 2.11 which means that the build >>>> >>>> doesn't >>>> >>>> work with 2.10 out of the box. >>>> >>>> We should either remove the module from the build, or fix the issue. >>>> >>>> -Val >>>> >>>> >>>> >>>> >>>> >>>> >>>> >> >> > > > -- > Sergey Kozlov > GridGain Systems > www.gridgain.com