Dmitriy, I do believe that you should know why the community decided to a separate edition for the Hadoop Accelerator. What was the reason for that? Presently, as I see, it brings more confusion and difficulties rather then benefit.
— Denis > On Nov 26, 2016, at 2:14 PM, Konstantin Boudnik <c...@apache.org> wrote: > > In fact I am very much agree with you. Right now, running the "accelerator" > component in Bigtop disto gives one a pretty much complete fabric anyway. But > in order to make just an accelerator component we perform quite a bit of > woodoo magic during the packaging stage of the Bigtop build, shuffling jars > from here and there. And that's quite crazy, honestly ;) > > Cos > > On Mon, Nov 21, 2016 at 03:33PM, Valentin Kulichenko wrote: >> I tend to agree with Denis. I see only these differences between Hadoop >> Accelerator and Fabric builds (correct me if I miss something): >> >> - Limited set of available modules and no optional modules in Hadoop >> Accelerator. >> - No ignite-hadoop module in Fabric. >> - Additional scripts, configs and instructions included in Hadoop >> Accelerator. >> >> And the list of included modules frankly looks very weird. Here are only >> some of the issues I noticed: >> >> - ignite-indexing and ignite-spark are mandatory. Even if we need them >> for Hadoop Acceleration (which I doubt), are they really required or can be >> optional? >> - We force to use ignite-log4j module without providing other logger >> options (e.g., SLF). >> - We don't include ignite-aws module. How to use Hadoop Accelerator with >> S3 discovery? >> - Etc. >> >> It seems to me that if we try to fix all this issue, there will be >> virtually no difference between Fabric and Hadoop Accelerator builds except >> couple of scripts and config files. If so, there is no reason to have two >> builds. >> >> -Val >> >> On Mon, Nov 21, 2016 at 3:13 PM, Denis Magda <dma...@apache.org> wrote: >> >>>> On the separate note, in the Bigtop, we start looking into changing the >>> way we >>>> deliver Ignite and we'll likely to start offering the whole 'data fabric' >>>> experience instead of the mere "hadoop-acceleration”. >>> >>> And you still will be using hadoop-accelerator libs of Ignite, right? >>> >>> I’m thinking of if there is a need to keep releasing Hadoop Accelerator as >>> a separate delivery. >>> What if we start releasing the accelerator as a part of the standard >>> fabric binary putting hadoop-accelerator libs under ‘optional’ folder? >>> >>> — >>> Denis >>> >>>> On Nov 21, 2016, at 12:19 PM, Konstantin Boudnik <c...@apache.org> wrote: >>>> >>>> What Denis said: spark has been added to the Hadoop accelerator as a way >>> to >>>> boost the performance of more than just MR compute of the Hadoop stack, >>> IIRC. >>>> For what it worth, Spark is considered a part of Hadoop at large. >>>> >>>> On the separate note, in the Bigtop, we start looking into changing the >>> way we >>>> deliver Ignite and we'll likely to start offering the whole 'data fabric' >>>> experience instead of the mere "hadoop-acceleration". >>>> >>>> Cos >>>> >>>> On Mon, Nov 21, 2016 at 09:54AM, Denis Magda wrote: >>>>> Val, >>>>> >>>>> Ignite Hadoop module includes not only the map-reduce accelerator but >>> Ignite >>>>> Hadoop File System component as well. The latter can be used in >>> deployments >>>>> like HDFS+IGFS+Ignite Spark + Spark. >>>>> >>>>> Considering this I’m for the second solution proposed by you: put both >>> 2.10 >>>>> and 2.11 ignite-spark modules under ‘optional’ folder of Ignite Hadoop >>>>> Accelerator distribution. >>>>> https://issues.apache.org/jira/browse/IGNITE-4254 < >>> https://issues.apache.org/jira/browse/IGNITE-4254> >>>>> >>>>> BTW, this task may be affected or related to the following ones: >>>>> https://issues.apache.org/jira/browse/IGNITE-3596 < >>> https://issues.apache.org/jira/browse/IGNITE-3596> >>>>> https://issues.apache.org/jira/browse/IGNITE-3822 >>>>> >>>>> — >>>>> Denis >>>>> >>>>>> On Nov 19, 2016, at 1:26 PM, Valentin Kulichenko < >>> valentin.kuliche...@gmail.com> wrote: >>>>>> >>>>>> Hadoop Accelerator is a plugin to Ignite and this plugin is used by >>> Hadoop >>>>>> when running its jobs. ignite-spark module only provides IgniteRDD >>> which >>>>>> Hadoop obviously will never use. >>>>>> >>>>>> Is there another use case for Hadoop Accelerator which I'm missing? >>>>>> >>>>>> -Val >>>>>> >>>>>> On Sat, Nov 19, 2016 at 3:12 AM, Dmitriy Setrakyan < >>> dsetrak...@apache.org> >>>>>> wrote: >>>>>> >>>>>>> Why do you think that spark module is not needed in our hadoop build? >>>>>>> >>>>>>> On Fri, Nov 18, 2016 at 5:44 PM, Valentin Kulichenko < >>>>>>> valentin.kuliche...@gmail.com> wrote: >>>>>>> >>>>>>>> Folks, >>>>>>>> >>>>>>>> Is there anyone who understands the purpose of including ignite-spark >>>>>>>> module in the Hadoop Accelerator build? I can't figure out a use >>> case for >>>>>>>> which it's needed. >>>>>>>> >>>>>>>> In case we actually need it there, there is an issue then. We >>> actually >>>>>>> have >>>>>>>> two ignite-spark modules, for 2.10 and 2.11. In Fabric build >>> everything >>>>>>> is >>>>>>>> good, we put both in 'optional' folder and user can enable either >>> one. >>>>>>> But >>>>>>>> in Hadoop Accelerator there is only 2.11 which means that the build >>>>>>> doesn't >>>>>>>> work with 2.10 out of the box. >>>>>>>> >>>>>>>> We should either remove the module from the build, or fix the issue. >>>>>>>> >>>>>>>> -Val >>>>>>>> >>>>>>> >>>>> >>> >>>