Guys, I just downloaded the hadoop accelerator and here are the differences from the fabric edition that jump at me right away:
- the "bin/" folder has "setup-hadoop" scripts - the "config/" folder has "hadoop" subfolder with necessary hadoop-related configuration - the "lib/" folder has much fewer libraries that in fabric, simply becomes many dependencies don't make sense for hadoop environment I currently don't see how we can merge the hadoop accelerator with standard fabric edition. D. On Thu, Dec 1, 2016 at 9:54 AM, Denis Magda <dma...@apache.org> wrote: > Vovan, > > As one of hadoop maintainers, please share your point of view on this. > > — > Denis > > > On Nov 30, 2016, at 10:49 PM, Sergey Kozlov <skoz...@gridgain.com> > wrote: > > > > Denis > > > > I agree that at the moment there's no reason to split into fabric and > > hadoop editions. > > > > On Thu, Dec 1, 2016 at 4:45 AM, Denis Magda <dma...@apache.org> wrote: > > > >> Hadoop Accelerator doesn’t require any additional libraries in compare > to > >> those we have in the fabric build. It only lacks some of them as Val > >> mentioned below. > >> > >> Wouldn’t it better to discontinue Hadoop Accelerator edition and simply > >> deliver hadoop jar and its configs as a part of the fabric? > >> > >> — > >> Denis > >> > >>> On Nov 27, 2016, at 3:12 PM, Dmitriy Setrakyan <dsetrak...@apache.org> > >> wrote: > >>> > >>> Separate edition for the Hadoop Accelerator was primarily driven by the > >>> default libraries. Hadoop Accelerator requires many more libraries as > >> well > >>> as configuration settings compared to the standard fabric download. > >>> > >>> Now, as far as spark integration is concerned, I am not sure which > >> edition > >>> it belongs in, Hadoop Accelerator or standard fabric. > >>> > >>> D. > >>> > >>> On Sat, Nov 26, 2016 at 7:39 PM, Denis Magda <dma...@apache.org> > wrote: > >>> > >>>> *Dmitriy*, > >>>> > >>>> I do believe that you should know why the community decided to a > >> separate > >>>> edition for the Hadoop Accelerator. What was the reason for that? > >>>> Presently, as I see, it brings more confusion and difficulties rather > >> then > >>>> benefit. > >>>> > >>>> — > >>>> Denis > >>>> > >>>> On Nov 26, 2016, at 2:14 PM, Konstantin Boudnik <c...@apache.org> > wrote: > >>>> > >>>> In fact I am very much agree with you. Right now, running the > >> "accelerator" > >>>> component in Bigtop disto gives one a pretty much complete fabric > >> anyway. > >>>> But > >>>> in order to make just an accelerator component we perform quite a bit > of > >>>> woodoo magic during the packaging stage of the Bigtop build, shuffling > >> jars > >>>> from here and there. And that's quite crazy, honestly ;) > >>>> > >>>> Cos > >>>> > >>>> On Mon, Nov 21, 2016 at 03:33PM, Valentin Kulichenko wrote: > >>>> > >>>> I tend to agree with Denis. I see only these differences between > Hadoop > >>>> Accelerator and Fabric builds (correct me if I miss something): > >>>> > >>>> - Limited set of available modules and no optional modules in Hadoop > >>>> Accelerator. > >>>> - No ignite-hadoop module in Fabric. > >>>> - Additional scripts, configs and instructions included in Hadoop > >>>> Accelerator. > >>>> > >>>> And the list of included modules frankly looks very weird. Here are > only > >>>> some of the issues I noticed: > >>>> > >>>> - ignite-indexing and ignite-spark are mandatory. Even if we need them > >>>> for Hadoop Acceleration (which I doubt), are they really required or > >> can > >>>> be > >>>> optional? > >>>> - We force to use ignite-log4j module without providing other logger > >>>> options (e.g., SLF). > >>>> - We don't include ignite-aws module. How to use Hadoop Accelerator > >> with > >>>> S3 discovery? > >>>> - Etc. > >>>> > >>>> It seems to me that if we try to fix all this issue, there will be > >>>> virtually no difference between Fabric and Hadoop Accelerator builds > >> except > >>>> couple of scripts and config files. If so, there is no reason to have > >> two > >>>> builds. > >>>> > >>>> -Val > >>>> > >>>> On Mon, Nov 21, 2016 at 3:13 PM, Denis Magda <dma...@apache.org> > wrote: > >>>> > >>>> On the separate note, in the Bigtop, we start looking into changing > the > >>>> > >>>> way we > >>>> > >>>> deliver Ignite and we'll likely to start offering the whole 'data > >> fabric' > >>>> experience instead of the mere "hadoop-acceleration”. > >>>> > >>>> > >>>> And you still will be using hadoop-accelerator libs of Ignite, right? > >>>> > >>>> I’m thinking of if there is a need to keep releasing Hadoop > Accelerator > >> as > >>>> a separate delivery. > >>>> What if we start releasing the accelerator as a part of the standard > >>>> fabric binary putting hadoop-accelerator libs under ‘optional’ folder? > >>>> > >>>> — > >>>> Denis > >>>> > >>>> On Nov 21, 2016, at 12:19 PM, Konstantin Boudnik <c...@apache.org> > >> wrote: > >>>> > >>>> What Denis said: spark has been added to the Hadoop accelerator as a > way > >>>> > >>>> to > >>>> > >>>> boost the performance of more than just MR compute of the Hadoop > stack, > >>>> > >>>> IIRC. > >>>> > >>>> For what it worth, Spark is considered a part of Hadoop at large. > >>>> > >>>> On the separate note, in the Bigtop, we start looking into changing > the > >>>> > >>>> way we > >>>> > >>>> deliver Ignite and we'll likely to start offering the whole 'data > >> fabric' > >>>> experience instead of the mere "hadoop-acceleration". > >>>> > >>>> Cos > >>>> > >>>> On Mon, Nov 21, 2016 at 09:54AM, Denis Magda wrote: > >>>> > >>>> Val, > >>>> > >>>> Ignite Hadoop module includes not only the map-reduce accelerator but > >>>> > >>>> Ignite > >>>> > >>>> Hadoop File System component as well. The latter can be used in > >>>> > >>>> deployments > >>>> > >>>> like HDFS+IGFS+Ignite Spark + Spark. > >>>> > >>>> Considering this I’m for the second solution proposed by you: put both > >>>> > >>>> 2.10 > >>>> > >>>> and 2.11 ignite-spark modules under ‘optional’ folder of Ignite Hadoop > >>>> Accelerator distribution. > >>>> https://issues.apache.org/jira/browse/IGNITE-4254 < > >>>> > >>>> https://issues.apache.org/jira/browse/IGNITE-4254> > >>>> > >>>> > >>>> BTW, this task may be affected or related to the following ones: > >>>> https://issues.apache.org/jira/browse/IGNITE-3596 < > >>>> > >>>> https://issues.apache.org/jira/browse/IGNITE-3596> > >>>> > >>>> https://issues.apache.org/jira/browse/IGNITE-3822 > >>>> > >>>> — > >>>> Denis > >>>> > >>>> On Nov 19, 2016, at 1:26 PM, Valentin Kulichenko < > >>>> > >>>> valentin.kuliche...@gmail.com> wrote: > >>>> > >>>> > >>>> Hadoop Accelerator is a plugin to Ignite and this plugin is used by > >>>> > >>>> Hadoop > >>>> > >>>> when running its jobs. ignite-spark module only provides IgniteRDD > >>>> > >>>> which > >>>> > >>>> Hadoop obviously will never use. > >>>> > >>>> Is there another use case for Hadoop Accelerator which I'm missing? > >>>> > >>>> -Val > >>>> > >>>> On Sat, Nov 19, 2016 at 3:12 AM, Dmitriy Setrakyan < > >>>> > >>>> dsetrak...@apache.org> > >>>> > >>>> wrote: > >>>> > >>>> Why do you think that spark module is not needed in our hadoop build? > >>>> > >>>> On Fri, Nov 18, 2016 at 5:44 PM, Valentin Kulichenko < > >>>> valentin.kuliche...@gmail.com> wrote: > >>>> > >>>> Folks, > >>>> > >>>> Is there anyone who understands the purpose of including ignite-spark > >>>> module in the Hadoop Accelerator build? I can't figure out a use > >>>> > >>>> case for > >>>> > >>>> which it's needed. > >>>> > >>>> In case we actually need it there, there is an issue then. We > >>>> > >>>> actually > >>>> > >>>> have > >>>> > >>>> two ignite-spark modules, for 2.10 and 2.11. In Fabric build > >>>> > >>>> everything > >>>> > >>>> is > >>>> > >>>> good, we put both in 'optional' folder and user can enable either > >>>> > >>>> one. > >>>> > >>>> But > >>>> > >>>> in Hadoop Accelerator there is only 2.11 which means that the build > >>>> > >>>> doesn't > >>>> > >>>> work with 2.10 out of the box. > >>>> > >>>> We should either remove the module from the build, or fix the issue. > >>>> > >>>> -Val > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >> > >> > > > > > > -- > > Sergey Kozlov > > GridGain Systems > > www.gridgain.com > >