Thank you for your suggestions, I am also considering Spark. Actually I was hoping I will be able to compare the speed of the Mahout's (MapReduce) and MLLib's (Spark) implementations of LDA algorithm, but am not sure whether the MLLib's implementation is already available in the current version. I hope I will at least be able to try one of the implementations. Anyway, don't want to spam your developer mailing list with this :-)
--David On Thu, Mar 12, 2015 at 6:21 PM, Konstantin Boudnik <[email protected]> wrote: > And speaking from my former academic background - it never hurts if you > thesis > is sexy. And Spark is quite hot at the moment ;) > > Cos > > On Thu, Mar 12, 2015 at 01:15PM, jay vyas wrote: > > @David i like rj's idea on considering mllib, which is something which is > > gauranteed to be bigtop supported ! possibly consider that as an option > if > > you want to build your thesis on bigtop > > > > On Thu, Mar 12, 2015 at 12:52 PM, Konstantin Boudnik <[email protected]> > wrote: > > > > > On Thu, Mar 12, 2015 at 06:04AM, jay vyas wrote: > > > > Hi david ! > > > > > > > > We found that mahout 0.9 , iirc, was released incompatible with Yarn > at > > > the > > > > time, and there wasn't any commandline option that you could run when > > > > compiling which fixed that issue. So that really made us realize we > > > needed > > > > community to participate with us. > > > > > > > > 1) I've reached out to the mahout community, and maybe they will join > > > > forces with us before it is dropped, but for us, we simply have too > many > > > > other priorities and nobody from the mahout community was interested > in > > > > collaborating with us on package testing in bigtop... So much like > > > fedora, > > > > debian, and so , once the curators of the have no interest in > packaging > > > it, > > > > it becomes hard to keep in the distro. > > > > > > > > 2) Are you interested in maintaining mahout packaging in bigtop? > That > > > > might be a nice addition to your thesis . It also would give you > some > > > > interesting insight into the libraries that mahout uses, and how it > uses > > > > hadoop APIs, etc... I'd be able to help you get up to speed with the > > > basics > > > > of building bigtop if you have that interest. > > > > > > > > 3) RE: W/o bigtop, you can always build/compile/install mahout from > > > source > > > > or from tarballs if need be. however this tends to be an annoying > thing > > > to > > > > maintain and manually make sure it interoperates with your yarn > distro > > > > etc.... > > > > > > Not to say that the same compatiblity issue between Hadoop 2.x and > Mahout > > > will > > > still be there when you build it yourself. > > > > > > Cos > > > > > > > On Thu, Mar 12, 2015 at 1:57 AM, David Starina < > [email protected]> > > > > wrote: > > > > > > > > > Hi guys, > > > > > > > > > > I'm just an observer, a passer-by you might say (for now) of this > > > mailing > > > > > list, so I hope you won't mind me commenting on this. I was > planning > > > to use > > > > > Hadoop with Mahout in my thesis, so this thread kind of freaked me > out. > > > > > Since you are mentioning the two pieces of software are > incompatible - > > > does > > > > > that mean it is not possible to get them to work together, or just > > > that it > > > > > requires some extra effort? Also, there are some algorithms that > work > > > with > > > > > Spark - do you know whether those still work with recent versions > of > > > Spark? > > > > > Is there a lot of work to manually install Mahout without Bigtop? > > > > > > > > > > Anyhow, hope the Mahout guys find their focus again. > > > > > > > > > > Best regards, > > > > > David > > > > > > > > > > > > > > > On Thursday, March 12, 2015, jay vyas <[email protected] > > > > > wrote: > > > > > > > > > >> okay, lets drop it... Im fine with that. > > > > >> > > > > >> On Wed, Mar 11, 2015 at 7:49 PM, Konstantin Boudnik < > [email protected]> > > > > >> wrote: > > > > >> > > > > >>> But the last time, back in 0.8, we found that runtime is pretty > > > broken. > > > > >>> So, is > > > > >>> there any real reason to keep on pushing an incompatible piece of > > > > >>> software? > > > > >>> > > > > >>> Cos > > > > >>> > > > > >>> On Tue, Mar 10, 2015 at 09:42AM, jay vyas wrote: > > > > >>> > At this point we can just keep packaging as is, but if bugs > crop > > > > >>> up, drop > > > > >>> > it unless we can get help > > > > >>> > On Mon, Mar 9, 2015 at 11:49 PM, Konstantin Boudnik < > > > [email protected] > > > > >>> > > > > > >>> > wrote: > > > > >>> > > > > > >>> > Should read > > > > >>> > > > > > >>> > So, anyone is interested to maintain Mahout OR a thing of > > > similar > > > > >>> > nature? > > > > >>> > > > > > >>> > Sorry > > > > >>> > On Mon, Mar 09, 2015 at 08:45PM, Konstantin Boudnik wrote: > > > > >>> > > So, anyone is interested to maintain Mahout and a thing > of > > > > >>> similar > > > > >>> > nature? > > > > >>> > > > > > >>> > > > > > > >>> > > Cos > > > > >>> > > > > > > >>> > > On Sat, Mar 07, 2015 at 02:13AM, Konstantin Boudnik > wrote: > > > > >>> > > > I think it eventually boils down to who will be > > > maintaining > > > > >>> the > > > > >>> > component. > > > > >>> > > > > > > > >>> > > > As Jay said - there's maintainer for the component > and if > > > it > > > > >>> will > > > > >>> > continue > > > > >>> > > > like this we might have no choice but delete it: I > think > > > > >>> right now > > > > >>> > it blocks > > > > >>> > > > the release. > > > > >>> > > > > > > > >>> > > > Cos > > > > >>> > > > > > > > >>> > > > On Fri, Mar 06, 2015 at 02:29PM, Ed - 0x1b wrote: > > > > >>> > > > > some links to some of Mahout's replacements - not > all > > > Apache > > > > >>> > projects. > > > > >>> > > > > > > > > >>> > > > > > > > > >>> > > > > > >>> > > > > https://gigaom.com/2014/03/27/apache-mahout-hadoops-original-machine-learning-project-is-moving-on-from-mapreduce/ > > > > >>> > > > > http://0xdata.com/ > > > > >>> > > > > https://spark.apache.org/mllib/ > > > > >>> > > > > > > > > >>> > > > > > >>> > > > https://databricks.com/blog/2014/06/30/sparkling-water-h20-spark.html > > > > >>> > > > > https://github.com/apache/mahout/tree/master/h2o > > > > >>> > > > > > > > > >>> > > > > and > > > > >>> > > > > > > > > >>> > > > > > > > > >>> > > > > > >>> > > > > https://gigaom.com/2014/02/28/cloudera-is-rebuilding-machine-learning-for-hadoop-with-oryx/ > > > > >>> > > > > > > > > >>> > > > > On Fri, Mar 6, 2015 at 12:47 PM, Konstantin Boudnik > > > > >>> > <[email protected]> wrote: > > > > >>> > > > > > Thanks man! I've heard that there's a new project > that > > > > >>> picks up > > > > >>> > where Mahout > > > > >>> > > > > > left of wrt Hadoop2.x support. But might be I am > just > > > > >>> delusional > > > > >>> > from > > > > >>> > > > > > hunger...? > > > > >>> > > > > > > > > > >>> > > > > > On Fri, Mar 06, 2015 at 02:32PM, jay vyas wrote: > > > > >>> > > > > >>A A i sent a email to mahout-dev... maybe > someone > > > will > > > > >>> ping > > > > >>> > back :) > > > > >>> > > > > >>A A On Fri, Mar 6, 2015 at 2:25 PM, Jay Vyas > > > > >>> > <[email protected]> > > > > >>> > > > > >>A A wrote: > > > > >>> > > > > >> > > > > >>> > > > > >>A A A Iirc we don't have any maintainers for > it. > > > > >>> > > > > >>A A A Is anyone interested in maintaining it? > > > > >>> > > > > >>A A A > On Mar 6, 2015, at 2:23 PM, Konstantin > > > Boudnik > > > > >>> > <[email protected]> wrote: > > > > >>> > > > > >>A A A > > > > > >>> > > > > >>A A A > Does anyone know what's the story with > > > Mahout? > > > > >>> Has it > > > > >>> > been fixed to be > > > > >>> > > > > >>A A A working > > > > >>> > > > > >>A A A > with Hadoop2 or shall we remove it > from the > > > > >>> BOM? > > > > >>> > > > > >>A A A > > > > > >>> > > > > >>A A A > Cos > > > > >>> > > > > >>A A A > > > > > >>> > > > > >>A A A >> On Sat, Feb 28, 2015 at 06:56PM, > > > Konstantin > > > > >>> Boudnik > > > > >>> > wrote: > > > > >>> > > > > >>A A A >> Guys, > > > > >>> > > > > >>A A A >> > > > > >>> > > > > >>A A A >> It'd be great if we can have the next > > > release > > > > >>> ready > > > > >>> > by ApacheCon in > > > > >>> > > > > >>A A A April. > > > > >>> > > > > >>A A A >> Think about all the PR and publicity > we > > > can > > > > >>> get > > > > >>> > without any effort on > > > > >>> > > > > >>A A A our own. > > > > >>> > > > > >>A A A >> And perhaps from the tactical > standpoint > > > we > > > > >>> shall > > > > >>> > call this release > > > > >>> > > > > >>A A A 1.0? > > > > >>> > > > > >>A A A >> > > > > >>> > > > > >>A A A >> I believe the only major hurdle > between us > > > > >>> and the > > > > >>> > release is CI. > > > > >>> > > > > >>A A A Roman, I > > > > >>> > > > > >>A A A >> understand you're busy elsewhere, but > > > could > > > > >>> you > > > > >>> > please let us know > > > > >>> > > > > >>A A A what else > > > > >>> > > > > >>A A A >> needs to be done before we can start > > > doing the > > > > >>> > regular builds and how > > > > >>> > > > > >>A A A the > > > > >>> > > > > >>A A A >> community can help. That's the highest > > > > >>> priority, > > > > >>> > IMO. > > > > >>> > > > > >>A A A >> > > > > >>> > > > > >>A A A >> There a couple of the tickets left > > > > >>> > unfixed/unassigned on BIGTOP-1480, > > > > >>> > > > > >>A A A and if > > > > >>> > > > > >>A A A >> they aren't resolved on time we can > move > > > them > > > > >>> > farther. There's lesser > > > > >>> > > > > >>A A A than a > > > > >>> > > > > >>A A A >> half-dozen blockers and none of them > look > > > too > > > > >>> big, > > > > >>> > honestly. And we > > > > >>> > > > > >>A A A have a > > > > >>> > > > > >>A A A >> whole lot of active committers and > > > > >>> contributors to > > > > >>> > wrap-up the > > > > >>> > > > > >>A A A release in a > > > > >>> > > > > >>A A A >> couple of weeks. > > > > >>> > > > > >>A A A >> > > > > >>> > > > > >>A A A >> Do we want to try upgrade to HBase > 1.x for > > > > >>> this > > > > >>> > release or it might > > > > >>> > > > > >>A A A be too big > > > > >>> > > > > >>A A A >> of a distortion? Andrew, what do you > think > > > > >>> and do > > > > >>> > you have cycles to > > > > >>> > > > > >>A A A do that? > > > > >>> > > > > >>A A A >> > > > > >>> > > > > >>A A A >> What else we need to get done for this > > > > >>> release? > > > > >>> > Suggestions? > > > > >>> > > > > >>A A A >> > > > > >>> > > > > >>A A A >> Is there anyone who wants to step up > as > > > the > > > > >>> RM this > > > > >>> > time around? RM > > > > >>> > > > > >>A A A doesn't > > > > >>> > > > > >>A A A >> mean that you have to do all the job, > but > > > > >>> rather be > > > > >>> > an efficient with > > > > >>> > > > > >>A A A a stick ;) > > > > >>> > > > > >>A A A >> > > > > >>> > > > > >>A A A >> Thoughts? > > > > >>> > > > > >>A A A >>AA Cos > > > > >>> > > > > >> > > > > >>> > > > > >>A A -- > > > > >>> > > > > >>A A jay vyas > > > > >>> > > > > > >>> > -- > > > > >>> > jay vyas > > > > >>> > > > > >> > > > > >> > > > > >> > > > > >> -- > > > > >> jay vyas > > > > >> > > > > > > > > > > > > > > > > > -- > > > > jay vyas > > > > > > > > > > > > -- > > jay vyas > >
