Hello again, I hope most of you had the time to read through the previous mail. It would mean a lot if you could answer (in partial at least) the above questions.
Thanks, Aditya On Sat, Apr 1, 2017 at 5:17 AM, Aditya <adityasarma...@gmail.com> wrote: > Hi everyone, > > I've been talking with Trevor over email and he shared some documents with > me. They contained content that he (along with a few others) were > developing to make Mahout easily accessible to newbies like myself. > > I've gone through the planned blog posts titled "Why Mahout", "Getting > Started with Mahout", "Algorithms Framework" and "Building Apache Mahout > from Source" and I have to say, I've got a lot of questions. Since Trevor > is on vacation and the deadline for final proposal submission is fast > approaching, I thought I'll post my questions on the dev forum. > > So here goes the big list of my questions. I hope of those of you who were > / are involved in the development of these blog posts will be able to help > me. Some of the questions are vague / abstract, I suggest you answer them > as if you're explaining it to a layman. > > 1. Could you elaborate to me the high-level structure of Mahout? > > 2. What are the plans in pipeline for Mahout's development in the months > to come? > > 3. How does contribution of a new algorithm work in Mahout? When I was > reading the doc "Getting Started with Mahout" the example implemented the > Ordinary Least Squares Regression in Samsara, Mahout's DSL. > I had something different in my mind before reading the blog posts. I had > thought that I would be contributing the distributed algorithm to Mahout > from scratch, written in Scala and make it available as a package (which > users can import and use) to users who use Mahout. > > 4. In general, is there a plan to contribute the algorithms in future > using Samsara only? If so, what will be the limitations and advantages of > this decision? I mean, the algorithms that will be a part of Mahout in the > future, is there a plan to write all of them in Samsara. > > 5. What are the building blocks of Mahout that enable the distributed > processing? The blog post mentions the Distributed Row Matrix. Are there > any other distributed data structures available? If not, won't the > algorithms that can be a part of the Mahout framework in the future become > limited? Meaning, algorithms that cannot be reduced to a Linear Algebra > problem? > > 6. What is expected of a newbie in the community? What is the learning > curve to become an active contributor to Mahout? Are there any specific > books / blog posts that I can read that will make the process easier? > > 7. Also, if you could give me some background as to how the development of > Mahout has been going on. Not the motivation / inspiration that led to > Mahout's conception but something like, what work has gone on between the > previous release and the current release candidate. > > 8. What was the high level motivation of developing Mahout's own DSL, > Samsara? > > Regards, > Aditya > > > >