On Tue, Apr 19, 2016 at 11:08 AM, Khurrum Nasim <[email protected]> wrote:
> Thank you Dimitry. > > So is there an architectural blueprint for mahout ? What I mean is how > can get the 1000 feet overview ? Or the bird eye view of the project. > I do see Mahout is very modularized - however I’m still trying to make > heads and tails out it :) > > @Dimitry - > "my investigation points that there are architectural problems in spark > that > are hard to overcome at this point for high IO algorithms.” - Can you > share some more details about this - I’m just curious. > Long story short - "Distributed != Scalable" > > > > On Apr 18, 2016, at 8:18 PM, Dmitriy Lyubimov <[email protected]> wrote: > > > > Khurrum, > > > > mahout is so much a library at this point. > > > > if you mean if it can be used to build networks with 2d inputs, yes i did > > some of that. multi-epoch SGD based systems should be easy enough to > build, > > and will probably have a reasonable performance -- although I think > > dedicated CNN systems like Caffe would still run faster at this point. > Full > > batch trainers are somewhat slow for larger problems though, my > > investigation points that there are architectural problems in spark that > > are hard to overcome at this point for high IO algorithms. > > > > On Mon, Apr 18, 2016 at 11:49 AM, Khurrum Nasim < > [email protected]> > > wrote: > > > >> Hi Guys, > >> > >> Can Mahout be used for things like face detection ? Also which unit > >> tests or integration tests do you recommend I should run just to get a > >> better feel of the execution flow. > >> > >> I’m still slowly acclimating to the project. But hopefully should come > up > >> to speed soon. > >> > >> > >> Many Thanks, > >> > >> Khurrum > >> > >> > >> > >> > >>> On Mar 30, 2016, at 3:10 PM, Suneel Marthi <[email protected]> wrote: > >>> > >>> Thanks Khurrum for stepping up. > >>> > >>> You just need basic programming skills - Java/Scala to be able to > >>> contribute. We can help you with the algorithms and linear algebra > stuff. > >>> > >>> > >>> Welcome aboard !! > >>> > >>> > >>> On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim < > [email protected] > >>> > >>> wrote: > >>> > >>>> Thanks for the advice Dimitry. I’m already signed up on ASF jira. > My > >>>> handle is “nasimk” > >>>> > >>>> Do I need to be a linear algebra expert and or math phd to > contribute ? > >>>> I have 10 plus years of computer programming experience. my > background > >> is > >>>> comp sci. > >>>> > >>>> Khurrum > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov <[email protected]> > >> wrote: > >>>>> > >>>>> PS You may also want to sign up with ASF Jira so we can assign issues > >> to > >>>>> yourself. > >>>>> > >>>>> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov < > [email protected]> > >>>>> wrote: > >>>>> > >>>>>> > >>>>>> > >>>>>> On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim < > >>>> [email protected]> > >>>>>> wrote: > >>>>>> > >>>>>>> Thanks Dimirtry. > >>>>>>> > >>>>>>> I take a look at see where I can start pitching in. Do I need > >>>>>>> contributor access ? how would I create feature branch of my work > ? > >>>>>>> > >>>>>> > >>>>>> Khurrum, > >>>>>> > >>>>>> you only need github account. What you need is to create mahout's > >> master > >>>>>> fork in your github space and keep it in sync, as possible, with > >> master > >>>> as > >>>>>> you go (by doing regular pulls). That way you have the most chance > of > >>>>>> having least conflicts possible. > >>>>>> > >>>>>> At any point in time (I recommend at perhaps when you feel you are > >> about > >>>>>> 50 to 70% done or just need a code advice), you can create a github > >> pull > >>>>>> request to the apache/mahout master. Make sure to include MAHOUT-XXX > >>>> issue > >>>>>> in the head of the pull request, that way ASF will automatically > >>>> propagate > >>>>>> code comments to jira, and so all discussion can be done entirely on > >>>> github. > >>>>>> > >>>>>> Again, if you take on a signficant contribution (such as a new > >> numerical > >>>>>> method contribution), I recommend to discuss the proposal on the > @dev > >>>> list > >>>>>> > >>>>>> thanks. > >>>>>> > >>>>>> > >>>>>>> > >>>>>>> Khurrum > >>>>>>> > >>>>>>>> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov <[email protected]> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> Oh but of course! please do! > >>>>>>>> > >>>>>>>> You may work on any issue, this or any other of your choice, or > even > >>>> on > >>>>>>> any > >>>>>>>> new issue you can think of (for sizeable contributions it is > >>>>>>> recommended to > >>>>>>>> start discussion on the @dev list first though, to make sure to > >>>> benefit > >>>>>>>> from experience of others. Please file any new issue first to > jira). > >>>>>>>> > >>>>>>>> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) < > >>>>>>>> [email protected]> wrote: > >>>>>>>> > >>>>>>>>> > >>>>>>>>> [ > >>>>>>>>> > >>>>>>> > >>>> > >> > https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218216#comment-15218216 > >>>>>>>>> ] > >>>>>>>>> > >>>>>>>>> shashi bushan dongur commented on MAHOUT-1788: > >>>>>>>>> ---------------------------------------------- > >>>>>>>>> > >>>>>>>>> Hello. I would like to start contributing to mahout. Can I work > on > >>>> this > >>>>>>>>> issue? > >>>>>>>>> > >>>>>>>>>> spark-itemsimilarity integration test script cleanup > >>>>>>>>>> ---------------------------------------------------- > >>>>>>>>>> > >>>>>>>>>> Key: MAHOUT-1788 > >>>>>>>>>> URL: > >>>> https://issues.apache.org/jira/browse/MAHOUT-1788 > >>>>>>>>>> Project: Mahout > >>>>>>>>>> Issue Type: Improvement > >>>>>>>>>> Components: cooccurrence > >>>>>>>>>> Affects Versions: 0.11.0 > >>>>>>>>>> Reporter: Pat Ferrel > >>>>>>>>>> Assignee: Pat Ferrel > >>>>>>>>>> Priority: Trivial > >>>>>>>>>> Fix For: 1.0.0 > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> binary release does not contain data for itemsimilarity tests, > >> neith > >>>>>>>>> binary nor source versions will run on a cluster unless data is > >> hand > >>>>>>> copied > >>>>>>>>> to hdfs. > >>>>>>>>>> Clean this up so it copies data if needed and the data is in > both > >>>>>>>>> versions. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> This message was sent by Atlassian JIRA > >>>>>>>>> (v6.3.4#6332) > >>>>>>>>> > >>>>>>> > >>>>>>> > >>>>>> > >>>> > >>>> > >> > >> > >
