Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

Khurrum Nasim Tue, 19 Apr 2016 08:39:51 -0700

okay thanks - i’ll run those tests. i actually ran a few others as well like 
the MatrixWritableTest.


> On Apr 18, 2016, at 8:22 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
> 
> I am not sure of your question about tests...
> 
> there are in-memory tests which you can by 'mvn test' in /math-scala
> module; distributed tests are done per engine under 'spark', 'h2o' or
> 'flink' modules.
> 
> 
> On Mon, Apr 18, 2016 at 5:19 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:
> 
>> i meant "not so much a library"
>> 
>> On Mon, Apr 18, 2016 at 5:18 PM, Dmitriy Lyubimov <dlie...@gmail.com>
>> wrote:
>> 
>>> Khurrum,
>>> 
>>> mahout is so much  a library at this point.
>>> 
>>> if you mean if it can be used to build networks with 2d inputs, yes i did
>>> some of that. multi-epoch SGD based systems should be easy enough to build,
>>> and will probably have a reasonable performance -- although I think
>>> dedicated CNN systems like Caffe would still run faster at this point. Full
>>> batch trainers are somewhat slow for larger problems though, my
>>> investigation points that  there are architectural problems in spark that
>>> are hard to overcome at this point for high IO algorithms.
>>> 
>>> On Mon, Apr 18, 2016 at 11:49 AM, Khurrum Nasim <khurrum.na...@useitc.com
>>>> wrote:
>>> 
>>>> Hi Guys,
>>>> 
>>>> Can Mahout be used for things like face detection ?    Also which unit
>>>> tests or integration tests do you recommend I should run just to get a
>>>> better feel of the execution flow.
>>>> 
>>>> I’m still slowly acclimating to the project.  But hopefully should come
>>>> up to speed soon.
>>>> 
>>>> 
>>>> Many Thanks,
>>>> 
>>>> Khurrum
>>>> 
>>>> 
>>>> 
>>>> 
>>>>> On Mar 30, 2016, at 3:10 PM, Suneel Marthi <smar...@apache.org> wrote:
>>>>> 
>>>>> Thanks Khurrum for stepping up.
>>>>> 
>>>>> You just need basic programming skills - Java/Scala to be able to
>>>>> contribute. We can help you with the algorithms and linear algebra
>>>> stuff.
>>>>> 
>>>>> 
>>>>> Welcome aboard !!
>>>>> 
>>>>> 
>>>>> On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim <
>>>> khurrum.na...@useitc.com>
>>>>> wrote:
>>>>> 
>>>>>> Thanks for the advice Dimitry.  I’m already signed up on ASF jira.
>>>> My
>>>>>> handle is “nasimk”
>>>>>> 
>>>>>> Do I need to be a linear algebra expert and or math phd  to
>>>> contribute ?
>>>>>> I have 10 plus years of computer programming experience.  my
>>>> background is
>>>>>> comp sci.
>>>>>> 
>>>>>> Khurrum
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov <dlie...@gmail.com>
>>>> wrote:
>>>>>>> 
>>>>>>> PS You may also want to sign up with ASF Jira so we can assign
>>>> issues to
>>>>>>> yourself.
>>>>>>> 
>>>>>>> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov <
>>>> dlie...@gmail.com>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
>>>>>> khurrum.na...@useitc.com>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Dimirtry.
>>>>>>>>> 
>>>>>>>>> I take a look at see where I can start pitching in.  Do I need
>>>>>>>>> contributor access ? how  would I create feature branch of my work
>>>> ?
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> Khurrum,
>>>>>>>> 
>>>>>>>> you only need github account. What you need is to create mahout's
>>>> master
>>>>>>>> fork in your github space and keep it in sync, as possible, with
>>>> master
>>>>>> as
>>>>>>>> you go (by doing regular pulls). That way you have the most chance
>>>> of
>>>>>>>> having least conflicts possible.
>>>>>>>> 
>>>>>>>> At any point in time (I recommend at perhaps when you feel you are
>>>> about
>>>>>>>> 50 to 70% done or just need a code advice), you can create a github
>>>> pull
>>>>>>>> request to the apache/mahout master. Make sure to include MAHOUT-XXX
>>>>>> issue
>>>>>>>> in the head of the pull request, that way ASF will automatically
>>>>>> propagate
>>>>>>>> code comments to jira, and so all discussion can be done entirely on
>>>>>> github.
>>>>>>>> 
>>>>>>>> Again, if you take on a signficant contribution (such as a new
>>>> numerical
>>>>>>>> method contribution), I recommend to discuss the proposal on the
>>>> @dev
>>>>>> list
>>>>>>>> 
>>>>>>>> thanks.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Khurrum
>>>>>>>>> 
>>>>>>>>>> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov <dlie...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Oh but of course! please do!
>>>>>>>>>> 
>>>>>>>>>> You may work on any issue, this or any other of your choice, or
>>>> even
>>>>>> on
>>>>>>>>> any
>>>>>>>>>> new issue you can think of (for sizeable contributions it is
>>>>>>>>> recommended to
>>>>>>>>>> start discussion on the @dev list first though, to make sure to
>>>>>> benefit
>>>>>>>>>> from experience of others. Please file any new issue first to
>>>> jira).
>>>>>>>>>> 
>>>>>>>>>> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>>>>>>>>>> j...@apache.org> wrote:
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> [
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15218216#comment-15218216
>>>>>>>>>>> ]
>>>>>>>>>>> 
>>>>>>>>>>> shashi bushan dongur commented on MAHOUT-1788:
>>>>>>>>>>> ----------------------------------------------
>>>>>>>>>>> 
>>>>>>>>>>> Hello. I would like to start contributing to mahout. Can I work
>>>> on
>>>>>> this
>>>>>>>>>>> issue?
>>>>>>>>>>> 
>>>>>>>>>>>> spark-itemsimilarity integration test script cleanup
>>>>>>>>>>>> ----------------------------------------------------
>>>>>>>>>>>> 
>>>>>>>>>>>>             Key: MAHOUT-1788
>>>>>>>>>>>>             URL:
>>>>>> https://issues.apache.org/jira/browse/MAHOUT-1788
>>>>>>>>>>>>         Project: Mahout
>>>>>>>>>>>>      Issue Type: Improvement
>>>>>>>>>>>>      Components: cooccurrence
>>>>>>>>>>>> Affects Versions: 0.11.0
>>>>>>>>>>>>        Reporter: Pat Ferrel
>>>>>>>>>>>>        Assignee: Pat Ferrel
>>>>>>>>>>>>        Priority: Trivial
>>>>>>>>>>>>         Fix For: 1.0.0
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> binary release does not contain data for itemsimilarity tests,
>>>> neith
>>>>>>>>>>> binary nor source versions will run on a cluster unless data is
>>>> hand
>>>>>>>>> copied
>>>>>>>>>>> to hdfs.
>>>>>>>>>>>> Clean this up so it copies data if needed and the data is in
>>>> both
>>>>>>>>>>> versions.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> This message was sent by Atlassian JIRA
>>>>>>>>>>> (v6.3.4#6332)
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>>

Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

Reply via email to