[jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-12-18 Thread Andrew Palumbo (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15760262#comment-15760262
 ] 

Andrew Palumbo commented on MAHOUT-1788:


[~pferrel], [~shashidongur] Is there anything to be done here?  Can we close it 
out or should we bump it to 0.14/1.0.0?

> spark-itemsimilarity integration test script cleanup
> 
>
> Key: MAHOUT-1788
> URL: https://issues.apache.org/jira/browse/MAHOUT-1788
> Project: Mahout
>  Issue Type: Improvement
>  Components: cooccurrence
>Affects Versions: 0.11.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
>Priority: Trivial
> Fix For: 1.0.0
>
>
> binary release does not contain data for itemsimilarity tests, neith binary 
> nor source versions will run on a cluster unless data is hand copied to hdfs.
> Clean this up so it copies data if needed and the data is in both versions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-10-13 Thread Suneel Marthi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15574059#comment-15574059
 ] 

Suneel Marthi commented on MAHOUT-1788:
---

Is someone still working on this? 

> spark-itemsimilarity integration test script cleanup
> 
>
> Key: MAHOUT-1788
> URL: https://issues.apache.org/jira/browse/MAHOUT-1788
> Project: Mahout
>  Issue Type: Improvement
>  Components: cooccurrence
>Affects Versions: 0.11.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
>Priority: Trivial
> Fix For: 1.0.0
>
>
> binary release does not contain data for itemsimilarity tests, neith binary 
> nor source versions will run on a cluster unless data is hand copied to hdfs.
> Clean this up so it copies data if needed and the data is in both versions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-04-19 Thread Khurrum Nasim
okay thanks - i’ll run those tests. i actually ran a few others as well like 
the MatrixWritableTest.  

> On Apr 18, 2016, at 8:22 PM, Dmitriy Lyubimov  wrote:
> 
> I am not sure of your question about tests...
> 
> there are in-memory tests which you can by 'mvn test' in /math-scala
> module; distributed tests are done per engine under 'spark', 'h2o' or
> 'flink' modules.
> 
> 
> On Mon, Apr 18, 2016 at 5:19 PM, Dmitriy Lyubimov  wrote:
> 
>> i meant "not so much a library"
>> 
>> On Mon, Apr 18, 2016 at 5:18 PM, Dmitriy Lyubimov 
>> wrote:
>> 
>>> Khurrum,
>>> 
>>> mahout is so much  a library at this point.
>>> 
>>> if you mean if it can be used to build networks with 2d inputs, yes i did
>>> some of that. multi-epoch SGD based systems should be easy enough to build,
>>> and will probably have a reasonable performance -- although I think
>>> dedicated CNN systems like Caffe would still run faster at this point. Full
>>> batch trainers are somewhat slow for larger problems though, my
>>> investigation points that  there are architectural problems in spark that
>>> are hard to overcome at this point for high IO algorithms.
>>> 
>>> On Mon, Apr 18, 2016 at 11:49 AM, Khurrum Nasim >> 
 Hi Guys,
 
 Can Mahout be used for things like face detection ?Also which unit
 tests or integration tests do you recommend I should run just to get a
 better feel of the execution flow.
 
 I’m still slowly acclimating to the project.  But hopefully should come
 up to speed soon.
 
 
 Many Thanks,
 
 Khurrum
 
 
 
 
> On Mar 30, 2016, at 3:10 PM, Suneel Marthi  wrote:
> 
> Thanks Khurrum for stepping up.
> 
> You just need basic programming skills - Java/Scala to be able to
> contribute. We can help you with the algorithms and linear algebra
 stuff.
> 
> 
> Welcome aboard !!
> 
> 
> On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim <
 khurrum.na...@useitc.com>
> wrote:
> 
>> Thanks for the advice Dimitry.  I’m already signed up on ASF jira.
 My
>> handle is “nasimk”
>> 
>> Do I need to be a linear algebra expert and or math phd  to
 contribute ?
>> I have 10 plus years of computer programming experience.  my
 background is
>> comp sci.
>> 
>> Khurrum
>> 
>> 
>> 
>> 
>> 
>>> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov 
 wrote:
>>> 
>>> PS You may also want to sign up with ASF Jira so we can assign
 issues to
>>> yourself.
>>> 
>>> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov <
 dlie...@gmail.com>
>>> wrote:
>>> 
 
 
 On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
>> khurrum.na...@useitc.com>
 wrote:
 
> Thanks Dimirtry.
> 
> I take a look at see where I can start pitching in.  Do I need
> contributor access ? how  would I create feature branch of my work
 ?
> 
 
 Khurrum,
 
 you only need github account. What you need is to create mahout's
 master
 fork in your github space and keep it in sync, as possible, with
 master
>> as
 you go (by doing regular pulls). That way you have the most chance
 of
 having least conflicts possible.
 
 At any point in time (I recommend at perhaps when you feel you are
 about
 50 to 70% done or just need a code advice), you can create a github
 pull
 request to the apache/mahout master. Make sure to include MAHOUT-XXX
>> issue
 in the head of the pull request, that way ASF will automatically
>> propagate
 code comments to jira, and so all discussion can be done entirely on
>> github.
 
 Again, if you take on a signficant contribution (such as a new
 numerical
 method contribution), I recommend to discuss the proposal on the
 @dev
>> list
 
 thanks.
 
 
> 
> Khurrum
> 
>> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
> wrote:
>> 
>> Oh but of course! please do!
>> 
>> You may work on any issue, this or any other of your choice, or
 even
>> on
> any
>> new issue you can think of (for sizeable contributions it is
> recommended to
>> start discussion on the @dev list first though, to make sure to
>> benefit
>> from experience of others. Please file any new issue first to
 jira).
>> 
>> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>> j...@apache.org> wrote:
>> 
>>> 
>>> [
>>> 
> 
>> 
 

Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-04-19 Thread Suneel Marthi
On Tue, Apr 19, 2016 at 11:08 AM, Khurrum Nasim 
wrote:

> Thank you Dimitry.
>
> So is there an architectural blueprint for mahout ?   What I mean is how
> can get the 1000 feet overview ? Or the bird eye view of the project.
> I do see Mahout is very modularized - however I’m still trying to make
> heads and tails out it :)
>
> @Dimitry -
> "my investigation points that  there are architectural problems in spark
> that
> are hard to overcome at this point for high IO algorithms.”  - Can you
> share some more details about this - I’m just curious.
>

Long story short - "Distributed != Scalable"

>
>
> > On Apr 18, 2016, at 8:18 PM, Dmitriy Lyubimov  wrote:
> >
> > Khurrum,
> >
> > mahout is so much  a library at this point.
> >
> > if you mean if it can be used to build networks with 2d inputs, yes i did
> > some of that. multi-epoch SGD based systems should be easy enough to
> build,
> > and will probably have a reasonable performance -- although I think
> > dedicated CNN systems like Caffe would still run faster at this point.
> Full
> > batch trainers are somewhat slow for larger problems though, my
> > investigation points that  there are architectural problems in spark that
> > are hard to overcome at this point for high IO algorithms.
> >
> > On Mon, Apr 18, 2016 at 11:49 AM, Khurrum Nasim <
> khurrum.na...@useitc.com>
> > wrote:
> >
> >> Hi Guys,
> >>
> >> Can Mahout be used for things like face detection ?Also which unit
> >> tests or integration tests do you recommend I should run just to get a
> >> better feel of the execution flow.
> >>
> >> I’m still slowly acclimating to the project.  But hopefully should come
> up
> >> to speed soon.
> >>
> >>
> >> Many Thanks,
> >>
> >> Khurrum
> >>
> >>
> >>
> >>
> >>> On Mar 30, 2016, at 3:10 PM, Suneel Marthi  wrote:
> >>>
> >>> Thanks Khurrum for stepping up.
> >>>
> >>> You just need basic programming skills - Java/Scala to be able to
> >>> contribute. We can help you with the algorithms and linear algebra
> stuff.
> >>>
> >>>
> >>> Welcome aboard !!
> >>>
> >>>
> >>> On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim <
> khurrum.na...@useitc.com
> >>>
> >>> wrote:
> >>>
>  Thanks for the advice Dimitry.  I’m already signed up on ASF jira.
> My
>  handle is “nasimk”
> 
>  Do I need to be a linear algebra expert and or math phd  to
> contribute ?
>  I have 10 plus years of computer programming experience.  my
> background
> >> is
>  comp sci.
> 
>  Khurrum
> 
> 
> 
> 
> 
> > On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov 
> >> wrote:
> >
> > PS You may also want to sign up with ASF Jira so we can assign issues
> >> to
> > yourself.
> >
> > On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov <
> dlie...@gmail.com>
> > wrote:
> >
> >>
> >>
> >> On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
>  khurrum.na...@useitc.com>
> >> wrote:
> >>
> >>> Thanks Dimirtry.
> >>>
> >>> I take a look at see where I can start pitching in.  Do I need
> >>> contributor access ? how  would I create feature branch of my work
> ?
> >>>
> >>
> >> Khurrum,
> >>
> >> you only need github account. What you need is to create mahout's
> >> master
> >> fork in your github space and keep it in sync, as possible, with
> >> master
>  as
> >> you go (by doing regular pulls). That way you have the most chance
> of
> >> having least conflicts possible.
> >>
> >> At any point in time (I recommend at perhaps when you feel you are
> >> about
> >> 50 to 70% done or just need a code advice), you can create a github
> >> pull
> >> request to the apache/mahout master. Make sure to include MAHOUT-XXX
>  issue
> >> in the head of the pull request, that way ASF will automatically
>  propagate
> >> code comments to jira, and so all discussion can be done entirely on
>  github.
> >>
> >> Again, if you take on a signficant contribution (such as a new
> >> numerical
> >> method contribution), I recommend to discuss the proposal on the
> @dev
>  list
> >>
> >> thanks.
> >>
> >>
> >>>
> >>> Khurrum
> >>>
>  On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
> >>> wrote:
> 
>  Oh but of course! please do!
> 
>  You may work on any issue, this or any other of your choice, or
> even
>  on
> >>> any
>  new issue you can think of (for sizeable contributions it is
> >>> recommended to
>  start discussion on the @dev list first though, to make sure to
>  benefit
>  from experience of others. Please file any new issue first to
> jira).
> 
>  On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>  j...@apache.org> wrote:
> 
> >
> > [
> 

Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-04-18 Thread Dmitriy Lyubimov
I am not sure of your question about tests...

there are in-memory tests which you can by 'mvn test' in /math-scala
module; distributed tests are done per engine under 'spark', 'h2o' or
'flink' modules.


On Mon, Apr 18, 2016 at 5:19 PM, Dmitriy Lyubimov  wrote:

> i meant "not so much a library"
>
> On Mon, Apr 18, 2016 at 5:18 PM, Dmitriy Lyubimov 
> wrote:
>
>> Khurrum,
>>
>> mahout is so much  a library at this point.
>>
>> if you mean if it can be used to build networks with 2d inputs, yes i did
>> some of that. multi-epoch SGD based systems should be easy enough to build,
>> and will probably have a reasonable performance -- although I think
>> dedicated CNN systems like Caffe would still run faster at this point. Full
>> batch trainers are somewhat slow for larger problems though, my
>> investigation points that  there are architectural problems in spark that
>> are hard to overcome at this point for high IO algorithms.
>>
>> On Mon, Apr 18, 2016 at 11:49 AM, Khurrum Nasim > > wrote:
>>
>>> Hi Guys,
>>>
>>> Can Mahout be used for things like face detection ?Also which unit
>>> tests or integration tests do you recommend I should run just to get a
>>> better feel of the execution flow.
>>>
>>> I’m still slowly acclimating to the project.  But hopefully should come
>>> up to speed soon.
>>>
>>>
>>> Many Thanks,
>>>
>>> Khurrum
>>>
>>>
>>>
>>>
>>> > On Mar 30, 2016, at 3:10 PM, Suneel Marthi  wrote:
>>> >
>>> > Thanks Khurrum for stepping up.
>>> >
>>> > You just need basic programming skills - Java/Scala to be able to
>>> > contribute. We can help you with the algorithms and linear algebra
>>> stuff.
>>> >
>>> >
>>> > Welcome aboard !!
>>> >
>>> >
>>> > On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim <
>>> khurrum.na...@useitc.com>
>>> > wrote:
>>> >
>>> >> Thanks for the advice Dimitry.  I’m already signed up on ASF jira.
>>> My
>>> >> handle is “nasimk”
>>> >>
>>> >> Do I need to be a linear algebra expert and or math phd  to
>>> contribute ?
>>> >> I have 10 plus years of computer programming experience.  my
>>> background is
>>> >> comp sci.
>>> >>
>>> >> Khurrum
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov 
>>> wrote:
>>> >>>
>>> >>> PS You may also want to sign up with ASF Jira so we can assign
>>> issues to
>>> >>> yourself.
>>> >>>
>>> >>> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov <
>>> dlie...@gmail.com>
>>> >>> wrote:
>>> >>>
>>> 
>>> 
>>>  On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
>>> >> khurrum.na...@useitc.com>
>>>  wrote:
>>> 
>>> > Thanks Dimirtry.
>>> >
>>> > I take a look at see where I can start pitching in.  Do I need
>>> > contributor access ? how  would I create feature branch of my work
>>> ?
>>> >
>>> 
>>>  Khurrum,
>>> 
>>>  you only need github account. What you need is to create mahout's
>>> master
>>>  fork in your github space and keep it in sync, as possible, with
>>> master
>>> >> as
>>>  you go (by doing regular pulls). That way you have the most chance
>>> of
>>>  having least conflicts possible.
>>> 
>>>  At any point in time (I recommend at perhaps when you feel you are
>>> about
>>>  50 to 70% done or just need a code advice), you can create a github
>>> pull
>>>  request to the apache/mahout master. Make sure to include MAHOUT-XXX
>>> >> issue
>>>  in the head of the pull request, that way ASF will automatically
>>> >> propagate
>>>  code comments to jira, and so all discussion can be done entirely on
>>> >> github.
>>> 
>>>  Again, if you take on a signficant contribution (such as a new
>>> numerical
>>>  method contribution), I recommend to discuss the proposal on the
>>> @dev
>>> >> list
>>> 
>>>  thanks.
>>> 
>>> 
>>> >
>>> > Khurrum
>>> >
>>> >> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
>>> > wrote:
>>> >>
>>> >> Oh but of course! please do!
>>> >>
>>> >> You may work on any issue, this or any other of your choice, or
>>> even
>>> >> on
>>> > any
>>> >> new issue you can think of (for sizeable contributions it is
>>> > recommended to
>>> >> start discussion on the @dev list first though, to make sure to
>>> >> benefit
>>> >> from experience of others. Please file any new issue first to
>>> jira).
>>> >>
>>> >> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>>> >> j...@apache.org> wrote:
>>> >>
>>> >>>
>>> >>>  [
>>> >>>
>>> >
>>> >>
>>> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
>>> >>> ]
>>> >>>
>>> >>> shashi bushan dongur commented on MAHOUT-1788:
>>> >>> --
>>> >>>
>>> >>> Hello. 

Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-04-18 Thread Dmitriy Lyubimov
i meant "not so much a library"

On Mon, Apr 18, 2016 at 5:18 PM, Dmitriy Lyubimov  wrote:

> Khurrum,
>
> mahout is so much  a library at this point.
>
> if you mean if it can be used to build networks with 2d inputs, yes i did
> some of that. multi-epoch SGD based systems should be easy enough to build,
> and will probably have a reasonable performance -- although I think
> dedicated CNN systems like Caffe would still run faster at this point. Full
> batch trainers are somewhat slow for larger problems though, my
> investigation points that  there are architectural problems in spark that
> are hard to overcome at this point for high IO algorithms.
>
> On Mon, Apr 18, 2016 at 11:49 AM, Khurrum Nasim 
> wrote:
>
>> Hi Guys,
>>
>> Can Mahout be used for things like face detection ?Also which unit
>> tests or integration tests do you recommend I should run just to get a
>> better feel of the execution flow.
>>
>> I’m still slowly acclimating to the project.  But hopefully should come
>> up to speed soon.
>>
>>
>> Many Thanks,
>>
>> Khurrum
>>
>>
>>
>>
>> > On Mar 30, 2016, at 3:10 PM, Suneel Marthi  wrote:
>> >
>> > Thanks Khurrum for stepping up.
>> >
>> > You just need basic programming skills - Java/Scala to be able to
>> > contribute. We can help you with the algorithms and linear algebra
>> stuff.
>> >
>> >
>> > Welcome aboard !!
>> >
>> >
>> > On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim <
>> khurrum.na...@useitc.com>
>> > wrote:
>> >
>> >> Thanks for the advice Dimitry.  I’m already signed up on ASF jira.
>> My
>> >> handle is “nasimk”
>> >>
>> >> Do I need to be a linear algebra expert and or math phd  to contribute
>> ?
>> >> I have 10 plus years of computer programming experience.  my
>> background is
>> >> comp sci.
>> >>
>> >> Khurrum
>> >>
>> >>
>> >>
>> >>
>> >>
>> >>> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov 
>> wrote:
>> >>>
>> >>> PS You may also want to sign up with ASF Jira so we can assign issues
>> to
>> >>> yourself.
>> >>>
>> >>> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov > >
>> >>> wrote:
>> >>>
>> 
>> 
>>  On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
>> >> khurrum.na...@useitc.com>
>>  wrote:
>> 
>> > Thanks Dimirtry.
>> >
>> > I take a look at see where I can start pitching in.  Do I need
>> > contributor access ? how  would I create feature branch of my work ?
>> >
>> 
>>  Khurrum,
>> 
>>  you only need github account. What you need is to create mahout's
>> master
>>  fork in your github space and keep it in sync, as possible, with
>> master
>> >> as
>>  you go (by doing regular pulls). That way you have the most chance of
>>  having least conflicts possible.
>> 
>>  At any point in time (I recommend at perhaps when you feel you are
>> about
>>  50 to 70% done or just need a code advice), you can create a github
>> pull
>>  request to the apache/mahout master. Make sure to include MAHOUT-XXX
>> >> issue
>>  in the head of the pull request, that way ASF will automatically
>> >> propagate
>>  code comments to jira, and so all discussion can be done entirely on
>> >> github.
>> 
>>  Again, if you take on a signficant contribution (such as a new
>> numerical
>>  method contribution), I recommend to discuss the proposal on the @dev
>> >> list
>> 
>>  thanks.
>> 
>> 
>> >
>> > Khurrum
>> >
>> >> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
>> > wrote:
>> >>
>> >> Oh but of course! please do!
>> >>
>> >> You may work on any issue, this or any other of your choice, or
>> even
>> >> on
>> > any
>> >> new issue you can think of (for sizeable contributions it is
>> > recommended to
>> >> start discussion on the @dev list first though, to make sure to
>> >> benefit
>> >> from experience of others. Please file any new issue first to
>> jira).
>> >>
>> >> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>> >> j...@apache.org> wrote:
>> >>
>> >>>
>> >>>  [
>> >>>
>> >
>> >>
>> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
>> >>> ]
>> >>>
>> >>> shashi bushan dongur commented on MAHOUT-1788:
>> >>> --
>> >>>
>> >>> Hello. I would like to start contributing to mahout. Can I work on
>> >> this
>> >>> issue?
>> >>>
>>  spark-itemsimilarity integration test script cleanup
>>  
>> 
>>   Key: MAHOUT-1788
>>   URL:
>> >> https://issues.apache.org/jira/browse/MAHOUT-1788
>>   Project: Mahout
>>    Issue Type: Improvement
>> 

Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-04-18 Thread Dmitriy Lyubimov
Khurrum,

mahout is so much  a library at this point.

if you mean if it can be used to build networks with 2d inputs, yes i did
some of that. multi-epoch SGD based systems should be easy enough to build,
and will probably have a reasonable performance -- although I think
dedicated CNN systems like Caffe would still run faster at this point. Full
batch trainers are somewhat slow for larger problems though, my
investigation points that  there are architectural problems in spark that
are hard to overcome at this point for high IO algorithms.

On Mon, Apr 18, 2016 at 11:49 AM, Khurrum Nasim 
wrote:

> Hi Guys,
>
> Can Mahout be used for things like face detection ?Also which unit
> tests or integration tests do you recommend I should run just to get a
> better feel of the execution flow.
>
> I’m still slowly acclimating to the project.  But hopefully should come up
> to speed soon.
>
>
> Many Thanks,
>
> Khurrum
>
>
>
>
> > On Mar 30, 2016, at 3:10 PM, Suneel Marthi  wrote:
> >
> > Thanks Khurrum for stepping up.
> >
> > You just need basic programming skills - Java/Scala to be able to
> > contribute. We can help you with the algorithms and linear algebra stuff.
> >
> >
> > Welcome aboard !!
> >
> >
> > On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim  >
> > wrote:
> >
> >> Thanks for the advice Dimitry.  I’m already signed up on ASF jira.My
> >> handle is “nasimk”
> >>
> >> Do I need to be a linear algebra expert and or math phd  to contribute ?
> >> I have 10 plus years of computer programming experience.  my background
> is
> >> comp sci.
> >>
> >> Khurrum
> >>
> >>
> >>
> >>
> >>
> >>> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov 
> wrote:
> >>>
> >>> PS You may also want to sign up with ASF Jira so we can assign issues
> to
> >>> yourself.
> >>>
> >>> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov 
> >>> wrote:
> >>>
> 
> 
>  On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
> >> khurrum.na...@useitc.com>
>  wrote:
> 
> > Thanks Dimirtry.
> >
> > I take a look at see where I can start pitching in.  Do I need
> > contributor access ? how  would I create feature branch of my work ?
> >
> 
>  Khurrum,
> 
>  you only need github account. What you need is to create mahout's
> master
>  fork in your github space and keep it in sync, as possible, with
> master
> >> as
>  you go (by doing regular pulls). That way you have the most chance of
>  having least conflicts possible.
> 
>  At any point in time (I recommend at perhaps when you feel you are
> about
>  50 to 70% done or just need a code advice), you can create a github
> pull
>  request to the apache/mahout master. Make sure to include MAHOUT-XXX
> >> issue
>  in the head of the pull request, that way ASF will automatically
> >> propagate
>  code comments to jira, and so all discussion can be done entirely on
> >> github.
> 
>  Again, if you take on a signficant contribution (such as a new
> numerical
>  method contribution), I recommend to discuss the proposal on the @dev
> >> list
> 
>  thanks.
> 
> 
> >
> > Khurrum
> >
> >> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
> > wrote:
> >>
> >> Oh but of course! please do!
> >>
> >> You may work on any issue, this or any other of your choice, or even
> >> on
> > any
> >> new issue you can think of (for sizeable contributions it is
> > recommended to
> >> start discussion on the @dev list first though, to make sure to
> >> benefit
> >> from experience of others. Please file any new issue first to jira).
> >>
> >> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
> >> j...@apache.org> wrote:
> >>
> >>>
> >>>  [
> >>>
> >
> >>
> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
> >>> ]
> >>>
> >>> shashi bushan dongur commented on MAHOUT-1788:
> >>> --
> >>>
> >>> Hello. I would like to start contributing to mahout. Can I work on
> >> this
> >>> issue?
> >>>
>  spark-itemsimilarity integration test script cleanup
>  
> 
>   Key: MAHOUT-1788
>   URL:
> >> https://issues.apache.org/jira/browse/MAHOUT-1788
>   Project: Mahout
>    Issue Type: Improvement
>    Components: cooccurrence
>  Affects Versions: 0.11.0
>  Reporter: Pat Ferrel
>  Assignee: Pat Ferrel
>  Priority: Trivial
>   Fix For: 1.0.0
> 
> 
>  binary release does not contain 

Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-04-18 Thread Khurrum Nasim
Hi Guys,

Can Mahout be used for things like face detection ?Also which unit tests or 
integration tests do you recommend I should run just to get a better feel of 
the execution flow.  

I’m still slowly acclimating to the project.  But hopefully should come up to 
speed soon.   


Many Thanks,

Khurrum




> On Mar 30, 2016, at 3:10 PM, Suneel Marthi  wrote:
> 
> Thanks Khurrum for stepping up.
> 
> You just need basic programming skills - Java/Scala to be able to
> contribute. We can help you with the algorithms and linear algebra stuff.
> 
> 
> Welcome aboard !!
> 
> 
> On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim 
> wrote:
> 
>> Thanks for the advice Dimitry.  I’m already signed up on ASF jira.My
>> handle is “nasimk”
>> 
>> Do I need to be a linear algebra expert and or math phd  to contribute ?
>> I have 10 plus years of computer programming experience.  my background is
>> comp sci.
>> 
>> Khurrum
>> 
>> 
>> 
>> 
>> 
>>> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov  wrote:
>>> 
>>> PS You may also want to sign up with ASF Jira so we can assign issues to
>>> yourself.
>>> 
>>> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov 
>>> wrote:
>>> 
 
 
 On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
>> khurrum.na...@useitc.com>
 wrote:
 
> Thanks Dimirtry.
> 
> I take a look at see where I can start pitching in.  Do I need
> contributor access ? how  would I create feature branch of my work ?
> 
 
 Khurrum,
 
 you only need github account. What you need is to create mahout's master
 fork in your github space and keep it in sync, as possible, with master
>> as
 you go (by doing regular pulls). That way you have the most chance of
 having least conflicts possible.
 
 At any point in time (I recommend at perhaps when you feel you are about
 50 to 70% done or just need a code advice), you can create a github pull
 request to the apache/mahout master. Make sure to include MAHOUT-XXX
>> issue
 in the head of the pull request, that way ASF will automatically
>> propagate
 code comments to jira, and so all discussion can be done entirely on
>> github.
 
 Again, if you take on a signficant contribution (such as a new numerical
 method contribution), I recommend to discuss the proposal on the @dev
>> list
 
 thanks.
 
 
> 
> Khurrum
> 
>> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
> wrote:
>> 
>> Oh but of course! please do!
>> 
>> You may work on any issue, this or any other of your choice, or even
>> on
> any
>> new issue you can think of (for sizeable contributions it is
> recommended to
>> start discussion on the @dev list first though, to make sure to
>> benefit
>> from experience of others. Please file any new issue first to jira).
>> 
>> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>> j...@apache.org> wrote:
>> 
>>> 
>>>  [
>>> 
> 
>> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
>>> ]
>>> 
>>> shashi bushan dongur commented on MAHOUT-1788:
>>> --
>>> 
>>> Hello. I would like to start contributing to mahout. Can I work on
>> this
>>> issue?
>>> 
 spark-itemsimilarity integration test script cleanup
 
 
  Key: MAHOUT-1788
  URL:
>> https://issues.apache.org/jira/browse/MAHOUT-1788
  Project: Mahout
   Issue Type: Improvement
   Components: cooccurrence
 Affects Versions: 0.11.0
 Reporter: Pat Ferrel
 Assignee: Pat Ferrel
 Priority: Trivial
  Fix For: 1.0.0
 
 
 binary release does not contain data for itemsimilarity tests, neith
>>> binary nor source versions will run on a cluster unless data is hand
> copied
>>> to hdfs.
 Clean this up so it copies data if needed and the data is in both
>>> versions.
>>> 
>>> 
>>> 
>>> --
>>> This message was sent by Atlassian JIRA
>>> (v6.3.4#6332)
>>> 
> 
> 
 
>> 
>> 



[jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-04-04 Thread shashi bushan dongur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225099#comment-15225099
 ] 

shashi bushan dongur commented on MAHOUT-1788:
--

[~smarthi] I currently have mahout installed and set up on my VM. I am digging 
up the source code to understand how it work. I will post update when I start 
editing the code. 

Is there any resource I can look at to learn how to efficiently edit and run 
mahout? I have followed the instructions on github, but having hard time 
understanding how I can run and test the code. Any resource regarding that 
would hugely help! 

P.S: I am new to apache open source or open source in general. 

> spark-itemsimilarity integration test script cleanup
> 
>
> Key: MAHOUT-1788
> URL: https://issues.apache.org/jira/browse/MAHOUT-1788
> Project: Mahout
>  Issue Type: Improvement
>  Components: cooccurrence
>Affects Versions: 0.11.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
>Priority: Trivial
> Fix For: 1.0.0
>
>
> binary release does not contain data for itemsimilarity tests, neith binary 
> nor source versions will run on a cluster unless data is hand copied to hdfs.
> Clean this up so it copies data if needed and the data is in both versions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-04-03 Thread Suneel Marthi (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15223282#comment-15223282
 ] 

Suneel Marthi commented on MAHOUT-1788:
---

[~shashidongur] Any progress on this yet? Do you need any help?

> spark-itemsimilarity integration test script cleanup
> 
>
> Key: MAHOUT-1788
> URL: https://issues.apache.org/jira/browse/MAHOUT-1788
> Project: Mahout
>  Issue Type: Improvement
>  Components: cooccurrence
>Affects Versions: 0.11.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
>Priority: Trivial
> Fix For: 1.0.0
>
>
> binary release does not contain data for itemsimilarity tests, neith binary 
> nor source versions will run on a cluster unless data is hand copied to hdfs.
> Clean this up so it copies data if needed and the data is in both versions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-31 Thread Khurrum Nasim
Thanks everyone - I’m glad to be a part of this.  

Khurrum


> On Mar 30, 2016, at 3:10 PM, Suneel Marthi  wrote:
> 
> Thanks Khurrum for stepping up.
> 
> You just need basic programming skills - Java/Scala to be able to
> contribute. We can help you with the algorithms and linear algebra stuff.
> 
> 
> Welcome aboard !!
> 
> 
> On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim 
> wrote:
> 
>> Thanks for the advice Dimitry.  I’m already signed up on ASF jira.My
>> handle is “nasimk”
>> 
>> Do I need to be a linear algebra expert and or math phd  to contribute ?
>> I have 10 plus years of computer programming experience.  my background is
>> comp sci.
>> 
>> Khurrum
>> 
>> 
>> 
>> 
>> 
>>> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov  wrote:
>>> 
>>> PS You may also want to sign up with ASF Jira so we can assign issues to
>>> yourself.
>>> 
>>> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov 
>>> wrote:
>>> 
 
 
 On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
>> khurrum.na...@useitc.com>
 wrote:
 
> Thanks Dimirtry.
> 
> I take a look at see where I can start pitching in.  Do I need
> contributor access ? how  would I create feature branch of my work ?
> 
 
 Khurrum,
 
 you only need github account. What you need is to create mahout's master
 fork in your github space and keep it in sync, as possible, with master
>> as
 you go (by doing regular pulls). That way you have the most chance of
 having least conflicts possible.
 
 At any point in time (I recommend at perhaps when you feel you are about
 50 to 70% done or just need a code advice), you can create a github pull
 request to the apache/mahout master. Make sure to include MAHOUT-XXX
>> issue
 in the head of the pull request, that way ASF will automatically
>> propagate
 code comments to jira, and so all discussion can be done entirely on
>> github.
 
 Again, if you take on a signficant contribution (such as a new numerical
 method contribution), I recommend to discuss the proposal on the @dev
>> list
 
 thanks.
 
 
> 
> Khurrum
> 
>> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
> wrote:
>> 
>> Oh but of course! please do!
>> 
>> You may work on any issue, this or any other of your choice, or even
>> on
> any
>> new issue you can think of (for sizeable contributions it is
> recommended to
>> start discussion on the @dev list first though, to make sure to
>> benefit
>> from experience of others. Please file any new issue first to jira).
>> 
>> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>> j...@apache.org> wrote:
>> 
>>> 
>>>  [
>>> 
> 
>> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
>>> ]
>>> 
>>> shashi bushan dongur commented on MAHOUT-1788:
>>> --
>>> 
>>> Hello. I would like to start contributing to mahout. Can I work on
>> this
>>> issue?
>>> 
 spark-itemsimilarity integration test script cleanup
 
 
  Key: MAHOUT-1788
  URL:
>> https://issues.apache.org/jira/browse/MAHOUT-1788
  Project: Mahout
   Issue Type: Improvement
   Components: cooccurrence
 Affects Versions: 0.11.0
 Reporter: Pat Ferrel
 Assignee: Pat Ferrel
 Priority: Trivial
  Fix For: 1.0.0
 
 
 binary release does not contain data for itemsimilarity tests, neith
>>> binary nor source versions will run on a cluster unless data is hand
> copied
>>> to hdfs.
 Clean this up so it copies data if needed and the data is in both
>>> versions.
>>> 
>>> 
>>> 
>>> --
>>> This message was sent by Atlassian JIRA
>>> (v6.3.4#6332)
>>> 
> 
> 
 
>> 
>> 



Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-30 Thread Suneel Marthi
Thanks Khurrum for stepping up.

You just need basic programming skills - Java/Scala to be able to
contribute. We can help you with the algorithms and linear algebra stuff.


Welcome aboard !!


On Wed, Mar 30, 2016 at 3:05 PM, Khurrum Nasim 
wrote:

> Thanks for the advice Dimitry.  I’m already signed up on ASF jira.My
> handle is “nasimk”
>
> Do I need to be a linear algebra expert and or math phd  to contribute ?
> I have 10 plus years of computer programming experience.  my background is
> comp sci.
>
> Khurrum
>
>
>
>
>
> > On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov  wrote:
> >
> > PS You may also want to sign up with ASF Jira so we can assign issues to
> > yourself.
> >
> > On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov 
> > wrote:
> >
> >>
> >>
> >> On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim <
> khurrum.na...@useitc.com>
> >> wrote:
> >>
> >>> Thanks Dimirtry.
> >>>
> >>> I take a look at see where I can start pitching in.  Do I need
> >>> contributor access ? how  would I create feature branch of my work ?
> >>>
> >>
> >> Khurrum,
> >>
> >> you only need github account. What you need is to create mahout's master
> >> fork in your github space and keep it in sync, as possible, with master
> as
> >> you go (by doing regular pulls). That way you have the most chance of
> >> having least conflicts possible.
> >>
> >> At any point in time (I recommend at perhaps when you feel you are about
> >> 50 to 70% done or just need a code advice), you can create a github pull
> >> request to the apache/mahout master. Make sure to include MAHOUT-XXX
> issue
> >> in the head of the pull request, that way ASF will automatically
> propagate
> >> code comments to jira, and so all discussion can be done entirely on
> github.
> >>
> >> Again, if you take on a signficant contribution (such as a new numerical
> >> method contribution), I recommend to discuss the proposal on the @dev
> list
> >>
> >> thanks.
> >>
> >>
> >>>
> >>> Khurrum
> >>>
>  On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
> >>> wrote:
> 
>  Oh but of course! please do!
> 
>  You may work on any issue, this or any other of your choice, or even
> on
> >>> any
>  new issue you can think of (for sizeable contributions it is
> >>> recommended to
>  start discussion on the @dev list first though, to make sure to
> benefit
>  from experience of others. Please file any new issue first to jira).
> 
>  On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>  j...@apache.org> wrote:
> 
> >
> >   [
> >
> >>>
> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
> > ]
> >
> > shashi bushan dongur commented on MAHOUT-1788:
> > --
> >
> > Hello. I would like to start contributing to mahout. Can I work on
> this
> > issue?
> >
> >> spark-itemsimilarity integration test script cleanup
> >> 
> >>
> >>   Key: MAHOUT-1788
> >>   URL:
> https://issues.apache.org/jira/browse/MAHOUT-1788
> >>   Project: Mahout
> >>Issue Type: Improvement
> >>Components: cooccurrence
> >>  Affects Versions: 0.11.0
> >>  Reporter: Pat Ferrel
> >>  Assignee: Pat Ferrel
> >>  Priority: Trivial
> >>   Fix For: 1.0.0
> >>
> >>
> >> binary release does not contain data for itemsimilarity tests, neith
> > binary nor source versions will run on a cluster unless data is hand
> >>> copied
> > to hdfs.
> >> Clean this up so it copies data if needed and the data is in both
> > versions.
> >
> >
> >
> > --
> > This message was sent by Atlassian JIRA
> > (v6.3.4#6332)
> >
> >>>
> >>>
> >>
>
>


Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-30 Thread Khurrum Nasim
Thanks for the advice Dimitry.  I’m already signed up on ASF jira.My handle 
is “nasimk”

Do I need to be a linear algebra expert and or math phd  to contribute ?  
I have 10 plus years of computer programming experience.  my background is comp 
sci. 

Khurrum
 




> On Mar 30, 2016, at 2:57 PM, Dmitriy Lyubimov  wrote:
> 
> PS You may also want to sign up with ASF Jira so we can assign issues to
> yourself.
> 
> On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov 
> wrote:
> 
>> 
>> 
>> On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim 
>> wrote:
>> 
>>> Thanks Dimirtry.
>>> 
>>> I take a look at see where I can start pitching in.  Do I need
>>> contributor access ? how  would I create feature branch of my work ?
>>> 
>> 
>> Khurrum,
>> 
>> you only need github account. What you need is to create mahout's master
>> fork in your github space and keep it in sync, as possible, with master as
>> you go (by doing regular pulls). That way you have the most chance of
>> having least conflicts possible.
>> 
>> At any point in time (I recommend at perhaps when you feel you are about
>> 50 to 70% done or just need a code advice), you can create a github pull
>> request to the apache/mahout master. Make sure to include MAHOUT-XXX issue
>> in the head of the pull request, that way ASF will automatically propagate
>> code comments to jira, and so all discussion can be done entirely on github.
>> 
>> Again, if you take on a signficant contribution (such as a new numerical
>> method contribution), I recommend to discuss the proposal on the @dev list
>> 
>> thanks.
>> 
>> 
>>> 
>>> Khurrum
>>> 
 On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
>>> wrote:
 
 Oh but of course! please do!
 
 You may work on any issue, this or any other of your choice, or even on
>>> any
 new issue you can think of (for sizeable contributions it is
>>> recommended to
 start discussion on the @dev list first though, to make sure to benefit
 from experience of others. Please file any new issue first to jira).
 
 On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
 j...@apache.org> wrote:
 
> 
>   [
> 
>>> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
> ]
> 
> shashi bushan dongur commented on MAHOUT-1788:
> --
> 
> Hello. I would like to start contributing to mahout. Can I work on this
> issue?
> 
>> spark-itemsimilarity integration test script cleanup
>> 
>> 
>>   Key: MAHOUT-1788
>>   URL: https://issues.apache.org/jira/browse/MAHOUT-1788
>>   Project: Mahout
>>Issue Type: Improvement
>>Components: cooccurrence
>>  Affects Versions: 0.11.0
>>  Reporter: Pat Ferrel
>>  Assignee: Pat Ferrel
>>  Priority: Trivial
>>   Fix For: 1.0.0
>> 
>> 
>> binary release does not contain data for itemsimilarity tests, neith
> binary nor source versions will run on a cluster unless data is hand
>>> copied
> to hdfs.
>> Clean this up so it copies data if needed and the data is in both
> versions.
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
> 
>>> 
>>> 
>> 



Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-30 Thread Dmitriy Lyubimov
PS You may also want to sign up with ASF Jira so we can assign issues to
yourself.

On Wed, Mar 30, 2016 at 11:52 AM, Dmitriy Lyubimov 
wrote:

>
>
> On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim 
> wrote:
>
>> Thanks Dimirtry.
>>
>> I take a look at see where I can start pitching in.  Do I need
>> contributor access ? how  would I create feature branch of my work ?
>>
>
> Khurrum,
>
> you only need github account. What you need is to create mahout's master
> fork in your github space and keep it in sync, as possible, with master as
> you go (by doing regular pulls). That way you have the most chance of
> having least conflicts possible.
>
> At any point in time (I recommend at perhaps when you feel you are about
> 50 to 70% done or just need a code advice), you can create a github pull
> request to the apache/mahout master. Make sure to include MAHOUT-XXX issue
> in the head of the pull request, that way ASF will automatically propagate
> code comments to jira, and so all discussion can be done entirely on github.
>
> Again, if you take on a signficant contribution (such as a new numerical
> method contribution), I recommend to discuss the proposal on the @dev list
>
> thanks.
>
>
>>
>> Khurrum
>>
>> > On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov 
>> wrote:
>> >
>> > Oh but of course! please do!
>> >
>> > You may work on any issue, this or any other of your choice, or even on
>> any
>> > new issue you can think of (for sizeable contributions it is
>> recommended to
>> > start discussion on the @dev list first though, to make sure to benefit
>> > from experience of others. Please file any new issue first to jira).
>> >
>> > On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
>> > j...@apache.org> wrote:
>> >
>> >>
>> >>[
>> >>
>> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
>> >> ]
>> >>
>> >> shashi bushan dongur commented on MAHOUT-1788:
>> >> --
>> >>
>> >> Hello. I would like to start contributing to mahout. Can I work on this
>> >> issue?
>> >>
>> >>> spark-itemsimilarity integration test script cleanup
>> >>> 
>> >>>
>> >>>Key: MAHOUT-1788
>> >>>URL: https://issues.apache.org/jira/browse/MAHOUT-1788
>> >>>Project: Mahout
>> >>> Issue Type: Improvement
>> >>> Components: cooccurrence
>> >>>   Affects Versions: 0.11.0
>> >>>   Reporter: Pat Ferrel
>> >>>   Assignee: Pat Ferrel
>> >>>   Priority: Trivial
>> >>>Fix For: 1.0.0
>> >>>
>> >>>
>> >>> binary release does not contain data for itemsimilarity tests, neith
>> >> binary nor source versions will run on a cluster unless data is hand
>> copied
>> >> to hdfs.
>> >>> Clean this up so it copies data if needed and the data is in both
>> >> versions.
>> >>
>> >>
>> >>
>> >> --
>> >> This message was sent by Atlassian JIRA
>> >> (v6.3.4#6332)
>> >>
>>
>>
>


Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-30 Thread Dmitriy Lyubimov
On Wed, Mar 30, 2016 at 11:43 AM, Khurrum Nasim 
wrote:

> Thanks Dimirtry.
>
> I take a look at see where I can start pitching in.  Do I need contributor
> access ? how  would I create feature branch of my work ?
>

Khurrum,

you only need github account. What you need is to create mahout's master
fork in your github space and keep it in sync, as possible, with master as
you go (by doing regular pulls). That way you have the most chance of
having least conflicts possible.

At any point in time (I recommend at perhaps when you feel you are about 50
to 70% done or just need a code advice), you can create a github pull
request to the apache/mahout master. Make sure to include MAHOUT-XXX issue
in the head of the pull request, that way ASF will automatically propagate
code comments to jira, and so all discussion can be done entirely on github.

Again, if you take on a signficant contribution (such as a new numerical
method contribution), I recommend to discuss the proposal on the @dev list

thanks.


>
> Khurrum
>
> > On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov  wrote:
> >
> > Oh but of course! please do!
> >
> > You may work on any issue, this or any other of your choice, or even on
> any
> > new issue you can think of (for sizeable contributions it is recommended
> to
> > start discussion on the @dev list first though, to make sure to benefit
> > from experience of others. Please file any new issue first to jira).
> >
> > On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
> > j...@apache.org> wrote:
> >
> >>
> >>[
> >>
> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
> >> ]
> >>
> >> shashi bushan dongur commented on MAHOUT-1788:
> >> --
> >>
> >> Hello. I would like to start contributing to mahout. Can I work on this
> >> issue?
> >>
> >>> spark-itemsimilarity integration test script cleanup
> >>> 
> >>>
> >>>Key: MAHOUT-1788
> >>>URL: https://issues.apache.org/jira/browse/MAHOUT-1788
> >>>Project: Mahout
> >>> Issue Type: Improvement
> >>> Components: cooccurrence
> >>>   Affects Versions: 0.11.0
> >>>   Reporter: Pat Ferrel
> >>>   Assignee: Pat Ferrel
> >>>   Priority: Trivial
> >>>Fix For: 1.0.0
> >>>
> >>>
> >>> binary release does not contain data for itemsimilarity tests, neith
> >> binary nor source versions will run on a cluster unless data is hand
> copied
> >> to hdfs.
> >>> Clean this up so it copies data if needed and the data is in both
> >> versions.
> >>
> >>
> >>
> >> --
> >> This message was sent by Atlassian JIRA
> >> (v6.3.4#6332)
> >>
>
>


Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-30 Thread Khurrum Nasim
Thanks Dimirtry.  

I take a look at see where I can start pitching in.  Do I need contributor 
access ? how  would I create feature branch of my work ? 

Khurrum

> On Mar 30, 2016, at 1:12 PM, Dmitriy Lyubimov  wrote:
> 
> Oh but of course! please do!
> 
> You may work on any issue, this or any other of your choice, or even on any
> new issue you can think of (for sizeable contributions it is recommended to
> start discussion on the @dev list first though, to make sure to benefit
> from experience of others. Please file any new issue first to jira).
> 
> On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
> j...@apache.org> wrote:
> 
>> 
>>[
>> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
>> ]
>> 
>> shashi bushan dongur commented on MAHOUT-1788:
>> --
>> 
>> Hello. I would like to start contributing to mahout. Can I work on this
>> issue?
>> 
>>> spark-itemsimilarity integration test script cleanup
>>> 
>>> 
>>>Key: MAHOUT-1788
>>>URL: https://issues.apache.org/jira/browse/MAHOUT-1788
>>>Project: Mahout
>>> Issue Type: Improvement
>>> Components: cooccurrence
>>>   Affects Versions: 0.11.0
>>>   Reporter: Pat Ferrel
>>>   Assignee: Pat Ferrel
>>>   Priority: Trivial
>>>Fix For: 1.0.0
>>> 
>>> 
>>> binary release does not contain data for itemsimilarity tests, neith
>> binary nor source versions will run on a cluster unless data is hand copied
>> to hdfs.
>>> Clean this up so it copies data if needed and the data is in both
>> versions.
>> 
>> 
>> 
>> --
>> This message was sent by Atlassian JIRA
>> (v6.3.4#6332)
>> 



[jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-30 Thread Dmitriy Lyubimov (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218531#comment-15218531
 ] 

Dmitriy Lyubimov commented on MAHOUT-1788:
--

[~shashidongur]

Oh but of course! please do!

You may work on any issue, this or any other of your choice, or even on any new 
issue you can think of (for sizeable contributions it is recommended to start 
discussion on the @dev list first though, to make sure to benefit from 
experience of others. Please file any new issue first to jira).


> spark-itemsimilarity integration test script cleanup
> 
>
> Key: MAHOUT-1788
> URL: https://issues.apache.org/jira/browse/MAHOUT-1788
> Project: Mahout
>  Issue Type: Improvement
>  Components: cooccurrence
>Affects Versions: 0.11.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
>Priority: Trivial
> Fix For: 1.0.0
>
>
> binary release does not contain data for itemsimilarity tests, neith binary 
> nor source versions will run on a cluster unless data is hand copied to hdfs.
> Clean this up so it copies data if needed and the data is in both versions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-30 Thread Dmitriy Lyubimov
Oh but of course! please do!

You may work on any issue, this or any other of your choice, or even on any
new issue you can think of (for sizeable contributions it is recommended to
start discussion on the @dev list first though, to make sure to benefit
from experience of others. Please file any new issue first to jira).

On Wed, Mar 30, 2016 at 9:05 AM, shashi bushan dongur (JIRA) <
j...@apache.org> wrote:

>
> [
> https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
> ]
>
> shashi bushan dongur commented on MAHOUT-1788:
> --
>
> Hello. I would like to start contributing to mahout. Can I work on this
> issue?
>
> > spark-itemsimilarity integration test script cleanup
> > 
> >
> > Key: MAHOUT-1788
> > URL: https://issues.apache.org/jira/browse/MAHOUT-1788
> > Project: Mahout
> >  Issue Type: Improvement
> >  Components: cooccurrence
> >Affects Versions: 0.11.0
> >Reporter: Pat Ferrel
> >Assignee: Pat Ferrel
> >Priority: Trivial
> > Fix For: 1.0.0
> >
> >
> > binary release does not contain data for itemsimilarity tests, neith
> binary nor source versions will run on a cluster unless data is hand copied
> to hdfs.
> > Clean this up so it copies data if needed and the data is in both
> versions.
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v6.3.4#6332)
>


[jira] [Commented] (MAHOUT-1788) spark-itemsimilarity integration test script cleanup

2016-03-30 Thread shashi bushan dongur (JIRA)

[ 
https://issues.apache.org/jira/browse/MAHOUT-1788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218216#comment-15218216
 ] 

shashi bushan dongur commented on MAHOUT-1788:
--

Hello. I would like to start contributing to mahout. Can I work on this issue? 

> spark-itemsimilarity integration test script cleanup
> 
>
> Key: MAHOUT-1788
> URL: https://issues.apache.org/jira/browse/MAHOUT-1788
> Project: Mahout
>  Issue Type: Improvement
>  Components: cooccurrence
>Affects Versions: 0.11.0
>Reporter: Pat Ferrel
>Assignee: Pat Ferrel
>Priority: Trivial
> Fix For: 1.0.0
>
>
> binary release does not contain data for itemsimilarity tests, neith binary 
> nor source versions will run on a cluster unless data is hand copied to hdfs.
> Clean this up so it copies data if needed and the data is in both versions. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)