Hey Manuel,

(I think I accidentally dropped dev@m.a.o when replying to I've added them
back.)
Let me open a WIP PR and then we can discuss on there.

In general though, the current form will create a docker image with Hadoop
and/or Spark, and mounts the project directory in the docker image at
/opt/mahout (which is also Mahout Home)
Also a script is run upon start up that runs a few of the examples/CLI
drivers.

We want:
- A script which runs through an exhaustive list of tests (cli
drivers/examples/etc)
- A way to tell weather those tests passed or failed (checking the output?)
- A way to fail the build if if the examples/etc fail. (no idea how this
works, I've always tried to make build successful, never tried to fail one).




Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*


On Sat, Mar 18, 2017 at 6:55 PM, Manuel Sequino <mansequ...@gmail.com>
wrote:

> Hi Trevor,
> let's start with this task.
> I did some experiments with maven and docker but I am not still
> comfortable.
>
> Now it looks like clear, if I have some doubt, I'll get back to you.
>
> Just a problem, I don't know how and what to write a jira, may you direct
> me?
>
> Best regards,
> Manuel
>
> 2017-03-16 15:48 GMT+01:00 Trevor Grant <trevor.d.gr...@gmail.com>:
>
>> Hey Manuel,
>>
>> Awesome!!  I don't think I even started a JIRA yet.  I was literally just
>> toying- I saw some cool stuff when building Apache Streams-Incubating, and
>> copied it.  Having maven kick off docker images is a strange thing.
>>
>> https://github.com/rawkintrevo/mahout/tree/docker-based-its/dockerITs
>>
>> At this point I 1) Recognize it is a thing we should do to streamline our
>> testing 2) don't know enough to intelligently write a JIRA.
>> The idea is, there should be a maven phase where we fire up pseudo spark
>> and hadoop clusters, and then run all of the exambles, cli drivers, and
>> shell tests.  And fail loudly should any of those tests fail.
>>
>> As I was telling Saikat, also kind of busy with 100 other things.  If you
>> want to take point on this, feel free to write a jira- copy or fork what
>> I've done so far and go.
>>
>> Again, also check out Apache Streams-incubating since I am admittedly
>> copying them.
>>
>> tg
>>
>>
>> Trevor Grant
>> Data Scientist
>> https://github.com/rawkintrevo
>> http://stackexchange.com/users/3002022/rawkintrevo
>> http://trevorgrant.org
>>
>> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>>
>>
>> On Wed, Mar 15, 2017 at 10:42 AM, Manuel Sequino <mansequ...@gmail.com>
>> wrote:
>>
>>> Hi Trevor,
>>> I'd like to contribute on Mahout specially working on something
>>> inherently docker, I am pretty new but I think I could give you help.
>>>
>>> What about this bullet?
>>>
>>> "I have been toying with some docker based integration tests if you
>>> happen
>>> to be familiar with Dockers and using them for maven IT (or want to
>>> learn)"
>>>
>>> Where can I get more info? Jira doesn't contain the "docker" keyword.
>>>
>>> Best regards,
>>>
>>> ---------------------------------------
>>> Manuel Sequino
>>>
>>> Email: mansequ...@gmail.com
>>> Skype: manuel.sequino
>>> +39 320 4869904 <+39%20320%20486%209904>
>>>
>>> Linkedin page <https://it.linkedin.com/pub/manuel-sequino/96/261/494>
>>> --------------------------------------
>>>
>>> 2017-03-15 8:24 GMT+01:00 Trevor Grant <trevor.d.gr...@gmail.com>:
>>>
>>>> Hey Dustin!
>>>>
>>>> Welcome to the community.
>>>>
>>>> At the moment, we are in the middle of a release.  The most immediate
>>>> thing
>>>> you could help with would be to help us test the release candidate.  See
>>>> Andrew's email.
>>>>
>>>> Moving forward though, there are lots of opportunities-
>>>> Some things that have been kicked around on here over the last few
>>>> months
>>>> include:
>>>> - Migrating website to a git based so that non committers can edit and
>>>> contribute to the docs.
>>>> - Expanding the algorithms section (are there any algorithms you are
>>>> familiar with? Implementing in Mahout would be a good start)
>>>> - I have been toying with some docker based integration tests if you
>>>> happen
>>>> to be familiar with Dockers and using them for maven IT (or want to
>>>> learn)
>>>> - Beginner issues- at the moment there aren't many on the JIRA board bc
>>>> we
>>>> fixed most in preparation for the release.
>>>>
>>>> Testing the release would be a good start point however, because it will
>>>> get you familiar with building Mahout ( a necessary first step).
>>>>
>>>> Items 1 and 3 are a bit advanced for someone just starting out- so
>>>> unless
>>>> you have some specific familiarity- I would direct you toward number 2.
>>>>
>>>> In that case- check out:
>>>> https://github.com/apache/mahout/tree/master/math-scala/src/
>>>> main/scala/org/apache/mahout/math/algorithms
>>>>
>>>> There is the algorithm framework- look through it.  If there is an
>>>> algorithm you have in mind (try to start with an easy one), let us know
>>>> and
>>>> open a JIRA ticket!
>>>>
>>>> Best,
>>>>
>>>> tg
>>>>
>>>> Trevor Grant
>>>> Data Scientist
>>>> https://github.com/rawkintrevo
>>>> http://stackexchange.com/users/3002022/rawkintrevo
>>>> http://trevorgrant.org
>>>>
>>>> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>>>>
>>>>
>>>> On Tue, Mar 14, 2017 at 6:12 PM, dustin vanstee <
>>>> dustinvans...@gmail.com>
>>>> wrote:
>>>>
>>>> > Hi I have been looking into mahout and think it has some very nice
>>>> > ML/Linear alg capabilities.  I would like to contribute to the
>>>> project, and
>>>> > I was hoping someone on the mailing list might be able to give me a
>>>> few
>>>> > ideas about where I could start.  Thanks!
>>>> >
>>>>
>>>
>>>
>>
>

Reply via email to