Hey Manuel, (I think I accidentally dropped dev@m.a.o when replying to I've added them back.) Let me open a WIP PR and then we can discuss on there.
In general though, the current form will create a docker image with Hadoop and/or Spark, and mounts the project directory in the docker image at /opt/mahout (which is also Mahout Home) Also a script is run upon start up that runs a few of the examples/CLI drivers. We want: - A script which runs through an exhaustive list of tests (cli drivers/examples/etc) - A way to tell weather those tests passed or failed (checking the output?) - A way to fail the build if if the examples/etc fail. (no idea how this works, I've always tried to make build successful, never tried to fail one). Trevor Grant Data Scientist https://github.com/rawkintrevo http://stackexchange.com/users/3002022/rawkintrevo http://trevorgrant.org *"Fortunate is he, who is able to know the causes of things." -Virgil* On Sat, Mar 18, 2017 at 6:55 PM, Manuel Sequino <mansequ...@gmail.com> wrote: > Hi Trevor, > let's start with this task. > I did some experiments with maven and docker but I am not still > comfortable. > > Now it looks like clear, if I have some doubt, I'll get back to you. > > Just a problem, I don't know how and what to write a jira, may you direct > me? > > Best regards, > Manuel > > 2017-03-16 15:48 GMT+01:00 Trevor Grant <trevor.d.gr...@gmail.com>: > >> Hey Manuel, >> >> Awesome!! I don't think I even started a JIRA yet. I was literally just >> toying- I saw some cool stuff when building Apache Streams-Incubating, and >> copied it. Having maven kick off docker images is a strange thing. >> >> https://github.com/rawkintrevo/mahout/tree/docker-based-its/dockerITs >> >> At this point I 1) Recognize it is a thing we should do to streamline our >> testing 2) don't know enough to intelligently write a JIRA. >> The idea is, there should be a maven phase where we fire up pseudo spark >> and hadoop clusters, and then run all of the exambles, cli drivers, and >> shell tests. And fail loudly should any of those tests fail. >> >> As I was telling Saikat, also kind of busy with 100 other things. If you >> want to take point on this, feel free to write a jira- copy or fork what >> I've done so far and go. >> >> Again, also check out Apache Streams-incubating since I am admittedly >> copying them. >> >> tg >> >> >> Trevor Grant >> Data Scientist >> https://github.com/rawkintrevo >> http://stackexchange.com/users/3002022/rawkintrevo >> http://trevorgrant.org >> >> *"Fortunate is he, who is able to know the causes of things." -Virgil* >> >> >> On Wed, Mar 15, 2017 at 10:42 AM, Manuel Sequino <mansequ...@gmail.com> >> wrote: >> >>> Hi Trevor, >>> I'd like to contribute on Mahout specially working on something >>> inherently docker, I am pretty new but I think I could give you help. >>> >>> What about this bullet? >>> >>> "I have been toying with some docker based integration tests if you >>> happen >>> to be familiar with Dockers and using them for maven IT (or want to >>> learn)" >>> >>> Where can I get more info? Jira doesn't contain the "docker" keyword. >>> >>> Best regards, >>> >>> --------------------------------------- >>> Manuel Sequino >>> >>> Email: mansequ...@gmail.com >>> Skype: manuel.sequino >>> +39 320 4869904 <+39%20320%20486%209904> >>> >>> Linkedin page <https://it.linkedin.com/pub/manuel-sequino/96/261/494> >>> -------------------------------------- >>> >>> 2017-03-15 8:24 GMT+01:00 Trevor Grant <trevor.d.gr...@gmail.com>: >>> >>>> Hey Dustin! >>>> >>>> Welcome to the community. >>>> >>>> At the moment, we are in the middle of a release. The most immediate >>>> thing >>>> you could help with would be to help us test the release candidate. See >>>> Andrew's email. >>>> >>>> Moving forward though, there are lots of opportunities- >>>> Some things that have been kicked around on here over the last few >>>> months >>>> include: >>>> - Migrating website to a git based so that non committers can edit and >>>> contribute to the docs. >>>> - Expanding the algorithms section (are there any algorithms you are >>>> familiar with? Implementing in Mahout would be a good start) >>>> - I have been toying with some docker based integration tests if you >>>> happen >>>> to be familiar with Dockers and using them for maven IT (or want to >>>> learn) >>>> - Beginner issues- at the moment there aren't many on the JIRA board bc >>>> we >>>> fixed most in preparation for the release. >>>> >>>> Testing the release would be a good start point however, because it will >>>> get you familiar with building Mahout ( a necessary first step). >>>> >>>> Items 1 and 3 are a bit advanced for someone just starting out- so >>>> unless >>>> you have some specific familiarity- I would direct you toward number 2. >>>> >>>> In that case- check out: >>>> https://github.com/apache/mahout/tree/master/math-scala/src/ >>>> main/scala/org/apache/mahout/math/algorithms >>>> >>>> There is the algorithm framework- look through it. If there is an >>>> algorithm you have in mind (try to start with an easy one), let us know >>>> and >>>> open a JIRA ticket! >>>> >>>> Best, >>>> >>>> tg >>>> >>>> Trevor Grant >>>> Data Scientist >>>> https://github.com/rawkintrevo >>>> http://stackexchange.com/users/3002022/rawkintrevo >>>> http://trevorgrant.org >>>> >>>> *"Fortunate is he, who is able to know the causes of things." -Virgil* >>>> >>>> >>>> On Tue, Mar 14, 2017 at 6:12 PM, dustin vanstee < >>>> dustinvans...@gmail.com> >>>> wrote: >>>> >>>> > Hi I have been looking into mahout and think it has some very nice >>>> > ML/Linear alg capabilities. I would like to contribute to the >>>> project, and >>>> > I was hoping someone on the mailing list might be able to give me a >>>> few >>>> > ideas about where I could start. Thanks! >>>> > >>>> >>> >>> >> >