Thanks Andrew - Since you're quite familiar with how Mahout backends (flink, spark, h20) bind and enable DRM and the API Mahout/Samsara exposes - I think the end goal would be to surface a JAVA/Python API as well as outline a declarative syntax that the various runners can adhere to (perhaps via some kind of execution plan).
This would be added to the current design document within a 'low-level / linear algebra' section. What are your thoughts (and Mahout's thoughts) on SystemML's approach to DRM and Declarative Machine Learning (DML) in general? http://www.vldb.org/pvldb/vol9/p960-elgohary.pdf Does Mahout plan on supporting SystemML compressed linear algebra (CLA)? Would you be willing to help (along with others including Vladisav) in providing a common ML design document for Beam? On Wed, Jan 11, 2017 at 12:07 PM, Andrew Musselman < andrew.mussel...@gmail.com> wrote: > That's right; what other info do you think would be useful? > > On Tue, Jan 10, 2017 at 11:09 AM, Kam Kasravi <kamkasr...@gmail.com> > wrote: > > > Thanks Andrew > > I think more information about the DRM operations and how persistence > > would be done at the runner level. It looks like HDFS or spark caching is > > currently being used? > > > > On Monday, January 9, 2017 6:04 PM, Andrew Musselman <a...@apache.org > > > > wrote: > > > > > > Hello Beam Team, > > > > Thought you might be interested in the work we've been doing on Mahout, > > such as the distributed linear algebra DSL/front-end that can use > multiple > > back-ends for compute (Spark, Flink, H2O now). See > > https://mahout.apache.org/users/environment/out-of-core-reference.html > for > > an intro. > > > > We also are working on native CPU/GPU hybrid support and we're close to > an > > initial release. Let us know if you'd like to know more. > > > > Thanks and best of luck! > > > > Best > > Andrew Musselman > > > > On 2017-01-09 12:00 (-0800), Kam Kasravi <kamkasr...@gmail.com> wrote: > > > Hi Vladisav > > > > > > I'm the author of the design document. An area we stalled on was > creating > > a > > > common low level linear algebra library that would also include > > > optimizations like MKL but across platforms and GPUs. > > > Additionally there are efforts underway that provide a scoring API vs a > > > training API. > > > > > > - PredictionIO http://predictionio.incubator.apache.org/ (now part > of > > > Salesforce) > > > - MLeap https://github.com/combust/mleap > > > - PFA - Portable Format for Analytics http://dmg.org/pfa/ > > > > > > Any ML effort needs to also include deep learning and the ability to > > > integrate various types of neural networks. Apache has several early > > > efforts in this regard (mxnet, singa). > > > > > > Thanks > > > Kam > > > > > > On Fri, Jan 6, 2017 at 7:07 AM, Vladisav Jelisavcic < > vladis...@gmail.com > > > > > > wrote: > > > > > > > Hi everyone, > > > > > > > > what is the current status on BEAM-478 and BEAM-303 (machine learning > > > > learning DSL and related functions)? > > > > I would like to start contributing in this direction. > > > > > > > > I found this design document: > > > > https://docs.google.com/document/d/17cRZk_ > > yqHm3C0fljivjN66MbLkeKS1yjo4PB > > > > ECHb-xA/edit#heading=h.n51rhya8bv4f > > > > > > > > Are there any other docs/advances related to this? > > > > > > > > > > > > Best regards, > > > > Vladisav > > > > > > > > > > > > > > > >