[ 
https://issues.apache.org/jira/browse/DRILL-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13847481#comment-13847481
 ] 

Henrik Behrens commented on DRILL-325:
--------------------------------------

I strongly support this feature for the following reasons:
•       MADlib already supports a wide range of algorithms for machine 
learning, data mining and statistics (see http://doc.madlib.net/latest/ for 
details)
•       MADlib is free and open source
•       MADlib is designed to eventually serve a role for scalable database 
systems that is similar to the CRAN library for R: a community repository of 
statistical methods, this time written with scale and parallelism in mind
•       MADlib is open for contributions of both new methods, and ports to 
additional database platforms
•       MADlib is already supported on the Hadoop platform via HAWQ
•       MADlib has already been started to be ported to Impala 
(http://blog.cloudera.com/blog/2013/10/how-to-use-madlib-pre-built-analytic-functions-with-impala/)
•       MADlib uses SQL and UDFs/UDAs for implementing analytical functions
•       MADlib supports iterative algorithms (in contrast to SQL)
•       MADlib supports templated Queries (the same function can be applied to 
different tables, in contrast to SQL)
•       MADlib contains additional sophisticated features and abstractions 
(Macroprogramming, Microprogramming, Abstraction Layer for UDFs, Convex 
Optimization, Features for Statistical Text Analysis)

For details please read their excellent paper: 
http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-38.pdf

I think it is important that no decisions are currently made concerning Drill 
that would later make it difficult to port MADlib to Drill (e.g. missing 
support for iterative or templated Queries etc.).


> Support for MADlib
> ------------------
>
>                 Key: DRILL-325
>                 URL: https://issues.apache.org/jira/browse/DRILL-325
>             Project: Apache Drill
>          Issue Type: New Feature
>            Reporter: Michael Hausenblas
>
> It should be possible to use MADlib (http://doc.madlib.net/latest/) with 
> Drill.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to