Hi, Does anyone have interest to support MADLib in Drill? MADLib is an open source in-database analytics library written with C/C++ and Python. Impala, HAWQ both have a port for MADLib. There is a JIRA for this: https://issues.apache.org/jira/browse/DRILL-325
I have personal interest in this project and would like to make it as a google summer of code project. Anyone interested in working as a mentor for GSoC for this? I've written a proposal here: http://bitly.com/MADDrill Your comments are highly appreciated. In summary, the tasks include: - implement MADLib C++ Abstraction Layer for Drill - generate Java Wrapper code for C++ UDF/UDA - write python driver code the drive the module I'd like to take the initial step to developing framework to support MADLib in Drill. There are still several problems needed to work out. For example, as Jacques pointed in Comments Drill-325, we need to find a way the working with the workspace variable with only support internal type for aggregation. In addition, for your convenience, you can find the GSoC mentor information for Apache here. http://community.apache.org/gsoc.html Disclaimer: I am a graduate student currently doing an internship at Simba (developing ODBC Driver for Drill :) ). This is for my personal interest. I will mostly do it in weekend and off-work. Best, Xiao
