Re: [DISCUSS] Looking to the future hivemall graduation
On Fri, Mar 3, 2017 at 9:39 AM, Makoto Yuiwrote: > Edward, > > Thank you for your suggestion. It's certainly an option for our > project graduation. > I'm discussing about it with other PPMC members of Hivemall. > > My concerns are > 1) Hivemall is not only for Hive but also targets Spark and Pig as the > runtime. > - It has some Spark Dataframe related features >http://hivemall.incubator.apache.org/userguide/spark/ > misc/topk_join.html > 2) Project management (e.g., release process) for a subproject. > - artifacts better to be separated to Hive (Separation of Concerns) > - It seems that Apache DB subproject are distinct ones. >https://db.apache.org/newproject.html > > We are now on a very early stage in Apache incubation, planning the > first release in early Q2. > It might be too early to discuss but we welcome your suggestion. > > Thanks, > Makoto > > 2017-03-03 15:15 GMT+09:00 Edward Capriolo : > > Hivemall in the incubator has a fairly impressive set of features that do > > machine learning directly from hive. > > > > http://hivemall.incubator.apache.org/overview.html > > https://github.com/myui/hivemall/wiki/Logistic- > regression-dataset-generation > > > > While we can not put the cart before the horse, i can imagine that upon > > graduation hivemall would be a natural fit to become part of hive (maybe > as > > a sub project). > > > > I could imagine we can setup like we did for hcat where we make a subtree > > and give commit rights to the tree eventually converting those interested > > in other parts of hive to hive committers as well. > > > > In any case hivemall devs, amazing work! > > > > Thanks, > > Edward > Those are fair concerns. I can say this. 1) I believe this is not a large issue for us. I believe we have sub modules that link to other things outside of hive for testing. 2) Our storage-api which lives inside hive source code is released separately from hive I understand that your graduation is far off, and when that happens you will make the choice that is right for your project (toplevel, part of hive, something else). I only wanted to say I would do my best to clear any technical or organizational concerns you have if you decide that landing Hive is the right course for you.
Re: [DISCUSS] Looking to the future hivemall graduation
Edward, Thank you for your suggestion. It's certainly an option for our project graduation. I'm discussing about it with other PPMC members of Hivemall. My concerns are 1) Hivemall is not only for Hive but also targets Spark and Pig as the runtime. - It has some Spark Dataframe related features http://hivemall.incubator.apache.org/userguide/spark/misc/topk_join.html 2) Project management (e.g., release process) for a subproject. - artifacts better to be separated to Hive (Separation of Concerns) - It seems that Apache DB subproject are distinct ones. https://db.apache.org/newproject.html We are now on a very early stage in Apache incubation, planning the first release in early Q2. It might be too early to discuss but we welcome your suggestion. Thanks, Makoto 2017-03-03 15:15 GMT+09:00 Edward Capriolo: > Hivemall in the incubator has a fairly impressive set of features that do > machine learning directly from hive. > > http://hivemall.incubator.apache.org/overview.html > https://github.com/myui/hivemall/wiki/Logistic-regression-dataset-generation > > While we can not put the cart before the horse, i can imagine that upon > graduation hivemall would be a natural fit to become part of hive (maybe as > a sub project). > > I could imagine we can setup like we did for hcat where we make a subtree > and give commit rights to the tree eventually converting those interested > in other parts of hive to hive committers as well. > > In any case hivemall devs, amazing work! > > Thanks, > Edward
[DISCUSS] Looking to the future hivemall graduation
Hivemall in the incubator has a fairly impressive set of features that do machine learning directly from hive. http://hivemall.incubator.apache.org/overview.html https://github.com/myui/hivemall/wiki/Logistic-regression-dataset-generation While we can not put the cart before the horse, i can imagine that upon graduation hivemall would be a natural fit to become part of hive (maybe as a sub project). I could imagine we can setup like we did for hcat where we make a subtree and give commit rights to the tree eventually converting those interested in other parts of hive to hive committers as well. In any case hivemall devs, amazing work! Thanks, Edward