Wojtek, Peter, If I can help, please add me to the github acct. I think it would be a huge boost for drill. V/R,
On Dec 28, 2015, at 10:59, Peder Jakobsen | gmail <[email protected]> wrote: > > Hi Wojtek, if you want to kick start this project quickly, I would > suggest you set up a project with a README.md in your github account, and > share the link with us. Then we can move the detailed discussion about > features over there, and start collaborating. > > Personally, I would then start with selecting the python test framework > (some are better that others), then simply write some documented tests. > These will fail, and we can then work on making them pass if they feel like > they embody API calls that have the correct design. > > Personally, I would start using such an API immediately, so I'm quite > motivated to help. And yes, it would make for a very nice masters thesis: > Good API design is a seriously useful thing to become an expert at :) > > Cheers, > > Peder Jakobsen > > On Mon, Dec 28, 2015 at 10:45 AM, Peder Jakobsen | gmail < > [email protected]> wrote: > >> Two thumbs up for this project. An immediate benefit is the ability to >> take advantage of the enhanced interactive features of the iPython shell. >> >> Perhaps the next step is to model the design after a similar Rest API >> wrapper, for example, python-twitter: >> https://github.com/bear/python-twitter >> >> On Mon, Dec 28, 2015 at 8:45 AM, Charles Givre <[email protected]> wrote: >> >>> I’d second that and be willing to help. >>> >>> >>>> On Dec 28, 2015, at 07:59, John Omernik <[email protected]> wrote: >>>> >>>> I think a Pythonic module with Drill could be a great contribution. >>> Using >>>> the Rest API makes the most sense, wrapping it, and interfacing with it >>>> using requests or something similar. Since everything is done via JSON >>> in >>>> the rest API, there could be nice interaction with the API, doing things >>>> such as authentication (it's form based, so you have to use a requests >>>> session or similar), query submission, results, error handling,etc. You >>>> will want to determine what you want your driver to do, do you want an >>>> interface to support submitting new storage plugins? Do you want to >>> expose >>>> query time settings (such as the JSON read number as double) via the >>>> driver, or just via a statement submitted by the user? (one requires >>> much >>>> more work, the other requires a eye towards security). Security in >>> another >>>> thing, you want to ensure that if something is using your module, say a >>>> Python Flask App, that there is validation of SQL, and other such >>> concerns. >>>> Drill seems to be pretty good about it, but any module you would write >>>> should be explicit about what it is and what it isn't doing related to >>>> input sanitization/security >>>> >>>> Other things to think about would be something that would allow result >>> set >>>> objects in your Python driver to be easily moved to a pandas data >>> frame. I >>>> think the Data Science folks out there would love this, and you would >>> have >>>> a core setup of users and other contributions very quickly with that. >>> The >>>> key to something like this would be ensuring it's as Pythonic as >>> possible >>>> and is trying to bridge the gap between the Python language and Rest >>> API. >>>> This allows you, the author, the most flexibility to focus on your code, >>>> and not have to worry much about the Drill code base as everything is >>> using >>>> the Rest API (which is really well designed having used it myself in >>> Python >>>> scripts). >>>> >>>> This is a great idea and I would be happy to contribute/assist! >>>> >>>> John >>>> >>>> On Mon, Dec 28, 2015 at 2:07 AM, Wojciech Nowak <[email protected]> >>> wrote: >>>> >>>>> Dear Drill developers, >>>>> >>>>> Recently I was trying to use Drill from Python through ODBC interface >>>>> based on blog post from >>>>> >>> https://www.mapr.com/blog/using-drill-programmatically-python-r-and-perl >>>>> It worked as expected, but what struck to me was that It’s a lot of >>> hassle >>>>> to configure it. >>>>> >>>>> That’s why based on Your site under Contribution Ideas ( >>>>> https://drill.apache.org/docs/apache-drill-contribution-ideas/) I >>> decided >>>>> to create simpler solution for Python community. >>>>> >>>>> My Contribution would have two phases: >>>>> client/driver for interacting with Drill >>>>> dsl which will provide a easier and idiomatic way to write and >>> manipulate >>>>> queries using defined query set expressions. >>>>> >>>>> >>>>> 1. >>>>> Similarly to official client for Elastic Search ( >>>>> https://github.com/elastic/elasticsearch-py) I would like to use >>> Rest-Api >>>>> of Drill for which i found documentation under >>>>> https://drill.apache.org/docs/rest-api/ >>>>> sketch of usage: >>>>> >>> https://gist.github.com/PythonicNinja/9b4952b6cbc17572c7db#file-pydrill-py >>>>> >>>>> questions: >>>>> 1.1 I was wondering if Python driver for Drill could be based on >>> Rest-Api, >>>>> do you see any problems? >>>>> 1.2 Do you have any ideas or suggestions for that project? >>>>> >>>>> 2. >>>>> It would be separate package from driver, you can install as an >>> optional >>>>> package via command: >>>>> pip install pydrill-dsl >>>>> so that it would have separate releases from 1 package. >>>>> It would enhance way of interacting with Drill via query set like >>>>> expressions. >>>>> sketch of usage: >>>>> >>>>> >>> https://gist.github.com/PythonicNinja/9b4952b6cbc17572c7db#file-pydrill_dsl-py >>>>> >>>>> questions: >>>>> 2.1 Should it be separated from Python Drill Driver package? >>>>> 2.2 Do you have any ideas or suggestions for that project? >>>>> >>>>> This contribution would be part of my Master Thesis, so any ideas are >>>>> welcome. My thesis supervisor suggested to contact You to get Drill >>> core >>>>> developers perspective. >>>>> >>>>> I would be very grateful if You could provide me with your thoughts. >>>>> >>>>> kind regards, >>>>> Wojtek Nowak >>>>> >>> >>> >>
