On Thu, Oct 13, 2016 at 11:01 PM, Greg Kocunik <g.kocu...@gmail.com> wrote:
> Hello, > > I would like to contribute pandas support in the python API. > > There is a jira ticket <https://issues.apache.org/jira/browse/KUDU-1276> > regarding this however the level is quite technical and beyond my current > abilities. > > I would like to get consensus if you are open to simpler solutions in the > interim. > To give you an idea, I was looking at doing something along the lines of: > > import pandas as pd > > scanner = table.scanner() > scanner.open() > data = scanner.read_all_tuples() > pd.DataFrame(data, > columns=table.schema.names).set_index(table.schema.primary_keys()) > > Please let me know if such solutions are welcome. > I'm always in favor of simple, but one question: if it's that simple then what's the purpose of having the explicit support, versus asking people to write the simple snippet? Justin Birdsell probably has a good opinion here since he's way more active than I am on Python. -Todd -- Todd Lipcon Software Engineer, Cloudera