Hey Andy, Thanks for the patch! I left you some specific feedback in on the gerrit review, but I want to discuss the high level approach a bit. I think the patch as it's written now is going to have limited use, because it doesn't allow for specifying primary keys or partitioning, which are critical for correctness and performance. In the long run we will definitely want to be able to create tables through Spark SQL, but perhaps we should start of with just inserting/updating rows in existing tables. It would be interesting to see how other databases solved this problem, since I'm sure we're not the only ones with configuration options on table create. The relational databases in particular must have PK options.
- Dan On Thu, May 5, 2016 at 5:51 PM, Andy Grove <[email protected]> wrote: > Hi, > > I'm working with some colleagues at AgilData on Spark/Kudu integration and > we expect to be able to contribute a number of features to the code base. > > To kick things off, here is a gerrit for discussion that adds support for > persisting a DataFrame to a Kudu table. It would be great to hear feedback > and feature requests for this capability. > > http://gerrit.cloudera.org:8080/#/c/2969/ > > Thanks, > > Andy. >
