[kudu-CR] add support for persisting Spark DataFrames to Kudu

Dan Burkert (Code Review) Thu, 05 May 2016 18:10:07 -0700

Dan Burkert has posted comments on this change.

Change subject: add support for persisting Spark DataFrames to Kudu
......................................................................



Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/2969/1/java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala:

Line 29: import org.kududb.client.{PartialRow, _}
Could you list them out explicitly?


Line 103:       val session: KuduSession = client.newSession
Put the session into batch mode by calling 
"session.setFlushMode(FlushMode.AUTO_FLUSH_BACKGROUND)".  Without this the 
session will flush each insert individually, which really hurts performance.


Line 125:       session.close()
This should probably be in a finally block just in case.  I don't think 
anything above can throw, but it's hard to know.  Also, check the returned 
OperationResponses to make sure there aren't any errors.  I'm not sure what 
Spark SQL's method for handling errors are, maybe an exception?


-- 
To view, visit http://gerrit.cloudera.org:8080/2969
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icdb58f4e707fa273ae50b93276c426ad77522e3b
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Andy Grove <[email protected]>
Gerrit-Reviewer: Dan Burkert <[email protected]>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes

[kudu-CR] add support for persisting Spark DataFrames to Kudu

Reply via email to