[kudu-CR] Changes: In kudu-spark, added support to delete rows from Kudu table in DataSource API. This is done using the SaveMode.ErrorIfExists.

2016-07-01 Thread Ram Mettu (Code Review)
Ram Mettu has posted comments on this change.

Change subject: Changes: In kudu-spark, added support to delete rows from Kudu 
table  in DataSource API. This is done using the SaveMode.ErrorIfExists.
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/3552/2//COMMIT_MSG
Commit Message:

Line 7: Changes: In kudu-spark, added support to delete rows from Kudu table 
> please remove trailing whitespace here and below.
Sure will fix it


Line 14:   .mode(SaveMode.ErrorIfExists).kudu
> Why are you using 'ErrorIfExists' here? It's not what is documented for tha
There is no delete mode in the data source api. So we are trying to fit it into 
one of the 4 defined options. Dan is suggesting a different approach will take 
a look


Line 14:   .mode(SaveMode.ErrorIfExists).kudu
> Agreed, I think a better way to expose this would be an implicit method on 
Ok will look into it


Line 21: * If there are additional columns along with primary columns, 
> I don't think it will cause problems to ignore them.
Ran into issues having extra columns expect the primary cols in newDelete 
operation. I can post the error if delete is supposed to work fine with extra 
columns.


-- 
To view, visit http://gerrit.cloudera.org:8080/3552
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Ram Mettu 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Changes: In kudu-spark, added support to delete rows from Kudu table in DataSource API. This is done using the SaveMode.ErrorIfExists.

2016-07-01 Thread Dan Burkert (Code Review)
Dan Burkert has posted comments on this change.

Change subject: Changes: In kudu-spark, added support to delete rows from Kudu 
table  in DataSource API. This is done using the SaveMode.ErrorIfExists.
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3552/2//COMMIT_MSG
Commit Message:

Line 7: Changes: In kudu-spark, added support to delete rows from Kudu table 
please remove trailing whitespace here and below.


Line 14:   .mode(SaveMode.ErrorIfExists).kudu
> Why are you using 'ErrorIfExists' here? It's not what is documented for tha
Agreed, I think a better way to expose this would be an implicit method on RDDs 
named something like "deleteFromKuduTable".  See the Cassandra spark 
connector's "saveToCassandra" for something similar.


Line 21: * If there are additional columns along with primary columns, 
> seems a little unintuitive here to ignore the extra columns, IMO. what do o
I don't think it will cause problems to ignore them.


-- 
To view, visit http://gerrit.cloudera.org:8080/3552
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu 
Gerrit-Reviewer: Dan Burkert 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Changes: In kudu-spark, added support to delete rows from Kudu table in DataSource API. This is done using the SaveMode.ErrorIfExists.

2016-07-01 Thread Todd Lipcon (Code Review)
Todd Lipcon has posted comments on this change.

Change subject: Changes: In kudu-spark, added support to delete rows from Kudu 
table  in DataSource API. This is done using the SaveMode.ErrorIfExists.
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/3552/2//COMMIT_MSG
Commit Message:

Line 7: Changes: In kudu-spark, added support to delete rows from Kudu table 
nit: please follow the standard commit message format (one-line short summary, 
followed by a blank line, followed by the list of changes). See 
http://chris.beams.io/posts/git-commit/


Line 11: Usage :
this usage info is probably better to put in the doc for the data frame, or 
perhaos in docs/developing.adoc


Line 14:   .mode(SaveMode.ErrorIfExists).kudu
Why are you using 'ErrorIfExists' here? It's not what is documented for that 
save mode ("ErrorIfExists mode means that when saving a DataFrame to a data 
source, if data already exists, an exception is expected to be thrown.")


Line 21: * If there are additional columns along with primary columns, 
seems a little unintuitive here to ignore the extra columns, IMO. what do other 
people think?


http://gerrit.cloudera.org:8080/#/c/3552/2/java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
File java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala:

Line 229: for ((sparkIdx, kuduIdx) <- indices) {
this stuff is all duplicated - can you refactor it into a utility method?


-- 
To view, visit http://gerrit.cloudera.org:8080/3552
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon 
Gerrit-HasComments: Yes


[kudu-CR] Changes: In kudu-spark, added support to delete rows from Kudu table in DataSource API. This is done using the SaveMode.ErrorIfExists.

2016-07-01 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Changes: In kudu-spark, added support to delete rows from Kudu 
table  in DataSource API. This is done using the SaveMode.ErrorIfExists.
..


Patch Set 2:

Build Started http://104.196.14.100/job/kudu-gerrit/2146/

-- 
To view, visit http://gerrit.cloudera.org:8080/3552
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No


[kudu-CR] Changes: In kudu-spark, added support to delete rows from Kudu table in DataSource API. This is done using the SaveMode.ErrorIfExists.

2016-07-01 Thread Ram Mettu (Code Review)
Ram Mettu has uploaded a new patch set (#2).

Change subject: Changes: In kudu-spark, added support to delete rows from Kudu 
table  in DataSource API. This is done using the SaveMode.ErrorIfExists.
..

Changes: In kudu-spark, added support to delete rows from Kudu table 
in DataSource API. This is done using
the SaveMode.ErrorIfExists.

Usage :

tableDataFrame.write.options(kuduOptions)
  .mode(SaveMode.ErrorIfExists).kudu

- this delete all rows in tableDataFrame
- tableDataFrame MUST provide all the cols which are part of the 
  primary key in Kudu table
* If any column which is part of primary key is missing in the 
  dataframe, it will give an error
* If there are additional columns along with primary columns, 
  it is okay, it will delete all the rows
* If tableDataFrame contains any rows which are not present in 
  the Kudu table, it will give an error for those missing rows

Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
---
M java/kudu-spark/src/main/scala/org/kududb/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
3 files changed, 145 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3552/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3552
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu 
Gerrit-Reviewer: Kudu Jenkins


[kudu-CR] Changes:

2016-07-01 Thread Ram Mettu (Code Review)
Ram Mettu has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3552

Change subject: Changes:
..

Changes:

In kudu-spark, added support to delete rows from Kudu table in DataSource API. 
This is done using
the SaveMode.ErrorIfExists.

Usage :

tableDataFrame.write.options(kuduOptions).mode(SaveMode.ErrorIfExists).kudu

- this delete all rows in tableDataFrame
- tableDataFrame MUST provide all the columns which are part of the primary key 
in Kudu table
* If any column which is part of primary key is missing in the dataframe, 
it will give an error
* If there are additional columns along with primary columns, it is okay, 
it will delete all the rows
* If tableDataFrame contains any rows which are not present in the Kudu 
table, it will give an error
  for those missing rows

Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
---
M java/kudu-spark/src/main/scala/org/kududb/spark/kudu/DefaultSource.scala
M java/kudu-spark/src/main/scala/org/kududb/spark/kudu/KuduContext.scala
M java/kudu-spark/src/test/scala/org/kududb/spark/kudu/DefaultSourceTest.scala
3 files changed, 145 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/3552/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3552
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu 


[kudu-CR] Changes:

2016-07-01 Thread Kudu Jenkins (Code Review)
Kudu Jenkins has posted comments on this change.

Change subject: Changes:
..


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/2145/

-- 
To view, visit http://gerrit.cloudera.org:8080/3552
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf7207d5ee525c07b54f7544e0ba63deb47e6acd
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Ram Mettu 
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: No