[kudu-CR] [examples] Add basic Spark example written in Scala
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 8: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc@44 PS8, Line 44: If running a spark2-submit job, you will need to set this value to match the '--master' Spark > Yeah, the upstream is just 'spark-submit' but I thought we wanted this targ This is an Apache project and an Apache repository. Nothing about this change should be related to or specifically for CDH. All changes should target the Apache integrations. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 8 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 29 Oct 2018 19:02:46 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 8: (2 comments) http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc@40 PS8, Line 40: - KuduMasters: A String value consisting of a comma-separated list of Kudu Master Host addresses. > nit: trailing white space here and below. Done http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc@44 PS8, Line 44: If running a spark2-submit job, you will need to set this value to match the '--master' Spark > I think this should be spark 2 `spark-submit` job. IIRC spark2-submit is a Yeah, the upstream is just 'spark-submit' but I thought we wanted this targeted for CDH - per Greg's update. Should I leave it as is to be CDH-specific, or change it to match upstream spark conventions? -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 8 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 29 Oct 2018 18:57:06 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc@40 PS8, Line 40: - KuduMasters: A String value consisting of a comma-separated list of Kudu Master Host addresses. nit: trailing white space here and below. http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc@44 PS8, Line 44: If running a spark2-submit job, you will need to set this value to match the '--master' Spark I think this should be spark 2 `spark-submit` job. IIRC spark2-submit is a CDH 5 only command for their compatibility reasons. http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc@59 PS8, Line 59: To run this against Spark2 On YARN, you can use the spark2-submit command as follows from the Same here. This should just be spark-submit. http://gerrit.cloudera.org:8080/#/c/11788/8/examples/scala/spark-example/README.adoc@66 PS8, Line 66: $ spark2-submit --class org.apache.kudu.examples.SparkExample --master yarn --deploy-mode --driver-java-options '-DKuduMasters=master.0:7051,master.1:7051,master.2:7051 -DTableName=test_table -DSparkMaster=yarn' target/kudu-spark-example-1.0-SNAPSHOT.jar Same here. This should just be spark-submit. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 8 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 29 Oct 2018 18:51:43 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 8 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 29 Oct 2018 18:16:54 + Gerrit-HasComments: No
[kudu-CR] [examples] Add basic Spark example written in Scala
Will Berkeley has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 8: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 8 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Mon, 29 Oct 2018 17:49:12 + Gerrit-HasComments: No
[kudu-CR] [examples] Add basic Spark example written in Scala
Will Berkeley has removed a vote on this change. Change subject: [examples] Add basic Spark example written in Scala .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 8 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Hello Will Berkeley, Attila Bukor, Kudu Jenkins, Adar Dembo, Grant Henke, Greg Solovyev, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11788 to look at the new patch set (#8). Change subject: [examples] Add basic Spark example written in Scala .. [examples] Add basic Spark example written in Scala This patch adds a basic Kudu client that utilizes both Kudu Java APIs, as well as Spark SQL APIs. It will allow customers to pull down the pom.xml and scala source, then build and execute from their local machine. Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f --- A examples/scala/spark-example/README.adoc A examples/scala/spark-example/pom.xml A examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala 3 files changed, 292 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/11788/8 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 8 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 7: So the snag I hit previously ended up being KUDU-2259, which caused the job to fail since the token was reissued by the master and it attempted to use a different name (despite there being no auth configured..) It looks like this was supposed to be fixed in 1.7, but I had to update the pom to pull 1.7.1 or 1.8.0 to get it working. I don't know how changes in Kudu directly impact the kudu-spark lib, but it's resolved nonetheless. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 7 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Sat, 27 Oct 2018 02:04:47 + Gerrit-HasComments: No
[kudu-CR] [examples] Add basic Spark example written in Scala
Hello Will Berkeley, Attila Bukor, Kudu Jenkins, Adar Dembo, Grant Henke, Greg Solovyev, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11788 to look at the new patch set (#7). Change subject: [examples] Add basic Spark example written in Scala .. [examples] Add basic Spark example written in Scala This patch adds a basic Kudu client that utilizes both Kudu Java APIs, as well as Spark SQL APIs. It will allow customers to pull down the pom.xml and scala source, then build and execute from their local machine. Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f --- A examples/scala/spark-example/README.adoc A examples/scala/spark-example/pom.xml A examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala 3 files changed, 287 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/11788/7 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 7 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 6: Pushed the additional instructions for running this as a spark2-submit job, however there was a small snag I encountered that I'm still working through. The job will actually run successfully and print out as expected, but will fail when closing down due to how I'm defining the SparkSession's 'master' instance. I'm working through how to mitigate that, and still have it be executable as both a java standalone app and via spark2-submit. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 6 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Sat, 27 Oct 2018 01:17:42 + Gerrit-HasComments: No
[kudu-CR] [examples] Add basic Spark example written in Scala
Hello Will Berkeley, Attila Bukor, Kudu Jenkins, Adar Dembo, Grant Henke, Greg Solovyev, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11788 to look at the new patch set (#6). Change subject: [examples] Add basic Spark example written in Scala .. [examples] Add basic Spark example written in Scala This patch adds a basic Kudu client that utilizes both Kudu Java APIs, as well as Spark SQL APIs. It will allow customers to pull down the pom.xml and scala source, then build and execute from their local machine. Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f --- A examples/scala/spark-example/README.adoc A examples/scala/spark-example/pom.xml A examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala 3 files changed, 287 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/11788/6 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 6 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Hello Will Berkeley, Attila Bukor, Kudu Jenkins, Adar Dembo, Grant Henke, Greg Solovyev, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11788 to look at the new patch set (#5). Change subject: [examples] Add basic Spark example written in Scala .. [examples] Add basic Spark example written in Scala This patch adds a basic Kudu client that utilizes both Kudu Java APIs, as well as Spark SQL APIs. It will allow customers to pull down the pom.xml and scala source, then build and execute from their local machine. Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f --- A examples/scala/spark-example/README.adoc A examples/scala/spark-example/pom.xml A examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala 3 files changed, 287 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/11788/5 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 5 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster > I guess, my point is that it would be nice if this example was applicable t I've got this running with the System.getProperty() call on another spark2 cluster. I'll write up some instructions and include them here so that customers don't have to do this same song and dance as we are :) -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 22:22:34 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster > I tried running this on Spark2 on Yarn after replacing how parameters are p I guess, my point is that it would be nice if this example was applicable to CDH. Otherwise, it should probably just use local execution. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 22:03:04 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster > I ran the job locally - worked fine. Then, I tried running it against a Spa I tried running this on Spark2 on Yarn after replacing how parameters are passed (args instead of System.getProperty) and it almost worked. It broke with "java.lang.NoClassDefFoundError: org/apache/commons/dbcp/ConnectionFactory" -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 22:00:06 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster > I ran the job locally - worked fine. Then, I tried running it against a Spa We don't support Spark 1 anymore. That was dropped a few versions back. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala File examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@71 PS4, Line 71: val upsertUsers = Array(User("newUserA", 1234), User("userC", )) > Yep, I am repeating a value from above (the id=1234). This will upsert 1234 oh, my bad I was looking at the names. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 20:34:29 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster > This should work without issue in a spark-submit job, I tested that previou I ran the job locally - worked fine. Then, I tried running it against a Spark 1.6 standalone cluster (CHD5.15.2) and got an error "java.io.StreamCorruptedException: invalid stream header: 01000C31" - because of the mismatch between Spark 2 and Spark 1.6. The way I submitted SparkMaster URL is exactly how your colleague described: java -DKuduMasters=greg-kudu-5152-1.vpc.cloudera.com:7051 -DSparkMaster=spark://greg-kudu-5152-1.vpc.cloudera.com:7077 -jar target/kudu-spark-example-1.0-SNAPSHOT.jar I didn't try running it against Spark2 on yarn yet, that will require rewriting the example code a bit. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 20:18:00 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster > That's one way to address it. I a wonder if there is a way to refactor the This should work without issue in a spark-submit job, I tested that previously. I talked with a colleague on the Spark team, and noted that the proper reference for a standalone spark cluster is simply "spark://:7077" so that should be very easy to implement after all. Let me make those changes, and go back and attempt the same spark-submit test again to make sure this is still working. Could you provide me with the steps you took to submit it, as well as the submit command you used? -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 19:45:39 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster > I'm starting to wonder if this is something we should even expose at this p That's one way to address it. I a wonder if there is a way to refactor the code, so that it can be submitted to Spark with spark-submit as well as ran as a java application. What makes me uncomfortable with the current example is that while you can run it locally (w/o a Spark cluster), you cannot run it against a Spark 2.x on Yarn and you cannot run it against Spark 1.x standalone (I tried). So, as a result, you cannot run this example against a Spark cluster deployed with CDH or HDP. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 19:28:06 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster > Could you provide a command-line example with proper SparkMaster URL that p I'm starting to wonder if this is something we should even expose at this point, given the purpose of this example was supposed to be as simple as possible. If we leave this local, there's really no point in exposing it and documenting how to change it. I can work on an example of how to target a remote cluster, but it will require a bit of reworking of the code as we'll need to add some additional values in order to gather the necessary YARN configuration files, etc. which might be more than a simple example like this should need. Thoughts? -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 18:26:12 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: SparkMaster Could you provide a command-line example with proper SparkMaster URL that points to a remote Spark Master? -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 18:03:58 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: If running locally (standalone), this must be set to 'local'. > Thanks for the feedback. Do you think this would be better? Yes, I think this is better. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 17:58:05 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: If running locally (standalone), this must be set to 'local'. > This wording creates an impression that this parameter must be set explici Thanks for the feedback. Do you think this would be better? "This defaults to 'local', which is the required value if running the application locally (standalone)" -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 17:53:06 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@41 PS4, Line 41: If running locally (standalone), this must be set to 'local'. This wording creates an impression that this parameter must be set explicitly when running locally, but the way the code works (and what the example below shows) is that this parameter defaults to "local" and is not required to be set. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Greg Solovyev Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 17:50:38 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala File examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@34 PS4, Line 34: import spark.implicits._ > It fails to build: This link explains it a bit better: https://stackoverflow.com/questions/39968707/spark-2-0-missing-spark-implicits -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 14:52:29 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (11 comments) http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@43 PS1, Line 43: To specify a value at execution time, you'll specify the parameter name in the '-D' format. For example, to set a different set of masters for the Kudu cluster from the command line and use a custom table name, set the property `KuduMasters` to a CSV of the master addresses in the form `host:port` and add a table name value, as shown: > I meant the default you are using in your code. For example, by default you Done http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@43 PS4, Line 43: To specify a value at execution time, you'll specify the parameter name in the '-D' format. For example, to set a different set of masters for the Kudu cluster from the command line and use a custom table name, set the property `KuduMasters` to a CSV of the master addresses in the form `host:port` and add a table name value, as shown: > Nit: line breaks. Done http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala File examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@16 PS4, Line 16: val KuduMasters: String = System.getProperty("KuduMasters","kudu.master1:7051,kudu.master2:7051,kudu.master3:7051") //kudu master address list > Can we use localhost:7051 as the default? Done http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@22 PS4, Line 22: //defining a class that we'll use to insert data into the table > Nit: periods at the end of comments. Done http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@27 PS4, Line 27: val logger = LoggerFactory.getLogger(SparkExample.getClass) > nit: this can be defined at the top of the object. Done http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@34 PS4, Line 34: import spark.implicits._ > Can you try moving this? It should be fine to be with the other imports. It fails to build: [ERROR] /Users/mbarnett/src/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala:14: error: not found: object spark [ERROR] import spark.implicits._ [ERROR]^ [INFO] [INFO] BUILD FAILURE because it's using the 'spark' instance we declare above it, which actually contains the lib. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@39 PS4, Line 39:List( > nit: This indentation seams level looks like it is using 4 spaces. Sorry, didn't quite understand this one. Should this be simply indented, or level? http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@40 PS4, Line 40: StructField(IdCol,IntegerType,false), > nit: space after commas. Done http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@48 PS4, Line 48: kc.createTable(TableName, schema, Seq(IdCol), new CreateTableOptions().setNumReplicas(3).addHashPartitions(List(IdCol).asJava, 3)) > This still has a replication factor of 3. Missed the change. Completed now. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@71 PS4, Line 71: val upsertUsers = Array(User("newUserA", 1234), User("userC", )) > To show the value of an upsert, should we re-use a key that was already ins Yep, I am repeating a value from above (the id=1234). This will upsert 1234 and change the name of 'userA' to 'newUserA'. If you think the change should be more drastic to call more attention to what the upsert did, let me know. I can have it change to something like "BrandNewUser" instead. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@84 PS4, Line
[kudu-CR] [examples] Add basic Spark example written in Scala
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: (11 comments) http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@43 PS1, Line 43: To specify a value at execution time, you'll specify the parameter name in the '-D' format. For example, to set a different set of masters for the Kudu cluster from the command line and use a custom table name, set the property `KuduMasters` to a CSV of the master addresses in the form `host:port` and add a table name value, as shown: > I didn't see any default listed for this value - it's inclusion is only due I meant the default you are using in your code. For example, by default you set SPARK_MASTER to local if it isn't provided. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/README.adoc@43 PS4, Line 43: To specify a value at execution time, you'll specify the parameter name in the '-D' format. For example, to set a different set of masters for the Kudu cluster from the command line and use a custom table name, set the property `KuduMasters` to a CSV of the master addresses in the form `host:port` and add a table name value, as shown: Nit: line breaks. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala File examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala: http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@16 PS4, Line 16: val KuduMasters: String = System.getProperty("KuduMasters","kudu.master1:7051,kudu.master2:7051,kudu.master3:7051") //kudu master address list Can we use localhost:7051 as the default? http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@22 PS4, Line 22: //defining a class that we'll use to insert data into the table Nit: periods at the end of comments. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@27 PS4, Line 27: val logger = LoggerFactory.getLogger(SparkExample.getClass) nit: this can be defined at the top of the object. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@34 PS4, Line 34: import spark.implicits._ Can you try moving this? It should be fine to be with the other imports. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@39 PS4, Line 39:List( nit: This indentation seams level looks like it is using 4 spaces. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@40 PS4, Line 40: StructField(IdCol,IntegerType,false), nit: space after commas. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@48 PS4, Line 48: kc.createTable(TableName, schema, Seq(IdCol), new CreateTableOptions().setNumReplicas(3).addHashPartitions(List(IdCol).asJava, 3)) This still has a replication factor of 3. http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@71 PS4, Line 71: val upsertUsers = Array(User("newUserA", 1234), User("userC", )) To show the value of an upsert, should we re-use a key that was already inserted? http://gerrit.cloudera.org:8080/#/c/11788/4/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@84 PS4, Line 84: finally try { just finally? -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Fri, 26 Oct 2018 13:59:50 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Attila Bukor has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: Verified+1 unrelated TSAN failure -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 25 Oct 2018 23:16:47 + Gerrit-HasComments: No
[kudu-CR] [examples] Add basic Spark example written in Scala
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 25 Oct 2018 23:16:36 + Gerrit-HasComments: No
[kudu-CR] [examples] Add basic Spark example written in Scala
Adar Dembo has removed a vote on this change. Change subject: [examples] Add basic Spark example written in Scala .. Removed Code-Review+1 by Mitch Barnett -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Attila Bukor has removed a vote on this change. Change subject: [examples] Add basic Spark example written in Scala .. Removed Verified-1 by Kudu Jenkins (120) -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: deleteVote Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Adar Dembo Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 4: Code-Review+1 (11 comments) Updated per Grant's feedback. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@18 PS1, Line 18: = Kudu-Spark example README > nit: extra space Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@40 PS1, Line 40: - KuduMasters: A String value consisting of a comma-separated list of Kudu Master Host addresses. This will need to be pointed to the Kudu cluster you wish to target. > Is this needed? Given the code is creating the table, we have control over Good call - probably better to give less control as to not cause additional failures. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@43 PS1, Line 43: To specify a value at execution time, you'll specify the parameter name in the '-D' format. For example, to set a different set of masters for the Kudu cluster from the command line and use a custom table name, set the property `KuduMasters` to a CSV of the master addresses in the form `host:port` and add a table name value, as shown: > nit: Move up near KUDU_MASTERS I didn't see any default listed for this value - it's inclusion is only due to it being a requirement of the SparkSession create statement. I suppose you could infer that 'local' is the default value, since any other value would require a spark master to be explicitly defined. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala File examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala: http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@14 PS1, Line 14: object SparkExample { > nit: val KuduMasters: String = ... Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@19 PS1, Line 19: val NameCol = "name" > Should this be configurable like KUDU_MASTERS? I went ahead and made it configurable, as to allow those who want to the option of changing where it runs. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@32 PS1, Line 32: > Nit: I think the below sytax is easier Went ahead and made that syntax change - bit easier to read. Also went ahead and explicitly defined 'false' for the structFields instead of the NULLABLE constant. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@39 PS1, Line 39: > nit: Replication factor of 1 might be easier for examples. Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@43 PS1, Line 43: ) > Can this be moved up by the other imports. From what I've read in the Spark documentation and what I've seen in the code base, it can't be moved. It's being called directly from the SparkSession we instantiated above. I could move it higher in main(), but I can't take it up to the other imports unfortunately. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@47 PS1, Line 47: if (!kc.tableExists(TableName)) { > Nit: Use slf4j logging. Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@77 PS1, Line 77: //create a DF and map the values of the KuduMaster and TableName values > No need to catch if there is no handling. Just let the exception bubble up. Done http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@87 PS1, Line 87: kc.deleteTable(TableName) > No need to catch here either. Done -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Mitch Barnett Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 25 Oct 2018 23:11:25 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Hello Will Berkeley, Attila Bukor, Kudu Jenkins, Grant Henke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11788 to look at the new patch set (#4). Change subject: [examples] Add basic Spark example written in Scala .. [examples] Add basic Spark example written in Scala This patch adds a basic Kudu client that utilizes both Kudu Java APIs, as well as Spark SQL APIs. It will allow customers to pull down the pom.xml and scala source, then build and execute from their local machine. Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f --- A examples/scala/spark-example/README.adoc A examples/scala/spark-example/pom.xml A examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala 3 files changed, 259 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/11788/4 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 4 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Hello Will Berkeley, Attila Bukor, Kudu Jenkins, Grant Henke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11788 to look at the new patch set (#3). Change subject: [examples] Add basic Spark example written in Scala .. [examples] Add basic Spark example written in Scala This patch adds a basic Kudu client that utilizes both Kudu Java APIs, as well as Spark SQL APIs. It will allow customers to pull down the pom.xml and scala source, then build and execute from their local machine. Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f --- A examples/scala/spark-example/README.adoc A examples/scala/spark-example/pom.xml A examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala 3 files changed, 257 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/11788/3 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 3 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Hello Will Berkeley, Attila Bukor, Kudu Jenkins, Grant Henke, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11788 to look at the new patch set (#2). Change subject: [examples] Add basic Spark example written in Scala .. [examples] Add basic Spark example written in Scala This patch adds a basic Kudu client that utilizes both Kudu Java APIs, as well as Spark SQL APIs. It will allow customers to pull down the pom.xml and scala source, then build and execute from their local machine. Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f --- A examples/scala/spark-example/README.adoc A examples/scala/spark-example/pom.xml A examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala 3 files changed, 259 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/11788/2 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 2 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley
[kudu-CR] [examples] Add basic Spark example written in Scala
Grant Henke has posted comments on this change. ( http://gerrit.cloudera.org:8080/11788 ) Change subject: [examples] Add basic Spark example written in Scala .. Patch Set 1: (11 comments) Thanks for the contribution. I did a quick first review pass. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc File examples/scala/spark-example/README.adoc: http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@18 PS1, Line 18: = Kudu-Spark example README nit: extra space http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@40 PS1, Line 40: - ID_COL: String value containing the name of the Primary Key column. Is this needed? Given the code is creating the table, we have control over this. Same with NAME_COL. I think it complicates the example. Looking at the code, these aren't exposed as setable properties. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/README.adoc@43 PS1, Line 43: - SPARK_MASTER: String value which identifies the location of the Spark Master to be used. If running locally (standalone), this must be set to 'local'. nit: Move up near KUDU_MASTERS Can you specify if these have a default? http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala File examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala: http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@14 PS1, Line 14: val KUDU_MASTERS : String = System.getProperty("KUDU_MASTERS","kudu.master1:7051,kudu.master2:7051,kudu.master3:7051") //kudu master address list nit: val KuduMasters: String = ... In Scala constants are usually named like objects. The others below should be changed too. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@19 PS1, Line 19: val SPARK_MASTER = "local" //location of spark master, specify 'local' if running on localhost Should this be configurable like KUDU_MASTERS? http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@32 PS1, Line 32: val schema : StructType = StructType { Nit: I think the below sytax is easier StructType( List( StructField("ID_COL", DataTypes.IntegerType,NULLABLE), StructField("NAME_COL", DataTypes.StringType,NULLABLE), )) ID columns can't be nullable. You can probably just hard coded if a column is nullable or not. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@39 PS1, Line 39: ID_COL nit: Replication factor of 1 might be easier for examples. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@43 PS1, Line 43: import spark.implicits._ Can this be moved up by the other imports. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@47 PS1, Line 47: System.out.println("Writing to table " + TABLE_NAME) Nit: Use slf4j logging. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@77 PS1, Line 77: case e: Exception => e.printStackTrace() No need to catch if there is no handling. Just let the exception bubble up. http://gerrit.cloudera.org:8080/#/c/11788/1/examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala@87 PS1, Line 87: case e: Exception => e.printStackTrace() No need to catch here either. -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 1 Gerrit-Owner: Mitch Barnett Gerrit-Reviewer: Attila Bukor Gerrit-Reviewer: Grant Henke Gerrit-Reviewer: Kudu Jenkins (120) Gerrit-Reviewer: Will Berkeley Gerrit-Comment-Date: Thu, 25 Oct 2018 19:46:01 + Gerrit-HasComments: Yes
[kudu-CR] [examples] Add basic Spark example written in Scala
Mitch Barnett has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11788 Change subject: [examples] Add basic Spark example written in Scala .. [examples] Add basic Spark example written in Scala This patch adds a basic Kudu client that utilizes both Kudu Java APIs, as well as Spark SQL APIs. It will allow customers to pull down the pom.xml and scala source, then build and execute from their local machine. Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f --- A examples/scala/spark-example/README.adoc A examples/scala/spark-example/pom.xml A examples/scala/spark-example/src/main/scala/org/apache/kudu/examples/SparkExample.scala 3 files changed, 259 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/88/11788/1 -- To view, visit http://gerrit.cloudera.org:8080/11788 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I9ba09f0118c054a07b951e241c31d66245c57d3f Gerrit-Change-Number: 11788 Gerrit-PatchSet: 1 Gerrit-Owner: Mitch Barnett