sekikn commented on a change in pull request #28063:
[SPARK-31293][DSTREAMS][Kinesis][DOC] Fix wrong examples and help messages for
Kinesis integration
URL: https://github.com/apache/spark/pull/28063#discussion_r399837065
##########
File path: docs/streaming-kinesis-integration.md
##########
@@ -246,8 +246,7 @@ To run the example,
</div>
<div data-lang="python" markdown="1">
- ./bin/spark-submit --jars external/kinesis-asl/target/scala-*/\
- spark-streaming-kinesis-asl-assembly_*.jar \
+ ./bin/spark-submit --jars
'external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_*.jar'
\
Review comment:
If the path is quoted, the wildcard is expanded by Spark itself and the
command works.
```
~/repos/spark/dist$ bash -x bin/spark-submit --verbose --jars
'../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_*.jar'
../external/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py
myAppName mySparkStream https://kinesis.ap-northeast-1.amazonaws.com
ap-northeast-1
[snip]
+ exec /home/sekikn/repos/spark/dist/bin/spark-class
org.apache.spark.deploy.SparkSubmit --verbose --jars
'../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_*.jar'
../external/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py
myAppName mySparkStream https://kinesis.ap-northeast-1.amazonaws.com
ap-northeast-1
[snip]
Parsed arguments:
[snip]
primaryResource
file:/home/sekikn/repos/spark/dist/../external/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py
name kinesis_wordcount_asl.py
childArgs [myAppName mySparkStream
https://kinesis.ap-northeast-1.amazonaws.com ap-northeast-1]
jars
file:/home/sekikn/repos/spark/dist/../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_*.jar
[snip]
Spark config:
(spark.executor.extraJavaOptions,*********(redacted))
(spark.jars,file:/home/sekikn/repos/spark/external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-sources.jar,file:/home/sekikn/repos/spark/external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-test-sources.jar,file:/home/sekikn/repos/spark/external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-tests.jar,file:/home/sekikn/repos/spark/external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT.jar)
(spark.app.name,kinesis_wordcount_asl.py)
[snip]
```
But without them, shell expands the wildcard, so if other jars that match
the pattern are in the directory in question, the command fails as follows.
```
~/repos/spark/dist$ bash -x bin/spark-submit --verbose --jars
../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_*.jar
../external/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py
myAppName mySparkStream https://kinesis.ap-northeast-1.amazonaws.com
ap-northeast-1
[snip]
+ exec /home/sekikn/repos/spark/dist/bin/spark-class
org.apache.spark.deploy.SparkSubmit --verbose --jars
../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT.jar
../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-sources.jar
../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-tests.jar
../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-test-sources.jar
../external/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py
myAppName mySparkStream https://kinesis.ap-northeast-1.amazonaws.com
ap-northeast-1
[snip]
Parsed arguments:
[snip]
primaryResource
file:/home/sekikn/repos/spark/dist/../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-sources.jar
name
spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-sources.jar
childArgs
[../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-tests.jar
../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT-test-sources.jar
../external/kinesis-asl/src/main/python/examples/streaming/kinesis_wordcount_asl.py
myAppName mySparkStream https://kinesis.ap-northeast-1.amazonaws.com
ap-northeast-1]
jars
file:/home/sekikn/repos/spark/dist/../external/kinesis-asl-assembly/target/spark-streaming-kinesis-asl-assembly_2.12-3.1.0-SNAPSHOT.jar
[snip]
Spark properties used, including those specified through
--conf and those from the properties file
/home/sekikn/repos/spark/dist/conf/spark-defaults.conf:
(spark.executor.extraJavaOptions,*********(redacted))
(spark.driver.extraJavaOptions,*********(redacted))
[snip]
Exception in thread "main" org.apache.spark.SparkException: No main class
set in JAR; please specify one with --class.
at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:940)
at
org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:463)
at
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:875)
at
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
at
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1011)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1020)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]