[ 
https://issues.apache.org/jira/browse/SPARK-28749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908751#comment-16908751
 ] 

Matt Foley edited comment on SPARK-28749 at 8/16/19 5:55 AM:
-------------------------------------------------------------

Hi [~hyukjin.kwon], thanks for looking at the issue.  I did try that, but it 
doesn't work for the following reason:
In {{python/pyspark/streaming/tests.py}}
* {{ENABLE_KAFKA_0_8_TESTS}} is used to derive boolean 
{{are_kafka_tests_enabled}}
* The call to {{search_kafka_assembly_jar()}} is not guarded by the use of 
{{are_kafka_tests_enabled}}.
* And the Failure exception is thrown from {{search_kafka_assembly_jar()}}.

So to make {{ENABLE_KAFKA_0_8_TESTS}} to properly guard the call to 
{{search_kafka_assembly_jar()}} would be a similar bug fix.


was (Author: mattf):
Hi [~hyukjin.kwon], thanks for looking at the issue.  I did try that, but it 
doesn't work for the following reason:
In {{python/pyspark/streaming/tests.py}}
* {{ENABLE_KAFKA_0_8_TESTS}} is used to derive boolean 
{{are_kafka_tests_enabled}}
* The call to {{search_kafka_assembly_jar()}} is not guarded by the use of 
{{are_kafka_tests_enabled}}.
* And the Failure exception is thrown from {search_kafka_assembly_jar()}.

So to make {{ENABLE_KAFKA_0_8_TESTS}} to properly guard the call to 
{{search_kafka_assembly_jar()}} would be a similar bug fix.

> Fix PySpark tests not to require kafka-0-8 in branch-2.4
> --------------------------------------------------------
>
>                 Key: SPARK-28749
>                 URL: https://issues.apache.org/jira/browse/SPARK-28749
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark, Tests
>    Affects Versions: 2.4.3
>            Reporter: Matt Foley
>            Priority: Minor
>
> As noted in SPARK-27550 we want to encourage testing of Spark 2.4.x with 
> Scala-2.12, and kafka-0-8 does not support Scala-2.12.
> Currently, the PySpark tests invoked by `python/run-tests` demand the 
> presence of kafka-0-8 libraries. If not present, this failure message will be 
> generated:
>  {code}
> Traceback (most recent call last):
>  File "/usr/lib64/python2.7/runpy.py", line 174, in _run_module_as_main
>  "__main__", fname, loader, pkg_name)
>  File "/usr/lib64/python2.7/runpy.py", line 72, in _run_code
>  exec code in run_globals
>  File "spark/python/pyspark/streaming/tests.py", line 1579, in <module>
>  kafka_assembly_jar = search_kafka_assembly_jar()
>  File "spark/python/pyspark/streaming/tests.py", line 1524, in 
> search_kafka_assembly_jar
>  "You need to build Spark with "
>  Exception: Failed to find Spark Streaming kafka assembly jar in 
> spark/external/kafka-0-8-assembly. You need to build Spark with 'build/sbt 
> -Pkafka-0-8 assembly/package streaming-kafka-0-8-assembly/assembly' or 
> 'build/mvn -DskipTests -Pkafka-0-8 package' before running this test.
> Had test failures in pyspark.streaming.tests with 
> spark/py_virtenv/bin/python; see logs.
>  Process exited with code 255
> {code}
> This change is only targeted at branch-2.4, as most kafka-0-8 related 
> materials have been removed in master and this problem no longer occurs there.
> PROPOSED SOLUTION
> The proposed solution is to make the kafka-0-8 stream testing optional for 
> pyspark testing, exactly the same as the Kinesis stream testing currently is, 
> in file `python/pyspark/streaming/tests.py`. This is only a few lines of 
> change.
> Ideally it would be limited to when SPARK_SCALA_VERSION >= 2.12, but it turns 
> out to be somewhat onerous to reliably obtain that value from within the 
> python test env, and no other python test code currently does so. So my 
> proposed solution simply makes the use of the kafka-0-8 profile optional, and 
> leaves it to the tester to include it for Scala-2.11 test builds and exclude 
> it for Scala-2.12 test builds.
> PR will be available in a day or so.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to