[jira] [Updated] (SPARK-25014) When we tried to read kafka topic through spark streaming spark submit is getting failed with Python worker exited unexpectedly (crashed) error

Sean Owen (JIRA) Fri, 03 Aug 2018 08:48:32 -0700


     [ 
https://issues.apache.org/jira/browse/SPARK-25014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sean Owen updated SPARK-25014:
------------------------------
         Priority: Major  (was: Blocker)
    Fix Version/s:     (was: 2.3.2)

Please read [https://spark.apache.org/contributing.html]

For example, don't set Blocker.

Nothing about this rules out an env problem or code problem. Jira is for 
reporting issues narrowed down to Spark, rather than asking for support.

> When we tried to read kafka topic through spark streaming spark submit is 
> getting failed with Python worker exited unexpectedly (crashed) error 
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-25014
>                 URL: https://issues.apache.org/jira/browse/SPARK-25014
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 2.3.1
>            Reporter: KARTHIKEYAN RASIPALAYAM DURAIRAJ
>            Priority: Major
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Hi Team , 
>  
> TOPIC = 'NBC_APPS.TBL_MS_ADVERTISER'
> PARTITION = 0
> topicAndPartition = TopicAndPartition(TOPIC, PARTITION)
> fromOffsets1 = \{topicAndPartition:int(PARTITION)}
>  
> def handler(message):
>     records = message.collect()
>     for record in records:
>         value_all=record[1]
>         value_key=record[0]
> #        print(value_all)
>  
> schema_registry_client = 
> CachedSchemaRegistryClient(url='http://localhost:8081')
> serializer = MessageSerializer(schema_registry_client)
> sc = SparkContext(appName="PythonStreamingAvro")
> ssc = StreamingContext(sc, 10)
> kvs = KafkaUtils.createDirectStream(ssc, ['NBC_APPS.TBL_MS_ADVERTISER'], 
> \{"metadata.broker.list": 
> 'localhost:9092'},valueDecoder=serializer.decode_message)
> lines = kvs.map(lambda x: x[1])
> lines.pprint()
> kvs.foreachRDD(handler)
>  
> ssc.start()
> ssc.awaitTermination()
>  
> This is code we trying to pull the data from kafka topic . when we execute 
> through spark submit we are getting below error 
>  
>  
> 2018-08-03 11:10:40 INFO  VerifiableProperties:68 - Property 
> zookeeper.connect is overridden to 
> 2018-08-03 11:10:40 ERROR PythonRunner:91 - Python worker exited unexpectedly 
> (crashed)
> org.apache.spark.api.python.PythonException: Traceback (most recent call 
> last):
>   File 
> "/Users/KarthikeyanDurairaj/Desktop/Sparkshell/python/lib/pyspark.zip/pyspark/worker.py",
>  line 215, in main
>     eval_type = read_int(infile)
>   File 
> "/Users/KarthikeyanDurairaj/Desktop/Sparkshell/python/lib/pyspark.zip/pyspark/serializers.py",
>  line 685, in read_int
>     raise EOFError
> EOFError
>  
>  at 
> org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:298)
>  at 
> org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:438)
>  at 
> org.apache.spark.api.python.PythonRunner$$anon$1.read(PythonRunner.scala:421)
>  at 
> org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:252)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-25014) When we tried to read kafka topic through spark streaming spark submit is getting failed with Python worker exited unexpectedly (crashed) error

Reply via email to