[jira] [Commented] (SPARK-28242) DataStreamer keeps logging errors even after fixing writeStream output sink

Hyokun Park (Jira) Tue, 14 Jan 2020 00:21:44 -0800


    [ 
https://issues.apache.org/jira/browse/SPARK-28242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17014919#comment-17014919
 ]


Hyokun Park commented on SPARK-28242:
-------------------------------------

Hi [~mcanes]

In my case, I resolved the problem by adding a configuration.

Please add this configuration "--conf 
spark.hadoop.dfs.client.block.write.replace-datanode-on-failure.enable=false" 
in your spark-submit command.

> DataStreamer keeps logging errors even after fixing writeStream output sink
> ---------------------------------------------------------------------------
>
>                 Key: SPARK-28242
>                 URL: https://issues.apache.org/jira/browse/SPARK-28242
>             Project: Spark
>          Issue Type: Bug
>          Components: Structured Streaming
>    Affects Versions: 2.4.3
>         Environment: Hadoop 2.8.4
>  
>            Reporter: Miquel Canes
>            Priority: Minor
>
> I have been testing what happens to a running structured streaming that is 
> writing to HDFS when all datanodes are down/stopped or all cluster is down 
> (including namenode)
> So I created a structured stream from kafka to a File output sink to HDFS and 
> tested some scenarios.
> We used a very simple streamings:
> {code:java}
> spark.readStream()
> .format("kafka")
> .option("kafka.bootstrap.servers", "kafka.server:9092...")
> .option("subscribe", "test_topic")
> .load()
> .select(col("value").cast(DataTypes.StringType))
> .writeStream()
> .format("text")
> .option("path", "HDFS/PATH")
> .option("checkpointLocation", "checkpointPath")
> .start()
> .awaitTermination();{code}
>  
> After stopping all the datanodes the process starts logging the error that 
> datanodes are bad.
> That's correct...
> {code:java}
> 2019-07-03 15:55:00 [spark-listener-group-eventLog] ERROR 
> org.apache.spark.scheduler.AsyncEventQueue:91 - Listener EventLoggingListener 
> threw an exception java.io.IOException: All datanodes 
> [DatanodeInfoWithStorage[10.2.12.202:50010,DS-d2fba01b-28eb-4fe4-baaa-4072102a2172,DISK]]
>  are bad. Aborting... at 
> org.apache.hadoop.hdfs.DataStreamer.handleBadDatanode(DataStreamer.java:1530) 
> at 
> org.apache.hadoop.hdfs.DataStreamer.setupPipelineForAppendOrRecovery(DataStreamer.java:1465)
>  at 
> org.apache.hadoop.hdfs.DataStreamer.processDatanodeError(DataStreamer.java:1237)
>  at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:657)
> {code}
> The problem is that even after starting again the datanodes the process keeps 
> logging the same error all the time.
> We checked and the WriteStream to HDFS recovered successfully after starting 
> the datanodes and the output sink worked again without problems.
> I have been trying some different HDFS configurations to be sure it's not a 
> client config related problem but with no clue about how to fix it.
> It seams that something is stuck indefinitely in an error loop.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-28242) DataStreamer keeps logging errors even after fixing writeStream output sink

Reply via email to