Hello,

I am struggling with a task that should be super simple:

      I define a structType to load json data from kafka with spark structed 
streaming, and some fields may have no value, how can I set a default value for 
this record?

For example:

StructType(
  Array(StructField("a", StringType, nullable = true),
  StructField("b", StringType, nullable = true),
  StructField("c", StringType, nullable = true))
)



spark
  .readStream
  .format("kafka")
  .option("kafka.bootstrap.servers", "localhost:9092")
  .option("subscribe", "input-topic")
  .option("failOnDataLoss", "false")
  .load()



df.writeStream
  .format(format)
  .option("checkpointLocation", checkpoint)
  .option("path", path)
  .outputMode(OutputMode.Append)
  .trigger(ProcessingTime("10 seconds"))
  .start()



If input data has no b, how can I set a default value(xxx) , only use udf?







Reply via email to