[
https://issues.apache.org/jira/browse/SPARK-34163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17271056#comment-17271056
]
Hyukjin Kwon commented on SPARK-34163:
--------------------------------------
For questions, please use Spark mailing lists before filing it as an issue.
Please also see https://spark.apache.org/community.html
> Spark Structured Streaming - Kafka avro transformation on optional field
> Failed
> -------------------------------------------------------------------------------
>
> Key: SPARK-34163
> URL: https://issues.apache.org/jira/browse/SPARK-34163
> Project: Spark
> Issue Type: Bug
> Components: Structured Streaming
> Affects Versions: 2.4.7
> Reporter: Felix Kizhakkel Jose
> Priority: Major
>
> Hello All,
> I have a spark structured streaming job to inject data from Kafka where
> message from Kafka is avro type.
> Some of the fields are optional in the data. And I have to perform
> transformation if those optional fields are present in the data.
> So I tried to check whether the column exists by :
> {color:#0747A6}def has_column(dataframe, col):
> """
> This function checks the existence of a given column in the given
> DataFrame
> :param dataframe: the dataframe
> :type dataframe: DataFrame
> :param col: the column name
> :type col: str
> :return: true if the column exists else false
> :rtype: boolean
> """
> try:
> dataframe[col]
> return True
> except AnalysisException:
> return False{color}
> But it seems not working when its a streaming dataframe, but when the
> dataframe is normal dataframe, and when a column is not present the above
> check returns false, therefore I can ignore the transformation on the missing
> column.
> But on Streaming dataframe *has_column* always returns true and therefore the
> transformation get executed and cause exception. What is the right approach
> to check existence of column in a streaming dataframe before performing
> transformation?
> Why streaming dataframe and normal dataframe differ in behavior? How to skip
> transformation on a column if it doesn't exists?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]