Yousof Hosny created SPARK-47892:
------------------------------------

             Summary: XML: Stop ignoring CDATA within rows. 
                 Key: SPARK-47892
                 URL: https://issues.apache.org/jira/browse/SPARK-47892
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 4.0.0
            Reporter: Yousof Hosny
             Fix For: 4.0.0


This change ignores CDATA within row tags as well as outside of it. We should 
only ignore CDATA found outside of row tags as they are considered data within 
the row.
[https://github.com/apache/spark/pull/45487]

 

NOTE: With the current parser implementation, after not ignoring CDATA elements 
within row tags there remains the edge case of a matching closing row tag 
within CDATA which will be parsed as a valid end tag. 
Example:
{code:java}
<row> <![CDATA[ </row> ]]> {code}
after no longer ignoring CDATA within rows, the closing tag in the example 
above will be matched by the parser which is incorrect. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to