[spark] branch branch-3.3 updated: [SPARK-38574][DOCS] Enrich the documentation of option avroSchema

gengliang Mon, 21 Mar 2022 22:27:08 -0700

This is an automated email from the ASF dual-hosted git repository.

gengliang pushed a commit to branch branch-3.3
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/branch-3.3 by this push:
     new 8b90205  [SPARK-38574][DOCS] Enrich the documentation of option 
avroSchema
8b90205 is described below

commit 8b90205ae971eb0ef6e79d849abb14243bb7dc0f
Author: tianhanhu <adrianh...@gmail.com>
AuthorDate: Tue Mar 22 13:23:39 2022 +0800

    [SPARK-38574][DOCS] Enrich the documentation of option avroSchema
    
    ### What changes were proposed in this pull request?
    Enrich Avro data source documentation to emphasize the difference between
    `avroSchema` which is an option, and `jsonFormatSchema` which is a 
parameter to function `from_avro` .
    
    When using `from_avro`, `avroSchema` option can be set to a compatible and 
evolved schema, while `jsonFormatSchema` has to be the actual schema. Elsewise, 
the behavior is undefined.
    
    ### Why are the changes needed?
    Reduce confusion caused by option and parameter having similar namings.
    
    ### Does this PR introduce _any_ user-facing change?
    Yes, Avro data source documentation is enriched a bit.
    
    ### How was this patch tested?
    No testing required. Just a documentation change
    
    Closes #35880 from tianhanhu/SPARK-38574.
    
    Authored-by: tianhanhu <adrianh...@gmail.com>
    Signed-off-by: Gengliang Wang <gengli...@apache.org>
    (cherry picked from commit ee5121a56e10ba2c65ae67159da472713cc5edd4)
    Signed-off-by: Gengliang Wang <gengli...@apache.org>
---
 docs/sql-data-sources-avro.md | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/docs/sql-data-sources-avro.md b/docs/sql-data-sources-avro.md
index a26d56f..db3e03c 100644
--- a/docs/sql-data-sources-avro.md
+++ b/docs/sql-data-sources-avro.md
@@ -231,10 +231,11 @@ Data source options of Avro can be set via:
     <td>Optional schema provided by a user in JSON format.
       <ul>
         <li>
-          When reading Avro, this option can be set to an evolved schema, 
which is compatible but different with
+          When reading Avro files or calling function <code>from_avro</code>, 
this option can be set to an evolved schema, which is compatible but different 
with
           the actual Avro schema. The deserialization schema will be 
consistent with the evolved schema.
           For example, if we set an evolved schema containing one additional 
column with a default value,
-          the reading result in Spark will contain the new column too.
+          the reading result in Spark will contain the new column too. Note 
that when using this option with 
+          <code>from_avro</code>, you still need to pass the actual Avro 
schema as a parameter to the function.
         </li>
         <li>
           When writing Avro, this option can be set if the expected output 
Avro schema doesn't match the

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch branch-3.3 updated: [SPARK-38574][DOCS] Enrich the documentation of option avroSchema

Reply via email to