dongjoon-hyun commented on a change in pull request #27489: 
[SPARK-30703][SQL][DOCS] Add a document for the ANSI mode
URL: https://github.com/apache/spark/pull/27489#discussion_r376739344
 
 

 ##########
 File path: docs/sql-ref-ansi-compliance.md
 ##########
 @@ -19,6 +19,87 @@ license: |
   limitations under the License.
 ---
 
+Spark SQL has two options to comply with the SQL standard: 
`spark.sql.ansi.enabled` and `spark.sql.storeAssignmentPolicy`.
+When `spark.sql.ansi.enabled` is set to `true` (`false` by default), Spark SQL 
follows the standard in basic behaviours (e.g., arithmetic operations, type 
conversion, and SQL parsing).
+Moreover, Spark SQL has an independent option to control implicit casting 
behaviours when inserting rows in a table.
+The casting behaviours are defined as store assignment rules in the standard.
+When `spark.sql.storeAssignmentPolicy` is set to `ANSI`, Spark SQL complies 
with the ANSI store assignment rules and this setting is a default value.
+
+The following subsections present behaviour changes in arithmetic operations, 
type conversions, and SQL parsing when the ANSI mode enabled.
+
+### Arithmetic Operations
+
+In Spark SQL, arithmetic operations performed on numeric types (with the 
exception of decimal) are not checked for overflow by default.
+This means that in case an operation causes an overflow, the result is the 
same that the same operation returns in a Java/Scala program (e.g., if the sum 
of 2 integers is higher than the maximum value representable, the result is a 
negative number).
+On the other hand, Spark SQL returns null for decimal overflow.
+When `spark.sql.ansi.enabled` is set to `true` and overflow occurs in numeric 
and interval arithmetic operations, it throws an arithmetic exception at 
runtime.
+
+{% highlight sql %}
+-- `spark.sql.ansi.enabled=true`
+SELECT 2147483647 + 1;
+
+  java.lang.ArithmeticException: integer overflow
+
+-- `spark.sql.ansi.enabled=false`
+SELECT 2147483647 + 1;
+
+  +----------------+
+  |(2147483647 + 1)|
+  +----------------+
+  |     -2147483648|
+  +----------------+
+
+{% endhighlight %}
+
+### Type Conversion
+
+Spark SQL has three kinds of type conversions: explicit casting, type 
coercion, and store assignment casting.
+When `spark.sql.ansi.enabled` is set to `true`, explicit castings by `CAST` 
syntax throws a number-format exception at runtime for illegal cast patterns 
defined in the standard, e.g. casts from a string to an integer.
+On the other hand, `INSERT INTO` syntax throws an analysis exception when the 
ANSI mode enabled via `spark.sql.storeAssignmentPolicy=ANSI`.
+
+Currently, the ANSI mode affects explicit casting and assignment casting only.
+In future releases, the behaviour of type coercion might change along with the 
other two type conversion rules.
+
+{% highlight sql %}
+-- Examples of explicit casting
+
+-- `spark.sql.ansi.enabled=true`
+SELECT CAST('a' AS INT);
+
+  java.lang.NumberFormatException: invalid input syntax for type numeric: a
+
+-- `spark.sql.ansi.enabled=false` (This is a legacy behaviour until Spark 2.x)
 
 Review comment:
   This can mislead the users. We may need to point out again this is the 
default behavior in 3.0.0, too.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to