+1 (binding) 2019년 10월 10일 (목) 오후 5:11, Takeshi Yamamuro <linguin....@gmail.com>님이 작성:
> Thanks for the great work, Gengliang! > > +1 for that. > As I said before, the behaviour is pretty common in DBMSs, so the change > helps for DMBS users. > > Bests, > Takeshi > > > On Mon, Oct 7, 2019 at 5:24 PM Gengliang Wang < > gengliang.w...@databricks.com> wrote: > >> Hi everyone, >> >> I'd like to call for a new vote on SPARK-28885 >> <https://issues.apache.org/jira/browse/SPARK-28885> "Follow ANSI store >> assignment rules in table insertion by default" after revising the ANSI >> store assignment policy(SPARK-29326 >> <https://issues.apache.org/jira/browse/SPARK-29326>). >> When inserting a value into a column with the different data type, Spark >> performs type coercion. Currently, we support 3 policies for the store >> assignment rules: ANSI, legacy and strict, which can be set via the option >> "spark.sql.storeAssignmentPolicy": >> 1. ANSI: Spark performs the store assignment as per ANSI SQL. In >> practice, the behavior is mostly the same as PostgreSQL. It disallows >> certain unreasonable type conversions such as converting `string` to `int` >> and `double` to `boolean`. It will throw a runtime exception if the value >> is out-of-range(overflow). >> 2. Legacy: Spark allows the store assignment as long as it is a valid >> `Cast`, which is very loose. E.g., converting either `string` to `int` or >> `double` to `boolean` is allowed. It is the current behavior in Spark 2.x >> for compatibility with Hive. When inserting an out-of-range value to an >> integral field, the low-order bits of the value is inserted(the same as >> Java/Scala numeric type casting). For example, if 257 is inserted into a >> field of Byte type, the result is 1. >> 3. Strict: Spark doesn't allow any possible precision loss or data >> truncation in store assignment, e.g., converting either `double` to `int` >> or `decimal` to `double` is allowed. The rules are originally for Dataset >> encoder. As far as I know, no mainstream DBMS is using this policy by >> default. >> >> Currently, the V1 data source uses "Legacy" policy by default, while V2 >> uses "Strict". This proposal is to use "ANSI" policy by default for both V1 >> and V2 in Spark 3.0. >> >> This vote is open until Friday (Oct. 11). >> >> [ ] +1: Accept the proposal >> [ ] +0 >> [ ] -1: I don't think this is a good idea because ... >> >> Thank you! >> >> Gengliang >> > > > -- > --- > Takeshi Yamamuro >