[jira] [Commented] (SPARK-28067) Incorrect results in decimal aggregation with whole-stage code gen enabled
[ https://issues.apache.org/jira/browse/SPARK-28067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870438#comment-16870438 ] Mark Sirek commented on SPARK-28067: [~mgaido] Here is the physical plan I'm getting. Maybe yours is different? I tried on master this time... {code:java} msirek@skylake16:~/IdeaProjects/spark$ git status On branch master Your branch is up to date with 'origin/master'. nothing to commit, working tree clean msirek@skylake16:~/IdeaProjects/spark$ git log -5 --pretty=format:"%h%x09%an%x09%ad%x09%s" 870f972dcc Yuming Wang Sat Jun 22 09:15:07 2019 -0700 [SPARK-28104][SQL] Implement Spark's own GetColumnsOperation 5ad1053f3e Bryan Cutler Sat Jun 22 11:20:35 2019 +0900 [SPARK-28128][PYTHON][SQL] Pandas Grouped UDFs skip empty partitions 113f8c8d13 HyukjinKwon Fri Jun 21 10:47:54 2019 -0700 [SPARK-28132][PYTHON] Update document type conversion for Pandas UDFs (pyarrow 0.13.0, pandas 0.24.2, Python 3.7) 9b9d81b821 HyukjinKwon Fri Jun 21 10:27:18 2019 -0700 [SPARK-28131][PYTHON] Update document type conversion between Python data and SQL types in normal UDFs (Python 3.7) 54da3bbfb2 Yesheng Ma Thu Jun 20 19:45:59 2019 -0700 [SPARK-28127][SQL] Micro optimization on TreeNode's mapChildren method msirek@skylake16:~/IdeaProjects/spark$ ./bin/spark-shell SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/msirek/IdeaProjects/spark/assembly/target/scala-2.12/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.16.1-1.cdh5.16.1.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). 19/06/22 22:13:39 WARN util.Utils: Service 'SparkUI' could not bind on port 4040. Attempting port 4041. Spark context Web UI available at http://skylake16.home.colo:4041 Spark context available as 'sc' (master = local[*], app id = local-1561266819220). Spark session available as 'spark'. Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 3.0.0-SNAPSHOT /_/ Using Scala version 2.12.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_201) Type in expressions to have them evaluated. Type :help for more information. scala> val df = Seq( | (BigDecimal("1000"), 1), | (BigDecimal("1000"), 1), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2), | (BigDecimal("1000"), 2)).toDF("decNum", "intNum") df: org.apache.spark.sql.DataFrame = [decNum: decimal(38,18), intNum: int] scala> val df2 = df.withColumnRenamed("decNum", "decNum2").join(df, "intNum").agg(sum("decNum")) df2: org.apache.spark.sql.DataFrame = [sum(decNum): decimal(38,18)] scala> df2.explain == Physical Plan == *(2) HashAggregate(keys=[], functions=[sum(decNum#14)]) +- Exchange SinglePartition +- *(1) HashAggregate(keys=[], functions=[partial_sum(decNum#14)]) +- *(1) Project [decNum#14] +- *(1) BroadcastHashJoin [intNum#8], [intNum#15], Inner, BuildLeft :- BroadcastExchange HashedRelationBroadcastMode(List(cast(input[0, int, false] as bigint))) : +- LocalTableScan [intNum#8] +- LocalTableScan [decNum#14, intNum#15] scala> df2.show(40,false) +---+ |sum(decNum) | +---+ |4000.00| +---+ {code} > Incorrect results in decimal aggregation with whole-stage code gen enabled > -- > > Key: SPARK-28067 > URL: https://issues.apache.org/jira/browse/SPARK-28067 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0, 2.4.0 > Environment: Ubuntu LTS 16.04 > Oracle Java 1.8.0_201 > spark-2.4.3-bin-without-hadoop > spark-shell >Reporter: Mark Sirek >Priority: Minor > Labels: correctness > > The following test case involving a join followed by a sum aggregation > returns the wrong answer for the sum: > > {code:java} > val df = Seq( > (BigDecimal("1000"), 1), > (BigDecimal("1000"), 1), > (BigDecimal("1000"), 2), >
[jira] [Updated] (SPARK-28141) Timestamp/Date type can not accept special values
[ https://issues.apache.org/jira/browse/SPARK-28141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang updated SPARK-28141: Description: ||Input String||Valid Types||Description|| |{{epoch}}|{{date}}, {{timestamp}}|1970-01-01 00:00:00+00 (Unix system time zero)| |{{infinity}}|{{date}}, {{timestamp}}|later than all other time stamps| |{{-infinity}}|{{date}}, {{timestamp}}|earlier than all other time stamps| |{{now}}|{{date}}, {{time}}, {{timestamp}}|current transaction's start time| |{{today}}|{{date}}, {{timestamp}}|midnight today| |{{tomorrow}}|{{date}}, {{timestamp}}|midnight tomorrow| |{{yesterday}}|{{date}}, {{timestamp}}|midnight yesterday| |{{allballs}}|{{time}}|00:00:00.00 UTC| https://www.postgresql.org/docs/12/datatype-datetime.html was: ||nput String||Valid Types||Description|| |{{epoch}}|{{date}}, {{timestamp}}|1970-01-01 00:00:00+00 (Unix system time zero)| |{{infinity}}|{{date}}, {{timestamp}}|later than all other time stamps| |{{-infinity}}|{{date}}, {{timestamp}}|earlier than all other time stamps| |{{now}}|{{date}}, {{time}}, {{timestamp}}|current transaction's start time| |{{today}}|{{date}}, {{timestamp}}|midnight today| |{{tomorrow}}|{{date}}, {{timestamp}}|midnight tomorrow| |{{yesterday}}|{{date}}, {{timestamp}}|midnight yesterday| |{{allballs}}|{{time}}|00:00:00.00 UTC| https://www.postgresql.org/docs/12/datatype-datetime.html > Timestamp/Date type can not accept special values > - > > Key: SPARK-28141 > URL: https://issues.apache.org/jira/browse/SPARK-28141 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > ||Input String||Valid Types||Description|| > |{{epoch}}|{{date}}, {{timestamp}}|1970-01-01 00:00:00+00 (Unix system time > zero)| > |{{infinity}}|{{date}}, {{timestamp}}|later than all other time stamps| > |{{-infinity}}|{{date}}, {{timestamp}}|earlier than all other time stamps| > |{{now}}|{{date}}, {{time}}, {{timestamp}}|current transaction's start time| > |{{today}}|{{date}}, {{timestamp}}|midnight today| > |{{tomorrow}}|{{date}}, {{timestamp}}|midnight tomorrow| > |{{yesterday}}|{{date}}, {{timestamp}}|midnight yesterday| > |{{allballs}}|{{time}}|00:00:00.00 UTC| > https://www.postgresql.org/docs/12/datatype-datetime.html -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28060) Float/Double type can not accept some special inputs
[ https://issues.apache.org/jira/browse/SPARK-28060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yuming Wang resolved SPARK-28060. - Resolution: Duplicate > Float/Double type can not accept some special inputs > > > Key: SPARK-28060 > URL: https://issues.apache.org/jira/browse/SPARK-28060 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > ||Query||Spark SQL||PostgreSQL|| > |SELECT float('nan');|NULL|NaN| > |SELECT float(' NAN ');|NULL|NaN| > |SELECT float('infinity');|NULL|Infinity| > |SELECT float(' -INFINiTY ');|NULL|-Infinity| > ||Query||Spark SQL||PostgreSQL|| > |SELECT double('nan');|NULL|NaN| > |SELECT double(' NAN ');|NULL|NaN| > |SELECT double('infinity');|NULL|Infinity| > |SELECT double(' -INFINiTY ');|NULL|-Infinity| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28060) Float/Double type can not accept some special inputs
[ https://issues.apache.org/jira/browse/SPARK-28060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870423#comment-16870423 ] Yuming Wang commented on SPARK-28060: - OK. Thank you [~mgaido] > Float/Double type can not accept some special inputs > > > Key: SPARK-28060 > URL: https://issues.apache.org/jira/browse/SPARK-28060 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > ||Query||Spark SQL||PostgreSQL|| > |SELECT float('nan');|NULL|NaN| > |SELECT float(' NAN ');|NULL|NaN| > |SELECT float('infinity');|NULL|Infinity| > |SELECT float(' -INFINiTY ');|NULL|-Infinity| > ||Query||Spark SQL||PostgreSQL|| > |SELECT double('nan');|NULL|NaN| > |SELECT double(' NAN ');|NULL|NaN| > |SELECT double('infinity');|NULL|Infinity| > |SELECT double(' -INFINiTY ');|NULL|-Infinity| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28139) DataSourceV2: Add AlterTable v2 implementation
[ https://issues.apache.org/jira/browse/SPARK-28139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28139: Assignee: (was: Apache Spark) > DataSourceV2: Add AlterTable v2 implementation > -- > > Key: SPARK-28139 > URL: https://issues.apache.org/jira/browse/SPARK-28139 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Ryan Blue >Priority: Major > > SPARK-27857 updated the parser for v2 ALTER TABLE statements. This tracks > implementing those using a v2 catalog. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28139) DataSourceV2: Add AlterTable v2 implementation
[ https://issues.apache.org/jira/browse/SPARK-28139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-28139: Assignee: Apache Spark > DataSourceV2: Add AlterTable v2 implementation > -- > > Key: SPARK-28139 > URL: https://issues.apache.org/jira/browse/SPARK-28139 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Ryan Blue >Assignee: Apache Spark >Priority: Major > > SPARK-27857 updated the parser for v2 ALTER TABLE statements. This tracks > implementing those using a v2 catalog. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28114) Add Jenkins job for `Hadoop-3.2` profile
[ https://issues.apache.org/jira/browse/SPARK-28114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870381#comment-16870381 ] Yuming Wang commented on SPARK-28114: - Hi, [~shaneknapp]. For the failure of [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2/2] and [https://amplab.cs.berkeley.edu/jenkins/job/spark-master-test-maven-hadoop-3.2/3], we need to set locale to {{en_US.UTF-8}}. Please see SPARK-27177 for more details. > Add Jenkins job for `Hadoop-3.2` profile > > > Key: SPARK-28114 > URL: https://issues.apache.org/jira/browse/SPARK-28114 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: shane knapp >Priority: Major > > Spark 3.0 is a major version change. We want to have the following new Jobs. > 1. SBT with hadoop-3.2 > 2. Maven with hadoop-3.2 (on JDK8 and JDK11) > Also, shall we have a limit for the concurrent run for the following existing > job? Currently, it invokes multiple jobs concurrently. We can save the > resource by limiting to 1 like the other jobs. > - > https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7-jdk-11-ubuntu-testing > We will drop four `branch-2.3` jobs at the end of August, 2019. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28135) ceil/ceiling/floor/power returns incorrect values
[ https://issues.apache.org/jira/browse/SPARK-28135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870360#comment-16870360 ] Marco Gaido commented on SPARK-28135: - [~Tonix517] tickets are assigned only once the PR is merged and the ticket is close. So please go ahead submitting the PR and the committer who will eventually merge it will assign the ticket to you. Thanks. > ceil/ceiling/floor/power returns incorrect values > - > > Key: SPARK-28135 > URL: https://issues.apache.org/jira/browse/SPARK-28135 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > {noformat} > spark-sql> select ceil(double(1.2345678901234e+200)), > ceiling(double(1.2345678901234e+200)), floor(double(1.2345678901234e+200)), > power('1', 'NaN'); > 9223372036854775807 9223372036854775807 9223372036854775807 NaN > {noformat} > {noformat} > postgres=# select ceil(1.2345678901234e+200::float8), > ceiling(1.2345678901234e+200::float8), floor(1.2345678901234e+200::float8), > power('1', 'NaN'); > ceil | ceiling|floor | power > --+--+--+--- > 1.2345678901234e+200 | 1.2345678901234e+200 | 1.2345678901234e+200 | 1 > (1 row) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28114) Add Jenkins job for `Hadoop-3.2` profile
[ https://issues.apache.org/jira/browse/SPARK-28114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870325#comment-16870325 ] shane knapp commented on SPARK-28114: - the SBT and maven compile builds are green, but the maven tests builds (both jdks) are failing. the hadoop-2.7 maven test build is green... if we're still seeing failures that don't correlate across hadoop versions on monday, i'll dig a little deeper. > Add Jenkins job for `Hadoop-3.2` profile > > > Key: SPARK-28114 > URL: https://issues.apache.org/jira/browse/SPARK-28114 > Project: Spark > Issue Type: Improvement > Components: Project Infra >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: shane knapp >Priority: Major > > Spark 3.0 is a major version change. We want to have the following new Jobs. > 1. SBT with hadoop-3.2 > 2. Maven with hadoop-3.2 (on JDK8 and JDK11) > Also, shall we have a limit for the concurrent run for the following existing > job? Currently, it invokes multiple jobs concurrently. We can save the > resource by limiting to 1 like the other jobs. > - > https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-maven-hadoop-2.7-jdk-11-ubuntu-testing > We will drop four `branch-2.3` jobs at the end of August, 2019. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28135) ceil/ceiling/floor/power returns incorrect values
[ https://issues.apache.org/jira/browse/SPARK-28135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870319#comment-16870319 ] Tony Zhang commented on SPARK-28135: [~yumwang] Ok on my way. BTW how can I assign the ticket to myself? > ceil/ceiling/floor/power returns incorrect values > - > > Key: SPARK-28135 > URL: https://issues.apache.org/jira/browse/SPARK-28135 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > {noformat} > spark-sql> select ceil(double(1.2345678901234e+200)), > ceiling(double(1.2345678901234e+200)), floor(double(1.2345678901234e+200)), > power('1', 'NaN'); > 9223372036854775807 9223372036854775807 9223372036854775807 NaN > {noformat} > {noformat} > postgres=# select ceil(1.2345678901234e+200::float8), > ceiling(1.2345678901234e+200::float8), floor(1.2345678901234e+200::float8), > power('1', 'NaN'); > ceil | ceiling|floor | power > --+--+--+--- > 1.2345678901234e+200 | 1.2345678901234e+200 | 1.2345678901234e+200 | 1 > (1 row) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-28104) Implement Spark's own GetColumnsOperation
[ https://issues.apache.org/jira/browse/SPARK-28104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-28104. - Resolution: Fixed Fix Version/s: 3.0.0 > Implement Spark's own GetColumnsOperation > - > > Key: SPARK-28104 > URL: https://issues.apache.org/jira/browse/SPARK-28104 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > Fix For: 3.0.0 > > > SPARK-24196 and SPARK-24570 implemented Spark's own {{GetSchemasOperation}} > and {{GetTablesOperation}}. We also need implement Spark's own > {{GetColumnsOperation}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-28104) Implement Spark's own GetColumnsOperation
[ https://issues.apache.org/jira/browse/SPARK-28104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-28104: --- Assignee: Yuming Wang > Implement Spark's own GetColumnsOperation > - > > Key: SPARK-28104 > URL: https://issues.apache.org/jira/browse/SPARK-28104 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Assignee: Yuming Wang >Priority: Major > > SPARK-24196 and SPARK-24570 implemented Spark's own {{GetSchemasOperation}} > and {{GetTablesOperation}}. We also need implement Spark's own > {{GetColumnsOperation}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28067) Incorrect results in decimal aggregation with whole-stage code gen enabled
[ https://issues.apache.org/jira/browse/SPARK-28067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870293#comment-16870293 ] Marco Gaido commented on SPARK-28067: - I cannot reproduce on master. It always returns null with whole stage codegen enabled. > Incorrect results in decimal aggregation with whole-stage code gen enabled > -- > > Key: SPARK-28067 > URL: https://issues.apache.org/jira/browse/SPARK-28067 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0, 2.4.0 > Environment: Ubuntu LTS 16.04 > Oracle Java 1.8.0_201 > spark-2.4.3-bin-without-hadoop > spark-shell >Reporter: Mark Sirek >Priority: Minor > Labels: correctness > > The following test case involving a join followed by a sum aggregation > returns the wrong answer for the sum: > > {code:java} > val df = Seq( > (BigDecimal("1000"), 1), > (BigDecimal("1000"), 1), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2), > (BigDecimal("1000"), 2)).toDF("decNum", "intNum") > val df2 = df.withColumnRenamed("decNum", "decNum2").join(df, > "intNum").agg(sum("decNum")) > scala> df2.show(40,false) > --- > sum(decNum) > --- > 4000.00 > --- > > {code} > > The result should be 104000.. > It appears a partial sum is computed for each join key, as the result > returned would be the answer for all rows matching intNum === 1. > If only the rows with intNum === 2 are included, the answer given is null: > > {code:java} > scala> val df3 = df.filter($"intNum" === lit(2)) > df3: org.apache.spark.sql.Dataset[org.apache.spark.sql.Row] = [decNum: > decimal(38,18), intNum: int] > scala> val df4 = df3.withColumnRenamed("decNum", "decNum2").join(df3, > "intNum").agg(sum("decNum")) > df4: org.apache.spark.sql.DataFrame = [sum(decNum): decimal(38,18)] > scala> df4.show(40,false) > --- > sum(decNum) > --- > null > --- > > {code} > > The correct answer, 10., doesn't fit in > the DataType picked for the result, decimal(38,18), so an overflow occurs, > which Spark then converts to null. > The first example, which doesn't filter out the intNum === 1 values should > also return null, indicating overflow, but it doesn't. This may mislead the > user to think a valid sum was computed. > If whole-stage code gen is turned off: > spark.conf.set("spark.sql.codegen.wholeStage", false) > ... incorrect results are not returned because the overflow is caught as an > exception: > java.lang.IllegalArgumentException: requirement failed: Decimal precision 39 > exceeds max precision 38 > > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28060) Float/Double type can not accept some special inputs
[ https://issues.apache.org/jira/browse/SPARK-28060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870288#comment-16870288 ] Marco Gaido commented on SPARK-28060: - This is a duplicate of SPARK-27768, isn't it? Or better, SPARK-27768 is a subpart of this? Anyway, shall we close either this one or SPARK-27768? > Float/Double type can not accept some special inputs > > > Key: SPARK-28060 > URL: https://issues.apache.org/jira/browse/SPARK-28060 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > ||Query||Spark SQL||PostgreSQL|| > |SELECT float('nan');|NULL|NaN| > |SELECT float(' NAN ');|NULL|NaN| > |SELECT float('infinity');|NULL|Infinity| > |SELECT float(' -INFINiTY ');|NULL|-Infinity| > ||Query||Spark SQL||PostgreSQL|| > |SELECT double('nan');|NULL|NaN| > |SELECT double(' NAN ');|NULL|NaN| > |SELECT double('infinity');|NULL|Infinity| > |SELECT double(' -INFINiTY ');|NULL|-Infinity| -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27820) case insensitive resolver should be used in GetMapValue
[ https://issues.apache.org/jira/browse/SPARK-27820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870286#comment-16870286 ] Marco Gaido commented on SPARK-27820: - +1 for [~hyukjin.kwon]'s comment. > case insensitive resolver should be used in GetMapValue > --- > > Key: SPARK-27820 > URL: https://issues.apache.org/jira/browse/SPARK-27820 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.4.1 >Reporter: Michel Lemay >Priority: Minor > > When extracting a key value from a MapType, it calls GetMapValue > (complexTypeExtractors.scala) and only use the map type ordering. It should > use the resolver instead. > Starting spark with: `{{spark-shell --conf spark.sql.caseSensitive=false`}} > Given dataframe: > {{val df = List(Map("a" -> 1), Map("A" -> 2)).toDF("m")}} > And executing any of these will only return one row: case insensitive in the > name of the column but case sensitive match in the keys of the map. > {{df.filter($"M.A".isNotNull).count}} > {{df.filter($"M"("A").isNotNull).count > df.filter($"M".getField("A").isNotNull).count}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28135) ceil/ceiling/floor/power returns incorrect values
[ https://issues.apache.org/jira/browse/SPARK-28135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870106#comment-16870106 ] Yuming Wang commented on SPARK-28135: - Thank you [~Tonix517], Could you submit pull request to fix this issue? > ceil/ceiling/floor/power returns incorrect values > - > > Key: SPARK-28135 > URL: https://issues.apache.org/jira/browse/SPARK-28135 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > > {noformat} > spark-sql> select ceil(double(1.2345678901234e+200)), > ceiling(double(1.2345678901234e+200)), floor(double(1.2345678901234e+200)), > power('1', 'NaN'); > 9223372036854775807 9223372036854775807 9223372036854775807 NaN > {noformat} > {noformat} > postgres=# select ceil(1.2345678901234e+200::float8), > ceiling(1.2345678901234e+200::float8), floor(1.2345678901234e+200::float8), > power('1', 'NaN'); > ceil | ceiling|floor | power > --+--+--+--- > 1.2345678901234e+200 | 1.2345678901234e+200 | 1.2345678901234e+200 | 1 > (1 row) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org