[jira] [Created] (SPARK-36082) when the right side is small enough to use SingleColumn Null Aware Anti Join
mcdull_zhang created SPARK-36082: Summary: when the right side is small enough to use SingleColumn Null Aware Anti Join Key: SPARK-36082 URL: https://issues.apache.org/jira/browse/SPARK-36082 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.2.0, 3.1.3 Reporter: mcdull_zhang Fix For: 3.2.0 NULL-aware ANTI join (https://issues.apache.org/jira/browse/SPARK-32290) will build right side into a HashMap. code in SparkStrategy: {code:java} case j @ ExtractSingleColumnNullAwareAntiJoin(leftKeys, rightKeys) => Seq(joins.BroadcastHashJoinExec(leftKeys, rightKeys, LeftAnti, BuildRight, None, planLater(j.left), planLater(j.right), isNullAwareAntiJoin = true)){code} we should add the conditions and use this optimization when the size of the right side is small enough. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36081) The implementation of cast doesn't comply with its specification
[ https://issues.apache.org/jira/browse/SPARK-36081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378522#comment-17378522 ] Apache Spark commented on SPARK-36081: -- User 'sarutak' has created a pull request for this issue: https://github.com/apache/spark/pull/33287 > The implementation of cast doesn't comply with its specification > > > Key: SPARK-36081 > URL: https://issues.apache.org/jira/browse/SPARK-36081 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > sql-migration-guide.md mentions about the behavior of cast like as follows. > {code} > In Spark 3.0, when casting string value to integral types(tinyint, smallint, > int and bigint), datetime types(date, timestamp and interval) and boolean > type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed > before converted to these type values, for example, `cast(' 1\t' as int)` > results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as > date)` results the date value `2019-10-10`. In Spark version 2.4 and below, > when casting string to integrals and booleans, it does not trim the > whitespaces from both ends; the foregoing results is `null`, while to > datetimes, only the trailing spaces (= ASCII 32) are removed. > {code} > In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark > 3.0.0. > But after 3.0.1, the query returns NULL and this behavior doesn't comply with > the specification. > The root cause seems to be -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36081) The implementation of cast doesn't comply with its specification
[ https://issues.apache.org/jira/browse/SPARK-36081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36081: Assignee: Kousuke Saruta (was: Apache Spark) > The implementation of cast doesn't comply with its specification > > > Key: SPARK-36081 > URL: https://issues.apache.org/jira/browse/SPARK-36081 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > sql-migration-guide.md mentions about the behavior of cast like as follows. > {code} > In Spark 3.0, when casting string value to integral types(tinyint, smallint, > int and bigint), datetime types(date, timestamp and interval) and boolean > type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed > before converted to these type values, for example, `cast(' 1\t' as int)` > results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as > date)` results the date value `2019-10-10`. In Spark version 2.4 and below, > when casting string to integrals and booleans, it does not trim the > whitespaces from both ends; the foregoing results is `null`, while to > datetimes, only the trailing spaces (= ASCII 32) are removed. > {code} > In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark > 3.0.0. > But after 3.0.1, the query returns NULL and this behavior doesn't comply with > the specification. > The root cause seems to be -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36081) The implementation of cast doesn't comply with its specification
[ https://issues.apache.org/jira/browse/SPARK-36081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-36081: --- Description: sql-migration-guide.md mentions about the behavior of cast like as follows. {code} In Spark 3.0, when casting string value to integral types(tinyint, smallint, int and bigint), datetime types(date, timestamp and interval) and boolean type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed before converted to these type values, for example, `cast(' 1\t' as int)` results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as date)` results the date value `2019-10-10`. In Spark version 2.4 and below, when casting string to integrals and booleans, it does not trim the whitespaces from both ends; the foregoing results is `null`, while to datetimes, only the trailing spaces (= ASCII 32) are removed. {code} In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 in Spark 3.0.0. But after 3.0.1, the query returns NULL and this behavior doesn't comply with the specification. was: sql-migration-guide.md mentions about the behavior of cast like as follows. {code} In Spark 3.0, when casting string value to integral types(tinyint, smallint, int and bigint), datetime types(date, timestamp and interval) and boolean type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed before converted to these type values, for example, `cast(' 1\t' as int)` results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as date)` results the date value `2019-10-10`. In Spark version 2.4 and below, when casting string to integrals and booleans, it does not trim the whitespaces from both ends; the foregoing results is `null`, while to datetimes, only the trailing spaces (= ASCII 32) are removed. {code} In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark 3.0.0. But after 3.0.1, the query returns NULL and this behavior doesn't comply with the specification. The root cause seems to be > The implementation of cast doesn't comply with its specification > > > Key: SPARK-36081 > URL: https://issues.apache.org/jira/browse/SPARK-36081 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > sql-migration-guide.md mentions about the behavior of cast like as follows. > {code} > In Spark 3.0, when casting string value to integral types(tinyint, smallint, > int and bigint), datetime types(date, timestamp and interval) and boolean > type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed > before converted to these type values, for example, `cast(' 1\t' as int)` > results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as > date)` results the date value `2019-10-10`. In Spark version 2.4 and below, > when casting string to integrals and booleans, it does not trim the > whitespaces from both ends; the foregoing results is `null`, while to > datetimes, only the trailing spaces (= ASCII 32) are removed. > {code} > In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 in Spark > 3.0.0. > But after 3.0.1, the query returns NULL and this behavior doesn't comply with > the specification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36081) The implementation of cast doesn't comply with its specification
[ https://issues.apache.org/jira/browse/SPARK-36081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378521#comment-17378521 ] Apache Spark commented on SPARK-36081: -- User 'sarutak' has created a pull request for this issue: https://github.com/apache/spark/pull/33287 > The implementation of cast doesn't comply with its specification > > > Key: SPARK-36081 > URL: https://issues.apache.org/jira/browse/SPARK-36081 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > sql-migration-guide.md mentions about the behavior of cast like as follows. > {code} > In Spark 3.0, when casting string value to integral types(tinyint, smallint, > int and bigint), datetime types(date, timestamp and interval) and boolean > type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed > before converted to these type values, for example, `cast(' 1\t' as int)` > results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as > date)` results the date value `2019-10-10`. In Spark version 2.4 and below, > when casting string to integrals and booleans, it does not trim the > whitespaces from both ends; the foregoing results is `null`, while to > datetimes, only the trailing spaces (= ASCII 32) are removed. > {code} > In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark > 3.0.0. > But after 3.0.1, the query returns NULL and this behavior doesn't comply with > the specification. > The root cause seems to be -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36081) The implementation of cast doesn't comply with its specification
[ https://issues.apache.org/jira/browse/SPARK-36081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36081: Assignee: Apache Spark (was: Kousuke Saruta) > The implementation of cast doesn't comply with its specification > > > Key: SPARK-36081 > URL: https://issues.apache.org/jira/browse/SPARK-36081 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Apache Spark >Priority: Major > > sql-migration-guide.md mentions about the behavior of cast like as follows. > {code} > In Spark 3.0, when casting string value to integral types(tinyint, smallint, > int and bigint), datetime types(date, timestamp and interval) and boolean > type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed > before converted to these type values, for example, `cast(' 1\t' as int)` > results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as > date)` results the date value `2019-10-10`. In Spark version 2.4 and below, > when casting string to integrals and booleans, it does not trim the > whitespaces from both ends; the foregoing results is `null`, while to > datetimes, only the trailing spaces (= ASCII 32) are removed. > {code} > In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark > 3.0.0. > But after 3.0.1, the query returns NULL and this behavior doesn't comply with > the specification. > The root cause seems to be -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36081) The implementation of cast doesn't comply with its specification
[ https://issues.apache.org/jira/browse/SPARK-36081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-36081: --- Description: sql-migration-guide.md mentions about the behavior of cast like as follows. {code} In Spark 3.0, when casting string value to integral types(tinyint, smallint, int and bigint), datetime types(date, timestamp and interval) and boolean type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed before converted to these type values, for example, `cast(' 1\t' as int)` results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as date)` results the date value `2019-10-10`. In Spark version 2.4 and below, when casting string to integrals and booleans, it does not trim the whitespaces from both ends; the foregoing results is `null`, while to datetimes, only the trailing spaces (= ASCII 32) are removed. {code} In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark 3.0.0. But after 3.0.1, the query returns NULL and this behavior doesn't comply with the specification. The root cause seems to be was: sql-migration-guide.md mentions about the behavior of cast like as follows. {code} In Spark 3.0, when casting string value to integral types(tinyint, smallint, int and bigint), datetime types(date, timestamp and interval) and boolean type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed before converted to these type values, for example, `cast(' 1\t' as int)` results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as date)` results the date value `2019-10-10`. In Spark version 2.4 and below, when casting string to integrals and booleans, it does not trim the whitespaces from both ends; the foregoing results is `null`, while to datetimes, only the trailing spaces (= ASCII 32) are removed. {code} In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark 3.0.0. But after 3.0.1, the query returns NULL and this behavior doesn't comply with the specification. > The implementation of cast doesn't comply with its specification > > > Key: SPARK-36081 > URL: https://issues.apache.org/jira/browse/SPARK-36081 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > sql-migration-guide.md mentions about the behavior of cast like as follows. > {code} > In Spark 3.0, when casting string value to integral types(tinyint, smallint, > int and bigint), datetime types(date, timestamp and interval) and boolean > type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed > before converted to these type values, for example, `cast(' 1\t' as int)` > results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as > date)` results the date value `2019-10-10`. In Spark version 2.4 and below, > when casting string to integrals and booleans, it does not trim the > whitespaces from both ends; the foregoing results is `null`, while to > datetimes, only the trailing spaces (= ASCII 32) are removed. > {code} > In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark > 3.0.0. > But after 3.0.1, the query returns NULL and this behavior doesn't comply with > the specification. > The root cause seems to be -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36081) The implementation of cast doesn't comply with its specification
[ https://issues.apache.org/jira/browse/SPARK-36081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-36081: --- Component/s: (was: Spark Core) > The implementation of cast doesn't comply with its specification > > > Key: SPARK-36081 > URL: https://issues.apache.org/jira/browse/SPARK-36081 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0 >Reporter: Kousuke Saruta >Assignee: Kousuke Saruta >Priority: Major > > sql-migration-guide.md mentions about the behavior of cast like as follows. > {code} > In Spark 3.0, when casting string value to integral types(tinyint, smallint, > int and bigint), datetime types(date, timestamp and interval) and boolean > type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed > before converted to these type values, for example, `cast(' 1\t' as int)` > results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as > date)` results the date value `2019-10-10`. In Spark version 2.4 and below, > when casting string to integrals and booleans, it does not trim the > whitespaces from both ends; the foregoing results is `null`, while to > datetimes, only the trailing spaces (= ASCII 32) are removed. > {code} > In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark > 3.0.0. > But after 3.0.1, the query returns NULL and this behavior doesn't comply with > the specification. > The root cause seems to be -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36081) The implementation of cast doesn't comply with its specification
Kousuke Saruta created SPARK-36081: -- Summary: The implementation of cast doesn't comply with its specification Key: SPARK-36081 URL: https://issues.apache.org/jira/browse/SPARK-36081 Project: Spark Issue Type: Bug Components: Spark Core, SQL Affects Versions: 3.1.2, 3.0.3, 3.2.0, 3.3.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta sql-migration-guide.md mentions about the behavior of cast like as follows. {code} In Spark 3.0, when casting string value to integral types(tinyint, smallint, int and bigint), datetime types(date, timestamp and interval) and boolean type, the leading and trailing whitespaces (<= ASCII 32) will be trimmed before converted to these type values, for example, `cast(' 1\t' as int)` results `1`, `cast(' 1\t' as boolean)` results `true`, `cast('2019-10-10\t as date)` results the date value `2019-10-10`. In Spark version 2.4 and below, when casting string to integrals and booleans, it does not trim the whitespaces from both ends; the foregoing results is `null`, while to datetimes, only the trailing spaces (= ASCII 32) are removed. {code} In fact, select cast('2019-10-10\b' as date); returns 2019-10-10 with Spark 3.0.0. But after 3.0.1, the query returns NULL and this behavior doesn't comply with the specification. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35508) job group and description do not apply on broadcasts
[ https://issues.apache.org/jira/browse/SPARK-35508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378483#comment-17378483 ] Shockang commented on SPARK-35508: -- It seems that this bug comes from this PR: [https://github.com/apache/spark/pull/24595] , which will override the settings of job group and job description in the user code. Let me fix this issue. > job group and description do not apply on broadcasts > > > Key: SPARK-35508 > URL: https://issues.apache.org/jira/browse/SPARK-35508 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.0.0, 3.1.0 >Reporter: Lior Chaga >Priority: Minor > Attachments: spark2-image.png, spark3-image.png > > > Given the following code: > {code:java} > SparkContext context = new SparkContext("local", "test"); > SparkSession session = new SparkSession(context); > List strings = Lists.newArrayList("a", "b", "c"); > List otherString = Lists.newArrayList( "b", "c", "d"); > Dataset broadcastedDf = session.createDataset(strings, > Encoders.STRING()).toDF(); > Dataset dataframe = session.createDataset(otherString, > Encoders.STRING()).toDF(); > context.setJobGroup("my group", "my job", false); > dataframe.join(broadcast(broadcastedDf), "value").count(); > {code} > Job group and description do not apply on broadcasted dataframe. > With spark 2.x, broadcast creation is given the same job description as the > query itself. > This seems to be broken with spark 3.x > See attached images > !spark3-image.png! !spark2-image.png! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36080) Broadcast join outer join stream side
[ https://issues.apache.org/jira/browse/SPARK-36080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378463#comment-17378463 ] Apache Spark commented on SPARK-36080: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/33288 > Broadcast join outer join stream side > - > > Key: SPARK-36080 > URL: https://issues.apache.org/jira/browse/SPARK-36080 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-36080) Broadcast join outer join stream side
[ https://issues.apache.org/jira/browse/SPARK-36080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17378462#comment-17378462 ] Apache Spark commented on SPARK-36080: -- User 'wangyum' has created a pull request for this issue: https://github.com/apache/spark/pull/33288 > Broadcast join outer join stream side > - > > Key: SPARK-36080 > URL: https://issues.apache.org/jira/browse/SPARK-36080 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36080) Broadcast join outer join stream side
[ https://issues.apache.org/jira/browse/SPARK-36080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36080: Assignee: (was: Apache Spark) > Broadcast join outer join stream side > - > > Key: SPARK-36080 > URL: https://issues.apache.org/jira/browse/SPARK-36080 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36080) Broadcast join outer join stream side
[ https://issues.apache.org/jira/browse/SPARK-36080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-36080: Assignee: Apache Spark > Broadcast join outer join stream side > - > > Key: SPARK-36080 > URL: https://issues.apache.org/jira/browse/SPARK-36080 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Yuming Wang >Assignee: Apache Spark >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-36047) Replace the handwriting compare methods with static compare methods in Java code
[ https://issues.apache.org/jira/browse/SPARK-36047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen reassigned SPARK-36047: Assignee: Yang Jie > Replace the handwriting compare methods with static compare methods in Java > code > > > Key: SPARK-36047 > URL: https://issues.apache.org/jira/browse/SPARK-36047 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Trivial > > There are some handwriting compare methods like > `ShuffleInMemorySorter.SortComparator` > {code:java} > private static final class SortComparator implements > Comparator { > @Override > public int compare(PackedRecordPointer left, PackedRecordPointer right) { > int leftId = left.getPartitionId(); > int rightId = right.getPartitionId(); > return Integer.compare(leftId, rightId); > } > } > {code} > the handwriting compare methods can replace with `Integer.compare()` method > and similar methods after Java 1.7 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-36047) Replace the handwriting compare methods with static compare methods in Java code
[ https://issues.apache.org/jira/browse/SPARK-36047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean R. Owen resolved SPARK-36047. -- Fix Version/s: 3.3.0 Resolution: Fixed Issue resolved by pull request 33260 [https://github.com/apache/spark/pull/33260] > Replace the handwriting compare methods with static compare methods in Java > code > > > Key: SPARK-36047 > URL: https://issues.apache.org/jira/browse/SPARK-36047 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.3.0 >Reporter: Yang Jie >Assignee: Yang Jie >Priority: Trivial > Fix For: 3.3.0 > > > There are some handwriting compare methods like > `ShuffleInMemorySorter.SortComparator` > {code:java} > private static final class SortComparator implements > Comparator { > @Override > public int compare(PackedRecordPointer left, PackedRecordPointer right) { > int leftId = left.getPartitionId(); > int rightId = right.getPartitionId(); > return Integer.compare(leftId, rightId); > } > } > {code} > the handwriting compare methods can replace with `Integer.compare()` method > and similar methods after Java 1.7 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36073) EquivalentExpressions fixes and improvements
[ https://issues.apache.org/jira/browse/SPARK-36073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Toth updated SPARK-36073: --- Description: Currently `EquivalentExpressions` has 2 issues: - identifying common expressions in conditional expressions is not correct in all cases - transparently canonicalized expressions (like `PromotePrecision`) are considered common subexpressions was: Fixes an issue with identifying common expressions in conditional expressions (a side effect of the above). Fixes the issue of transparently canonicalized expressions (like PromotePrecision) are considered common subexpressions. > EquivalentExpressions fixes and improvements > > > Key: SPARK-36073 > URL: https://issues.apache.org/jira/browse/SPARK-36073 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Peter Toth >Priority: Minor > > Currently `EquivalentExpressions` has 2 issues: > - identifying common expressions in conditional expressions is not correct in > all cases > - transparently canonicalized expressions (like `PromotePrecision`) are > considered common subexpressions -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36073) EquivalentExpressions fixes and improvements
[ https://issues.apache.org/jira/browse/SPARK-36073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Toth updated SPARK-36073: --- Issue Type: Bug (was: Improvement) > EquivalentExpressions fixes and improvements > > > Key: SPARK-36073 > URL: https://issues.apache.org/jira/browse/SPARK-36073 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.2.0 >Reporter: Peter Toth >Priority: Major > > Currently `EquivalentExpressions` has 2 issues: > - identifying common expressions in conditional expressions is not correct in > all cases > - transparently canonicalized expressions (like `PromotePrecision`) are > considered common subexpressions -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36073) EquivalentExpressions fixes and improvements
[ https://issues.apache.org/jira/browse/SPARK-36073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Toth updated SPARK-36073: --- Priority: Major (was: Minor) > EquivalentExpressions fixes and improvements > > > Key: SPARK-36073 > URL: https://issues.apache.org/jira/browse/SPARK-36073 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Peter Toth >Priority: Major > > Currently `EquivalentExpressions` has 2 issues: > - identifying common expressions in conditional expressions is not correct in > all cases > - transparently canonicalized expressions (like `PromotePrecision`) are > considered common subexpressions -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36073) EquivalentExpressions fixes and improvements
[ https://issues.apache.org/jira/browse/SPARK-36073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Toth updated SPARK-36073: --- Description: Fixes an issue with identifying common expressions in conditional expressions (a side effect of the above). Fixes the issue of transparently canonicalized expressions (like PromotePrecision) are considered common subexpressions. was:SPARK-35410 (https://github.com/apache/spark/commit/9e1b204bcce4a8fe24c1edd8271197277b5017f4#diff-4d8c210a38fc808fef3e5c966b438591f225daa3c9fd69359446b94c351aa11eR106-R112) filters out all child expressions, but in some cases that is not necessary. > EquivalentExpressions fixes and improvements > > > Key: SPARK-36073 > URL: https://issues.apache.org/jira/browse/SPARK-36073 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Peter Toth >Priority: Minor > > Fixes an issue with identifying common expressions in conditional expressions > (a side effect of the above). > Fixes the issue of transparently canonicalized expressions (like > PromotePrecision) are considered common subexpressions. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-36073) EquivalentExpressions fixes and improvements
[ https://issues.apache.org/jira/browse/SPARK-36073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Toth updated SPARK-36073: --- Summary: EquivalentExpressions fixes and improvements (was: SubExpr elimination should include common child exprs of conditional expressions) > EquivalentExpressions fixes and improvements > > > Key: SPARK-36073 > URL: https://issues.apache.org/jira/browse/SPARK-36073 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.2.0 >Reporter: Peter Toth >Priority: Minor > > SPARK-35410 > (https://github.com/apache/spark/commit/9e1b204bcce4a8fe24c1edd8271197277b5017f4#diff-4d8c210a38fc808fef3e5c966b438591f225daa3c9fd69359446b94c351aa11eR106-R112) > filters out all child expressions, but in some cases that is not necessary. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org