[jira] [Created] (SPARK-2535) Add StringComparison case to NullPropagation.
Takuya Ueshin created SPARK-2535: Summary: Add StringComparison case to NullPropagation. Key: SPARK-2535 URL: https://issues.apache.org/jira/browse/SPARK-2535 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin {{StringComparison}} expressions including {{null}} literal cases could be added to {{NullPropagation}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2535) Add StringComparison case to NullPropagation.
[ https://issues.apache.org/jira/browse/SPARK-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064381#comment-14064381 ] Takuya Ueshin commented on SPARK-2535: -- PR: https://github.com/apache/spark/pull/1451 Add StringComparison case to NullPropagation. - Key: SPARK-2535 URL: https://issues.apache.org/jira/browse/SPARK-2535 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin {{StringComparison}} expressions including {{null}} literal cases could be added to {{NullPropagation}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2504) Fix nullability of Substring expression.
Takuya Ueshin created SPARK-2504: Summary: Fix nullability of Substring expression. Key: SPARK-2504 URL: https://issues.apache.org/jira/browse/SPARK-2504 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin This is a follow-up of [#1359|https://github.com/apache/spark/pull/1359] with nullability narrowing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2504) Fix nullability of Substring expression.
[ https://issues.apache.org/jira/browse/SPARK-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063028#comment-14063028 ] Takuya Ueshin commented on SPARK-2504: -- PRed: https://github.com/apache/spark/pull/1426 Fix nullability of Substring expression. Key: SPARK-2504 URL: https://issues.apache.org/jira/browse/SPARK-2504 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin This is a follow-up of [#1359|https://github.com/apache/spark/pull/1359] with nullability narrowing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2509) Add optimization for Substring.
[ https://issues.apache.org/jira/browse/SPARK-2509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14063084#comment-14063084 ] Takuya Ueshin commented on SPARK-2509: -- PRed: https://github.com/apache/spark/pull/1428 Add optimization for Substring. --- Key: SPARK-2509 URL: https://issues.apache.org/jira/browse/SPARK-2509 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin {{Substring}} including {{null}} literal cases could be added to {{NullPropagation}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2446) Add BinaryType support to Parquet I/O.
Takuya Ueshin created SPARK-2446: Summary: Add BinaryType support to Parquet I/O. Key: SPARK-2446 URL: https://issues.apache.org/jira/browse/SPARK-2446 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin To support {{BinaryType}}, the following changes are needed: - Make {{StringType}} use {{OriginalType.UTF8}} - Add {{BinaryType}} using {{PrimitiveTypeName.BINARY}} without {{OriginalType}} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2428) Add except and intersect methods to SchemaRDD.
Takuya Ueshin created SPARK-2428: Summary: Add except and intersect methods to SchemaRDD. Key: SPARK-2428 URL: https://issues.apache.org/jira/browse/SPARK-2428 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2431) Refine StringComparison and related codes.
[ https://issues.apache.org/jira/browse/SPARK-2431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057640#comment-14057640 ] Takuya Ueshin commented on SPARK-2431: -- PRed: https://github.com/apache/spark/pull/1357 Refine StringComparison and related codes. -- Key: SPARK-2431 URL: https://issues.apache.org/jira/browse/SPARK-2431 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin Refine {{StringComparison}} and related codes as follows: - {{StringComparison}} could be similar to {{StringRegexExpression}} or {{CaseConversionExpression}}. - Nullability of {{StringRegexExpression}} could depend on children's nullabilities. - Add a case that the like condition includes no wildcard to {{LikeSimplification}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2415) RowWriteSupport should handle empty ArrayType correctly.
Takuya Ueshin created SPARK-2415: Summary: RowWriteSupport should handle empty ArrayType correctly. Key: SPARK-2415 URL: https://issues.apache.org/jira/browse/SPARK-2415 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{RowWriteSupport}} doesn't write empty {{ArrayType}} value, so the read value becomes {{null}}. It should write empty {{ArrayType}} value as it is. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2415) RowWriteSupport should handle empty ArrayType correctly.
[ https://issues.apache.org/jira/browse/SPARK-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14055983#comment-14055983 ] Takuya Ueshin commented on SPARK-2415: -- PRed: https://github.com/apache/spark/pull/1339 RowWriteSupport should handle empty ArrayType correctly. Key: SPARK-2415 URL: https://issues.apache.org/jira/browse/SPARK-2415 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{RowWriteSupport}} doesn't write empty {{ArrayType}} value, so the read value becomes {{null}}. It should write empty {{ArrayType}} value as it is. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2386) RowWriteSupport should use the exact types to cast.
[ https://issues.apache.org/jira/browse/SPARK-2386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053537#comment-14053537 ] Takuya Ueshin commented on SPARK-2386: -- PRed: https://github.com/apache/spark/pull/1315 RowWriteSupport should use the exact types to cast. --- Key: SPARK-2386 URL: https://issues.apache.org/jira/browse/SPARK-2386 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin When execute {{saveAsParquetFile}} with non-primitive type, {{RowWriteSupport}} uses wrong type {{Int}} for {{ByteType}} and {{ShortType}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2366) Add column pruning for the right side of LeftSemi join.
Takuya Ueshin created SPARK-2366: Summary: Add column pruning for the right side of LeftSemi join. Key: SPARK-2366 URL: https://issues.apache.org/jira/browse/SPARK-2366 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin The right side of {{LeftSemi}} join needs columns only used in join condition. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2366) Add column pruning for the right side of LeftSemi join.
[ https://issues.apache.org/jira/browse/SPARK-2366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052265#comment-14052265 ] Takuya Ueshin commented on SPARK-2366: -- PRed: https://github.com/apache/spark/pull/1301 Add column pruning for the right side of LeftSemi join. --- Key: SPARK-2366 URL: https://issues.apache.org/jira/browse/SPARK-2366 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin The right side of {{LeftSemi}} join needs columns only used in join condition. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2327) Fix nullabilities of Join/Generate/Aggregate.
[ https://issues.apache.org/jira/browse/SPARK-2327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047495#comment-14047495 ] Takuya Ueshin commented on SPARK-2327: -- PRed: https://github.com/apache/spark/pull/1266 Fix nullabilities of Join/Generate/Aggregate. - Key: SPARK-2327 URL: https://issues.apache.org/jira/browse/SPARK-2327 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin Fix nullabilities of Join/Generate/Aggregate because: - Output attributes of opposite side of {{OuterJoin}} should be nullable. - Output attributes of generater side of {{Generate}} should be nullable if {{join}} is {{true}} and {{outer}} is {{true}}. - {{AttributeReference}} of {{computedAggregates}} of {{Aggregate}} should be the same as {{aggregateExpression}}'s. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2328) Add execution of `SHOW TABLES` before `TestHive.reset()`.
Takuya Ueshin created SPARK-2328: Summary: Add execution of `SHOW TABLES` before `TestHive.reset()`. Key: SPARK-2328 URL: https://issues.apache.org/jira/browse/SPARK-2328 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{PruningSuite}} is executed first of Hive tests unfortunately, {{TestHive.reset()}} breaks the test environment. To prevent this, we must run a query before calling reset the first time. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2328) Add execution of `SHOW TABLES` before `TestHive.reset()`.
[ https://issues.apache.org/jira/browse/SPARK-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14047546#comment-14047546 ] Takuya Ueshin commented on SPARK-2328: -- PRed: https://github.com/apache/spark/pull/1268 Add execution of `SHOW TABLES` before `TestHive.reset()`. - Key: SPARK-2328 URL: https://issues.apache.org/jira/browse/SPARK-2328 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{PruningSuite}} is executed first of Hive tests unfortunately, {{TestHive.reset()}} breaks the test environment. To prevent this, we must run a query before calling reset the first time. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2287) Make ScalaReflection be able to handle Generic case classes.
Takuya Ueshin created SPARK-2287: Summary: Make ScalaReflection be able to handle Generic case classes. Key: SPARK-2287 URL: https://issues.apache.org/jira/browse/SPARK-2287 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2287) Make ScalaReflection be able to handle Generic case classes.
[ https://issues.apache.org/jira/browse/SPARK-2287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044389#comment-14044389 ] Takuya Ueshin commented on SPARK-2287: -- PRed: https://github.com/apache/spark/pull/1226 Make ScalaReflection be able to handle Generic case classes. Key: SPARK-2287 URL: https://issues.apache.org/jira/browse/SPARK-2287 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2295) Make JavaBeans nullability stricter.
Takuya Ueshin created SPARK-2295: Summary: Make JavaBeans nullability stricter. Key: SPARK-2295 URL: https://issues.apache.org/jira/browse/SPARK-2295 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2295) Make JavaBeans nullability stricter.
[ https://issues.apache.org/jira/browse/SPARK-2295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14044950#comment-14044950 ] Takuya Ueshin commented on SPARK-2295: -- PRed: https://github.com/apache/spark/pull/1235 Make JavaBeans nullability stricter. Key: SPARK-2295 URL: https://issues.apache.org/jira/browse/SPARK-2295 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2254) ScalaRefection should mark primitive types as non-nullable.
Takuya Ueshin created SPARK-2254: Summary: ScalaRefection should mark primitive types as non-nullable. Key: SPARK-2254 URL: https://issues.apache.org/jira/browse/SPARK-2254 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2254) ScalaRefection should mark primitive types as non-nullable.
[ https://issues.apache.org/jira/browse/SPARK-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14041837#comment-14041837 ] Takuya Ueshin commented on SPARK-2254: -- PRed: https://github.com/apache/spark/pull/1193 ScalaRefection should mark primitive types as non-nullable. --- Key: SPARK-2254 URL: https://issues.apache.org/jira/browse/SPARK-2254 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2196) Fix nullability of CaseWhen.
Takuya Ueshin created SPARK-2196: Summary: Fix nullability of CaseWhen. Key: SPARK-2196 URL: https://issues.apache.org/jira/browse/SPARK-2196 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{CaseWhen}} should use {{branches.length}} to check if `elseValue` is provided or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2196) Fix nullability of CaseWhen.
[ https://issues.apache.org/jira/browse/SPARK-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-2196: - Description: {{CaseWhen}} should use {{branches.length}} to check if {{elseValue}} is provided or not. (was: {{CaseWhen}} should use {{branches.length}} to check if `elseValue` is provided or not.) Fix nullability of CaseWhen. Key: SPARK-2196 URL: https://issues.apache.org/jira/browse/SPARK-2196 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{CaseWhen}} should use {{branches.length}} to check if {{elseValue}} is provided or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2196) Fix nullability of CaseWhen.
[ https://issues.apache.org/jira/browse/SPARK-2196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037189#comment-14037189 ] Takuya Ueshin commented on SPARK-2196: -- PRed: https://github.com/apache/spark/pull/1133 Fix nullability of CaseWhen. Key: SPARK-2196 URL: https://issues.apache.org/jira/browse/SPARK-2196 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{CaseWhen}} should use {{branches.length}} to check if {{elseValue}} is provided or not. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2093) NullPropagation should use exact type value.
Takuya Ueshin created SPARK-2093: Summary: NullPropagation should use exact type value. Key: SPARK-2093 URL: https://issues.apache.org/jira/browse/SPARK-2093 Project: Spark Issue Type: Bug Reporter: Takuya Ueshin {{NullPropagation}} should use exact type value when transform {{Count}} or {{Sum}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2093) NullPropagation should use exact type value.
[ https://issues.apache.org/jira/browse/SPARK-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026232#comment-14026232 ] Takuya Ueshin commented on SPARK-2093: -- PRed: https://github.com/apache/spark/pull/1034 NullPropagation should use exact type value. Key: SPARK-2093 URL: https://issues.apache.org/jira/browse/SPARK-2093 Project: Spark Issue Type: Bug Reporter: Takuya Ueshin {{NullPropagation}} should use exact type value when transform {{Count}} or {{Sum}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2036) CaseConversionExpression should check if the evaluated value is null.
Takuya Ueshin created SPARK-2036: Summary: CaseConversionExpression should check if the evaluated value is null. Key: SPARK-2036 URL: https://issues.apache.org/jira/browse/SPARK-2036 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{CaseConversionExpression}} should check if the evaluated value is {{null}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2036) CaseConversionExpression should check if the evaluated value is null.
[ https://issues.apache.org/jira/browse/SPARK-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018928#comment-14018928 ] Takuya Ueshin commented on SPARK-2036: -- PRed: https://github.com/apache/spark/pull/982 CaseConversionExpression should check if the evaluated value is null. - Key: SPARK-2036 URL: https://issues.apache.org/jira/browse/SPARK-2036 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{CaseConversionExpression}} should check if the evaluated value is {{null}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2052) Add optimization for CaseConversionExpression's.
Takuya Ueshin created SPARK-2052: Summary: Add optimization for CaseConversionExpression's. Key: SPARK-2052 URL: https://issues.apache.org/jira/browse/SPARK-2052 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin Add optimization for {{CaseConversionExpression}}'s. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2052) Add optimization for CaseConversionExpression's.
[ https://issues.apache.org/jira/browse/SPARK-2052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019578#comment-14019578 ] Takuya Ueshin commented on SPARK-2052: -- PRed: https://github.com/apache/spark/pull/990 Add optimization for CaseConversionExpression's. Key: SPARK-2052 URL: https://issues.apache.org/jira/browse/SPARK-2052 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin Add optimization for {{CaseConversionExpression}}'s. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2029) Bump pom.xml version number of master branch to 1.1.0-SNAPSHOT.
Takuya Ueshin created SPARK-2029: Summary: Bump pom.xml version number of master branch to 1.1.0-SNAPSHOT. Key: SPARK-2029 URL: https://issues.apache.org/jira/browse/SPARK-2029 Project: Spark Issue Type: Bug Reporter: Takuya Ueshin Bump pom.xml version number of master branch to 1.1.0-SNAPSHOT. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2029) Bump pom.xml version number of master branch to 1.1.0-SNAPSHOT.
[ https://issues.apache.org/jira/browse/SPARK-2029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018492#comment-14018492 ] Takuya Ueshin commented on SPARK-2029: -- PRed: https://github.com/apache/spark/pull/974 Bump pom.xml version number of master branch to 1.1.0-SNAPSHOT. --- Key: SPARK-2029 URL: https://issues.apache.org/jira/browse/SPARK-2029 Project: Spark Issue Type: Bug Reporter: Takuya Ueshin Bump pom.xml version number of master branch to 1.1.0-SNAPSHOT. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2030) Bump SparkBuild.scala version number of branch-1.0 to 1.0.1-SNAPSHOT.
Takuya Ueshin created SPARK-2030: Summary: Bump SparkBuild.scala version number of branch-1.0 to 1.0.1-SNAPSHOT. Key: SPARK-2030 URL: https://issues.apache.org/jira/browse/SPARK-2030 Project: Spark Issue Type: Bug Reporter: Takuya Ueshin Bump SparkBuild.scala version number of branch-1.0 to 1.0.1-SNAPSHOT. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2030) Bump SparkBuild.scala version number of branch-1.0 to 1.0.1-SNAPSHOT.
[ https://issues.apache.org/jira/browse/SPARK-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14018493#comment-14018493 ] Takuya Ueshin commented on SPARK-2030: -- PRed: https://github.com/apache/spark/pull/975 Bump SparkBuild.scala version number of branch-1.0 to 1.0.1-SNAPSHOT. - Key: SPARK-2030 URL: https://issues.apache.org/jira/browse/SPARK-2030 Project: Spark Issue Type: Bug Reporter: Takuya Ueshin Bump SparkBuild.scala version number of branch-1.0 to 1.0.1-SNAPSHOT. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1947) Child of SumDistinct or Average should be widened to prevent overflows the same as Sum.
[ https://issues.apache.org/jira/browse/SPARK-1947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14010945#comment-14010945 ] Takuya Ueshin commented on SPARK-1947: -- PRed: https://github.com/apache/spark/pull/902 Child of SumDistinct or Average should be widened to prevent overflows the same as Sum. --- Key: SPARK-1947 URL: https://issues.apache.org/jira/browse/SPARK-1947 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin Child of {{SumDistinct}} or {{Average}} should be widened to prevent overflows the same as {{Sum}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-1938) ApproxCountDistinctMergeFunction should return Int value.
Takuya Ueshin created SPARK-1938: Summary: ApproxCountDistinctMergeFunction should return Int value. Key: SPARK-1938 URL: https://issues.apache.org/jira/browse/SPARK-1938 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{ApproxCountDistinctMergeFunction}} should return {{Int}} value because the {{dataType}} of {{ApproxCountDistinct}} is {{IntegerType}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1938) ApproxCountDistinctMergeFunction should return Int value.
[ https://issues.apache.org/jira/browse/SPARK-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14009555#comment-14009555 ] Takuya Ueshin commented on SPARK-1938: -- PRed: https://github.com/apache/spark/pull/893 ApproxCountDistinctMergeFunction should return Int value. - Key: SPARK-1938 URL: https://issues.apache.org/jira/browse/SPARK-1938 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{ApproxCountDistinctMergeFunction}} should return {{Int}} value because the {{dataType}} of {{ApproxCountDistinct}} is {{IntegerType}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1926) Nullability of Max/Min/First should be true.
[ https://issues.apache.org/jira/browse/SPARK-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14008631#comment-14008631 ] Takuya Ueshin commented on SPARK-1926: -- PRed: https://github.com/apache/spark/pull/881 Nullability of Max/Min/First should be true. Key: SPARK-1926 URL: https://issues.apache.org/jira/browse/SPARK-1926 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin Nullability of {{Max}}/{{Min}}/{{First}} should be {{true}} because they return {{null}} if there are no rows. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-1914) Simplify CountFunction not to traverse to evaluate all child expressions.
Takuya Ueshin created SPARK-1914: Summary: Simplify CountFunction not to traverse to evaluate all child expressions. Key: SPARK-1914 URL: https://issues.apache.org/jira/browse/SPARK-1914 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1914) Simplify CountFunction not to traverse to evaluate all child expressions.
[ https://issues.apache.org/jira/browse/SPARK-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-1914: - Description: {{CountFunction}} should count up only if the child's evaluated value is not null. Because it traverses to evaluate all child expressions, even if the child is null, it counts up if one of the all children is not null. Simplify CountFunction not to traverse to evaluate all child expressions. - Key: SPARK-1914 URL: https://issues.apache.org/jira/browse/SPARK-1914 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{CountFunction}} should count up only if the child's evaluated value is not null. Because it traverses to evaluate all child expressions, even if the child is null, it counts up if one of the all children is not null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1914) Simplify CountFunction not to traverse to evaluate all child expressions.
[ https://issues.apache.org/jira/browse/SPARK-1914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007033#comment-14007033 ] Takuya Ueshin commented on SPARK-1914: -- Pull-requested: https://github.com/apache/spark/pull/861 Simplify CountFunction not to traverse to evaluate all child expressions. - Key: SPARK-1914 URL: https://issues.apache.org/jira/browse/SPARK-1914 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{CountFunction}} should count up only if the child's evaluated value is not null. Because it traverses to evaluate all child expressions, even if the child is null, it counts up if one of the all children is not null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1915) AverageFunction should not count if the evaluated value is null.
[ https://issues.apache.org/jira/browse/SPARK-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14007136#comment-14007136 ] Takuya Ueshin commented on SPARK-1915: -- Pull-requested: https://github.com/apache/spark/pull/862 AverageFunction should not count if the evaluated value is null. Key: SPARK-1915 URL: https://issues.apache.org/jira/browse/SPARK-1915 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin Average values are difference between the calculation is done partially or not partially. Because {{AverageFunction}} (in not-partially calculation) counts even if the evaluated value is null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-1880) Eliminate unnecessary job executions.
[ https://issues.apache.org/jira/browse/SPARK-1880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin updated SPARK-1880: - Description: There are unnecessary job executions in {{BroadcastNestedLoopJoin}}. When {{Innner}} or {{LeftOuter}} join, preparation of {{rightOuterMatches}} for {{RightOuter}} or {{FullOuter}} join is not neccessary. And when {{RightOuter}} or {{FullOuter}}, it should use not {{count}} and then {{reduce}} but {{fold}}. was: There are unnecessary job executions in {{BroadcastNestedLoopJoin}}. When Innner or LeftOuter join, preparation of {{rightOuterMatches}} for {{RightOuter}} or {{FullOuter}} join is not neccessary. And when {{RightOuter}} or {{FullOuter}}, it should use not {{count}} and then {{reduce}} but {{fold}}. Eliminate unnecessary job executions. - Key: SPARK-1880 URL: https://issues.apache.org/jira/browse/SPARK-1880 Project: Spark Issue Type: Improvement Components: SQL Reporter: Takuya Ueshin There are unnecessary job executions in {{BroadcastNestedLoopJoin}}. When {{Innner}} or {{LeftOuter}} join, preparation of {{rightOuterMatches}} for {{RightOuter}} or {{FullOuter}} join is not neccessary. And when {{RightOuter}} or {{FullOuter}}, it should use not {{count}} and then {{reduce}} but {{fold}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1845) Use AllScalaRegistrar for SparkSqlSerializer to register serializers of Scala collections.
[ https://issues.apache.org/jira/browse/SPARK-1845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13998680#comment-13998680 ] Takuya Ueshin commented on SPARK-1845: -- Pull-requested: https://github.com/apache/spark/pull/790 Use AllScalaRegistrar for SparkSqlSerializer to register serializers of Scala collections. -- Key: SPARK-1845 URL: https://issues.apache.org/jira/browse/SPARK-1845 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin When I execute {{orderBy}} or {{limit}} for {{SchemaRDD}} including {{ArrayType}} or {{MapType}}, {{SparkSqlSerializer}} throws the following exception: {quote} com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.$colon$colon {quote} or {quote} com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.Vector {quote} or {quote} com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.HashMap$HashTrieMap {quote} and so on. This is because registrations of serializers for each concrete collections are missing in {{SparkSqlSerializer}}. I believe it should use {{AllScalaRegistrar}}. {{AllScalaRegistrar}} covers a lot of serializers for concrete classes of {{Seq}}, {{Map}} for {{ArrayType}}, {{MapType}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-1845) Use AllScalaRegistrar for SparkSqlSerializer to register serializers of Scala collections.
Takuya Ueshin created SPARK-1845: Summary: Use AllScalaRegistrar for SparkSqlSerializer to register serializers of Scala collections. Key: SPARK-1845 URL: https://issues.apache.org/jira/browse/SPARK-1845 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin When I execute {{orderBy}} or {{limit}} for {{SchemaRDD}} including {{ArrayType}} or {{MapType}}, {{SparkSqlSerializer}} throws the following exception: {quote} com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.$colon$colon {quote} or {quote} com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.Vector {quote} or {quote} com.esotericsoftware.kryo.KryoException: Class cannot be created (missing no-arg constructor): scala.collection.immutable.HashMap$HashTrieMap {quote} and so on. This is because registrations of serializers for each concrete collections are missing in {{SparkSqlSerializer}}. I believe it should use {{AllScalaRegistrar}}. {{AllScalaRegistrar}} covers a lot of serializers for concrete classes of {{Seq}}, {{Map}} for {{ArrayType}}, {{MapType}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-1819) Fix GetField.nullable.
Takuya Ueshin created SPARK-1819: Summary: Fix GetField.nullable. Key: SPARK-1819 URL: https://issues.apache.org/jira/browse/SPARK-1819 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{GetField.nullable}} should be {{true}} not only when {{field.nullable}} is {{true}} but also when {{child.nullable}} is {{true}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1819) Fix GetField.nullable.
[ https://issues.apache.org/jira/browse/SPARK-1819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996201#comment-13996201 ] Takuya Ueshin commented on SPARK-1819: -- Pull-requested: https://github.com/apache/spark/pull/757 Fix GetField.nullable. -- Key: SPARK-1819 URL: https://issues.apache.org/jira/browse/SPARK-1819 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin {{GetField.nullable}} should be {{true}} not only when {{field.nullable}} is {{true}} but also when {{child.nullable}} is {{true}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-1778) Add 'limit' transformation to SchemaRDD.
Takuya Ueshin created SPARK-1778: Summary: Add 'limit' transformation to SchemaRDD. Key: SPARK-1778 URL: https://issues.apache.org/jira/browse/SPARK-1778 Project: Spark Issue Type: Improvement Reporter: Takuya Ueshin Add {{limit}} transformation to {{SchemaRDD}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1778) Add 'limit' transformation to SchemaRDD.
[ https://issues.apache.org/jira/browse/SPARK-1778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13993461#comment-13993461 ] Takuya Ueshin commented on SPARK-1778: -- Pull-requested: https://github.com/apache/spark/pull/711 Add 'limit' transformation to SchemaRDD. Key: SPARK-1778 URL: https://issues.apache.org/jira/browse/SPARK-1778 Project: Spark Issue Type: Improvement Reporter: Takuya Ueshin Add {{limit}} transformation to {{SchemaRDD}}. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1608) Cast.nullable should be true when cast from StringType to NumericType/TimestampType
[ https://issues.apache.org/jira/browse/SPARK-1608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979559#comment-13979559 ] Takuya Ueshin commented on SPARK-1608: -- Pull-requested: https://github.com/apache/spark/pull/532 Cast.nullable should be true when cast from StringType to NumericType/TimestampType --- Key: SPARK-1608 URL: https://issues.apache.org/jira/browse/SPARK-1608 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin Cast.nullable should be true when cast from StringType to NumericType or TimestampType. Because if StringType expression has an illegal number string or illegal timestamp string, the casted value becomes null. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1610) Cast from BooleanType to NumericType should use exact type value.
[ https://issues.apache.org/jira/browse/SPARK-1610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13979618#comment-13979618 ] Takuya Ueshin commented on SPARK-1610: -- Pull-requested: https://github.com/apache/spark/pull/533 Cast from BooleanType to NumericType should use exact type value. - Key: SPARK-1610 URL: https://issues.apache.org/jira/browse/SPARK-1610 Project: Spark Issue Type: Bug Components: SQL Reporter: Takuya Ueshin Cast from BooleanType to NumericType are all using Int value. But it causes ClassCastException when the casted value is used by the following evaluation like the code below: {quote} scala import org.apache.spark.sql.catalyst._ import org.apache.spark.sql.catalyst._ scala import types._ import types._ scala import expressions._ import expressions._ scala Add(Cast(Literal(true), ShortType), Literal(1.toShort)).eval() java.lang.ClassCastException: java.lang.Integer cannot be cast to java.lang.Short at scala.runtime.BoxesRunTime.unboxToShort(BoxesRunTime.java:102) at scala.math.Numeric$ShortIsIntegral$.plus(Numeric.scala:72) at org.apache.spark.sql.catalyst.expressions.Add$$anonfun$eval$2.apply(arithmetic.scala:58) at org.apache.spark.sql.catalyst.expressions.Add$$anonfun$eval$2.apply(arithmetic.scala:58) at org.apache.spark.sql.catalyst.expressions.Expression.n2(Expression.scala:114) at org.apache.spark.sql.catalyst.expressions.Add.eval(arithmetic.scala:58) at .init(console:17) at .clinit(console) at .init(console:7) at .clinit(console) at $print(console) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983) at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568) at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:760) at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:805) at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717) at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581) at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588) at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:882) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837) at scala.tools.nsc.MainGenericRunner.runTarget$1(MainGenericRunner.scala:83) at scala.tools.nsc.MainGenericRunner.process(MainGenericRunner.scala:96) at scala.tools.nsc.MainGenericRunner$.main(MainGenericRunner.scala:105) at scala.tools.nsc.MainGenericRunner.main(MainGenericRunner.scala) {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-1380) Add sort-merge based cogroup/joins.
Takuya Ueshin created SPARK-1380: Summary: Add sort-merge based cogroup/joins. Key: SPARK-1380 URL: https://issues.apache.org/jira/browse/SPARK-1380 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Takuya Ueshin I've written cogroup/joins based on 'Sort-Merge' algorithm. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-1380) Add sort-merge based cogroup/joins.
[ https://issues.apache.org/jira/browse/SPARK-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13956313#comment-13956313 ] Takuya Ueshin commented on SPARK-1380: -- Pull-requested: https://github.com/apache/spark/pull/283 Add sort-merge based cogroup/joins. --- Key: SPARK-1380 URL: https://issues.apache.org/jira/browse/SPARK-1380 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Takuya Ueshin I've written cogroup/joins based on 'Sort-Merge' algorithm. -- This message was sent by Atlassian JIRA (v6.2#6252)