[jira] [Comment Edited] (SPARK-14516) Clustering evaluator

2023-03-27 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069815#comment-16069815 ] Marco Gaido edited comment on SPARK-14516 at 3/27/23 9:42 AM: -- Hello

[jira] [Commented] (SPARK-14516) Clustering evaluator

2023-03-27 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-14516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17705265#comment-17705265 ] Marco Gaido commented on SPARK-14516: - As there are some issues with the Google Doc sharing, I

[jira] [Commented] (SPARK-36673) Incorrect Unions of struct with mismatched field name case

2021-09-06 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-36673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17410488#comment-17410488 ] Marco Gaido commented on SPARK-36673: - AFAIK, in SQL the names in the struct are case sensitive,

[jira] [Commented] (SPARK-29667) implicitly convert mismatched datatypes on right side of "IN" operator

2019-12-04 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987712#comment-16987712 ] Marco Gaido commented on SPARK-29667: - I can agree more with you [~hyukjin.kwon]. I think that

[jira] [Commented] (SPARK-29123) DecimalType multiplication precision loss

2019-09-19 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933560#comment-16933560 ] Marco Gaido commented on SPARK-29123: - [~benny] the point here is: Spark can represent decimals with

[jira] [Commented] (SPARK-29123) DecimalType multiplication precision loss

2019-09-19 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16933128#comment-16933128 ] Marco Gaido commented on SPARK-29123: - You can set

[jira] [Commented] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926650#comment-16926650 ] Marco Gaido commented on SPARK-29038: - [~cltlfcjin] currently spark has a something similar, which

[jira] [Comment Edited] (SPARK-29038) SPIP: Support Spark Materialized View

2019-09-10 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16926650#comment-16926650 ] Marco Gaido edited comment on SPARK-29038 at 9/10/19 1:40 PM: -- [~cltlfcjin]

[jira] [Commented] (SPARK-29009) Returning pojo from udf not working

2019-09-07 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16924774#comment-16924774 ] Marco Gaido commented on SPARK-29009: - Why do you think this is a bug? If you want a struct to be

[jira] [Commented] (SPARK-28610) Support larger buffer for sum of long

2019-09-03 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16921471#comment-16921471 ] Marco Gaido commented on SPARK-28610: - Hi [~Gengliang.Wang]. That's a different thing. you are doing

[jira] [Updated] (SPARK-28939) SQL configuration are not always propagated

2019-09-01 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-28939: Description: The SQL configurations are propagated to executors in order to be effective.

[jira] [Created] (SPARK-28939) SQL configuration are not always propagated

2019-09-01 Thread Marco Gaido (Jira)
Marco Gaido created SPARK-28939: --- Summary: SQL configuration are not always propagated Key: SPARK-28939 URL: https://issues.apache.org/jira/browse/SPARK-28939 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-28916) Generated SpecificSafeProjection.apply method grows beyond 64 KB when use SparkSQL

2019-08-31 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920153#comment-16920153 ] Marco Gaido commented on SPARK-28916: - I think the problem is related to subexpression elimination.

[jira] [Commented] (SPARK-28916) Generated SpecificSafeProjection.apply method grows beyond 64 KB when use SparkSQL

2019-08-31 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920074#comment-16920074 ] Marco Gaido commented on SPARK-28916: - Thanks for reporting this. I am checking it. > Generated

[jira] [Commented] (SPARK-28934) Add `spark.sql.compatiblity.mode`

2019-08-31 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16920072#comment-16920072 ] Marco Gaido commented on SPARK-28934: - Hi [~smilegator]! Thanks for opening this. I am wondering

[jira] [Resolved] (SPARK-28610) Support larger buffer for sum of long

2019-08-28 Thread Marco Gaido (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-28610. - Resolution: Won't Fix Since the perf regression introduced by the change would be very high,

[jira] [Commented] (SPARK-28611) Histogram's height is diffrent

2019-08-04 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899585#comment-16899585 ] Marco Gaido commented on SPARK-28611: - Mmmhthat's weird! How can you get a different result than

[jira] [Created] (SPARK-28610) Support larger buffer for sum of long

2019-08-03 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-28610: --- Summary: Support larger buffer for sum of long Key: SPARK-28610 URL: https://issues.apache.org/jira/browse/SPARK-28610 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-28512) New optional mode: throw runtime exceptions on casting failures

2019-07-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16892692#comment-16892692 ] Marco Gaido commented on SPARK-28512: - Thanks for pinging me [~maropu]. It is not the same issue, I

[jira] [Commented] (SPARK-28470) Honor spark.sql.decimalOperations.nullOnOverflow in Cast

2019-07-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16889979#comment-16889979 ] Marco Gaido commented on SPARK-28470: - Thanks for checking this Wenchen! I will work on this ASAP.

[jira] [Comment Edited] (SPARK-28225) Unexpected behavior for Window functions

2019-07-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16889518#comment-16889518 ] Marco Gaido edited comment on SPARK-28225 at 7/20/19 2:43 PM: -- Let me cite

[jira] [Commented] (SPARK-28225) Unexpected behavior for Window functions

2019-07-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16889518#comment-16889518 ] Marco Gaido commented on SPARK-28225: - Let me cite PostgreSQL documentation to explain you the

[jira] [Commented] (SPARK-28386) Cannot resolve ORDER BY columns with GROUP BY and HAVING

2019-07-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16889422#comment-16889422 ] Marco Gaido commented on SPARK-28386: - I think this is a duplicate of SPARK-26741. I have a PR for

[jira] [Commented] (SPARK-23758) MLlib 2.4 Roadmap

2019-07-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16886406#comment-16886406 ] Marco Gaido commented on SPARK-23758: - [~dongjoon] seems weird to set the affected version to 3.0

[jira] [Commented] (SPARK-28222) Feature importance outputs different values in GBT and Random Forest in 2.3.3 and 2.4 pyspark version

2019-07-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884412#comment-16884412 ] Marco Gaido commented on SPARK-28222: - [~eneriwrt] do you have a simple repro for this? I can try

[jira] [Commented] (SPARK-28316) Decimal precision issue

2019-07-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884398#comment-16884398 ] Marco Gaido commented on SPARK-28316: - Well, IIUC, this is just the result of Postgres having no

[jira] [Commented] (SPARK-28324) The LOG function using 10 as the base, but Spark using E

2019-07-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884397#comment-16884397 ] Marco Gaido commented on SPARK-28324: - +1 for [~srowen]'s opinion. I don't think it is a good idea

[jira] [Commented] (SPARK-28348) Avoid cast twice for decimal type

2019-07-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883165#comment-16883165 ] Marco Gaido commented on SPARK-28348: - Mmmhyes, you're right. > Avoid cast twice for decimal

[jira] [Commented] (SPARK-28348) Avoid cast twice for decimal type

2019-07-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882773#comment-16882773 ] Marco Gaido commented on SPARK-28348: - No, I don't think that's a good idea. Setting the result type

[jira] [Commented] (SPARK-28348) Avoid cast twice for decimal type

2019-07-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882741#comment-16882741 ] Marco Gaido commented on SPARK-28348: - There is no cast to `decimal(38, 6)`. The reson why the

[jira] [Comment Edited] (SPARK-28348) Avoid cast twice for decimal type

2019-07-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882741#comment-16882741 ] Marco Gaido edited comment on SPARK-28348 at 7/11/19 8:23 AM: -- There is no

[jira] [Commented] (SPARK-28322) DIV support decimal type

2019-07-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881798#comment-16881798 ] Marco Gaido commented on SPARK-28322: - Thanks for pinging me [~yumwang], I'll work on this on the

[jira] [Commented] (SPARK-28067) Incorrect results in decimal aggregation with whole-stage code gen enabled

2019-07-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880700#comment-16880700 ] Marco Gaido commented on SPARK-28067: - I cannot reproduce in 2.4.0 either: {code}

[jira] [Created] (SPARK-28235) Decimal sum return type

2019-07-02 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-28235: --- Summary: Decimal sum return type Key: SPARK-28235 URL: https://issues.apache.org/jira/browse/SPARK-28235 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-28222) Feature importance outputs different values in GBT and Random Forest in 2.3.3 and 2.4 pyspark version

2019-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877267#comment-16877267 ] Marco Gaido commented on SPARK-28222: - Mmmmh, there has been a bug fix for it (see SPARK-26721), but

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876509#comment-16876509 ] Marco Gaido commented on SPARK-28186: - You're right with that. The equivalent in Postgres is

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876357#comment-16876357 ] Marco Gaido commented on SPARK-28186: - Do you know of any SQL BD with the behavior you are

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-06-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875447#comment-16875447 ] Marco Gaido commented on SPARK-28186: - This is the right behavior AFAIK. Why are you saying it is

[jira] [Commented] (SPARK-28201) Revisit MakeDecimal behavior on overflow

2019-06-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874765#comment-16874765 ] Marco Gaido commented on SPARK-28201: - I'll create a PR for this ASAP. > Revisit MakeDecimal

[jira] [Created] (SPARK-28201) Revisit MakeDecimal behavior on overflow

2019-06-28 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-28201: --- Summary: Revisit MakeDecimal behavior on overflow Key: SPARK-28201 URL: https://issues.apache.org/jira/browse/SPARK-28201 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-28200) Overflow handling in `ExpressionEncoder`

2019-06-28 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-28200: --- Summary: Overflow handling in `ExpressionEncoder` Key: SPARK-28200 URL: https://issues.apache.org/jira/browse/SPARK-28200 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-28067) Incorrect results in decimal aggregation with whole-stage code gen enabled

2019-06-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870880#comment-16870880 ] Marco Gaido commented on SPARK-28067: - No, it is the same. Are you sure about your configs? {code}

[jira] [Commented] (SPARK-28135) ceil/ceiling/floor/power returns incorrect values

2019-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870360#comment-16870360 ] Marco Gaido commented on SPARK-28135: - [~Tonix517] tickets are assigned only once the PR is merged

[jira] [Commented] (SPARK-28067) Incorrect results in decimal aggregation with whole-stage code gen enabled

2019-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870293#comment-16870293 ] Marco Gaido commented on SPARK-28067: - I cannot reproduce on master. It always returns null with

[jira] [Commented] (SPARK-28060) Float/Double type can not accept some special inputs

2019-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870288#comment-16870288 ] Marco Gaido commented on SPARK-28060: - This is a duplicate of SPARK-27768, isn't it? Or better,

[jira] [Commented] (SPARK-27820) case insensitive resolver should be used in GetMapValue

2019-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870286#comment-16870286 ] Marco Gaido commented on SPARK-27820: - +1 for [~hyukjin.kwon]'s comment. > case insensitive

[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range

2019-06-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869879#comment-16869879 ] Marco Gaido commented on SPARK-28024: - [~joshrosen] thanks for linking them, Yes, I did try having

[jira] [Commented] (SPARK-24149) Automatic namespaces discovery in HDFS federation

2019-05-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851896#comment-16851896 ] Marco Gaido commented on SPARK-24149: - That's true, the point is: if you want to access a different

[jira] [Commented] (SPARK-24149) Automatic namespaces discovery in HDFS federation

2019-05-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848097#comment-16848097 ] Marco Gaido commented on SPARK-24149: - [~Dhruve Ashar] the use case for this change, for instance,

[jira] [Commented] (SPARK-27761) Make UDF nondeterministic by default(?)

2019-05-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843914#comment-16843914 ] Marco Gaido commented on SPARK-27761: - Yes, I think this is a good idea IMHO. The behavior by

[jira] [Commented] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840242#comment-16840242 ] Marco Gaido commented on SPARK-27684: - I can try and work on it, but most likely I will start

[jira] [Commented] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839297#comment-16839297 ] Marco Gaido commented on SPARK-27684: - I agree on this too. > Reduce ScalaUDF conversion overheads

[jira] [Commented] (SPARK-27685) `union` doesn't promote non-nullable columns of struct to nullable

2019-05-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839295#comment-16839295 ] Marco Gaido commented on SPARK-27685: - This is a duplicate of SPARK-26812. > `union` doesn't

[jira] [Resolved] (SPARK-27685) `union` doesn't promote non-nullable columns of struct to nullable

2019-05-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-27685. - Resolution: Duplicate > `union` doesn't promote non-nullable columns of struct to nullable >

[jira] [Commented] (SPARK-26182) Cost increases when optimizing scalaUDF

2019-05-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836486#comment-16836486 ] Marco Gaido commented on SPARK-26182: - Actually you just need to mark it {{asNondetermistic}} to

[jira] [Resolved] (SPARK-27089) Loss of precision during decimal division

2019-05-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-27089. - Resolution: Information Provided > Loss of precision during decimal division >

[jira] [Commented] (SPARK-27612) Creating a DataFrame in PySpark with ArrayType produces some Rows with Arrays of None

2019-05-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831097#comment-16831097 ] Marco Gaido commented on SPARK-27612: - I don't have a python3 env, sorry... > Creating a DataFrame

[jira] [Commented] (SPARK-27332) Filter Pushdown duplicates expensive ScalarSubquery (discarding result)

2019-05-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830947#comment-16830947 ] Marco Gaido commented on SPARK-27332: - [~dzklip] actually Spark was not using the ScalarSubquery

[jira] [Commented] (SPARK-27612) Creating a DataFrame in PySpark with ArrayType produces some Rows with Arrays of None

2019-05-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830945#comment-16830945 ] Marco Gaido commented on SPARK-27612: - I am not able to reproduce... {code} __ / __/__ ___

[jira] [Commented] (SPARK-27607) Improve performance of Row.toString()

2019-05-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830933#comment-16830933 ] Marco Gaido commented on SPARK-27607: - Hi [~joshrosen], are you working on it? If not I can take it.

[jira] [Commented] (SPARK-27287) PCAModel.load() does not honor spark configs

2019-04-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823876#comment-16823876 ] Marco Gaido commented on SPARK-27287: - [~dharmesh.kakadia] the point is: if you set a config on theĀ 

[jira] [Commented] (SPARK-26218) Throw exception on overflow for integers

2019-04-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816087#comment-16816087 ] Marco Gaido commented on SPARK-26218: - [~rxin] I see that. But the reason for this are: - the

[jira] [Commented] (SPARK-27278) Optimize GetMapValue when the map is a foldable and the key is not

2019-04-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811808#comment-16811808 ] Marco Gaido commented on SPARK-27278: - [~huonw] I think the point is: in the existing case which is

[jira] [Commented] (SPARK-27287) PCAModel.load() does not honor spark configs

2019-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807780#comment-16807780 ] Marco Gaido commented on SPARK-27287: - I think the problem here is that the configuration are copied

[jira] [Commented] (SPARK-27283) BigDecimal arithmetic losing precision

2019-03-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803767#comment-16803767 ] Marco Gaido commented on SPARK-27283: - {quote} I guess I'm mostly frustrated that the SQL standard

[jira] [Commented] (SPARK-27282) Spark incorrect results when using UNION with GROUP BY clause

2019-03-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801930#comment-16801930 ] Marco Gaido commented on SPARK-27282: - Please do not use Blocker/Critical as they are reserved for

[jira] [Updated] (SPARK-27282) Spark incorrect results when using UNION with GROUP BY clause

2019-03-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-27282: Priority: Major (was: Blocker) > Spark incorrect results when using UNION with GROUP BY clause >

[jira] [Commented] (SPARK-27283) BigDecimal arithmetic losing precision

2019-03-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801778#comment-16801778 ] Marco Gaido commented on SPARK-27283: - [~Mats_SX] another issue which could happen using Decimal

[jira] [Created] (SPARK-27243) RuleExecutor throws exception when dumping time spent with no rule executed

2019-03-22 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-27243: --- Summary: RuleExecutor throws exception when dumping time spent with no rule executed Key: SPARK-27243 URL: https://issues.apache.org/jira/browse/SPARK-27243 Project:

[jira] [Updated] (SPARK-27193) CodeFormatter should format multi comment lines correctly

2019-03-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-27193: Priority: Trivial (was: Major) > CodeFormatter should format multi comment lines correctly >

[jira] [Commented] (SPARK-27089) Loss of precision during decimal division

2019-03-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794984#comment-16794984 ] Marco Gaido commented on SPARK-27089: - You can set:

[jira] [Commented] (SPARK-27018) Checkpointed RDD deleted prematurely when using GBTClassifier

2019-03-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786984#comment-16786984 ] Marco Gaido commented on SPARK-27018: - The PeriodicCheckpointer is still there in master, you can

[jira] [Commented] (SPARK-27018) Checkpointed RDD deleted prematurely when using GBTClassifier

2019-03-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781657#comment-16781657 ] Marco Gaido commented on SPARK-27018: - First the issue is fixed on master and then it is backported

[jira] [Commented] (SPARK-27018) Checkpointed RDD deleted prematurely when using GBTClassifier

2019-03-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781558#comment-16781558 ] Marco Gaido commented on SPARK-27018: - [~pkolaczk] thanks for reporting the issue. Spark works with

[jira] [Commented] (SPARK-26996) Scalar Subquery not handled properly in Spark 2.4

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778485#comment-16778485 ] Marco Gaido commented on SPARK-26996: - Thanks [~dongjoon]! > Scalar Subquery not handled properly

[jira] [Commented] (SPARK-26996) Scalar Subquery not handled properly in Spark 2.4

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778453#comment-16778453 ] Marco Gaido commented on SPARK-26996: - I have not been able to reproduce on current master though...

[jira] [Updated] (SPARK-26996) Scalar Subquery not handled properly in Spark 2.4

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-26996: Component/s: (was: Spark Core) SQL > Scalar Subquery not handled properly in

[jira] [Commented] (SPARK-26974) Invalid data in grouped cached dataset, formed by joining a large cached dataset with a small dataset

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1686#comment-1686 ] Marco Gaido commented on SPARK-26974: - Can you please try a newer Spark version (2.4.0)? If the

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1665#comment-1665 ] Marco Gaido commented on SPARK-26947: - Cloud you also please provide the heap dump of the JVM? You

[jira] [Commented] (SPARK-26988) Spark overwrites spark.scheduler.pool if set in configs

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1655#comment-1655 ] Marco Gaido commented on SPARK-26988: - This seems indeed an issue for any property set using

[jira] [Commented] (SPARK-26911) Spark do not see column in table

2019-02-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770964#comment-16770964 ] Marco Gaido commented on SPARK-26911: - May you please check that current master is still affected?

[jira] [Commented] (SPARK-26881) Scaling issue with Gramian computation for RowMatrix: too many results sent to driver

2019-02-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770881#comment-16770881 ] Marco Gaido commented on SPARK-26881: - This may have been fixed/improved by SPARK-26228, could you

[jira] [Commented] (SPARK-26894) Fix Alias handling in AggregateEstimation

2019-02-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769845#comment-16769845 ] Marco Gaido commented on SPARK-26894: - It will be assigned once the PR is merged. Thanks. > Fix

[jira] [Commented] (SPARK-26893) Allow pushdown of partition pruning subquery filters to file source

2019-02-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769474#comment-16769474 ] Marco Gaido commented on SPARK-26893: - Actually this was done intentionally in order to avoid

[jira] [Commented] (SPARK-26829) In place standard scaler so the column remains same after transformation

2019-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766939#comment-16766939 ] Marco Gaido commented on SPARK-26829: - You can set the output column name and you can rename it as

[jira] [Resolved] (SPARK-26782) Wrong column resolved when joining twice with the same dataframe

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-26782. - Resolution: Duplicate > Wrong column resolved when joining twice with the same dataframe >

[jira] [Commented] (SPARK-26782) Wrong column resolved when joining twice with the same dataframe

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755987#comment-16755987 ] Marco Gaido commented on SPARK-26782: - This is a duplicate of many others. I also started a thread

[jira] [Commented] (SPARK-26779) NullPointerException when disable wholestage codegen

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755881#comment-16755881 ] Marco Gaido commented on SPARK-26779: - I'd say this is most likely just a duplicate of SPARK-23731.

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755837#comment-16755837 ] Marco Gaido commented on SPARK-25420: - [~jeffrey.mak] I cannot reproduce your issue on current

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755831#comment-16755831 ] Marco Gaido commented on SPARK-25420: - [~jeffrey.mak] [~kabhwan] I agree that your case seems not

[jira] [Comment Edited] (SPARK-26767) Filter on a dropDuplicates dataframe gives inconsistency result

2019-01-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754882#comment-16754882 ] Marco Gaido edited comment on SPARK-26767 at 1/29/19 11:13 AM: --- IIRC there

[jira] [Commented] (SPARK-26767) Filter on a dropDuplicates dataframe gives inconsistency result

2019-01-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754882#comment-16754882 ] Marco Gaido commented on SPARK-26767: - IIRC there was a similar JIRA reported. May you please try in

[jira] [Commented] (SPARK-26752) Multiple aggregate methods in the same column in DataFrame

2019-01-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754719#comment-16754719 ] Marco Gaido commented on SPARK-26752: - I agree with you [~hyukjin.kwon]. Actually I'd rather propose

[jira] [Commented] (SPARK-18484) case class datasets - ability to specify decimal precision and scale

2019-01-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751422#comment-16751422 ] Marco Gaido commented on SPARK-18484: - [~bonazzaf] please do not delete comments, as they may be

[jira] [Commented] (SPARK-20162) Reading data from MySQL - Cannot up cast from decimal(30,6) to decimal(38,18)

2019-01-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16749920#comment-16749920 ] Marco Gaido commented on SPARK-20162: - [~bonazzaf] what you just reported is an invalid use case and

[jira] [Commented] (SPARK-26639) The reuse subquery function maybe does not work in SPARK SQL

2019-01-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16747501#comment-16747501 ] Marco Gaido commented on SPARK-26639: - [~Jk_Self] I checked and there is only one subquery node in

[jira] [Commented] (SPARK-26645) CSV infer schema bug infers decimal(9,-1)

2019-01-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16745146#comment-16745146 ] Marco Gaido commented on SPARK-26645: - The error is on python side, I will submit a PR shortly,

[jira] [Commented] (SPARK-26639) The reuse subquery function maybe does not work in SPARK SQL

2019-01-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744799#comment-16744799 ] Marco Gaido commented on SPARK-26639: - I see, then let me investigate this further.I think I already

[jira] [Commented] (SPARK-26639) The reuse subquery function maybe does not work in SPARK SQL

2019-01-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16744783#comment-16744783 ] Marco Gaido commented on SPARK-26639: - This may be a duplicate of SPARK-25482. Please may you try on

[jira] [Commented] (SPARK-26569) Fixed point for batch Operator Optimizations never reached when optimize logicalPlan

2019-01-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16742305#comment-16742305 ] Marco Gaido commented on SPARK-26569: - [~chenfan] may you please try a more recent version of Spark?

  1   2   3   4   5   6   7   >