[jira] [Commented] (SPARK-26308) Large BigDecimal value is converted to null when passed into a UDF

2018-12-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714782#comment-16714782 ] Marco Gaido commented on SPARK-26308: - I think that works. Maybe it is not a "perfect" solution but

[jira] [Commented] (SPARK-26308) Large BigDecimal value is converted to null when passed into a UDF

2018-12-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714879#comment-16714879 ] Marco Gaido commented on SPARK-26308: - I checked but it doesn't work because the cast is added

[jira] [Commented] (SPARK-26224) Results in stackOverFlowError when trying to add 3000 new columns using withColumn function of dataframe.

2018-12-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16714991#comment-16714991 ] Marco Gaido commented on SPARK-26224: - [~viirya] that is true, but here it comes a question: do we

[jira] [Commented] (SPARK-26215) define reserved keywords after SQL standard

2018-11-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16703042#comment-16703042 ] Marco Gaido commented on SPARK-26215: - [~cloud_fan] thanks for pinging me. I agree on putting a

[jira] [Commented] (SPARK-26214) Add "broadcast" method to DataFrame

2018-11-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16702994#comment-16702994 ] Marco Gaido commented on SPARK-26214: - You can just use the {{broadcast}} function from

[jira] [Updated] (SPARK-23179) Support option to throw exception if overflow occurs

2018-11-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23179: Issue Type: Sub-task (was: Improvement) Parent: SPARK-26217 > Support option to throw

[jira] [Updated] (SPARK-26215) define reserved keywords after SQL standard

2018-11-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-26215: Issue Type: Sub-task (was: Improvement) Parent: SPARK-26217 > define reserved keywords

[jira] [Created] (SPARK-26218) Throw exception on overflow for integers

2018-11-29 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-26218: --- Summary: Throw exception on overflow for integers Key: SPARK-26218 URL: https://issues.apache.org/jira/browse/SPARK-26218 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-26217) Compliance to SQL standard (SQL:2011)

2018-11-29 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-26217: --- Summary: Compliance to SQL standard (SQL:2011) Key: SPARK-26217 URL: https://issues.apache.org/jira/browse/SPARK-26217 Project: Spark Issue Type: Umbrella

[jira] [Commented] (SPARK-26242) Leading slash breaks proxying

2018-12-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16705916#comment-16705916 ] Marco Gaido commented on SPARK-26242: - You can set {{spark.ui.proxyBase}} or the proxy can set the

[jira] [Commented] (SPARK-25959) Difference in featureImportances results on computed vs saved models

2018-11-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692861#comment-16692861 ] Marco Gaido commented on SPARK-25959: - [~srowen] what do you think about backporting this? Maybe 2.2

[jira] [Created] (SPARK-26127) Remove deprecated setImpurity from GBTClassificationModel, DecisionTreeRegressionModel, GBTRegressionModel, RandomForestRegressionModel

2018-11-20 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-26127: --- Summary: Remove deprecated setImpurity from GBTClassificationModel, DecisionTreeRegressionModel, GBTRegressionModel, RandomForestRegressionModel Key: SPARK-26127 URL:

[jira] [Updated] (SPARK-26127) Remove deprecated setImpurity from tree regression and classification models

2018-11-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-26127: Summary: Remove deprecated setImpurity from tree regression and classification models (was:

[jira] [Updated] (SPARK-26127) Remove deprecated setters from tree regression and classification models

2018-11-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-26127: Summary: Remove deprecated setters from tree regression and classification models (was: Remove

[jira] [Updated] (SPARK-26127) Remove deprecated setters from tree regression and classification models

2018-11-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-26127: Description: Many {{set***}} methods are present for the models of regression and classification

[jira] [Created] (SPARK-26535) Parsing literals as DOUBLE instead of DECIMAL

2019-01-04 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-26535: --- Summary: Parsing literals as DOUBLE instead of DECIMAL Key: SPARK-26535 URL: https://issues.apache.org/jira/browse/SPARK-26535 Project: Spark Issue Type:

[jira] [Commented] (SPARK-26458) OneHotEncoderModel verifies the number of category values incorrectly when tries to transform a dataframe.

2018-12-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16730188#comment-16730188 ] Marco Gaido commented on SPARK-26458: - Which is the issue you are encountering? Can you provide a

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25420: Priority: Major (was: Critical) > Dataset.count() every time is different. >

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25420: Labels: SQL (was: SQL correctness) > Dataset.count() every time is different. >

[jira] [Updated] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-25420: Labels: SQL correctness (was: ) > Dataset.count() every time is different. >

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613209#comment-16613209 ] Marco Gaido commented on SPARK-25420: - Please do not use Critical/Blocker as they are reserved for

[jira] [Comment Edited] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613209#comment-16613209 ] Marco Gaido edited comment on SPARK-25420 at 9/13/18 8:51 AM: -- Please do

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2018-09-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613217#comment-16613217 ] Marco Gaido commented on SPARK-25420: - I think the reason here is that since we don't enforce any

[jira] [Commented] (SPARK-24315) Multiple streaming jobs detected error causing job failure

2018-09-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16616643#comment-16616643 ] Marco Gaido commented on SPARK-24315: - [~joeyfezster] it has been a while ago, so I may be wrong,

[jira] [Commented] (SPARK-25454) Division between operands with negative scale can cause precision loss

2018-09-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620192#comment-16620192 ] Marco Gaido commented on SPARK-25454: - [~bersprockets] you're right, the only "wrong" thing of your

[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null

2018-09-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16618892#comment-16618892 ] Marco Gaido commented on SPARK-22036: - [~bersprockets] first of all thank you for reporting this and

[jira] [Created] (SPARK-25457) IntegralDivide (div) should not always return long

2018-09-18 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25457: --- Summary: IntegralDivide (div) should not always return long Key: SPARK-25457 URL: https://issues.apache.org/jira/browse/SPARK-25457 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22036) BigDecimal multiplication sometimes returns null

2018-09-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16619129#comment-16619129 ] Marco Gaido commented on SPARK-22036: - [~bersprockets] I created SPARK-25454 for tracking since I

[jira] [Created] (SPARK-25454) Division between operands with negative scale can cause precision loss

2018-09-18 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-25454: --- Summary: Division between operands with negative scale can cause precision loss Key: SPARK-25454 URL: https://issues.apache.org/jira/browse/SPARK-25454 Project: Spark

[jira] [Commented] (SPARK-27287) PCAModel.load() does not honor spark configs

2019-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16807780#comment-16807780 ] Marco Gaido commented on SPARK-27287: - I think the problem here is that the configuration are copied

[jira] [Commented] (SPARK-27283) BigDecimal arithmetic losing precision

2019-03-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801778#comment-16801778 ] Marco Gaido commented on SPARK-27283: - [~Mats_SX] another issue which could happen using Decimal

[jira] [Updated] (SPARK-27282) Spark incorrect results when using UNION with GROUP BY clause

2019-03-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-27282: Priority: Major (was: Blocker) > Spark incorrect results when using UNION with GROUP BY clause >

[jira] [Commented] (SPARK-27282) Spark incorrect results when using UNION with GROUP BY clause

2019-03-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801930#comment-16801930 ] Marco Gaido commented on SPARK-27282: - Please do not use Blocker/Critical as they are reserved for

[jira] [Commented] (SPARK-27283) BigDecimal arithmetic losing precision

2019-03-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803767#comment-16803767 ] Marco Gaido commented on SPARK-27283: - {quote} I guess I'm mostly frustrated that the SQL standard

[jira] [Commented] (SPARK-26988) Spark overwrites spark.scheduler.pool if set in configs

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1655#comment-1655 ] Marco Gaido commented on SPARK-26988: - This seems indeed an issue for any property set using

[jira] [Commented] (SPARK-26974) Invalid data in grouped cached dataset, formed by joining a large cached dataset with a small dataset

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1686#comment-1686 ] Marco Gaido commented on SPARK-26974: - Can you please try a newer Spark version (2.4.0)? If the

[jira] [Commented] (SPARK-27018) Checkpointed RDD deleted prematurely when using GBTClassifier

2019-03-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781558#comment-16781558 ] Marco Gaido commented on SPARK-27018: - [~pkolaczk] thanks for reporting the issue. Spark works with

[jira] [Commented] (SPARK-27018) Checkpointed RDD deleted prematurely when using GBTClassifier

2019-03-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16781657#comment-16781657 ] Marco Gaido commented on SPARK-27018: - First the issue is fixed on master and then it is backported

[jira] [Commented] (SPARK-26947) Pyspark KMeans Clustering job fails on large values of k

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1665#comment-1665 ] Marco Gaido commented on SPARK-26947: - Cloud you also please provide the heap dump of the JVM? You

[jira] [Commented] (SPARK-26996) Scalar Subquery not handled properly in Spark 2.4

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778485#comment-16778485 ] Marco Gaido commented on SPARK-26996: - Thanks [~dongjoon]! > Scalar Subquery not handled properly

[jira] [Commented] (SPARK-26996) Scalar Subquery not handled properly in Spark 2.4

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16778453#comment-16778453 ] Marco Gaido commented on SPARK-26996: - I have not been able to reproduce on current master though...

[jira] [Updated] (SPARK-26996) Scalar Subquery not handled properly in Spark 2.4

2019-02-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-26996: Component/s: (was: Spark Core) SQL > Scalar Subquery not handled properly in

[jira] [Commented] (SPARK-27018) Checkpointed RDD deleted prematurely when using GBTClassifier

2019-03-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16786984#comment-16786984 ] Marco Gaido commented on SPARK-27018: - The PeriodicCheckpointer is still there in master, you can

[jira] [Updated] (SPARK-27193) CodeFormatter should format multi comment lines correctly

2019-03-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-27193: Priority: Trivial (was: Major) > CodeFormatter should format multi comment lines correctly >

[jira] [Commented] (SPARK-27089) Loss of precision during decimal division

2019-03-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794984#comment-16794984 ] Marco Gaido commented on SPARK-27089: - You can set:

[jira] [Created] (SPARK-27243) RuleExecutor throws exception when dumping time spent with no rule executed

2019-03-22 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-27243: --- Summary: RuleExecutor throws exception when dumping time spent with no rule executed Key: SPARK-27243 URL: https://issues.apache.org/jira/browse/SPARK-27243 Project:

[jira] [Commented] (SPARK-26911) Spark do not see column in table

2019-02-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770964#comment-16770964 ] Marco Gaido commented on SPARK-26911: - May you please check that current master is still affected?

[jira] [Commented] (SPARK-26881) Scaling issue with Gramian computation for RowMatrix: too many results sent to driver

2019-02-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770881#comment-16770881 ] Marco Gaido commented on SPARK-26881: - This may have been fixed/improved by SPARK-26228, could you

[jira] [Commented] (SPARK-26829) In place standard scaler so the column remains same after transformation

2019-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16766939#comment-16766939 ] Marco Gaido commented on SPARK-26829: - You can set the output column name and you can rename it as

[jira] [Commented] (SPARK-26893) Allow pushdown of partition pruning subquery filters to file source

2019-02-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769474#comment-16769474 ] Marco Gaido commented on SPARK-26893: - Actually this was done intentionally in order to avoid

[jira] [Commented] (SPARK-26894) Fix Alias handling in AggregateEstimation

2019-02-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769845#comment-16769845 ] Marco Gaido commented on SPARK-26894: - It will be assigned once the PR is merged. Thanks. > Fix

[jira] [Commented] (SPARK-26752) Multiple aggregate methods in the same column in DataFrame

2019-01-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754719#comment-16754719 ] Marco Gaido commented on SPARK-26752: - I agree with you [~hyukjin.kwon]. Actually I'd rather propose

[jira] [Commented] (SPARK-26779) NullPointerException when disable wholestage codegen

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755881#comment-16755881 ] Marco Gaido commented on SPARK-26779: - I'd say this is most likely just a duplicate of SPARK-23731.

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755831#comment-16755831 ] Marco Gaido commented on SPARK-25420: - [~jeffrey.mak] [~kabhwan] I agree that your case seems not

[jira] [Commented] (SPARK-25420) Dataset.count() every time is different.

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755837#comment-16755837 ] Marco Gaido commented on SPARK-25420: - [~jeffrey.mak] I cannot reproduce your issue on current

[jira] [Resolved] (SPARK-26782) Wrong column resolved when joining twice with the same dataframe

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-26782. - Resolution: Duplicate > Wrong column resolved when joining twice with the same dataframe >

[jira] [Commented] (SPARK-26782) Wrong column resolved when joining twice with the same dataframe

2019-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16755987#comment-16755987 ] Marco Gaido commented on SPARK-26782: - This is a duplicate of many others. I also started a thread

[jira] [Commented] (SPARK-26767) Filter on a dropDuplicates dataframe gives inconsistency result

2019-01-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754882#comment-16754882 ] Marco Gaido commented on SPARK-26767: - IIRC there was a similar JIRA reported. May you please try in

[jira] [Comment Edited] (SPARK-26767) Filter on a dropDuplicates dataframe gives inconsistency result

2019-01-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16754882#comment-16754882 ] Marco Gaido edited comment on SPARK-26767 at 1/29/19 11:13 AM: --- IIRC there

[jira] [Commented] (SPARK-18484) case class datasets - ability to specify decimal precision and scale

2019-01-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751422#comment-16751422 ] Marco Gaido commented on SPARK-18484: - [~bonazzaf] please do not delete comments, as they may be

[jira] [Commented] (SPARK-27278) Optimize GetMapValue when the map is a foldable and the key is not

2019-04-07 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16811808#comment-16811808 ] Marco Gaido commented on SPARK-27278: - [~huonw] I think the point is: in the existing case which is

[jira] [Commented] (SPARK-26218) Throw exception on overflow for integers

2019-04-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16816087#comment-16816087 ] Marco Gaido commented on SPARK-26218: - [~rxin] I see that. But the reason for this are: - the

[jira] [Commented] (SPARK-24149) Automatic namespaces discovery in HDFS federation

2019-05-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16848097#comment-16848097 ] Marco Gaido commented on SPARK-24149: - [~Dhruve Ashar] the use case for this change, for instance,

[jira] [Commented] (SPARK-24149) Automatic namespaces discovery in HDFS federation

2019-05-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16851896#comment-16851896 ] Marco Gaido commented on SPARK-24149: - That's true, the point is: if you want to access a different

[jira] [Commented] (SPARK-27820) case insensitive resolver should be used in GetMapValue

2019-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870286#comment-16870286 ] Marco Gaido commented on SPARK-27820: - +1 for [~hyukjin.kwon]'s comment. > case insensitive

[jira] [Commented] (SPARK-28135) ceil/ceiling/floor/power returns incorrect values

2019-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870360#comment-16870360 ] Marco Gaido commented on SPARK-28135: - [~Tonix517] tickets are assigned only once the PR is merged

[jira] [Commented] (SPARK-28060) Float/Double type can not accept some special inputs

2019-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870288#comment-16870288 ] Marco Gaido commented on SPARK-28060: - This is a duplicate of SPARK-27768, isn't it? Or better,

[jira] [Commented] (SPARK-28067) Incorrect results in decimal aggregation with whole-stage code gen enabled

2019-06-22 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870293#comment-16870293 ] Marco Gaido commented on SPARK-28067: - I cannot reproduce on master. It always returns null with

[jira] [Commented] (SPARK-28024) Incorrect numeric values when out of range

2019-06-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16869879#comment-16869879 ] Marco Gaido commented on SPARK-28024: - [~joshrosen] thanks for linking them, Yes, I did try having

[jira] [Commented] (SPARK-28067) Incorrect results in decimal aggregation with whole-stage code gen enabled

2019-06-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16870880#comment-16870880 ] Marco Gaido commented on SPARK-28067: - No, it is the same. Are you sure about your configs? {code}

[jira] [Resolved] (SPARK-27089) Loss of precision during decimal division

2019-05-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-27089. - Resolution: Information Provided > Loss of precision during decimal division >

[jira] [Commented] (SPARK-26182) Cost increases when optimizing scalaUDF

2019-05-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16836486#comment-16836486 ] Marco Gaido commented on SPARK-26182: - Actually you just need to mark it {{asNondetermistic}} to

[jira] [Commented] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839297#comment-16839297 ] Marco Gaido commented on SPARK-27684: - I agree on this too. > Reduce ScalaUDF conversion overheads

[jira] [Resolved] (SPARK-27685) `union` doesn't promote non-nullable columns of struct to nullable

2019-05-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-27685. - Resolution: Duplicate > `union` doesn't promote non-nullable columns of struct to nullable >

[jira] [Commented] (SPARK-27685) `union` doesn't promote non-nullable columns of struct to nullable

2019-05-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839295#comment-16839295 ] Marco Gaido commented on SPARK-27685: - This is a duplicate of SPARK-26812. > `union` doesn't

[jira] [Commented] (SPARK-27761) Make UDF nondeterministic by default(?)

2019-05-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16843914#comment-16843914 ] Marco Gaido commented on SPARK-27761: - Yes, I think this is a good idea IMHO. The behavior by

[jira] [Commented] (SPARK-27684) Reduce ScalaUDF conversion overheads for primitives

2019-05-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840242#comment-16840242 ] Marco Gaido commented on SPARK-27684: - I can try and work on it, but most likely I will start

[jira] [Commented] (SPARK-27607) Improve performance of Row.toString()

2019-05-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830933#comment-16830933 ] Marco Gaido commented on SPARK-27607: - Hi [~joshrosen], are you working on it? If not I can take it.

[jira] [Commented] (SPARK-27332) Filter Pushdown duplicates expensive ScalarSubquery (discarding result)

2019-05-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830947#comment-16830947 ] Marco Gaido commented on SPARK-27332: - [~dzklip] actually Spark was not using the ScalarSubquery

[jira] [Commented] (SPARK-27612) Creating a DataFrame in PySpark with ArrayType produces some Rows with Arrays of None

2019-05-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830945#comment-16830945 ] Marco Gaido commented on SPARK-27612: - I am not able to reproduce... {code} __ / __/__ ___

[jira] [Commented] (SPARK-27612) Creating a DataFrame in PySpark with ArrayType produces some Rows with Arrays of None

2019-05-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16831097#comment-16831097 ] Marco Gaido commented on SPARK-27612: - I don't have a python3 env, sorry... > Creating a DataFrame

[jira] [Created] (SPARK-28200) Overflow handling in `ExpressionEncoder`

2019-06-28 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-28200: --- Summary: Overflow handling in `ExpressionEncoder` Key: SPARK-28200 URL: https://issues.apache.org/jira/browse/SPARK-28200 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-28201) Revisit MakeDecimal behavior on overflow

2019-06-28 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-28201: --- Summary: Revisit MakeDecimal behavior on overflow Key: SPARK-28201 URL: https://issues.apache.org/jira/browse/SPARK-28201 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-28201) Revisit MakeDecimal behavior on overflow

2019-06-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874765#comment-16874765 ] Marco Gaido commented on SPARK-28201: - I'll create a PR for this ASAP. > Revisit MakeDecimal

[jira] [Commented] (SPARK-28348) Avoid cast twice for decimal type

2019-07-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882741#comment-16882741 ] Marco Gaido commented on SPARK-28348: - There is no cast to `decimal(38, 6)`. The reson why the

[jira] [Comment Edited] (SPARK-28348) Avoid cast twice for decimal type

2019-07-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882741#comment-16882741 ] Marco Gaido edited comment on SPARK-28348 at 7/11/19 8:23 AM: -- There is no

[jira] [Commented] (SPARK-28348) Avoid cast twice for decimal type

2019-07-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882773#comment-16882773 ] Marco Gaido commented on SPARK-28348: - No, I don't think that's a good idea. Setting the result type

[jira] [Commented] (SPARK-28348) Avoid cast twice for decimal type

2019-07-11 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883165#comment-16883165 ] Marco Gaido commented on SPARK-28348: - Mmmhyes, you're right. > Avoid cast twice for decimal

[jira] [Commented] (SPARK-28324) The LOG function using 10 as the base, but Spark using E

2019-07-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884397#comment-16884397 ] Marco Gaido commented on SPARK-28324: - +1 for [~srowen]'s opinion. I don't think it is a good idea

[jira] [Commented] (SPARK-28222) Feature importance outputs different values in GBT and Random Forest in 2.3.3 and 2.4 pyspark version

2019-07-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884412#comment-16884412 ] Marco Gaido commented on SPARK-28222: - [~eneriwrt] do you have a simple repro for this? I can try

[jira] [Commented] (SPARK-28316) Decimal precision issue

2019-07-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884398#comment-16884398 ] Marco Gaido commented on SPARK-28316: - Well, IIUC, this is just the result of Postgres having no

[jira] [Commented] (SPARK-28222) Feature importance outputs different values in GBT and Random Forest in 2.3.3 and 2.4 pyspark version

2019-07-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16877267#comment-16877267 ] Marco Gaido commented on SPARK-28222: - Mmmmh, there has been a bug fix for it (see SPARK-26721), but

[jira] [Created] (SPARK-28235) Decimal sum return type

2019-07-02 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-28235: --- Summary: Decimal sum return type Key: SPARK-28235 URL: https://issues.apache.org/jira/browse/SPARK-28235 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876357#comment-16876357 ] Marco Gaido commented on SPARK-28186: - Do you know of any SQL BD with the behavior you are

[jira] [Commented] (SPARK-28067) Incorrect results in decimal aggregation with whole-stage code gen enabled

2019-07-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16880700#comment-16880700 ] Marco Gaido commented on SPARK-28067: - I cannot reproduce in 2.4.0 either: {code}

[jira] [Commented] (SPARK-27287) PCAModel.load() does not honor spark configs

2019-04-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-27287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16823876#comment-16823876 ] Marco Gaido commented on SPARK-27287: - [~dharmesh.kakadia] the point is: if you set a config on theĀ 

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-07-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16876509#comment-16876509 ] Marco Gaido commented on SPARK-28186: - You're right with that. The equivalent in Postgres is

[jira] [Commented] (SPARK-28186) array_contains returns null instead of false when one of the items in the array is null

2019-06-29 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16875447#comment-16875447 ] Marco Gaido commented on SPARK-28186: - This is the right behavior AFAIK. Why are you saying it is

[jira] [Created] (SPARK-28610) Support larger buffer for sum of long

2019-08-03 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-28610: --- Summary: Support larger buffer for sum of long Key: SPARK-28610 URL: https://issues.apache.org/jira/browse/SPARK-28610 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-28611) Histogram's height is diffrent

2019-08-04 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16899585#comment-16899585 ] Marco Gaido commented on SPARK-28611: - Mmmhthat's weird! How can you get a different result than

<    1   2   3   4   5   6   7   >