[jira] [Commented] (SPARK-23902) Provide an option in months_between UDF to disable rounding-off

2018-04-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16430214#comment-16430214 ] Marco Gaido commented on SPARK-23902: - I will work on this, thanks. > Provide an option in

[jira] [Created] (NIFI-5043) TailFile opens and never closes readers in Multiple Files mode

2018-04-05 Thread Marco Gaido (JIRA)
Marco Gaido created NIFI-5043: - Summary: TailFile opens and never closes readers in Multiple Files mode Key: NIFI-5043 URL: https://issues.apache.org/jira/browse/NIFI-5043 Project: Apache NiFi

[jira] [Commented] (SPARK-23835) When Dataset.as converts column from nullable to non-nullable type, null Doubles are converted silently to -1

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23835?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422082#comment-16422082 ] Marco Gaido commented on SPARK-23835: - Actually this is not the first time we see this. Previously,

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422068#comment-16422068 ] Marco Gaido commented on SPARK-23791: - Yes, I think you're right [~maropu]. Do you want to reopen

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-04-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422064#comment-16422064 ] Marco Gaido commented on SPARK-23791: - Thanks, [~rednikotin]. The error you noticed in the range

[jira] [Commented] (SPARK-23791) Sub-optimal generated code for sum aggregating

2018-03-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420335#comment-16420335 ] Marco Gaido commented on SPARK-23791: - Hi [~rednikotin]. Thanks for reporting this. The error you

[jira] [Commented] (SPARK-23782) SHS should not show applications to user without read permission

2018-03-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16417149#comment-16417149 ] Marco Gaido commented on SPARK-23782: - [~vanzin] sorry but I cannot see any usability issue with

[jira] [Commented] (SPARK-23782) SHS should not show applications to user without read permission

2018-03-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412510#comment-16412510 ] Marco Gaido commented on SPARK-23782: - [~vanzin] thanks for the link. I see that in the discussion

[jira] [Commented] (SPARK-23782) SHS should not show applications to user without read permission

2018-03-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411836#comment-16411836 ] Marco Gaido commented on SPARK-23782: - [~vanzin] sorry but I have not been able to find any JIRA

[jira] [Created] (SPARK-23782) SHS should not show applications to user without read permission

2018-03-23 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23782: --- Summary: SHS should not show applications to user without read permission Key: SPARK-23782 URL: https://issues.apache.org/jira/browse/SPARK-23782 Project: Spark

[jira] [Commented] (SPARK-23739) Spark structured streaming long running problem

2018-03-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16411377#comment-16411377 ] Marco Gaido commented on SPARK-23739: - [~zsxwing] [~joseph.torres] [~c...@koeninger.org] I am not

[jira] [Commented] (SPARK-23739) Spark structured streaming long running problem

2018-03-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407112#comment-16407112 ] Marco Gaido commented on SPARK-23739: - Can you provide some more info about how you are getting this

[jira] [Commented] (NIFI-4631) Improve ListFile performance (using walkFileTree)

2018-03-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/NIFI-4631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16404970#comment-16404970 ] Marco Gaido commented on NIFI-4631: --- [~joewitt] yes, thanks. I created about 100.000 directories with

[jira] [Created] (SPARK-23644) SHS with proxy doesn't show applications

2018-03-10 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23644: --- Summary: SHS with proxy doesn't show applications Key: SPARK-23644 URL: https://issues.apache.org/jira/browse/SPARK-23644 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23598) WholeStageCodegen can lead to IllegalAccessError calling append for HashAggregateExec

2018-03-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16391373#comment-16391373 ] Marco Gaido commented on SPARK-23598: - [~dvogelbacher] the parameter you are talking about is taken

[jira] [Created] (SPARK-23628) WholeStageCodegen can generate methods with too many params

2018-03-08 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23628: --- Summary: WholeStageCodegen can generate methods with too many params Key: SPARK-23628 URL: https://issues.apache.org/jira/browse/SPARK-23628 Project: Spark

[jira] [Commented] (SPARK-23592) Add interpreted execution for DecodeUsingSerializer expression

2018-03-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388075#comment-16388075 ] Marco Gaido commented on SPARK-23592: - I will submit a PR as soon as SPARK-23591 gets merged, thanks

[jira] [Commented] (SPARK-23590) Add interpreted execution for CreateExternalRow expression

2018-03-06 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387721#comment-16387721 ] Marco Gaido commented on SPARK-23590: - I am working on this > Add interpreted execution for

[jira] [Commented] (SPARK-23598) WholeStageCodegen can lead to IllegalAccessError calling append for HashAggregateExec

2018-03-05 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16386234#comment-16386234 ] Marco Gaido commented on SPARK-23598: - thanks for reporting this. Actually the one which you designed

[jira] [Created] (SPARK-23568) Silhouette should get number of features from metadata if available

2018-03-02 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23568: --- Summary: Silhouette should get number of features from metadata if available Key: SPARK-23568 URL: https://issues.apache.org/jira/browse/SPARK-23568 Project: Spark

[jira] [Commented] (SPARK-23498) Accuracy problem in comparison with string and integer

2018-03-02 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16383373#comment-16383373 ] Marco Gaido commented on SPARK-23498: - I think we are seeing many of these issues with implicit

[jira] [Commented] (SPARK-23528) Expose vital statistics of GaussianMixtureModel

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380426#comment-16380426 ] Marco Gaido commented on SPARK-23528: - The log likelihood is already available in the summary (eg.

[jira] [Commented] (SPARK-23535) MinMaxScaler return 0.5 for an all zero column

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16380334#comment-16380334 ] Marco Gaido commented on SPARK-23535: - I checked and each tool behaves in its own way when this case

[jira] [Commented] (SPARK-23531) When explain, plan's output should include attribute type info

2018-02-28 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16379967#comment-16379967 ] Marco Gaido commented on SPARK-23531: - I am working on this. I will submit a PR soon. > When

[jira] [Created] (SPARK-23501) Refactor AllStagesPage in order to avoid redundant code

2018-02-23 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23501: --- Summary: Refactor AllStagesPage in order to avoid redundant code Key: SPARK-23501 URL: https://issues.apache.org/jira/browse/SPARK-23501 Project: Spark Issue

[jira] [Commented] (SPARK-23496) Locality of coalesced partitions can be severely skewed by the order of input partitions

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374530#comment-16374530 ] Marco Gaido commented on SPARK-23496: - [~ala.luszczak] thanks for your answer. Honestly I don't see

[jira] [Commented] (SPARK-23496) Locality of coalesced partitions can be severely skewed by the order of input partitions

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374439#comment-16374439 ] Marco Gaido commented on SPARK-23496: - I read that the proposed solution is to use random numbers

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374358#comment-16374358 ] Marco Gaido commented on SPARK-23493: - How can it know that you are not setting the partition column

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374258#comment-16374258 ] Marco Gaido commented on SPARK-23493: - I don't think so. Partition columns are always at the end. If

[jira] [Commented] (SPARK-23493) insert-into depends on columns order, otherwise incorrect data inserted

2018-02-23 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374146#comment-16374146 ] Marco Gaido commented on SPARK-23493: - I don't think this is an issue. I think this is the expected

[jira] [Created] (SPARK-23489) HiveExternalCatalogVersionsSuite flaky test

2018-02-22 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23489: --- Summary: HiveExternalCatalogVersionsSuite flaky test Key: SPARK-23489 URL: https://issues.apache.org/jira/browse/SPARK-23489 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-23475) The "stages" page doesn't show any completed stages

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371347#comment-16371347 ] Marco Gaido commented on SPARK-23475: - The reason of this behavior is that SKIPPED stages, which were

[jira] [Comment Edited] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371279#comment-16371279 ] Marco Gaido edited comment on SPARK-23473 at 2/21/18 11:53 AM: --- Your stack

[jira] [Commented] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371279#comment-16371279 ] Marco Gaido commented on SPARK-23473: - Your stack error points out which is the real issue: {code}

[jira] [Resolved] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23473. - Resolution: Invalid > spark.catalog.listTables error when database name starts with a number >

[jira] [Commented] (SPARK-23477) Misleading exception message when union fails due to metadata

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371278#comment-16371278 ] Marco Gaido commented on SPARK-23477: - [~kretes] yes. I think we can close this, do you agree? >

[jira] [Commented] (SPARK-23477) Misleading exception message when union fails due to metadata

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371238#comment-16371238 ] Marco Gaido commented on SPARK-23477: - I cannot reproduce this on master. > Misleading exception

[jira] [Updated] (SPARK-23473) spark.catalog.listTables error when database name starts with a number

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23473: Component/s: (was: Spark Core) SQL > spark.catalog.listTables error when

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-21 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16371186#comment-16371186 ] Marco Gaido commented on SPARK-23463: - It changed Spark's implicit casting. Probably in 2.1.1

[jira] [Comment Edited] (NIFI-4367) InvokedScriptedProcessor does not support scripted processor that extends AbstractProcessor

2018-02-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/NIFI-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370090#comment-16370090 ] Marco Gaido edited comment on NIFI-4367 at 2/20/18 2:14 PM: [~frett27] sorry

[jira] [Commented] (NIFI-4367) InvokedScriptedProcessor does not support scripted processor that extends AbstractProcessor

2018-02-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/NIFI-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370090#comment-16370090 ] Marco Gaido commented on NIFI-4367: --- [~frett27] sorry did you have time to check the PR I sent you? Any

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-20 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16370046#comment-16370046 ] Marco Gaido commented on SPARK-23463: - Hi [~m.bakshi11]. The problem is very easy. The column `val`

[jira] [Commented] (SPARK-23463) Filter operation fails to handle blank values and evicts rows that even satisfy the filtering condition

2018-02-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368968#comment-16368968 ] Marco Gaido commented on SPARK-23463: - sorry, what do you mean by blank values? Which is the type of

[jira] [Commented] (SPARK-23458) OrcSuite flaky test

2018-02-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16368295#comment-16368295 ] Marco Gaido commented on SPARK-23458: - cc [~dongjoon] > OrcSuite flaky test > --- >

[jira] [Created] (SPARK-23458) OrcSuite flaky test

2018-02-17 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23458: --- Summary: OrcSuite flaky test Key: SPARK-23458 URL: https://issues.apache.org/jira/browse/SPARK-23458 Project: Spark Issue Type: Task Components: SQL

[jira] [Created] (SPARK-23451) Deprecate KMeans computeCost

2018-02-16 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23451: --- Summary: Deprecate KMeans computeCost Key: SPARK-23451 URL: https://issues.apache.org/jira/browse/SPARK-23451 Project: Spark Issue Type: Task

[jira] [Commented] (SPARK-23439) Ambiguous reference when selecting column inside StructType with same name that outer colum

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366945#comment-16366945 ] Marco Gaido commented on SPARK-23439: - [~cloud_fan] I think this comes from

[jira] [Commented] (SPARK-23442) Reading from partitioned and bucketed table uses only bucketSpec.numBuckets partitions in all cases

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366898#comment-16366898 ] Marco Gaido commented on SPARK-23442: - I am not sure it is what you are looking for, but you can

[jira] [Commented] (SPARK-23399) Register a task completion listener first for OrcColumnarBatchReader

2018-02-16 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16366788#comment-16366788 ] Marco Gaido commented on SPARK-23399: - I think we should reopen this, it is still happening:

[jira] [Commented] (SPARK-23436) Incorrect Date column Inference in partition discovery

2018-02-15 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16365780#comment-16365780 ] Marco Gaido commented on SPARK-23436: - Thanks for reporting this. This affects also current branch. I

[jira] [Commented] (SPARK-23234) ML python test failure due to default outputCol

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16364680#comment-16364680 ] Marco Gaido commented on SPARK-23234: - [~josephkb] maybe it is not a blocker, but since this can

[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363741#comment-16363741 ] Marco Gaido commented on SPARK-23402: - Yes the table existed. please try with the current master. I

[jira] [Commented] (SPARK-23402) Dataset write method not working as expected for postgresql database

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363657#comment-16363657 ] Marco Gaido commented on SPARK-23402: - I tried with Postgres 10, driver 42.2.1 and I was unable to

[jira] [Commented] (SPARK-23420) Datasource loading not handling paths with regex chars.

2018-02-14 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16363634#comment-16363634 ] Marco Gaido commented on SPARK-23420: - I don't remember the ticket number but this may be solved. May

[jira] [Commented] (SPARK-23416) flaky test: org.apache.spark.sql.kafka010.KafkaSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362920#comment-16362920 ] Marco Gaido commented on SPARK-23416: - I see this failing also with this stacktrace: {code:java}

[jira] [Commented] (SPARK-23344) Add KMeans distanceMeasure param to PySpark

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362774#comment-16362774 ] Marco Gaido commented on SPARK-23344: - I see. It would be good indeed to decide in the community a

[jira] [Commented] (SPARK-23344) Add KMeans distanceMeasure param to PySpark

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362758#comment-16362758 ] Marco Gaido commented on SPARK-23344: - [~srowen] I did it this way because I always say doing so. Not

[jira] [Commented] (SPARK-23411) Deprecate SparkContext.getExecutorStorageStatus

2018-02-13 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362597#comment-16362597 ] Marco Gaido commented on SPARK-23411: - I think this method was removed in SPARK-20659. So I think

[jira] [Created] (SPARK-23412) Add cosine distance measure to BisectingKMeans

2018-02-13 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23412: --- Summary: Add cosine distance measure to BisectingKMeans Key: SPARK-23412 URL: https://issues.apache.org/jira/browse/SPARK-23412 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23394) Storage info's Cached Partitions doesn't consider the replications (but sc.getRDDStorageInfo does)

2018-02-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360593#comment-16360593 ] Marco Gaido commented on SPARK-23394: - I think this is not an issue. `numCachedPartitions ` is 20

[jira] [Commented] (SPARK-23393) Path is error when run test in local machine

2018-02-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16360471#comment-16360471 ] Marco Gaido commented on SPARK-23393: - I think this is a problem for your environment. THe path is

[jira] [Commented] (SPARK-22105) Dataframe has poor performance when computing on many columns with codegen

2018-02-10 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16359420#comment-16359420 ] Marco Gaido commented on SPARK-22105: - [~WeichenXu123] which is the number of rows for the dataset

[jira] [Updated] (SPARK-23375) Optimizer should remove unneeded Sort

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23375: Description: As pointed out in SPARK-23368, as of now there is no rule to remove the Sort

[jira] [Created] (SPARK-23375) Optimizer should remove unneeded Sort

2018-02-09 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23375: --- Summary: Optimizer should remove unneeded Sort Key: SPARK-23375 URL: https://issues.apache.org/jira/browse/SPARK-23375 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23373. - Resolution: Cannot Reproduce > Can not execute "count distinct" queries on parquet formatted

[jira] [Commented] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358402#comment-16358402 ] Marco Gaido commented on SPARK-23373: - Then I think we can close this, thanks. > Can not execute

[jira] [Commented] (SPARK-23373) Can not execute "count distinct" queries on parquet formatted table

2018-02-09 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16358315#comment-16358315 ] Marco Gaido commented on SPARK-23373: - I cannot reproduce on current master... May you try and check

[jira] [Commented] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357473#comment-16357473 ] Marco Gaido commented on SPARK-23244: - The change is related because your problem is caused by the

[jira] [Commented] (SPARK-23041) Inconsistent `drop`ing of columns in dataframes

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356850#comment-16356850 ] Marco Gaido commented on SPARK-23041: - yes I am unable to reproduce this problem in master branch. >

[jira] [Comment Edited] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356786#comment-16356786 ] Marco Gaido edited comment on SPARK-23244 at 2/8/18 10:47 AM: -- maybe we can

[jira] [Commented] (SPARK-23244) Incorrect handling of default values when deserializing python wrappers of scala transformers

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356786#comment-16356786 ] Marco Gaido commented on SPARK-23244: - maybe we can close this as a duplicate of SPARK-23234. Anyway,

[jira] [Commented] (SPARK-23338) Spark unable to run on HDP deployed Azure Blob File System

2018-02-08 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16356771#comment-16356771 ] Marco Gaido commented on SPARK-23338: - [~Subham] questions should be sent to the user mailing list,

[jira] [Created] (SPARK-23344) Add KMeans distanceMeasure param to PySpark

2018-02-06 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23344: --- Summary: Add KMeans distanceMeasure param to PySpark Key: SPARK-23344 URL: https://issues.apache.org/jira/browse/SPARK-23344 Project: Spark Issue Type:

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348622#comment-16348622 ] Marco Gaido commented on SPARK-22575: - I think STS is the only Spark application where this can

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348603#comment-16348603 ] Marco Gaido commented on SPARK-22575: - Then the problem is likely that the executors are killed in

[jira] [Comment Edited] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348548#comment-16348548 ] Marco Gaido edited comment on SPARK-22575 at 2/1/18 1:06 PM: - I am not able

[jira] [Commented] (SPARK-22575) Making Spark Thrift Server clean up its cache

2018-02-01 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16348548#comment-16348548 ] Marco Gaido commented on SPARK-22575: - I am not able to reproduce the issue. May I ask you to provide

[jira] [Commented] (NIFI-4367) InvokedScriptedProcessor does not support scripted processor that extends AbstractProcessor

2018-01-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/NIFI-4367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346780#comment-16346780 ] Marco Gaido commented on NIFI-4367: --- [~frett27] I sent you a PR, thanks. > InvokedScriptedProcessor does

[jira] [Commented] (SPARK-23273) Spark Dataset withColumn - schema column order isn't the same as case class paramether order

2018-01-31 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346758#comment-16346758 ] Marco Gaido commented on SPARK-23273: - [~viirya] I don't think that this would solve this problem.

[jira] [Resolved] (SPARK-22692) Reduce the number of generated mutable states

2018-01-30 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-22692. - Resolution: Fixed > Reduce the number of generated mutable states >

[jira] [Updated] (SPARK-23234) ML python test failure due to default outputCol

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23234: Description: SPARK-22799 and SPARK-22797 are causing valid Python test failures. The reason is

[jira] [Updated] (SPARK-23234) ML python test failure due to default outputCol

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23234: Summary: ML python test failure due to default outputCol (was: ML python test failure) > ML

[jira] [Updated] (SPARK-23234) ML python test failure

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23234: Description: SPARK-22799 and SPARK-22797 are causing valid Python test failures. The reason is

[jira] [Created] (SPARK-23234) ML python test failure

2018-01-26 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23234: --- Summary: ML python test failure Key: SPARK-23234 URL: https://issues.apache.org/jira/browse/SPARK-23234 Project: Spark Issue Type: Bug Components:

[jira] [Resolved] (SPARK-23225) Spark is infering decimal values with wrong precision

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23225. - Resolution: Duplicate > Spark is infering decimal values with wrong precision >

[jira] [Commented] (SPARK-23225) Spark is infering decimal values with wrong precision

2018-01-26 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16340813#comment-16340813 ] Marco Gaido commented on SPARK-23225: - I am not able to reproduce on master. May you provide a sample

[jira] [Created] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23217: --- Summary: Add cosine distance measure to ClusteringEvaluator Key: SPARK-23217 URL: https://issues.apache.org/jira/browse/SPARK-23217 Project: Spark Issue Type:

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: (was: SPARK-23217.pages) > Add cosine distance measure to ClusteringEvaluator >

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: SPARK-23217.pdf > Add cosine distance measure to ClusteringEvaluator >

[jira] [Updated] (SPARK-23217) Add cosine distance measure to ClusteringEvaluator

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23217: Attachment: SPARK-23217.pages > Add cosine distance measure to ClusteringEvaluator >

[jira] [Resolved] (SPARK-23212) Casts the column to a different data type.

2018-01-25 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23212. - Resolution: Invalid This is not the right place. For questions, please use the user mailing

[jira] [Created] (SPARK-23179) Support option to throw exception if overflow occurs

2018-01-22 Thread Marco Gaido (JIRA)
Marco Gaido created SPARK-23179: --- Summary: Support option to throw exception if overflow occurs Key: SPARK-23179 URL: https://issues.apache.org/jira/browse/SPARK-23179 Project: Spark Issue

[jira] [Updated] (SPARK-23087) CheckCartesianProduct too restrictive when condition is constant folded to false/null

2018-01-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido updated SPARK-23087: Priority: Minor (was: Major) > CheckCartesianProduct too restrictive when condition is constant

[jira] [Commented] (SPARK-23156) Code of method "initialize(I)V" of class "org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection" grows beyond 64 KB

2018-01-19 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16332434#comment-16332434 ] Marco Gaido commented on SPARK-23156: - [~kzawisto] a lot of work on this has been done and it is both

[jira] [Created] (NIFI-4790) Support HTTPS proxy in InvokeHTTP

2018-01-18 Thread Marco Gaido (JIRA)
Marco Gaido created NIFI-4790: - Summary: Support HTTPS proxy in InvokeHTTP Key: NIFI-4790 URL: https://issues.apache.org/jira/browse/NIFI-4790 Project: Apache NiFi Issue Type: Improvement

[jira] [Assigned] (NIFI-2169) Improve RouteText performance with pre-compilation of RegEx in certain cases

2018-01-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/NIFI-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido reassigned NIFI-2169: - Assignee: Marco Gaido (was: Oleg Zhurakousky) > Improve RouteText performance with

[jira] [Commented] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)

2018-01-18 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16330269#comment-16330269 ] Marco Gaido commented on SPARK-23130: - [~seano] there is no JIRA for the pipeout issue and there

[jira] [Resolved] (SPARK-15401) Spark Thrift server creates empty directories in tmp directory on the driver

2018-01-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-15401. - Resolution: Duplicate > Spark Thrift server creates empty directories in tmp directory on the

[jira] [Commented] (SPARK-15401) Spark Thrift server creates empty directories in tmp directory on the driver

2018-01-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328738#comment-16328738 ] Marco Gaido commented on SPARK-15401: - this should have been fixed in SPARK-22793. > Spark Thrift

[jira] [Resolved] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)

2018-01-17 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23130. - Resolution: Duplicate > Spark Thrift does not clean-up temporary files (/tmp/*_resources and >

<    1   2   3   4   5   6   7   8   >