[jira] [Commented] (SPARK-26222) Scan: track file listing time

2018-12-12 Thread Yuanjian Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719110#comment-16719110 ] Yuanjian Li commented on SPARK-26222: - Yes, I think I misunderstood your original intention. I'll

[jira] [Assigned] (SPARK-26222) Scan: track file listing time

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26222: Assignee: Apache Spark > Scan: track file listing time > - >

[jira] [Resolved] (SPARK-24102) RegressionEvaluator should use sample weight data

2018-12-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-24102. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 17085

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719231#comment-16719231 ] Imran Rashid commented on SPARK-26019: -- we chatted some offline about this. the summary here is

[jira] [Updated] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26019: - Description: ORIGINAL REPORT = pyspark/accumulators.py: "TypeError: object of type

[jira] [Updated] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26019: - Description: ORIGINAL REPORT = Started happening after 2.3.1 -> 2.3.2 upgrade.  

[jira] [Commented] (SPARK-19827) spark.ml R API for PIC

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19827?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719069#comment-16719069 ] ASF GitHub Bot commented on SPARK-19827: srowen closed pull request #23292:

[jira] [Created] (SPARK-26349) Pyspark should not accept insecure p4yj gateways

2018-12-12 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-26349: Summary: Pyspark should not accept insecure p4yj gateways Key: SPARK-26349 URL: https://issues.apache.org/jira/browse/SPARK-26349 Project: Spark Issue Type:

[jira] [Commented] (SPARK-26348) make sure expression is resolved during test

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719115#comment-16719115 ] ASF GitHub Bot commented on SPARK-26348: cloud-fan opened a new pull request #23297:

[jira] [Commented] (SPARK-26222) Scan: track file listing time

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719122#comment-16719122 ] ASF GitHub Bot commented on SPARK-26222: xuanyuanking opened a new pull request #23298:

[jira] [Commented] (SPARK-24102) RegressionEvaluator should use sample weight data

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719133#comment-16719133 ] ASF GitHub Bot commented on SPARK-24102: srowen closed pull request #17085:

[jira] [Commented] (SPARK-26327) Metrics in FileSourceScanExec not update correctly while relation.partitionSchema is set

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719186#comment-16719186 ] ASF GitHub Bot commented on SPARK-26327: xuanyuanking opened a new pull request #23300:

[jira] [Assigned] (SPARK-26348) make sure expression is resolved during test

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26348: Assignee: Apache Spark (was: Wenchen Fan) > make sure expression is resolved during

[jira] [Assigned] (SPARK-26348) make sure expression is resolved during test

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26348: Assignee: Wenchen Fan (was: Apache Spark) > make sure expression is resolved during

[jira] [Updated] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26019: - Description: pyspark's accumulator server expects a secure py4j connection between python and

[jira] [Updated] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26019: - Description: pyspark's accumulator server expects a secure py4j connection between python and

[jira] [Commented] (SPARK-24152) SparkR CRAN feasibility check server problem

2018-12-12 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719247#comment-16719247 ] Marco Gaido commented on SPARK-24152: - [~viirya] [~hyukjin.kwon] I am seeing this again constantly:

[jira] [Updated] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26019: - Target Version/s: 2.3.3, 2.4.1 > pyspark/accumulators.py: "TypeError: object of type 'NoneType'

[jira] [Updated] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26019: - Description: pyspark's accumulator server expects a secure py4j connection between python and

[jira] [Created] (SPARK-26348) make sure expression is resolved during test

2018-12-12 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-26348: --- Summary: make sure expression is resolved during test Key: SPARK-26348 URL: https://issues.apache.org/jira/browse/SPARK-26348 Project: Spark Issue Type: Test

[jira] [Commented] (SPARK-26327) Metrics in FileSourceScanExec not update correctly while relation.partitionSchema is set

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719156#comment-16719156 ] ASF GitHub Bot commented on SPARK-26327: xuanyuanking opened a new pull request #23299:

[jira] [Assigned] (SPARK-26222) Scan: track file listing time

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26222: Assignee: (was: Apache Spark) > Scan: track file listing time >

[jira] [Commented] (SPARK-24152) SparkR CRAN feasibility check server problem

2018-12-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719254#comment-16719254 ] Dongjoon Hyun commented on SPARK-24152: --- [~felixcheung], [~shivaram] . Can we split out CRAN

[jira] [Assigned] (SPARK-24102) RegressionEvaluator should use sample weight data

2018-12-12 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-24102: - Assignee: Ilya Matiach > RegressionEvaluator should use sample weight data >

[jira] [Commented] (SPARK-25877) Put all feature-related code in the feature step itself

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719388#comment-16719388 ] ASF GitHub Bot commented on SPARK-25877: asfgit closed pull request #23220: [SPARK-25877][k8s]

[jira] [Commented] (SPARK-26350) Allow the user to override the group id of the Kafka's consumer

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719517#comment-16719517 ] ASF GitHub Bot commented on SPARK-26350: zsxwing opened a new pull request #23301:

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719308#comment-16719308 ] ASF GitHub Bot commented on SPARK-26019: vanzin closed pull request #23113:

[jira] [Created] (SPARK-26350) Allow the user to override the group id of the Kafka's consumer

2018-12-12 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-26350: Summary: Allow the user to override the group id of the Kafka's consumer Key: SPARK-26350 URL: https://issues.apache.org/jira/browse/SPARK-26350 Project: Spark

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719433#comment-16719433 ] Imran Rashid commented on SPARK-26019: -- Hi [~dongjoon], I don't think there is a zeppelin issue,

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719501#comment-16719501 ] Dongjoon Hyun commented on SPARK-26019: --- Thank you for the information. If it's considered

[jira] [Updated] (SPARK-25285) Add executor task metrics to track the number of tasks started and of tasks successfully completed

2018-12-12 Thread Luca Canali (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Luca Canali updated SPARK-25285: Description: The motivation for these additional metrics is to help in troubleshooting and

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-12-12 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719359#comment-16719359 ] Dongjoon Hyun commented on SPARK-26019: --- Hi, [~irashid]. Is there a corresponding Apache Zeppelin

[jira] [Resolved] (SPARK-25877) Put all feature-related code in the feature step itself

2018-12-12 Thread Matt Cheah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt Cheah resolved SPARK-25877. Resolution: Fixed Fix Version/s: 3.0.0 > Put all feature-related code in the feature step

[jira] [Assigned] (SPARK-26350) Allow the user to override the group id of the Kafka's consumer

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26350: Assignee: (was: Apache Spark) > Allow the user to override the group id of the

[jira] [Assigned] (SPARK-26350) Allow the user to override the group id of the Kafka's consumer

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26350: Assignee: Apache Spark > Allow the user to override the group id of the Kafka's consumer

[jira] [Assigned] (SPARK-24522) Centralize code to deal with security-related HTTP features

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24522: Assignee: (was: Apache Spark) > Centralize code to deal with security-related HTTP

[jira] [Assigned] (SPARK-24522) Centralize code to deal with security-related HTTP features

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24522: Assignee: Apache Spark > Centralize code to deal with security-related HTTP features >

[jira] [Assigned] (SPARK-25277) YARN applicationMaster metrics should not register static and JVM metrics

2018-12-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-25277: -- Assignee: Luca Canali > YARN applicationMaster metrics should not register static

[jira] [Resolved] (SPARK-25277) YARN applicationMaster metrics should not register static and JVM metrics

2018-12-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-25277. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22279

[jira] [Commented] (SPARK-24152) SparkR CRAN feasibility check server problem

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719578#comment-16719578 ] Hyukjin Kwon commented on SPARK-24152: -- This will be permanently prevented after Spark 3.0 release

[jira] [Assigned] (SPARK-26322) Simplify kafka delegation token sasl.mechanism configuration

2018-12-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-26322: -- Assignee: Gabor Somogyi > Simplify kafka delegation token sasl.mechanism

[jira] [Commented] (SPARK-26280) Spark will read entire CSV file even when limit is used

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719727#comment-16719727 ] Hyukjin Kwon commented on SPARK-26280: -- Is it CSV specific or does it happen in other datasources?

[jira] [Commented] (SPARK-26355) Add a workaround for PyArrow 0.11.

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719733#comment-16719733 ] ASF GitHub Bot commented on SPARK-26355: ueshin opened a new pull request #23305:

[jira] [Created] (SPARK-26351) Documented formula of precision at k does not match the actual code

2018-12-12 Thread Pablo J. Villacorta (JIRA)
Pablo J. Villacorta created SPARK-26351: --- Summary: Documented formula of precision at k does not match the actual code Key: SPARK-26351 URL: https://issues.apache.org/jira/browse/SPARK-26351

[jira] [Commented] (SPARK-24522) Centralize code to deal with security-related HTTP features

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719565#comment-16719565 ] ASF GitHub Bot commented on SPARK-24522: vanzin opened a new pull request #23302:

[jira] [Commented] (SPARK-26254) Move delegation token providers into a separate project

2018-12-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719574#comment-16719574 ] Marcelo Vanzin commented on SPARK-26254: bq. loaded the providers with ServiceLoader If you're

[jira] [Commented] (SPARK-26322) Simplify kafka delegation token sasl.mechanism configuration

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719604#comment-16719604 ] ASF GitHub Bot commented on SPARK-26322: asfgit closed pull request #23274: [SPARK-26322][SS]

[jira] [Created] (SPARK-26353) Add typed aggregate functions:max&

2018-12-12 Thread liuxian (JIRA)
liuxian created SPARK-26353: --- Summary: Add typed aggregate functions:max& Key: SPARK-26353 URL: https://issues.apache.org/jira/browse/SPARK-26353 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-26297) improve the doc of Distribution/Partitioning

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719701#comment-16719701 ] ASF GitHub Bot commented on SPARK-26297: asfgit closed pull request #23249: [SPARK-26297][SQL]

[jira] [Commented] (SPARK-26291) how to get the number of rows when i write or read use datasourceV2

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719730#comment-16719730 ] Hyukjin Kwon commented on SPARK-26291: -- Also, avoid to set fix version which is usually set when

[jira] [Updated] (SPARK-26351) Documented formula of precision at k does not match the actual code

2018-12-12 Thread Pablo J. Villacorta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo J. Villacorta updated SPARK-26351: Description: The formula of the *precision @ k* for measuring the quality of the

[jira] [Updated] (SPARK-26351) Documented formula of precision at k does not match the actual code

2018-12-12 Thread Pablo J. Villacorta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pablo J. Villacorta updated SPARK-26351: Description: The formula of the *precision @ k* for measuring the quality of the

[jira] [Created] (SPARK-26354) Ability to return schema prefix before dataframe column names

2018-12-12 Thread t oo (JIRA)
t oo created SPARK-26354: Summary: Ability to return schema prefix before dataframe column names Key: SPARK-26354 URL: https://issues.apache.org/jira/browse/SPARK-26354 Project: Spark Issue Type:

[jira] [Commented] (SPARK-18818) Window...orderBy() should accept an 'ascending' parameter just like DataFrame.orderBy()

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719686#comment-16719686 ] ASF GitHub Bot commented on SPARK-18818: HyukjinKwon closed pull request #22533:

[jira] [Commented] (SPARK-24152) SparkR CRAN feasibility check server problem

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719692#comment-16719692 ] Hyukjin Kwon commented on SPARK-24152: -- Adding [~vanzin]. R test failure was by this, FYI. >

[jira] [Created] (SPARK-26356) Remove SaveMode from data source v2 API

2018-12-12 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-26356: --- Summary: Remove SaveMode from data source v2 API Key: SPARK-26356 URL: https://issues.apache.org/jira/browse/SPARK-26356 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-26356) Remove SaveMode from data source v2 API

2018-12-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-26356: Priority: Blocker (was: Major) > Remove SaveMode from data source v2 API >

[jira] [Updated] (SPARK-26320) udf with multiple arrays as input

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-26320: - Description: Spark get GC out of memory error when passing many arrays when we use many arrays

[jira] [Commented] (SPARK-26320) udf with multiple arrays as input

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719754#comment-16719754 ] Hyukjin Kwon commented on SPARK-26320: -- Can you post a runnable self-reproducer? > udf with

[jira] [Commented] (SPARK-26308) Large BigDecimal value is converted to null when passed into a UDF

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719753#comment-16719753 ] Hyukjin Kwon commented on SPARK-26308: -- ^ I think that sounds okay. > Large BigDecimal value is

[jira] [Commented] (SPARK-26355) Add a workaround for PyArrow 0.11.

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719762#comment-16719762 ] ASF GitHub Bot commented on SPARK-26355: asfgit closed pull request #23305:

[jira] [Commented] (SPARK-23464) MesosClusterScheduler double-escapes parameters to bash command

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23464?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719598#comment-16719598 ] ASF GitHub Bot commented on SPARK-23464: vanzin closed pull request #20641: [SPARK-23464][MESOS]

[jira] [Created] (SPARK-26352) ReorderJoin should not change the order of columns

2018-12-12 Thread Kris Mok (JIRA)
Kris Mok created SPARK-26352: Summary: ReorderJoin should not change the order of columns Key: SPARK-26352 URL: https://issues.apache.org/jira/browse/SPARK-26352 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-23464) MesosClusterScheduler double-escapes parameters to bash command

2018-12-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23464. Resolution: Duplicate > MesosClusterScheduler double-escapes parameters to bash command >

[jira] [Commented] (SPARK-26291) how to get the number of rows when i write or read use datasourceV2

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719729#comment-16719729 ] Hyukjin Kwon commented on SPARK-26291: -- Please avoid to set target version which is usually

[jira] [Updated] (SPARK-26291) how to get the number of rows when i write or read use datasourceV2

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-26291: - Fix Version/s: (was: 2.4.0) > how to get the number of rows when i write or read use

[jira] [Updated] (SPARK-26291) how to get the number of rows when i write or read use datasourceV2

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-26291: - Target Version/s: (was: 2.4.0) > how to get the number of rows when i write or read use

[jira] [Assigned] (SPARK-26355) Add a workaround for PyArrow 0.11.

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26355: Assignee: Apache Spark > Add a workaround for PyArrow 0.11. >

[jira] [Assigned] (SPARK-26355) Add a workaround for PyArrow 0.11.

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26355: Assignee: (was: Apache Spark) > Add a workaround for PyArrow 0.11. >

[jira] [Resolved] (SPARK-26348) make sure expression is resolved during test

2018-12-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-26348. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23297

[jira] [Reopened] (SPARK-24152) SparkR CRAN feasibility check server problem

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-24152: -- > SparkR CRAN feasibility check server problem > > >

[jira] [Commented] (SPARK-25277) YARN applicationMaster metrics should not register static and JVM metrics

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719584#comment-16719584 ] ASF GitHub Bot commented on SPARK-25277: asfgit closed pull request #22279: [SPARK-25277][YARN]

[jira] [Commented] (SPARK-26353) Add typed aggregate functions:max&

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719624#comment-16719624 ] ASF GitHub Bot commented on SPARK-26353: 10110346 opened a new pull request #23304:

[jira] [Assigned] (SPARK-26353) Add typed aggregate functions:max&

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26353: Assignee: Apache Spark > Add typed aggregate functions:max& >

[jira] [Resolved] (SPARK-18818) Window...orderBy() should accept an 'ascending' parameter just like DataFrame.orderBy()

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-18818. -- Resolution: Won't Fix > Window...orderBy() should accept an 'ascending' parameter just like

[jira] [Resolved] (SPARK-26297) improve the doc of Distribution/Partitioning

2018-12-12 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-26297. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23249

[jira] [Resolved] (SPARK-26322) Simplify kafka delegation token sasl.mechanism configuration

2018-12-12 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26322. Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23274

[jira] [Commented] (SPARK-26352) ReorderJoin should not change the order of columns

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719620#comment-16719620 ] ASF GitHub Bot commented on SPARK-26352: rednaxelafx opened a new pull request #23303:

[jira] [Assigned] (SPARK-26352) ReorderJoin should not change the order of columns

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26352: Assignee: Apache Spark > ReorderJoin should not change the order of columns >

[jira] [Assigned] (SPARK-26352) ReorderJoin should not change the order of columns

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26352?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26352: Assignee: (was: Apache Spark) > ReorderJoin should not change the order of columns >

[jira] [Commented] (SPARK-26346) Upgrade parquet to 1.11.0

2018-12-12 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719631#comment-16719631 ] Yuming Wang commented on SPARK-26346: - Pending PARQUET-1434. > Upgrade parquet to 1.11.0 >

[jira] [Assigned] (SPARK-26353) Add typed aggregate functions:max&

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26353: Assignee: (was: Apache Spark) > Add typed aggregate functions:max& >

[jira] [Updated] (SPARK-26291) how to get the number of rows when i write or read use datasourceV2

2018-12-12 Thread webber (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] webber updated SPARK-26291: --- Shepherd: Gabor Somogyi > how to get the number of rows when i write or read use datasourceV2 >

[jira] [Created] (SPARK-26355) Add a workaround for PyArrow 0.11.

2018-12-12 Thread Takuya Ueshin (JIRA)
Takuya Ueshin created SPARK-26355: - Summary: Add a workaround for PyArrow 0.11. Key: SPARK-26355 URL: https://issues.apache.org/jira/browse/SPARK-26355 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-26291) how to get the number of rows when i write or read use datasourceV2

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26291. -- Resolution: Invalid > how to get the number of rows when i write or read use datasourceV2 >

[jira] [Commented] (SPARK-26348) make sure expression is resolved during test

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719751#comment-16719751 ] ASF GitHub Bot commented on SPARK-26348: asfgit closed pull request #23297:

[jira] [Updated] (SPARK-26338) Use scala-xml explicitly

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-26338: - Affects Version/s: (was: 2.4.0) 3.0.0 > Use scala-xml explicitly >

[jira] [Commented] (SPARK-26354) Ability to return schema prefix before dataframe column names

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719785#comment-16719785 ] Hyukjin Kwon commented on SPARK-26354: -- BTW, for your usecase how about simply just giving an alias

[jira] [Resolved] (SPARK-26331) Allow SQL UDF registration to recognize default function values from Scala

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26331. -- Resolution: Won't Fix > Allow SQL UDF registration to recognize default function values from

[jira] [Commented] (SPARK-26331) Allow SQL UDF registration to recognize default function values from Scala

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719769#comment-16719769 ] Hyukjin Kwon commented on SPARK-26331: -- Yea default value doesn't work under the hood in Java side.

[jira] [Commented] (SPARK-26332) Spark sql write orc table on viewFS throws exception

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719774#comment-16719774 ] Hyukjin Kwon commented on SPARK-26332: -- I think you can put some input at

[jira] [Created] (SPARK-26357) Expose executors' procfs metrics to Metrics system

2018-12-12 Thread Reza Safi (JIRA)
Reza Safi created SPARK-26357: - Summary: Expose executors' procfs metrics to Metrics system Key: SPARK-26357 URL: https://issues.apache.org/jira/browse/SPARK-26357 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-26357) Expose executors' procfs metrics to Metrics system

2018-12-12 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26357: Assignee: Apache Spark > Expose executors' procfs metrics to Metrics system >

[jira] [Commented] (SPARK-26357) Expose executors' procfs metrics to Metrics system

2018-12-12 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719801#comment-16719801 ] ASF GitHub Bot commented on SPARK-26357: rezasafi opened a new pull request #23306:

[jira] [Commented] (SPARK-26354) Ability to return schema prefix before dataframe column names

2018-12-12 Thread tooptoop4 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719831#comment-16719831 ] tooptoop4 commented on SPARK-26354: --- there are many existing (and regularly run) queries written

[jira] [Updated] (SPARK-26326) Cannot save a NaiveBayesModel with 48685 features and 5453 labels

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-26326: - Component/s: (was: Spark Core) ML > Cannot save a NaiveBayesModel with

[jira] [Resolved] (SPARK-26355) Add a workaround for PyArrow 0.11.

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-26355. -- Resolution: Fixed Fixed in https://github.com/apache/spark/pull/23305 > Add a workaround for

[jira] [Assigned] (SPARK-26355) Add a workaround for PyArrow 0.11.

2018-12-12 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-26355: Assignee: Takuya Ueshin > Add a workaround for PyArrow 0.11. >

[jira] [Commented] (SPARK-26315) auto cast threshold from Integer to Float in approxSimilarityJoin of BucketedRandomProjectionLSHModel

2018-12-12 Thread Jerry He (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719777#comment-16719777 ] Jerry He commented on SPARK-26315: -- Hi, [~bryanc]  I will try to provide a PR. Thanks! > auto cast

[jira] [Commented] (SPARK-26357) Expose executors' procfs metrics to Metrics system

2018-12-12 Thread Reza Safi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719788#comment-16719788 ] Reza Safi commented on SPARK-26357: --- I will send a pr soon > Expose executors' procfs metrics to

  1   2   >