[jira] [Commented] (SPARK-26792) Apply custom log URL to Spark UI

2019-01-31 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26792?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16758037#comment-16758037 ] Jungtaek Lim commented on SPARK-26792: -- Hi [~Thatboix45], looks like you voted this issue: do you

[jira] [Closed] (SPARK-24404) Increase currentEpoch when meet a EpochMarker in ContinuousQueuedDataReader.next() in CP mode based on PR #21353 #21332 #21293 and the latest master

2019-01-31 Thread Liangchang Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liangchang Zhu closed SPARK-24404. -- > Increase currentEpoch when meet a EpochMarker in > ContinuousQueuedDataReader.next() in CP

[jira] [Resolved] (SPARK-24404) Increase currentEpoch when meet a EpochMarker in ContinuousQueuedDataReader.next() in CP mode based on PR #21353 #21332 #21293 and the latest master

2019-01-31 Thread Liangchang Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liangchang Zhu resolved SPARK-24404. Resolution: Won't Fix > Increase currentEpoch when meet a EpochMarker in >

[jira] [Assigned] (SPARK-26525) Fast release memory of ShuffleBlockFetcherIterator

2019-01-31 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-26525: --- Assignee: liupengcheng > Fast release memory of ShuffleBlockFetcherIterator >

[jira] [Commented] (SPARK-24541) TCP based shuffle

2019-01-31 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757985#comment-16757985 ] Jungtaek Lim commented on SPARK-24541: -- Continuous processing requires "single stage" to let all

[jira] [Updated] (SPARK-26744) Support schema validation in File Source V2

2019-01-31 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-26744: --- Description: The internal API supportDataType in FileFormat validates the output/input

[jira] [Commented] (SPARK-24541) TCP based shuffle

2019-01-31 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757936#comment-16757936 ] Imran Rashid commented on SPARK-24541: -- can you explain what this means at all? regular spark

[jira] [Updated] (SPARK-26744) Support schema validation in File Source V2

2019-01-31 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-26744: --- Description: The method supportDataType in FileFormat helps to validate the output/input

[jira] [Resolved] (SPARK-26730) Strip redundant AssertNotNull expression for ExpressionEncoder's serializer

2019-01-31 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-26730. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23651

[jira] [Assigned] (SPARK-26730) Strip redundant AssertNotNull expression for ExpressionEncoder's serializer

2019-01-31 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-26730: --- Assignee: wuyi > Strip redundant AssertNotNull expression for ExpressionEncoder's

[jira] [Commented] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757898#comment-16757898 ] Hyukjin Kwon commented on SPARK-26786: -- This behaviour is inherited from Univocity parser if I am

[jira] [Resolved] (SPARK-26787) Fix standardization error message in WeightedLeastSquares

2019-01-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26787. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23705

[jira] [Updated] (SPARK-24959) Do not invoke the CSV/JSON parser for empty schema

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24959: - Fix Version/s: (was: 2.4.0) > Do not invoke the CSV/JSON parser for empty schema >

[jira] [Updated] (SPARK-26745) Non-parsing Dataset.count() optimization causes inconsistent results for JSON inputs with empty lines

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-26745: - Fix Version/s: 2.4.1 > Non-parsing Dataset.count() optimization causes inconsistent results for

[jira] [Assigned] (SPARK-7721) Generate test coverage report from Python

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-7721: --- Assignee: Hyukjin Kwon > Generate test coverage report from Python >

[jira] [Resolved] (SPARK-7721) Generate test coverage report from Python

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-7721. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23117

[jira] [Created] (SPARK-26807) Confusing documentation regarding installation from PyPi

2019-01-31 Thread Emmanuel Arias (JIRA)
Emmanuel Arias created SPARK-26807: -- Summary: Confusing documentation regarding installation from PyPi Key: SPARK-26807 URL: https://issues.apache.org/jira/browse/SPARK-26807 Project: Spark

[jira] [Assigned] (SPARK-26808) Pruned schema should not change nullability

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26808: Assignee: (was: Apache Spark) > Pruned schema should not change nullability >

[jira] [Commented] (SPARK-26808) Pruned schema should not change nullability

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757880#comment-16757880 ] Apache Spark commented on SPARK-26808: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-26808) Pruned schema should not change nullability

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26808: Assignee: Apache Spark > Pruned schema should not change nullability >

[jira] [Created] (SPARK-26808) Pruned schema should not change nullability

2019-01-31 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-26808: --- Summary: Pruned schema should not change nullability Key: SPARK-26808 URL: https://issues.apache.org/jira/browse/SPARK-26808 Project: Spark Issue

[jira] [Updated] (SPARK-26795) Retry remote fileSegmentManagedBuffer when creating inputStream failed during shuffle read phase

2019-01-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26795: -- Flags: (was: Patch) Target Version/s: (was: 2.3.2, 2.4.0) Fix Version/s:

[jira] [Assigned] (SPARK-26787) Fix standardization error message in WeightedLeastSquares

2019-01-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26787: - Assignee: Brian Scannell > Fix standardization error message in WeightedLeastSquares >

[jira] [Resolved] (SPARK-25997) Python example code for Power Iteration Clustering in spark.ml

2019-01-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-25997. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 22996

[jira] [Assigned] (SPARK-25997) Python example code for Power Iteration Clustering in spark.ml

2019-01-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-25997: - Assignee: Huaxin Gao > Python example code for Power Iteration Clustering in spark.ml >

[jira] [Updated] (SPARK-26726) Synchronize the amount of memory used by the broadcast variable to the UI display

2019-01-31 Thread hantiantian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] hantiantian updated SPARK-26726: Description: The amount of memory used by the broadcast variable is not synchronized to the UI

[jira] [Comment Edited] (SPARK-26783) Kafka parameter documentation doesn't match with the reality (upper/lowercase)

2019-01-31 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757842#comment-16757842 ] Jungtaek Lim edited comment on SPARK-26783 at 2/1/19 1:01 AM: -- I'm not sure

[jira] [Commented] (SPARK-26783) Kafka parameter documentation doesn't match with the reality (upper/lowercase)

2019-01-31 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757842#comment-16757842 ] Jungtaek Lim commented on SPARK-26783: -- I'm not sure about what [~sindiri] left a comment on the

[jira] [Resolved] (SPARK-26793) Remove spark.shuffle.manager

2019-01-31 Thread liuxian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liuxian resolved SPARK-26793. - Resolution: Invalid > Remove spark.shuffle.manager > > >

[jira] [Commented] (SPARK-25136) unable to use HDFS checkpoint directories after driver restart

2019-01-31 Thread Robert Reid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757828#comment-16757828 ] Robert Reid commented on SPARK-25136: - [~gsomogyi] I haven't had a chance to retry it. Our build

[jira] [Commented] (SPARK-26783) Kafka parameter documentation doesn't match with the reality (upper/lowercase)

2019-01-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757824#comment-16757824 ] Shixiong Zhu commented on SPARK-26783: -- [~gsomogyi] This seems just an API document issue. Right?

[jira] [Updated] (SPARK-26806) EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly

2019-01-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-26806: - Description: Right now, EventTimeStats.merge doesn't handle "zero.merge(zero)". This will make

[jira] [Assigned] (SPARK-26806) EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26806: Assignee: Apache Spark (was: Shixiong Zhu) > EventTimeStats.merge doesn't handle

[jira] [Updated] (SPARK-26806) EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly

2019-01-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-26806: - Reporter: liancheng (was: Shixiong Zhu) > EventTimeStats.merge doesn't handle

[jira] [Assigned] (SPARK-26806) EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26806: Assignee: Shixiong Zhu (was: Apache Spark) > EventTimeStats.merge doesn't handle

[jira] [Updated] (SPARK-26806) EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly

2019-01-31 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-26806: - Affects Version/s: 2.2.1 2.3.0 2.3.1

[jira] [Created] (SPARK-26806) EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly

2019-01-31 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-26806: Summary: EventTimeStats.merge doesn't handle "zero.merge(zero)" correctly Key: SPARK-26806 URL: https://issues.apache.org/jira/browse/SPARK-26806 Project: Spark

[jira] [Commented] (SPARK-26654) Use Timestamp/DateFormatter in CatalogColumnStat

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757814#comment-16757814 ] Apache Spark commented on SPARK-26654: -- User 'MaxGekk' has created a pull request for this issue:

[jira] [Assigned] (SPARK-26654) Use Timestamp/DateFormatter in CatalogColumnStat

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26654: Assignee: (was: Apache Spark) > Use Timestamp/DateFormatter in CatalogColumnStat >

[jira] [Assigned] (SPARK-26654) Use Timestamp/DateFormatter in CatalogColumnStat

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26654: Assignee: Apache Spark > Use Timestamp/DateFormatter in CatalogColumnStat >

[jira] [Assigned] (SPARK-26805) Eliminate double checking of stringToDate and stringToTimestamp inputs

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26805: Assignee: Apache Spark > Eliminate double checking of stringToDate and stringToTimestamp

[jira] [Resolved] (SPARK-26757) GraphX EdgeRDDImpl and VertexRDDImpl `count` method cannot handle empty RDDs

2019-01-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-26757. --- Resolution: Fixed Fix Version/s: 2.3.4 2.4.1 3.0.0

[jira] [Assigned] (SPARK-26805) Eliminate double checking of stringToDate and stringToTimestamp inputs

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26805: Assignee: (was: Apache Spark) > Eliminate double checking of stringToDate and

[jira] [Assigned] (SPARK-26757) GraphX EdgeRDDImpl and VertexRDDImpl `count` method cannot handle empty RDDs

2019-01-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-26757: - Assignee: Huon Wilson > GraphX EdgeRDDImpl and VertexRDDImpl `count` method cannot handle

[jira] [Created] (SPARK-26805) Eliminate double checking of stringToDate and stringToTimestamp inputs

2019-01-31 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-26805: -- Summary: Eliminate double checking of stringToDate and stringToTimestamp inputs Key: SPARK-26805 URL: https://issues.apache.org/jira/browse/SPARK-26805 Project: Spark

[jira] [Resolved] (SPARK-26799) Make ANTLR v4 version consistent between Maven and SBT

2019-01-31 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-26799. - Resolution: Fixed Assignee: Chenxiao Mao Fix Version/s: 3.0.0 > Make ANTLR v4 version

[jira] [Assigned] (SPARK-26734) StackOverflowError on WAL serialization caused by large receivedBlockQueue

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26734: Assignee: (was: Apache Spark) > StackOverflowError on WAL serialization caused by

[jira] [Assigned] (SPARK-26734) StackOverflowError on WAL serialization caused by large receivedBlockQueue

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26734: Assignee: Apache Spark > StackOverflowError on WAL serialization caused by large

[jira] [Issue Comment Deleted] (SPARK-24432) Add support for dynamic resource allocation

2019-01-31 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-24432: --- Comment: was deleted (was: vanzin closed pull request #22722: [SPARK-24432][k8s] Add

[jira] [Assigned] (SPARK-24432) Add support for dynamic resource allocation

2019-01-31 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-24432: -- Assignee: Marcelo Vanzin > Add support for dynamic resource allocation >

[jira] [Assigned] (SPARK-24432) Add support for dynamic resource allocation

2019-01-31 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-24432: -- Assignee: (was: Marcelo Vanzin) > Add support for dynamic resource allocation >

[jira] [Created] (SPARK-26804) Spark sql carries newline char from last csv column when imported

2019-01-31 Thread Raj (JIRA)
Raj created SPARK-26804: --- Summary: Spark sql carries newline char from last csv column when imported Key: SPARK-26804 URL: https://issues.apache.org/jira/browse/SPARK-26804 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-26803) include sbin subdirectory in pyspark

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26803: Assignee: (was: Apache Spark) > include sbin subdirectory in pyspark >

[jira] [Updated] (SPARK-26803) include sbin subdirectory in pyspark

2019-01-31 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-26803: -- Shepherd: (was: Sean Owen) > include sbin subdirectory in pyspark >

[jira] [Assigned] (SPARK-26803) include sbin subdirectory in pyspark

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26803: Assignee: Apache Spark > include sbin subdirectory in pyspark >

[jira] [Created] (SPARK-26803) include sbin subdirectory in pyspark

2019-01-31 Thread Oliver Urs Lenz (JIRA)
Oliver Urs Lenz created SPARK-26803: --- Summary: include sbin subdirectory in pyspark Key: SPARK-26803 URL: https://issues.apache.org/jira/browse/SPARK-26803 Project: Spark Issue Type:

[jira] [Commented] (SPARK-24736) --py-files not functional for non local URLs. It appears to pass non-local URL's into PYTHONPATH directly.

2019-01-31 Thread Oleg Frenkel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24736?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757663#comment-16757663 ] Oleg Frenkel commented on SPARK-24736: -- Supporting local files with --py-files would be great. By

[jira] [Updated] (SPARK-26726) Synchronize the amount of memory used by the broadcast variable to the UI display

2019-01-31 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-26726: --- Fix Version/s: 2.3.3 > Synchronize the amount of memory used by the broadcast variable to

[jira] [Updated] (SPARK-26802) CVE-2018-11760: Apache Spark local privilege escalation vulnerability

2019-01-31 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26802: - Description: Severity: Important Vendor: The Apache Software Foundation Versions affected:

[jira] [Created] (SPARK-26802) CVE-2018-11760: Apache Spark local privilege escalation vulnerability

2019-01-31 Thread Imran Rashid (JIRA)
Imran Rashid created SPARK-26802: Summary: CVE-2018-11760: Apache Spark local privilege escalation vulnerability Key: SPARK-26802 URL: https://issues.apache.org/jira/browse/SPARK-26802 Project: Spark

[jira] [Updated] (SPARK-26802) CVE-2018-11760: Apache Spark local privilege escalation vulnerability

2019-01-31 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-26802: - Description: Severity: Important Vendor: The Apache Software Foundation Versions affected:

[jira] [Resolved] (SPARK-26802) CVE-2018-11760: Apache Spark local privilege escalation vulnerability

2019-01-31 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-26802. -- Resolution: Fixed > CVE-2018-11760: Apache Spark local privilege escalation vulnerability >

[jira] [Assigned] (SPARK-26726) Synchronize the amount of memory used by the broadcast variable to the UI display

2019-01-31 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-26726: -- Assignee: hantiantian > Synchronize the amount of memory used by the broadcast

[jira] [Created] (SPARK-26801) Spark unable to read valid avro types

2019-01-31 Thread Dhruve Ashar (JIRA)
Dhruve Ashar created SPARK-26801: Summary: Spark unable to read valid avro types Key: SPARK-26801 URL: https://issues.apache.org/jira/browse/SPARK-26801 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-26726) Synchronize the amount of memory used by the broadcast variable to the UI display

2019-01-31 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-26726. Resolution: Fixed Fix Version/s: 2.4.1 3.0.0 Issue resolved by

[jira] [Assigned] (SPARK-26744) Support schema validation in File Source V2

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26744: Assignee: (was: Apache Spark) > Support schema validation in File Source V2 >

[jira] [Assigned] (SPARK-26744) Support schema validation in File Source V2

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26744: Assignee: Apache Spark > Support schema validation in File Source V2 >

[jira] [Updated] (SPARK-26744) Support schema validation in File Source V2

2019-01-31 Thread Gengliang Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gengliang Wang updated SPARK-26744: --- Summary: Support schema validation in File Source V2 (was: Create API supportDataType in

[jira] [Assigned] (SPARK-26799) Make ANTLR v4 version consistent between Maven and SBT

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26799: Assignee: (was: Apache Spark) > Make ANTLR v4 version consistent between Maven and

[jira] [Updated] (SPARK-26800) JDBC - MySQL nullable option is ignored

2019-01-31 Thread Francisco Miguel Biete Banon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francisco Miguel Biete Banon updated SPARK-26800: - Description: Spark 2.4.0 MySQL 5.7.21 (docker official MySQL

[jira] [Assigned] (SPARK-26799) Make ANTLR v4 version consistent between Maven and SBT

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26799: Assignee: Apache Spark > Make ANTLR v4 version consistent between Maven and SBT >

[jira] [Created] (SPARK-26800) JDBC - MySQL nullable option is ignored

2019-01-31 Thread Francisco Miguel Biete Banon (JIRA)
Francisco Miguel Biete Banon created SPARK-26800: Summary: JDBC - MySQL nullable option is ignored Key: SPARK-26800 URL: https://issues.apache.org/jira/browse/SPARK-26800 Project:

[jira] [Created] (SPARK-26799) Make ANTLR v4 version consistent between Maven and SBT

2019-01-31 Thread Chenxiao Mao (JIRA)
Chenxiao Mao created SPARK-26799: Summary: Make ANTLR v4 version consistent between Maven and SBT Key: SPARK-26799 URL: https://issues.apache.org/jira/browse/SPARK-26799 Project: Spark Issue

[jira] [Assigned] (SPARK-26798) HandleNullInputsForUDF should trust nullability

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26798: Assignee: Wenchen Fan (was: Apache Spark) > HandleNullInputsForUDF should trust

[jira] [Assigned] (SPARK-26798) HandleNullInputsForUDF should trust nullability

2019-01-31 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-26798: Assignee: Apache Spark (was: Wenchen Fan) > HandleNullInputsForUDF should trust

[jira] [Created] (SPARK-26798) HandleNullInputsForUDF should trust nullability

2019-01-31 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-26798: --- Summary: HandleNullInputsForUDF should trust nullability Key: SPARK-26798 URL: https://issues.apache.org/jira/browse/SPARK-26798 Project: Spark Issue Type:

[jira] [Commented] (SPARK-17998) Reading Parquet files coalesces parts into too few in-memory partitions

2019-01-31 Thread Nicholas Resnick (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757326#comment-16757326 ] Nicholas Resnick commented on SPARK-17998: -- Going to answer my question: it is in fact a

[jira] [Comment Edited] (SPARK-17998) Reading Parquet files coalesces parts into too few in-memory partitions

2019-01-31 Thread Nicholas Resnick (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757326#comment-16757326 ] Nicholas Resnick edited comment on SPARK-17998 at 1/31/19 3:03 PM: ---

[jira] [Commented] (SPARK-26786) Handle to treat escaped newline characters('\r','\n') in spark csv

2019-01-31 Thread vishnuram selvaraj (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757327#comment-16757327 ] vishnuram selvaraj commented on SPARK-26786: We should be able to treat the escaped newlines

[jira] [Comment Edited] (SPARK-17998) Reading Parquet files coalesces parts into too few in-memory partitions

2019-01-31 Thread Nicholas Resnick (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757326#comment-16757326 ] Nicholas Resnick edited comment on SPARK-17998 at 1/31/19 3:02 PM: ---

[jira] [Commented] (SPARK-25153) Improve error messages for columns with dots/periods

2019-01-31 Thread Mikhail (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757286#comment-16757286 ] Mikhail commented on SPARK-25153: - Hello [~blavigne] Are you still working on this? Hello [~holdenk]

[jira] [Assigned] (SPARK-26673) File source V2 write: create framework and migrate ORC to it

2019-01-31 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-26673: --- Assignee: Gengliang Wang > File source V2 write: create framework and migrate ORC to it >

[jira] [Resolved] (SPARK-26673) File source V2 write: create framework and migrate ORC to it

2019-01-31 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-26673. - Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 23601

[jira] [Created] (SPARK-26797) Start using the new logical types API of Parquet 1.11.0 instead of the deprecated one

2019-01-31 Thread Zoltan Ivanfi (JIRA)
Zoltan Ivanfi created SPARK-26797: - Summary: Start using the new logical types API of Parquet 1.11.0 instead of the deprecated one Key: SPARK-26797 URL: https://issues.apache.org/jira/browse/SPARK-26797

[jira] [Comment Edited] (SPARK-26345) Parquet support Column indexes

2019-01-31 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757209#comment-16757209 ] Zoltan Ivanfi edited comment on SPARK-26345 at 1/31/19 1:02 PM: Please

[jira] [Commented] (SPARK-26345) Parquet support Column indexes

2019-01-31 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757209#comment-16757209 ] Zoltan Ivanfi commented on SPARK-26345: --- Please note that column indexes will automatically get

[jira] [Commented] (SPARK-25136) unable to use HDFS checkpoint directories after driver restart

2019-01-31 Thread Gabor Somogyi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16757199#comment-16757199 ] Gabor Somogyi commented on SPARK-25136: --- [~kerbylane] did you have time to check it? > unable to

[jira] [Updated] (SPARK-26795) Retry remote fileSegmentManagedBuffer when creating inputStream failed during shuffle read phase

2019-01-31 Thread feiwang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] feiwang updated SPARK-26795: Labels: (was: shuffle) > Retry remote fileSegmentManagedBuffer when creating inputStream failed during

[jira] [Updated] (SPARK-26795) Retry remote fileSegmentManagedBuffer when creating inputStream failed during shuffle read phase

2019-01-31 Thread feiwang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] feiwang updated SPARK-26795: Target Version/s: 2.4.0, 2.3.2 (was: 2.3.2, 2.4.0) Labels: shuffle (was: ) > Retry remote

[jira] [Updated] (SPARK-26795) Retry remote fileSegmentManagedBuffer when creating inputStream failed during shuffle read phase

2019-01-31 Thread feiwang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] feiwang updated SPARK-26795: Description: There is a parameter *spark.maxRemoteBlockSizeFetchToMem*, which means the remote block

[jira] [Updated] (SPARK-26795) Retry remote fileSegmentManagedBuffer when creating inputStream failed during shuffle read phase

2019-01-31 Thread feiwang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] feiwang updated SPARK-26795: Description: There is a parameter `spark.maxRemoteBlockSizeFetchToMem`, which means the remote block

[jira] [Updated] (SPARK-26795) Retry remote fileSegmentManagedBuffer when creating inputStream failed during shuffle read phase

2019-01-31 Thread feiwang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] feiwang updated SPARK-26795: Description: There is a parameter spark.maxRemoteBlockSizeFetchToMem, which means the remote block will

[jira] [Updated] (SPARK-26795) Retry remote fileSegmentManagedBuffer when creating inputStream failed during shuffle read phase

2019-01-31 Thread feiwang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] feiwang updated SPARK-26795: Description: There is a parameter spark.maxRemoteBlockSizeFetchToMem, which means the remote block will

[jira] [Assigned] (SPARK-24023) Built-in SQL Functions improvement in SparkR

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-24023: Assignee: Huaxin Gao > Built-in SQL Functions improvement in SparkR >

[jira] [Resolved] (SPARK-24023) Built-in SQL Functions improvement in SparkR

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24023. -- Resolution: Done > Built-in SQL Functions improvement in SparkR >

[jira] [Assigned] (SPARK-24779) Add map_concat / map_from_entries / an option in months_between UDF to disable rounding-off

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reassigned SPARK-24779: Assignee: Huaxin Gao > Add map_concat / map_from_entries / an option in months_between

[jira] [Updated] (SPARK-24779) Add map_concat / map_from_entries / an option in months_between UDF to disable rounding-off

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24779: - Summary: Add map_concat / map_from_entries / an option in months_between UDF to disable

[jira] [Resolved] (SPARK-24779) Add map_concat / map_from_entries / an option in months_between UDF to disable rounding-off

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24779. -- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21835

[jira] [Updated] (SPARK-24779) Add sequence / map_concat / map_from_entries / an option in months_between UDF to disable rounding-off

2019-01-31 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24779: - Description: Add R versions of  * map_concat   -SPARK-23936- * map_from_entries   SPARK-23934

[jira] [Updated] (SPARK-26796) Testcases failing with "org.apache.hadoop.fs.ChecksumException" error

2019-01-31 Thread Anuja Jakhade (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuja Jakhade updated SPARK-26796: -- Environment: Ubuntu 16.04  Java Version openjdk version "1.8.0_192" OpenJDK Runtime

  1   2   >