[jira] [Commented] (SPARK-19256) Hive bucketing support

2018-04-24 Thread Xianjin YE (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451689#comment-16451689 ] Xianjin YE commented on SPARK-19256: Hi [~tejasp] [~cloud_fan], are you still working on this? We

[jira] [Updated] (SPARK-24009) spark2.3.0 INSERT OVERWRITE LOCAL DIRECTORY '/home/spark/aaaaab'

2018-04-24 Thread chris_j (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] chris_j updated SPARK-24009: Description: local mode  spark execute "INSERT OVERWRITE LOCAL DIRECTORY " successfully. on yarn spark

[jira] [Commented] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451679#comment-16451679 ] Hyukjin Kwon commented on SPARK-24078: -- Would you be able to test this in higher versions? > reduce

[jira] [Resolved] (SPARK-24077) Why spark SQL not support `CREATE TEMPORARY FUNCTION IF NOT EXISTS`?

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-24077. -- Resolution: Invalid Fix Version/s: (was: 3.0.0) Target Version/s: (was:

[jira] [Commented] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451677#comment-16451677 ] Hyukjin Kwon commented on SPARK-24074: -- I haven't looked into this yet but doesn't that sound more

[jira] [Commented] (SPARK-5594) SparkException: Failed to get broadcast (TorrentBroadcast)

2018-04-24 Thread Spark User (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451666#comment-16451666 ] Spark User commented on SPARK-5594: --- In my case, this issue was happening when spark context doesn't

[jira] [Created] (SPARK-24081) Spark SQL drops the table while writing into table in "overwrite" mode.

2018-04-24 Thread Ashish (JIRA)
Ashish created SPARK-24081: -- Summary: Spark SQL drops the table while writing into table in "overwrite" mode. Key: SPARK-24081 URL: https://issues.apache.org/jira/browse/SPARK-24081 Project: Spark

[jira] [Commented] (SPARK-24036) Stateful operators in continuous processing

2018-04-24 Thread Jungtaek Lim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451653#comment-16451653 ] Jungtaek Lim commented on SPARK-24036: -- Hello, I'm quite interested to this issue since I just read

[jira] [Created] (SPARK-24080) Update the nullability of Filter output based on inferred predicates

2018-04-24 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-24080: Summary: Update the nullability of Filter output based on inferred predicates Key: SPARK-24080 URL: https://issues.apache.org/jira/browse/SPARK-24080

[jira] [Commented] (SPARK-24079) Update the nullability of Join output based on inferred predicates

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451650#comment-16451650 ] Apache Spark commented on SPARK-24079: -- User 'maropu' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24079) Update the nullability of Join output based on inferred predicates

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24079: Assignee: (was: Apache Spark) > Update the nullability of Join output based on

[jira] [Assigned] (SPARK-24079) Update the nullability of Join output based on inferred predicates

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24079: Assignee: Apache Spark > Update the nullability of Join output based on inferred

[jira] [Created] (SPARK-24079) Update the nullability of Join output based on inferred predicates

2018-04-24 Thread Takeshi Yamamuro (JIRA)
Takeshi Yamamuro created SPARK-24079: Summary: Update the nullability of Join output based on inferred predicates Key: SPARK-24079 URL: https://issues.apache.org/jira/browse/SPARK-24079 Project:

[jira] [Assigned] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-24070: --- Assignee: Takeshi Yamamuro > TPC-DS Performance Tests for Parquet 1.10.0 Upgrade >

[jira] [Updated] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread zhangsongcheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangsongcheng updated SPARK-24078: --- Description: I try to sample the traning sets with each category,and then uion all samples

[jira] [Commented] (SPARK-23799) [CBO] FilterEstimation.evaluateInSet produces devision by zero in a case of empty table with analyzed statistics

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451609#comment-16451609 ] Apache Spark commented on SPARK-23799: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Updated] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread zhangsongcheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangsongcheng updated SPARK-24078: --- Description: I try to sample the traning sets with each category,and then uion all samples

[jira] [Updated] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread zhangsongcheng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24078?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhangsongcheng updated SPARK-24078: --- Description: I try to sample the traning sets with each category,and then uion all samples

[jira] [Created] (SPARK-24078) reduce with unionAll takes a long time

2018-04-24 Thread zhangsongcheng (JIRA)
zhangsongcheng created SPARK-24078: -- Summary: reduce with unionAll takes a long time Key: SPARK-24078 URL: https://issues.apache.org/jira/browse/SPARK-24078 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451540#comment-16451540 ] yucai commented on SPARK-24076: --- shuffle.partition = 8192 !p1.png! shuffle.partition = 8000 !p2.png! >

[jira] [Updated] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yucai updated SPARK-24076: -- Attachment: p2.png p1.png > very bad performance when shuffle.partition = 8192 >

[jira] [Created] (SPARK-24077) Why spark SQL not support `CREATE TEMPORARY FUNCTION IF NOT EXISTS`?

2018-04-24 Thread Benedict Jin (JIRA)
Benedict Jin created SPARK-24077: Summary: Why spark SQL not support `CREATE TEMPORARY FUNCTION IF NOT EXISTS`? Key: SPARK-24077 URL: https://issues.apache.org/jira/browse/SPARK-24077 Project: Spark

[jira] [Created] (SPARK-24076) very bad performance when shuffle.partition = 8192

2018-04-24 Thread yucai (JIRA)
yucai created SPARK-24076: - Summary: very bad performance when shuffle.partition = 8192 Key: SPARK-24076 URL: https://issues.apache.org/jira/browse/SPARK-24076 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-23821) High-order function: flatten(x) → array

2018-04-24 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin resolved SPARK-23821. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20938

[jira] [Assigned] (SPARK-23821) High-order function: flatten(x) → array

2018-04-24 Thread Takuya Ueshin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takuya Ueshin reassigned SPARK-23821: - Assignee: Marek Novotny > High-order function: flatten(x) → array >

[jira] [Created] (SPARK-24075) [Mesos] Supervised driver upon failure will be retried indefinitely unless explicitly killed

2018-04-24 Thread Yogesh Natarajan (JIRA)
Yogesh Natarajan created SPARK-24075: Summary: [Mesos] Supervised driver upon failure will be retried indefinitely unless explicitly killed Key: SPARK-24075 URL:

[jira] [Commented] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451499#comment-16451499 ] Nadav Samet commented on SPARK-24074: - I was only able to reproduce this problem with this particular

[jira] [Updated] (SPARK-24064) [Spark SQL] Create table using csv does not support binary column Type

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24064: - Target Version/s: (was: 2.3.1) Please avoid to set a target version which is usually set by a

[jira] [Commented] (SPARK-24068) CSV schema inferring doesn't work for compressed files

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451483#comment-16451483 ] Hyukjin Kwon commented on SPARK-24068: -- Hm, [~maxgekk], btw is this specific to CSV (not, for

[jira] [Updated] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-24074: - Priority: Major (was: Critical) Please avoid to set Critical+ which is usually reserved for

[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451466#comment-16451466 ] Takeshi Yamamuro commented on SPARK-24070: -- ok > TPC-DS Performance Tests for Parquet 1.10.0

[jira] [Resolved] (SPARK-24038) refactor continuous write exec to its own class

2018-04-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-24038. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21116

[jira] [Assigned] (SPARK-24038) refactor continuous write exec to its own class

2018-04-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das reassigned SPARK-24038: - Assignee: Jose Torres > refactor continuous write exec to its own class >

[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451427#comment-16451427 ] Xiao Li commented on SPARK-24070: - Yeah, please do it here. Thanks! If you have the bandwidth to write

[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451404#comment-16451404 ] Takeshi Yamamuro commented on SPARK-24070: -- ok, this ticket means we will put the performance

[jira] [Updated] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nadav Samet updated SPARK-24074: Environment: (was: {code:java} // code placeholder {code}) > Maven package resolver downloads

[jira] [Updated] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nadav Samet updated SPARK-24074: Description: {code:java} // code placeholder {code} >From some reason spark downloads a javadoc 

[jira] [Created] (SPARK-24074) Maven package resolver downloads javadoc instead of jar

2018-04-24 Thread Nadav Samet (JIRA)
Nadav Samet created SPARK-24074: --- Summary: Maven package resolver downloads javadoc instead of jar Key: SPARK-24074 URL: https://issues.apache.org/jira/browse/SPARK-24074 Project: Spark Issue

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-24 Thread Henry Robinson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451263#comment-16451263 ] Henry Robinson commented on SPARK-23852: Yes it has - the Parquet community are going to do a

[jira] [Resolved] (SPARK-24056) Make consumer creation lazy in Kafka source for Structured streaming

2018-04-24 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-24056. --- Resolution: Fixed Fix Version/s: 3.0.0 Issue resolved by pull request 21134

[jira] [Commented] (SPARK-24051) Incorrect results for certain queries using Java and Python APIs on Spark 2.3.0

2018-04-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451241#comment-16451241 ] Marco Gaido commented on SPARK-24051: - [~hvanhovell] I am not sure that the analysis barriers are the

[jira] [Assigned] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-20114: - Assignee: Weichen Xu > spark.ml parity for sequential pattern mining -

[jira] [Commented] (SPARK-23654) Cut jets3t as a dependency of spark-core

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450449#comment-16450449 ] Apache Spark commented on SPARK-23654: -- User 'steveloughran' has created a pull request for this

[jira] [Assigned] (SPARK-23654) Cut jets3t as a dependency of spark-core

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23654: Assignee: Apache Spark > Cut jets3t as a dependency of spark-core >

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-20114: -- Shepherd: Joseph K. Bradley > spark.ml parity for sequential pattern mining -

[jira] [Assigned] (SPARK-23654) Cut jets3t as a dependency of spark-core

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23654: Assignee: (was: Apache Spark) > Cut jets3t as a dependency of spark-core >

[jira] [Updated] (SPARK-20114) spark.ml parity for sequential pattern mining - PrefixSpan

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-20114: -- Target Version/s: 2.4.0 > spark.ml parity for sequential pattern mining - PrefixSpan >

[jira] [Commented] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450443#comment-16450443 ] Apache Spark commented on SPARK-24073: -- User 'rdblue' has created a pull request for this issue:

[jira] [Assigned] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24073: Assignee: Apache Spark > DataSourceV2: Rename DataReaderFactory back to ReadTask. >

[jira] [Assigned] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24073: Assignee: (was: Apache Spark) > DataSourceV2: Rename DataReaderFactory back to

[jira] [Updated] (SPARK-23654) Cut jets3t as a dependency of spark-core

2018-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated SPARK-23654: --- Summary: Cut jets3t as a dependency of spark-core (was: Cut jets3t as a dependency of

[jira] [Created] (SPARK-24073) DataSourceV2: Rename DataReaderFactory back to ReadTask.

2018-04-24 Thread Ryan Blue (JIRA)
Ryan Blue created SPARK-24073: - Summary: DataSourceV2: Rename DataReaderFactory back to ReadTask. Key: SPARK-24073 URL: https://issues.apache.org/jira/browse/SPARK-24073 Project: Spark Issue

[jira] [Commented] (SPARK-24043) InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450426#comment-16450426 ] Apache Spark commented on SPARK-24043: -- User 'bersprockets' has created a pull request for this

[jira] [Assigned] (SPARK-24043) InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24043: Assignee: Apache Spark > InterpretedPredicate.eval fails if expression tree contains

[jira] [Assigned] (SPARK-24043) InterpretedPredicate.eval fails if expression tree contains Nondeterministic expressions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24043: Assignee: (was: Apache Spark) > InterpretedPredicate.eval fails if expression tree

[jira] [Updated] (SPARK-24072) clearly define pushed filters

2018-04-24 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan updated SPARK-24072: Summary: clearly define pushed filters (was: remove unused DataSourceV2Relation.pushedFilters) >

[jira] [Resolved] (SPARK-23990) Instruments logging improvements - ML regression package

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-23990. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 21078

[jira] [Commented] (SPARK-24051) Incorrect results for certain queries using Java and Python APIs on Spark 2.3.0

2018-04-24 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450293#comment-16450293 ] Herman van Hovell commented on SPARK-24051: --- [~mgaido] do you have any idea why this is failing

[jira] [Resolved] (SPARK-23455) Default Params in ML should be saved separately

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley resolved SPARK-23455. --- Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20633

[jira] [Assigned] (SPARK-24072) remove unused DataSourceV2Relation.pushedFilters

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24072: Assignee: Wenchen Fan (was: Apache Spark) > remove unused

[jira] [Assigned] (SPARK-24072) remove unused DataSourceV2Relation.pushedFilters

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24072: Assignee: Apache Spark (was: Wenchen Fan) > remove unused

[jira] [Commented] (SPARK-24072) remove unused DataSourceV2Relation.pushedFilters

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450273#comment-16450273 ] Apache Spark commented on SPARK-24072: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Created] (SPARK-24072) remove unused DataSourceV2Relation.pushedFilters

2018-04-24 Thread Wenchen Fan (JIRA)
Wenchen Fan created SPARK-24072: --- Summary: remove unused DataSourceV2Relation.pushedFilters Key: SPARK-24072 URL: https://issues.apache.org/jira/browse/SPARK-24072 Project: Spark Issue Type:

[jira] [Commented] (SPARK-23933) High-order function: map(array, array) → map<K,V>

2018-04-24 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23933?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450262#comment-16450262 ] Kazuaki Ishizaki commented on SPARK-23933: -- cc [~smilegator] > High-order function: map(array,

[jira] [Commented] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450243#comment-16450243 ] Xiao Li commented on SPARK-24070: - cc [~maropu] > TPC-DS Performance Tests for Parquet 1.10.0 Upgrade >

[jira] [Created] (SPARK-24071) Micro-benchmark of Parquet Filter Pushdown

2018-04-24 Thread Xiao Li (JIRA)
Xiao Li created SPARK-24071: --- Summary: Micro-benchmark of Parquet Filter Pushdown Key: SPARK-24071 URL: https://issues.apache.org/jira/browse/SPARK-24071 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-24070) TPC-DS Performance Tests for Parquet 1.10.0 Upgrade

2018-04-24 Thread Xiao Li (JIRA)
Xiao Li created SPARK-24070: --- Summary: TPC-DS Performance Tests for Parquet 1.10.0 Upgrade Key: SPARK-24070 URL: https://issues.apache.org/jira/browse/SPARK-24070 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-23807) Add Hadoop 3 profile with relevant POM fix ups

2018-04-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-23807. Resolution: Fixed Fix Version/s: 2.4.0 Issue resolved by pull request 20923

[jira] [Assigned] (SPARK-23807) Add Hadoop 3 profile with relevant POM fix ups

2018-04-24 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin reassigned SPARK-23807: -- Assignee: Steve Loughran > Add Hadoop 3 profile with relevant POM fix ups >

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Julien Cuquemelle (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450186#comment-16450186 ] Julien Cuquemelle commented on SPARK-22683: --- Thanks for all your comments and proposals :) >

[jira] [Resolved] (SPARK-24052) Support spark version showing on environment page

2018-04-24 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-24052. --- Resolution: Not A Problem > Support spark version showing on environment page >

[jira] [Commented] (SPARK-23975) Allow Clustering to take Arrays of Double as input features

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450161#comment-16450161 ] Joseph K. Bradley commented on SPARK-23975: --- I merged

[jira] [Assigned] (SPARK-23975) Allow Clustering to take Arrays of Double as input features

2018-04-24 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-23975: - Assignee: Lu Wang > Allow Clustering to take Arrays of Double as input features

[jira] [Assigned] (SPARK-24069) Add array_max / array_min functions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24069: Assignee: (was: Apache Spark) > Add array_max / array_min functions >

[jira] [Commented] (SPARK-24069) Add array_max / array_min functions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450142#comment-16450142 ] Apache Spark commented on SPARK-24069: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-24069) Add array_max / array_min functions

2018-04-24 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-24069: Assignee: Apache Spark > Add array_max / array_min functions >

[jira] [Created] (SPARK-24069) Add array_max / array_min functions

2018-04-24 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-24069: Summary: Add array_max / array_min functions Key: SPARK-24069 URL: https://issues.apache.org/jira/browse/SPARK-24069 Project: Spark Issue Type: Sub-task

[jira] [Assigned] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves reassigned SPARK-22683: - Assignee: Julien Cuquemelle > DynamicAllocation wastes resources by allocating

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450131#comment-16450131 ] Thomas Graves commented on SPARK-22683: --- Note this added a new config 

[jira] [Resolved] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2018-04-24 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-22683. --- Resolution: Fixed Fix Version/s: 2.4.0 > DynamicAllocation wastes resources by

[jira] [Commented] (SPARK-24000) S3A: Create Table should fail on invalid AK/SK

2018-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450024#comment-16450024 ] Steve Loughran commented on SPARK-24000: We could consider whether or not to raise an

[jira] [Commented] (SPARK-23852) Parquet MR bug can lead to incorrect SQL results

2018-04-24 Thread Eric Maynard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16450018#comment-16450018 ] Eric Maynard commented on SPARK-23852: -- {color:#33}>There is no upstream release of Parquet that

[jira] [Commented] (SPARK-23519) Create View Commands Fails with The view output (col1,col1) contains duplicate column name

2018-04-24 Thread Eric Maynard (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449948#comment-16449948 ] Eric Maynard commented on SPARK-23519: -- Why is the fact that you dynamically generate the statement

[jira] [Comment Edited] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-04-24 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449935#comment-16449935 ] Li Jin edited comment on SPARK-22947 at 4/24/18 2:16 PM: - I came across this blog

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-04-24 Thread Li Jin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449935#comment-16449935 ] Li Jin commented on SPARK-22947: I came across this blog today:

[jira] [Created] (SPARK-24068) CSV schema inferring doesn't work for compressed files

2018-04-24 Thread Maxim Gekk (JIRA)
Maxim Gekk created SPARK-24068: -- Summary: CSV schema inferring doesn't work for compressed files Key: SPARK-24068 URL: https://issues.apache.org/jira/browse/SPARK-24068 Project: Spark Issue

[jira] [Updated] (SPARK-23182) Allow enabling of TCP keep alive for master RPC connections

2018-04-24 Thread Petar Petrov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-23182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Petar Petrov updated SPARK-23182: - Affects Version/s: 2.2.2 > Allow enabling of TCP keep alive for master RPC connections >

[jira] [Commented] (SPARK-18673) Dataframes doesn't work on Hadoop 3.x; Hive rejects Hadoop version

2018-04-24 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449857#comment-16449857 ] Steve Loughran commented on SPARK-18673: It's a big hive patch, but most of it is hbase related.

[jira] [Updated] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Description: SPARK-17147 fixes a problem with non-consecutive Kafka Offsets. The  [PR

[jira] [Updated] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Affects Version/s: (was: 2.0.0) 2.3.0 > Backport SPARK-17147 to

[jira] [Updated] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Fix Version/s: (was: 2.4.0) > Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10

[jira] [Updated] (SPARK-24067) Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction))

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Summary: Backport SPARK-17147 to 2.3 (Spark Streaming Kafka 0.10 Consumer Can't Handle

[jira] [Updated] (SPARK-24067) Spark 2.3 Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Description: Spark 2.3 Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets

[jira] [Updated] (SPARK-24067) Backport SPARK-17147 to 2.3

2018-04-24 Thread Joachim Hereth (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joachim Hereth updated SPARK-24067: --- Summary: Backport SPARK-17147 to 2.3 (was: Spark 2.3 Streaming Kafka 0.10 Consumer Can't

[jira] [Created] (SPARK-24067) Spark 2.3 Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction)

2018-04-24 Thread Joachim Hereth (JIRA)
Joachim Hereth created SPARK-24067: -- Summary: Spark 2.3 Streaming Kafka 0.10 Consumer Can't Handle Non-consecutive Offsets (i.e. Log Compaction) Key: SPARK-24067 URL:

[jira] [Comment Edited] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2018-04-24 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449684#comment-16449684 ] Manish Kumar edited comment on SPARK-13699 at 4/24/18 11:50 AM: I am not

[jira] [Commented] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2018-04-24 Thread Manish Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449684#comment-16449684 ] Manish Kumar commented on SPARK-13699: -- I am not sure whether the issue is resolved or not. But as a

[jira] [Commented] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2018-04-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449678#comment-16449678 ] Hyukjin Kwon commented on SPARK-13699: -- Mind opening a separate JIRA with details and a reproducer

[jira] [Commented] (SPARK-24051) Incorrect results for certain queries using Java and Python APIs on Spark 2.3.0

2018-04-24 Thread Marco Gaido (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449670#comment-16449670 ] Marco Gaido commented on SPARK-24051: - I was able to reproduce the issue. It is due to the usage of

[jira] [Commented] (SPARK-13699) Spark SQL drops the table in "overwrite" mode while writing into table

2018-04-24 Thread Ashish (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449661#comment-16449661 ] Ashish commented on SPARK-13699: Is this issue gone resolve . I am facing same issue while writing to

  1   2   >