[jira] [Created] (SPARK-19605) Fail it if existing resource is not enough to run streaming job

2017-02-14 Thread Genmao Yu (JIRA)
Genmao Yu created SPARK-19605: - Summary: Fail it if existing resource is not enough to run streaming job Key: SPARK-19605 URL: https://issues.apache.org/jira/browse/SPARK-19605 Project: Spark

[jira] [Commented] (SPARK-19594) StreamingQueryListener fails to handle QueryTerminatedEvent if more then one listeners exists

2017-02-14 Thread Eyal Zituny (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867409#comment-15867409 ] Eyal Zituny commented on SPARK-19594: - sure, i can do that, will it make sense to fix it by marking

[jira] [Commented] (SPARK-19598) Remove the alias parameter in UnresolvedRelation

2017-02-14 Thread Song Jun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867398#comment-15867398 ] Song Jun commented on SPARK-19598: -- OK~ I'd like to do this. Thank you very much! > Remove the alias

[jira] [Commented] (SPARK-16742) Kerberos support for Spark on Mesos

2017-02-14 Thread JIRA
[ https://issues.apache.org/jira/browse/SPARK-16742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867374#comment-15867374 ] Abel Rincón commented on SPARK-16742: - Hi all we are working on a solution with hadoop delegation

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-02-14 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867366#comment-15867366 ] Yun Ni commented on SPARK-18392: I agree with Seth. We need to first finish SPARK-18080 and SPARK-18450

[jira] [Comment Edited] (SPARK-19442) Unable to add column to the dataset using Dataset.WithColumn() api

2017-02-14 Thread Navya Krishnappa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867357#comment-15867357 ] Navya Krishnappa edited comment on SPARK-19442 at 2/15/17 7:04 AM: ---

[jira] [Commented] (SPARK-19442) Unable to add column to the dataset using Dataset.WithColumn() api

2017-02-14 Thread Navya Krishnappa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867357#comment-15867357 ] Navya Krishnappa commented on SPARK-19442: -- Thank you [~hyukjin.kwon]. It is satisfied my

[jira] [Assigned] (SPARK-19604) Log the start of every Python test

2017-02-14 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai reassigned SPARK-19604: Assignee: Yin Huai > Log the start of every Python test > -- > >

[jira] [Comment Edited] (SPARK-19593) Records read per each kinesis transaction

2017-02-14 Thread Sarath Chandra Jiguru (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867214#comment-15867214 ] Sarath Chandra Jiguru edited comment on SPARK-19593 at 2/15/17 5:52 AM:

[jira] [Commented] (SPARK-19442) Unable to add column to the dataset using Dataset.WithColumn() api

2017-02-14 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867277#comment-15867277 ] Hyukjin Kwon commented on SPARK-19442: -- ping [~Navya Krishnappa], would this satisfy your demand?

[jira] [Assigned] (SPARK-19604) Log the start of every Python test

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19604: Assignee: Apache Spark > Log the start of every Python test >

[jira] [Commented] (SPARK-19593) Records read per each kinesis transaction

2017-02-14 Thread Sarath Chandra Jiguru (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867214#comment-15867214 ] Sarath Chandra Jiguru commented on SPARK-19593: --- See the Type of the ticket is question. In

[jira] [Commented] (SPARK-19604) Log the start of every Python test

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867212#comment-15867212 ] Apache Spark commented on SPARK-19604: -- User 'yhuai' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19604) Log the start of every Python test

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19604: Assignee: (was: Apache Spark) > Log the start of every Python test >

[jira] [Created] (SPARK-19604) Log the start of every Python test

2017-02-14 Thread Yin Huai (JIRA)
Yin Huai created SPARK-19604: Summary: Log the start of every Python test Key: SPARK-19604 URL: https://issues.apache.org/jira/browse/SPARK-19604 Project: Spark Issue Type: Test

[jira] [Updated] (SPARK-19593) Records read per each kinesis transaction

2017-02-14 Thread Sarath Chandra Jiguru (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sarath Chandra Jiguru updated SPARK-19593: -- Description: The question is related to spark streaming+kinesis integration

[jira] [Commented] (SPARK-19556) Broadcast data is not encrypted when I/O encryption is on

2017-02-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867161#comment-15867161 ] Marcelo Vanzin commented on SPARK-19556: We don't generally assign bugs. Leaving a message should

[jira] [Commented] (SPARK-19556) Broadcast data is not encrypted when I/O encryption is on

2017-02-14 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867152#comment-15867152 ] Genmao Yu commented on SPARK-19556: --- [~vanzin] I am working on this, could you please assign it to me?

[jira] [Commented] (SPARK-18113) Sending AskPermissionToCommitOutput failed, driver enter into task deadloop

2017-02-14 Thread xukun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867124#comment-15867124 ] xukun commented on SPARK-18113: --- [~aash] According my scenario and

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-02-14 Thread Mingjie Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867064#comment-15867064 ] Mingjie Tang commented on SPARK-18392: -- Sure, AND-amp is important and basic for current LSH. We can

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-02-14 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867049#comment-15867049 ] Seth Hendrickson commented on SPARK-18392: -- I would pretty strongly prefer to focus on adding

[jira] [Commented] (SPARK-18392) LSH API, algorithm, and documentation follow-ups

2017-02-14 Thread mingjie tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867041#comment-15867041 ] mingjie tang commented on SPARK-18392: -- [~yunn] are you working on the BitSampling &

[jira] [Commented] (SPARK-19603) Fix StreamingQuery explain command

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867037#comment-15867037 ] Apache Spark commented on SPARK-19603: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19603) Fix StreamingQuery explain command

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19603: Assignee: Apache Spark (was: Shixiong Zhu) > Fix StreamingQuery explain command >

[jira] [Assigned] (SPARK-19603) Fix StreamingQuery explain command

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19603: Assignee: Shixiong Zhu (was: Apache Spark) > Fix StreamingQuery explain command >

[jira] [Updated] (SPARK-19603) Fix StreamingQuery explain command

2017-02-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19603: - Summary: Fix StreamingQuery explain command (was: Fix the stream explain command) > Fix

[jira] [Created] (SPARK-19603) Fix the stream explain command

2017-02-14 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-19603: Summary: Fix the stream explain command Key: SPARK-19603 URL: https://issues.apache.org/jira/browse/SPARK-19603 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19602) Unable to query using the fully qualified column name of the form ( ..)

2017-02-14 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867024#comment-15867024 ] Sunitha Kambhampati commented on SPARK-19602: - Attaching the design doc and the proposed

[jira] [Updated] (SPARK-19602) Unable to query using the fully qualified column name of the form ( ..)

2017-02-14 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunitha Kambhampati updated SPARK-19602: Attachment: Design_ColResolution_JIRA19602.docx > Unable to query using the fully

[jira] [Updated] (SPARK-19602) Unable to query using the fully qualified column name of the form ( ..)

2017-02-14 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunitha Kambhampati updated SPARK-19602: Description: 1) Spark SQL fails to analyze this query: select db1.t1.i1 from

[jira] [Updated] (SPARK-19602) Unable to query using the fully qualified column name of the form ( ..)

2017-02-14 Thread Sunitha Kambhampati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sunitha Kambhampati updated SPARK-19602: Description: 1) Spark SQL fails to analyze this query: {quote} select db1.t1.i1

[jira] [Created] (SPARK-19602) Unable to query using the fully qualified column name of the form ( ..)

2017-02-14 Thread Sunitha Kambhampati (JIRA)
Sunitha Kambhampati created SPARK-19602: --- Summary: Unable to query using the fully qualified column name of the form ( ..) Key: SPARK-19602 URL: https://issues.apache.org/jira/browse/SPARK-19602

[jira] [Resolved] (SPARK-14894) Python GaussianMixture summary

2017-02-14 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-14894. -- Resolution: Duplicate [~wangmiao1981], I guess we could take an action to JIRA too if we are

[jira] [Commented] (SPARK-19588) Allow putting keytab file to HDFS location specified in spark.yarn.keytab

2017-02-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15867000#comment-15867000 ] Ruslan Dautkhanov commented on SPARK-19588: --- Got it. Thanks [~vanzin] > Allow putting keytab

[jira] [Commented] (SPARK-14894) Python GaussianMixture summary

2017-02-14 Thread Miao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866999#comment-15866999 ] Miao Wang commented on SPARK-14894: --- This is a dup of JIRA-18282. Should be closed. > Python

[jira] [Commented] (SPARK-19528) external shuffle service would close while still have request from executor when dynamic allocation is enabled

2017-02-14 Thread satheessh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866962#comment-15866962 ] satheessh commented on SPARK-19528: --- I am also getting same error from container " ERROR

[jira] [Commented] (SPARK-19588) Allow putting keytab file to HDFS location specified in spark.yarn.keytab

2017-02-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866932#comment-15866932 ] Marcelo Vanzin commented on SPARK-19588: bq. driver/yarn#client holds keytab just to distribute

[jira] [Resolved] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-02-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19318. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16891

[jira] [Assigned] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-02-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19318: --- Assignee: Suresh Thalamati > Docker test case failure: `SPARK-16625: General data types to

[jira] [Commented] (SPARK-19588) Allow putting keytab file to HDFS location specified in spark.yarn.keytab

2017-02-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866839#comment-15866839 ] Ruslan Dautkhanov commented on SPARK-19588: --- driver/yarn#client holds keytab just to distribute

[jira] [Resolved] (SPARK-19275) Spark Streaming, Kafka receiver, "Failed to get records for ... after polling for 512"

2017-02-14 Thread Armin Braun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Armin Braun resolved SPARK-19275. - Resolution: Not A Problem > Spark Streaming, Kafka receiver, "Failed to get records for ...

[jira] [Assigned] (SPARK-16475) Broadcast Hint for SQL Queries

2017-02-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-16475: --- Assignee: Reynold Xin > Broadcast Hint for SQL Queries > -- > >

[jira] [Resolved] (SPARK-16475) Broadcast Hint for SQL Queries

2017-02-14 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-16475. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16925

[jira] [Updated] (SPARK-19593) Records read per each kinesis transaction

2017-02-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19593: - Priority: Trivial (was: Critical) > Records read per each kinesis transaction >

[jira] [Updated] (SPARK-19593) Records read per each kinesis transaction

2017-02-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19593: - Component/s: (was: Structured Streaming) (was: Spark Core)

[jira] [Commented] (SPARK-19594) StreamingQueryListener fails to handle QueryTerminatedEvent if more then one listeners exists

2017-02-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866778#comment-15866778 ] Shixiong Zhu commented on SPARK-19594: -- Good catch. Would you like to submit a PR to fix it? >

[jira] [Resolved] (SPARK-19387) CRAN tests do not run with SparkR source package

2017-02-14 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shivaram Venkataraman resolved SPARK-19387. --- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 Issue

[jira] [Commented] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-14 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866768#comment-15866768 ] Timothy Hunter commented on SPARK-19208: Yes, I meant returning a struct and then projecting this

[jira] [Commented] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866755#comment-15866755 ] Nick Pentreath commented on SPARK-19208: Ah right I see - yes rewrite rules would be a good

[jira] [Comment Edited] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866755#comment-15866755 ] Nick Pentreath edited comment on SPARK-19208 at 2/14/17 9:42 PM: - Ah

[jira] [Updated] (SPARK-19523) Spark streaming+ insert into table leaves bunch of trash in table directory

2017-02-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu updated SPARK-19523: - Issue Type: Question (was: Improvement) > Spark streaming+ insert into table leaves bunch of

[jira] [Resolved] (SPARK-19523) Spark streaming+ insert into table leaves bunch of trash in table directory

2017-02-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19523. -- Resolution: Not A Bug > Spark streaming+ insert into table leaves bunch of trash in table

[jira] [Assigned] (SPARK-19497) dropDuplicates with watermark

2017-02-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-19497: Assignee: Shixiong Zhu > dropDuplicates with watermark > - >

[jira] [Comment Edited] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-14 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866714#comment-15866714 ] Timothy Hunter edited comment on SPARK-19208 at 2/14/17 9:24 PM: - Thanks

[jira] [Commented] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-14 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866714#comment-15866714 ] Timothy Hunter commented on SPARK-19208: Thanks for the clarification [~mlnick]. I was a bit

[jira] [Assigned] (SPARK-19601) Fix CollapseRepartition rule to preserve shuffle-enabled Repartition

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19601: Assignee: Apache Spark (was: Xiao Li) > Fix CollapseRepartition rule to preserve

[jira] [Commented] (SPARK-19601) Fix CollapseRepartition rule to preserve shuffle-enabled Repartition

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866709#comment-15866709 ] Apache Spark commented on SPARK-19601: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19601) Fix CollapseRepartition rule to preserve shuffle-enabled Repartition

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19601: Assignee: Xiao Li (was: Apache Spark) > Fix CollapseRepartition rule to preserve

[jira] [Created] (SPARK-19601) Fix CollapseRepartition rule to preserve shuffle-enabled Repartition

2017-02-14 Thread Xiao Li (JIRA)
Xiao Li created SPARK-19601: --- Summary: Fix CollapseRepartition rule to preserve shuffle-enabled Repartition Key: SPARK-19601 URL: https://issues.apache.org/jira/browse/SPARK-19601 Project: Spark

[jira] [Commented] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-14 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866675#comment-15866675 ] Nick Pentreath commented on SPARK-19208: When I said "estimator-like", I didn't mean it should

[jira] [Commented] (SPARK-19600) ArrayIndexOutOfBoundsException in ALS

2017-02-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866544#comment-15866544 ] Sean Owen commented on SPARK-19600: --- You indicated it was the same issue, but, it's still not the right

[jira] [Comment Edited] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-14 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866535#comment-15866535 ] Timothy Hunter edited comment on SPARK-19208 at 2/14/17 8:04 PM: - I am

[jira] [Commented] (SPARK-19208) MultivariateOnlineSummarizer performance optimization

2017-02-14 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866535#comment-15866535 ] Timothy Hunter commented on SPARK-19208: I am not sure if we should follow the Estimator API for

[jira] [Commented] (SPARK-19518) IGNORE NULLS in first_value / last_value should be supported in SQL statements

2017-02-14 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866523#comment-15866523 ] Herman van Hovell commented on SPARK-19518: --- Go for it. > IGNORE NULLS in first_value /

[jira] [Comment Edited] (SPARK-19518) IGNORE NULLS in first_value / last_value should be supported in SQL statements

2017-02-14 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866523#comment-15866523 ] Herman van Hovell edited comment on SPARK-19518 at 2/14/17 8:00 PM:

[jira] [Resolved] (SPARK-19501) Slow checking if there are many spark.yarn.jars, which are already on HDFS

2017-02-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19501. Resolution: Fixed Assignee: Jong Wook Kim Fix Version/s: 2.2.0

[jira] [Commented] (SPARK-19600) ArrayIndexOutOfBoundsException in ALS

2017-02-14 Thread zhengxiang pan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866516#comment-15866516 ] zhengxiang pan commented on SPARK-19600: are you sure it is duplicated issue as SPARK-3080 even

[jira] [Resolved] (SPARK-19600) ArrayIndexOutOfBoundsException in ALS

2017-02-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19600. --- Resolution: Duplicate Questions go to the mailing list. Don't open duplicate JIRAs to re-ask. >

[jira] [Commented] (SPARK-19518) IGNORE NULLS in first_value / last_value should be supported in SQL statements

2017-02-14 Thread Sameer Abhyankar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866461#comment-15866461 ] Sameer Abhyankar commented on SPARK-19518: -- I have come across this issue recently with Spark

[jira] [Commented] (SPARK-19599) Clean up HDFSMetadataLog for Hadoop 2.6+

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866445#comment-15866445 ] Apache Spark commented on SPARK-19599: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19599) Clean up HDFSMetadataLog for Hadoop 2.6+

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19599: Assignee: Apache Spark > Clean up HDFSMetadataLog for Hadoop 2.6+ >

[jira] [Assigned] (SPARK-19599) Clean up HDFSMetadataLog for Hadoop 2.6+

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19599: Assignee: (was: Apache Spark) > Clean up HDFSMetadataLog for Hadoop 2.6+ >

[jira] [Resolved] (SPARK-19529) TransportClientFactory.createClient() shouldn't call awaitUninterruptibly()

2017-02-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-19529. Resolution: Fixed > TransportClientFactory.createClient() shouldn't call awaitUninterruptibly() >

[jira] [Created] (SPARK-19600) ArrayIndexOutOfBoundsException in ALS

2017-02-14 Thread zhengxiang pan (JIRA)
zhengxiang pan created SPARK-19600: -- Summary: ArrayIndexOutOfBoundsException in ALS Key: SPARK-19600 URL: https://issues.apache.org/jira/browse/SPARK-19600 Project: Spark Issue Type: Bug

[jira] [Updated] (SPARK-19529) TransportClientFactory.createClient() shouldn't call awaitUninterruptibly()

2017-02-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19529: --- Fix Version/s: 1.6.4 > TransportClientFactory.createClient() shouldn't call awaitUninterruptibly() >

[jira] [Updated] (SPARK-19529) TransportClientFactory.createClient() shouldn't call awaitUninterruptibly()

2017-02-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19529: --- Target Version/s: 1.6.4, 2.0.3, 2.1.1, 2.2.0 (was: 1.6.3, 2.0.3, 2.1.1, 2.2.0) >

[jira] [Updated] (SPARK-19599) Clean up HDFSMetadataLog for Hadoop 2.6+

2017-02-14 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin updated SPARK-19599: --- Affects Version/s: (was: 2.1.0) 2.2.0 > Clean up HDFSMetadataLog

[jira] [Updated] (SPARK-19529) TransportClientFactory.createClient() shouldn't call awaitUninterruptibly()

2017-02-14 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-19529: --- Fix Version/s: 2.2.0 2.1.1 2.0.3 >

[jira] [Created] (SPARK-19599) Clean up HDFSMetadataLog for Hadoop 2.6+

2017-02-14 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-19599: Summary: Clean up HDFSMetadataLog for Hadoop 2.6+ Key: SPARK-19599 URL: https://issues.apache.org/jira/browse/SPARK-19599 Project: Spark Issue Type:

[jira] [Assigned] (SPARK-19587) Disallow when sort columns are part of partitioning columns

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19587: Assignee: (was: Apache Spark) > Disallow when sort columns are part of partitioning

[jira] [Commented] (SPARK-19587) Disallow when sort columns are part of partitioning columns

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866437#comment-15866437 ] Apache Spark commented on SPARK-19587: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19587) Disallow when sort columns are part of partitioning columns

2017-02-14 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19587: Assignee: Apache Spark > Disallow when sort columns are part of partitioning columns >

[jira] [Commented] (SPARK-19592) Duplication in Test Configuration Relating to SparkConf Settings Should be Removed

2017-02-14 Thread Armin Braun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866428#comment-15866428 ] Armin Braun commented on SPARK-19592: - Imo this also relates to the ability to handle

[jira] [Commented] (SPARK-19598) Remove the alias parameter in UnresolvedRelation

2017-02-14 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866410#comment-15866410 ] Reynold Xin commented on SPARK-19598: - cc [~windpiger] are you interested in working on this? >

[jira] [Created] (SPARK-19598) Remove the alias parameter in UnresolvedRelation

2017-02-14 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-19598: --- Summary: Remove the alias parameter in UnresolvedRelation Key: SPARK-19598 URL: https://issues.apache.org/jira/browse/SPARK-19598 Project: Spark Issue Type:

[jira] [Resolved] (SPARK-19571) tests are failing to run on Windows with another instance Derby error with Hadoop 2.6.5

2017-02-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-19571. -- Resolution: Fixed Assignee: Hyukjin Kwon > tests are failing to run on Windows with

[jira] [Updated] (SPARK-19571) tests are failing to run on Windows with another instance Derby error with Hadoop 2.6.5

2017-02-14 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-19571: - Summary: tests are failing to run on Windows with another instance Derby error with Hadoop 2.6.5

[jira] [Resolved] (SPARK-19163) Lazy creation of the _judf

2017-02-14 Thread Maciej Szymkiewicz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maciej Szymkiewicz resolved SPARK-19163. Resolution: Fixed Fix Version/s: 2.2.0 > Lazy creation of the _judf >

[jira] [Commented] (SPARK-18541) Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadata management in pyspark SQL API

2017-02-14 Thread Shea Parkes (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866393#comment-15866393 ] Shea Parkes commented on SPARK-18541: - Thank you very much! > Add

[jira] [Commented] (SPARK-19592) Duplication in Test Configuration Relating to SparkConf Settings Should be Removed

2017-02-14 Thread Armin Braun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866367#comment-15866367 ] Armin Braun commented on SPARK-19592: - [~srowen] {quote} What about tests that make their own conf

[jira] [Resolved] (SPARK-19552) Upgrade Netty version to 4.1.8 final

2017-02-14 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19552?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19552. -- Resolution: Later > Upgrade Netty version to 4.1.8 final >

[jira] [Commented] (SPARK-19592) Duplication in Test Configuration Relating to SparkConf Settings Should be Removed

2017-02-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866311#comment-15866311 ] Sean Owen commented on SPARK-19592: --- What about tests that make their own conf or need to? I don't

[jira] [Resolved] (SPARK-19582) DataFrameReader conceptually inadequate

2017-02-14 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-19582. --- Resolution: Invalid I don't understand what this is describing. Is it a dependency conflict? if so,

[jira] [Commented] (SPARK-14523) Feature parity for Statistics ML with MLlib

2017-02-14 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866295#comment-15866295 ] Timothy Hunter commented on SPARK-14523: Also, the correlation is missing the multivariate case.

[jira] [Commented] (SPARK-4591) Algorithm/model parity for spark.ml (Scala)

2017-02-14 Thread Timothy Hunter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866288#comment-15866288 ] Timothy Hunter commented on SPARK-4591: --- [~josephkb] do you also want some subtasks for

[jira] [Assigned] (SPARK-18541) Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadata management in pyspark SQL API

2017-02-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk reassigned SPARK-18541: --- Assignee: Shea Parkes > Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadata

[jira] [Resolved] (SPARK-18541) Add pyspark.sql.Column.aliasWithMetadata to allow dynamic metadata management in pyspark SQL API

2017-02-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-18541. - Resolution: Fixed Fix Version/s: 2.2.0 > Add pyspark.sql.Column.aliasWithMetadata to allow

[jira] [Commented] (SPARK-13219) Pushdown predicate propagation in SparkSQL with join

2017-02-14 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866262#comment-15866262 ] Nick Dimiduk commented on SPARK-13219: -- I would implement this manually by materializing the smaller

[jira] [Commented] (SPARK-12957) Derive and propagate data constrains in logical plan

2017-02-14 Thread Nick Dimiduk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15866239#comment-15866239 ] Nick Dimiduk commented on SPARK-12957: -- [~sameerag] thanks for the comment. From a naive scan of the

[jira] [Resolved] (SPARK-19162) UserDefinedFunction constructor should verify that func is callable

2017-02-14 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk resolved SPARK-19162. - Resolution: Fixed Fix Version/s: 2.2.0 > UserDefinedFunction constructor should verify that func

  1   2   >