[jira] [Comment Edited] (SPARK-20199) GradientBoostedTreesModel doesn't have Column Sampling Rate Paramenter

2017-04-26 Thread 颜发才
[ https://issues.apache.org/jira/browse/SPARK-20199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984161#comment-15984161 ] Yan Facai (颜发才) edited comment on SPARK-20199 at 4/26/17 6:11 AM: -- The

[jira] [Comment Edited] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984173#comment-15984173 ] Liang-Chi Hsieh edited comment on SPARK-20392 at 4/26/17 6:43 AM: --

[jira] [Comment Edited] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984173#comment-15984173 ] Liang-Chi Hsieh edited comment on SPARK-20392 at 4/26/17 6:43 AM: --

[jira] [Comment Edited] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984173#comment-15984173 ] Liang-Chi Hsieh edited comment on SPARK-20392 at 4/26/17 6:44 AM: --

[jira] [Created] (SPARK-20466) HadoopRDD#addLocalConfiguration throws NPE

2017-04-26 Thread liyunzhang_intel (JIRA)
liyunzhang_intel created SPARK-20466: Summary: HadoopRDD#addLocalConfiguration throws NPE Key: SPARK-20466 URL: https://issues.apache.org/jira/browse/SPARK-20466 Project: Spark Issue

[jira] [Assigned] (SPARK-20467) sbt-launch-lib.bash has lacked the ASF header.

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20467: Assignee: Apache Spark > sbt-launch-lib.bash has lacked the ASF header. >

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984270#comment-15984270 ] Liang-Chi Hsieh commented on SPARK-20392: - [~barrybecker4] Btw, the time applying the model_9756

[jira] [Assigned] (SPARK-20468) Refactor the ALS code

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20468: Assignee: Apache Spark > Refactor the ALS code > - > >

[jira] [Assigned] (SPARK-20468) Refactor the ALS code

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20468: Assignee: (was: Apache Spark) > Refactor the ALS code > - > >

[jira] [Created] (SPARK-20469) Add a method to display DataFrame schema in PipelineStage

2017-04-26 Thread darion yaphet (JIRA)
darion yaphet created SPARK-20469: - Summary: Add a method to display DataFrame schema in PipelineStage Key: SPARK-20469 URL: https://issues.apache.org/jira/browse/SPARK-20469 Project: Spark

[jira] [Assigned] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20392: Assignee: (was: Apache Spark) > Slow performance when calling fit on ML pipeline for

[jira] [Assigned] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20392: Assignee: Apache Spark > Slow performance when calling fit on ML pipeline for dataset

[jira] [Created] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Philip Adetiloye (JIRA)
Philip Adetiloye created SPARK-20470: Summary: Invalid json converting RDD row with Array of struct to json Key: SPARK-20470 URL: https://issues.apache.org/jira/browse/SPARK-20470 Project: Spark

[jira] [Commented] (SPARK-20468) Refactor the ALS code

2017-04-26 Thread Daniel Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984350#comment-15984350 ] Daniel Li commented on SPARK-20468: --- I'd appreciate if someone with permissions could assign this issue

[jira] [Assigned] (SPARK-20471) Remove AggregateBenchmark testsuite warning: Two level hashmap is disabled but vectorized hashmap is enabled.

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20471: Assignee: Apache Spark > Remove AggregateBenchmark testsuite warning: Two level hashmap

[jira] [Assigned] (SPARK-20471) Remove AggregateBenchmark testsuite warning: Two level hashmap is disabled but vectorized hashmap is enabled.

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20471: Assignee: (was: Apache Spark) > Remove AggregateBenchmark testsuite warning: Two

[jira] [Updated] (SPARK-20466) HadoopRDD#addLocalConfiguration throws NPE

2017-04-26 Thread liyunzhang_intel (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liyunzhang_intel updated SPARK-20466: - Attachment: NPE_log NPE_log describes the detailed info. >

[jira] [Created] (SPARK-20467) sbt-launch-lib.bash has lacked the ASF header.

2017-04-26 Thread liuzhaokun (JIRA)
liuzhaokun created SPARK-20467: -- Summary: sbt-launch-lib.bash has lacked the ASF header. Key: SPARK-20467 URL: https://issues.apache.org/jira/browse/SPARK-20467 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-20468) Refactor the ALS code

2017-04-26 Thread Daniel Li (JIRA)
Daniel Li created SPARK-20468: - Summary: Refactor the ALS code Key: SPARK-20468 URL: https://issues.apache.org/jira/browse/SPARK-20468 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-20468) Refactor the ALS code

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984346#comment-15984346 ] Apache Spark commented on SPARK-20468: -- User 'danielyli' has created a pull request for this issue:

[jira] [Commented] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984403#comment-15984403 ] Sean Owen commented on SPARK-20470: --- What JSON do you expect? what JSON do you get? There's no relevant

[jira] [Updated] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Philip Adetiloye (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Adetiloye updated SPARK-20470: - Description: Trying to convert an RDD in pyspark containing Array of struct doesn't

[jira] [Updated] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Philip Adetiloye (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Adetiloye updated SPARK-20470: - Description: Trying to convert an RDD in pyspark containing Array of struct doesn't

[jira] [Created] (SPARK-20472) Support for Dynamic Configuration

2017-04-26 Thread Shahbaz Hussain (JIRA)
Shahbaz Hussain created SPARK-20472: --- Summary: Support for Dynamic Configuration Key: SPARK-20472 URL: https://issues.apache.org/jira/browse/SPARK-20472 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-20471) Remove AggregateBenchmark testsuite warning: Two level hashmap is disabled but vectorized hashmap is enabled.

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984451#comment-15984451 ] Apache Spark commented on SPARK-20471: -- User 'heary-cao' has created a pull request for this issue:

[jira] [Commented] (SPARK-11968) ALS recommend all methods spend most of time in GC

2017-04-26 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984533#comment-15984533 ] Nick Pentreath commented on SPARK-11968: Thanks - in the meantime I will take a look at the PR.

[jira] [Commented] (SPARK-20081) RandomForestClassifier doesn't seem to support more than 100 labels

2017-04-26 Thread Christian Reiniger (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984326#comment-15984326 ] Christian Reiniger commented on SPARK-20081: Thank you. All (potential) labels *are* in fact

[jira] [Resolved] (SPARK-20467) sbt-launch-lib.bash has lacked the ASF header.

2017-04-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20467. --- Resolution: Not A Problem This file should not have an ASF header. > sbt-launch-lib.bash has lacked

[jira] [Assigned] (SPARK-20400) Remove References to Third Party Vendors from Spark ASF Documentation

2017-04-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-20400: - Assignee: Bill Chambers > Remove References to Third Party Vendors from Spark ASF Documentation

[jira] [Resolved] (SPARK-20400) Remove References to Third Party Vendors from Spark ASF Documentation

2017-04-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20400. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17695

[jira] [Updated] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Philip Adetiloye (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Adetiloye updated SPARK-20470: - Description: Trying to convert an RDD in pyspark containing Array of struct doesn't

[jira] [Commented] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Philip Adetiloye (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984428#comment-15984428 ] Philip Adetiloye commented on SPARK-20470: -- [~srowen] I'm sorry, I added an example. > Invalid

[jira] [Created] (SPARK-20471) Remove AggregateBenchmark testsuite warning: Two level hashmap is disabled but vectorized hashmap is enabled.

2017-04-26 Thread caoxuewen (JIRA)
caoxuewen created SPARK-20471: - Summary: Remove AggregateBenchmark testsuite warning: Two level hashmap is disabled but vectorized hashmap is enabled. Key: SPARK-20471 URL:

[jira] [Commented] (SPARK-20472) Support for Dynamic Configuration

2017-04-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984461#comment-15984461 ] Sean Owen commented on SPARK-20472: --- I don't think this is generally possible, because some config is

[jira] [Created] (SPARK-20473) ColumnVector.Array is missing accessors for some types

2017-04-26 Thread Michal Szafranski (JIRA)
Michal Szafranski created SPARK-20473: - Summary: ColumnVector.Array is missing accessors for some types Key: SPARK-20473 URL: https://issues.apache.org/jira/browse/SPARK-20473 Project: Spark

[jira] [Commented] (SPARK-12965) Indexer setInputCol() doesn't resolve column names like DataFrame.col()

2017-04-26 Thread Calin Cocan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984540#comment-15984540 ] Calin Cocan commented on SPARK-12965: - I have encountered the same problem using StringIndexer and

[jira] [Commented] (SPARK-20473) ColumnVector.Array is missing accessors for some types

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984576#comment-15984576 ] Apache Spark commented on SPARK-20473: -- User 'michal-databricks' has created a pull request for this

[jira] [Created] (SPARK-20475) Whether use "broadcast join" depends on hive configuration

2017-04-26 Thread Lijia Liu (JIRA)
Lijia Liu created SPARK-20475: - Summary: Whether use "broadcast join" depends on hive configuration Key: SPARK-20475 URL: https://issues.apache.org/jira/browse/SPARK-20475 Project: Spark Issue

[jira] [Created] (SPARK-20474) OnHeapColumnVector realocation may not copy existing data

2017-04-26 Thread Michal Szafranski (JIRA)
Michal Szafranski created SPARK-20474: - Summary: OnHeapColumnVector realocation may not copy existing data Key: SPARK-20474 URL: https://issues.apache.org/jira/browse/SPARK-20474 Project: Spark

[jira] [Commented] (SPARK-20474) OnHeapColumnVector realocation may not copy existing data

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984577#comment-15984577 ] Apache Spark commented on SPARK-20474: -- User 'michal-databricks' has created a pull request for this

[jira] [Assigned] (SPARK-20473) ColumnVector.Array is missing accessors for some types

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20473: Assignee: Apache Spark > ColumnVector.Array is missing accessors for some types >

[jira] [Assigned] (SPARK-20473) ColumnVector.Array is missing accessors for some types

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20473: Assignee: (was: Apache Spark) > ColumnVector.Array is missing accessors for some

[jira] [Assigned] (SPARK-20473) ColumnVector.Array is missing accessors for some types

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20473: Assignee: Apache Spark > ColumnVector.Array is missing accessors for some types >

[jira] [Assigned] (SPARK-20474) OnHeapColumnVector realocation may not copy existing data

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20474: Assignee: (was: Apache Spark) > OnHeapColumnVector realocation may not copy existing

[jira] [Assigned] (SPARK-20473) ColumnVector.Array is missing accessors for some types

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20473: Assignee: (was: Apache Spark) > ColumnVector.Array is missing accessors for some

[jira] [Assigned] (SPARK-20474) OnHeapColumnVector realocation may not copy existing data

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20474: Assignee: Apache Spark > OnHeapColumnVector realocation may not copy existing data >

[jira] [Commented] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Sebastian Arzt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984776#comment-15984776 ] Sebastian Arzt commented on SPARK-18371: I deep dived into it and found a simple solution. The

[jira] [Comment Edited] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Sebastian Arzt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984776#comment-15984776 ] Sebastian Arzt edited comment on SPARK-18371 at 4/26/17 1:24 PM: - I deep

[jira] [Resolved] (SPARK-19812) YARN shuffle service fails to relocate recovery DB across NFS directories

2017-04-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-19812. --- Resolution: Fixed Fix Version/s: 2.3.0 2.2.0 > YARN shuffle

[jira] [Comment Edited] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Sebastian Arzt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984776#comment-15984776 ] Sebastian Arzt edited comment on SPARK-18371 at 4/26/17 1:23 PM: - I deep

[jira] [Assigned] (SPARK-20391) Properly rename the memory related fields in ExecutorSummary REST API

2017-04-26 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid reassigned SPARK-20391: Assignee: Saisai Shao > Properly rename the memory related fields in ExecutorSummary REST

[jira] [Commented] (SPARK-7481) Add spark-hadoop-cloud module to pull in object store support

2017-04-26 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984771#comment-15984771 ] Steve Loughran commented on SPARK-7481: --- I think we ended up going in circles on that PR. Sean has

[jira] [Resolved] (SPARK-20391) Properly rename the memory related fields in ExecutorSummary REST API

2017-04-26 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid resolved SPARK-20391. -- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17700

[jira] [Commented] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984962#comment-15984962 ] Apache Spark commented on SPARK-18371: -- User 'arzt' has created a pull request for this issue:

[jira] [Issue Comment Deleted] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Sebastian Arzt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Arzt updated SPARK-18371: --- Comment: was deleted (was: After fix) > Spark Streaming backpressure bug - generates a

[jira] [Issue Comment Deleted] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Sebastian Arzt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Arzt updated SPARK-18371: --- Comment: was deleted (was: Before fix) > Spark Streaming backpressure bug - generates a

[jira] [Updated] (SPARK-20476) Exception between "create table as" and "get_json_object"

2017-04-26 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-20476: -- Description: I encounter this problem when I want to create a table as select , get_json_object

[jira] [Updated] (SPARK-20476) Exception between "create table as" and "get_json_object"

2017-04-26 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-20476: -- Description: I encounter this problem when I want to create a table as select , get_json_object

[jira] [Updated] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Sebastian Arzt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Arzt updated SPARK-18371: --- Attachment: 01.png Before fix > Spark Streaming backpressure bug - generates a batch with

[jira] [Updated] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Sebastian Arzt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Arzt updated SPARK-18371: --- Attachment: 02.png After fix > Spark Streaming backpressure bug - generates a batch with

[jira] [Commented] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Sebastian Arzt (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984975#comment-15984975 ] Sebastian Arzt commented on SPARK-18371: Screenshots:

[jira] [Comment Edited] (SPARK-17403) Fatal Error: Scan cached strings

2017-04-26 Thread Paul Lysak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15982595#comment-15982595 ] Paul Lysak edited comment on SPARK-17403 at 4/26/17 3:23 PM: - Looks like we

[jira] [Assigned] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18371: Assignee: (was: Apache Spark) > Spark Streaming backpressure bug - generates a batch

[jira] [Assigned] (SPARK-18371) Spark Streaming backpressure bug - generates a batch with large number of records

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18371: Assignee: Apache Spark > Spark Streaming backpressure bug - generates a batch with large

[jira] [Commented] (SPARK-17403) Fatal Error: Scan cached strings

2017-04-26 Thread Paul Lysak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984987#comment-15984987 ] Paul Lysak commented on SPARK-17403: Hope that helps - finally managed to reproduce it without using

[jira] [Created] (SPARK-20476) Exception between "create table as" and "get_json_object"

2017-04-26 Thread cen yuhai (JIRA)
cen yuhai created SPARK-20476: - Summary: Exception between "create table as" and "get_json_object" Key: SPARK-20476 URL: https://issues.apache.org/jira/browse/SPARK-20476 Project: Spark Issue

[jira] [Updated] (SPARK-20476) Exception between "create table as" and "get_json_object"

2017-04-26 Thread cen yuhai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] cen yuhai updated SPARK-20476: -- Description: {code} create table spark_json_object as select get_json_object(deliver_geojson,'$.')

[jira] [Commented] (SPARK-20476) Exception between "create table as" and "get_json_object"

2017-04-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985175#comment-15985175 ] Xiao Li commented on SPARK-20476: - You can bypass it by {noformat} create table spark_json_object as

[jira] [Created] (SPARK-20478) Document LinearSVC in R programming guide

2017-04-26 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-20478: Summary: Document LinearSVC in R programming guide Key: SPARK-20478 URL: https://issues.apache.org/jira/browse/SPARK-20478 Project: Spark Issue Type:

[jira] [Created] (SPARK-20477) Document R bisecting k-means in R programming guide

2017-04-26 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-20477: Summary: Document R bisecting k-means in R programming guide Key: SPARK-20477 URL: https://issues.apache.org/jira/browse/SPARK-20477 Project: Spark Issue

[jira] [Commented] (SPARK-20478) Document LinearSVC in R programming guide

2017-04-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985227#comment-15985227 ] Felix Cheung commented on SPARK-20478: -- [~wangmiao1981] Would you like to add this? > Document

[jira] [Commented] (SPARK-20477) Document R bisecting k-means in R programming guide

2017-04-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985226#comment-15985226 ] Felix Cheung commented on SPARK-20477: -- [~wangmiao1981] Would you like to add this? > Document R

[jira] [Updated] (SPARK-20477) Document R bisecting k-means in R programming guide

2017-04-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-20477: - Issue Type: Documentation (was: Bug) > Document R bisecting k-means in R programming guide >

[jira] [Resolved] (SPARK-20473) ColumnVector.Array is missing accessors for some types

2017-04-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-20473. - Resolution: Fixed Assignee: Michal Szafranski Fix Version/s: 2.2.0 >

[jira] [Commented] (SPARK-20208) Document R fpGrowth support in vignettes, programming guide and code example

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985309#comment-15985309 ] Apache Spark commented on SPARK-20208: -- User 'zero323' has created a pull request for this issue:

[jira] [Assigned] (SPARK-20208) Document R fpGrowth support in vignettes, programming guide and code example

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20208: Assignee: Apache Spark (was: Maciej Szymkiewicz) > Document R fpGrowth support in

[jira] [Assigned] (SPARK-20208) Document R fpGrowth support in vignettes, programming guide and code example

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20208: Assignee: Maciej Szymkiewicz (was: Apache Spark) > Document R fpGrowth support in

[jira] [Updated] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-20470: Component/s: SQL > Invalid json converting RDD row with Array of struct to json >

[jira] [Commented] (SPARK-7481) Add spark-hadoop-cloud module to pull in object store support

2017-04-26 Thread Steve Loughran (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985040#comment-15985040 ] Steve Loughran commented on SPARK-7481: --- (This is a fairly long comment, but it tries to summarise

[jira] [Commented] (SPARK-20476) Exception between "create table as" and "get_json_object"

2017-04-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985096#comment-15985096 ] Xiao Li commented on SPARK-20476: - This sounds a bug to me. Let me double check it > Exception between

[jira] [Commented] (SPARK-20475) Whether use "broadcast join" depends on hive configuration

2017-04-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985100#comment-15985100 ] Xiao Li commented on SPARK-20475: - cc [~ZenWzh] > Whether use "broadcast join" depends on hive

[jira] [Commented] (SPARK-20467) sbt-launch-lib.bash has lacked the ASF header.

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984298#comment-15984298 ] Apache Spark commented on SPARK-20467: -- User 'liu-zhaokun' has created a pull request for this

[jira] [Assigned] (SPARK-20467) sbt-launch-lib.bash has lacked the ASF header.

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20467: Assignee: (was: Apache Spark) > sbt-launch-lib.bash has lacked the ASF header. >

[jira] [Commented] (SPARK-20468) Refactor the ALS code

2017-04-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984365#comment-15984365 ] Sean Owen commented on SPARK-20468: --- Please read http://spark.apache.org/contributing.html first, we

[jira] [Commented] (SPARK-20392) Slow performance when calling fit on ML pipeline for dataset with many columns but few rows

2017-04-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15984399#comment-15984399 ] Apache Spark commented on SPARK-20392: -- User 'viirya' has created a pull request for this issue:

[jira] [Updated] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Philip Adetiloye (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Adetiloye updated SPARK-20470: - Description: Trying to convert an RDD in pyspark containing Array of struct doesn't

[jira] [Updated] (SPARK-20470) Invalid json converting RDD row with Array of struct to json

2017-04-26 Thread Philip Adetiloye (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philip Adetiloye updated SPARK-20470: - Description: Trying to convert an RDD in pyspark containing Array of struct doesn't

[jira] [Reopened] (SPARK-20208) Document R fpGrowth support in vignettes, programming guide and code example

2017-04-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung reopened SPARK-20208: -- Actually, would you mind updating the R programming guide too? > Document R fpGrowth support in

[jira] [Updated] (SPARK-20015) Document R Structured Streaming (experimental) in R vignettes and R & SS programming guide, R example

2017-04-26 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-20015: - Summary: Document R Structured Streaming (experimental) in R vignettes and R & SS programming

[jira] [Commented] (SPARK-20184) performance regression for complex/long sql when enable whole stage codegen

2017-04-26 Thread Kazuaki Ishizaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985319#comment-15985319 ] Kazuaki Ishizaki commented on SPARK-20184: -- When # of the aggregated columns gets large, I saw

[jira] [Created] (SPARK-20479) Performance degradation for large number of hash-aggregated columns

2017-04-26 Thread Kazuaki Ishizaki (JIRA)
Kazuaki Ishizaki created SPARK-20479: Summary: Performance degradation for large number of hash-aggregated columns Key: SPARK-20479 URL: https://issues.apache.org/jira/browse/SPARK-20479 Project:

[jira] [Commented] (SPARK-20480) FileFormatWriter hides FetchFailedException from scheduler

2017-04-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985567#comment-15985567 ] Thomas Graves commented on SPARK-20480: --- exception in task manager looks like: 17/04/26 20:09:21

[jira] [Resolved] (SPARK-12868) ADD JAR via sparkSQL JDBC will fail when using a HDFS URL

2017-04-26 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12868?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-12868. Resolution: Fixed Assignee: Weiqing Yang Fix Version/s: 2.2.0 > ADD JAR

[jira] [Created] (SPARK-20480) FileFormatWriter hides FetchFailedException from scheduler

2017-04-26 Thread Thomas Graves (JIRA)
Thomas Graves created SPARK-20480: - Summary: FileFormatWriter hides FetchFailedException from scheduler Key: SPARK-20480 URL: https://issues.apache.org/jira/browse/SPARK-20480 Project: Spark

[jira] [Closed] (SPARK-20480) FileFormatWriter hides FetchFailedException from scheduler

2017-04-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves closed SPARK-20480. - Resolution: Duplicate > FileFormatWriter hides FetchFailedException from scheduler >

[jira] [Commented] (SPARK-20178) Improve Scheduler fetch failures

2017-04-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985447#comment-15985447 ] Thomas Graves commented on SPARK-20178: --- Another thing we should tie in here is handling preempted

[jira] [Updated] (SPARK-20480) FileFormatWriter hides FetchFailedException from scheduler

2017-04-26 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-20480: -- Description: I was running a large job where it was getting faiures, noticed they were listed

[jira] [Assigned] (SPARK-20476) Exception between "create table as" and "get_json_object"

2017-04-26 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-20476: --- Assignee: Xiao Li > Exception between "create table as" and "get_json_object" >

[jira] [Updated] (SPARK-20454) Improvement of ShortestPaths in Spark GraphX

2017-04-26 Thread Ji Dai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ji Dai updated SPARK-20454: --- Target Version/s: (was: 2.1.0) > Improvement of ShortestPaths in Spark GraphX >

[jira] [Commented] (SPARK-20480) FileFormatWriter hides FetchFailedException from scheduler

2017-04-26 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15985575#comment-15985575 ] Mridul Muralidharan commented on SPARK-20480: - Shouldn't fix for SPARK-19276 by [~imranr] not

  1   2   >