[jira] [Updated] (SPARK-18841) PushProjectionThroughUnion exception when there are same column

2017-02-07 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18841: -- Fix Version/s: 2.1.1 > PushProjectionThroughUnion exception when there are same column

[jira] [Updated] (SPARK-18609) [SQL] column mixup with CROSS JOIN

2017-02-07 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-18609: -- Fix Version/s: 2.1.1 > [SQL] column mixup with CROSS JOIN >

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Song Jun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857570#comment-15857570 ] Song Jun commented on SPARK-19496: -- [~hyukjin.kwon]  > to_date with format has weird behavior >

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Song Jun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857541#comment-15857541 ] Song Jun edited comment on SPARK-19496 at 2/8/17 7:18 AM: -- mysql: select

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857561#comment-15857561 ] Wenchen Fan commented on SPARK-19496: - returning null looks better, [~smilegator] what do you think?

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857554#comment-15857554 ] Hyukjin Kwon commented on SPARK-19496: -- Yea, thank you for explanation. I was just curious so tested

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Song Jun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857541#comment-15857541 ] Song Jun edited comment on SPARK-19496 at 2/8/17 7:11 AM: -- mysql: select

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Song Jun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857541#comment-15857541 ] Song Jun edited comment on SPARK-19496 at 2/8/17 7:09 AM: -- mysql: select

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Song Jun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857541#comment-15857541 ] Song Jun commented on SPARK-19496: -- mysql: select str_to_date('2014-12-31','%Y-%d-%m') also return

[jira] [Assigned] (SPARK-19508) Improve error message when binding service fails

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19508: Assignee: (was: Apache Spark) > Improve error message when binding service fails >

[jira] [Commented] (SPARK-19508) Improve error message when binding service fails

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857533#comment-15857533 ] Apache Spark commented on SPARK-19508: -- User 'viirya' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19508) Improve error message when binding service fails

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19508: Assignee: Apache Spark > Improve error message when binding service fails >

[jira] [Updated] (SPARK-19508) Improve error message when binding service fails

2017-02-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh updated SPARK-19508: Description: Utils provides a helper function to bind service on port. This function can

[jira] [Created] (SPARK-19508) Improve error message when binding service fails

2017-02-07 Thread Liang-Chi Hsieh (JIRA)
Liang-Chi Hsieh created SPARK-19508: --- Summary: Improve error message when binding service fails Key: SPARK-19508 URL: https://issues.apache.org/jira/browse/SPARK-19508 Project: Spark Issue

[jira] [Resolved] (SPARK-19488) CSV infer schema does not take into account Inf,-Inf,NaN

2017-02-07 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19488. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16834

[jira] [Assigned] (SPARK-19488) CSV infer schema does not take into account Inf,-Inf,NaN

2017-02-07 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-19488: --- Assignee: Song Jun > CSV infer schema does not take into account Inf,-Inf,NaN >

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:16 AM: -- - Hive

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:16 AM: -- - Hive

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:12 AM: -- - Hive

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857473#comment-15857473 ] Hyukjin Kwon commented on SPARK-19496: -- Oh, yes. I just found and updated my comment. > to_date

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857464#comment-15857464 ] Wenchen Fan commented on SPARK-19496: - The weird part is, Spark may have different behaviors depend

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon commented on SPARK-19496: -- - Hive {code} hive> SELECT to_date('2014-31-12');

[jira] [Commented] (SPARK-19413) Basic mapGroupsWithState API

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857439#comment-15857439 ] Apache Spark commented on SPARK-19413: -- User 'tdas' has created a pull request for this issue:

[jira] [Commented] (SPARK-19413) Basic mapGroupsWithState API

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857437#comment-15857437 ] Apache Spark commented on SPARK-19413: -- User 'tdas' has created a pull request for this issue:

[jira] [Commented] (SPARK-19507) pyspark.sql.types._verify_type() exceptions too broad to debug collections or nested data

2017-02-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857423#comment-15857423 ] Hyukjin Kwon commented on SPARK-19507: -- That is as you said private with the underbar prefix

[jira] [Commented] (SPARK-19474) SparkSQL unsupports to change hive table's name\dataType

2017-02-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857382#comment-15857382 ] Hyukjin Kwon commented on SPARK-19474: -- I have no information about this but I am less sure given

[jira] [Assigned] (SPARK-18873) New test cases for scalar subquery

2017-02-07 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li reassigned SPARK-18873: --- Assignee: Nattavut Sutyanyong > New test cases for scalar subquery >

[jira] [Closed] (SPARK-18824) Add optimizer rule to reorder expensive Filter predicates like ScalaUDF

2017-02-07 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Liang-Chi Hsieh closed SPARK-18824. --- Resolution: Won't Fix > Add optimizer rule to reorder expensive Filter predicates like

[jira] [Resolved] (SPARK-19499) Add more notes in the comments of Sink.addBatch()

2017-02-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19499. -- Resolution: Fixed Assignee: Nan Zhu Fix Version/s: 2.2.0

[jira] [Resolved] (SPARK-19413) Basic mapGroupsWithState API

2017-02-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19413. -- Resolution: Fixed Fix Version/s: 2.2.0 > Basic mapGroupsWithState API >

[jira] [Commented] (SPARK-19279) Disallow Users to Create a Hive Table With an Empty Schema

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857363#comment-15857363 ] Apache Spark commented on SPARK-19279: -- User 'gatorsmile' has created a pull request for this issue:

[jira] [Updated] (SPARK-19507) pyspark.sql.types._verify_type() exceptions too broad to debug collections or nested data

2017-02-07 Thread David Gingrich (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Gingrich updated SPARK-19507: --- Description: The private function pyspark.sql.types._verify_type() recursively checks an

[jira] [Created] (SPARK-19507) pyspark.sql.types._verify_type() exceptions too broad to debug collections or nested data

2017-02-07 Thread David Gingrich (JIRA)
David Gingrich created SPARK-19507: -- Summary: pyspark.sql.types._verify_type() exceptions too broad to debug collections or nested data Key: SPARK-19507 URL: https://issues.apache.org/jira/browse/SPARK-19507

[jira] [Assigned] (SPARK-17629) Add local version of Word2Vec findSynonyms for spark.ml

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-17629: - Shepherd: Joseph K. Bradley Assignee: Asher Krim

[jira] [Commented] (SPARK-19474) SparkSQL unsupports to change hive table's name\dataType

2017-02-07 Thread Xiaochen Ouyang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857238#comment-15857238 ] Xiaochen Ouyang commented on SPARK-19474: - ping [~hyukjin.kwon] > SparkSQL unsupports to change

[jira] [Commented] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2017-02-07 Thread Seth Hendrickson (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857230#comment-15857230 ] Seth Hendrickson commented on SPARK-17139: -- [~josephkb] Is [this more or less what you had in

[jira] [Commented] (SPARK-10721) Log warning when file deletion fails

2017-02-07 Thread meiyoula (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857225#comment-15857225 ] meiyoula commented on SPARK-10721: -- Hi [~srowen], may I ask a question. When File.delete() returns

[jira] [Resolved] (SPARK-19397) Option names of LIBSVM and TEXT are not case insensitive.

2017-02-07 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-19397. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 16737

[jira] [Assigned] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19318: Assignee: Apache Spark > Docker test case failure: `SPARK-16625: General data types to be

[jira] [Commented] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857191#comment-15857191 ] Apache Spark commented on SPARK-19318: -- User 'sureshthalamati' has created a pull request for this

[jira] [Assigned] (SPARK-19318) Docker test case failure: `SPARK-16625: General data types to be mapped to Oracle`

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19318: Assignee: (was: Apache Spark) > Docker test case failure: `SPARK-16625: General data

[jira] [Commented] (SPARK-19506) Missing warnings import in pyspark.ml.util

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857159#comment-15857159 ] Apache Spark commented on SPARK-19506: -- User 'zero323' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19506) Missing warnings import in pyspark.ml.util

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19506: Assignee: Apache Spark > Missing warnings import in pyspark.ml.util >

[jira] [Assigned] (SPARK-19506) Missing warnings import in pyspark.ml.util

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19506: Assignee: (was: Apache Spark) > Missing warnings import in pyspark.ml.util >

[jira] [Created] (SPARK-19506) Missing warnings import in pyspark.ml.util

2017-02-07 Thread Maciej Szymkiewicz (JIRA)
Maciej Szymkiewicz created SPARK-19506: -- Summary: Missing warnings import in pyspark.ml.util Key: SPARK-19506 URL: https://issues.apache.org/jira/browse/SPARK-19506 Project: Spark Issue

[jira] [Assigned] (SPARK-19505) AttributeError on Exception.message in Python3; hides true exceptions in cloudpickle.py and broadcast.py

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19505: Assignee: (was: Apache Spark) > AttributeError on Exception.message in Python3; hides

[jira] [Commented] (SPARK-19505) AttributeError on Exception.message in Python3; hides true exceptions in cloudpickle.py and broadcast.py

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857154#comment-15857154 ] Apache Spark commented on SPARK-19505: -- User 'dgingrich' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19505) AttributeError on Exception.message in Python3; hides true exceptions in cloudpickle.py and broadcast.py

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19505: Assignee: Apache Spark > AttributeError on Exception.message in Python3; hides true

[jira] [Created] (SPARK-19505) AttributeError on Exception.message in Python3; hides true exceptions in cloudpickle.py and broadcast.py

2017-02-07 Thread David Gingrich (JIRA)
David Gingrich created SPARK-19505: -- Summary: AttributeError on Exception.message in Python3; hides true exceptions in cloudpickle.py and broadcast.py Key: SPARK-19505 URL:

[jira] [Commented] (SPARK-19503) Execution Plan Optimizer: avoid sort or shuffle when it does not change end result such as df.sort(...).count()

2017-02-07 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857015#comment-15857015 ] Herman van Hovell commented on SPARK-19503: --- We could prune sort and distribute operators at

[jira] [Updated] (SPARK-19503) Execution Plan Optimizer: avoid sort or shuffle when it does not change end result such as df.sort(...).count()

2017-02-07 Thread R (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] R updated SPARK-19503: -- Summary: Execution Plan Optimizer: avoid sort or shuffle when it does not change end result such as

[jira] [Closed] (SPARK-18386) Batch mode SQL source for Kafka

2017-02-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu closed SPARK-18386. Resolution: Duplicate > Batch mode SQL source for Kafka > --- > >

[jira] [Resolved] (SPARK-18682) Batch Source for Kafka

2017-02-07 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-18682. -- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 > Batch Source for

[jira] [Created] (SPARK-19504) clearCache fails to delete orphan RDDs, especially in pyspark

2017-02-07 Thread R (JIRA)
R created SPARK-19504: - Summary: clearCache fails to delete orphan RDDs, especially in pyspark Key: SPARK-19504 URL: https://issues.apache.org/jira/browse/SPARK-19504 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-19503) Dumb Execution Plan

2017-02-07 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856931#comment-15856931 ] Sean Owen commented on SPARK-19503: --- Can you improve the JIRA? the title is uninformative. See

[jira] [Created] (SPARK-19503) Dumb Execution Plan

2017-02-07 Thread R (JIRA)
R created SPARK-19503: - Summary: Dumb Execution Plan Key: SPARK-19503 URL: https://issues.apache.org/jira/browse/SPARK-19503 Project: Spark Issue Type: Bug Components: Optimizer Affects

[jira] [Commented] (SPARK-19348) pyspark.ml.Pipeline gets corrupted under multi threaded use

2017-02-07 Thread Peter D Kirchner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856922#comment-15856922 ] Peter D Kirchner commented on SPARK-19348: -- Two things happen with this wrapper. First, it

[jira] [Commented] (SPARK-17498) StringIndexer.setHandleInvalid should have another option 'new'

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856918#comment-15856918 ] Joseph K. Bradley commented on SPARK-17498: --- Linking related issue for QuantileDiscretizer

[jira] [Assigned] (SPARK-19500) Fail to spill the aggregated hash map when radix sort is used

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19500: Assignee: Apache Spark (was: Davies Liu) > Fail to spill the aggregated hash map when

[jira] [Assigned] (SPARK-19500) Fail to spill the aggregated hash map when radix sort is used

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19500: Assignee: Davies Liu (was: Apache Spark) > Fail to spill the aggregated hash map when

[jira] [Commented] (SPARK-19500) Fail to spill the aggregated hash map when radix sort is used

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856896#comment-15856896 ] Apache Spark commented on SPARK-19500: -- User 'davies' has created a pull request for this issue:

[jira] [Updated] (SPARK-17498) StringIndexer.setHandleInvalid should have another option 'new'

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-17498: -- Summary: StringIndexer.setHandleInvalid should have another option 'new' (was:

[jira] [Commented] (SPARK-18841) PushProjectionThroughUnion exception when there are same column

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856854#comment-15856854 ] Apache Spark commented on SPARK-18841: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Commented] (SPARK-18609) [SQL] column mixup with CROSS JOIN

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856853#comment-15856853 ] Apache Spark commented on SPARK-18609: -- User 'hvanhovell' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-18967) Locality preferences should be used when scheduling even when delay scheduling is turned off

2017-02-07 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855615#comment-15855615 ] Kay Ousterhout edited comment on SPARK-18967 at 2/7/17 9:42 PM: Please

[jira] [Resolved] (SPARK-18841) PushProjectionThroughUnion exception when there are same column

2017-02-07 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18841. --- Resolution: Fixed Assignee: Herman van Hovell Fix Version/s: 2.2.0 >

[jira] [Resolved] (SPARK-18609) [SQL] column mixup with CROSS JOIN

2017-02-07 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell resolved SPARK-18609. --- Resolution: Fixed Assignee: Herman van Hovell Fix Version/s: 2.2.0 >

[jira] [Commented] (SPARK-18967) Locality preferences should be used when scheduling even when delay scheduling is turned off

2017-02-07 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856842#comment-15856842 ] Kay Ousterhout commented on SPARK-18967: Oops this was me [~rxin] sorry! > Locality preferences

[jira] [Comment Edited] (SPARK-18967) Locality preferences should be used when scheduling even when delay scheduling is turned off

2017-02-07 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855615#comment-15855615 ] Kay Ousterhout edited comment on SPARK-18967 at 2/7/17 9:40 PM: Oops this

[jira] [Commented] (SPARK-19304) Kinesis checkpoint recovery is 10x slow

2017-02-07 Thread Gaurav Shah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856772#comment-15856772 ] Gaurav Shah commented on SPARK-19304: - went ahead with a compromised approach for better code. We can

[jira] [Assigned] (SPARK-19304) Kinesis checkpoint recovery is 10x slow

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19304: Assignee: (was: Apache Spark) > Kinesis checkpoint recovery is 10x slow >

[jira] [Commented] (SPARK-19304) Kinesis checkpoint recovery is 10x slow

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856771#comment-15856771 ] Apache Spark commented on SPARK-19304: -- User 'Gauravshah' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19304) Kinesis checkpoint recovery is 10x slow

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19304: Assignee: Apache Spark > Kinesis checkpoint recovery is 10x slow >

[jira] [Commented] (SPARK-19501) Slow checking if there are many spark.yarn.jars, which are already on HDFS

2017-02-07 Thread Jong Wook Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856741#comment-15856741 ] Jong Wook Kim commented on SPARK-19501: --- I am aware of that option, but I would like the

[jira] [Commented] (SPARK-19501) Slow checking if there are many spark.yarn.jars, which are already on HDFS

2017-02-07 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856729#comment-15856729 ] Marcelo Vanzin commented on SPARK-19501: Reducing the number of RPCs is nice, but you can

[jira] [Commented] (SPARK-19501) Slow checking if there are many spark.yarn.jars, which are already on HDFS

2017-02-07 Thread Jong Wook Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856718#comment-15856718 ] Jong Wook Kim commented on SPARK-19501: --- I will be happy to make a pull request for this, but

[jira] [Updated] (SPARK-19501) Slow checking if there are many spark.yarn.jars, which are already on HDFS

2017-02-07 Thread Jong Wook Kim (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jong Wook Kim updated SPARK-19501: -- Description: Hi, this is my first Spark issue submission and please excuse any

[jira] [Updated] (SPARK-19502) Remove unnecessary code to re-submit stages in the DAGScheduler

2017-02-07 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout updated SPARK-19502: --- Description: There are a [few lines of code in the

[jira] [Created] (SPARK-19502) Remove unnecessary code to re-submit stages in the DAGScheduler

2017-02-07 Thread Kay Ousterhout (JIRA)
Kay Ousterhout created SPARK-19502: -- Summary: Remove unnecessary code to re-submit stages in the DAGScheduler Key: SPARK-19502 URL: https://issues.apache.org/jira/browse/SPARK-19502 Project: Spark

[jira] [Created] (SPARK-19501) Slow checking if there are many spark.yarn.jars, which are already on HDFS

2017-02-07 Thread Jong Wook Kim (JIRA)
Jong Wook Kim created SPARK-19501: - Summary: Slow checking if there are many spark.yarn.jars, which are already on HDFS Key: SPARK-19501 URL: https://issues.apache.org/jira/browse/SPARK-19501

[jira] [Commented] (SPARK-19348) pyspark.ml.Pipeline gets corrupted under multi threaded use

2017-02-07 Thread Peter D Kirchner (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856700#comment-15856700 ] Peter D Kirchner commented on SPARK-19348: -- To save folks some time, the keyword_only decorator

[jira] [Assigned] (SPARK-19500) Fail to spill the aggregated hash map when radix sort is used

2017-02-07 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu reassigned SPARK-19500: -- Assignee: Davies Liu > Fail to spill the aggregated hash map when radix sort is used >

[jira] [Created] (SPARK-19500) Fail to spill the aggregated hash map when radix sort is used

2017-02-07 Thread Davies Liu (JIRA)
Davies Liu created SPARK-19500: -- Summary: Fail to spill the aggregated hash map when radix sort is used Key: SPARK-19500 URL: https://issues.apache.org/jira/browse/SPARK-19500 Project: Spark

[jira] [Commented] (SPARK-17714) ClassCircularityError is thrown when using org.apache.spark.util.Utils.classForName 

2017-02-07 Thread Cheng Lian (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856638#comment-15856638 ] Cheng Lian commented on SPARK-17714: Although I've no idea why this error occurs, it seems that

[jira] [Commented] (SPARK-18871) New test cases for IN/NOT IN subquery

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856626#comment-15856626 ] Apache Spark commented on SPARK-18871: -- User 'kevinyu98' has created a pull request for this issue:

[jira] [Comment Edited] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856612#comment-15856612 ] Joseph K. Bradley edited comment on SPARK-17139 at 2/7/17 7:25 PM: ---

[jira] [Commented] (SPARK-17139) Add model summary for MultinomialLogisticRegression

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856612#comment-15856612 ] Joseph K. Bradley commented on SPARK-17139: --- I'll offer a few thoughts first: * A

[jira] [Commented] (SPARK-19495) Make SQLConf slightly more extensible

2017-02-07 Thread Jacek Laskowski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856571#comment-15856571 ] Jacek Laskowski commented on SPARK-19495: - what's the use case it tries to solve/address? What

[jira] [Updated] (SPARK-19499) Add more notes in the comments of Sink.addBatch()

2017-02-07 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19499: Description: addBatch method in Sink trait is supposed to be a synchronous method to coordinate with the

[jira] [Assigned] (SPARK-19499) Add more notes in the comments of Sink.addBatch()

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19499: Assignee: Apache Spark > Add more notes in the comments of Sink.addBatch() >

[jira] [Assigned] (SPARK-19499) Add more notes in the comments of Sink.addBatch()

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19499: Assignee: (was: Apache Spark) > Add more notes in the comments of Sink.addBatch() >

[jira] [Commented] (SPARK-19499) Add more notes in the comments of Sink.addBatch()

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856508#comment-15856508 ] Apache Spark commented on SPARK-19499: -- User 'CodingCat' has created a pull request for this issue:

[jira] [Updated] (SPARK-19499) Add more notes in the comments of Sink.addBatch()

2017-02-07 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nan Zhu updated SPARK-19499: Summary: Add more notes in the comments of Sink.addBatch() (was: Add more description in the comments of

[jira] [Created] (SPARK-19499) Add more description in the comments of Sink.addBatch()

2017-02-07 Thread Nan Zhu (JIRA)
Nan Zhu created SPARK-19499: --- Summary: Add more description in the comments of Sink.addBatch() Key: SPARK-19499 URL: https://issues.apache.org/jira/browse/SPARK-19499 Project: Spark Issue Type:

[jira] [Commented] (SPARK-19409) Upgrade Parquet to 1.8.2

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19409?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856434#comment-15856434 ] Apache Spark commented on SPARK-19409: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Resolved] (SPARK-19495) Make SQLConf slightly more extensible

2017-02-07 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-19495. - Resolution: Fixed Fix Version/s: 2.2.0 > Make SQLConf slightly more extensible >

[jira] [Commented] (SPARK-19359) partition path created by Hive should be deleted after rename a partition with upper-case

2017-02-07 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856376#comment-15856376 ] Apache Spark commented on SPARK-19359: -- User 'cloud-fan' has created a pull request for this issue:

[jira] [Updated] (SPARK-10817) ML abstraction umbrella

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-10817: -- Priority: Major (was: Critical) > ML abstraction umbrella > --- >

[jira] [Updated] (SPARK-19498) Discussion: Making MLlib APIs extensible for 3rd party libraries

2017-02-07 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-19498: -- Description: Per the recent discussion on the dev list, this JIRA is for discussing

[jira] [Created] (SPARK-19498) Discussion: Making MLlib APIs extensible for 3rd party libraries

2017-02-07 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-19498: - Summary: Discussion: Making MLlib APIs extensible for 3rd party libraries Key: SPARK-19498 URL: https://issues.apache.org/jira/browse/SPARK-19498 Project:

  1   2   >