[jira] [Commented] (SPARK-12225) Support adding or replacing multiple columns at once in DataFrame API

2017-05-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005904#comment-16005904 ] Liang-Chi Hsieh commented on SPARK-12225: - Without knowing this issue, I've implemented a

[jira] [Assigned] (SPARK-20704) CRAN test should run single threaded

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20704: Assignee: Apache Spark > CRAN test should run single threaded >

[jira] [Assigned] (SPARK-20704) CRAN test should run single threaded

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20704: Assignee: (was: Apache Spark) > CRAN test should run single threaded >

[jira] [Commented] (SPARK-20704) CRAN test should run single threaded

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005899#comment-16005899 ] Apache Spark commented on SPARK-20704: -- User 'felixcheung' has created a pull request for this

[jira] [Created] (SPARK-20704) CRAN test should run single threaded

2017-05-10 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-20704: Summary: CRAN test should run single threaded Key: SPARK-20704 URL: https://issues.apache.org/jira/browse/SPARK-20704 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-20590) Map default input data source formats to inlined classes

2017-05-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005869#comment-16005869 ] Wenchen Fan commented on SPARK-20590: - We only prefer internal data source if the given name is a

[jira] [Commented] (SPARK-20590) Map default input data source formats to inlined classes

2017-05-10 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005864#comment-16005864 ] Felix Cheung commented on SPARK-20590: -- When the user explicitly specifies the package to use,

[jira] [Updated] (SPARK-20666) Flaky test - SparkListenerBus randomly failing java.lang.IllegalAccessError

2017-05-10 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-20666: - Description: seeing quite a bit of this on AppVeyor, aka Windows only,-> seems like in other

[jira] [Commented] (SPARK-20228) Random Forest instable results depending on spark.executor.memory

2017-05-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005844#comment-16005844 ] Hyukjin Kwon commented on SPARK-20228: -- gentle ping [~Ansgar Schulze] > Random Forest instable

[jira] [Commented] (SPARK-20369) pyspark: Dynamic configuration with SparkConf does not work

2017-05-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005842#comment-16005842 ] Hyukjin Kwon commented on SPARK-20369: -- I am resolving this as I can't reproduce as above and it

[jira] [Resolved] (SPARK-20369) pyspark: Dynamic configuration with SparkConf does not work

2017-05-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-20369. -- Resolution: Cannot Reproduce > pyspark: Dynamic configuration with SparkConf does not work >

[jira] [Commented] (SPARK-20606) ML 2.2 QA: Remove deprecated methods for ML

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005835#comment-16005835 ] Apache Spark commented on SPARK-20606: -- User 'yanboliang' has created a pull request for this issue:

[jira] [Commented] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-05-10 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005823#comment-16005823 ] Imran Rashid commented on SPARK-19354: -- did a bit more searching -- isn't this fixed by SPARK-20217

[jira] [Commented] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-05-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005817#comment-16005817 ] Thomas Graves commented on SPARK-19354: --- Right from what I've seen not a blacklisting bug. Bug with

[jira] [Commented] (SPARK-20608) Standby namenodes should be allowed to included in yarn.spark.access.namenodes to support HDFS HA

2017-05-10 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005795#comment-16005795 ] Marcelo Vanzin commented on SPARK-20608: You still haven't understood what I'm saying. You should

[jira] [Commented] (SPARK-20682) Support a new faster ORC data source based on Apache ORC

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005789#comment-16005789 ] Apache Spark commented on SPARK-20682: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Commented] (SPARK-20703) Add an operator for writing data out

2017-05-10 Thread Liang-Chi Hsieh (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005780#comment-16005780 ] Liang-Chi Hsieh commented on SPARK-20703: - [~rxin] Thanks for ping me. Sure. I'd love to take

[jira] [Created] (SPARK-20703) Add an operator for writing data out

2017-05-10 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-20703: --- Summary: Add an operator for writing data out Key: SPARK-20703 URL: https://issues.apache.org/jira/browse/SPARK-20703 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-20703) Add an operator for writing data out

2017-05-10 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005765#comment-16005765 ] Reynold Xin commented on SPARK-20703: - cc [~viirya] want to give this a try? > Add an operator for

[jira] [Commented] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-05-10 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005767#comment-16005767 ] Imran Rashid commented on SPARK-19354: -- [~tgraves] I haven't run into this yet -- frankly I still

[jira] [Commented] (SPARK-20200) Flaky Test: org.apache.spark.rdd.LocalCheckpointSuite

2017-05-10 Thread Yuming Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005757#comment-16005757 ] Yuming Wang commented on SPARK-20200: - Can you check it again? it works for me. {code} build/sbt

[jira] [Assigned] (SPARK-20702) TaskContextImpl.markTaskCompleted should not hide the original error

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20702: Assignee: Apache Spark (was: Shixiong Zhu) > TaskContextImpl.markTaskCompleted should

[jira] [Assigned] (SPARK-20702) TaskContextImpl.markTaskCompleted should not hide the original error

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20702: Assignee: Shixiong Zhu (was: Apache Spark) > TaskContextImpl.markTaskCompleted should

[jira] [Commented] (SPARK-20702) TaskContextImpl.markTaskCompleted should not hide the original error

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005722#comment-16005722 ] Apache Spark commented on SPARK-20702: -- User 'zsxwing' has created a pull request for this issue:

[jira] [Created] (SPARK-20702) TaskContextImpl.markTaskCompleted should not hide the original error

2017-05-10 Thread Shixiong Zhu (JIRA)
Shixiong Zhu created SPARK-20702: Summary: TaskContextImpl.markTaskCompleted should not hide the original error Key: SPARK-20702 URL: https://issues.apache.org/jira/browse/SPARK-20702 Project: Spark

[jira] [Resolved] (SPARK-20685) BatchPythonEvaluation UDF evaluator fails for case of single UDF with repeated argument

2017-05-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-20685. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 2.1.2 >

[jira] [Created] (SPARK-20701) dataframe.show has wrong white space when containing Supplement Unicode character

2017-05-10 Thread Pingsan Song (JIRA)
Pingsan Song created SPARK-20701: Summary: dataframe.show has wrong white space when containing Supplement Unicode character Key: SPARK-20701 URL: https://issues.apache.org/jira/browse/SPARK-20701

[jira] [Assigned] (SPARK-20684) expose createGlobalTempView and dropGlobalTempView in SparkR

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20684: Assignee: Apache Spark > expose createGlobalTempView and dropGlobalTempView in SparkR >

[jira] [Assigned] (SPARK-20684) expose createGlobalTempView and dropGlobalTempView in SparkR

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20684: Assignee: (was: Apache Spark) > expose createGlobalTempView and dropGlobalTempView in

[jira] [Commented] (SPARK-20684) expose createGlobalTempView and dropGlobalTempView in SparkR

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005600#comment-16005600 ] Apache Spark commented on SPARK-20684: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Comment Edited] (SPARK-13210) NPE in Sort

2017-05-10 Thread David McWhorter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005591#comment-16005591 ] David McWhorter edited comment on SPARK-13210 at 5/10/17 10:52 PM: --- I

[jira] [Commented] (SPARK-13210) NPE in Sort

2017-05-10 Thread David McWhorter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005591#comment-16005591 ] David McWhorter commented on SPARK-13210: - I suppose that may be a different error actually... >

[jira] [Commented] (SPARK-13210) NPE in Sort

2017-05-10 Thread David McWhorter (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005588#comment-16005588 ] David McWhorter commented on SPARK-13210: - [~srowen] Here's the error from Spark 2.1.1 with

[jira] [Commented] (SPARK-20684) expose createGlobalTempView and dropGlobalTempView in SparkR

2017-05-10 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005553#comment-16005553 ] Dongjoon Hyun commented on SPARK-20684: --- Thank you for confirming! > expose createGlobalTempView

[jira] [Updated] (SPARK-20684) expose createGlobalTempView and dropGlobalTempView in SparkR

2017-05-10 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-20684: --- Summary: expose createGlobalTempView and dropGlobalTempView in SparkR (was: expose

[jira] [Commented] (SPARK-20684) expose createGlobalTempView in SparkR

2017-05-10 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005552#comment-16005552 ] Hossein Falaki commented on SPARK-20684: Yes I agree. > expose createGlobalTempView in SparkR >

[jira] [Commented] (SPARK-20684) expose createGlobalTempView in SparkR

2017-05-10 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005545#comment-16005545 ] Dongjoon Hyun commented on SPARK-20684: --- We need `dropGlobalTempView`, too. > expose

[jira] [Commented] (SPARK-20684) expose createGlobalTempView in SparkR

2017-05-10 Thread Dongjoon Hyun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005527#comment-16005527 ] Dongjoon Hyun commented on SPARK-20684: --- Hi, [~falaki]. I'll make a PR for this. > expose

[jira] [Updated] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-20700: --- Description: The following (complicated) query eventually fails with a stack overflow during

[jira] [Updated] (SPARK-20700) InferFiltersFromConstraints stackoverflows for query (v2)

2017-05-10 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-20700: --- Summary: InferFiltersFromConstraints stackoverflows for query (v2) (was: Expression

[jira] [Created] (SPARK-20700) Expression canonicalization hits stack overflow for query

2017-05-10 Thread Josh Rosen (JIRA)
Josh Rosen created SPARK-20700: -- Summary: Expression canonicalization hits stack overflow for query Key: SPARK-20700 URL: https://issues.apache.org/jira/browse/SPARK-20700 Project: Spark Issue

[jira] [Assigned] (SPARK-20504) ML 2.2 QA: API: Java compatibility, docs

2017-05-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-20504: - Assignee: Weichen Xu (was: Joseph K. Bradley) > ML 2.2 QA: API: Java

[jira] [Created] (SPARK-20699) The end of Python stdout/stderr streams may be lost by PythonRunner

2017-05-10 Thread Nick Gates (JIRA)
Nick Gates created SPARK-20699: -- Summary: The end of Python stdout/stderr streams may be lost by PythonRunner Key: SPARK-20699 URL: https://issues.apache.org/jira/browse/SPARK-20699 Project: Spark

[jira] [Resolved] (SPARK-20698) =, ==, > is not working as expected when used in sql query

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20698. --- Resolution: Invalid Fix Version/s: (was: 1.6.2) This isn't a place to ask people to debug

[jira] [Updated] (SPARK-20698) =, ==, > is not working as expected when used in sql query

2017-05-10 Thread someshwar kale (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] someshwar kale updated SPARK-20698: --- Description: I have written below spark program- its not working as expected {code} package

[jira] [Updated] (SPARK-20698) =, ==, > is not working as expected when used in sql query

2017-05-10 Thread someshwar kale (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] someshwar kale updated SPARK-20698: --- Description: I have written below spark program- its not working as expected

[jira] [Created] (SPARK-20698) =, ==, > is not working as expected when used in sql query

2017-05-10 Thread someshwar kale (JIRA)
someshwar kale created SPARK-20698: -- Summary: =, ==, > is not working as expected when used in sql query Key: SPARK-20698 URL: https://issues.apache.org/jira/browse/SPARK-20698 Project: Spark

[jira] [Comment Edited] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-05-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005189#comment-16005189 ] Thomas Graves edited comment on SPARK-19354 at 5/10/17 6:37 PM:

[jira] [Updated] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-05-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-19354: -- Priority: Major (was: Minor) > Killed tasks are getting marked as FAILED >

[jira] [Commented] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-05-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005189#comment-16005189 ] Thomas Graves commented on SPARK-19354: --- [~squito] wondering if you have seen the issue with

[jira] [Updated] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-05-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-19354: -- Issue Type: Bug (was: Improvement) > Killed tasks are getting marked as FAILED >

[jira] [Commented] (SPARK-19354) Killed tasks are getting marked as FAILED

2017-05-10 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005182#comment-16005182 ] Thomas Graves commented on SPARK-19354: --- This is definitely causing issues with blacklisting.

[jira] [Commented] (SPARK-20687) mllib.Matrices.fromBreeze may crash when converting breeze CSCMatrix

2017-05-10 Thread Ignacio Bermudez Corrales (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16005179#comment-16005179 ] Ignacio Bermudez Corrales commented on SPARK-20687: --- Proposing a patch in PR

[jira] [Comment Edited] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2017-05-10 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004607#comment-16004607 ] Zoltan Ivanfi edited comment on SPARK-12297 at 5/10/17 6:26 PM: bq. It'd

[jira] [Assigned] (SPARK-20504) ML 2.2 QA: API: Java compatibility, docs

2017-05-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley reassigned SPARK-20504: - Assignee: Joseph K. Bradley > ML 2.2 QA: API: Java compatibility, docs >

[jira] [Updated] (SPARK-20697) MSCK REPAIR TABLE resets the Storage Information for bucketed hive tables.

2017-05-10 Thread Abhishek Madav (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Madav updated SPARK-20697: --- Description: MSCK REPAIR TABLE used to recover partitions for a partitioned+bucketed table

[jira] [Created] (SPARK-20697) MSCK REPAIR TABLE resets the Storage Information for bucketed hive tables.

2017-05-10 Thread Abhishek Madav (JIRA)
Abhishek Madav created SPARK-20697: -- Summary: MSCK REPAIR TABLE resets the Storage Information for bucketed hive tables. Key: SPARK-20697 URL: https://issues.apache.org/jira/browse/SPARK-20697

[jira] [Closed] (SPARK-19213) FileSourceScanExec uses SparkSession from HadoopFsRelation creation time instead of the active session at execution time

2017-05-10 Thread Robert Kruszewski (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Kruszewski closed SPARK-19213. - Resolution: Won't Fix Not really necessary and can lead to confusing results >

[jira] [Commented] (SPARK-20687) mllib.Matrices.fromBreeze may crash when converting breeze CSCMatrix

2017-05-10 Thread Ignacio Bermudez Corrales (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004975#comment-16004975 ] Ignacio Bermudez Corrales commented on SPARK-20687: --- When you try to do operations like

[jira] [Resolved] (SPARK-20689) python doctest leaking bucketed table

2017-05-10 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-20689. - Resolution: Fixed Fix Version/s: 2.3.0 > python doctest leaking bucketed table >

[jira] [Commented] (SPARK-20680) Spark-sql do not support for void column datatype of view

2017-05-10 Thread Herman van Hovell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004911#comment-16004911 ] Herman van Hovell commented on SPARK-20680: --- [~jiangxb] Do you have time to work on this? >

[jira] [Commented] (SPARK-20608) Standby namenodes should be allowed to included in yarn.spark.access.namenodes to support HDFS HA

2017-05-10 Thread Yuechen Chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004778#comment-16004778 ] Yuechen Chen commented on SPARK-20608: -- I know what you mean and that's exactly right. But since

[jira] [Resolved] (SPARK-20696) tf-idf document clustering with K-means in Apache Spark putting points into one cluster

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20696. --- Resolution: Invalid This isn't a good place to ask, as it's almost surely a question about your

[jira] [Created] (SPARK-20696) tf-idf document clustering with K-means in Apache Spark putting points into one cluster

2017-05-10 Thread Nassir (JIRA)
Nassir created SPARK-20696: -- Summary: tf-idf document clustering with K-means in Apache Spark putting points into one cluster Key: SPARK-20696 URL: https://issues.apache.org/jira/browse/SPARK-20696 Project:

[jira] [Commented] (SPARK-5594) SparkException: Failed to get broadcast (TorrentBroadcast)

2017-05-10 Thread Nick Hryhoriev (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004733#comment-16004733 ] Nick Hryhoriev commented on SPARK-5594: --- Hi, i have the same issue. But in spark 2.1. But i can't

[jira] [Updated] (SPARK-20622) Parquet partition discovery for non key=value named directories

2017-05-10 Thread Noam Asor (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Noam Asor updated SPARK-20622: -- Description: h4. Why There are cases where traditional M/R jobs and RDD based Spark jobs writes out

[jira] [Updated] (SPARK-20695) Running multiple TCP socket streams in Spark Shell causes driver error

2017-05-10 Thread Peter Mead (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peter Mead updated SPARK-20695: --- Do you guys ever read the issues?This is a simple spark shell script (dse -u -p yyy spark) which

[jira] [Resolved] (SPARK-20695) Running multiple TCP socket streams in Spark Shell causes driver error

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20695. --- Resolution: Invalid I don't believe that's anything to do with TCP; you are enabling Kryo

[jira] [Closed] (SPARK-20695) Running multiple TCP socket streams in Spark Shell causes driver error

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen closed SPARK-20695. - > Running multiple TCP socket streams in Spark Shell causes driver error >

[jira] [Commented] (SPARK-20542) Add an API into Bucketizer that can bin a lot of columns all at once

2017-05-10 Thread Barry Becker (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004688#comment-16004688 ] Barry Becker commented on SPARK-20542: -- @viirya, your implementation of MultipleBucketizer relies on

[jira] [Created] (SPARK-20695) Running multiple TCP socket streams in Spark Shell causes driver error

2017-05-10 Thread Peter Mead (JIRA)
Peter Mead created SPARK-20695: -- Summary: Running multiple TCP socket streams in Spark Shell causes driver error Key: SPARK-20695 URL: https://issues.apache.org/jira/browse/SPARK-20695 Project: Spark

[jira] [Commented] (SPARK-19447) Fix input metrics for range operator

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004631#comment-16004631 ] Apache Spark commented on SPARK-19447: -- User 'ala' has created a pull request for this issue:

[jira] [Commented] (SPARK-12297) Add work-around for Parquet/Hive int96 timestamp bug.

2017-05-10 Thread Zoltan Ivanfi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004607#comment-16004607 ] Zoltan Ivanfi commented on SPARK-12297: --- bq. It'd be great to consider this more holistically and

[jira] [Assigned] (SPARK-20678) Ndv for columns not in filter condition should also be updated

2017-05-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-20678: --- Assignee: Zhenhua Wang > Ndv for columns not in filter condition should also be updated >

[jira] [Assigned] (SPARK-20694) Document DataFrameWriter partitionBy, bucketBy and sortBy in SQL guide

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20694: Assignee: Apache Spark > Document DataFrameWriter partitionBy, bucketBy and sortBy in SQL

[jira] [Resolved] (SPARK-20678) Ndv for columns not in filter condition should also be updated

2017-05-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20678. - Resolution: Fixed Fix Version/s: 2.3.0 2.2.1 Issue resolved by pull

[jira] [Assigned] (SPARK-20694) Document DataFrameWriter partitionBy, bucketBy and sortBy in SQL guide

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20694: Assignee: (was: Apache Spark) > Document DataFrameWriter partitionBy, bucketBy and

[jira] [Commented] (SPARK-20694) Document DataFrameWriter partitionBy, bucketBy and sortBy in SQL guide

2017-05-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004523#comment-16004523 ] Apache Spark commented on SPARK-20694: -- User 'zero323' has created a pull request for this issue:

[jira] [Created] (SPARK-20694) Document DataFrameWriter partitionBy, bucketBy and sortBy in SQL guide

2017-05-10 Thread Maciej Szymkiewicz (JIRA)
Maciej Szymkiewicz created SPARK-20694: -- Summary: Document DataFrameWriter partitionBy, bucketBy and sortBy in SQL guide Key: SPARK-20694 URL: https://issues.apache.org/jira/browse/SPARK-20694

[jira] [Resolved] (SPARK-20688) correctly check analysis for scalar sub-queries

2017-05-10 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-20688. - Resolution: Fixed Fix Version/s: 2.1.2 2.3.0 2.2.1

[jira] [Created] (SPARK-20693) Kafka+SSL: path for security related files needs to be different for driver and executors

2017-05-10 Thread JIRA
Daniel Lanza GarcĂ­a created SPARK-20693: --- Summary: Kafka+SSL: path for security related files needs to be different for driver and executors Key: SPARK-20693 URL:

[jira] [Resolved] (SPARK-20692) unknowing delay in event timeline

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20692. --- Resolution: Invalid This isn't appropriate for a JIRA. Questions should go to

[jira] [Resolved] (SPARK-20393) Strengthen Spark to prevent XSS vulnerabilities

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20393. --- Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 17686

[jira] [Assigned] (SPARK-20393) Strengthen Spark to prevent XSS vulnerabilities

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-20393: - Assignee: Nicholas Marion Labels: security (was: newbie security) Priority: Minor

[jira] [Updated] (SPARK-20692) unknowing delay in event timeline

2017-05-10 Thread Zhiwen Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiwen Sun updated SPARK-20692: --- Description: Spark streaming job with 1s interval. Process time of micro batch suddenly became to

[jira] [Updated] (SPARK-20692) unknowing delay in event timeline

2017-05-10 Thread Zhiwen Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiwen Sun updated SPARK-20692: --- Description: Spark streaming job with 1s interval. Process time of micro batch suddenly became to

[jira] [Updated] (SPARK-20692) unknowing delay in event timeline

2017-05-10 Thread Zhiwen Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhiwen Sun updated SPARK-20692: --- Attachment: screenshot-1.png > unknowing delay in event timeline > -

[jira] [Created] (SPARK-20692) unknowing delay in event timeline

2017-05-10 Thread Zhiwen Sun (JIRA)
Zhiwen Sun created SPARK-20692: -- Summary: unknowing delay in event timeline Key: SPARK-20692 URL: https://issues.apache.org/jira/browse/SPARK-20692 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-20433) Security issue with jackson-databind

2017-05-10 Thread David Hodeffi (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004417#comment-16004417 ] David Hodeffi commented on SPARK-20433: --- Did you upgrade json4s? since 3.2.1 is not compatible with

[jira] [Commented] (SPARK-20691) Difference between Storage Memory as seen internally and in web UI

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004413#comment-16004413 ] Sean Owen commented on SPARK-20691: --- I think that we have, unfortunately, not consistently

[jira] [Resolved] (SPARK-20433) Security issue with jackson-databind

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20433. --- Resolution: Not A Problem OK, I don't see evidence that this isn't just an instance of the problem

[jira] [Updated] (SPARK-20687) mllib.Matrices.fromBreeze may crash when converting breeze CSCMatrix

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-20687: -- Priority: Minor (was: Critical) > mllib.Matrices.fromBreeze may crash when converting breeze

[jira] [Commented] (SPARK-20687) mllib.Matrices.fromBreeze may crash when converting breeze CSCMatrix

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004372#comment-16004372 ] Sean Owen commented on SPARK-20687: --- This doesn't say what the problem is. What goes wrong? >

[jira] [Assigned] (SPARK-20637) MappedRDD, FilteredRDD, etc. are still referenced in code comments

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-20637: - Assignee: Michael Mior > MappedRDD, FilteredRDD, etc. are still referenced in code comments >

[jira] [Resolved] (SPARK-20637) MappedRDD, FilteredRDD, etc. are still referenced in code comments

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20637. --- Resolution: Fixed Fix Version/s: 2.2.1 Issue resolved by pull request 17900

[jira] [Resolved] (SPARK-20630) Thread Dump link available in Executors tab irrespective of spark.ui.threadDumpsEnabled

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-20630. --- Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17904

[jira] [Assigned] (SPARK-20630) Thread Dump link available in Executors tab irrespective of spark.ui.threadDumpsEnabled

2017-05-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reassigned SPARK-20630: - Assignee: Alex Bozarth Affects Version/s: (was: 2.3.0)

[jira] [Assigned] (SPARK-20631) LogisticRegression._checkThresholdConsistency should use values not Params

2017-05-10 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang reassigned SPARK-20631: --- Assignee: Maciej Szymkiewicz > LogisticRegression._checkThresholdConsistency should use

[jira] [Resolved] (SPARK-20631) LogisticRegression._checkThresholdConsistency should use values not Params

2017-05-10 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-20631. - Resolution: Fixed Fix Version/s: 2.2.0 2.1.2 2.0.3

[jira] [Commented] (SPARK-20638) Optimize the CartesianRDD to reduce repeatedly data fetching

2017-05-10 Thread Teng Jiang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16004342#comment-16004342 ] Teng Jiang commented on SPARK-20638: A further 88x improvement is show in my PR comment.

  1   2   >