[jira] [Assigned] (SPARK-18726) Filesystem unnecessarily scanned twice during creation of non-catalog table

2017-03-02 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-18726: --- Assignee: Song Jun > Filesystem unnecessarily scanned twice during creation of non-catalog

[jira] [Resolved] (SPARK-18726) Filesystem unnecessarily scanned twice during creation of non-catalog table

2017-03-02 Thread Wenchen Fan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-18726. - Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17081

[jira] [Commented] (SPARK-19371) Cannot spread cached partitions evenly across executors

2017-03-02 Thread Paul Lysak (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893877#comment-15893877 ] Paul Lysak commented on SPARK-19371: I'm observing similar behavior in Spark 2.1 - unfortunately, due

[jira] [Commented] (SPARK-19339) StatFunctions.multipleApproxQuantiles can give NoSuchElementException: next on empty iterator

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19339?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893858#comment-15893858 ] Nick Pentreath commented on SPARK-19339: This should be addressed by SPARK-19573 - empty (or all

[jira] [Commented] (SPARK-19714) Bucketizer Bug Regarding Handling Unbucketed Inputs

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19714?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893821#comment-15893821 ] Nick Pentreath commented on SPARK-19714: If you feel that handling values outside the bucket

[jira] [Commented] (SPARK-19747) Consolidate code in ML aggregators

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893811#comment-15893811 ] Nick Pentreath commented on SPARK-19747: Also agree we should be able to extract out the penalty

[jira] [Commented] (SPARK-19747) Consolidate code in ML aggregators

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893810#comment-15893810 ] Nick Pentreath commented on SPARK-19747: [~yuhaoyan] for {{SGDClassifier}} it would be

[jira] [Closed] (SPARK-18478) Support codegen for Hive UDFs

2017-03-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18478?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro closed SPARK-18478. Resolution: Won't Fix > Support codegen for Hive UDFs > - > >

[jira] [Assigned] (SPARK-19779) structured streaming exist needless tmp file

2017-03-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu reassigned SPARK-19779: Assignee: Feng Gui > structured streaming exist needless tmp file >

[jira] [Resolved] (SPARK-19779) structured streaming exist needless tmp file

2017-03-02 Thread Shixiong Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shixiong Zhu resolved SPARK-19779. -- Resolution: Fixed Fix Version/s: 2.2.0 2.1.1 2.0.3

[jira] [Updated] (SPARK-19805) Log the row type when query result dose not match

2017-03-02 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Genmao Yu updated SPARK-19805: -- Summary: Log the row type when query result dose not match (was: Log the row type when query result

[jira] [Commented] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893736#comment-15893736 ] Shivaram Venkataraman commented on SPARK-19796: --- I think (a) is worth exploring in a new

[jira] [Assigned] (SPARK-19806) PySpark GLR supports tweedie distribution

2017-03-02 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang reassigned SPARK-19806: --- Assignee: Yanbo Liang > PySpark GLR supports tweedie distribution >

[jira] [Assigned] (SPARK-19806) PySpark GLR supports tweedie distribution

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19806: Assignee: (was: Apache Spark) > PySpark GLR supports tweedie distribution >

[jira] [Assigned] (SPARK-19806) PySpark GLR supports tweedie distribution

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19806: Assignee: Apache Spark > PySpark GLR supports tweedie distribution >

[jira] [Commented] (SPARK-19806) PySpark GLR supports tweedie distribution

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893733#comment-15893733 ] Apache Spark commented on SPARK-19806: -- User 'yanboliang' has created a pull request for this issue:

[jira] [Created] (SPARK-19806) PySpark GLR supports tweedie distribution

2017-03-02 Thread Yanbo Liang (JIRA)
Yanbo Liang created SPARK-19806: --- Summary: PySpark GLR supports tweedie distribution Key: SPARK-19806 URL: https://issues.apache.org/jira/browse/SPARK-19806 Project: Spark Issue Type:

[jira] [Commented] (SPARK-15474) ORC data source fails to write and read back empty dataframe

2017-03-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893713#comment-15893713 ] Hyukjin Kwon commented on SPARK-15474: -- Let me leave some pointer -

[jira] [Commented] (SPARK-10294) When Parquet writer's close method throws an exception, we will call close again and trigger a NPE

2017-03-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893709#comment-15893709 ] Hyukjin Kwon commented on SPARK-10294: -- Maybe, we could resolve this as a duplicate of SPARK-13127

[jira] [Commented] (SPARK-10294) When Parquet writer's close method throws an exception, we will call close again and trigger a NPE

2017-03-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893706#comment-15893706 ] Hyukjin Kwon commented on SPARK-10294: -- Hi [~yhuai], it seems this issue refers PARQUET-544 which is

[jira] [Commented] (SPARK-15474) ORC data source fails to write and read back empty dataframe

2017-03-02 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893703#comment-15893703 ] Nicholas Chammas commented on SPARK-15474: -- cc [~owen.omalley] > ORC data source fails to

[jira] [Assigned] (SPARK-19805) Log the row type when query result dose match

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19805: Assignee: Apache Spark > Log the row type when query result dose match >

[jira] [Commented] (SPARK-19805) Log the row type when query result dose match

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893693#comment-15893693 ] Apache Spark commented on SPARK-19805: -- User 'uncleGen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19805) Log the row type when query result dose match

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19805: Assignee: (was: Apache Spark) > Log the row type when query result dose match >

[jira] [Created] (SPARK-19805) Log the row type when query result dose match

2017-03-02 Thread Genmao Yu (JIRA)
Genmao Yu created SPARK-19805: - Summary: Log the row type when query result dose match Key: SPARK-19805 URL: https://issues.apache.org/jira/browse/SPARK-19805 Project: Spark Issue Type:

[jira] [Commented] (SPARK-15474) ORC data source fails to write and read back empty dataframe

2017-03-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893691#comment-15893691 ] Hyukjin Kwon commented on SPARK-15474: -- It seems an issue related with Hive's {{OrcOutputFormat}}.

[jira] [Resolved] (SPARK-19745) SVCAggregator serializes coefficients

2017-03-02 Thread Yanbo Liang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yanbo Liang resolved SPARK-19745. - Resolution: Fixed Fix Version/s: 2.2.0 > SVCAggregator serializes coefficients >

[jira] [Assigned] (SPARK-19803) Flaky BlockManagerProactiveReplicationSuite tests

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19803: Assignee: Apache Spark > Flaky BlockManagerProactiveReplicationSuite tests >

[jira] [Commented] (SPARK-19803) Flaky BlockManagerProactiveReplicationSuite tests

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893640#comment-15893640 ] Apache Spark commented on SPARK-19803: -- User 'uncleGen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19803) Flaky BlockManagerProactiveReplicationSuite tests

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19803?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19803: Assignee: (was: Apache Spark) > Flaky BlockManagerProactiveReplicationSuite tests >

[jira] [Commented] (SPARK-18608) Spark ML algorithms that check RDD cache level for internal caching double-cache data

2017-03-02 Thread zhengruifeng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893596#comment-15893596 ] zhengruifeng commented on SPARK-18608: -- [~mlnick] [~yuhaoyan] [~srowen] I think if we use

[jira] [Commented] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893584#comment-15893584 ] Mridul Muralidharan commented on SPARK-19796: - I would not prefer (b) - if we are worried

[jira] [Commented] (SPARK-19802) Remote History Server

2017-03-02 Thread Saisai Shao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893580#comment-15893580 ] Saisai Shao commented on SPARK-19802: - Spark's {{ApplicationHistoryProvider}} is pluggable, user

[jira] [Commented] (SPARK-19804) HiveClientImpl does not work with Hive 2.2.0 metastore

2017-03-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893579#comment-15893579 ] Marcelo Vanzin commented on SPARK-19804: For posterity, the error you get looks like this:

[jira] [Commented] (SPARK-14698) CREATE FUNCTION cloud not add function to hive metastore

2017-03-02 Thread poseidon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893565#comment-15893565 ] poseidon commented on SPARK-14698: -- [~azeroth2b] I think in spark 1.6.1, author do it on purpose. If

[jira] [Closed] (SPARK-19349) Check resource ready to avoid multiple receivers to be scheduled on the same node.

2017-03-02 Thread Genmao Yu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Genmao Yu closed SPARK-19349. - Resolution: Won't Fix > Check resource ready to avoid multiple receivers to be scheduled on the same >

[jira] [Resolved] (SPARK-19750) Spark UI http -> https redirect error

2017-03-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19750. Resolution: Fixed Assignee: Saisai Shao Fix Version/s: 2.1.1

[jira] [Commented] (SPARK-19771) Support OR-AND amplification in Locality Sensitive Hashing (LSH)

2017-03-02 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893508#comment-15893508 ] Yun Ni commented on SPARK-19771: [~merlin] What you are suggesting is to hash each AND hash vector into a

[jira] [Resolved] (SPARK-19276) FetchFailures can be hidden by user (or sql) exception handling

2017-03-02 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-19276. Resolution: Fixed Assignee: Imran Rashid Fix Version/s: 2.2.0 >

[jira] [Created] (SPARK-19804) HiveClientImpl does not work with Hive 2.2.0 metastore

2017-03-02 Thread Marcelo Vanzin (JIRA)
Marcelo Vanzin created SPARK-19804: -- Summary: HiveClientImpl does not work with Hive 2.2.0 metastore Key: SPARK-19804 URL: https://issues.apache.org/jira/browse/SPARK-19804 Project: Spark

[jira] [Commented] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893376#comment-15893376 ] Kay Ousterhout commented on SPARK-19796: Do you think we should (separately) fix the underlying

[jira] [Assigned] (SPARK-19631) OutputCommitCoordinator should not allow commits for already failed tasks

2017-03-02 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout reassigned SPARK-19631: -- Assignee: Patrick Woody > OutputCommitCoordinator should not allow commits for

[jira] [Resolved] (SPARK-19631) OutputCommitCoordinator should not allow commits for already failed tasks

2017-03-02 Thread Kay Ousterhout (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19631?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kay Ousterhout resolved SPARK-19631. Resolution: Fixed Fix Version/s: 2.2.0 > OutputCommitCoordinator should not allow

[jira] [Commented] (SPARK-18113) Sending AskPermissionToCommitOutput failed, driver enter into task deadloop

2017-03-02 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893222#comment-15893222 ] Andrew Ash commented on SPARK-18113: We discovered another bug related to committing that causes task

[jira] [Commented] (SPARK-19771) Support OR-AND amplification in Locality Sensitive Hashing (LSH)

2017-03-02 Thread Mingjie Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893117#comment-15893117 ] Mingjie Tang commented on SPARK-19771: -- (1) because you need to explode each tuple. For example

[jira] [Comment Edited] (SPARK-19771) Support OR-AND amplification in Locality Sensitive Hashing (LSH)

2017-03-02 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893097#comment-15893097 ] Yun Ni edited comment on SPARK-19771 at 3/2/17 9:55 PM: [~merlin] (1) The

[jira] [Commented] (SPARK-19771) Support OR-AND amplification in Locality Sensitive Hashing (LSH)

2017-03-02 Thread Yun Ni (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893097#comment-15893097 ] Yun Ni commented on SPARK-19771: [~merlin] (1) The computation cost is NumHashFunctions because we go

[jira] [Commented] (SPARK-18454) Changes to improve Nearest Neighbor Search for LSH

2017-03-02 Thread Mingjie Tang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893087#comment-15893087 ] Mingjie Tang commented on SPARK-18454: -- [~yunn] the current multi-probe NNS can be improved without

[jira] [Commented] (SPARK-1693) Dependent on multiple versions of servlet-api jars lead to throw an SecurityException when Spark built for hadoop 2.3.0 , 2.4.0

2017-03-02 Thread Andrew Otto (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893074#comment-15893074 ] Andrew Otto commented on SPARK-1693: We just upgraded to CDH 5.10, which has Spark 1.6.0, Hadoop

[jira] [Comment Edited] (SPARK-1693) Dependent on multiple versions of servlet-api jars lead to throw an SecurityException when Spark built for hadoop 2.3.0 , 2.4.0

2017-03-02 Thread Andrew Otto (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15893074#comment-15893074 ] Andrew Otto edited comment on SPARK-1693 at 3/2/17 9:42 PM: We just upgraded

[jira] [Created] (SPARK-19803) Flaky BlockManagerProactiveReplicationSuite tests

2017-03-02 Thread Sital Kedia (JIRA)
Sital Kedia created SPARK-19803: --- Summary: Flaky BlockManagerProactiveReplicationSuite tests Key: SPARK-19803 URL: https://issues.apache.org/jira/browse/SPARK-19803 Project: Spark Issue Type:

[jira] [Created] (SPARK-19802) Remote History Server

2017-03-02 Thread Ben Barnard (JIRA)
Ben Barnard created SPARK-19802: --- Summary: Remote History Server Key: SPARK-19802 URL: https://issues.apache.org/jira/browse/SPARK-19802 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-19801) Remove JDK7 from Travis CI

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892817#comment-15892817 ] Apache Spark commented on SPARK-19801: -- User 'dongjoon-hyun' has created a pull request for this

[jira] [Assigned] (SPARK-19801) Remove JDK7 from Travis CI

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19801: Assignee: Apache Spark > Remove JDK7 from Travis CI > -- > >

[jira] [Assigned] (SPARK-19801) Remove JDK7 from Travis CI

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19801: Assignee: (was: Apache Spark) > Remove JDK7 from Travis CI >

[jira] [Created] (SPARK-19801) Remove JDK7 from Travis CI

2017-03-02 Thread Dongjoon Hyun (JIRA)
Dongjoon Hyun created SPARK-19801: - Summary: Remove JDK7 from Travis CI Key: SPARK-19801 URL: https://issues.apache.org/jira/browse/SPARK-19801 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-19720) Redact sensitive information from SparkSubmit console output

2017-03-02 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcelo Vanzin resolved SPARK-19720. Resolution: Fixed Assignee: Mark Grover Fix Version/s: 2.2.0 > Redact

[jira] [Commented] (SPARK-11197) Run SQL query on files directly without create a table

2017-03-02 Thread Ladislav Jech (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-11197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892706#comment-15892706 ] Ladislav Jech commented on SPARK-11197: --- Grat stuff! > Run SQL query on files directly without

[jira] [Commented] (SPARK-18699) Spark CSV parsing types other than String throws exception when malformed

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892679#comment-15892679 ] Apache Spark commented on SPARK-18699: -- User 'HyukjinKwon' has created a pull request for this

[jira] [Assigned] (SPARK-19800) Implement one kind of streaming sampling - reservoir sampling

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19800: Assignee: (was: Apache Spark) > Implement one kind of streaming sampling - reservoir

[jira] [Commented] (SPARK-19800) Implement one kind of streaming sampling - reservoir sampling

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892571#comment-15892571 ] Apache Spark commented on SPARK-19800: -- User 'uncleGen' has created a pull request for this issue:

[jira] [Assigned] (SPARK-19800) Implement one kind of streaming sampling - reservoir sampling

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19800: Assignee: Apache Spark > Implement one kind of streaming sampling - reservoir sampling >

[jira] [Created] (SPARK-19800) Implement one kind of streaming sampling - reservoir sampling

2017-03-02 Thread Genmao Yu (JIRA)
Genmao Yu created SPARK-19800: - Summary: Implement one kind of streaming sampling - reservoir sampling Key: SPARK-19800 URL: https://issues.apache.org/jira/browse/SPARK-19800 Project: Spark

[jira] [Commented] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892547#comment-15892547 ] Imran Rashid commented on SPARK-19796: -- [~kayousterhout] [~shivaram] here's another example of

[jira] [Updated] (SPARK-19766) INNER JOIN on constant alias columns return incorrect results

2017-03-02 Thread Xiao Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-19766: Fix Version/s: 2.0.3 > INNER JOIN on constant alias columns return incorrect results >

[jira] [Assigned] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19796: Assignee: Apache Spark > taskScheduler fails serializing long statements received by

[jira] [Assigned] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19796: Assignee: (was: Apache Spark) > taskScheduler fails serializing long statements

[jira] [Commented] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892482#comment-15892482 ] Apache Spark commented on SPARK-19796: -- User 'squito' has created a pull request for this issue:

[jira] [Created] (SPARK-19799) Support WITH clause in subqueries

2017-03-02 Thread Giambattista (JIRA)
Giambattista created SPARK-19799: Summary: Support WITH clause in subqueries Key: SPARK-19799 URL: https://issues.apache.org/jira/browse/SPARK-19799 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892390#comment-15892390 ] Imran Rashid commented on SPARK-19796: -- Since its a regression, I'm making this a blocker for 2.2.0

[jira] [Updated] (SPARK-19796) taskScheduler fails serializing long statements received by thrift server

2017-03-02 Thread Imran Rashid (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-19796: - Priority: Blocker (was: Major) > taskScheduler fails serializing long statements received by

[jira] [Commented] (SPARK-18890) Do all task serialization in CoarseGrainedExecutorBackend thread (rather than TaskSchedulerImpl)

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18890?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892363#comment-15892363 ] Apache Spark commented on SPARK-18890: -- User 'witgo' has created a pull request for this issue:

[jira] [Commented] (SPARK-17080) join reorder

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892325#comment-15892325 ] Apache Spark commented on SPARK-17080: -- User 'wzhfy' has created a pull request for this issue:

[jira] [Assigned] (SPARK-17080) join reorder

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17080: Assignee: Apache Spark > join reorder > > > Key: SPARK-17080

[jira] [Assigned] (SPARK-17080) join reorder

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-17080: Assignee: (was: Apache Spark) > join reorder > > > Key:

[jira] [Created] (SPARK-19798) Query returns stale results when tables are modified on other sessions

2017-03-02 Thread Giambattista (JIRA)
Giambattista created SPARK-19798: Summary: Query returns stale results when tables are modified on other sessions Key: SPARK-19798 URL: https://issues.apache.org/jira/browse/SPARK-19798 Project:

[jira] [Commented] (SPARK-18769) Spark to be smarter about what the upper bound is and to restrict number of executor when dynamic allocation is enabled

2017-03-02 Thread Thomas Graves (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892318#comment-15892318 ] Thomas Graves commented on SPARK-18769: --- I definitely understand there is an actual problem here,

[jira] [Resolved] (SPARK-19345) Add doc for "coldStartStrategy" usage in ALS

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-19345. Resolution: Fixed Fix Version/s: 2.2.0 > Add doc for "coldStartStrategy" usage in

[jira] [Updated] (SPARK-19345) Add doc for "coldStartStrategy" usage in ALS

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-19345: --- Priority: Minor (was: Major) > Add doc for "coldStartStrategy" usage in ALS >

[jira] [Commented] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892216#comment-15892216 ] Sean Owen commented on SPARK-19797: --- Yes, it's not true of scoring though, and the difference in

[jira] [Comment Edited] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Zhe Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892189#comment-15892189 ] Zhe Sun edited comment on SPARK-19797 at 3/2/17 12:52 PM: -- Hi Sean, thanks for

[jira] [Commented] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Zhe Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892189#comment-15892189 ] Zhe Sun commented on SPARK-19797: - Hi Sean, thanks for your quick reply. bq. If the Pipeline had more

[jira] [Assigned] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19797: Assignee: Apache Spark > ML pipelines document error > --- > >

[jira] [Assigned] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19797: Assignee: (was: Apache Spark) > ML pipelines document error >

[jira] [Commented] (SPARK-19503) Execution Plan Optimizer: avoid sort or shuffle when it does not change end result such as df.sort(...).count()

2017-03-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892169#comment-15892169 ] Takeshi Yamamuro commented on SPARK-19503: -- I'm not sure this should be fixed though, postgresql

[jira] [Comment Edited] (SPARK-19503) Execution Plan Optimizer: avoid sort or shuffle when it does not change end result such as df.sort(...).count()

2017-03-02 Thread Takeshi Yamamuro (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892169#comment-15892169 ] Takeshi Yamamuro edited comment on SPARK-19503 at 3/2/17 12:39 PM: --- I'm

[jira] [Commented] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Zhe Sun (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892170#comment-15892170 ] Zhe Sun commented on SPARK-19797: - A pull request was created https://github.com/apache/spark/pull/17137

[jira] [Commented] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892168#comment-15892168 ] Apache Spark commented on SPARK-19797: -- User 'ymwdalex' has created a pull request for this issue:

[jira] [Commented] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892157#comment-15892157 ] Sean Owen commented on SPARK-19797: --- Hm, on second look, the placement of the sentence suggest it

[jira] [Resolved] (SPARK-19778) alais cannot use in group by

2017-03-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-19778. -- Resolution: Duplicate I am resolving this as a duplicate of SPARK-14471 Please reopen this if

[jira] [Commented] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892149#comment-15892149 ] Sean Owen commented on SPARK-19797: --- I don't think that's true. The resulting pipeline would contain a

[jira] [Assigned] (SPARK-19783) Treat shorter/longer lengths of tokens as malformed records in CSV parser

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19783: Assignee: Apache Spark > Treat shorter/longer lengths of tokens as malformed records in

[jira] [Assigned] (SPARK-19783) Treat shorter/longer lengths of tokens as malformed records in CSV parser

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-19783: Assignee: (was: Apache Spark) > Treat shorter/longer lengths of tokens as malformed

[jira] [Commented] (SPARK-19783) Treat shorter/longer lengths of tokens as malformed records in CSV parser

2017-03-02 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15892146#comment-15892146 ] Apache Spark commented on SPARK-19783: -- User 'maropu' has created a pull request for this issue:

[jira] [Created] (SPARK-19797) ML pipelines document error

2017-03-02 Thread Zhe Sun (JIRA)
Zhe Sun created SPARK-19797: --- Summary: ML pipelines document error Key: SPARK-19797 URL: https://issues.apache.org/jira/browse/SPARK-19797 Project: Spark Issue Type: Bug Components: ML

[jira] [Updated] (SPARK-19704) AFTSurvivalRegression should support numeric censorCol

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-19704: --- Fix Version/s: 2.2.0 > AFTSurvivalRegression should support numeric censorCol >

[jira] [Assigned] (SPARK-19704) AFTSurvivalRegression should support numeric censorCol

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-19704: -- Assignee: zhengruifeng > AFTSurvivalRegression should support numeric censorCol >

[jira] [Resolved] (SPARK-19704) AFTSurvivalRegression should support numeric censorCol

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-19704. Resolution: Fixed > AFTSurvivalRegression should support numeric censorCol >

[jira] [Assigned] (SPARK-19733) ALS performs unnecessary casting on item and user ids

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath reassigned SPARK-19733: -- Assignee: Vasilis Vryniotis > ALS performs unnecessary casting on item and user ids >

[jira] [Resolved] (SPARK-19733) ALS performs unnecessary casting on item and user ids

2017-03-02 Thread Nick Pentreath (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-19733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath resolved SPARK-19733. Resolution: Fixed Fix Version/s: 2.2.0 Issue resolved by pull request 17059

  1   2   >