[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90083413 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQuery.scala --- @@ -38,11 +40,11 @@ trait StreamingQuery { def name: String

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90082045 --- Diff: python/pyspark/sql/streaming.py --- @@ -87,6 +88,24 @@ def awaitTermination(self, timeout=None): else: return self._j

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90084842 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala --- @@ -279,3 +287,8 @@ class StreamingQueryManager private[sq

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90085518 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryProgress.scala --- @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90085377 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryProgress.scala --- @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90084970 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryProgress.scala --- @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90083627 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQuery.scala --- @@ -51,7 +53,7 @@ trait StreamingQuery { def sparkSession:

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90086100 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamTest.scala --- @@ -669,55 +658,48 @@ trait StreamTest extends QueryTest with SharedS

[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] head() and show() for Columns

2016-11-29 Thread falaki
Github user falaki commented on the issue: https://github.com/apache/spark/pull/11336 I did another pass. It looks good to me. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature ena

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90084420 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala --- @@ -81,30 +83,30 @@ object StreamingQueryListener {

[GitHub] spark pull request #15954: [WIP][SPARK-18516][SQL] Split state and progress ...

2016-11-29 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15954#discussion_r90084872 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala --- @@ -279,3 +287,8 @@ class StreamingQueryManager private[sq

[GitHub] spark issue #15972: [SPARK-18319][ML][QA2.1] 2.1 QA: API: Experimental, Deve...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15972 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15972: [SPARK-18319][ML][QA2.1] 2.1 QA: API: Experimental, Deve...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15972 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69338/ Test PASSed. ---

[GitHub] spark issue #15972: [SPARK-18319][ML][QA2.1] 2.1 QA: API: Experimental, Deve...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15972 **[Test build #69338 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69338/consoleFull)** for PR 15972 at commit [`1907019`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69336/ Test PASSed. ---

[GitHub] spark pull request #15923: [SPARK-4105] retry the fetch or stage if shuffle ...

2016-11-29 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/15923#discussion_r90085570 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -56,8 +59,10 @@ final class ShuffleBlockFetcherIterator(

[GitHub] spark issue #15923: [SPARK-4105] retry the fetch or stage if shuffle block i...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15923 **[Test build #3443 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3443/consoleFull)** for PR 15923 at commit [`b3e1786`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #15923: [SPARK-4105] retry the fetch or stage if shuffle ...

2016-11-29 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/15923#discussion_r90085604 --- Diff: core/src/main/scala/org/apache/spark/storage/ShuffleBlockFetcherIterator.scala --- @@ -305,40 +312,82 @@ final class ShuffleBlockFetcherIterator(

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16030 **[Test build #69336 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69336/consoleFull)** for PR 16030 at commit [`43b4eb0`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...

2016-11-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/16061 @erikerlandson For the RAT failure, you may either add Apache license header to newly added files or add the file to `dev/.rat-excludes`. --- If your project is set up for it, you can reply to th

[GitHub] spark issue #16044: [Spark-18614][SQL] Incorrect predicate pushdown from Exi...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16044 **[Test build #69341 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69341/consoleFull)** for PR 16044 at commit [`d4002c7`](https://github.com/apache/spark/commit/d

[GitHub] spark issue #16062: [SPARK-18629][SQL] Fix numPartition of JDBCSuite Testcas...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16062 **[Test build #69340 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69340/consoleFull)** for PR 16062 at commit [`30c5d6f`](https://github.com/apache/spark/commit/3

[GitHub] spark pull request #16062: [SPARK-18629][SQL] Fix numPartition of JDBCSuite ...

2016-11-29 Thread weiqingy
GitHub user weiqingy opened a pull request: https://github.com/apache/spark/pull/16062 [SPARK-18629][SQL] Fix numPartition of JDBCSuite Testcase ## What changes were proposed in this pull request? Fix numPartition of JDBCSuite Testcase. ## How was this patch tested?

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15979 My only concern is that "non-flat" type is neither intuitive nor a well-known term. In fact, this PR only prevents `Option[T <: Product]` to be top-level Dataset types. How about just call them "P

[GitHub] spark pull request #16044: [Spark-18614][SQL] Incorrect predicate pushdown f...

2016-11-29 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16044#discussion_r90083465 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala --- @@ -575,6 +575,24 @@ class JoinSuite extends QueryTest with SharedSQLContext {

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread liancheng
Github user liancheng commented on the issue: https://github.com/apache/spark/pull/15979 LGTM, merging to master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15843: [SPARK-18274][ML][PYSPARK] Memory leak in PySpark JavaWr...

2016-11-29 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15843 I agree, for a follow up (so we don't lose track of it) - I've created SPARK-18630 but option 1 for now is a strict improvement over the current situation. Thanks for all of your work on this @techa

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-11-29 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/15505 @witgo I don't mind moving the serialization out of resourceOffer, but I do think it's helpful to separate that from the too-many-objects-serialized issue. Smaller PRs are easier for folks to

[GitHub] spark issue #16048: [DO_NOT_MERGE]Test kafka deletion

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16048 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69339/ Test FAILed. ---

[GitHub] spark issue #16048: [DO_NOT_MERGE]Test kafka deletion

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16048 **[Test build #69339 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69339/consoleFull)** for PR 16048 at commit [`9ff2ed4`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16048: [DO_NOT_MERGE]Test kafka deletion

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16048 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-29 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16038 I see, you are saying that encoding an actually-dense vector as sparse is somewhat less efficient, and if the implementations happened to make that choice given sparse input, could be bad. In that ca

[GitHub] spark issue #15817: [SPARK-18366][PYSPARK][ML] Add handleInvalid to Pyspark ...

2016-11-29 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/15817 LGTM given our planned follow up to update the documentation for both Python and Scala. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as wel

[GitHub] spark issue #8318: [SPARK-1267][PYSPARK] Adds pip installer for pyspark

2016-11-29 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/8318 Since https://github.com/apache/spark/pull/15659 got merged, would you be ok with closing this @alope107? Thanks for your work on this :) --- If your project is set up for it, you can reply to this

[GitHub] spark issue #16048: [DO_NOT_MERGE]Test kafka deletion

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16048 **[Test build #69339 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69339/consoleFull)** for PR 16048 at commit [`9ff2ed4`](https://github.com/apache/spark/commit/9

[GitHub] spark pull request #16017: [SPARK-18592][ML] Move DT/RF/GBT Param setter met...

2016-11-29 Thread jkbradley
Github user jkbradley commented on a diff in the pull request: https://github.com/apache/spark/pull/16017#discussion_r90072953 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/treeParams.scala --- @@ -107,25 +107,41 @@ private[ml] trait DecisionTreeParams extends PredictorPa

[GitHub] spark issue #16058: [SPARK-18291][SparkR][ML][FOLLOW-UP] Encode probability ...

2016-11-29 Thread felixcheung
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/16058 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the f

[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16055 The above fix does not cover all the cases. Found the root cause. The `constraints` of an operator is the expressions that evaluate to `true` for all the rows produced. That means, the ex

[GitHub] spark issue #15972: [SPARK-18319][ML][QA2.1] 2.1 QA: API: Experimental, Deve...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15972 **[Test build #69338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69338/consoleFull)** for PR 15972 at commit [`1907019`](https://github.com/apache/spark/commit/1

[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...

2016-11-29 Thread tnachen
Github user tnachen commented on the issue: https://github.com/apache/spark/pull/16061 @rxin Makes sense, @srowen also talked about starting the discussion of having a better support for external cluster managers as well. --- If your project is set up for it, you can reply to this em

[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...

2016-11-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16061 One thing - can we submit a separate pr to move all resource managers into resource-managers/yarn resource-managers/mesos ? --- If your project is set up for it, you can reply t

[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...

2016-11-29 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16061 This is pretty cool. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #15998: [SPARK-18572][SQL] Add a method `listPartitionNames` to ...

2016-11-29 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/15998 @mallman I'll take a look today On Tue, Nov 29, 2016, 9:45 AM Michael Allman wrote: > Hi Guys, > > Repeating my comment/query for @ericl . I'm

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-29 Thread AnthonyTruchet
Github user AnthonyTruchet commented on the issue: https://github.com/apache/spark/pull/16038 Actually aggregating (and thus sending on the network) on quite dense SparseVectors with 10s of million elements is not to taken lightly. This would required serious benchmarking. What I tell

[GitHub] spark issue #13909: [SPARK-16213][SQL] Reduce runtime overhead of a program ...

2016-11-29 Thread kiszk
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/13909 ping @cloud-fan --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-29 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16038 Right now, everything is dense here, right? That's the worst case. Your goal is to avoid serializing a dense zero vector and I say it can just be sparse, which solves the immediate problem. From ther

[GitHub] spark pull request #16046: [SPARK-18582][SQL] Whitelist LogicalPlan operator...

2016-11-29 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16046#discussion_r90065066 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1120,47 +1173,54 @@ class Analyzer( }

[GitHub] spark issue #15982: [SPARK-18546][core] Fix merging shuffle spills when usin...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15982 **[Test build #69337 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69337/consoleFull)** for PR 15982 at commit [`2e03ee6`](https://github.com/apache/spark/commit/2

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-29 Thread AnthonyTruchet
Github user AnthonyTruchet commented on the issue: https://github.com/apache/spark/pull/16038 THe big doubt I have is on *Of course, the first call to axpy should produce a dense vector, but, that's already on the executor*: the other operand is Sparse and has to be sparse, and this a

[GitHub] spark pull request #16046: [SPARK-18582][SQL] Whitelist LogicalPlan operator...

2016-11-29 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16046#discussion_r90064101 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1120,47 +1173,54 @@ class Analyzer( }

[GitHub] spark issue #15998: [SPARK-18572][SQL] Add a method `listPartitionNames` to ...

2016-11-29 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/15998 Hi Guys, Repeating my comment/query for @ericl. I'm hoping someone can provide affirmation/refutation to my question before I proceed with new unit tests. I've run some tests to com

[GitHub] spark pull request #14638: [SPARK-11374][SQL] Support `skip.header.line.coun...

2016-11-29 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14638#discussion_r90063348 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -113,6 +113,10 @@ class HadoopTableReader( val tablePath =

[GitHub] spark pull request #14638: [SPARK-11374][SQL] Support `skip.header.line.coun...

2016-11-29 Thread srowen
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/14638#discussion_r90063628 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala --- @@ -122,10 +126,20 @@ class HadoopTableReader( val attrsWithIndex =

[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...

2016-11-29 Thread tnachen
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90063528 --- Diff: kubernetes/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterScheduler.scala --- @@ -0,0 +1,236 @@ +/* + * Lic

[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...

2016-11-29 Thread tnachen
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90063456 --- Diff: kubernetes/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala --- @@ -0,0 +1,222 @@ +/*

[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...

2016-11-29 Thread tnachen
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90063380 --- Diff: kubernetes/src/main/scala/org/apache/spark/scheduler/cluster/kubernetes/KubernetesClusterSchedulerBackend.scala --- @@ -0,0 +1,222 @@ +/*

[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69334/ Test PASSed. ---

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-29 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16038 It might be as simple as writing `Vectors.sparse(n, Seq())` as the zero value. Everything else appears to operate on a Vector that's either sparse or dense then. (Of course, the first call to axpy sh

[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14638 **[Test build #69334 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69334/consoleFull)** for PR 14638 at commit [`3c06aa6`](https://github.com/apache/spark/commit/

[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...

2016-11-29 Thread tnachen
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90062695 --- Diff: dev/make-distribution.sh --- @@ -154,7 +154,9 @@ export MAVEN_OPTS="${MAVEN_OPTS:--Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCac # Store the

[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...

2016-11-29 Thread tnachen
Github user tnachen commented on a diff in the pull request: https://github.com/apache/spark/pull/16061#discussion_r90062639 --- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala --- @@ -596,6 +599,26 @@ object SparkSubmit extends CommandLineUtils { }

[GitHub] spark issue #15924: [SPARK-18498] [SQL] Revise HDFSMetadataLog API for bette...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15924 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15924: [SPARK-18498] [SQL] Revise HDFSMetadataLog API for bette...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15924 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69333/ Test PASSed. ---

[GitHub] spark issue #15924: [SPARK-18498] [SQL] Revise HDFSMetadataLog API for bette...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15924 **[Test build #69333 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69333/consoleFull)** for PR 15924 at commit [`8e3d705`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-29 Thread AnthonyTruchet
Github user AnthonyTruchet commented on the issue: https://github.com/apache/spark/pull/16038 Yes sure the zero Vector is very sparse. Bu ti do not get your suggestion ? I see no way to pass a Sparse Vector as zero and get the type of vector to change underway to Dense Vector with onl

[GitHub] spark issue #15120: [SPARK-4563][core] Allow driver to advertise a different...

2016-11-29 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/15120 > Do you know if there are any alternatives while the commit isn't released? Not really. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHu

[GitHub] spark pull request #16060: [SPARK-18220][SQL] read Hive orc table with varch...

2016-11-29 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16060#discussion_r90060096 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala --- @@ -51,9 +51,12 @@ private[spark] object HiveUtils extends Logging { sc

[GitHub] spark pull request #16060: [SPARK-18220][SQL] read Hive orc table with varch...

2016-11-29 Thread rxin
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/16060#discussion_r90059903 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala --- @@ -51,9 +51,12 @@ private[spark] object HiveUtils extends Logging { sc

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16030 **[Test build #69336 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69336/consoleFull)** for PR 16030 at commit [`43b4eb0`](https://github.com/apache/spark/commit/4

[GitHub] spark pull request #16057: [SPARK-18624][SQL] Implicit cast complex types

2016-11-29 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16057#discussion_r8343 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -714,6 +714,17 @@ object TypeCoercion {

[GitHub] spark pull request #16057: [SPARK-18624][SQL] Implicit cast complex types

2016-11-29 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16057#discussion_r90012747 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -714,6 +714,17 @@ object TypeCoercion {

[GitHub] spark pull request #16057: [SPARK-18624][SQL] Implicit cast complex types

2016-11-29 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16057#discussion_r90011683 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala --- @@ -714,6 +714,17 @@ object TypeCoercion {

[GitHub] spark issue #16055: [SPARK-17897] [SQL] Attribute is not NullIntolerant

2016-11-29 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16055 This does not resolve all the cases. Will submit a better fix today. : ) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark pull request #16055: [SPARK-17897] [SQL] Attribute is not NullIntolera...

2016-11-29 Thread gatorsmile
Github user gatorsmile closed the pull request at: https://github.com/apache/spark/pull/16055 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15979 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69328/ Test PASSed. ---

[GitHub] spark issue #15979: [SPARK-18251][SQL] the type of Dataset can't be Option o...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15979 **[Test build #69328 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69328/consoleFull)** for PR 15979 at commit [`01b072d`](https://github.com/apache/spark/commit/

[GitHub] spark issue #15780: [SPARK-18284][SQL] Make ExpressionEncoder.serializer.nul...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15780 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69332/ Test FAILed. ---

[GitHub] spark issue #15780: [SPARK-18284][SQL] Make ExpressionEncoder.serializer.nul...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15780 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #15780: [SPARK-18284][SQL] Make ExpressionEncoder.serializer.nul...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15780 **[Test build #69332 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69332/consoleFull)** for PR 15780 at commit [`d409d46`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16061 **[Test build #69335 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69335/consoleFull)** for PR 16061 at commit [`8584913`](https://github.com/apache/spark/commit/

[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16061 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69335/ Test FAILed. ---

[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16061 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature e

[GitHub] spark issue #16061: [SPARK-18278] [Scheduler] Support native submission of s...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16061 **[Test build #69335 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69335/consoleFull)** for PR 16061 at commit [`8584913`](https://github.com/apache/spark/commit/8

[GitHub] spark pull request #16046: [SPARK-18582][SQL] Whitelist LogicalPlan operator...

2016-11-29 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request: https://github.com/apache/spark/pull/16046#discussion_r90043228 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1120,47 +1173,54 @@ class Analyzer( }

[GitHub] spark pull request #16061: [SPARK-18278] [Scheduler] Support native submissi...

2016-11-29 Thread erikerlandson
GitHub user erikerlandson opened a pull request: https://github.com/apache/spark/pull/16061 [SPARK-18278] [Scheduler] Support native submission of spark jobs to a kubernetes cluster ## What changes were proposed in this pull request? Add support for native submission of spark jo

[GitHub] spark pull request #16044: [Spark-18614][SQL] Incorrect predicate pushdown f...

2016-11-29 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/16044#discussion_r90042593 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala --- @@ -575,6 +575,24 @@ class JoinSuite extends QueryTest with SharedSQLContext

[GitHub] spark issue #7075: [SPARK-8674] [MLlib] Implementation of a 2 sample Kolmogo...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/7075 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69329/ Test FAILed. --- I

[GitHub] spark issue #7075: [SPARK-8674] [MLlib] Implementation of a 2 sample Kolmogo...

2016-11-29 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/7075 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled a

[GitHub] spark issue #7075: [SPARK-8674] [MLlib] Implementation of a 2 sample Kolmogo...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/7075 [Test build #69329 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69329/console) for PR 7075 at commit [`feacda0`](https://github.com/apache/spark/commit/feacda

[GitHub] spark pull request #16044: [Spark-18614][SQL] Incorrect predicate pushdown f...

2016-11-29 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16044#discussion_r90040908 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/JoinSuite.scala --- @@ -575,6 +575,24 @@ class JoinSuite extends QueryTest with SharedSQLContext {

[GitHub] spark pull request #15780: [SPARK-18284][SQL] Make ExpressionEncoder.seriali...

2016-11-29 Thread kiszk
Github user kiszk commented on a diff in the pull request: https://github.com/apache/spark/pull/15780#discussion_r90040682 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala --- @@ -590,7 +591,11 @@ object ScalaReflection extends ScalaReflect

[GitHub] spark pull request #16046: [SPARK-18582][SQL] Whitelist LogicalPlan operator...

2016-11-29 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16046#discussion_r90039212 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1077,10 +1077,54 @@ class Analyzer( // Si

[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-11-29 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14638 **[Test build #69334 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/69334/consoleFull)** for PR 14638 at commit [`3c06aa6`](https://github.com/apache/spark/commit/3

[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-11-29 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/14638 Retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-29 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16038 Yes, but the result is inherently dense -- not going to be many zeroes in the gradient if any. There's no way around that. The discussion has been about the initial value, the zero vector, right? Tha

[GitHub] spark pull request #16046: [SPARK-18582][SQL] Whitelist LogicalPlan operator...

2016-11-29 Thread nsyca
Github user nsyca commented on a diff in the pull request: https://github.com/apache/spark/pull/16046#discussion_r90033852 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala --- @@ -1120,47 +1173,54 @@ class Analyzer( } else

[GitHub] spark issue #15279: SPARK-12347 [ML][WIP] Add a script to test Spark ML exam...

2016-11-29 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15279 Hi, @ethanluoyc . Could you add Apache License, too? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does no

[GitHub] spark issue #16038: [SPARK-18471][CORE] New treeAggregate overload for big l...

2016-11-29 Thread AnthonyTruchet
Github user AnthonyTruchet commented on the issue: https://github.com/apache/spark/pull/16038 No this does not do the trick as the result of the aggregation IS dense. And the zero in (tree)aggregate has the same type as the result. Said otherwise, in L-BFGS, we do aggregate vectors th

<    1   2   3   4   5   6   7   >