date:20161011

[GitHub] spark issue #14959: [SPARK-17387][PYSPARK] Creating SparkContext() from pyth...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14959 **[Test build #66758 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66758/consoleFull)** for PR 14959 at commit

[GitHub] spark issue #14959: [SPARK-17387][PYSPARK] Creating SparkContext() from pyth...

2016-10-11 Thread vanzin

Github user vanzin commented on the issue: https://github.com/apache/spark/pull/14959 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark pull request #13194: [SPARK-15402] [ML] [PySpark] PySpark ml.evaluatio...

2016-10-11 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/13194#discussion_r82877590 --- Diff: python/pyspark/ml/evaluation.py --- @@ -311,19 +330,25 @@ def setParams(self, predictionCol="prediction", labelCol="label", if

[GitHub] spark pull request #15338: [SPARK-11653][Deploy] Allow spark-daemon.sh to ru...

2016-10-11 Thread jodersky

Github user jodersky commented on a diff in the pull request: https://github.com/apache/spark/pull/15338#discussion_r82866433 --- Diff: sbin/spark-daemon.sh --- @@ -122,6 +123,35 @@ if [ "$SPARK_NICENESS" = "" ]; then export SPARK_NICENESS=0 fi

[GitHub] spark pull request #15338: [SPARK-11653][Deploy] Allow spark-daemon.sh to ru...

2016-10-11 Thread jodersky

Github user jodersky commented on a diff in the pull request: https://github.com/apache/spark/pull/15338#discussion_r82874206 --- Diff: sbin/spark-daemon.sh --- @@ -146,13 +176,11 @@ run_command() { case "$mode" in (class) - nohup nice -n

[GitHub] spark pull request #15338: [SPARK-11653][Deploy] Allow spark-daemon.sh to ru...

2016-10-11 Thread jodersky

Github user jodersky commented on a diff in the pull request: https://github.com/apache/spark/pull/15338#discussion_r82874084 --- Diff: sbin/spark-daemon.sh --- @@ -146,13 +176,11 @@ run_command() { case "$mode" in (class) - nohup nice -n

[GitHub] spark pull request #15338: [SPARK-11653][Deploy] Allow spark-daemon.sh to ru...

2016-10-11 Thread jodersky

Github user jodersky commented on a diff in the pull request: https://github.com/apache/spark/pull/15338#discussion_r82875044 --- Diff: sbin/spark-daemon.sh --- @@ -122,6 +123,35 @@ if [ "$SPARK_NICENESS" = "" ]; then export SPARK_NICENESS=0 fi

[GitHub] spark pull request #15338: [SPARK-11653][Deploy] Allow spark-daemon.sh to ru...

2016-10-11 Thread jodersky

Github user jodersky commented on a diff in the pull request: https://github.com/apache/spark/pull/15338#discussion_r82871978 --- Diff: sbin/spark-daemon.sh --- @@ -122,6 +123,35 @@ if [ "$SPARK_NICENESS" = "" ]; then export SPARK_NICENESS=0 fi

[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #15338: [SPARK-11653][Deploy] Allow spark-daemon.sh to ru...

2016-10-11 Thread jodersky

Github user jodersky commented on a diff in the pull request: https://github.com/apache/spark/pull/15338#discussion_r82872476 --- Diff: sbin/spark-daemon.sh --- @@ -122,6 +123,35 @@ if [ "$SPARK_NICENESS" = "" ]; then export SPARK_NICENESS=0 fi

[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15307 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66757/ Test FAILed. ---

[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66757 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66757/consoleFull)** for PR 15307 at commit

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-10-11 Thread ajbozarth

Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/15410 Just to raise an idea that would possibly mean less code change, would simply having a flag that causing a "currently processing applications" type message to display without an actual count with

[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #66757 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66757/consoleFull)** for PR 15307 at commit

[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-11 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9766#discussion_r82876529 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala --- @@ -412,6 +419,63 @@ class UDFRegistration private[sql] (functionRegistry:

[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-11 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9766#discussion_r82876442 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala --- @@ -412,6 +419,63 @@ class UDFRegistration private[sql] (functionRegistry:

[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-11 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9766#discussion_r82876419 --- Diff: python/pyspark/sql/context.py --- @@ -202,6 +202,26 @@ def registerFunction(self, name, f, returnType=StringType()): """

[GitHub] spark pull request #9766: [SPARK-11775][PYSPARK][SQL] Allow PySpark to regis...

2016-10-11 Thread marmbrus

Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/9766#discussion_r82876509 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala --- @@ -17,9 +17,15 @@ package org.apache.spark.sql +

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-11 Thread ericl

Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r82875191 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -225,13 +225,16 @@ case class FileSourceScanExec( }

[GitHub] spark issue #15375: [SPARK-17790][SPARKR] Support for parallelizing R data.f...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15375 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66750/ Test FAILed. ---

[GitHub] spark issue #15425: [SPARK-17816] [Core] [Branch-2.0] Fix ConcurrentModifica...

2016-10-11 Thread zsxwing

Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15425 LGTM. Merging to 2.0. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15375: [SPARK-17790][SPARKR] Support for parallelizing R data.f...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15375 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15375: [SPARK-17790][SPARKR] Support for parallelizing R data.f...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15375 **[Test build #66750 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66750/consoleFull)** for PR 15375 at commit

[GitHub] spark pull request #15431: [SPARK-15153] [ML] [SparkR] Fix SparkR spark.naiv...

2016-10-11 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15431 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15431: [SPARK-15153] [ML] [SparkR] Fix SparkR spark.naiveBayes ...

2016-10-11 Thread jkbradley

Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15431 I'll go ahead and merge with master. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread brkyvz

Github user brkyvz commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82872469 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -176,7 +184,9 @@ class StreamExecution(

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14690 **[Test build #66751 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66751/consoleFull)** for PR 14690 at commit

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14690 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14690 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66751/ Test FAILed. ---

[GitHub] spark issue #15408: [SPARK-17839][CORE] Use Nio's directbuffer instead of Bu...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15408 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15408: [SPARK-17839][CORE] Use Nio's directbuffer instead of Bu...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15408 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66749/ Test FAILed. ---

[GitHub] spark issue #15408: [SPARK-17839][CORE] Use Nio's directbuffer instead of Bu...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15408 **[Test build #66749 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66749/consoleFull)** for PR 15408 at commit

[GitHub] spark pull request #15148: [SPARK-5992][ML] Locality Sensitive Hashing

2016-10-11 Thread Yunni

Github user Yunni commented on a diff in the pull request: https://github.com/apache/spark/pull/15148#discussion_r82871238 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/RandomProjection.scala --- @@ -0,0 +1,146 @@ +/* + * Licensed to the Apache Software

[GitHub] spark pull request #15335: [SPARK-17769][Core][Scheduler]Some FetchFailure r...

2016-10-11 Thread markhamstra

Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/15335#discussion_r82869819 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1255,27 +1255,46 @@ class DAGScheduler( s"longer

[GitHub] spark pull request #15382: [SPARK-17810] [SQL] Default spark.sql.warehouse.d...

2016-10-11 Thread avulanov

Github user avulanov commented on a diff in the pull request: https://github.com/apache/spark/pull/15382#discussion_r82869165 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -757,7 +758,10 @@ private[sql] class SQLConf extends Serializable with

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread wangmiao1981

Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15421 It might not be a R specific issue. I am trying to create a test case on Scala side in SQLUtilsSuite.scala. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread wangmiao1981

Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15421 I think we should find out the root cause of the negative length of "NA" field. Yesterday, I debugged R side and I have not found out the reason yet. --- If your project is set up for it,

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread wangmiao1981

Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15421 New test on Mac: > df <- data.frame(Date = as.Date(c(rep("2016-01-10", 10), "NA", "NA")), id = 1:12) > > dim(createDataFrame(df)) 16/10/11 12:10:30 ERROR Executor:

[GitHub] spark issue #13440: [SPARK-15699] [ML] Implement a Chi-Squared test statisti...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/13440 **[Test build #66756 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66756/consoleFull)** for PR 13440 at commit

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-10-11 Thread tgravescs

Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/15410 ah, yeah startup would definitely be a good case for this and like I mentioned its better then nothing so I'm ok with concept. I wonder for the other use case where it hasn't looked in ~

[GitHub] spark pull request #15421: [SPARK-17811] SparkR cannot parallelize data.fram...

2016-10-11 Thread wangmiao1981

Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15421#discussion_r82865118 --- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala --- @@ -125,15 +125,24 @@ private[spark] object SerDe { } def

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread wangmiao1981

Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15421 MacBook Pro (Retina, 15-inch, Mid 2015) This is the machine that I test the patch on. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15421 **[Test build #66755 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66755/consoleFull)** for PR 15421 at commit

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread wangmiao1981

Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15421 @falaki I saw the exception on Mac too. But I don't find the root cause of negative length in the input stream. Catching the exception will solve the problem. Do you want to explore the reason

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread falaki

Github user falaki commented on the issue: https://github.com/apache/spark/pull/15421 @wangmiao1981 thanks for testing on Windows. I added a check for this. Would you please try again and let me know? Unfortunately, I don't have access to a windows box for testing. --- If your

[GitHub] spark pull request #15421: [SPARK-17811] SparkR cannot parallelize data.fram...

2016-10-11 Thread falaki

Github user falaki commented on a diff in the pull request: https://github.com/apache/spark/pull/15421#discussion_r82864117 --- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala --- @@ -125,15 +125,24 @@ private[spark] object SerDe { } def

[GitHub] spark pull request #15335: [SPARK-17769][Core][Scheduler]Some FetchFailure r...

2016-10-11 Thread markhamstra

Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/15335#discussion_r82864060 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1255,27 +1255,46 @@ class DAGScheduler( s"longer

[GitHub] spark pull request #15421: [SPARK-17811] SparkR cannot parallelize data.fram...

2016-10-11 Thread wangmiao1981

Github user wangmiao1981 commented on a diff in the pull request: https://github.com/apache/spark/pull/15421#discussion_r82863921 --- Diff: core/src/main/scala/org/apache/spark/api/r/SerDe.scala --- @@ -125,15 +125,24 @@ private[spark] object SerDe { } def

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-11 Thread mallman

Github user mallman commented on the issue: https://github.com/apache/spark/pull/14690 >> Finally, this would require us to read the schema files. That's something I'm trying to avoid in this patch. > Not sure what you mean here, but the parquet change should be execution

[GitHub] spark issue #15431: [SPARK-15153] [ML] [SparkR] Fix SparkR spark.naiveBayes ...

2016-10-11 Thread jkbradley

Github user jkbradley commented on the issue: https://github.com/apache/spark/pull/15431 LGTM2 Thanks! Is it fine with you if this just gets fixed in master, not branch-2.0 (since the other PR is not in branch-2.0 since it adds a new public API)? --- If your project is set

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread wangmiao1981

Github user wangmiao1981 commented on the issue: https://github.com/apache/spark/pull/15421 @falaki I patched your fix to a clean build. I still see the following error: > df <- data.frame(Date = as.Date(c(rep("2016-01-10", 10), "NA", "NA")), id = 1:12) > >

[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11336 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11336 **[Test build #66753 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66753/consoleFull)** for PR 11336 at commit

[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/11336 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66753/ Test FAILed. ---

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread felixcheung

Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15421 hmm, still the same error in the new test case in appveyor ``` Failed - 1. Error: SPARK-17811: can create

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-11 Thread ericl

Github user ericl commented on the issue: https://github.com/apache/spark/pull/14690 > For one thing, a ListingFileCatalog performs a file tree traversal right off the bat. However, the external catalog returns the locations of partitions as part of the listPartitionsByFilter call. I

[GitHub] spark issue #15431: [SPARK-15153] [ML] [SparkR] Fix SparkR spark.naiveBayes ...

2016-10-11 Thread felixcheung

Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/15431 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #15389: [SPARK-17817][PySpark] PySpark RDD Repartitioning...

2016-10-11 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15389 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15421 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15421 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66746/ Test PASSed. ---

[GitHub] spark issue #15295: [SPARK-17720][SQL] introduce static SQL conf

2016-10-11 Thread rxin

Github user rxin commented on the issue: https://github.com/apache/spark/pull/15295 LGTM except that one comment on naming. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15295: [SPARK-17720][SQL] introduce static SQL conf

2016-10-11 Thread rxin

Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15295#discussion_r82860005 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RuntimeConfig.scala --- @@ -132,4 +136,9 @@ class RuntimeConfig private[sql](sqlConf: SQLConf = new

[GitHub] spark pull request #15295: [SPARK-17720][SQL] introduce static SQL conf

2016-10-11 Thread rxin

Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/15295#discussion_r82859976 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/RuntimeConfig.scala --- @@ -36,6 +37,7 @@ class RuntimeConfig private[sql](sqlConf: SQLConf = new

[GitHub] spark issue #15421: [SPARK-17811] SparkR cannot parallelize data.frame with ...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15421 **[Test build #66746 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66746/consoleFull)** for PR 15421 at commit

[GitHub] spark issue #14719: [SPARK-17154][SQL] Wrong result can be returned or Analy...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14719 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66747/ Test PASSed. ---

[GitHub] spark issue #14719: [SPARK-17154][SQL] Wrong result can be returned or Analy...

2016-10-11 Thread AmplabJenkins

Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14719 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14719: [SPARK-17154][SQL] Wrong result can be returned or Analy...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14719 **[Test build #66747 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66747/consoleFull)** for PR 14719 at commit

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null as input seed

2016-10-11 Thread rxin

Github user rxin commented on the issue: https://github.com/apache/spark/pull/15432 hm - maybe we should just cast any NullType input into some concrete type defined by an ExpectsInputTypes expression? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark issue #15437: [SPARK-17876] Write StructuredStreaming WAL to a stream ...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15437 **[Test build #66754 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66754/consoleFull)** for PR 15437 at commit

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82858522 --- Diff: python/pyspark/sql/streaming.py --- @@ -189,6 +189,282 @@ def resetTerminated(self): self._jsqm.resetTerminated() +class

[GitHub] spark pull request #15437: [SPARK-17876] Write StructuredStreaming WAL to a ...

2016-10-11 Thread brkyvz

GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/15437 [SPARK-17876] Write StructuredStreaming WAL to a stream instead of materializing all at once ## What changes were proposed in this pull request? The CompactibleFileStreamLog materializes

[GitHub] spark issue #15375: [SPARK-17790][SPARKR] Support for parallelizing R data.f...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15375 **[Test build #3324 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3324/consoleFull)** for PR 15375 at commit

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread koeninger

Github user koeninger commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82857280 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala --- @@ -0,0 +1,244 @@ +/* + * Licensed to the

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82856942 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -530,7 +692,7 @@ class StreamExecution( case

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82857011 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -516,12 +563,127 @@ class StreamExecution(

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82856924 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -221,8 +247,15 @@ class StreamExecution(

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82856802 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -105,11 +105,21 @@ class StreamExecution( var

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82856363 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala --- @@ -0,0 +1,244 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-11 Thread mallman

Github user mallman commented on the issue: https://github.com/apache/spark/pull/14690 I believe that using a method like `TableFileCatalog.filterPartitions` to build a new file catalog restricted to some pruned partitions is a sound approach, however I'm starting to reconsider the

[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11336 **[Test build #66753 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66753/consoleFull)** for PR 11336 at commit

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82855608 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -516,12 +563,127 @@ class StreamExecution(

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-11 Thread tdas

Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r82855520 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala --- @@ -176,7 +184,9 @@ class StreamExecution( //

[GitHub] spark pull request #15436: [SPARK-17875] [BUILD] Remove unneeded direct depe...

2016-10-11 Thread srowen

Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15436#discussion_r82854783 --- Diff: NOTICE --- @@ -162,7 +162,7 @@ Please visit the Netty web site for more information: * http://netty.io/ -Copyright 2011 The

[GitHub] spark pull request #15436: [SPARK-17875] [BUILD] Remove unneeded direct depe...

2016-10-11 Thread srowen

Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/15436#discussion_r82854900 --- Diff: dev/deps/spark-deps-hadoop-2.3 --- @@ -130,7 +130,6 @@ metrics-json-3.1.2.jar metrics-jvm-3.1.2.jar minlog-1.3.0.jar mx4j-3.0.2.jar

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15436 **[Test build #66752 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66752/consoleFull)** for PR 15436 at commit

[GitHub] spark pull request #15436: [SPARK-17875] [BUILD] Remove unneeded direct depe...

2016-10-11 Thread srowen

GitHub user srowen opened a pull request: https://github.com/apache/spark/pull/15436 [SPARK-17875] [BUILD] Remove unneeded direct dependence on Netty 3.x ## What changes were proposed in this pull request? Remove unneeded direct dependency on Netty 3.x. I left the

[GitHub] spark issue #15190: [SPARK-17620][SQL] Determine Serde by hive.default.filef...

2016-10-11 Thread dilipbiswal

Github user dilipbiswal commented on the issue: https://github.com/apache/spark/pull/15190 @yhuai We will use Parquet format in your example. We look at ```SQL spark.sql.sources.default ``` configuration to decide on the format to use ? Here is the output for your perusal.

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-10-11 Thread vijoshi

Github user vijoshi commented on the issue: https://github.com/apache/spark/pull/15410 @tgravescs - you're right - for newer logs that are generated, there could be a window of time (10 secs or whatever the user configures) where the new logs are not picked up for replay and the UI

[GitHub] spark issue #15422: [SPARK-17850][Core]HadoopRDD should not catch EOFExcepti...

2016-10-11 Thread zsxwing

Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15422 > For example, in MR you have the ability to even set the percentage of bad records you want to tolerate (we dont have that in spark). I may be wrong. But in MR, I think bad records just

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-11 Thread SparkQA

Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14690 **[Test build #66751 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66751/consoleFull)** for PR 14690 at commit

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-11 Thread mallman

Github user mallman commented on the issue: https://github.com/apache/spark/pull/14690 Ah cripes. I committed something I didn't want to. I'm rebasing again in a few... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If

[GitHub] spark issue #15435: [SPARK-17139][ML] Add model summary for MultinomialLogis...

2016-10-11 Thread sethah

Github user sethah commented on the issue: https://github.com/apache/spark/pull/15435 I'll try to take a look before too long. For now, I see there are no tests, could you please add tests, using the summary tests for binary classification as a guide? Thanks! --- If your project is

[GitHub] spark pull request #15384: [SPARK-17346][SQL][Tests]Fix the flaky topic dele...

2016-10-11 Thread asfgit

Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15384 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15384: [SPARK-17346][SQL][Tests]Fix the flaky topic deletion in...

2016-10-11 Thread tdas

Github user tdas commented on the issue: https://github.com/apache/spark/pull/15384 Merging it to master, and branch 2.0 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-11 Thread mallman

Github user mallman commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r82850487 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala --- @@ -225,13 +225,16 @@ case class FileSourceScanExec( }

[GitHub] spark issue #15422: [SPARK-17850][Core]HadoopRDD should not catch EOFExcepti...

2016-10-11 Thread srowen

Github user srowen commented on the issue: https://github.com/apache/spark/pull/15422 @mridulm for the scenario you're imagining, maybe the data is OK, sure. That doesn't mean it's true in all cases. Yeah, this is really to work around bad input, which you can to some degree do at

[GitHub] spark issue #13194: [SPARK-15402] [ML] [PySpark] PySpark ml.evaluation shoul...

2016-10-11 Thread BryanCutler

Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/13194 Just one question, otherwise LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #13194: [SPARK-15402] [ML] [PySpark] PySpark ml.evaluatio...

2016-10-11 Thread BryanCutler

Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/13194#discussion_r82849461 --- Diff: python/pyspark/ml/evaluation.py --- @@ -21,7 +21,8 @@ from pyspark.ml.wrapper import JavaParams from pyspark.ml.param import Param,

[GitHub] spark pull request #15398: [SPARK-17647][SQL] Fix backslash escaping in 'LIK...

2016-10-11 Thread jodersky

Github user jodersky commented on a diff in the pull request: https://github.com/apache/spark/pull/15398#discussion_r82849408 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/StringUtils.scala --- @@ -25,26 +25,25 @@ object StringUtils { //

[GitHub] spark issue #15422: [SPARK-17850][Core]HadoopRDD should not catch EOFExcepti...

2016-10-11 Thread mridulm

Github user mridulm commented on the issue: https://github.com/apache/spark/pull/15422 @marmbrus +1 on logging, that is definitely something which was probably missed here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

< 1 2 3 4 5 6 7 >

301 - 400 of 661 matches

Mail list logo