[GitHub] spark pull request #15429: [SPARK-17840] [DOCS] Add some pointers for wiki/C...

2016-10-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15429 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #15048: [SPARK-17409] [SQL] Do Not Optimize Query in CTAS More T...

2016-10-12 Thread yhuai
Github user yhuai commented on the issue: https://github.com/apache/spark/pull/15048 Also, can we add a test for hive tables? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-12 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/14690 Btw I've noticed a significant performance difference between ListingFileCatalog and TableFileCatalog's implementation of ListFiles. The difference seems to be that ListingFileCatalog parallelizes

[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...

2016-10-12 Thread tdas
Github user tdas commented on the issue: https://github.com/apache/spark/pull/15307 This was a flaky DStream test. ^^^ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15408: [SPARK-17839][CORE] Use Nio's directbuffer instead of Bu...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15408 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66830/ Test PASSed. ---

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-12 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/14690 > Btw I've noticed a significant performance difference between ListingFileCatalog and TableFileCatalog's implementation of ListFiles. The difference seems to be that ListingFileCatalog

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-12 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/14690 >> Btw I've noticed a significant performance difference between ListingFileCatalog and TableFileCatalog's implementation of ListFiles. The difference seems to be that ListingFileCatalog

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Indeed, I have not taken into account Anaconda environment, first because this tool provide quick and efficient way of having an almost good working environment to run jobs with numpy, pandas, and

[GitHub] spark issue #15451: [BUILD] Closing stale PRs

2016-10-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15451 Actually I'm getting error when merging. Not sure why. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark issue #15452: minor doc fix for Row.scala

2016-10-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15452 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark issue #15429: [SPARK-17840] [DOCS] Add some pointers for wiki/CONTRIBU...

2016-10-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15429 Thanks - merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15432 @gatorsmile Yes (for https://github.com/apache/spark/pull/15432#issuecomment-253291901), it is and sure, I should add more tests. I actually intended to show how it looks like. --- If your

[GitHub] spark issue #15432: [SPARK-17854][SQL] rand/randn allows null/long as input ...

2016-10-12 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/15432 I should have added `[WIP]` maybe. If it look okay in general, I will try to follow your suggestions and also the case you gave. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/14963 I tested this on macosx, and got some pylint errors: + echo 'Checking Pylint...' Checking Pylint... + for to_be_checked in '"$PATHS_TO_CHECK"' + '[' false == true ']'

[GitHub] spark issue #15408: [SPARK-17839][CORE] Use Nio's directbuffer instead of Bu...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15408 **[Test build #66830 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66830/consoleFull)** for PR 15408 at commit

[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #3335 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3335/consoleFull)** for PR 15307 at commit

[GitHub] spark issue #15408: [SPARK-17839][CORE] Use Nio's directbuffer instead of Bu...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15408 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14690 **[Test build #66833 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66833/consoleFull)** for PR 14690 at commit

[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11336 **[Test build #66834 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66834/consoleFull)** for PR 11336 at commit

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83071819 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala --- @@ -55,10 +55,16 @@ class TableFileCatalog(

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83065199 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -616,6 +617,44 @@ private[spark] class HiveExternalCatalog(conf:

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83067223 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala --- @@ -0,0 +1,103 @@ +/* + * Licensed to the

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83072226 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala --- @@ -0,0 +1,72 @@ +/* + * Licensed

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83065354 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -616,6 +617,44 @@ private[spark] class HiveExternalCatalog(conf:

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread davies
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83071827 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala --- @@ -0,0 +1,72 @@ +/* + * Licensed

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 I have just tested this pull request on MacOS X with latest version of Spark, no error: ``` ... Checking Pep8... PEP8 checks passed. Checking Pylint... Pylint checks

[GitHub] spark pull request #15335: [SPARK-17769][Core][Scheduler]Some FetchFailure r...

2016-10-12 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/15335#discussion_r83073697 --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala --- @@ -1255,27 +1255,46 @@ class DAGScheduler( s"longer

[GitHub] spark issue #15446: [SPARK-17882][SparkR] Fix swallowed exception in RBacken...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15446 **[Test build #66831 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66831/consoleFull)** for PR 15446 at commit

[GitHub] spark issue #15048: [SPARK-17409] [SQL] Do Not Optimize Query in CTAS More T...

2016-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/15048 Yeah, based on my understanding, it should cover the hive serde table. I will submit a PR to make sure it and also include the test case you provided above. Thank you! --- If your project is

[GitHub] spark issue #15446: [SPARK-17882][SparkR] Fix swallowed exception in RBacken...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15446 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15451: [BUILD] Closing stale PRs

2016-10-12 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/15451 Let me try... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #15446: [SPARK-17882][SparkR] Fix swallowed exception in RBacken...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15446 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66831/ Test PASSed. ---

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66836 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66836/consoleFull)** for PR 14963 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66838 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66838/consoleFull)** for PR 14963 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Here is my proposal: I leave the environment and let the script create the virtual env, plus minor improvement on the linter script. --- If your project is set up for it, you can reply to this

[GitHub] spark issue #15452: minor doc fix for Row.scala

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15452 **[Test build #3334 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3334/consoleFull)** for PR 15452 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread holdenk
Github user holdenk commented on the issue: https://github.com/apache/spark/pull/14963 @Stibbons the jenkins environment for Spark is not OS X, but since a lot of the developers work in OS X I figured it would be good to test there too. I think we should probably figure out why

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-12 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/14690 I determined the performance regression was introduced by a commit I hadn't pushed to this PR. Sorry for the false alarm. 😞 Needless to say, I'm not pushing that commit. --- If your project is

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83078271 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala --- @@ -55,10 +55,16 @@ class TableFileCatalog(

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66837 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66837/consoleFull)** for PR 14963 at commit

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread mallman
Github user mallman commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83085625 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala --- @@ -0,0 +1,103 @@ +/* + * Licensed to the

[GitHub] spark issue #15445: [SPARK-17817][PySpark][FOLLOWUP] PySpark RDD Repartition...

2016-10-12 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/15445 For the change related to performance, should be verified by benchmark, unless it's obvious. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #15422: [SPARK-17850][Core]Add a flag to ignore corrupt files

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15422 **[Test build #66839 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66839/consoleFull)** for PR 15422 at commit

[GitHub] spark issue #13571: [SPARK-15369][WIP][RFC][PySpark][SQL] Expose potential t...

2016-10-12 Thread davies
Github user davies commented on the issue: https://github.com/apache/spark/pull/13571 Thanks for the prototype and sending the PR out, this looks interesting. I don't know how mature Jython currently is, never heard a company who use it. The last release took to years from

[GitHub] spark pull request #15453: [SPARK-17770] [CATALYST] making ObjectType public

2016-10-12 Thread bdrillard
GitHub user bdrillard opened a pull request: https://github.com/apache/spark/pull/15453 [SPARK-17770] [CATALYST] making ObjectType public ## What changes were proposed in this pull request? In order to facilitate the writing of additional Encoders, I proposed opening up

[GitHub] spark issue #15453: [SPARK-17770] [CATALYST] making ObjectType public

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15453 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #15422: [SPARK-17850][Core]Add a flag to ignore corrupt files

2016-10-12 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/15422 Merged - had issue with pip (new laptop, sigh), and so jira and pr did not get closed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #14690: [SPARK-16980][SQL] Load only catalog table partition met...

2016-10-12 Thread ericl
Github user ericl commented on the issue: https://github.com/apache/spark/pull/14690 Conf flag here: https://github.com/VideoAmp/spark-public/pull/3 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request #15422: [SPARK-17850][Core]Add a flag to ignore corrupt f...

2016-10-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15422 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #15437: [SPARK-17876] Write StructuredStreaming WAL to a ...

2016-10-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15437#discussion_r83096455 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala --- @@ -17,19 +17,17 @@ package

[GitHub] spark pull request #15437: [SPARK-17876] Write StructuredStreaming WAL to a ...

2016-10-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15437#discussion_r83097076 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSourceLog.scala --- @@ -17,6 +17,7 @@ package

[GitHub] spark issue #15307: [SPARK-17731][SQL][STREAMING] Metrics for structured str...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15307 **[Test build #3335 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3335/consoleFull)** for PR 15307 at commit

[GitHub] spark issue #15453: [SPARK-17770] [CATALYST] making ObjectType public

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15453 **[Test build #66840 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66840/consoleFull)** for PR 15453 at commit

[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15399 **[Test build #66841 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66841/consoleFull)** for PR 15399 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 You and I agree, actually: - PySpark can run inside Anaconda, and indeed this is greatly valuable. This will make available to the "driver" all the package provided by Anaconda (in client

[GitHub] spark pull request #15422: [SPARK-17850][Core]Add a flag to ignore corrupt f...

2016-10-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15422#discussion_r83093287 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala --- @@ -588,6 +588,12 @@ object SQLConf { .doubleConf

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-12 Thread marmbrus
Github user marmbrus commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r83096825 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala --- @@ -0,0 +1,240 @@ +/* + * Licensed to the

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 I really think using Spark with Anaconda is a **must have**. Deploying jobs that runs inside a Conda environment is so fast and efficient. I really want to push for this pull request #14180 that

[GitHub] spark pull request #15437: [SPARK-17876] Write StructuredStreaming WAL to a ...

2016-10-12 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/15437#discussion_r83096847 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala --- @@ -93,20 +95,26 @@ abstract class

[GitHub] spark issue #15453: [SPARK-17770] [CATALYST] making ObjectType public

2016-10-12 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/15453 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #15427: [SPARK-17866][SPARK-17867][SQL] Fix Dataset.dropduplicat...

2016-10-12 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15427 1. `Dataset.dropDuplicates()` should definitely drop duplicates for all columns. 2. `Dataset.dropDuplicates(col: String)` should also drop duplicates for all columns matching the name.

[GitHub] spark issue #15376: [SPARK-17796][SQL] Support wildcard character in filenam...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15376 **[Test build #66842 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66842/consoleFull)** for PR 15376 at commit

[GitHub] spark issue #11336: [SPARK-9325][SPARK-R] collect() head() and show() for Co...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/11336 **[Test build #66844 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66844/consoleFull)** for PR 11336 at commit

[GitHub] spark pull request #11105: [SPARK-12469][CORE] Data Property accumulators fo...

2016-10-12 Thread squito
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/11105#discussion_r83101779 --- Diff: core/src/test/scala/org/apache/spark/DataPropertyAccumulatorSuite.scala --- @@ -0,0 +1,361 @@ +/* + * Licensed to the Apache Software

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66835 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66835/consoleFull)** for PR 14963 at commit

[GitHub] spark issue #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in the UI

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15441 **[Test build #66843 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66843/consoleFull)** for PR 15441 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14963 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66835/ Test PASSed. ---

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14963 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in the UI

2016-10-12 Thread ajbozarth
Github user ajbozarth commented on the issue: https://github.com/apache/spark/pull/15441 Addressed @markhamstra comments for both `JobsTab` and `StagesTab` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark issue #15422: [SPARK-17850][Core]Add a flag to ignore corrupt files

2016-10-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15422 I will work on a patch for 1.6. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66838 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66838/consoleFull)** for PR 14963 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14963 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14963 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66838/ Test FAILed. ---

[GitHub] spark issue #14638: [SPARK-11374][SQL] Support `skip.header.line.count` opti...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14638 **[Test build #66845 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66845/consoleFull)** for PR 14638 at commit

[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15399 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66841/ Test PASSed. ---

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread Stibbons
Github user Stibbons commented on the issue: https://github.com/apache/spark/pull/14963 Ho, I have these pylint errors on my ubuntu! Probably I did not rebased correctly. Fixed with new check ignore: - deprecated-method - unsubscriptable-object, -

[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15399 **[Test build #66841 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66841/consoleFull)** for PR 15399 at commit

[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15399 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #9973: [SPARK-11989][SQL] Only use commit in JDBC data source if...

2016-10-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/9973 The basic problem is multiple connections work on the same transaction. It is doable but might not be applicable as a general JDBC data source connector. Let us keep it as an open problem. If

[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...

2016-10-12 Thread koeninger
Github user koeninger commented on the issue: https://github.com/apache/spark/pull/15401 @zsxwing so poll is only called in consumer strategy in situations in which starting offsets have been provided, and seek is called immediately thereafter for those offsets. What is the specific

[GitHub] spark issue #15401: [SPARK-17782][STREAMING][KAFKA] alternative eliminate ra...

2016-10-12 Thread zsxwing
Github user zsxwing commented on the issue: https://github.com/apache/spark/pull/15401 @koeninger sorry for the delay. Actually, my concern is `poll(0)` in `ConsumerStrategy.onStart` may update the offsets. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66846 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66846/consoleFull)** for PR 14963 at commit

[GitHub] spark pull request #15441: [SPARK-4411] [Web UI] Add "kill" link for jobs in...

2016-10-12 Thread markhamstra
Github user markhamstra commented on a diff in the pull request: https://github.com/apache/spark/pull/15441#discussion_r83105534 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/JobsTab.scala --- @@ -35,4 +37,20 @@ private[ui] class JobsTab(parent: SparkUI) extends

[GitHub] spark pull request #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSet...

2016-10-12 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15249 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #14567: [SPARK-16992][PYSPARK] Python Pep8 formatting and import...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14567 **[Test build #66847 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66847/consoleFull)** for PR 14567 at commit

[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets

2016-10-12 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/15249 merged to master, thanks everyone --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark issue #15399: [SPARK-17819][SQL] Support default database in connectio...

2016-10-12 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/15399 Hi, @rxin . Since this kind of tests need to change their URI and also assume that the non-default database exists prior, I made a new testsuite for this. Could you review this again?

[GitHub] spark pull request #14690: [SPARK-16980][SQL] Load only catalog table partit...

2016-10-12 Thread ericl
Github user ericl commented on a diff in the pull request: https://github.com/apache/spark/pull/14690#discussion_r83106495 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/TableFileCatalog.scala --- @@ -55,10 +55,16 @@ class TableFileCatalog(

[GitHub] spark issue #15249: [SPARK-17675] [CORE] Expand Blacklist for TaskSets

2016-10-12 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/15249 Awesome nice work!! Exciting to see this in! Let me know when the other component, which blacklists across different stages, is ready for review. --- If your project is set up for it, you

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66848 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66848/consoleFull)** for PR 14963 at commit

[GitHub] spark issue #15436: [SPARK-17875] [BUILD] Remove unneeded direct dependence ...

2016-10-12 Thread rabbitonweb
Github user rabbitonweb commented on the issue: https://github.com/apache/spark/pull/15436 My question would be: what about updating netty 4.x version as well? Right now it's `4.0.29.Final` if I recall correctly, but we could update it to `4.1.3.Final` --- If your project is set up

[GitHub] spark issue #12951: [SPARK-15176][Core] Add maxShares setting to Pools

2016-10-12 Thread kayousterhout
Github user kayousterhout commented on the issue: https://github.com/apache/spark/pull/12951 @njwhite do you have time to work on this and implement maxShares? If not, can you close the PR? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66836 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66836/consoleFull)** for PR 14963 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14963 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14963 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66836/ Test PASSed. ---

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14963 **[Test build #66837 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/66837/consoleFull)** for PR 14963 at commit

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14963 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/66837/ Test PASSed. ---

[GitHub] spark issue #15410: [SPARK-17843][Web UI] Indicate event logs pending for pr...

2016-10-12 Thread tgravescs
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/15410 If we are going to list the actual applications being loaded in the table then you have to rely on the filename to know the application id. This may be ok, but its then the interface and we

[GitHub] spark issue #14963: [SPARK-16992][PYSPARK] Virtualenv for Pylint and pep8 in...

2016-10-12 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14963 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request #15307: [SPARK-17731][SQL][STREAMING] Metrics for structu...

2016-10-12 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/15307#discussion_r83108563 --- Diff: external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala --- @@ -264,6 +266,44 @@ class KafkaSourceSuite extends

  1   2   3   4   5   6   7   >