[GitHub] spark issue #16297: [SPARK-18888] partitionBy in DataStreamWriter in Python ...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16297 **[Test build #70205 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70205/testReport)** for PR 16297 at commit

[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70206 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70206/testReport)** for PR 16272 at commit

[GitHub] spark issue #16297: [SPARK-18888] partitionBy in DataStreamWriter in Python ...

2016-12-15 Thread brkyvz
Github user brkyvz commented on the issue: https://github.com/apache/spark/pull/16297 cc @tdas --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark pull request #16297: [SPARK-18888] partitionBy in DataStreamWriter in ...

2016-12-15 Thread brkyvz
GitHub user brkyvz opened a pull request: https://github.com/apache/spark/pull/16297 [SPARK-1] partitionBy in DataStreamWriter in Python throws _to_seq not defined ## What changes were proposed in this pull request? `_to_seq` wasn't imported. ## How was this

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread mridulm
Github user mridulm commented on the issue: https://github.com/apache/spark/pull/16281 I agree with @srowen, forking should be the last resort. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70204 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70204/testReport)** for PR 16272 at commit

[GitHub] spark pull request #16272: [SPARK-18850][SS]Make StreamExecution serializabl...

2016-12-15 Thread zsxwing
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/16272#discussion_r92690628 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryStatusAndProgressSuite.scala --- @@ -137,12 +146,13 @@ object

[GitHub] spark issue #16287: [SPARK-18868][FLAKY-TEST] Deflake StreamingQueryListener...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16287 **[Test build #70203 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70203/testReport)** for PR 16287 at commit

[GitHub] spark pull request #16289: [SPARK-18870] Disallowed Distinct Aggregations on...

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/16289 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #16272: [SPARK-18850][SS]Make StreamExecution serializabl...

2016-12-15 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16272#discussion_r92687106 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQuerySuite.scala --- @@ -439,6 +440,37 @@ class StreamingQuerySuite extends

[GitHub] spark pull request #16272: [SPARK-18850][SS]Make StreamExecution serializabl...

2016-12-15 Thread tdas
Github user tdas commented on a diff in the pull request: https://github.com/apache/spark/pull/16272#discussion_r92687025 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamSuite.scala --- @@ -300,6 +302,48 @@ class StreamSuite extends StreamTest {

[GitHub] spark issue #16287: [SPARK-18868][FLAKY-TEST] Deflake StreamingQueryListener...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16287 **[Test build #70202 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70202/testReport)** for PR 16287 at commit

[GitHub] spark issue #16251: [SPARK-18826][SS]Add 'latestFirst' option to FileStreamS...

2016-12-15 Thread tdas
Github user tdas commented on the issue: https://github.com/apache/spark/pull/16251 LGTM, pending tests. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so,

[GitHub] spark issue #16289: [SPARK-18870] Disallowed Distinct Aggregations on Stream...

2016-12-15 Thread tdas
Github user tdas commented on the issue: https://github.com/apache/spark/pull/16289 Merging to 2.1 and master --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16289: [SPARK-18870] Disallowed Distinct Aggregations on Stream...

2016-12-15 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/16289 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the

[GitHub] spark issue #16291: [SPARK-18838][WIP] Use separate executor service for eac...

2016-12-15 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/16291 Hmm... I took a quick look and I'm not sure I understand exactly what's going on. It seems you're wrapping each listener with a `ListenerEvenProcessor` (note the typo), and each processor has its

[GitHub] spark issue #16285: [SPARK-18867] [SQL] Throw cause if IsolatedClientLoad ca...

2016-12-15 Thread jojochuang
Github user jojochuang commented on the issue: https://github.com/apache/spark/pull/16285 Good point @rxin. That seems possible. https://docs.oracle.com/javase/7/docs/api/java/lang/reflect/InvocationTargetException.html#getCause() --- If your project is set up for it, you can

[GitHub] spark issue #16142: [SPARK-18716][CORE] Restrict the disk usage of spark eve...

2016-12-15 Thread vanzin
Github user vanzin commented on the issue: https://github.com/apache/spark/pull/16142 I'm not such a big fan of this feature, but mostly I'm not a big fan of the current implementation. For the feature, it feels like it's trying to make the SHS more like a "log management

[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70201 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70201/testReport)** for PR 16272 at commit

[GitHub] spark issue #16228: [WIP] [SPARK-17076] [SQL] Cardinality estimation for joi...

2016-12-15 Thread Tagar
Github user Tagar commented on the issue: https://github.com/apache/spark/pull/16228 @wzhfy, I think overestimating cardinality could be as bad as underestimating. For example, Optimizer could prematurely switch to SortMergeJoin when it could used broadcast hash join. But I

[GitHub] spark issue #16251: [SPARK-18826][SS]Add 'latestFirst' option to FileStreamS...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16251 **[Test build #70200 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70200/testReport)** for PR 16251 at commit

[GitHub] spark issue #16290: [SPARK-18817] [SPARKR] [SQL] Set default warehouse dir t...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16290 **[Test build #70199 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70199/testReport)** for PR 16290 at commit

[GitHub] spark issue #16290: [SPARK-18817] [SPARKR] [SQL] Set default warehouse dir t...

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16290 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70198 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70198/testReport)** for PR 16272 at commit

[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16272 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70198/ Test FAILed. ---

[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16272 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70198 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70198/testReport)** for PR 16272 at commit

[GitHub] spark issue #16272: [SPARK-18850][SS]Make StreamExecution serializable

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16272 **[Test build #70197 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70197/consoleFull)** for PR 16272 at commit

[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92669686 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllReceiversResource.scala --- @@ -0,0 +1,77 @@ +/* + * Licensed to the

[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92668634 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllBatchesResource.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed to the

[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92670326 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/StreamingStatisticsResource.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed

[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92669331 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllOutputOperationsResource.scala --- @@ -0,0 +1,65 @@ +/* + * Licensed

[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92668159 --- Diff: project/MimaExcludes.scala --- @@ -116,7 +116,10 @@ object MimaExcludes {

[GitHub] spark pull request #16253: [SPARK-18537][Web UI] Add a REST api to serve spa...

2016-12-15 Thread vanzin
Github user vanzin commented on a diff in the pull request: https://github.com/apache/spark/pull/16253#discussion_r92669743 --- Diff: streaming/src/main/scala/org/apache/spark/status/api/v1/streaming/AllReceiversResource.scala --- @@ -0,0 +1,77 @@ +/* + * Licensed to the

[GitHub] spark issue #15505: [SPARK-17931][CORE] taskScheduler has some unneeded seri...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15505 **[Test build #70196 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70196/consoleFull)** for PR 15505 at commit

[GitHub] spark pull request #15717: [SPARK-17910][SQL] Allow users to update the comm...

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/15717 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark pull request #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-15 Thread asfgit
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/14079 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is

[GitHub] spark issue #16276: [SPARK-18855][CORE] Add RDD flatten function

2016-12-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16276 Do we really need to add this? For the time we spent we can work on more impactful things ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as

[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/15717 Merging in master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or

[GitHub] spark issue #16291: [SPARK-18838][WIP] Use separate executor service for eac...

2016-12-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16291 This was initially introduced by @kayousterhout --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16281 @nsync you raised an excellent question on test coverage. The kind of bugs we have seen in the past weren't really integration bugs, but bugs in parquet-mr. Technically it should be the jobs of

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 @dongjoon-hyun I just checked the code changes in `1.2.1.spark2` compared with the official Hive 1.2.1: https://github.com/JoshRosen/hive/commits/release-1.2.1-spark2 Very

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16281 Btw issues are not just performance, but often correctness as well. As the default format, a bug in Parquet is much worse than a bug in say ORC. --- If your project is set up for it, you can reply

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/16281 We haven't really added much to Hive though, and as a matter of fact the dependency on Hive is decreasing. Parquet is a much more manageable piece of code to fork. In the past we have seen fairly

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 Yep. Spark Thrift Server are different, but it's not actively maintained. For example, the default database feature is recently added. I mean this one by `Spark Hive`. ```

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16281 Has anyone even asked for a new 1.8.x build from Parquet and been told it won't happen? You don't stop consuming non fix changes by forking. You do that by staying on a maintenance branch.

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16030 Can you also update the title? And the description has a mistake: the logical layer trusts the data schema to infer the type the overlapped partition columns, and, on the other hand, the physical

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 We are adding major code changes in Spark Thrift Server? What is the Spark Hive? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 I think we are not adding new features into Parquet. The fixes must be small. To avoid the cost and risk, we need to reject all the major fixes in our special build. At the same time, we also

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 Yep. At the beginning, it starts like that. But, please look at Spark Hive or Spark Thrift Server. I don't think we are maintaining that well or visibly. --- If your project is set up for

[GitHub] spark pull request #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax...

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/16296#discussion_r92655270 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -69,16 +69,19 @@ statement | ALTER DATABASE

[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16296 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70195/ Test FAILed. ---

[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16296 **[Test build #70195 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70195/testReport)** for PR 16296 at commit

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 I agree, but, in a long term perspective, the risk and cost of forking could be the worst. --- If your project is set up for it, you can reply to this email and have your reply appear on

[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16296 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16296 **[Test build #70195 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70195/testReport)** for PR 16296 at commit

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 @dongjoon-hyun What kind of questions/requests should we ask in dev mailing list? IMO, the risk and cost are small if we make a special build by ourselves. We can get the bug fixes

[GitHub] spark issue #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for da...

2016-12-15 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/16296 cc @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if

[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14079 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70194/ Test PASSed. ---

[GitHub] spark pull request #16296: [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax...

2016-12-15 Thread cloud-fan
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/16296 [SPARK-18885][SQL][WIP] unify CREATE TABLE syntax for data source and hive serde tables ## What changes were proposed in this pull request? Today we have different syntax to create data

[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/14079 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14079 **[Test build #70194 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70194/testReport)** for PR 14079 at commit

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15915 **[Test build #3500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3500/testReport)** for PR 15915 at commit

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/16281 Actually, this PR is about Apache Spark 2.2 on Late March in terms of RC1. We have a lot of time to discuss. Why don't we discuss that on dev mailing list? --- If your project is set up

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 Basically, the idea is to make a special build for Parquet 1.8.1 with the needed fixes by ourselves. Upgrading to newer version like Parquet 1.9.0 is risky. Parquet 1.9.0 was just

[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16057 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70193/ Test PASSed. ---

[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16057 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16057 **[Test build #70193 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70193/testReport)** for PR 16057 at commit

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16281 I'd much rather lobby to release 1.8.2 and help with the legwork than do all that legwork and more to maintain a fork. It's still not clear to me that upgrading to 1.9.0 is not a solution? --- If

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/15018 Just add commits here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 The problem is the Parquet community will not create a branch 1.8.2+ for us. Upgrading to newer versions 1.9 or 2.0 are always risky. Based on the history, we hit the bugs and performance

[GitHub] spark issue #16282: [DO_NOT_MERGE]Try to fix kafka

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16282 **[Test build #3501 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3501/testReport)** for PR 16282 at commit

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread srowen
Github user srowen commented on the issue: https://github.com/apache/spark/pull/16281 @gatorsmile @rdblue also works directly on Parquet. I am not seeing "unfixable" Parquet problems here. You're just pointing at problems that can and should be fixed, preferably in one place. Forking

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16281 @srowen Even if we fork our own version, it does not mean we will give up the upgrading to the newer version. We just added a few fixes. This is very normal in the mission-critical

[GitHub] spark issue #15018: [SPARK-17455][MLlib] Improve PAVA implementation in Isot...

2016-12-15 Thread neggert
Github user neggert commented on the issue: https://github.com/apache/spark/pull/15018 Found another input that triggers non-polynomial time with the code in this PR. I'm again borrowing from scikit-learn. I think this is the case they found that led them to re-write their

[GitHub] spark issue #16134: [SPARK-18703] [SQL] Drop Staging Directories and Data Fi...

2016-12-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/16134 The staging directory and files will not be removed when users hitting abnormal termination of JVM. In addition, if the JVM does not stop, these temporary files could still consume a lot of

[GitHub] spark issue #1980: [SPARK-2750] support https in spark web ui

2016-12-15 Thread pritpalm
Github user pritpalm commented on the issue: https://github.com/apache/spark/pull/1980 I want to enable https on spark UI. I added following config to spark-defaults.config, but when we access spark ui via https::/:8080 or https://:443 or https://:8480, it's not able to connect.

[GitHub] spark issue #5664: [SPARK-2750][WEB UI]Add Https support for Web UI

2016-12-15 Thread pritpalm
Github user pritpalm commented on the issue: https://github.com/apache/spark/pull/5664 I want to enable https on spark UI. I added following config to spark-defaults.config, but when we access spark ui via https::/:8080 or https://:443 or https://:8480, it's not able to connect.

[GitHub] spark issue #10238: [SPARK-2750][WEB UI] Add https support to the Web UI

2016-12-15 Thread pritpalm
Github user pritpalm commented on the issue: https://github.com/apache/spark/pull/10238 I want to enable https on spark UI. I added following config to spark-defaults.config, but when we access spark ui via https::/:8080 or https://:443 or https://:8480, it's not able to connect.

[GitHub] spark issue #16281: [SPARK-13127][SQL] Update Parquet to 1.9.0

2016-12-15 Thread nsyca
Github user nsyca commented on the issue: https://github.com/apache/spark/pull/16281 My two cents: - Do we have a Parquet specific test suite **with sufficient coverage** to run and back us up that this upgrade won't cause any regressions? I think simply moving up the version of

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15915 **[Test build #3500 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/3500/testReport)** for PR 15915 at commit

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15915 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/15915 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70191/ Test FAILed. ---

[GitHub] spark issue #15915: [SPARK-18485][CORE] Underlying integer overflow when cre...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/15915 **[Test build #70191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70191/testReport)** for PR 15915 at commit

[GitHub] spark issue #16271: [SPARK-18845][GraphX] PageRank has incorrect initializat...

2016-12-15 Thread aray
Github user aray commented on the issue: https://github.com/apache/spark/pull/16271 Yes the improvement is from the sum of magnitudes of initial values being closer to the (known) sum of the solution. Fiddling with resetProb controls a completely different thing. The current

[GitHub] spark pull request #16271: [SPARK-18845][GraphX] PageRank has incorrect init...

2016-12-15 Thread aray
Github user aray commented on a diff in the pull request: https://github.com/apache/spark/pull/16271#discussion_r92621591 --- Diff: graphx/src/test/scala/org/apache/spark/graphx/lib/PageRankSuite.scala --- @@ -70,10 +70,10 @@ class PageRankSuite extends SparkFunSuite with

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70190/ Test PASSed. ---

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16030 **[Test build #70190 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70190/testReport)** for PR 16030 at commit

[GitHub] spark issue #14079: [SPARK-8425][CORE] Application Level Blacklisting

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/14079 **[Test build #70194 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70194/testReport)** for PR 14079 at commit

[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16057 **[Test build #70193 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70193/testReport)** for PR 16057 at commit

[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16057 **[Test build #70192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70192/testReport)** for PR 16057 at commit

[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16057 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16057 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70192/ Test FAILed. ---

[GitHub] spark issue #16057: [SPARK-18624][SQL] Implicit cast ArrayType(InternalType)

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16057 **[Test build #70192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70192/testReport)** for PR 16057 at commit

[GitHub] spark issue #15717: [SPARK-17910][SQL] Allow users to update the comment of ...

2016-12-15 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/15717 @gatorsmile I've updated the PR description, thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-15 Thread maropu
Github user maropu commented on the issue: https://github.com/apache/spark/pull/16030 @cloud-fan okay, I updated the desc. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/70186/ Test PASSed. ---

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/16030 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark issue #16030: [SPARK-18108][SQL] Fix a bug to fail partition schema in...

2016-12-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/16030 **[Test build #70186 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/70186/testReport)** for PR 16030 at commit

<    1   2   3   4   5   >