date:20140917

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-17 Thread li-zhihui

Github user li-zhihui commented on the pull request:

https://github.com/apache/spark/pull/1616#issuecomment-55852899
  
@andrewor14 @JoshRosen 
I am not sure if the test failure is related to the patch. Can you have a 
look at the failure? Or just retest it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2330#issuecomment-55854376
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20452/consoleFull)
 for   PR 2330 at commit 
[`088ed8a`](https://github.com/apache/spark/commit/088ed8ac46bc59bf3d13d4a0be1d4c616a22d698).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2378#issuecomment-55855022
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/121/consoleFull)
 for   PR 2378 at commit 
[`1fccf1a`](https://github.com/apache/spark/commit/1fccf1adc91e78a6c9e65f4ae14ba770a7eecd2c).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3491] [WIP] [MLlib] [PySpark] use pickl...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2378#issuecomment-55855160
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20449/consoleFull)
 for   PR 2378 at commit 
[`19d0967`](https://github.com/apache/spark/commit/19d096783b60e741173f48f2944d91f650616140).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3477] Clean up code in Yarn Client / Cl...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2350#issuecomment-55855267
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20450/consoleFull)
 for   PR 2350 at commit 
[`a3b9693`](https://github.com/apache/spark/commit/a3b9693884b99622b626890fba08b2111d661e93).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3491] [MLlib] [PySpark] use pickle to s...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2378#issuecomment-55855348
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/122/consoleFull)
 for   PR 2378 at commit 
[`e431377`](https://github.com/apache/spark/commit/e431377170172571974aadcae7ff42d3a79e2cd9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3491] [MLlib] [PySpark] use pickle to s...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2378#issuecomment-55855377
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20453/consoleFull)
 for   PR 2378 at commit 
[`e431377`](https://github.com/apache/spark/commit/e431377170172571974aadcae7ff42d3a79e2cd9).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-17 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/1616#issuecomment-55856032
  
This looks like a failure due to a known flaky test:

```
[info] SparkSinkSuite:
[info] - Success with ack *** FAILED ***
[info]   4000 did not equal 5000 (SparkSinkSuite.scala:195)
```

I'm going to let Jenkins re-test this overnight.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1616#issuecomment-55856209
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/123/consoleFull)
 for   PR 1616 at commit 
[`f9330d4`](https://github.com/apache/spark/commit/f9330d447dd749d530de96739f5e3598f973bd60).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2330#issuecomment-55856201
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20451/consoleFull)
 for   PR 2330 at commit 
[`a79a259`](https://github.com/apache/spark/commit/a79a25918a96171c4b20c5c9153e5815bc23698e).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-17 Thread li-zhihui

Github user li-zhihui commented on the pull request:

https://github.com/apache/spark/pull/1616#issuecomment-55856239
  
@JoshRosen thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Docs] Correct spark.files.fetchTimeout defaul...

2014-09-17 Thread JoshRosen

Github user JoshRosen commented on the pull request:

https://github.com/apache/spark/pull/2406#issuecomment-55856872
  
LGTM; thanks for updating the title!  I'm going to merge this now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Docs] Correct spark.files.fetchTimeout defaul...

2014-09-17 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2406


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2594][SQL] Support CACHE TABLE name A...

2014-09-17 Thread liancheng

Github user liancheng commented on a diff in the pull request:

https://github.com/apache/spark/pull/2397#discussion_r17648914
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala ---
@@ -166,3 +166,20 @@ case class DescribeCommand(child: SparkPlan, output: 
Seq[Attribute])(
   child.output.map(field = Row(field.name, field.dataType.toString, 
null))
   }
 }
+
+/**
+ * :: DeveloperApi ::
+ */
+@DeveloperApi
+case class CacheTableAsSelectCommand(tableName: String, plan: LogicalPlan)
+  extends LeafNode with Command {
+  
+  override protected[sql] lazy val sideEffectResult = {
+sqlContext.catalog.registerTable(None, tableName,  
sqlContext.executePlan(plan).analyzed)
--- End diff --

(Probably my final comment on this PR :) )

As described in PR #2382, we shouldn't store analyzed logical plan when 
registering tables any more (see 
[here](https://github.com/apache/spark/pull/2382/files?diff=split#diff-5)).

To prevent duplicated code, I'd suggest to import `SQLContext._` so that we 
can leverage [the implicit 
conversion](https://github.com/apache/spark/blob/008a5ed4808d1467b47c1d6fa4d950cc6c4976b7/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala#L78-L85)
 from `LogicalPlan` to `SchemaRDD`, and then simply do this:

```scala
sqlContext.executePlan(plan).logical.registerTempTable(tableName)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2594][SQL] Support CACHE TABLE name A...

2014-09-17 Thread liancheng

Github user liancheng commented on the pull request:

https://github.com/apache/spark/pull/2397#issuecomment-55857224
  
LGTM except for the analyzed logical plan issue as mentioned in my last 
comment. Thanks for working on this!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3319] [SPARK-3338] Resolve Spark submit...

2014-09-17 Thread sryza

Github user sryza commented on the pull request:

https://github.com/apache/spark/pull/2232#issuecomment-55858105
  
Could this change behavior in cases where the spark.yarn.dist.files is 
configured with no scheme?  Without this change, it would interpret no scheme 
to mean that it's on HDFS, and with it, it would interpret it to mean that it's 
on the local FS?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2330#issuecomment-55858115
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20452/consoleFull)
 for   PR 2330 at commit 
[`088ed8a`](https://github.com/apache/spark/commit/088ed8ac46bc59bf3d13d4a0be1d4c616a22d698).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2873] [SQL] using ExternalAppendOnlyMap...

2014-09-17 Thread guowei2

Github user guowei2 commented on the pull request:

https://github.com/apache/spark/pull/2029#issuecomment-55858585
  
I've run a micro benchmark in my local with 5 records,1500 keys.

Type | OnHeapAggregate  | ExternalAggregate (happens 10 spills)
 | - | --
First run | 876ms | 16.9s
Stablized runs | 150ms | 15.0s



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3218, SPARK-3219, SPARK-3261, SPARK-342...

2014-09-17 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/2419#issuecomment-55858790
  
@derrickburns I think these notes can go in code comments? (They each 
generate their own email too.)

This is also a big-bang change covering several issues, some of which seem 
like more focused bug fixes or improvements. I would think it would be easier 
to break this down further if possible, and get in clear easy changes first.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3550][MLLIB] fix a unresolved reference...

2014-09-17 Thread OdinLin

GitHub user OdinLin opened a pull request:

https://github.com/apache/spark/pull/2423

[SPARK-3550][MLLIB] fix a unresolved reference  variable 'nb' bug

variable nb is not reference in the raise log

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/OdinLin/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2423.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2423


commit 2748cd6b8db8cb9c549b9a9dae1e6889c4995739
Author: linyicong linyic...@dkhs.com
Date:   2014-09-17T07:33:15Z

fix a unresolved reference bug




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [MLLIB] fix a unresolved reference variable 'n...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2423#issuecomment-55860478
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3491] [MLlib] [PySpark] use pickle to s...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2378#issuecomment-55860743
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/122/consoleFull)
 for   PR 2378 at commit 
[`e431377`](https://github.com/apache/spark/commit/e431377170172571974aadcae7ff42d3a79e2cd9).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3491] [MLlib] [PySpark] use pickle to s...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2378#issuecomment-55860805
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20453/consoleFull)
 for   PR 2378 at commit 
[`e431377`](https://github.com/apache/spark/commit/e431377170172571974aadcae7ff42d3a79e2cd9).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2713] Executors of same application in ...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1616#issuecomment-55861628
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/123/consoleFull)
 for   PR 1616 at commit 
[`f9330d4`](https://github.com/apache/spark/commit/f9330d447dd749d530de96739f5e3598f973bd60).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3564] Display App ID on HistoryPage

2014-09-17 Thread sarutak

GitHub user sarutak opened a pull request:

https://github.com/apache/spark/pull/2424

[SPARK-3564] Display App ID on HistoryPage



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sarutak/spark display-appid-on-webui

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2424.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2424


commit 417fe90ab77dfb3f8be5c2755b4738f86e74d5ba
Author: Kousuke Saruta saru...@oss.nttdata.co.jp
Date:   2014-09-17T08:04:35Z

Added App ID column to HistoryPage




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-17 Thread colorant

Github user colorant commented on a diff in the pull request:

https://github.com/apache/spark/pull/2330#discussion_r17650877
  
--- Diff: 
core/src/main/scala/org/apache/spark/network/netty/BlockClientFactory.scala ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.network.netty
+
+import java.io.Closeable
+import java.util.concurrent.{ConcurrentHashMap, TimeoutException}
+
+import io.netty.bootstrap.Bootstrap
+import io.netty.buffer.PooledByteBufAllocator
+import io.netty.channel._
+import io.netty.channel.epoll.{Epoll, EpollEventLoopGroup, 
EpollSocketChannel}
+import io.netty.channel.nio.NioEventLoopGroup
+import io.netty.channel.oio.OioEventLoopGroup
+import io.netty.channel.socket.SocketChannel
+import io.netty.channel.socket.nio.NioSocketChannel
+import io.netty.channel.socket.oio.OioSocketChannel
+import io.netty.util.internal.PlatformDependent
+
+import org.apache.spark.{Logging, SparkConf}
+import org.apache.spark.util.Utils
+
+
+/**
+ * Factory for creating [[BlockClient]] by using createClient.
+ *
+ * The factory maintains a connection pool to other hosts and should 
return the same [[BlockClient]]
+ * for the same remote host. It also shares a single worker thread pool 
for all [[BlockClient]]s.
+ */
+private[netty]
+class BlockClientFactory(val conf: NettyConfig) extends Logging with 
Closeable {
+
+  def this(sparkConf: SparkConf) = this(new NettyConfig(sparkConf))
+
+  /** A thread factory so the threads are named (for debugging). */
+  private[this] val threadFactory = 
Utils.namedThreadFactory(spark-netty-client)
+
+  /** Socket channel type, initialized by [[init]] depending ioMode. */
+  private[this] var socketChannelClass: Class[_ : Channel] = _
+
+  /** Thread pool shared by all clients. */
+  private[this] var workerGroup: EventLoopGroup = _
+
+  private[this] val connectionPool = new ConcurrentHashMap[(String, Int), 
BlockClient]
+
+  // The encoders are stateless and can be shared among multiple clients.
+  private[this] val encoder = new ClientRequestEncoder
+  private[this] val decoder = new ServerResponseDecoder
+
+  init()
+
+  /** Initialize [[socketChannelClass]] and [[workerGroup]] based on 
ioMode. */
+  private def init(): Unit = {
+def initOio(): Unit = {
+  socketChannelClass = classOf[OioSocketChannel]
+  workerGroup = new OioEventLoopGroup(0, threadFactory)
+}
+def initNio(): Unit = {
+  socketChannelClass = classOf[NioSocketChannel]
+  workerGroup = new NioEventLoopGroup(0, threadFactory)
+}
+def initEpoll(): Unit = {
+  socketChannelClass = classOf[EpollSocketChannel]
+  workerGroup = new EpollEventLoopGroup(0, threadFactory)
+}
+
+// For auto mode, first try epoll (only available on Linux), then nio.
+conf.ioMode match {
+  case nio = initNio()
+  case oio = initOio()
+  case epoll = initEpoll()
+  case auto = if (Epoll.isAvailable) initEpoll() else initNio()
+}
+  }
+
+  /**
+   * Create a new BlockFetchingClient connecting to the given remote host 
/ port.
+   *
+   * This blocks until a connection is successfully established.
+   *
+   * Concurrency: This method is safe to call from multiple threads.
+   */
+  def createClient(remoteHost: String, remotePort: Int): BlockClient = {
+// Get connection from the connection pool first.
+// If it is not found or not active, create a new one.
+val cachedClient = connectionPool.get((remoteHost, remotePort))
+if (cachedClient != null  cachedClient.isActive) {
+  return cachedClient
+}
+
+logInfo(sCreating new connection to $remoteHost:$remotePort)
+
+// There is a chance two threads are creating two different clients 
connecting to the same host.
+// But that's probably ok ...

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2330#discussion_r17650983
  
--- Diff: 
core/src/main/scala/org/apache/spark/network/netty/BlockClientFactory.scala ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.network.netty
+
+import java.io.Closeable
+import java.util.concurrent.{ConcurrentHashMap, TimeoutException}
+
+import io.netty.bootstrap.Bootstrap
+import io.netty.buffer.PooledByteBufAllocator
+import io.netty.channel._
+import io.netty.channel.epoll.{Epoll, EpollEventLoopGroup, 
EpollSocketChannel}
+import io.netty.channel.nio.NioEventLoopGroup
+import io.netty.channel.oio.OioEventLoopGroup
+import io.netty.channel.socket.SocketChannel
+import io.netty.channel.socket.nio.NioSocketChannel
+import io.netty.channel.socket.oio.OioSocketChannel
+import io.netty.util.internal.PlatformDependent
+
+import org.apache.spark.{Logging, SparkConf}
+import org.apache.spark.util.Utils
+
+
+/**
+ * Factory for creating [[BlockClient]] by using createClient.
+ *
+ * The factory maintains a connection pool to other hosts and should 
return the same [[BlockClient]]
+ * for the same remote host. It also shares a single worker thread pool 
for all [[BlockClient]]s.
+ */
+private[netty]
+class BlockClientFactory(val conf: NettyConfig) extends Logging with 
Closeable {
+
+  def this(sparkConf: SparkConf) = this(new NettyConfig(sparkConf))
+
+  /** A thread factory so the threads are named (for debugging). */
+  private[this] val threadFactory = 
Utils.namedThreadFactory(spark-netty-client)
+
+  /** Socket channel type, initialized by [[init]] depending ioMode. */
+  private[this] var socketChannelClass: Class[_ : Channel] = _
+
+  /** Thread pool shared by all clients. */
+  private[this] var workerGroup: EventLoopGroup = _
+
+  private[this] val connectionPool = new ConcurrentHashMap[(String, Int), 
BlockClient]
+
+  // The encoders are stateless and can be shared among multiple clients.
+  private[this] val encoder = new ClientRequestEncoder
+  private[this] val decoder = new ServerResponseDecoder
+
+  init()
+
+  /** Initialize [[socketChannelClass]] and [[workerGroup]] based on 
ioMode. */
+  private def init(): Unit = {
+def initOio(): Unit = {
+  socketChannelClass = classOf[OioSocketChannel]
+  workerGroup = new OioEventLoopGroup(0, threadFactory)
+}
+def initNio(): Unit = {
+  socketChannelClass = classOf[NioSocketChannel]
+  workerGroup = new NioEventLoopGroup(0, threadFactory)
+}
+def initEpoll(): Unit = {
+  socketChannelClass = classOf[EpollSocketChannel]
+  workerGroup = new EpollEventLoopGroup(0, threadFactory)
+}
+
+// For auto mode, first try epoll (only available on Linux), then nio.
+conf.ioMode match {
+  case nio = initNio()
+  case oio = initOio()
+  case epoll = initEpoll()
+  case auto = if (Epoll.isAvailable) initEpoll() else initNio()
+}
+  }
+
+  /**
+   * Create a new BlockFetchingClient connecting to the given remote host 
/ port.
+   *
+   * This blocks until a connection is successfully established.
+   *
+   * Concurrency: This method is safe to call from multiple threads.
+   */
+  def createClient(remoteHost: String, remotePort: Int): BlockClient = {
+// Get connection from the connection pool first.
+// If it is not found or not active, create a new one.
+val cachedClient = connectionPool.get((remoteHost, remotePort))
+if (cachedClient != null  cachedClient.isActive) {
+  return cachedClient
+}
+
+logInfo(sCreating new connection to $remoteHost:$remotePort)
+
+// There is a chance two threads are creating two different clients 
connecting to the same host.
+// But that's probably ok ...
+

[GitHub] spark pull request: [SPARK-3564] Display App ID on HistoryPage

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2424#issuecomment-55862451
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20454/consoleFull)
 for   PR 2424 at commit 
[`417fe90`](https://github.com/apache/spark/commit/417fe90ab77dfb3f8be5c2755b4738f86e74d5ba).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: add spark.driver.memory to config docs

2014-09-17 Thread ash211

Github user ash211 commented on the pull request:

https://github.com/apache/spark/pull/2410#issuecomment-55862389
  
One note about this setting is that I'm not sure it works in all settings 
-- if you pass it to a driver as a parameter then it's too late to take effect 
(the JVM has already started).

```./bin/spark-shell --driver-java-options -Dspark.driver.memory=1576m``` 
doesn't actually change driver memory.

Maybe the docs should mention where it does and doesn't work?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread ScrapCodes

GitHub user ScrapCodes opened a pull request:

https://github.com/apache/spark/pull/2425

[SPARK-3543] Write TaskContext in Java and expose it through a static 
accessor.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ScrapCodes/spark-1 SPARK-3543/withTaskContext

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2425.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2425


commit 333c7d644973c721f0b0509399579723d2a43446
Author: Prashant Sharma prashan...@imaginea.com
Date:   2014-09-17T07:40:47Z

Translated Task context from scala to java.

commit f716fd1b0d84bcb08889b62af2d5f7a6d14b1cab
Author: Prashant Sharma prashan...@imaginea.com
Date:   2014-09-17T08:15:39Z

introduced thread local for getting the task context.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651529
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
+public Integer partitionId;
+public Long attemptId;
+public Boolean runningLocally;
+public TaskMetrics taskMetrics;
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId, Boolean runningLocally,
+   TaskMetrics taskMetrics) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = taskMetrics;
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId,
+   Boolean runningLocally) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = false;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
--- End diff --

extra line


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651524
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
+public Integer partitionId;
+public Long attemptId;
+public Boolean runningLocally;
+public TaskMetrics taskMetrics;
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId, Boolean runningLocally,
+   TaskMetrics taskMetrics) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = taskMetrics;
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId,
+   Boolean runningLocally) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = false;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+
+private static ThreadLocalTaskContext taskContext =
--- End diff --

u don't need to wrap this, do u?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651575
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
+public Integer partitionId;
+public Long attemptId;
+public Boolean runningLocally;
+public TaskMetrics taskMetrics;
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId, Boolean runningLocally,
+   TaskMetrics taskMetrics) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = taskMetrics;
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId,
+   Boolean runningLocally) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = false;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+
+private static ThreadLocalTaskContext taskContext =
+new ThreadLocalTaskContext();
+
+public static TaskContext get() {
+return taskContext.get();
+}
+
+// List of callback functions to execute when the task completes.
+private transient ListTaskCompletionListener onCompleteCallbacks =
+new ArrayListTaskCompletionListener();
+
+// Whether the corresponding task has been killed.
+private volatile Boolean interrupted = false;
+
+// Whether the task has completed.
+private volatile Boolean completed = false;
+
+/**
+ * Checks whether the task has completed.
+ */
+public Boolean isCompleted() {
+return completed;
+}
+
+/**
+ * Checks whether the task has been killed.
+ */
+public Boolean isInterrupted() {
+return interrupted;
+}
+
+
+/**
+ * Add a (Java friendly) listener to be executed on task completion.
+ * This will be called in all situation - success, failure, or 
cancellation.
+ *
+ * An example use is for HadoopRDD to register a callback to close the 
input stream.
+ */
+public TaskContext addTaskCompletionListener(TaskCompletionListener 
listener){
+onCompleteCallbacks.add(listener);
+return this;
+}
+
+
+/**
+ * Add a listener in the form of a Scala closure to be executed on 
task completion.
+ * This will be called in all situation - success, failure, or 
cancellation.
+ *
+ * An example use is for HadoopRDD to register a callback to close the 
input stream.
+ */
+public TaskContext addTaskCompletionListener(final

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2425#issuecomment-55863345
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20455/consoleFull)
 for   PR 2425 at commit 
[`f716fd1`](https://github.com/apache/spark/commit/f716fd1b0d84bcb08889b62af2d5f7a6d14b1cab).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651616
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
+public Integer partitionId;
+public Long attemptId;
+public Boolean runningLocally;
+public TaskMetrics taskMetrics;
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId, Boolean runningLocally,
+   TaskMetrics taskMetrics) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = taskMetrics;
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId,
+   Boolean runningLocally) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = false;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+
+private static ThreadLocalTaskContext taskContext =
+new ThreadLocalTaskContext();
+
+public static TaskContext get() {
+return taskContext.get();
+}
+
+// List of callback functions to execute when the task completes.
+private transient ListTaskCompletionListener onCompleteCallbacks =
+new ArrayListTaskCompletionListener();
+
+// Whether the corresponding task has been killed.
+private volatile Boolean interrupted = false;
+
+// Whether the task has completed.
+private volatile Boolean completed = false;
+
+/**
+ * Checks whether the task has completed.
+ */
+public Boolean isCompleted() {
+return completed;
+}
+
+/**
+ * Checks whether the task has been killed.
+ */
+public Boolean isInterrupted() {
+return interrupted;
+}
+
+
+/**
+ * Add a (Java friendly) listener to be executed on task completion.
+ * This will be called in all situation - success, failure, or 
cancellation.
+ *
+ * An example use is for HadoopRDD to register a callback to close the 
input stream.
+ */
+public TaskContext addTaskCompletionListener(TaskCompletionListener 
listener){
+onCompleteCallbacks.add(listener);
+return this;
+}
+
+
+/**
+ * Add a listener in the form of a Scala closure to be executed on 
task completion.
+ * This will be called in all situation - success, failure, or 
cancellation.
+ *
+ * An example use is for HadoopRDD to register a callback to close the 
input stream.
+ */
+public TaskContext addTaskCompletionListener(final

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651635
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
--- End diff --

do we do 2 space or 4 space indent?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651668
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
--- End diff --

actually don't make them public - this breaks binary compatibility right 
now.

you should make them private, and create a public accessor stageId()


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651760
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
+public Integer partitionId;
+public Long attemptId;
+public Boolean runningLocally;
+public TaskMetrics taskMetrics;
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId, Boolean runningLocally,
+   TaskMetrics taskMetrics) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = taskMetrics;
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId,
+   Boolean runningLocally) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = false;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+
+private static ThreadLocalTaskContext taskContext =
+new ThreadLocalTaskContext();
+
+public static TaskContext get() {
+return taskContext.get();
+}
+
+// List of callback functions to execute when the task completes.
+private transient ListTaskCompletionListener onCompleteCallbacks =
+new ArrayListTaskCompletionListener();
+
+// Whether the corresponding task has been killed.
+private volatile Boolean interrupted = false;
+
+// Whether the task has completed.
+private volatile Boolean completed = false;
+
+/**
+ * Checks whether the task has completed.
+ */
+public Boolean isCompleted() {
+return completed;
+}
+
+/**
+ * Checks whether the task has been killed.
+ */
+public Boolean isInterrupted() {
+return interrupted;
+}
+
--- End diff --

can u go through the file and use only one line to separate functions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651763
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
+public Integer partitionId;
+public Long attemptId;
+public Boolean runningLocally;
+public TaskMetrics taskMetrics;
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId, Boolean runningLocally,
+   TaskMetrics taskMetrics) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = taskMetrics;
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId,
+   Boolean runningLocally) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = false;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+
+private static ThreadLocalTaskContext taskContext =
+new ThreadLocalTaskContext();
+
+public static TaskContext get() {
+return taskContext.get();
+}
+
+// List of callback functions to execute when the task completes.
+private transient ListTaskCompletionListener onCompleteCallbacks =
+new ArrayListTaskCompletionListener();
+
+// Whether the corresponding task has been killed.
+private volatile Boolean interrupted = false;
+
+// Whether the task has completed.
+private volatile Boolean completed = false;
+
+/**
+ * Checks whether the task has completed.
+ */
+public Boolean isCompleted() {
+return completed;
+}
+
+/**
+ * Checks whether the task has been killed.
+ */
+public Boolean isInterrupted() {
+return interrupted;
+}
+
+
+/**
+ * Add a (Java friendly) listener to be executed on task completion.
+ * This will be called in all situation - success, failure, or 
cancellation.
+ *
+ * An example use is for HadoopRDD to register a callback to close the 
input stream.
+ */
+public TaskContext addTaskCompletionListener(TaskCompletionListener 
listener){
--- End diff --

space before {


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail:

[GitHub] spark pull request: [SPARK-3566] [BUILD] .gitignore and .rat-exclu...

2014-09-17 Thread sarutak

GitHub user sarutak opened a pull request:

https://github.com/apache/spark/pull/2426

[SPARK-3566] [BUILD] .gitignore and .rat-excludes should consider cmd file 
and Emacs' backup files



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sarutak/spark emacs-metafiles-ignore

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2426.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2426


commit 8cade06dce9ea0ed09c47c01e31efa9e92312fcb
Author: Kousuke Saruta saru...@oss.nttdata.co.jp
Date:   2014-09-16T00:40:29Z

Modified .gitignore to ignore emacs lock file and backup file

commit 897da6321001bc3f9344ee0b501628d5c579d28a
Author: Kousuke Saruta saru...@oss.nttdata.co.jp
Date:   2014-09-17T08:12:51Z

Merge branch 'master' of git://git.apache.org/spark into 
emacs-metafiles-ignore

commit 6a0a5eb6a352009dd8c6b902a5253b8846b1bffc
Author: Kousuke Saruta saru...@oss.nttdata.co.jp
Date:   2014-09-17T08:22:28Z

Added cmd file entry to .rat-excludes and .gitignore




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread rxin

Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17651774
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
+public Integer partitionId;
+public Long attemptId;
+public Boolean runningLocally;
+public TaskMetrics taskMetrics;
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId, Boolean runningLocally,
+   TaskMetrics taskMetrics) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = taskMetrics;
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId,
+   Boolean runningLocally) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = false;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+
+private static ThreadLocalTaskContext taskContext =
+new ThreadLocalTaskContext();
+
+public static TaskContext get() {
+return taskContext.get();
+}
+
+// List of callback functions to execute when the task completes.
+private transient ListTaskCompletionListener onCompleteCallbacks =
+new ArrayListTaskCompletionListener();
+
+// Whether the corresponding task has been killed.
+private volatile Boolean interrupted = false;
+
+// Whether the task has completed.
+private volatile Boolean completed = false;
+
+/**
+ * Checks whether the task has completed.
+ */
+public Boolean isCompleted() {
+return completed;
+}
+
+/**
+ * Checks whether the task has been killed.
+ */
+public Boolean isInterrupted() {
+return interrupted;
+}
+
+
+/**
+ * Add a (Java friendly) listener to be executed on task completion.
+ * This will be called in all situation - success, failure, or 
cancellation.
+ *
+ * An example use is for HadoopRDD to register a callback to close the 
input stream.
+ */
+public TaskContext addTaskCompletionListener(TaskCompletionListener 
listener){
+onCompleteCallbacks.add(listener);
+return this;
+}
+
+
+/**
+ * Add a listener in the form of a Scala closure to be executed on 
task completion.
+ * This will be called in all situation - success, failure, or 
cancellation.
+ *
+ * An example use is for HadoopRDD to register a callback to close the 
input stream.
+ */
+public TaskContext addTaskCompletionListener(final

[GitHub] spark pull request: [SPARK-3565]Fix configuration item not consist...

2014-09-17 Thread WangTaoTheTonic

GitHub user WangTaoTheTonic opened a pull request:

https://github.com/apache/spark/pull/2427

[SPARK-3565]Fix configuration item not consistent with document

https://issues.apache.org/jira/browse/SPARK-3565

spark.ports.maxRetries should be spark.port.maxRetries. Make the 
configuration keys in document and code consistent.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/WangTaoTheTonic/spark fixPortRetries

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2427.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2427


commit 3700dbaef56b573b67c8ca6824a0b5037a70608d
Author: WangTaoTheTonic barneystin...@aliyun.com
Date:   2014-09-17T08:23:45Z

Fix configuration item not consistent with document




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL]Add Dynamic Partition suppo...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2226#issuecomment-55863834
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20456/consoleFull)
 for   PR 2226 at commit 
[`5033928`](https://github.com/apache/spark/commit/5033928ee8709c3d1b758856efe2c6cb951ea574).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3565]Fix configuration item not consist...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2427#issuecomment-55864303
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20457/consoleFull)
 for   PR 2427 at commit 
[`3700dba`](https://github.com/apache/spark/commit/3700dbaef56b573b67c8ca6824a0b5037a70608d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3566] [BUILD] .gitignore and .rat-exclu...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2426#issuecomment-55864329
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20458/consoleFull)
 for   PR 2426 at commit 
[`6a0a5eb`](https://github.com/apache/spark/commit/6a0a5eb6a352009dd8c6b902a5253b8846b1bffc).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-2745 [STREAMING] Add Java friendly metho...

2014-09-17 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/2403#issuecomment-55864499
  
I'm not sure what to make of the MIMA errors:

* the type hierarchy of object org.apache.spark.streaming.Duration has 
changed in new version. Missing types {scala.runtime.AbstractFunction1}
* method apply(java.lang.Object)java.lang.Object in object 
org.apache.spark.streaming.Duration's type has changed; was 
(java.lang.Object)java.lang.Object, is now: 
(Long)org.apache.spark.streaming.Duration
* method toString()java.lang.String in object 
org.apache.spark.streaming.Duration does not have a correspondent in new version

I assume this is a side-effect of adding `object Duration`, but I don't see 
why this would have changed the `apply` method this way. `toString` still 
definitely exists. So I wonder if this is a false positive and should be 
suppressed?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3485][SQL] Use GenericUDFUtils.Conversi...

2014-09-17 Thread adrian-wang

Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2407#issuecomment-55864941
  
I add a test case to this file, the current master will fail the case 
because we missed `Short` in the `primitiveTypes` here. With this new solution, 
we can get rid of maintaining of `primitiveTypes`, and eliminate potential bugs 
here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3485][SQL] Use GenericUDFUtils.Conversi...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2407#issuecomment-55865221
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20459/consoleFull)
 for   PR 2407 at commit 
[`15762d2`](https://github.com/apache/spark/commit/15762d2f3835644397d9dbf5ba87bedd9bb91e1c).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-17 Thread colorant

Github user colorant commented on a diff in the pull request:

https://github.com/apache/spark/pull/2330#discussion_r17652889
  
--- Diff: 
core/src/main/scala/org/apache/spark/network/netty/BlockServer.scala ---
@@ -121,18 +91,18 @@ class BlockServer(conf: NettyConfig, dataProvider: 
BlockDataProvider) extends Lo
   bootstrap.option[java.lang.Integer](ChannelOption.SO_BACKLOG, 
backLog)
 }
 conf.receiveBuf.foreach { receiveBuf =
-  bootstrap.option[java.lang.Integer](ChannelOption.SO_RCVBUF, 
receiveBuf)
+  bootstrap.childOption[java.lang.Integer](ChannelOption.SO_RCVBUF, 
receiveBuf)
 }
 conf.sendBuf.foreach { sendBuf =
-  bootstrap.option[java.lang.Integer](ChannelOption.SO_SNDBUF, sendBuf)
+  bootstrap.childOption[java.lang.Integer](ChannelOption.SO_SNDBUF, 
sendBuf)
 }
 
 bootstrap.childHandler(new ChannelInitializer[SocketChannel] {
   override def initChannel(ch: SocketChannel): Unit = {
 ch.pipeline
-  .addLast(frameDecoder, new LineBasedFrameDecoder(1024))  // 
max block id length 1024
-  .addLast(stringDecoder, new StringDecoder(CharsetUtil.UTF_8))
-  .addLast(blockHeaderEncoder, new BlockHeaderEncoder)
+  .addLast(frameDecoder, ProtocolUtils.createFrameDecoder())
+  .addLast(clientRequestDecoder, new ClientRequestDecoder)
+  .addLast(serverResponseEncoder, new ServerResponseEncoder)
   .addLast(handler, new BlockServerHandler(dataProvider))
--- End diff --

should this handler run on separate EventLoopGroup? since GetBlockData 
might block.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3453] [WIP] Refactor Netty module to us...

2014-09-17 Thread colorant

Github user colorant commented on a diff in the pull request:

https://github.com/apache/spark/pull/2330#discussion_r17653172
  
--- Diff: 
core/src/main/scala/org/apache/spark/network/netty/BlockClientFactory.scala ---
@@ -0,0 +1,182 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.network.netty
+
+import java.io.Closeable
+import java.util.concurrent.{ConcurrentHashMap, TimeoutException}
+
+import io.netty.bootstrap.Bootstrap
+import io.netty.buffer.PooledByteBufAllocator
+import io.netty.channel._
+import io.netty.channel.epoll.{Epoll, EpollEventLoopGroup, 
EpollSocketChannel}
+import io.netty.channel.nio.NioEventLoopGroup
+import io.netty.channel.oio.OioEventLoopGroup
+import io.netty.channel.socket.SocketChannel
+import io.netty.channel.socket.nio.NioSocketChannel
+import io.netty.channel.socket.oio.OioSocketChannel
+import io.netty.util.internal.PlatformDependent
+
+import org.apache.spark.{Logging, SparkConf}
+import org.apache.spark.util.Utils
+
+
+/**
+ * Factory for creating [[BlockClient]] by using createClient.
+ *
+ * The factory maintains a connection pool to other hosts and should 
return the same [[BlockClient]]
+ * for the same remote host. It also shares a single worker thread pool 
for all [[BlockClient]]s.
+ */
+private[netty]
+class BlockClientFactory(val conf: NettyConfig) extends Logging with 
Closeable {
+
+  def this(sparkConf: SparkConf) = this(new NettyConfig(sparkConf))
+
+  /** A thread factory so the threads are named (for debugging). */
+  private[this] val threadFactory = 
Utils.namedThreadFactory(spark-netty-client)
+
+  /** Socket channel type, initialized by [[init]] depending ioMode. */
+  private[this] var socketChannelClass: Class[_ : Channel] = _
+
+  /** Thread pool shared by all clients. */
+  private[this] var workerGroup: EventLoopGroup = _
+
+  private[this] val connectionPool = new ConcurrentHashMap[(String, Int), 
BlockClient]
+
+  // The encoders are stateless and can be shared among multiple clients.
+  private[this] val encoder = new ClientRequestEncoder
+  private[this] val decoder = new ServerResponseDecoder
+
+  init()
+
+  /** Initialize [[socketChannelClass]] and [[workerGroup]] based on 
ioMode. */
+  private def init(): Unit = {
+def initOio(): Unit = {
+  socketChannelClass = classOf[OioSocketChannel]
+  workerGroup = new OioEventLoopGroup(0, threadFactory)
+}
+def initNio(): Unit = {
+  socketChannelClass = classOf[NioSocketChannel]
+  workerGroup = new NioEventLoopGroup(0, threadFactory)
+}
+def initEpoll(): Unit = {
+  socketChannelClass = classOf[EpollSocketChannel]
+  workerGroup = new EpollEventLoopGroup(0, threadFactory)
+}
+
+// For auto mode, first try epoll (only available on Linux), then nio.
+conf.ioMode match {
+  case nio = initNio()
+  case oio = initOio()
+  case epoll = initEpoll()
+  case auto = if (Epoll.isAvailable) initEpoll() else initNio()
+}
+  }
+
+  /**
+   * Create a new BlockFetchingClient connecting to the given remote host 
/ port.
+   *
+   * This blocks until a connection is successfully established.
+   *
+   * Concurrency: This method is safe to call from multiple threads.
+   */
+  def createClient(remoteHost: String, remotePort: Int): BlockClient = {
+// Get connection from the connection pool first.
+// If it is not found or not active, create a new one.
+val cachedClient = connectionPool.get((remoteHost, remotePort))
+if (cachedClient != null  cachedClient.isActive) {
+  return cachedClient
+}
+
+logInfo(sCreating new connection to $remoteHost:$remotePort)
+
+// There is a chance two threads are creating two different clients 
connecting to the same host.
+// But that's probably ok ...

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread ScrapCodes

Github user ScrapCodes commented on a diff in the pull request:

https://github.com/apache/spark/pull/2425#discussion_r17653450
  
--- Diff: core/src/main/java/org/apache/spark/TaskContext.java ---
@@ -0,0 +1,176 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark;
+
+import java.io.Serializable;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import scala.Function0;
+import scala.Function1;
+import scala.Unit;
+import scala.collection.JavaConversions;
+
+import org.apache.spark.annotation.DeveloperApi;
+import org.apache.spark.executor.TaskMetrics;
+import org.apache.spark.util.TaskCompletionListener;
+import org.apache.spark.util.TaskCompletionListenerException;
+
+@DeveloperApi
+public class TaskContext implements Serializable {
+
+public Integer stageId;
+public Integer partitionId;
+public Long attemptId;
+public Boolean runningLocally;
+public TaskMetrics taskMetrics;
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId, Boolean runningLocally,
+   TaskMetrics taskMetrics) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = taskMetrics;
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId,
+   Boolean runningLocally) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = runningLocally;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+public TaskContext(Integer stageId, Integer partitionId, Long 
attemptId) {
+this.attemptId = attemptId;
+this.partitionId = partitionId;
+this.runningLocally = false;
+this.stageId = stageId;
+this.taskMetrics = TaskMetrics.empty();
+taskContext.set(this);
+}
+
+
+private static ThreadLocalTaskContext taskContext =
--- End diff --

You mean text wrap ?, yes in one line they are 114 chars.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-17 Thread adrian-wang

Github user adrian-wang closed the pull request at:

https://github.com/apache/spark/pull/2355


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3485][SQL] should check parameter type ...

2014-09-17 Thread adrian-wang

Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2355#issuecomment-55867500
  
maybe 2407 is better, I'll close this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread ScrapCodes

Github user ScrapCodes commented on the pull request:

https://github.com/apache/spark/pull/2425#issuecomment-55867575
  
@rxin One side question, Java8ApiSuite(s) don't compile, looks like we have 
been overlooking them for a while. May be we could just remove them ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2425#issuecomment-55868820
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20461/consoleFull)
 for   PR 2425 at commit 
[`edf945e`](https://github.com/apache/spark/commit/edf945e6765314f0b87092d460f71aa70feecdc5).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL]Add Dynamic Partition suppo...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2226#issuecomment-55868821
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20462/consoleFull)
 for   PR 2226 at commit 
[`096bbbc`](https://github.com/apache/spark/commit/096bbbc261e6e5015d7ff4b90f299ea321ba7651).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3566] [BUILD] .gitignore and .rat-exclu...

2014-09-17 Thread ScrapCodes

Github user ScrapCodes commented on the pull request:

https://github.com/apache/spark/pull/2426#issuecomment-55868873
  
You can have global gitignore. 
https://help.github.com/articles/ignoring-files. I am not sure how many editors 
and such we are going to support. Mind closing this PR. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3564] Display App ID on HistoryPage

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2424#issuecomment-55868956
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20454/consoleFull)
 for   PR 2424 at commit 
[`417fe90`](https://github.com/apache/spark/commit/417fe90ab77dfb3f8be5c2755b4738f86e74d5ba).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3565]Fix configuration item not consist...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2427#issuecomment-55869240
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20457/consoleFull)
 for   PR 2427 at commit 
[`3700dba`](https://github.com/apache/spark/commit/3700dbaef56b573b67c8ca6824a0b5037a70608d).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3566] [BUILD] .gitignore and .rat-exclu...

2014-09-17 Thread sarutak

Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/2426#issuecomment-55869359
  
@ScrapCodes Thanks for your comment.
Vim/vi and Emacs is the most popular editors and current .gitignore 
considers vim's swp file so Emacs should be considered.

Also, this PR includes the modification for Windows batch file so I cannot 
close this PR, sorry.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3566] [BUILD] .gitignore and .rat-exclu...

2014-09-17 Thread ScrapCodes

Github user ScrapCodes commented on the pull request:

https://github.com/apache/spark/pull/2426#issuecomment-55869795
  
There is nothing like spark-env.cmd in the code base ? + Emacs is my editor 
of choice too and I add those excludes in gitignore global simply because I can 
not go and update gitignore on every open source project I work on.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2425#issuecomment-55869923
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20455/consoleFull)
 for   PR 2425 at commit 
[`f716fd1`](https://github.com/apache/spark/commit/f716fd1b0d84bcb08889b62af2d5f7a6d14b1cab).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55870259
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20463/consoleFull)
 for   PR 2344 at commit 
[`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3566] [BUILD] .gitignore and .rat-exclu...

2014-09-17 Thread sarutak

Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/2426#issuecomment-55870586
  
Current code base doesn't include spark-env.sh but user can create and used 
by spark-class2.cmd, run-examples2.cmd, compute-classpath.cmd and pyspark2.cmd.

And if applying your logic, the entry for swp should be removed from 
.gitignore.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL]Add Dynamic Partition suppo...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2226#issuecomment-55872750
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20456/consoleFull)
 for   PR 2226 at commit 
[`5033928`](https://github.com/apache/spark/commit/5033928ee8709c3d1b758856efe2c6cb951ea574).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2425#issuecomment-55873286
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20464/consoleFull)
 for   PR 2425 at commit 
[`a7d5e23`](https://github.com/apache/spark/commit/a7d5e23330a159e91581a35d46e1846770eb421d).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2709] A tool to output all public API o...

2014-09-17 Thread ScrapCodes

Github user ScrapCodes closed the pull request at:

https://github.com/apache/spark/pull/1688


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3485][SQL] Use GenericUDFUtils.Conversi...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2407#issuecomment-55874010
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20459/consoleFull)
 for   PR 2407 at commit 
[`15762d2`](https://github.com/apache/spark/commit/15762d2f3835644397d9dbf5ba87bedd9bb91e1c).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3567] appId field in SparkDeploySchedul...

2014-09-17 Thread sarutak

GitHub user sarutak opened a pull request:

https://github.com/apache/spark/pull/2428

[SPARK-3567] appId field in SparkDeploySchedulerBackend should be volatile



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sarutak/spark appid-volatile-modification

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2428.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2428


commit c7d890d1557254ff7ea01263a6512a3d846d4270
Author: Kousuke Saruta saru...@oss.nttdata.co.jp
Date:   2014-09-17T10:19:30Z

Added volatile modifier to appId field in SparkDeploySchedulerBackend




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2425#issuecomment-55875586
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20461/consoleFull)
 for   PR 2425 at commit 
[`edf945e`](https://github.com/apache/spark/commit/edf945e6765314f0b87092d460f71aa70feecdc5).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3567] appId field in SparkDeploySchedul...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2428#issuecomment-55875701
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20465/consoleFull)
 for   PR 2428 at commit 
[`c7d890d`](https://github.com/apache/spark/commit/c7d890d1557254ff7ea01263a6512a3d846d4270).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3566] [BUILD] .gitignore and .rat-exclu...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2426#issuecomment-55876167
  
**[Tests timed 
out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20458/consoleFull)**
 after a configured wait of `120m`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55877253
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20463/consoleFull)
 for   PR 2344 at commit 
[`413f946`](https://github.com/apache/spark/commit/413f946ea2d9488b08186307bed550fa298a8a23).
 * This patch **fails** unit tests.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class Cast(child: Expression, dataType: DataType) extends 
UnaryExpression with Logging `
  * `public class DateType extends DataType `



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3007][SQL]Add Dynamic Partition suppo...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2226#issuecomment-55877476
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20462/consoleFull)
 for   PR 2226 at commit 
[`096bbbc`](https://github.com/apache/spark/commit/096bbbc261e6e5015d7ff4b90f299ea321ba7651).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3543] Write TaskContext in Java and exp...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2425#issuecomment-55879258
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20464/consoleFull)
 for   PR 2425 at commit 
[`a7d5e23`](https://github.com/apache/spark/commit/a7d5e23330a159e91581a35d46e1846770eb421d).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3567] appId field in SparkDeploySchedul...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2428#issuecomment-55881519
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20465/consoleFull)
 for   PR 2428 at commit 
[`c7d890d`](https://github.com/apache/spark/commit/c7d890d1557254ff7ea01263a6512a3d846d4270).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2594][SQL] Support CACHE TABLE name A...

2014-09-17 Thread ravipesala

Github user ravipesala commented on a diff in the pull request:

https://github.com/apache/spark/pull/2397#discussion_r17659871
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/commands.scala ---
@@ -166,3 +166,20 @@ case class DescribeCommand(child: SparkPlan, output: 
Seq[Attribute])(
   child.output.map(field = Row(field.name, field.dataType.toString, 
null))
   }
 }
+
+/**
+ * :: DeveloperApi ::
+ */
+@DeveloperApi
+case class CacheTableAsSelectCommand(tableName: String, plan: LogicalPlan)
+  extends LeafNode with Command {
+  
+  override protected[sql] lazy val sideEffectResult = {
+sqlContext.catalog.registerTable(None, tableName,  
sqlContext.executePlan(plan).analyzed)
--- End diff --

Thank you for your comment. It is a good idea to import ```sqlContext._```. 
But we can simplify as below code if we import it. Please comment on it.
```
import sqlContext._
plan.registerTempTable(tableName)
cacheTable(tableName) 
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3223 runAsSparkUser cannot change HDFS w...

2014-09-17 Thread tgravescs

Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/2126#issuecomment-55887296
  
@jongyoul sorry I don't know what you mean by on Aug?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3407][SQL]Add Date type support

2014-09-17 Thread adrian-wang

Github user adrian-wang commented on the pull request:

https://github.com/apache/spark/pull/2344#issuecomment-55889454
  
Just rebase code, the failure is in spark-core and not in this patch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3223 runAsSparkUser cannot change HDFS w...

2014-09-17 Thread jongyoul

Github user jongyoul commented on the pull request:

https://github.com/apache/spark/pull/2126#issuecomment-55889694
  
@tgravescs not seriously, I'm deal with this issue unless it's not fixed
until Sep.

On Wednesday, September 17, 2014, Tom Graves notificati...@github.com
wrote:

 @jongyoul https://github.com/jongyoul sorry I don't know what you mean
 by on Aug?

 â
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/2126#issuecomment-55887296.



-- 
ì´ì¢ì´, Jongyoul Lee, æå®ç
http://madeng.net


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3223 runAsSparkUser cannot change HDFS w...

2014-09-17 Thread tgravescs

Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/2126#issuecomment-55892735
  
Ok, sounds good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2778] [yarn] Add yarn integration tests...

2014-09-17 Thread tgravescs

Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/2257#issuecomment-55895625
  
@pwendell  @andrewor14   Any concerns with the time on this?  I know our 
unit tests already take a long time to run but this doesn't seem to bad.  I 
think going forward before we add a bunch more tests we might need to figure 
out how to run them separately.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3304] [YARN] ApplicationMaster's Finish...

2014-09-17 Thread tgravescs

Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/2198#issuecomment-55895905
  
@sarutak it is going to be another day or 2 before I can get time to look 
at this in detail.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: add support for zipping a sequence of RDDs

2014-09-17 Thread mohitjaggi

GitHub user mohitjaggi opened a pull request:

https://github.com/apache/spark/pull/2429

add support for zipping a sequence of RDDs

a proposed fix for https://issues.apache.org/jira/browse/SPARK-3489

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/AyasdiOpenSource/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2429.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2429


commit ff0ae81ccd1549db9aa58f5152871c34d23d7c22
Author: Mohit Jaggi mo...@ayasdi.com
Date:   2014-09-17T13:55:07Z

add support for zipping a sequence of RDDs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: add support for zipping a sequence of RDDs

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2429#issuecomment-55896797
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3218, SPARK-3219, SPARK-3261, SPARK-342...

2014-09-17 Thread derrickburns

Github user derrickburns commented on the pull request:

https://github.com/apache/spark/pull/2419#issuecomment-55898846
  
I agree that some of my comments should go in the code. 

As for the Big Bang change, I understand your concern. The distance 
functions touches practically everything. The change in the treatment of number 
of clusters is also a broad change. So, while I would prefer to make small 
increment changes, these two changes required touching lots of code.



Sent from my iPhone

 On Sep 17, 2014, at 12:33 AM, Sean Owen notificati...@github.com wrote:
 
 @derrickburns I think these notes can go in code comments? (They each 
generate their own email too.)
 
 This is also a big-bang change covering several issues, some of which 
seem like more focused bug fixes or improvements. I would think it would be 
easier to break this down further if possible, and get in clear easy changes 
first.
 
 â
 Reply to this email directly or view it on GitHub.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an 1-hi...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1290#issuecomment-55900054
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20466/consoleFull)
 for   PR 1290 at commit 
[`f1af6cf`](https://github.com/apache/spark/commit/f1af6cfc9c50470c94ce178d8acf80228d496071).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-1455] [SPARK-3534] [Build] When possibl...

2014-09-17 Thread nchammas

Github user nchammas commented on the pull request:

https://github.com/apache/spark/pull/2420#issuecomment-55900507
  
cc @marmbrus @pwendell Dunno if you guys got notified via JIRA of this pull 
request or not.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

2014-09-17 Thread willb

Github user willb commented on the pull request:

https://github.com/apache/spark/pull/2401#issuecomment-55902476
  
Here's a somewhat-related concern:  it seems like JVM overhead is unlikely 
to scale linearly with requested heap size, so maybe a straight-up 15% isn't a 
great default?  (If you have hard data on how heap requirements grow with job 
size, I'd be interested in seeing it.)  Perhaps it would make more sense to 
reserve whichever is smaller of 15% or some fixed but reasonable amount.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

2014-09-17 Thread brndnmtthws

Github user brndnmtthws commented on the pull request:

https://github.com/apache/spark/pull/2401#issuecomment-55903580
  
That implies that as you grow the heap, you're not adding threads (or other 
things that use off-heap memory).  I'm not familiar with Spark's execution 
model, but I would assume that as you scale up, the workers will need more 
threads to serve cache requests and what have you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3535][Mesos] Add 15% task memory overhe...

2014-09-17 Thread brndnmtthws

Github user brndnmtthws commented on the pull request:

https://github.com/apache/spark/pull/2401#issuecomment-55903894
  
Oh, and one more thing you may want to think about is the OS filesystem 
buffers.  Again, as you scale up the heap, you may want to proportionally 
reserve a slice of off-heap memory just for the OS.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-3304] [YARN] ApplicationMaster's Finish...

2014-09-17 Thread sarutak

Github user sarutak commented on the pull request:

https://github.com/apache/spark/pull/2198#issuecomment-55904701
  
@tgravescs Thank you so much!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3069 [DOCS] Build instructions in README...

2014-09-17 Thread nchammas

Github user nchammas commented on a diff in the pull request:

https://github.com/apache/spark/pull/2014#discussion_r17668529
  
--- Diff: CONTRIBUTING.md ---
@@ -0,0 +1,12 @@
+## Contributing to Spark
--- End diff --

Having this is pretty nice! I like the banner you get when opening a new PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Docs] minor grammar fix

2014-09-17 Thread nchammas

GitHub user nchammas opened a pull request:

https://github.com/apache/spark/pull/2430

[Docs] minor grammar fix



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/nchammas/spark patch-2

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/2430.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2430


commit d476bfb21b7cbe066aa313128644866e99204aaf
Author: Nicholas Chammas nicholas.cham...@gmail.com
Date:   2014-09-17T14:55:03Z

[Docs] minor grammar fix




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [Docs] minor grammar fix

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/2430#issuecomment-55906606
  
  [QA tests have 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20467/consoleFull)
 for   PR 2430 at commit 
[`d476bfb`](https://github.com/apache/spark/commit/d476bfb21b7cbe066aa313128644866e99204aaf).
 * This patch merges cleanly.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: add spark.driver.memory to config docs

2014-09-17 Thread nartz

Github user nartz commented on the pull request:

https://github.com/apache/spark/pull/2410#issuecomment-55909584
  
using `./bin/spark-submit --driver-memory 3g myscript.py` on the command 
line works for me


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3177 (on Master Branch)

2014-09-17 Thread tgravescs

Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/2204#issuecomment-55910777
  
looks good. I tested it on hadoop 0.23 and it works also.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-3177 (on Master Branch)

2014-09-17 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/2204


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [MLLIB] [spark-2352] Implementation of an 1-hi...

2014-09-17 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/1290#issuecomment-55911175
  
  [QA tests have 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/20466/consoleFull)
 for   PR 1290 at commit 
[`f1af6cf`](https://github.com/apache/spark/commit/f1af6cfc9c50470c94ce178d8acf80228d496071).
 * This patch **passes** unit tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-2418] Custom checkpointing with an exte...

2014-09-17 Thread darabos

Github user darabos commented on the pull request:

https://github.com/apache/spark/pull/1345#issuecomment-55912866
  
@Forevian, can you please update it to merge cleanly? Then hunt down a 
reviewer! It would be great to have this in 1.2. It would make our code 
significantly more efficient. (Currently we save to S3 and load from S3 to 
checkpoint. With your change I think we could avoid the unnecessary loading.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 484 matches

Mail list logo