date:20151001

[GitHub] spark pull request: [SPARK-5569] [STREAMING] fix ObjectInputStream...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8955#issuecomment-144644388
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-5569] [STREAMING] fix ObjectInputStream...

2015-10-01 Thread maxwellzdm

GitHub user maxwellzdm opened a pull request:

https://github.com/apache/spark/pull/8955

[SPARK-5569] [STREAMING] fix ObjectInputStreamWithLoader for supporting 
load array classes.

When use Kafka DirectStream API to create checkpoint and restore saved 
checkpoint when restart,
ClassNotFound exception would occur.

The reason for this error is that ObjectInputStreamWithLoader extends the 
ObjectInputStream class and override its resolveClass method. But Instead of 
Using Class.forName(desc,false,loader), Spark uses loader.loadClass(desc) to 
instance the class, which do not works with array class.

For example: 

Class.forName("[Lorg.apache.spark.streaming.kafka.OffsetRange.",false,loader) 
works well while 
loader.loadClass("[Lorg.apache.spark.streaming.kafka.OffsetRange") would throw 
an class not found exception.

details of the difference between Class.forName and loader.loadClass can be 
found here.
http://bugs.java.com/view_bug.do?bug_id=6446627


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maxwellzdm/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8955.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8955


commit 929fec4445d8857e1d7833c9c848ad18226f60e9
Author: DEMING ZHU 
Date:   2015-10-01T07:24:38Z

fix ObjectInputStreamWithLoader for supporting load array classes.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8922#issuecomment-144645064
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8922#issuecomment-144645045
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8922#issuecomment-144646240
  
  [Test build #43147 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43147/consoleFull)
 for   PR 8922 at commit 
[`388de88`](https://github.com/apache/spark/commit/388de88069f335a3db55aae604918e52d26a4071).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-7012][SQL] Add support for NOT NULL mod...

2015-10-01 Thread smola

Github user smola commented on the pull request:

https://github.com/apache/spark/pull/8746#issuecomment-144693706
  
@sabhyankar Great! The implementation looks good. Could you add a test case 
for it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-144701972
  
  [Test build #43149 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43149/consoleFull)
 for   PR 8956 at commit 
[`f27288e`](https://github.com/apache/spark/commit/f27288e29211d47e24767cf7731914cdf9865bc1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9570][Docs][YARN]Consistent recommendat...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8385#issuecomment-144707850
  
@nssalian I'd like to resolve this at last. This has outstanding comments 
and needs a rebase. Do you want to do that or should I take over?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10724] [SQL] SQL's floor() returns DOUB...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8893#issuecomment-144707576
  
It looks like @chenghao-intel is farther along towards a simpler fix in 
https://github.com/apache/spark/pull/8933/files  Do you mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-144726375
  
  [Test build #43149 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43149/console)
 for   PR 8956 at commit 
[`f27288e`](https://github.com/apache/spark/commit/f27288e29211d47e24767cf7731914cdf9865bc1).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `  case class StringFilter(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-01 Thread viirya

GitHub user viirya opened a pull request:

https://github.com/apache/spark/pull/8956

[SPARK-10895][SQL] Push down string filters to Parquet

JIRA: https://issues.apache.org/jira/browse/SPARK-10895


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viirya/spark-1 parquet-stringfilter-pushdown

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8956.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8956


commit f27288e29211d47e24767cf7731914cdf9865bc1
Author: Liang-Chi Hsieh 
Date:   2015-10-01T11:14:14Z

Push down string filters to Parquet.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: SPARK-10883: use relative location of scalasty...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8949#issuecomment-144708587
  
Per comments in the JIRA, I think this is unnecessary as you can use the 
standard Maven syntax to build modules correctly. Do you mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10886] [Documentation] Random RDD creat...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8951#issuecomment-144708911
  
Do you mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9570][Docs][YARN]Consistent recommendat...

2015-10-01 Thread nssalian

Github user nssalian commented on the pull request:

https://github.com/apache/spark/pull/8385#issuecomment-144723093
  
@srowen please go ahead.
 Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-144726522
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43149/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-144726518
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...

2015-10-01 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/8922#issuecomment-144697959
  
ping @liancheng @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-144700733
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8758#issuecomment-144706624
  
Yeah, the current script does blindly 'pass' the first arg as an 
environment variable. It never parsed any args at all to the arg-parsing code, 
which seems like an oversight. Instead it sent a dummy (?) argument 1 for some 
reason -- is this historical?

So I generally agree with plumbing through the arguments to the argument 
parsing code. By the way, don't we need to remove that "1" argument then?

From there it seemed straightforward to attempt to retain backwards 
compatibility for the script, such that passing just a dir as the single arg 
still works (and generates a warning). This formulation would in fact stop at 
the first such argument.

I would not be against just removing support for this naked argument, as it 
has been long since deprecated, if anyone felt strongly about it.

Aside from the "1" issue, this LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10895][SQL] Push down string filters to...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8956#issuecomment-144700717
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10515] When killing executor, the pendi...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8668#issuecomment-144707186
  
@KaiXinXiaoLei do you mind closing this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10772][Streaming][Scala]: NullPointerEx...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8881#issuecomment-144707086
  
  [Test build #1833 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1833/consoleFull)
 for   PR 8881 at commit 
[`cba60ed`](https://github.com/apache/spark/commit/cba60ed77e1c4812617667f5d1d3e73e588e9f96).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10582] using dynamic-executor-allocatio...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8737#issuecomment-144707252
  
@KaiXinXiaoLei are you working on this, or else do you mind closing this 
PR? I'm also not clear if it's the same thing as 
https://github.com/apache/spark/pull/8945


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10772][Streaming][Scala]: NullPointerEx...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8881#issuecomment-144707377
  
  [Test build #1833 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1833/console)
 for   PR 8881 at commit 
[`cba60ed`](https://github.com/apache/spark/commit/cba60ed77e1c4812617667f5d1d3e73e588e9f96).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8954#issuecomment-144732292
  
  [Test build #43150 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43150/consoleFull)
 for   PR 8954 at commit 
[`a72d3ec`](https://github.com/apache/spark/commit/a72d3ec526898f41b88ad907c468c843962bb965).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8758#issuecomment-144738730
  
Aha you're right about "1". It can stay:
```
usage="Usage: spark-daemon.sh [--config ] 
(start|stop|submit|status)   "
```
Yes it doesn't pass args now but isn't that a problem? clearly 
`HistoryServer` has code to parse args and I don't see how those are plumbed 
through.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10058][Core][Tests]Fix the flaky tests ...

2015-10-01 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8946


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10889] [Streaming] Bump KCL to add Mill...

2015-10-01 Thread akatz

GitHub user akatz opened a pull request:

https://github.com/apache/spark/pull/8957

[SPARK-10889] [Streaming] Bump KCL to add MillisBehindLatest metric

I don't believe the API changed at all.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/akatz/spark kcl-upgrade

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8957.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8957


commit 6abc571a1a1d0a06565a0ec9fefa7bcb1ce69cfe
Author: Avrohom Katz 
Date:   2015-09-30T22:26:36Z

Bump KCL to add MillisBehindLatest metric




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8954#issuecomment-144734249
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8954#issuecomment-144734253
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43150/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-01 Thread zsxwing

GitHub user zsxwing opened a pull request:

https://github.com/apache/spark/pull/8958

[SPARK-10900][Streaming]Add output operation events to StreamingListener

Add output operation events to StreamingListener so as to implement the 
following UI features:

1. Progress bar of a batch in the batch list.
2. Be able to display output operation `description` and `duration` when 
there is no spark job in a Streaming job.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zsxwing/spark output-operation-events

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8958.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8958


commit 1ffae9302bc699bba750693ba3d08327e0b62f57
Author: zsxwing 
Date:   2015-10-01T14:42:58Z

Add output operation events to StreamingListener




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10889] [Streaming] Bump KCL to add Mill...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8957#issuecomment-144731855
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...

2015-10-01 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8758#issuecomment-144732338
  
You guys are misinterpreting the script.

That `1` is not an argument to the HistoryServer, it's an argument to 
`spark-daemon.sh`. The script never passes any arguments to the HistoryServer 
itself. This change is *breaking* the command line parsing in 
`HistoryServerArguments`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10300] [build] [tests] Add support for ...

2015-10-01 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8437#issuecomment-144733008
  
@JoshRosen #8775


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10058][Core][Tests]Fix the flaky tests ...

2015-10-01 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8946#issuecomment-144738827
  
Merged to master and branch-1.5, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...

2015-10-01 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8758#issuecomment-144740121
  
The change in `start-history-server.sh` adds the command line parameters, 
and those are propagated to the java process. That part of the change is fine.

The broken part is the one I commented on. It's breaking command line 
parsing, because if you provide an invalid argument, it treats it as the log 
directory. and stops parsing the rest of the command line. That behavior has 
never existed and is actually broken.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8758#issuecomment-144741977
  
Good, yes passing through the args is right. Right now if you run 
`start-history-server.sh foo` you will successfully set the log directory to 
foo because of what the script does. That's what I'm trying to preserve. Or am 
I also missing something there? 

Right now there is no other arg parsing to break, right? nothing is passed 
or parsed otherwise.

I tend to agree it's a little janky, but hey it generates a warning. 
Compatibility is good if it's cheap and it seems easy here. But then again it's 
been deprecated for forever. So I don't feel strongly about keeping it if there 
are strong feelings for retiring this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8958#issuecomment-144754459
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8958#issuecomment-144754489
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-01 Thread zsxwing

Github user zsxwing commented on the pull request:

https://github.com/apache/spark/pull/8958#issuecomment-14475
  
/cc @tdas


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8958#issuecomment-144756156
  
  [Test build #43151 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43151/consoleFull)
 for   PR 8958 at commit 
[`ba8f9b8`](https://github.com/apache/spark/commit/ba8f9b8a9aa53a88a87592cc833180ee4aaf6ee8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8954#issuecomment-144730502
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8954#issuecomment-144730473
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10058][Core][Tests]Fix the flaky tests ...

2015-10-01 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8946#issuecomment-144732736
  
LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [WIP][SPARK-10891][STREAMING][KINESIS] Add Mes...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8954#issuecomment-144734228
  
  [Test build #43150 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43150/console)
 for   PR 8954 at commit 
[`a72d3ec`](https://github.com/apache/spark/commit/a72d3ec526898f41b88ad907c468c843962bb965).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `class KinesisBackedBlockRDD[T](`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...

2015-10-01 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8758#issuecomment-144742060
  
Hmm, I think I get what you're trying to do. You're trying to implement the 
script's command line handling in `HistoryServerArguments` (the old `if [ $# != 
0 ]; then` code you're removing). I don't think you should do that.

Instead, the script itself should handle this backwards compatibility, much 
like it did before. Instead of setting an env variable, it can add command line 
arguments to the history server.

But the change in `HistoryServerArguments` is broken the way it is now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...

2015-10-01 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8758#issuecomment-144749562
  
Hi, me again. Looking at the code once more, I think it's ok if you want to 
make this change in the scala code and not the shell script (so, e.g., you can 
unit test it), but it cannot be done in the current spot. Basically, you need 
something like this:

if (args.length == 1) {
  // Print deprecation warning, set log dir.
} else {
   parse(args.toList)
}

That way the existing command-line parsing is not broken. With your change, 
something like "--properties-file foo blah --dir /path" will successfully be 
parsed, and the log directory will be set to "blah", which is not what should 
happen, since that's either an invalid command line, or the directory should be 
set to "/path". (I vote for invalid command line.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8958#issuecomment-144771166
  
  [Test build #43151 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43151/console)
 for   PR 8958 at commit 
[`ba8f9b8`](https://github.com/apache/spark/commit/ba8f9b8a9aa53a88a87592cc833180ee4aaf6ee8).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class StreamingListenerOutputOperationStarted(`
  * `case class StreamingListenerOutputOperationCompleted(`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread shivaram

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144788137
  
Yeah that would be great. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10886] [Documentation] Random RDD creat...

2015-10-01 Thread jayantshekhar

Github user jayantshekhar closed the pull request at:

https://github.com/apache/spark/pull/8951


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10886] [Documentation] Random RDD creat...

2015-10-01 Thread jayantshekhar

Github user jayantshekhar commented on the pull request:

https://github.com/apache/spark/pull/8951#issuecomment-144787366
  
Thanks Sean! Closing it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144790511
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread shivaram

Github user shivaram commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144782174
  
Thanks @NarineK - I left some inline comments regarding the roxygen docs.  
Regarding the sqlContext reuse, I think we should do that in a separate JIRA. 
Could you file one for that ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread NarineK

Github user NarineK commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144787827
  
Hi Shivaram, 
should I change the example for createDataFrame with iris too ?
Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144790493
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144790574
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144790538
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...

2015-10-01 Thread viirya

Github user viirya commented on the pull request:

https://github.com/apache/spark/pull/8922#issuecomment-144776282
  
I will post performance comparison later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/8952#discussion_r40933250
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -149,6 +149,26 @@ createDataFrame <- function(sqlContext, data, schema = 
NULL, samplingRatio = 1.0
   dataFrame(sdf)
 }
 
+#' Create a DataFrame from an RDD
+#'
+#' Converts an RDD to a DataFrame by infer the types.
+#'
+#' @param sqlContext A SQLContext
+#' @param data An RDD or list or data.frame
+#' @param schema a list of column names or named list (StructType), 
optional
+#' @return an DataFrame
+#' @export
+#' @examples
+#'\dontrun{
+#' sc <- sparkR.init()
+#' sqlContext <- sparkRSQL.init(sc)
+#' rdd <- lapply(parallelize(sc, 1:10), function(x) list(a=x, 
b=as.character(x)))
+#' df <- as.DataFrame(sqlContext, rdd)
+#' }
+as.DataFrame <- function(sqlContext, data, schema = NULL, samplingRatio = 
1.0){
--- End diff --

Yeah I think we can do that, but lets do this change in a separate JIRA / PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10841][SQL] Add pushdown support of UDF...

2015-10-01 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/8922#issuecomment-144772135
  
I'm a little skeptical that this is worth the complexity.  Do you have real 
works loads that this speeds up significantly?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread shivaram

Github user shivaram commented on a diff in the pull request:

https://github.com/apache/spark/pull/8952#discussion_r40933374
  
--- Diff: R/pkg/R/SQLContext.R ---
@@ -149,6 +149,26 @@ createDataFrame <- function(sqlContext, data, schema = 
NULL, samplingRatio = 1.0
   dataFrame(sdf)
 }
 
+#' Create a DataFrame from an RDD
+#'
+#' Converts an RDD to a DataFrame by infer the types.
--- End diff --

Could you remove the reference to RDD in the comments here (and in 
createDataFrame). You could just make it `Converts R data.frame or list into 
DataFrame`. 

In the same spirit could you use `iris` or something like that in the 
example (instead of the lapply)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8958#issuecomment-144771352
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43151/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10900][Streaming]Add output operation e...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8958#issuecomment-144771351
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144792949
  
  [Test build #43152 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43152/consoleFull)
 for   PR 8952 at commit 
[`336b3a9`](https://github.com/apache/spark/commit/336b3a9d94dec34db4e5eea3a21a7d2ac1d0ce1a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread NarineK

Github user NarineK commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144794849
  
Here is the jira for making sqlContext global.
https://issues.apache.org/jira/browse/SPARK-10903


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144796830
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10261][Documentation, ML] Fixed @Since ...

2015-10-01 Thread tijoparacka

Github user tijoparacka commented on the pull request:

https://github.com/apache/spark/pull/8554#issuecomment-144803543
  
yu-iskw  could you please reivew this. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144803339
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread felixcheung

Github user felixcheung commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144816142
  
looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-01 Thread vanzin

Github user vanzin commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-144818093
  
Hi @tgravescs ,

This fixes the problem but I think it's actually just masking a subtle bug 
elsewhere. In `getUserClasspath`, there's this code:

val mainUri = mainJar.orElse(Some(APP_JAR)).map(new URI(_))

That is not actually doing the right thing in certain cases. When invoked 
from a `Client` instance, the `mainJar` argument comes from 
`ClientArguments.userJar`, so it's never going to be `None` (and thus always 
return the name of the original jar instead of `APP_JAR`).

The "cleanest" thing would be to have just a single version of 
`getUserClasspath` that gets things from `SparkConf`, but that runs into the 
problem that the conf has not yet been updated when `populateClasspath` is 
called.

I think changing that `map` call to something like the following would fix 
the source of the problem:

.map { path =>
  val uri = new URI(path)
  if (uri.getScheme == LOCAL_SCHEME) new URI(uri.getPath()) else uri
}

(Not tested.) Does that make sense? Could you try that out?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144792979
  
  [Test build #43153 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43153/consoleFull)
 for   PR 8941 at commit 
[`2b2c643`](https://github.com/apache/spark/commit/2b2c643436584203357405e8921fe65be1af9286).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10889] [Streaming] Bump KCL to add Mill...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8957#issuecomment-144793932
  
  [Test build #1834 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/1834/consoleFull)
 for   PR 8957 at commit 
[`6abc571`](https://github.com/apache/spark/commit/6abc571a1a1d0a06565a0ec9fefa7bcb1ce69cfe).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10709] [SQL] When loading a json datase...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8899#issuecomment-144794227
  
@navis can you follow up on this PR or close it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144796688
  
  [Test build #43152 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43152/console)
 for   PR 8952 at commit 
[`336b3a9`](https://github.com/apache/spark/commit/336b3a9d94dec34db4e5eea3a21a7d2ac1d0ce1a).
 * This patch **passes all tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10317] [Core] Compatibility between his...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8758#issuecomment-144796916
  
I like that better since it focuses narrowly on supporting one arg. 
@rekhajoshm what do you think? the rest looks good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10888] [SparkR] Added as.DataFrame as a...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8952#issuecomment-144796833
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43152/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144798668
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-144810614
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-144810576
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144812918
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144812883
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-01 Thread tgravescs

Github user tgravescs commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-144821230
  

where are you suggesting putting this .map so its clear? in getUserPath on 
mainUri?  I'm not seeing how your map call fixes anything so I"m guessing I'm 
missing the context.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10889] [Streaming] Bump KCL to add Mill...

2015-10-01 Thread srowen

Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/8957#issuecomment-144793663
  
Seems OK to me, but the usual question is simply, does it introduce any 
potential problems too? changes dependencies, incompatible behavior, etc? might 
skim the release notes and commits if you can to sanity check.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144797066
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144797039
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144799890
  
 Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144799919
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144803367
  
Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10261][Documentation, ML] Fixed @Since ...

2015-10-01 Thread tijoparacka

Github user tijoparacka commented on the pull request:

https://github.com/apache/spark/pull/8554#issuecomment-144803700
  
@yu-iskw  could you please reivew this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144805650
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43153/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144805491
  
  [Test build #43153 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43153/console)
 for   PR 8941 at commit 
[`2b2c643`](https://github.com/apache/spark/commit/2b2c643436584203357405e8921fe65be1af9286).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144805646
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10901] [YARN] spark.yarn.user.classpath...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8959#issuecomment-144812185
  
  [Test build #43157 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43157/consoleFull)
 for   PR 8959 at commit 
[`08f9382`](https://github.com/apache/spark/commit/08f93822de809946e565793adab7c30c8c8c3430).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10264][Documentation, ML] Added Since a...

2015-10-01 Thread tijoparacka

Github user tijoparacka commented on the pull request:

https://github.com/apache/spark/pull/8532#issuecomment-144802854
  
Any of you can review this  and merge.  I may loose track if  it is delayed 
more.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10671] [SQL] Throws an analysis excepti...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8941#issuecomment-144804910
  
  [Test build #43156 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43156/consoleFull)
 for   PR 8941 at commit 
[`666dd63`](https://github.com/apache/spark/commit/666dd630dd8882b16b864f2dcf3b994b70894ef8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-9570][Docs][YARN]Consistent recommendat...

2015-10-01 Thread tgravescs

Github user tgravescs commented on a diff in the pull request:

https://github.com/apache/spark/pull/8385#discussion_r40947175
  
--- Diff: docs/running-on-yarn.md ---
@@ -16,37 +16,51 @@ containers used by the application use the same 
configuration. If the configurat
 Java system properties or environment variables not managed by YARN, they 
should also be set in the
 Spark application's configuration (driver, executors, and the AM when 
running in client mode).
 
-There are two deploy modes that can be used to launch Spark applications 
on YARN. In `yarn-cluster` mode, the Spark driver runs inside an application 
master process which is managed by YARN on the cluster, and the client can go 
away after initiating the application. In `yarn-client` mode, the driver runs 
in the client process, and the application master is only used for requesting 
resources from YARN.
+There are two deploy modes that can be used to launch Spark applications 
on YARN. In `cluster` mode, the Spark driver runs inside an application master 
process which is managed by YARN on the cluster, and the client can go away 
after initiating the application. In `client` mode, the driver runs in the 
client process, and the application master is only used for requesting 
resources from YARN.
 
-Unlike in Spark standalone and Mesos mode, in which the master's address 
is specified in the `--master` parameter, in YARN mode the ResourceManager's 
address is picked up from the Hadoop configuration. Thus, the `--master` 
parameter is `yarn-client` or `yarn-cluster`. 
-To launch a Spark application in `yarn-cluster` mode:
+Unlike in Spark standalone and Mesos mode, in which the master's address 
is specified in the `--master` parameter, in YARN mode the ResourceManager's 
address is picked up from the Hadoop configuration. Thus, the `--master` 
parameter is `yarn` and `--deploy-mode` can be `client` or `cluster` to select 
the YARN deployment mode.
+To launch a Spark application in YARN in `cluster` mode:
 
-   `$ ./bin/spark-submit --class path.to.your.Class --master yarn-cluster 
[options]  [app options]`
-
+   `$ ./bin/spark-submit --class path.to.your.Class --master yarn 
--deploy-mode cluster [options]  [app options]`
+   
 For example:
 
 $ ./bin/spark-submit --class org.apache.spark.examples.SparkPi \
---master yarn-cluster \
+--master yarn \
+--deploy-mode cluster
 --num-executors 3 \
 --driver-memory 4g \
 --executor-memory 2g \
 --executor-cores 1 \
 --queue thequeue \
 lib/spark-examples*.jar \
-10
 
-The above starts a YARN client program which starts the default 
Application Master. Then SparkPi will be run as a child thread of Application 
Master. The client will periodically poll the Application Master for status 
updates and display them in the console. The client will exit once your 
application has finished running.  Refer to the "Debugging your Application" 
section below for how to see driver and executor logs.
+The above example starts a YARN client program which starts the default 
Application Master. Then SparkPi will be run as a child thread of Application 
Master. The client will periodically poll the Application Master for status 
updates and display them in the console. The client will exit once your 
application has finished running.  Refer to the "Debugging your Application" 
section below for how to see driver and executor logs.
+
+To launch a Spark application in `client` mode, do the same, but replace 
`cluster` with `client` in the `--deploy-mode` argument.  
+To run spark-shell:
 
-To launch a Spark application in `yarn-client` mode, do the same, but 
replace `yarn-cluster` with `yarn-client`.  To run spark-shell:
+$ ./bin/spark-shell --master yarn --deploy-mode client 
 
-$ ./bin/spark-shell --master yarn-client
+For example:
 
+$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi \
+--master yarn-cluster \
--- End diff --

still using yarn-cluster instead of deploy-mode


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10836] [SparkR] Added sort(x, decreasin...

2015-10-01 Thread olarayej

Github user olarayej commented on the pull request:

https://github.com/apache/spark/pull/8920#issuecomment-144809819
  
@shivaram @felixcheung @sun-rui Thanks for your feedback!

I totally see your point with the naming (sort vs. arrange), but @NarineK's 
implementation has two advantages:

1) It supports string column names in both asc and desc order. In the 
current SparkR's implementation of arrange(), I couldn't do that:

arrange(df, desc("Species")) # fails

2) Boolean parameter 'decreasing' is useful. Right now, if you were to sort 
by 100 columns, all of them in descending order, you'll need to write 100 
times, for each column: desc(data$col1), , desc(data$col100), whereas in 
@NarineK's implementation, it will suffice to specify decreasing=T.

I'm aware that plyr also takes functions asc/desc, probably because R was 
not designed with big data in mind. We've seen customer use cases with hundreds 
of thousands of columns.

Bottom line: I think these are two valid additions to Spark R, and since 
the code is ready and tested, it won't hurt. Let the user decide which function 
to use.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [SQL] Improve session management...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144816852
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...

2015-10-01 Thread SparkQA

Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144798664
  
  [Test build #43154 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43154/console)
 for   PR 8909 at commit 
[`2e365ad`](https://github.com/apache/spark/commit/2e365ada232e42d88692c89634e5ed2ceb741beb).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class HyperLogLogPlusPlus(child: Expression, relativeSD: Double = 
0.05)`



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-10810] [WIP] [SQL] Improve session mana...

2015-10-01 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8909#issuecomment-144798673
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/43154/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 >

1 - 100 of 309 matches

Mail list logo