Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1390#issuecomment-48832990
@yhuai can you take a look?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1390#discussion_r14856885
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -157,21 +161,60 @@ class HadoopTableReader(@transient _tableDesc:
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1238#discussion_r14857013
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
@@ -21,16 +21,27 @@ import java.util.Properties
import
Github user concretevitamin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1238#discussion_r14857017
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
@@ -21,16 +21,27 @@ import java.util.Properties
import
Github user XuTingjun closed the pull request at:
https://github.com/apache/spark/pull/1284
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1238#discussion_r14857026
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala ---
@@ -21,16 +21,27 @@ import java.util.Properties
import
Github user rxin commented on a diff in the pull request:
https://github.com/apache/spark/pull/1238#discussion_r14857044
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala ---
@@ -47,6 +47,13 @@ private[sql] abstract class SparkStrategies
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1259#issuecomment-48834061
I brought this up to date. @andrewor14 can you take a look at this? I'd
want to merge this quickly so I can submit my other scheduler patches too.
---
If your project is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1259#issuecomment-48834107
QA tests have started for PR 1259. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16600/consoleFull
---
Github user aarondav commented on the pull request:
https://github.com/apache/spark/pull/1385#issuecomment-48834426
Is this related to the other conf-related concurrency issue that was fixed
recently? https://github.com/apache/spark/pull/1273
---
If your project is set up for it,
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835312
We have gone over this in the past .. it is suboptimal to make it a linear
function of executor/driver memory.
Overhead is a function of number of executors,
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835447
That makes sense, but then it doesn't explain why a constant amount works
for a given job when executor memory is low, and then doesn't work when it is
high. This has
Github user nishkamravi2 commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835560
Sean, the memory_overhead is fairly substantial. More than 2GB for a 30GB
executor. Less than 400MB for a 2GB executor.
---
If your project is set up for it, you
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835566
The default constant is actually a lowerbound to account for other
overheads (since yarn will aggressively kill tasks)... Unfortunately we
have not sized this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1259#issuecomment-48835596
QA results for PR 1259:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brclass
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835618
That would be a function of your jobs.
Other apps would have a drastically different characteristics ... Which is
why we can't generalize to a simple fraction of
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835656
The basic issue is you are trying to model overhead using the wrong
variable... It has no correlation on executor memory actually (other than
vm overheads as heap
You are lucky :-) for some of our jobs, in a 8gb container, overhead is
1.8gb !
On 13-Jul-2014 2:40 pm, nishkamravi2 g...@git.apache.org wrote:
Github user nishkamravi2 commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835560
Sean, the
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835727
Yes of course, lots of settings' best or even usable values are ultimately
app-specific. Ideally, defaults work for lots of cases. A flat value is the
simplest of models,
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835769
You are lucky :-) for some of our jobs, in a 8gb container, overhead is
1.8gb !
On 13-Jul-2014 2:41 pm, nishkamravi2 notificati...@github.com wrote:
Github user nishkamravi2 commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835852
Experimented with three different workloads and noticed common patterns of
proportionality.
Other parameters were left unchanged and only executor size was
Github user nishkamravi2 commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835881
That's why the parameter is configurable. If you have jobs that cause
20-25% memory_overhead, default values will not help.
---
If your project is set up for it,
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48836123
You are missing my point I think ... To give unscientific anecdotal example
: our gbdt expiriments , which run on about 22 nodes need no tuning.
While our
Github user nishkamravi2 commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48836220
Mridul, I think you are missing the point. We understand that this
parameter will in a lot of cases have to be specified by the developer, since
there is no easy
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48836408
On Jul 13, 2014 3:16 PM, nishkamravi2 notificati...@github.com wrote:
Mridul, I think you are missing the point. We understand that this
parameter will in a
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48836619
Hmm, looks like some of my responses to Sean via mail reply have not shown
up here ... Maybe mail gateway delays ?
---
If your project is set up for it, you can reply
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48836879
Since this is a recurring nightmare for our users, let me try to list down
the factors which influence overhead given current spark codebase state in
the jira when
Github user ScrapCodes commented on the pull request:
https://github.com/apache/spark/pull/1354#issuecomment-48837932
@rxin done !
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1354#issuecomment-48837923
QA tests have started for PR 1354. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16601/consoleFull
---
GitHub user YanTangZhai opened a pull request:
https://github.com/apache/spark/pull/1392
[SPARK-2290] Worker should directly use its own sparkHome instead of
appDesc.sparkHome when LaunchExecutor
Worker should directly use its own sparkHome instead of appDesc.sparkHome
when
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1392#issuecomment-48839494
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user YanTangZhai commented on the pull request:
https://github.com/apache/spark/pull/1392#issuecomment-48839557
#1244
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user YanTangZhai commented on the pull request:
https://github.com/apache/spark/pull/1392#issuecomment-48839668
fix #1244
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1354#issuecomment-48839833
QA results for PR 1354:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user YanTangZhai commented on the pull request:
https://github.com/apache/spark/pull/1244#issuecomment-48839912
I've fixed the compile problem. Please review and test again. Thanks very
much.
---
If your project is set up for it, you can reply to this email and have your
Github user YanTangZhai commented on the pull request:
https://github.com/apache/spark/pull/1281#issuecomment-48840373
Hi @ash211, I think this change is needed. Since the method
Utils.getLocalDir is used by some function such as HttpBroadcast, which is
different from
Github user YanTangZhai commented on the pull request:
https://github.com/apache/spark/pull/1281#issuecomment-48840378
Hi @ash211, I think this change is needed. Since the method
Utils.getLocalDir is used by some function such as HttpBroadcast, which is
different from
Github user YanTangZhai commented on the pull request:
https://github.com/apache/spark/pull/1281#issuecomment-48840401
Hi @ash211, I think this change is needed. Since the method
Utils.getLocalDir is used by some function such as HttpBroadcast, which is
different from
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1313#issuecomment-48840541
QA tests have started for PR 1313. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16602/consoleFull
---
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1387#issuecomment-48841019
Now, `SparkContext.cleaner` without considering the executor memory usage.
This will cause the spark to fail in the shortage of memory.
---
If your project is set up for
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1387#issuecomment-48841151
@srowen
[Executor.scala#L253](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L253)
handle exceptions. But the
GitHub user srowen opened a pull request:
https://github.com/apache/spark/pull/1393
SPARK-2465. Use long as user / item ID for ALS
I'd like to float this for consideration: use longs instead of ints for
user and product IDs in the ALS implementation.
The main reason for is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1393#issuecomment-48842883
QA tests have started for PR 1393. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16603/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1313#issuecomment-48843020
QA results for PR 1313:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1393#issuecomment-48843123
The overall increase how much memory? Have a detailed contrast?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/1393#issuecomment-48843270
I think the most significant change is the Rating object. It goes from 8 +
(ref) + 8 (object) + 4 (int) + 4 (int) + 8 (double) = 32 bytes to 8 (ref) + 8
(object) + 4
Github user mateiz commented on the pull request:
https://github.com/apache/spark/pull/1191#issuecomment-48845073
A few comments on this:
- We probably can't break the existing combineByKey through a config
setting. If people want to use this directly, they'll need to use another
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/1390#issuecomment-48845188
I am reviewing it. Will comment it later today.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1393#issuecomment-48845420
QA results for PR 1393:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brcase class
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1393#issuecomment-48846073
QA tests have started for PR 1393. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16604/consoleFull
---
Github user srowen closed the pull request at:
https://github.com/apache/spark/pull/906
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user srowen commented on the pull request:
https://github.com/apache/spark/pull/906#issuecomment-48846227
Obsoleted by SBT build changes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/1313#issuecomment-48846392
@mridulm I updated the patch, now, the order is
PROCESS_LOCAL-NODE_LOCAL-noPref / Speculative-RACK_LOCAL-NON_LOCAL
---
If your project is set up for it, you
GitHub user srowen opened a pull request:
https://github.com/apache/spark/pull/1394
SPARK-2363. Clean MLlib's sample data files
(Just made a PR for this, @mengxr was the reporter of:)
MLlib has sample data under serveral folders:
1) data/mllib
2) data/
3)
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1394#issuecomment-48846547
QA tests have started for PR 1394. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16605/consoleFull
---
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1259#discussion_r14859597
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -490,19 +488,19 @@ private[spark] class TaskSetManager(
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1259#discussion_r14859634
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -429,9 +425,11 @@ private[spark] class TaskSetManager(
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1259#discussion_r14859639
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -521,14 +519,13 @@ private[spark] class TaskSetManager(
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1259#discussion_r14859645
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -429,9 +425,11 @@ private[spark] class TaskSetManager(
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1259#discussion_r14859653
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -570,19 +561,17 @@ private[spark] class TaskSetManager(
Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/1259#discussion_r14859650
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala ---
@@ -536,23 +533,17 @@ private[spark] class TaskSetManager(
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1360#issuecomment-48848235
QA tests have started for PR 1360. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16606/consoleFull
---
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/1259#issuecomment-48848292
Hi @rxin, I took a pass over the patch and the changes mostly look good. On
a higher level point, I notice that we log this pattern `0.0:4.0 (TID 4 ...)`
quite often,
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1393#issuecomment-48848503
QA results for PR 1393:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brcase class
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1313#issuecomment-48849034
Hi @CodingCat looks good to me.
My only doubt, which we discussed last, was whether we want to
differentiate between tasks which have no locations at all vs tasks
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1394#issuecomment-48849070
QA results for PR 1394:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1360#issuecomment-48850708
QA results for PR 1360:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1393#issuecomment-48850704
QA tests have started for PR 1393. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16607/consoleFull
---
Github user falaki commented on the pull request:
https://github.com/apache/spark/pull/1351#issuecomment-48850882
This is not a bad idea, especially considering that a file can be split
across partitions. @marmbrus you suggested this feature. What do you think
about Reynold's
GitHub user staple opened a pull request:
https://github.com/apache/spark/pull/1395
[SPARK-546] Add full outer join to RDD and DStream.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/staple/spark SPARK-546
Alternatively you
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/1395#issuecomment-48851025
Can one of the admins verify this patch?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1351#issuecomment-48851059
Note that there are multiple problems. We can solve the problem of out of
memory by simply limiting the length of a record. Ideally, csvRDD(RDD[String])
should just be one
GitHub user marmbrus opened a pull request:
https://github.com/apache/spark/pull/1396
[SQL] Whitelist more Hive tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/marmbrus/spark moreTests
Alternatively you can review and
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1396#issuecomment-48851429
QA tests have started for PR 1396. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16608/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1393#issuecomment-48852700
QA results for PR 1393:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds the following public classes
(experimental):brcase class
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1396#issuecomment-48853489
QA results for PR 1396:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/1392#issuecomment-48853888
Jenkins, test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1392#issuecomment-48854002
QA tests have started for PR 1392. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16609/consoleFull
---
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1392#issuecomment-48855862
QA results for PR 1392:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user miccagiann commented on the pull request:
https://github.com/apache/spark/pull/1311#issuecomment-48855958
Hello guys,
I have provided Java examples for the following documentation files:
mllib-clustering.md
mllib-collaborative-filtering.md
Github user marmbrus commented on the pull request:
https://github.com/apache/spark/pull/1377#issuecomment-48857036
test this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1377#issuecomment-48857093
QA tests have started for PR 1377. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16610/consoleFull
---
Github user jerryshao commented on the pull request:
https://github.com/apache/spark/pull/1210#issuecomment-48859519
Hi Matei, thanks a lot for your review, I will change the code according to
your comments.
---
If your project is set up for it, you can reply to this email and have
Github user chenghao-intel commented on the pull request:
https://github.com/apache/spark/pull/1390#issuecomment-48859675
The code looks good to me. However, I think we can avoid the work around
solution (de-serializing (with partition serde) and then serialize (with table
serde)
Github user aarondav commented on the pull request:
https://github.com/apache/spark/pull/1259#issuecomment-48859674
If we actually want people to get information out of all those numbers, can
we consider using a human readable format such as `Task(stageId = 1, taskId =
5, attempt =
Github user chenghao-intel commented on the pull request:
https://github.com/apache/spark/pull/1390#issuecomment-48859842
And as the Hive SerDe actually provides the feature of `lazy` parsing,
hence during the converting of `raw object` to `Row`, we need to support the
column pruning
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1387#issuecomment-48859861
QA tests have started for PR 1387. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16611/consoleFull
---
Github user lirui-intel commented on the pull request:
https://github.com/apache/spark/pull/1313#issuecomment-48859854
This looks good to me :)
Just a reminder that when TaskSchedulerImpl calls
TaskSetManager.resourceOffer, the maxLocality (changed to preferredLocality in
this
Github user yhuai commented on the pull request:
https://github.com/apache/spark/pull/1390#issuecomment-48860018
@chenghao-intel I am not sure I understand your comment on column pruning.
I think for a Hive table, we should use `ColumnProjectionUtils` to set needed
columns. So,
Github user yhuai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1390#discussion_r14862289
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -157,21 +161,60 @@ class HadoopTableReader(@transient _tableDesc:
Github user yhuai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1390#discussion_r14862300
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -157,21 +161,60 @@ class HadoopTableReader(@transient _tableDesc:
Github user yhuai commented on a diff in the pull request:
https://github.com/apache/spark/pull/1390#discussion_r14862338
--- Diff:
sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala ---
@@ -157,21 +161,60 @@ class HadoopTableReader(@transient _tableDesc:
Github user mengxr commented on the pull request:
https://github.com/apache/spark/pull/1394#issuecomment-48860407
@srowen This looks good to me and thank you for updating the docs as well!
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/1394
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1377#issuecomment-48860618
QA results for PR 1377:br- This patch PASSES unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1397#issuecomment-48861087
QA tests have started for PR 1397. This patch merges cleanly. brView
progress:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16612/consoleFull
---
Github user scwf commented on the pull request:
https://github.com/apache/spark/pull/1385#issuecomment-48861711
@rxin and @aarondav, yeah ï¼the master branch deadlocks, it seems locks of
#1273 and Hadoop-10456 lead to the problem. when run hivesql self join sql---
hql(SELECT t1.a,
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1311#discussion_r14862906
--- Diff: docs/mllib-clustering.md ---
@@ -69,7 +69,54 @@ println(Within Set Sum of Squared Errors = + WSSSE)
All of MLlib's methods use Java-friendly
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1311#discussion_r14862907
--- Diff: docs/mllib-clustering.md ---
@@ -69,7 +69,54 @@ println(Within Set Sum of Squared Errors = + WSSSE)
All of MLlib's methods use Java-friendly
Github user mengxr commented on a diff in the pull request:
https://github.com/apache/spark/pull/1311#discussion_r14862911
--- Diff: docs/mllib-collaborative-filtering.md ---
@@ -99,7 +99,88 @@ val model = ALS.trainImplicit(ratings, rank,
numIterations, alpha)
All of MLlib's
1 - 100 of 151 matches
Mail list logo