Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3677#issuecomment-66618650
cc /@mengxr
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3222#issuecomment-66876208
@avulanov
I'm sorry, my English is poor. I am very willing to join you. The main
focus of this PR is RBM and DBN.
---
If your project is set up for it, you can
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-66876255
@ankurdave
#3677 should be able to help improve this PR performance
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-66876521
@avulanov I will submit a new PR about `AdaDelta` and `AdaGrad` in the next
week
It should be able to use in this PR.
---
If your project is set up for it, you can
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3222#issuecomment-67108573
@avulanov The label a one-hot vector. See the convert code in
`MinstDatasetSuite`. And the latest code has fixed this bug.
---
If your project is set up for it, you
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3222#issuecomment-67109137
@avulanov
If the `MLP` contains multi-layer. Usually don't need to l2
regularization(see: [Dropout: A simple way to prevent neural networks from
overfitting](http
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3222#issuecomment-67268417
This parameter is used to performance optimization, basically does not
affect the result of the experiment. It can be set into a number between 1 and
100.
---
If your
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/3729
[WIP][SPARK-4844][MLLIB]SGD should support custom sampling.
JIRA: [SPARK-484](https://issues.apache.org/jira/browse/SPARK-4844)
You can merge this pull request into a Git repository by running
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3222#issuecomment-67438369
@avulanov The `AdaDelta` and `AdaGrad` has been placed in a separate
file. It can be used in #1290.
---
If your project is set up for it, you can reply to this email
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/3729
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3729#issuecomment-67603907
The PR is temporarily close.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/3744
[SPARK-4902][CORE] gap-sampling performance optimization
jira: [SPARK-4902](https://issues.apache.org/jira/browse/SPARK-4902)
cc @mengxr
You can merge this pull request into a Git repository
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3051#issuecomment-67723783
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/1518#discussion_r22171070
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/optimization/Regularizer.scala ---
@@ -0,0 +1,140 @@
+/*
+ * Licensed to the Apache Software
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/3788
[SPARK-4952][Core]Handle ConcurrentModificationExceptions in
SparkEnv.environmentDetails
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/witgo
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3707#issuecomment-68324363
@brennonyork Sorry, I tried many times, could not reproduce the issue.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3707#issuecomment-68417443
Seems to be the issue caused by the command line is killed.
But,I tested both in the mac os x and centos, and could not reproduce the
issue.
---
If your project
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/2388#issuecomment-72621141
@mengxr
I created a JIRAs
[SPARK-5556](https://issues.apache.org/jira/browse/SPARK-5556).
---
If your project is set up for it, you can reply to this email
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/3051
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3222#issuecomment-70984655
witgo#qq.com
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/1387
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/4047#issuecomment-70030602
There are two questions.
1. How to cover the long tail topic features
2. The algorithm performance in 10K -100K topic
Here is a relevant
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/4263#issuecomment-72040666
I test in Max OS x and centos 6.4 system, there is no problem.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/4263
[SPARK-5474][Build]curl should support URL redirection in build/mvn
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/witgo/spark SPARK-5474
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/2388#issuecomment-72069846
Here is a sample faster branch(work in progress):
https://github.com/witgo/spark/tree/lda_MH
---
If your project is set up for it, you can reply to this email
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/3069
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/3867
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/3989
[Minor]Resolve sbt warnings during build (MQTTStreamSuite.scala).
cc @andrewor14
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/witgo/spark
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3051#issuecomment-69137057
Test failure seems to be no related.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3051#issuecomment-69125932
retest this please
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-82685857
The MLLIB existing infrastructure(BSP) is not suitable for large scale
distributed neural net
There are JIRA about the parameters server.
[SPARK-4590](https
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/4908#issuecomment-77556717
@srowen
You can merge this PR. Other changes I will create a new JIRA and PR.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/4807#issuecomment-78486179
@jkbradley, @mengxrWhether have time to look at here?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4807#discussion_r26189615
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDA.scala
---
@@ -311,165 +319,319 @@ private[clustering] object LDA {
private
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/4807#issuecomment-78089533
@EntilZha @mengxr This branch can be merged into master?
I want merge the PR to
[LightLDA](https://github.com/witgo/spark/tree/LightLDA) and
[lda_Gibbs](https
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4807#discussion_r26188380
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDA.scala
---
@@ -311,165 +319,319 @@ private[clustering] object LDA {
private
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4807#discussion_r26188379
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDA.scala
---
@@ -311,165 +319,319 @@ private[clustering] object LDA {
private
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4807#discussion_r26187041
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDA.scala
---
@@ -311,165 +319,319 @@ private[clustering] object LDA {
private
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/4807#discussion_r26186996
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/LDA.scala
---
@@ -311,165 +319,319 @@ private[clustering] object LDA {
private
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/4807#issuecomment-78189159
@EntilZha thx.
@mengxr what do you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/5204#issuecomment-86417578
ok, I have closed this PR
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/5204
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/5204
[Hot Fix][SQL] a compilation error in InsertIntoHiveTableSuite.scala
cc @marmbrus
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/witgo/spark
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/3222
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3222#issuecomment-84746525
Well, I have to close it
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-84750038
The latest code in the branch [Sent2vec
(WIP)](https://github.com/witgo/spark/tree/Sent2vec).
This branch includes two new classes
[Sentence2vec](https
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/4908
Resolve sbt warnings: postfix operator second should be enabled
Resolve sbt warnings:
```
[warn]
spark/streaming/src/main/scala/org/apache/spark/streaming/util
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/4908#issuecomment-77376561
Yes.
This patch only resolves the postfix build warnings.
I can try to resolve more build warnings, we should create a JIRA.
---
If your project is set up
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/855#issuecomment-76114183
@andrewor14 @tdas The code has been updated.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3744#issuecomment-76126729
Two months ago. I talked to @mengxr in an email, I will post the
performance test results
---
If your project is set up for it, you can reply to this email and have your
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/3744#issuecomment-76143785
```scala
test(bernoulli sampling benchmark) {
class BernoulliSamplerBenchmark(val fraction: Double, items: () =
Iterator[Int]) extends scala.testing.Benchmark
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-82767499
This seems to be worth a try. @avulanov what do you think?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-82750982
Parameter server + matrix calculations is usually more common.
The performance of matrix calculations is better.
---
If your project is set up for it, you can reply
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/4263#issuecomment-72165108
@pwendell It seems that only this URL(download maven) might return 3xx.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/5591#discussion_r28685417
--- Diff:
mllib/src/test/scala/org/apache/spark/mllib/regression/FactorizationMachineSuite.scala
---
@@ -0,0 +1,245 @@
+package
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/5591#discussion_r28685371
--- Diff:
mllib/src/main/scala/org/apache/spark/mllib/regression/FactorizationMachine.scala
---
@@ -0,0 +1,539 @@
+package
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1482#issuecomment-95117217
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/5528#issuecomment-97298740
`curl -d id=3terminate=true http://host:4040/stages/stage/kill/` does
not work. There are other better way?
---
If your project is set up for it, you can reply
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/5528#issuecomment-97126406
The hadoop version of my test cluster is `2.3.0-cdh5.0.1`. I'm not sure,
tomorrow I'll test what you said.
---
If your project is set up for it, you can reply
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/5716#issuecomment-96550921
cc @vanzin
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/5716
[SPARK-7162][YARN]Launcher error in yarn-client
jira: https://issues.apache.org/jira/browse/SPARK-7162
You can merge this pull request into a Git repository by running:
$ git pull https
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/5837#issuecomment-98146620
The kill link work in yarn-client, LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/5528#issuecomment-97619733
No, get blanks. but
`curl -d id=3terminate=true
http://host:9082/proxy/application_1429108701044_0377/stages/stage/kill/` get
a 405
---
If your project is set up
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/5528#issuecomment-96947031
@srowen This PR seems to have a bug in yarn-client:
```
HTTP ERROR 405
Problem accessing /proxy/application_1429108701044_0316/stages/stage/kill
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/1482#discussion_r29440041
--- Diff: core/src/main/scala/org/apache/spark/executor/Executor.scala ---
@@ -280,13 +280,18 @@ private[spark] class Executor(
m
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/5548
[SPARK-6963][CORE]Flaky test: o.a.s.ContextCleanerSuite automatically
cleanup checkpoint
cc @andrewor14
You can merge this pull request into a Git repository by running:
$ git pull https
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/5548#discussion_r28581367
--- Diff: core/src/test/scala/org/apache/spark/ContextCleanerSuite.scala ---
@@ -245,7 +245,7 @@ class ContextCleanerSuite extends
ContextCleanerSuiteBase
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/5548#discussion_r28584757
--- Diff: core/src/test/scala/org/apache/spark/ContextCleanerSuite.scala ---
@@ -427,12 +429,17 @@ class CleanerTester(
def broadcastCleaned
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/5548#discussion_r28583156
--- Diff: core/src/test/scala/org/apache/spark/ContextCleanerSuite.scala ---
@@ -245,7 +245,7 @@ class ContextCleanerSuite extends
ContextCleanerSuiteBase
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/5456
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/5305#discussion_r27656092
--- Diff:
yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala
---
@@ -127,23 +127,11 @@ private[spark] class
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1290#issuecomment-104111324
@ankurdave
The `DBN` related code:
https://github.com/witgo/spark/tree/ann-interface-gemm-dbn
---
If your project is set up for it, you can reply to this email
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/6717#issuecomment-110612334
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/6393#issuecomment-105154271
The code is useless ?
https://github.com/apache/spark/tree/master/sql/hive
---
If your project is set up for it, you can reply to this email and have your
reply
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/7621#issuecomment-126402467
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/7621#discussion_r35607344
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala
---
@@ -0,0 +1,130 @@
+/*
+ * Licensed
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/7621#discussion_r35390543
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/ann/Layer.scala ---
@@ -0,0 +1,856 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/7621#discussion_r35408460
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/ann/Layer.scala ---
@@ -0,0 +1,856 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/7621#discussion_r35409451
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/ann/Layer.scala ---
@@ -0,0 +1,856 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/7621#discussion_r35410425
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/ann/Layer.scala ---
@@ -0,0 +1,856 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/7621#discussion_r35433813
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/classification/MultilayerPerceptronClassifier.scala
---
@@ -0,0 +1,130 @@
+/*
+ * Licensed
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/7621#discussion_r35436475
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/ann/Layer.scala ---
@@ -0,0 +1,856 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/7621#discussion_r35435404
--- Diff: mllib/src/main/scala/org/apache/spark/mllib/ann/Layer.scala ---
@@ -0,0 +1,857 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/7279#issuecomment-122477288
This is a very good performance for ML.I will test this feature on
[cloudml/zen](https://github.com/cloudml/zen)
---
If your project is set up for it, you can reply
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/1482
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/7279#issuecomment-119401465
This is very cool PR.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/8520
[SPARK-10350] [Minor] [Doc] Fix SQL Programming Guide
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/witgo/spark SPARK-10350
Alternatively you
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/8467#discussion_r38283165
--- Diff: docs/sql-programming-guide.md ---
@@ -1371,6 +1380,26 @@ Configuration of Parquet can be done using the
`setConf` method on `SQLContext`
/p
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/1482#issuecomment-136971893
I think it is necessary to merge the PR into master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/11041#issuecomment-184034549
@srowen 0.8.0 is the latest.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/11041#issuecomment-184175637
@srowen I've run some simple spark SQL cases, and it doesn't seem to have
any issues.
---
If your project is set up for it, you can reply to this email and have
Github user witgo closed the pull request at:
https://github.com/apache/spark/pull/11986
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user witgo reopened a pull request:
https://github.com/apache/spark/pull/11986
[CORE][SPARK-14178]DAGScheduler should get map output statuses directly.
## What changes were proposed in this pull request?
Add a new method to `MapOutputTracker` class
```scala
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/11986#issuecomment-203695553
There are lots of map and reduce, MapStatus deserialization may take a few
seconds
---
If your project is set up for it, you can reply to this email and have your
reply
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/12113#discussion_r58711271
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -492,16 +624,51 @@ private[spark] object MapOutputTracker extends
Logging
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/12113#discussion_r58641966
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -296,10 +290,89 @@ private[spark] class MapOutputTrackerMaster(conf:
SparkConf
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/12113#discussion_r58637905
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -428,40 +503,93 @@ private[spark] class MapOutputTrackerMaster(conf:
SparkConf
Github user witgo commented on a diff in the pull request:
https://github.com/apache/spark/pull/12113#discussion_r58643596
--- Diff: core/src/main/scala/org/apache/spark/MapOutputTracker.scala ---
@@ -492,16 +624,51 @@ private[spark] object MapOutputTracker extends
Logging
GitHub user witgo opened a pull request:
https://github.com/apache/spark/pull/11986
DAGScheduler should get map output statuses directly.
## What changes were proposed in this pull request?
Add a new method to `MapOutputTracker` class
```scala
def
Github user witgo commented on the pull request:
https://github.com/apache/spark/pull/11986#issuecomment-202179133
In actual use of MapStatus. The array stored in MapStatus is only removed
or added.The elements in this array are not modified. Therefore, this array can
be used outside
601 - 700 of 866 matches
Mail list logo