Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5580#issuecomment-100652864
@tgravescs sorry for my late reply. i think
https://github.com/apache/spark/pull/6022/ is worked for SPARK-7485.
---
If your project is set up for it, you can
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5580#issuecomment-98372096
@tgravescs yes, i think there are some other unit test failed. jenkins,
please test this please.
---
If your project is set up for it, you can reply to this email
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/5580#discussion_r29369790
--- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/Client.scala ---
@@ -341,6 +342,17 @@ private[spark] class Client(
env
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/5580#discussion_r29370287
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -328,6 +328,46 @@ object SparkSubmit
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5580#issuecomment-97547542
@Sephiroth-Lin @tgravescs i have updated it and zip pyspark archives in
mvn's pom.xml and sbt's build.scala.
---
If your project is set up for it, you can reply
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5580#issuecomment-97553254
@vanzin below code is very important.
pyArchives = pyArchives.split(,).map { localPath=
val localURI = Utils.resolveURI(localPath
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/5580#discussion_r29162587
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -328,6 +328,42 @@ object SparkSubmit
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/5580#discussion_r29169827
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -328,6 +328,42 @@ object SparkSubmit
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5580#issuecomment-96752326
@tgravescs yes, i agree with your comments and have update it. can you
review it again? thanks.
---
If your project is set up for it, you can reply to this email
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5580#issuecomment-96369904
@andrewor14 for second question,i add two things for it.one is i add zip
pyspark archives to pyspark/lib when we build spark jar. other is in submit
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5580#issuecomment-96369994
@tgravescs i think this PR is useful for you. you can try it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4474#issuecomment-95907922
@pwendell i think we cannot kill JVM directly when this occurs. when it is
hive server that one driver for many jobs, if we kill JVM, other jobs on this
driver
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4474#issuecomment-95773327
@pwendell @andrewor14 i think i should reopen this PR. because i got this
error yesterday. when i used collect() from many executors. then
task-result-getter thread
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/5580#discussion_r28743589
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -328,6 +328,14 @@ object SparkSubmit
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5580#issuecomment-94609007
@andrewor14 First question is why not just put it on --py-files? because on
yarn-client mode, we can not use --py-files. so if we put it on --py-files for
yarn
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/5580
[SPARK-6869][PySpark] Add pyspark archives path to PYTHONPATH
Based on https://github.com/apache/spark/pull/5478 that provide a
PYSPARK_ARCHIVES_PATH env. from this PR, we just should export
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5478#issuecomment-94111781
@Sephiroth-Lin i think later i will submit my PR based on this PR and then
please help me review it. thanks.
---
If your project is set up for it, you can reply
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5478#issuecomment-93762922
yes, i think in SparkSubmit we can automatically add PYSPARK_ARCHIVES_PATH
to dist files. and then in Client and ExecutorRunnable can set PYTHONPATH
according
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5478#issuecomment-93873704
@sryza we can export
PYSPARK_ARCHIVES_PATH=local://xx/pyspark.zip;local://xx/py4j.zip in
spark-env.sh and we also can export
PYSPARK_ARCHIVES_PATH=hdfs://xx
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5461#issuecomment-91741359
LGTM @andrewor14
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3438#discussion_r28191449
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/TieredDiskMerger.scala ---
@@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3438#discussion_r28191464
--- Diff:
core/src/main/scala/org/apache/spark/util/collection/TieredDiskMerger.scala ---
@@ -0,0 +1,232 @@
+/*
+ * Licensed to the Apache
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5209#issuecomment-86504920
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/5168
[SPARK-5763][Core]add Sort-Merge Join to resolve skewed data
add sort-merge join to resolve skewed data.
i provide three interface to achieve join operator using SortMergeJoinRDD
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-85542318
@tgravescs thanks, i know what you means. i will take a look at it.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-85531984
@tgravescs
does this mean that spark has to be installed on all the nodes? That
shouldn't be needed on YARN.
yes, now we need to put spark_home dir to all
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4823#discussion_r27089597
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala ---
@@ -42,6 +42,18 @@ private[ui] class AllJobsPage(parent: JobsTab) extends
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/5159#issuecomment-85416065
i will close this because my branch is spark-1.2 and later i will use
master to test and then address.
---
If your project is set up for it, you can reply
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/5159
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4823#discussion_r27006782
--- Diff: core/src/main/scala/org/apache/spark/ui/jobs/AllJobsPage.scala ---
@@ -42,6 +42,18 @@ private[ui] class AllJobsPage(parent: JobsTab) extends
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/5159
[SPARK-5763][Core]Add Sort-Merge Join to resolve skewed data
add sort-merge join to resolve skewed data
@rxin @sryza
You can merge this pull request into a Git repository by running
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4846
[SPARK-6103][Graphx]remove unused class to import in EdgeRDDImpl
Class TaskContext is unused in EdgeRDDImpl, so we need to remove it from
import list.
You can merge this pull request
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4823#issuecomment-76584418
@srowen i have update for your comments. can you take a look again. thanks.
---
If your project is set up for it, you can reply to this email and have your
reply
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4823
[SPARK-4411][UI]Add kill link for jobs in the UI
We should have a kill link for each job, similar to what we have for
each stage, so it's easier for users to kill jobs in the UI
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4813#issuecomment-76362065
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4363#discussion_r25312597
--- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala ---
@@ -17,33 +17,84 @@
package org.apache.spark
-import
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3694#issuecomment-75903702
i do not think that a global default ratio is right. because in a job the
size of each stage is different and they are not Increasing or decreasing. if
we define
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3173#issuecomment-74461781
@justinuang i think you are interesting in SPARK-5763.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4554#issuecomment-74034822
@sryza thanks. I think i can use SparkException to warp Exception. Could
you review this again?
---
If your project is set up for it, you can reply to this email
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4554
[SPARK-5759][Yarn]ExecutorRunnable should catch YarnException while
NMClient start contain...
some time since some of reasons, it lead to some exception while NMClient
start
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4392#issuecomment-74030105
i think for yarn-cluster mode, we also should catch the
ApplicationNotFoundException.@pwendell @andrewor14
---
If your project is set up for it, you can reply
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/4474
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4168#discussion_r24406500
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -224,59 +240,90 @@ private[spark] class ExecutorAllocationManager
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4066#discussion_r24315166
--- Diff: core/src/main/scala/org/apache/spark/SparkHadoopWriter.scala ---
@@ -105,24 +106,61 @@ class SparkHadoopWriter(@transient jobConf: JobConf
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4474
[SPARK-5687][Core]TaskResultGetter need to catch OutOfMemoryError.
because in enqueueSuccessfulTask there is another thread to fetch result,
if result is large,it maybe throw a OutOfMemoryError
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3525#issuecomment-73535044
before i think that both of two configs can be existed. from @tgravescs i
think overhead is more necessary than OverheadFraction , because at some time
it has very
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-73541324
@tgravescs your thought is right. but there is just different at Yarn
internal's Client. it is same way on spark-submit with Yarn client and Yarn
cluster.so i think
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4363#issuecomment-73626671
@andrewor14 @rxin yes, i agree with you. for other mode, later we need to
implement killing executor. so this PR is unify failure detection between
blockmanager
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4367#issuecomment-73625959
yes ,i will close this PR. thanks all.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/4367
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4168#discussion_r24313087
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -413,6 +418,7 @@ private[spark] class ExecutorAllocationManager
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4474#issuecomment-73627335
@andrewor14 @pwendell yes, Now we have
conf.get(spark.driver.maxResultSize, 1g) to control the memory of driver.
but once OOM is happened at TaskResultGetter
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3525#issuecomment-73520398
i think memoryOverheadFraction is good to me.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4430
[SPARK-5653][YARN] In ApplicationMaster rename isDriver to isClusterMode
in ApplicationMaster rename isDriver to isClusterMode,because in Client it
uses isClusterMode,ApplicationMaster should
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4141#discussion_r24241097
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala ---
@@ -124,10 +123,12 @@ private[yarn] class YarnAllocator
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4066#discussion_r24240721
--- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala ---
@@ -148,6 +148,20 @@ case object TaskKilled extends TaskFailedReason
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4409#issuecomment-73188040
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4369
[SPARK-5593][Core]Replace BlockManagerListener with ExecutorListener in
ExecutorAllocationListener
More strictly, in ExecutorAllocationListener, we need to replace
onBlockManagerAdded
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4367
[SPARK-5529][Core]Replace blockManager's timeoutChecking with executor's
timeoutChecking
the phenomenon is:
blockManagerSlave is timeout and BlockManagerMasterActor will remove
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4258#issuecomment-72780420
Jenkins, retest this please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4309#issuecomment-72455127
LGTM
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-72578648
@andrewor14 thank you, About SPARK_HOME, we need to consider two places of
compatible. The first is communication between python context and Scala
context
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3903#discussion_r23982799
--- Diff:
core/src/main/scala/org/apache/spark/storage/BlockManagerMasterActor.scala ---
@@ -52,11 +52,7 @@ class BlockManagerMasterActor(val isLocal
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23904474
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -134,12 +136,29 @@ object SparkSubmit
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23904929
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -430,6 +430,10 @@ private[spark] class ApplicationMaster(args
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23905022
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -134,12 +136,29 @@ object SparkSubmit
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23904612
--- Diff:
yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala ---
@@ -98,6 +121,9 @@ class YarnClusterSuite extends FunSuite
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-72401201
for python application, if SPARK_HOME of submission node is different from
the nodeManager, so it can not work in my test. example:submission node's
version is 1.2
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23904494
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -267,10 +292,20 @@ object SparkSubmit {
// In yarn-cluster mode
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23904843
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -134,12 +136,29 @@ object SparkSubmit
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4292#discussion_r2348
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -820,7 +822,10 @@ class SparkContext(config: SparkConf) extends Logging
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4155#discussion_r23888903
--- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
---
@@ -808,6 +810,7 @@ class DAGScheduler(
// will be posted, which
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-72323941
@JoshRosen can you help me? now i add a unit test for python applicaiton on
yarn cluster mode. but now there is a failed. i think its reason is environment
Github user lianhuiwang closed the pull request at:
https://github.com/apache/spark/pull/3976
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user lianhuiwang reopened a pull request:
https://github.com/apache/spark/pull/3976
[SPARK-5173]support python application running on yarn cluster mode
now when we run python application on yarn cluster mode through
spark-submit, spark-submit does not support python
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23888565
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -430,6 +430,10 @@ private[spark] class ApplicationMaster(args
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23884756
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -267,10 +277,22 @@ object SparkSubmit {
// In yarn-cluster mode
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23885383
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -165,6 +168,13 @@ object SparkSubmit
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4258#issuecomment-72298982
@sryza thank you. I have updated for your comment. Can you review again?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3976#discussion_r23885537
--- Diff:
core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala ---
@@ -172,7 +172,8 @@ private[spark] class SparkSubmitArguments(args
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3976#issuecomment-72302620
@sryza @andrewor14 thank for your reviews. I have updated with your
comments. Later i will add YarnClusterSuite for a test.
---
If your project is set up
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3233#discussion_r23826046
--- Diff: core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala ---
@@ -327,8 +326,14 @@ object SparkSubmit {
printStream.println(\n
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4168#discussion_r23682980
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -226,50 +249,32 @@ private[spark] class ExecutorAllocationManager
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4168#discussion_r23682749
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -438,6 +444,7 @@ private[spark] class ExecutorAllocationManager
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4168#discussion_r23682774
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -470,6 +477,7 @@ private[spark] class ExecutorAllocationManager
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4168#discussion_r23683544
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -226,50 +249,32 @@ private[spark] class ExecutorAllocationManager
GitHub user lianhuiwang opened a pull request:
https://github.com/apache/spark/pull/4258
[SPARK-5470][Core]use defaultClassLoader to load classes of
classesToRegister in KryoSeria...
Now KryoSerializer load classes of classesToRegister at the time of its
initialization. when we
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3962#issuecomment-71953371
@andrewor14 thanks for you help.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3962#discussion_r23740893
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -231,6 +231,25 @@ private[spark] class ApplicationMaster(args
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4207#issuecomment-71633632
firstly,we need to rename title of PR to [SPARK-5324][SQL] Implement
Describe Table for SQLContext. i find that code of this PR is not the latest
code. i donot
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3962#issuecomment-71574630
@andrewor14 if we don't finish with SUCCEEDED on driver disassociation, AM
should finish with non-zero. example: if driver's main class throw some
exception and exit
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3962#discussion_r23581653
--- Diff:
yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala ---
@@ -231,6 +231,25 @@ private[spark] class ApplicationMaster(args
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4055#issuecomment-71434453
@JoshRosen i think that's ok. because change of code is very small and
there is no influence for current logic.
---
If your project is set up for it, you can
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4051#issuecomment-71362007
@sryza @andrewor14 i find that setting minExecutors to initialExecutors is
best for the following situation: when DAGScheduler submits missing tasks of
first stage
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/3962#discussion_r23511995
--- Diff:
core/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala
---
@@ -94,12 +91,14 @@ private[spark] abstract class
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3962#issuecomment-71411565
in cluster mode, AMActor donot need to subscribe to disassociated event.
because sometime driver has some errors, Now AMActor donot understand what
happened
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3962#issuecomment-71422216
@andrewor14 I have looked at it in depth. YarnSchedulerActor can work very
well in both yarn cluster and yarn client mode and i have tested in these two
mode. Now we
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/3962#issuecomment-71404828
@andrewor14 have you some feedback about this PR? thanks.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user lianhuiwang commented on the pull request:
https://github.com/apache/spark/pull/4051#issuecomment-71404736
@andrewor14 i think replacing maxExecutors with --num-executors is more
reasonable, because when dynamic allocation is not enable, --num-executors
Github user lianhuiwang commented on a diff in the pull request:
https://github.com/apache/spark/pull/4168#discussion_r23502021
--- Diff:
core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala ---
@@ -199,14 +199,31 @@ private[spark] class ExecutorAllocationManager
301 - 400 of 515 matches
Mail list logo