[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-24 Thread Astralidea
Github user Astralidea commented on the issue:

https://github.com/apache/spark/pull/15588
  
@jerryshao  I agree waiting time waste the CPU time, and I have tested 
@lw-lin mentioned feasible solution is not work in my environment. 
OK, If there is no better solution or advise. I will close this PR next 
week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-24 Thread jerryshao
Github user jerryshao commented on the issue:

https://github.com/apache/spark/pull/15588
  
I think this fix cannot really handle this imbalance receiver allocation 
problem, also blindly waste the CPU time.

What @lw-lin mentioned is a feasible solution to wait for executors to be 
registered, also `ReceiverSchedulingPolicy` should probably handle this problem 
well, but strictly even distribution is hard to guarantee and very costing 
especially when a cluster has intensive resource contention.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-23 Thread Astralidea
Github user Astralidea commented on the issue:

https://github.com/apache/spark/pull/15588
  
@lw-lin 
spark.scheduler.minRegisteredResourcesRatio does not work.
The reason it may could be I use mesos and it run executor not through 
driver.
but I still need to make sure it have sufficient resources registered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-22 Thread Astralidea
Github user Astralidea commented on the issue:

https://github.com/apache/spark/pull/15588
  
@lw-lin
Thanks for you reply. In my private cluster running spark is a little 
different. (I start drivr & executor by myself)
I had try maxRegisteredWaitingTime, but I had not try 
minRegisteredResourcesRatio.
I thought minRegisteredResourcesRatio will not work if 
maxRegisteredWaitingTime won't work.
Maybe it works, I will try spark.scheduler.minRegisteredResourcesRatio 
tommorrow.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-22 Thread lw-lin
Github user lw-lin commented on the issue:

https://github.com/apache/spark/pull/15588
  
Spark Streaming would do a very simple dummy job ensure that all slaves 
have registered before scheduling the `Receiver`s; please see 
https://github.com/apache/spark/blob/v2.0.0/streaming/src/main/scala/org/apache/spark/streaming/scheduler/ReceiverTracker.scala#L436-L447.

@Astralidea, `spark.scheduler.minRegisteredResourcesRatio`  is the minimum 
ratio of registered resources to wait for before the dummy job begins.In our 
private clusters, configuring that to be `0.9` or even `1.0` helps a lot to 
balance our 100+ `Receiver`s. Maybe you could also give it a try.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-22 Thread Astralidea
Github user Astralidea commented on the issue:

https://github.com/apache/spark/pull/15588
  
@srowen But in my cluster I tested 10 times. 9 times successed, 1 time 
failed. 
Why not necessary? receiver balance scheduler affect performance. 
If new executor delay add to driver. receiver won't scheduler again. Or any 
other solution?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-22 Thread srowen
Github user srowen commented on the issue:

https://github.com/apache/spark/pull/15588
  
I don't think it's necessarily true that you want to wait for all receivers 
to begin processing. This change won't work in any event.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #15588: [SPARK-18039][Scheduler] fix bug maxRegisteredWaitingTim...

2016-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/15588
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org