Re: Feedback on MLlib roadmap process proposal

2017-01-26 Thread Joseph Bradley
Sean has given a great explanation. A few more comments: Roadmap: I have been creating roadmap JIRAs, but the goal really is to have all committers working on MLlib help to set that roadmap, based on either their knowledge of current maintenance/internal needs of the project or the feedback

Re: Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

2017-01-26 Thread Jacek Laskowski
Hi Imran, Ok, that makes sense for performance reasons. Thanks for bearing with me and explaining that code with so much patience. Appreciated! Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 https://bit.ly/mastering-apache-spark Follow me at

Re: Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

2017-01-26 Thread Imran Rashid
it is a small difference but think about what this means with a cluster where you have 10k tasks (perhaps 1k executors with 10 cores each). When you have one task complete, you have to go through 1k more executors. On top of that, with a large cluster, task completions happen far more

Re: Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

2017-01-26 Thread Jacek Laskowski
Hi Imran, Thanks a lot for your detailed explanation, but IMHO the difference is so small that I'm surprised it merits two versions -- both check whether an executor is alive -- executorIsAlive(executorId) vs executorDataMap.filterKeys(executorIsAlive) A bit fishy, isn't it? But, on the other

Re: Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

2017-01-26 Thread Imran Rashid
one is used when exactly one task has finished -- that means you now have free resources on just that one executor, so you only need to look for something to schedule on that one. the other one is used when you want to schedule everything you can across the entire cluster. For example, you have

Why two makeOffers in CoarseGrainedSchedulerBackend? Duplication?

2017-01-26 Thread Jacek Laskowski
Hi, Why are there two (almost) identical makeOffers in CoarseGrainedSchedulerBackend [1] and [2]? I can't seem to figure out why they are there and am leaning towards considering one a duplicate. WDYT? [1]

Re: A question about creating persistent table when in-memory catalog is used

2017-01-26 Thread Shuai Lin
I see, thanks for the info! On Mon, Jan 23, 2017 at 4:12 PM, Xiao Li wrote: > Reynold mentioned the direction we are heading. You can see many PRs the > community submitted are for this target. To achieve this, a lot of works we > need to do. > > For example, for some