On 06/07/2016 09:57 AM, Du, Fan wrote:


On 2016/6/6 21:27, james wrote:
Hello,


@Stephen::I guess Stephen is bringing up the 'security' aspect of who
get's access to the information, particularly cluster/cloud devops,
customers or interlopers....?

ACLs should play in this part to address security concern.

YES, and so much more! I know folks that their primary (in house cluster) usage is deep packet inspection on the cluster....
With a cluster (inside) there is no limit to new tools that can be
judiciously altered to benefit from cluster codes....



@Fan:: As a consultant, most of my customers either have  or are
planning hybrid installations, where some codes run on a local cluster
or using 'the cloud' for dynamic load requirements. I would think your
proposed scheme needs to be very flexible, both in application to a
campus or Metropolitan Area Network, if not massively distributed around
the globe. What about different resouce types (racks of arm64, gpu
centric hardware, DSPs, FPGA etc etc. Hardware diversity bring many
benefits to the cluster/cloud capabilities.


This also begs the quesion of hardware management (boot/config/online)
of the various hardware, such as is built into coreOS. Are several
applications going to be supported? Standards track? Just Mesos DC/OS
centric?

It depends whether this proposal is accepted by Mesos, if you think
this feature is useful, let's discuss detailed requirement under
MESOS-5545.

OK. Take a look at 'Rackview' on sourceforge::
'http://rackview.sourceforge.net/'


Do I have access to the jira system by default joining this list,
or do I have to request permission somewhere? (sorry jira is new to me
so recommendations on jira, per mesos, in a document, would be keen.)


btw, I have limited knowledge of CoreOS, will look into it.

CoreOS has some great ideas. But many of their codes are not current
(when compared to the gentoo portage tree) and thus many are suspect
for security/function.

I thought the purpose was to get more folks involved here in discussions
and then better formulated ideas can migrate to the ticket (5545) and repos.



TIMING DATA:: This is the main issue I see. Once you start 'vectoring
in resources' you need to add timing (latency) data to encourage robust
and diversified use of of this data. For HPC, this could be very
valuable for rDMA abusive algorithms where memory constrained workloads
not only need the knowledge of additional nearby memory resources, but
the approximated (based on previous data collected) latency and
bandwidth constraints to use those additional resources.

Out of curiosity, which open sourced Mesos framework do you/your
customer run MPI?

Easy dude.    Most of this work in tightly help and nothing to publish
or open up yet. It's a mess (my professional opinion) right now and
I'm testing a variety of tools just be able to have better instrumentation on these codes. Still rDMA is very attractive so it does warrant much attention and extreme, internal, excitement.




Mesos can support MPI framework, but AFIK, it's immature [1][2].

YEP.

I think this part of work should be investigated in future.

[1]: https://github.com/apache/mesos/tree/master/mpi   <- mpd ring version
[2]:https://github.com/mesosphere/mesos-hydra         <- hydra version

Many codes floating around. Much excitement on new compiler features. Lots of hard work and testing going on. That said, the point I was try to make is "Vectoring in" resources, with a variety of parameters as a companion to your idea, is warranted for these aforementioned use cases
and other opportunities.


Great idea. I do like it very much.

hth,
James


On 06/06/2016 05:06 AM, Stephen Gran wrote:
Hi,

This looks potentially interesting.  How does it work in a public cloud
deployment scenario?  I assume you would just have to disable this
feature, or not enable it?

Cheers,

On 06/06/16 10:17, Du, Fan wrote:
Hi, Mesos folks

I’ve been thinking about Mesos rack awareness support for a while,

it’s a common interest for lots of data center applications to provide
data locality,

fault tolerance and better task placement. Create MESOS-5545 to track
the story,

and here is the initial design doc [1] to support rack awareness in
Mesos.

Looking forward to hear any comments from end user and other
developers,

Thanks!

[1]:
https://docs.google.com/document/d/1rql_LZSwtQzBPALnk0qCLsmxcT3-zB7X7aJp-H3xxyE/edit?usp=sharing









Reply via email to