[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-09-13 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766124#comment-13766124
 ] 

Eli Collins commented on YARN-1024:
---

bq. keeping virtual cores to express parallelism sounds good as it is clear it 
is not a real core.

Hm, I read this the other way. If a framework asks for three vcores on a host 
it intends to run some code on three real physical cores at the same time. If a 
long lived framework wants to reserve 2 cores per host it would ask for 2 cores 
(and 100% YCU per core).

Sandy's proposal, switching to cores and YCU instead of just vcores, is 
equivalent to the proposal above of getting rid of vcores and supporting 
fractional cores. A vcore becomes a core and YCU is just a way to express 
that you want a fraction of a core. Sounds good to me.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: CPUasaYARNresource.pdf


 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-26 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750263#comment-13750263
 ] 

Chris Riccomini commented on YARN-1024:
---

Jumping into this late. I'm thinking about this discussion from an end user's 
perspective.

1. It seems to me that the only time you'd want a YCU value that's not -1 is 
when you're running a thread that uses less than 100% of the CPU. Is that a 
correct statement?

2. As an end user, how do I know what YCU value is reasonable for my job? In 
the distcp example, how do I figure out that 500 YCUs is reasonable? Are we 
expecting users to run their job on an isolated box, run top, then do some 
arithmetic? Alternatively, are we expecting them to run their job repeatedly, 
and tune down their YCU request until the point that it becomes too 

Instinctively, I kind of have the feeling that the concept of the YCU is more 
useful if it encompasses more than just CPU. The benefit of Amazon's ECU is 
that it's fairly straightforward to reason about. You get a pre-defined slice 
of memory, CPU, disk, and network. If the primary goal is simplicity (stated 
above), why wouldn't you go that route, vs. limiting YCUs to being a strictly 
CPU-related concept? This leads to (perhaps significantly) worse cluster 
utilization, but it's a simpler model for the end user. As I understand it, 
this is kind of how memory was being treated prior to adding CPU resources 
(i.e. asking for 20% of the memory on a host is really just a proxy for 20% of 
machine resources as a whole).

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: CPUasaYARNresource.pdf


 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-26 Thread Chris Riccomini (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750270#comment-13750270
 ] 

Chris Riccomini commented on YARN-1024:
---

Realized I'm conflating ECU with AWS machine instances. I retract the whole 
last paragraph. :)

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: CPUasaYARNresource.pdf


 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-26 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13750286#comment-13750286
 ] 

Sandy Ryza commented on YARN-1024:
--

bq. It seems to me that the only time you'd want a YCU value that's not -1 is 
when you're running a thread that uses less than 100% of the CPU. Is that a 
correct statement?
That's correct.  This is common for data-intensive tasks that can be more 
I/O-bound than CPU-bound.

bq. As an end user, how do I know what YCU value is reasonable for my job?
I think selecting the right value is an inherently difficult task. I think we 
would expect different users with different amounts of technical proficiency to 
do it in different ways.  Something like:
* Simple: Use the default value on the cluster.
* Intermediate: Notice your tasks are running too slow and increase YCUs.  Or 
notice your tasks aren't getting scheduled enough and decrease them.
* Advanced: Do the thing with top.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: CPUasaYARNresource.pdf


 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-22 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13747701#comment-13747701
 ] 

Sandy Ryza commented on YARN-1024:
--

Filed YARN-1089 for adding YCUs.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-22 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13748062#comment-13748062
 ] 

Sandy Ryza commented on YARN-1024:
--

I wrote up a more detailed proposal and attached a PDF of it.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy
 Attachments: CPUasaYARNresource.pdf


 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-15 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741051#comment-13741051
 ] 

Arun C Murthy commented on YARN-1024:
-

[~sandyr] Thanks for taking the time to clearly elucidate our long  
always-half-confused discussion i.e. the longwindedness! I think we could be 
close to a solution here, I really do - though, I'm not betting my house yet. 
*smile*

To paraphrase your proposal (mainly for my own benefit):
# Split current (get,set)VirtualCores into (get,set)YCUPerCore and 
(get,set)Cores.
# There is a cluster-wide constant of maxYCUPerCore
# The schedulers use {{core * YCUPerCore}} to do resource-allocation 
comparisons.

The one issue that we need to think about is that we'll need to enhance the 
schedulers to track how much YCUs are available on which core on any given 
node... you could have 5 YCUs in a node but split 3-2-1 across 3 cores. Any 
good ideas on how to get to this?




 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-15 Thread Alejandro Abdelnur (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741081#comment-13741081
 ] 

Alejandro Abdelnur commented on YARN-1024:
--

Instead cores, could we talk about parallelism? given that there are cpus with 
hyper-threading and a physical core may be seen as more than one from a 
parallelism perspective? (ie a singlethreaded MR task would consume at most 1/2 
of a core)

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741191#comment-13741191
 ] 

Sandy Ryza commented on YARN-1024:
--

bq. To paraphrase your proposal...
Thanks for summarizing.  That captures it perfectly.

bq, The one issue that we need to think about is that we'll need to enhance the 
schedulers to track how much YCUs are available on which core on any given 
node... you could have 5 YCUs in a node but split 3-2-1 across 3 cores. Any 
good ideas on how to get to this?
Correct me if I'm wrong, but you're talking about availability based on usage, 
not heterogeneous cores within a node, right?  If my assumptions about how we 
can view CPUs are sufficient, I'm thinking we shouldn't need this, at least for 
a start.  I.e. if we have two threads to run and two cores to run them on, we 
can be agnostic to whether the OS scheduler is running each on its own core or 
splitting both across two. The CGroups properties discussed in YARN-810 allow 
you to limit the total processing power that a process gets without pinning its 
threads to cores.  Assigning tasks to cores might matter for things like cache 
performance, so I agree it's a useful thing to work on eventually. But I think 
any solution will either end up with a decent amount of fragmentation or 
require doing some NP-hard combinatorial optimization repeatedly.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-15 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741213#comment-13741213
 ] 

Arun C Murthy commented on YARN-1024:
-

bq. Correct me if I'm wrong, but you're talking about availability based on 
usage, not heterogeneous cores within a node, right? If my assumptions about 
how we can view CPUs are sufficient, I'm thinking we shouldn't need this, at 
least for a start.

Ah, good point. Although I would like to really think through implications of 
heterogenous nodes in the cluster. In spite, I think there isn't anything here 
we'd be blocked on. Anyone disagrees?

-

Now, the important question. 

If we agree, in a broad sense, on [~sandyr]'s proposal - are we happy with our 
current APIs, particularly in light of 2.1.0-beta?

One option is for us to use the current (get,set)VirtualCores as the basis for 
'cores' or 'parallelism' going fwd and introduce a new (get,set)YCUPerCore?

Is that ok? What do you guys think? Thanks.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-15 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741289#comment-13741289
 ] 

Sandy Ryza commented on YARN-1024:
--

My opinion is that we shouldn't delay the release for adding in both, and that 
we can add in YCUs in 2.1.1.  If we want to change virtual cores to 'cores' or 
'parallelism', I could post a refactoring patch by EOD.  I also wouldn't cry if 
we left it as virtual cores.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-15 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13741518#comment-13741518
 ] 

Robert Joseph Evans commented on YARN-1024:
---

I am fine with that too.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-14 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739853#comment-13739853
 ] 

Robert Joseph Evans commented on YARN-1024:
---

{quote}Sorry for the longwindedness.{quote}

From what people have told me you still have a long ways to go before you 
approach me for longwindedness :).

My initial gut reaction is that only having two numbers to express the request 
seems too simplified, but the more I think about it the more I am OK with it, 
although I think I would change the numbers to be total YCUs requested and 
minimum YCUs per core.  This gives the user better viability into how the 
scheduler is treating these numbers so they can better reason about them. The 
total YCUs is the value used for scheduling.  The minimum YCUs per core is 
compared to the maxComputeUnitsPerCore like was suggested to reject a request 
as not possible, or in the case of a heterogeneous environment restrict the 
hosts that this container can run on.  Although I am OK with the original 
proposal too.

I would also like us to have a flag that would either limit the container to 
the requested CPU and let it have no more even when more is available, or would 
let it expand to use whatever CPU was free, but would be guaranteed to get at 
least the YCUs requested.  This is likely something that would have to be done 
on a separate JIRA though.  Without this I don't see a way to really get 
simplicity, predictability, or consistency.  1 MB of RAM is fairly simple to 
understand.  It can be measured without too much of a problem just by running 
the process.  Most user do a simple search for the correct value run with the 
default, if it does not work I increase the amount and run again.  1 YCU is 
very complex to measure for an application.  If I cannot restrict a container 
to never use more than what was requested I cannot consistently predict how 
long it will take to run later.  Without this I don't know how to answer the 
question I know will come up.

What should I set these values to?


 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-14 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13739900#comment-13739900
 ] 

Sandy Ryza commented on YARN-1024:
--

bq. I would also like us to have a flag that would either limit the container 
to the requested CPU and let it have no more even when more is available, or 
would let it expand to use whatever CPU was free, but would be guaranteed to 
get at least the YCUs requested.
YARN-810 should handle this.  The plan is to make it a cluster config, but feel 
free to chime in there if you think it needs to be an app config. 

bq. 1 YCU is very complex to measure for an application.
Agreed that YCUs are very complex to measure and set for applications, and I 
don't think there is any good way around this. YARN-810 will help considerably, 
but still won't make it close to as easy as configuring memory.

bq. although I think I would change the numbers to be total YCUs requested and 
minimum YCUs per core.
Because of the complexity discussed above in dealing with YCUs, I strongly 
believe that we should keep one of the parameters as just number of cores, 
which allows a user to separate the concerns of how much parallelism can my 
task take advantage of? and how CPU-bound is my task?.  This will also give 
us something in common with every other cluster resource manager I have 
surveyed (Condor, Maui, and Torque, etc.)

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-13 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13738985#comment-13738985
 ] 

Sandy Ryza commented on YARN-1024:
--

I've been thinking a lot about this, and wanted to propose a modified approach, 
inspired by an offline discussion with Arun and his max-vcores idea 
(https://issues.apache.org/jira/browse/YARN-1024?focusedCommentId=13730074page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13730074).

First, my assumptions about how CPUs work:
* A CPU is essentially a bathtub full of processing power that can be doled out 
to threads, with a limit per thread based on the power of each core within it.
* To give X processing power to a thread means that within a standard unit of 
time, roughly some number of instructions proportional to X can be executed for 
that thread. 
* No more than a certain amount of processing power (the amount of processing 
power per core) can be given to each thread.
* We can use CGroups to say that a task gets some fraction of the system's 
processing power.
* This means that if we have 5 cores with Y processing power each, we can give 
5 threads Y processing power each, or 6 threads 5Y/6 processing power each, but 
we can't give 4 threads 5Y/4 processing power each.
* It never makes sense to use CGroups assign a higher fraction of the system's 
processing power than (numthreads the task can take advantage of / number of 
cores) to a task.
* Equivalently, if my CPU has X processing power per core, it never makes sense 
to assign more than (numthreads the task can take advantage of) * X processing 
power to a task.

So as long as we account for that last constraint, we can essentially view 
processing power as a fluid resource like memory.  With this in mind, we can:
1. Split virtual cores into cores and yarnComputeUnitsPerCore.  Requests can 
include both and nodes can be configured with both.
2. Have a cluster-defined maxComputeUnitsPerCore, which would be the smallest 
yarnComputeUnitsPerCore on any node.  We min all yarnComputeUnitsPerCore 
requests with this number when they hit the RM.
3. Use YCUs, not cores, for scheduling.  I.e. the scheduler thinks of a node's 
CPU capacity in terms of the number of YCUs it can handle and thinks of a 
resource's CPU request in terms of its (normalized yarnComputeUnitsPerCore * # 
cores).  We use YCUs for DRF.
4. If we make YCUs small enough, no need for fractional anything.

This reduces to a number-of-cores-based approach if all containers are 
requested with yarnComputeUnitsPerCore=infinity, and reduces to a YCU approach 
if maxComputeUnitsPerCore is set to infinity.  Predictability, simplicity, and 
scheduling flexibility can be traded off per cluster without overloading the 
same concept with multiple definitions.

This doesn't take into account heteregeneous hardware within a cluster, but I 
think (2) can be tweaked to handle this by holding a value for each node  (can 
elaborate on how this would work).  It also doesn't take into account pinning 
threads to CPUs, but I don't think it's any less extensible for ultimately 
dealing with this than other proposals.

Sorry for the longwindedness.  Bobby, would this provide the flexibility you're 
looking for?

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-12 Thread Robert Joseph Evans (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736919#comment-13736919
 ] 

Robert Joseph Evans commented on YARN-1024:
---

Perhaps I am missing something here.  The goals Arun has asked for are 
simplicity, predictability, and consistency.  Simplicity I totally agree with, 
but I do not totally agree with always having predictability and consistency 
after simplicity, and I do not agree that they are always required.  These two 
come with a trade-off with utilization, and this is something that Sandy 
brought up, although not directly.  For HBase guaranteed resources, in terms of 
both parallelism and raw CPU speed are important because it is using those to 
provide a service where predictability and consistency are needed. If the HBase 
AM cannot truly express to YARN what it needs because of simplicity HBase on 
YARN will not be used, because it will not behave the way users need/expect it 
to.  Similarly if HBase is allowed to steal resources from others you can 
easily request too little resources on an underutilized cluster and when the 
cluster is under load it falls apart.

This is similar for me with my desire for Storm on YARN.  I am happy to use a 
complex API to express my needs if it means that I get what I need.  On the 
other hand, if I am doing MR batch processing most of the time (but not all of 
it) I am doing single threaded processing and I really just want it to fill in 
the gaps and use as much unused CPU as it can.  Yes, some MR jobs have strict 
SLAs but most do not and it is best if we can provide a scheduler that can 
balance both.

I also don't agree that because YARN lacks the ability to schedule everything 
that impacts performance, including network and disk IO, that we should skip 
doing CPU correctly.  Some applications are truly CPU bound and they will 
benefit.  For other resources we can add them to YARN as they are needed until 
we do meet the goal of predictability and consistency.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-06 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730622#comment-13730622
 ] 

Junping Du commented on YARN-1024:
--

I would also prefer #1 in scheduling resources as #2 is only meaningful in 
charge/billing as [~philip] mentioned above. 
For #2, simple calculation like ECU (it is released in 2006/2007, but didn't 
change over 7 years which against Moore's law :)) has two common questioned 
scenarios below:
- assignment of multiple slow p-cores (4 x 1G) to a single thread task (1 x 4G) 
asking for a fast core (mapping to multiple vcore) cannot help performance but 
a waste of cpu resource: unused core will still consume timer interrupts, and 
idle loop cause resources too. In addition, maintaining a consistent memory 
view among multiple vCPUs consume resources. All of these are unnecessary. 
Another case is that it is possible for OS CPU scheduler to migrate a 
single-threaded workload amongst multiple vCPUs, thereby losing cache locality.
- assignment of single faster p-cores (1 x 4G) to multiple thread task asking 
for multiple slow core (4 x 1G), it will cause performance issues as Steve 
mentioned above and in YARN-972, too much overhead in process context switch 
and cache miss.
#1 sounds more reasonable and 1 vcore don't have to be 1pcore, but could be 
mapped to 1 vCPU on virtualization and can be overcommit latter (with 
configured ratio) by virtualized platform.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-06 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730961#comment-13730961
 ] 

Eli Collins commented on YARN-1024:
---

bq. vcores are optional anyway (only used in DRF) 

Sandy corrected me offline that while this is true for the CS it is not true 
for the FS, which by default (w/o DRF) will not schedule more containers worth 
of vcores than configured vcores (which seems like it could lead to 
under-utilization given that the default resource calculator only uses memory 
and not every container needs a whole core). By default the # vcores is the # 
cores on the machine and MR asks containers w/ 1 vcore so we effectively have 
vcore=pcore today as the default (re-inforced by the decision to remove the 
notion of pcore in YARN-782).

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729724#comment-13729724
 ] 

Arun C Murthy commented on YARN-1024:
-

bq. If I were to package my simulator and give it to other people on other 
clusters, it would still be true that it spins one CPU. Its runtime, however, 
would vary depending on the horsepower.

I don't see the conflict.

If you don't care about predictable runtime, you could still say I want to run 
on 1 virtual-core. By the above non-requirement on predictability, whether it's 
1 (virtual) core out of 16 physical cores or 1024 virtual cores is immaterial, 
isn't it? And yes, you still get only 1 physical core since the virtual core is 
mapped to a single physical core.

The point about specifying a virtual core is that you get predictable 
performance when you migrate your application between clusters and other 
goodness.

What am I missing here?

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Steve Loughran (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729833#comment-13729833
 ] 

Steve Loughran commented on YARN-1024:
--

I was the one trying to convince Sandy that a uniform core metric is dangerous, 
it's like when a MIP was a VAX-equivalent Million Instructions.

# different parts have different performance in terms of FPU and memory IO 
bandwidth, even if the integer perf is the same. (hence people like to get 
Intel parts over AMD parts on EC2 allocations). 
# there's also the hyperthreading issue; is an HT core the equivalent of a real 
core (no, but Linux treats them the same, AFAIK).
# over time, as 2007 gets further away, the metric becomes less relevant.
# EC2 also includes RAM (e.g m1.small has same CPU as m1.medium, only less RAM; 
AWS considers medium as having 2x ECUs. 

One thing I was arguing against in YARN-972 is allocating fractions of a real 
core: if I say 1 core, I get a single core, irrespective of performance. If 
EC2s are used, and I ask for 1 ECU, does that mean that I get 0.50 of a bigger 
core, or a free upgrade.

I'm happy if I ask for 8 ECUs and get a guarantee of not being on a CPU with 8 
ECUs, making it a minimum requirement of the CPU perf.


 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729839#comment-13729839
 ] 

Sandy Ryza commented on YARN-1024:
--

If I am used to running my single-threaded task on a fast core (let's say rated 
at 250 YVCs), and then I migrate it to another cluster with slower cores (let's 
say rated at 150 YVCs), and still request 250 YVCs, my task will run no faster 
than if I had requested it with 150 YVCs.  I won't get predictable performance, 
and, from a scheduling perspective, I'd be better off requesting 150 YVCs on 
the slower cluster.

In a single pcore-to-vcore world, if I know that my task is CPU-bound and uses 
X threads, I know that each vcore I ask for up to X vcores will predictably 
improve its performance, whatever cluster I am running on.  In a world where 
different cores have different YVCs, I don't get a clear concept of when I 
should increase my YVCs requested, and the advantage of doing so depends mostly 
on the cluster I am running on.

A virtual core definition based on processing power masks the fact that two 1.5 
GHz cores mean something very different than three 1.0 GHz cores. And makes it 
very hard to reason about how many virtual cores to request.


 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Jason Lowe (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13729895#comment-13729895
 ] 

Jason Lowe commented on YARN-1024:
--

Agree that the example posed by [~sandyr] shows that a single unit in the 
request cannot properly convey the ask.  Chatted briefly about this offline 
with [~revans2] and [~nroberts] and we think in general there needs to be a way 
to show the parallelism needed along with some performance guarantee from those 
threads.  That basically leads us to a path where in the generalized case we're 
asking for a list of vcore units, where the number of entries in the list 
represents the desired hardware parallelism and the value of each entry 
represents the performance needed for that execution thread.

Using this with Sandy's example, asking for a single unit of 250 YVCs means it 
would not be allocated on the node with three cores each rated at 150 YVCs 
because none of the cores meets the single-threaded performance needed by the 
container.  If another job came along and asked for three cores each at 100 
YVCs, that could still run on a node that only has a single core rated at 500 
YVCs because that core likely has enough horsepower to multitask the three 
threads and get them each the required performance.

I understand where [~ste...@apache.org] is coming from re: dangers of 
developing one unit to rule them all, but I also think there needs to be 
*some* way to convey performance requirements.  Sandy's example shows that just 
because a job ran fine with one core on some box doesn't mean the job is going 
to run fine with one core on another.  We will not be able to develop a metric 
that will cover all the hardware architecture differences, but if a metric 
works in the vast majority of cases then I think that's a net win over no 
metric.

The APIs are already set for 2.1, and I believe the common case will be jobs 
where a single thread dominates the overall CPU request of the container.  In 
that sense, we can map the existing API call to a single vcore ask and add 
another API where the ask can be a list/array of vcore asks.  This could get 
complicated in the scheduler for an architecture where the effective vcore 
rating of the processors is not homogenous (brings up the spectre of 
processor-pinning and per-processor scheduling), but I don't think this will be 
a common architecture.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730074#comment-13730074
 ] 

Arun C Murthy commented on YARN-1024:
-

bq. If I am used to running my single-threaded task on a fast core (let's say 
rated at 250 YVCs), and then I migrate it to another cluster with slower cores 
(let's say rated at 150 YVCs), and still request 250 YVCs, my task will run no 
faster than if I had requested it with 150 YVCs.

[~sandyr] That is why you'd set a max-vcores in CS/FS of 150. This prevents 
users from falling into that trap. So, that should solve it - correct?

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730077#comment-13730077
 ] 

Arun C Murthy commented on YARN-1024:
-

[~jlowe] Yep, it does make sense to talk about a more explicit 'vector of 
cores' model as we've discussed in past - that said, I agree it's too early.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730080#comment-13730080
 ] 

Arun C Murthy commented on YARN-1024:
-

Overall, yes, there are certainly issues with a strict definition vcore etc., 
but we need to do *just enough* for now - not solve all possible permutations.

Basic requirements are simplicity, predictability and consistency - in that 
order.


 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730195#comment-13730195
 ] 

Sandy Ryza commented on YARN-1024:
--

Jason, Steve, and Arun, you bring up good points that I think have helped me 
understand some of my assumptions.   I agree that simplicity, predictability, 
and consistency are our most important requirements.  I agree with Jason that 
at least two values -  processing power per core and # of cores - are required 
to fully express a request, and that, in spite of this, we should not use both 
and that a single value is better than nothing.

We have a tradeoff between
* A definition that offers some predictability between clusters, but only makes 
sense for requests for a single physical core or less per container.
* A definition that offers predictability only on homogeneous hardware, but 
that functions sensibly for requests for both more and less than a single 
physical core.

I thought that one of the exciting things about allowing requests for CPU would 
be that YARN would be able to better accommodate multi-threaded CPU-intensive 
frameworks like MPI and Storm.  Predictability between clusters seems to matter 
a lot less to me. A ton of other factors interfere with this kind of 
predictability.  The speed that hardware permits a task to read from disk or 
over the network has can have just as large an impact on the processing power 
it consumes as whatever the task is doing.  I don't believe that we will be 
able to attain predictability to the degree that it will provide much value.


 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Sandy Ryza (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730197#comment-13730197
 ] 

Sandy Ryza commented on YARN-1024:
--

bq. The speed that hardware permits a task to read from disk or over the 
network has can have just as large an impact on the processing power it 
consumes as whatever the task is doing.
Meant: The speed that hardware permits a task to read from disk or over the 
network can have just as large an impact on the processing power it consumes as 
whatever the task is doing.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-05 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730301#comment-13730301
 ] 

Eli Collins commented on YARN-1024:
---

I agree we need to define the meaning of a virtual core unambiguously 
(otherwise we won't be able to support two different frameworks on the same 
cluster that may have differing ideas of what a vcore is). I also agree with 
Phil that there are essentially two high-level use cases:

1. Jobs that want to express how much CPU capacity the job needs. Real world 
example - a distcp job wants to express it needs 100 containers but only a 
fraction of a CPU for each since it will spend most of its time blocking on IO.

2. Services - ie long-lived frameworks (ie support 2-level scheduling) - that 
want to request cores on many machines on a cluster and want to express 
CPU-level parallelism and aggregate demand (because they will schedule 
fine-grain requests w/in their long-lived containers). Eg a framework should be 
able to ask for two containers on a host, each with one core, so it can get two 
containers that can execute in parallel on a full core. This is assuming we 
plan to support long-running services in Yarn (YARN-896), which is hopefully 
not controversial. Real world example is HBase which may want 2 guaranteed 
cores per host on a given set of hosts.

Seems like there are two high-level approaches:

1. Get rid of vcores. If we define 1vcore=1pcore (1vcore=1vcpu for virtual 
environments) and support fractional cores (YARN-972) then services can ask for 
1 or more vcores knowing they're getting real cores and jobs just ask for what 
fraction of a vcore they think they need. This is really abandoning the concept 
of a virtual core because it's actually expressing a physical requirement 
(like memory, we assume Yarn is not dramatically over-committing the host). We 
can handle heterogeneous CPUs via attributes (as discussed in other Yarn jiras) 
since most clusters in my experience don't have wildly different processors (eg 
1 or 2 generations is common), and attributes are sufficient to express 
policies like all my cores should have equal/comparable performance.

2. Keep going with vcores as a CPU unit of measurement. If we define 
1vcore=1ECU (works 1:1 for virtual environments) then services (#1) need to 
understand the the power of a core so they can ask for that many vcores - 
essentially they are just undoing the virtualization. YARN would need to make 
sure two containers each with 1 pcores worth of vcores does in fact give you 
two cores( just like hypervisors schedule vcpus for the same VM on different 
pcores to ensure parallelism), but there would be no guarantee that two 
containers on the same host each w/ one vcore would run in parallel. Jobs that 
want fractional cores would just express 1vcore per container and work they're 
way up based on the experience running on the cluster (or also undo the 
virtualization by calculating vcore/pcore if they know what fraction of a pcore 
they want). Heterogenous CPUs does not fall out naturally (still need 
attributes) since there's no guarantee you can describe the difference between 
two CPUs is roughly 1 or more vcore (eg 2.4 - vs 2.0 Ghz  1ECU), however 
there's no need for fractional vcores.

I think either is reasonable and can be made to work, though I think #1 is 
preferable because:
- Some frameworks want to express containers in physical resources (this is 
consistent with how YARN handles memory)
- You can support jobs that don't want a full core via fractional cores (or 
slightly over-committing cores)
- You can support heterogeneous cores via attributes (I want equivalent 
containers)
- vcores are optional anyway (only used in DRF) and therefore only need to be 
expressed if you care about physical cores because you need to reserve them or 
say you want a fraction of one

Either way I think vcore is the wrong name either way because in #1 
1vcore=1pcore so there's no virtualization and in #2 1 vcore is not a 
virtualization of a core (10 vcores does not give me 10 levels of parallelism), 
it's _just a unit_ (like an ECU).

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 Essentially we need to clearly define a YARN Virtual Core (YVC).
 Equivalently, we can use ECU itself: *One EC2 Compute Unit provides the 
 equivalent CPU capacity of a 1.0-1.2 

[jira] [Commented] (YARN-1024) Define a virtual core unambigiously

2013-08-04 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/YARN-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13728952#comment-13728952
 ] 

Arun C Murthy commented on YARN-1024:
-

We need to push on YARN-160 and normalize to YARN Virtual Core (YVC) or ECU 
itself.

 Define a virtual core unambigiously
 ---

 Key: YARN-1024
 URL: https://issues.apache.org/jira/browse/YARN-1024
 Project: Hadoop YARN
  Issue Type: Improvement
Reporter: Arun C Murthy
Assignee: Arun C Murthy

 We need to clearly define the meaning of a virtual core unambiguously so that 
 it's easy to migrate applications between clusters.
 For e.g. here is Amazon EC2 definition of ECU: 
 http://aws.amazon.com/ec2/faqs/#What_is_an_EC2_Compute_Unit_and_why_did_you_introduce_it
 We can use ECU itself: *One EC2 Compute Unit provides the equivalent CPU 
 capacity of a 1.0-1.2 GHz 2007 Opteron or 2007 Xeon processor.*

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira