[jira] [Commented] (MESOS-2744) MasterAuthorizationTest.SlaveRemoved is flaky

2015-05-18 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548067#comment-14548067
 ] 

haosdent commented on MESOS-2744:
-

[~lackita] I fill this ticket from user email list. The operation system of 
user is Linux kopernikus-u 3.13.0-52-generic #86-Ubuntu SMP x86_64 GNU/Linux. 
You could find from details from this email. And I don't have ubuntu, I could 
not reproduce this issue in CentOS.

 MasterAuthorizationTest.SlaveRemoved is flaky
 -

 Key: MESOS-2744
 URL: https://issues.apache.org/jira/browse/MESOS-2744
 Project: Mesos
  Issue Type: Bug
Reporter: haosdent
  Labels: flaky, flaky-test

 See (1) and (2), just executed in that order.
 Results make for me - from a blackbox point of view - no sense at all. My two 
 cents/theory - tests themselfs(t.i. the framework's they use) seem to affect 
 each other.
 Will file an issue in your JIRA. Pls provide info for access/handling your 
 JIRA e.g. is this email as description enough information for your 
 investigation?
 (1)
 joma@kopernikus-u:~/dev/programme/mesos/build/mesos/build$ make check 
 GTEST_FILTER=MasterAuthorizationTest.SlaveRemoved GTEST_REPEAT=1000 
 GTEST_BREAK_ON_FAILURE=1
 ...
 Repeating all tests (iteration 1000) . . .
 Note: Google Test filter = 
 MasterAuthorizationTest.SlaveRemoved-DockerContainerizerTest.ROOT_DOCKER_Launch_Executor:DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged:DockerContainerizerTest.ROOT_DOCKER_Launch:DockerContainerizerTest.ROOT_DOCKER_Kill:DockerContainerizerTest.ROOT_DOCKER_Usage:DockerContainerizerTest.ROOT_DOCKER_Update:DockerContainerizerTest.DISABLED_ROOT_DOCKER_Recover:DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker:DockerContainerizerTest.ROOT_DOCKER_Logs:DockerContainerizerTest.ROOT_DOCKER_Default_CMD:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args:DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer:DockerContainerizerTest.DISABLED_ROOT_DOCKER_SlaveRecoveryExecutorContainer:DockerContainerizerTest.ROOT_DOCKER_PortMapping:DockerContainerizerTest.ROOT_DOCKER_LaunchSandboxWithColon:DockerContainerizerTest.ROOT_DOCKER_DestroyWhileFetching:DockerContainerizerTest.ROOT_DOCKER_Destr
 o
 yWhilePulling:DockerTest.ROOT_DOCKER_interface:DockerTest.ROOT_DOCKER_CheckCommandWithShell:DockerTest.ROOT_DOCKER_CheckPortResource:DockerTest.ROOT_DOCKER_CancelPull:CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:MemIsolatorTest/2.MemUsage:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:SharedFilesystemIsolatorTest.ROOT_RelativeVolume:SharedFilesystemIsolatorTest.ROOT_AbsoluteVolume:NamespacesPidIsolatorTest.ROOT_PidNamespace:UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceBackward:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DI
 S
 ABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FindCgroupSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_
 M
 

[jira] [Commented] (MESOS-2637) Consolidate 'foo', 'bar', ... string constants in test and example code

2015-05-18 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548158#comment-14548158
 ] 

Niklas Quarfot Nielsen commented on MESOS-2637:
---

Exactly :)

 Consolidate 'foo', 'bar', ... string constants in test and example code
 ---

 Key: MESOS-2637
 URL: https://issues.apache.org/jira/browse/MESOS-2637
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Niklas Quarfot Nielsen
Assignee: Colin Williams

 We are using 'foo', 'bar', ... string constants and pairs in 
 src/tests/master_tests.cpp, src/tests/slave_tests.cpp, 
 src/tests/hook_tests.cpp and src/examples/test_hook_module.cpp for label and 
 hooks tests. We should consolidate them to make the call sites less prone to 
 forgetting to update all call sites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2637) Consolidate 'foo', 'bar', ... string constants in test and example code

2015-05-18 Thread Colin Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547960#comment-14547960
 ] 

Colin Williams commented on MESOS-2637:
---

When you say that they should be consolidated, are you referring to extracting 
the value into a string in the same test or pulling all of the duplicate label 
creation/checking into a function?

 Consolidate 'foo', 'bar', ... string constants in test and example code
 ---

 Key: MESOS-2637
 URL: https://issues.apache.org/jira/browse/MESOS-2637
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Niklas Quarfot Nielsen
Assignee: Colin Williams

 We are using 'foo', 'bar', ... string constants and pairs in 
 src/tests/master_tests.cpp, src/tests/slave_tests.cpp, 
 src/tests/hook_tests.cpp and src/examples/test_hook_module.cpp for label and 
 hooks tests. We should consolidate them to make the call sites less prone to 
 forgetting to update all call sites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2637) Consolidate 'foo', 'bar', ... string constants in test and example code

2015-05-18 Thread Colin Williams (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Williams reassigned MESOS-2637:
-

Assignee: Colin Williams

 Consolidate 'foo', 'bar', ... string constants in test and example code
 ---

 Key: MESOS-2637
 URL: https://issues.apache.org/jira/browse/MESOS-2637
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Niklas Quarfot Nielsen
Assignee: Colin Williams

 We are using 'foo', 'bar', ... string constants and pairs in 
 src/tests/master_tests.cpp, src/tests/slave_tests.cpp, 
 src/tests/hook_tests.cpp and src/examples/test_hook_module.cpp for label and 
 hooks tests. We should consolidate them to make the call sites less prone to 
 forgetting to update all call sites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2744) MasterAuthorizationTest.SlaveRemoved is flaky

2015-05-18 Thread Colin Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547942#comment-14547942
 ] 

Colin Williams commented on MESOS-2744:
---

I've run this test a couple thousand times and haven't been able to replicate 
the issue. Can you provide any information about what environment you're 
running this in?

 MasterAuthorizationTest.SlaveRemoved is flaky
 -

 Key: MESOS-2744
 URL: https://issues.apache.org/jira/browse/MESOS-2744
 Project: Mesos
  Issue Type: Bug
Reporter: haosdent
  Labels: flaky, flaky-test

 See (1) and (2), just executed in that order.
 Results make for me - from a blackbox point of view - no sense at all. My two 
 cents/theory - tests themselfs(t.i. the framework's they use) seem to affect 
 each other.
 Will file an issue in your JIRA. Pls provide info for access/handling your 
 JIRA e.g. is this email as description enough information for your 
 investigation?
 (1)
 joma@kopernikus-u:~/dev/programme/mesos/build/mesos/build$ make check 
 GTEST_FILTER=MasterAuthorizationTest.SlaveRemoved GTEST_REPEAT=1000 
 GTEST_BREAK_ON_FAILURE=1
 ...
 Repeating all tests (iteration 1000) . . .
 Note: Google Test filter = 
 MasterAuthorizationTest.SlaveRemoved-DockerContainerizerTest.ROOT_DOCKER_Launch_Executor:DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged:DockerContainerizerTest.ROOT_DOCKER_Launch:DockerContainerizerTest.ROOT_DOCKER_Kill:DockerContainerizerTest.ROOT_DOCKER_Usage:DockerContainerizerTest.ROOT_DOCKER_Update:DockerContainerizerTest.DISABLED_ROOT_DOCKER_Recover:DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker:DockerContainerizerTest.ROOT_DOCKER_Logs:DockerContainerizerTest.ROOT_DOCKER_Default_CMD:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args:DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer:DockerContainerizerTest.DISABLED_ROOT_DOCKER_SlaveRecoveryExecutorContainer:DockerContainerizerTest.ROOT_DOCKER_PortMapping:DockerContainerizerTest.ROOT_DOCKER_LaunchSandboxWithColon:DockerContainerizerTest.ROOT_DOCKER_DestroyWhileFetching:DockerContainerizerTest.ROOT_DOCKER_Destr
 o
 yWhilePulling:DockerTest.ROOT_DOCKER_interface:DockerTest.ROOT_DOCKER_CheckCommandWithShell:DockerTest.ROOT_DOCKER_CheckPortResource:DockerTest.ROOT_DOCKER_CancelPull:CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:MemIsolatorTest/2.MemUsage:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:SharedFilesystemIsolatorTest.ROOT_RelativeVolume:SharedFilesystemIsolatorTest.ROOT_AbsoluteVolume:NamespacesPidIsolatorTest.ROOT_PidNamespace:UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceBackward:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DI
 S
 ABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FindCgroupSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_
 M
 

[jira] [Updated] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-18 Thread Ian Downes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Downes updated MESOS-2652:
--
Sprint: Twitter Q2 Sprint 3 - 5/11

 Update Mesos containerizer to understand revocable cpu resources
 

 Key: MESOS-2652
 URL: https://issues.apache.org/jira/browse/MESOS-2652
 Project: Mesos
  Issue Type: Task
Reporter: Vinod Kone
Assignee: Ian Downes
  Labels: twitter

 The CPU isolator needs to properly set limits for revocable and non-revocable 
 containers.
 The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
 -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
 a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
 (TBD). Containers would be present in only one of the subtrees. CFS quotas 
 will *not* be set on subtree roots, only cpu.shares. Each container would set 
 CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2633) Move implementations of Framework struct functions out of master.hpp

2015-05-18 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2633:
---
Assignee: (was: Marco Massenzio)

 Move implementations of Framework struct functions out of master.hpp
 

 Key: MESOS-2633
 URL: https://issues.apache.org/jira/browse/MESOS-2633
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Joris Van Remoortere
Priority: Trivial
  Labels: beginner, master, tech-debt, trivial

 To help reduce compile time and keep the header easy to read, let's move the 
 implementations of the Framework struct functions out of master.hpp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2743) Include ExecutorInfos in master/state.json

2015-05-18 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2743?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548299#comment-14548299
 ] 

haosdent commented on MESOS-2743:
-

Patch: https://reviews.apache.org/r/34362/

[~adam-mesos]I add the executors information in Framework model just like 
slave/http.cpp. Or should I add them to other nodes?

 Include ExecutorInfos in master/state.json
 --

 Key: MESOS-2743
 URL: https://issues.apache.org/jira/browse/MESOS-2743
 Project: Mesos
  Issue Type: Improvement
  Components: json api
Reporter: Adam B
Assignee: haosdent
  Labels: mesosphere

 The slave/state.json already reports executorInfos:
 https://github.com/apache/mesos/blob/0.22.1/src/slave/http.cpp#L215-219
 Would be great to see this in the master/state.json as well, so external 
 tools don't have to query each slave to find out executor resources, sandbox 
 directories, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-18 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548394#comment-14548394
 ] 

Timothy Chen commented on MESOS-2652:
-

Just chatted with Ian offline, in the future we should consider expressing some 
priority from the frameworks even using non-revocable resource can put tasks on 
low priority as well, that it's a nice balance since I think cutting on 
[non]revocable might be too limiting.

 Update Mesos containerizer to understand revocable cpu resources
 

 Key: MESOS-2652
 URL: https://issues.apache.org/jira/browse/MESOS-2652
 Project: Mesos
  Issue Type: Task
Reporter: Vinod Kone
Assignee: Ian Downes
  Labels: twitter

 The CPU isolator needs to properly set limits for revocable and non-revocable 
 containers.
 The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
 -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
 a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
 (TBD). Containers would be present in only one of the subtrees. CFS quotas 
 will *not* be set on subtree roots, only cpu.shares. Each container would set 
 CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2350) Add support for MesosContainerizerLaunch to chroot to a specified path

2015-05-18 Thread Ian Downes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Downes updated MESOS-2350:
--
Sprint: Twitter Mesos Q1 Sprint 3, Twitter Mesos Q1 Sprint 4, Twitter Mesos 
Q1 Sprint 5, Twitter Mesos Q1 Sprint 6, Twitter Q2 Sprint 1 - 4/13, Twitter Q2 
Sprint 2, Twitter Q2 Sprint 3 - 5/11  (was: Twitter Mesos Q1 Sprint 3, Twitter 
Mesos Q1 Sprint 4, Twitter Mesos Q1 Sprint 5, Twitter Mesos Q1 Sprint 6, 
Twitter Q2 Sprint 1 - 4/13, Twitter Q2 Sprint 2)

 Add support for MesosContainerizerLaunch to chroot to a specified path
 --

 Key: MESOS-2350
 URL: https://issues.apache.org/jira/browse/MESOS-2350
 Project: Mesos
  Issue Type: Improvement
  Components: isolation
Affects Versions: 0.21.1, 0.22.0
Reporter: Ian Downes
Assignee: Ian Downes
  Labels: twitter

 In preparation for the MesosContainerizer to support a filesystem isolator 
 the MesosContainerizerLauncher must support chrooting. Optionally, it should 
 also configure the chroot environment by (re-)mounting special filesystems 
 such as /proc and /sys and making device nodes such as /dev/zero, etc., such 
 that the chroot environment is functional.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2540) mesos containerizer should provide scheduler specified rootfs

2015-05-18 Thread Ian Downes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Downes updated MESOS-2540:
--
Sprint: Twitter Q2 Sprint 3 - 5/11

 mesos containerizer should provide scheduler specified rootfs
 -

 Key: MESOS-2540
 URL: https://issues.apache.org/jira/browse/MESOS-2540
 Project: Mesos
  Issue Type: Story
  Components: containerization
Reporter: Jay Buffington
Assignee: Ian Downes

 The mesos containerizer already supports cgroups and namespaces.  MESOS-2350 
 is being actively worked on now to allow for an operator to specify a fixed 
 rootfs to chroot into.
 Let’s extend these features and provide the ability for a scheduler to 
 specify the rootfs.  Schedulers should be able to specify a ContainerInfo[1] 
 that includes type = mesos and an image URI.  The mesos containerizer should 
 fetch that rootfs using the mesos-fetcher then chroot into it before starting 
 the task.
 [1] 
 https://github.com/apache/mesos/blob/7bdb559/include/mesos/mesos.proto#L992



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2540) mesos containerizer should provide scheduler specified rootfs

2015-05-18 Thread Ian Downes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Downes reassigned MESOS-2540:
-

Assignee: Ian Downes

 mesos containerizer should provide scheduler specified rootfs
 -

 Key: MESOS-2540
 URL: https://issues.apache.org/jira/browse/MESOS-2540
 Project: Mesos
  Issue Type: Story
  Components: containerization
Reporter: Jay Buffington
Assignee: Ian Downes

 The mesos containerizer already supports cgroups and namespaces.  MESOS-2350 
 is being actively worked on now to allow for an operator to specify a fixed 
 rootfs to chroot into.
 Let’s extend these features and provide the ability for a scheduler to 
 specify the rootfs.  Schedulers should be able to specify a ContainerInfo[1] 
 that includes type = mesos and an image URI.  The mesos containerizer should 
 fetch that rootfs using the mesos-fetcher then chroot into it before starting 
 the task.
 [1] 
 https://github.com/apache/mesos/blob/7bdb559/include/mesos/mesos.proto#L992



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-18 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548302#comment-14548302
 ] 

Ian Downes commented on MESOS-2652:
---

CFS bandwidth quota provides and upper bound on CPU time for a task. If the 
non-revocable workload is variable then we can increase utilization by removing 
that bound for revocable CPU, given that we immediately preempt for 
non-revocable. Then, we just uses cpu shares to balance between the revocable 
tasks.

 Update Mesos containerizer to understand revocable cpu resources
 

 Key: MESOS-2652
 URL: https://issues.apache.org/jira/browse/MESOS-2652
 Project: Mesos
  Issue Type: Task
Reporter: Vinod Kone
Assignee: Ian Downes
  Labels: twitter

 The CPU isolator needs to properly set limits for revocable and non-revocable 
 containers.
 The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
 -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
 a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
 (TBD). Containers would be present in only one of the subtrees. CFS quotas 
 will *not* be set on subtree roots, only cpu.shares. Each container would set 
 CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2729) Update DRF sorter to not explicitly keep track of total resources

2015-05-18 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2729:
--
Sprint: Twitter Q2 Sprint 3 - 5/11
  Assignee: Vinod Kone
Issue Type: Improvement  (was: Bug)

 Update DRF sorter to not explicitly keep track of total resources
 -

 Key: MESOS-2729
 URL: https://issues.apache.org/jira/browse/MESOS-2729
 Project: Mesos
  Issue Type: Improvement
Reporter: Vinod Kone
Assignee: Vinod Kone
  Labels: twitter

 DRF sorter currently keeps track of allocated resources and total resources. 
 This becomes confusing with oversubscribed resources because the total 
 allocated resources might be greater than total resources on the slave.
 The plan is to get rid of the total resources tracking in DRF sorter because 
 it is not strictly necessary. The share of each client can still be 
 calculated by doing the ratio of allocation of a client to the total 
 allocations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2652) Update Mesos containerizer to understand revocable cpu resources

2015-05-18 Thread Timothy Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548342#comment-14548342
 ] 

Timothy Chen commented on MESOS-2652:
-

I see, and you also set SCHED_IDLE on the revocable tasks right?
I was just wondering if SCHED_IDLE becomes a limiting factor that easily any 
other SCHED_OTHER task that might not be more important can overwhelm the tasks 
running on overscribed resources, since there isn't a way to express task 
priorities when we launch anything. 

 Update Mesos containerizer to understand revocable cpu resources
 

 Key: MESOS-2652
 URL: https://issues.apache.org/jira/browse/MESOS-2652
 Project: Mesos
  Issue Type: Task
Reporter: Vinod Kone
Assignee: Ian Downes
  Labels: twitter

 The CPU isolator needs to properly set limits for revocable and non-revocable 
 containers.
 The proposed strategy is to use a two-way split of the cpu cgroup hierarchy 
 -- normal (non-revocable) and low priority (revocable) subtrees -- and to use 
 a biased split of CFS cpu.shares across the subtrees, e.g., a 20:1 split 
 (TBD). Containers would be present in only one of the subtrees. CFS quotas 
 will *not* be set on subtree roots, only cpu.shares. Each container would set 
 CFS quota and shares as done currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-328) HTTP headers should be considered case-insensitive.

2015-05-18 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548235#comment-14548235
 ] 

haosdent commented on MESOS-328:


Patch: 
https://reviews.apache.org/r/33792/ (Extend hashmap to support custom equality 
and hash)
https://reviews.apache.org/r/34068/ (The test case of extend hashmap to support 
custom equality and hash)
https://reviews.apache.org/r/33793/ (HTTP headers should be considered 
case-insensitive.)

Ping [~bmahler] 

 HTTP headers should be considered case-insensitive.
 ---

 Key: MESOS-328
 URL: https://issues.apache.org/jira/browse/MESOS-328
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Reporter: Benjamin Mahler
Assignee: haosdent
Priority: Minor
  Labels: twitter

 I found this when writing some tests for the decoder in libprocess.
 Message header names should be case-insensitive:
 http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2
 Creating this issue to track it, I'm going to add some TODOs for now.
 Most clients tend to use Camel-Case for the headers so this is not urgent.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2637) Consolidate 'foo', 'bar', ... string constants in test and example code

2015-05-18 Thread Colin Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548287#comment-14548287
 ] 

Colin Williams commented on MESOS-2637:
---

Alright I've put a change up for review (https://reviews.apache.org/r/34361/) 
representing what I think is wanted from this issue, let me know if I should 
change anything.

 Consolidate 'foo', 'bar', ... string constants in test and example code
 ---

 Key: MESOS-2637
 URL: https://issues.apache.org/jira/browse/MESOS-2637
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Niklas Quarfot Nielsen
Assignee: Colin Williams

 We are using 'foo', 'bar', ... string constants and pairs in 
 src/tests/master_tests.cpp, src/tests/slave_tests.cpp, 
 src/tests/hook_tests.cpp and src/examples/test_hook_module.cpp for label and 
 hooks tests. We should consolidate them to make the call sites less prone to 
 forgetting to update all call sites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2637) Consolidate 'foo', 'bar', ... string constants in test and example code

2015-05-18 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548250#comment-14548250
 ] 

haosdent commented on MESOS-2637:
-

I also see a lot of xxx1 xxx2 in test cases. LoL

 Consolidate 'foo', 'bar', ... string constants in test and example code
 ---

 Key: MESOS-2637
 URL: https://issues.apache.org/jira/browse/MESOS-2637
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Niklas Quarfot Nielsen
Assignee: Colin Williams

 We are using 'foo', 'bar', ... string constants and pairs in 
 src/tests/master_tests.cpp, src/tests/slave_tests.cpp, 
 src/tests/hook_tests.cpp and src/examples/test_hook_module.cpp for label and 
 hooks tests. We should consolidate them to make the call sites less prone to 
 forgetting to update all call sites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2744) MasterAuthorizationTest.SlaveRemoved is flaky

2015-05-18 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548269#comment-14548269
 ] 

haosdent commented on MESOS-2744:
-

[~lackita]Thank you for your check in ubuntu. Let me send email to user mail 
list and confirm again.

 MasterAuthorizationTest.SlaveRemoved is flaky
 -

 Key: MESOS-2744
 URL: https://issues.apache.org/jira/browse/MESOS-2744
 Project: Mesos
  Issue Type: Bug
Reporter: haosdent
  Labels: flaky, flaky-test

 See (1) and (2), just executed in that order.
 Results make for me - from a blackbox point of view - no sense at all. My two 
 cents/theory - tests themselfs(t.i. the framework's they use) seem to affect 
 each other.
 Will file an issue in your JIRA. Pls provide info for access/handling your 
 JIRA e.g. is this email as description enough information for your 
 investigation?
 (1)
 joma@kopernikus-u:~/dev/programme/mesos/build/mesos/build$ make check 
 GTEST_FILTER=MasterAuthorizationTest.SlaveRemoved GTEST_REPEAT=1000 
 GTEST_BREAK_ON_FAILURE=1
 ...
 Repeating all tests (iteration 1000) . . .
 Note: Google Test filter = 
 MasterAuthorizationTest.SlaveRemoved-DockerContainerizerTest.ROOT_DOCKER_Launch_Executor:DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged:DockerContainerizerTest.ROOT_DOCKER_Launch:DockerContainerizerTest.ROOT_DOCKER_Kill:DockerContainerizerTest.ROOT_DOCKER_Usage:DockerContainerizerTest.ROOT_DOCKER_Update:DockerContainerizerTest.DISABLED_ROOT_DOCKER_Recover:DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker:DockerContainerizerTest.ROOT_DOCKER_Logs:DockerContainerizerTest.ROOT_DOCKER_Default_CMD:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args:DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer:DockerContainerizerTest.DISABLED_ROOT_DOCKER_SlaveRecoveryExecutorContainer:DockerContainerizerTest.ROOT_DOCKER_PortMapping:DockerContainerizerTest.ROOT_DOCKER_LaunchSandboxWithColon:DockerContainerizerTest.ROOT_DOCKER_DestroyWhileFetching:DockerContainerizerTest.ROOT_DOCKER_Destr
 o
 yWhilePulling:DockerTest.ROOT_DOCKER_interface:DockerTest.ROOT_DOCKER_CheckCommandWithShell:DockerTest.ROOT_DOCKER_CheckPortResource:DockerTest.ROOT_DOCKER_CancelPull:CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:MemIsolatorTest/2.MemUsage:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:SharedFilesystemIsolatorTest.ROOT_RelativeVolume:SharedFilesystemIsolatorTest.ROOT_AbsoluteVolume:NamespacesPidIsolatorTest.ROOT_PidNamespace:UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceBackward:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DI
 S
 ABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FindCgroupSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_
 M
 

[jira] [Assigned] (MESOS-2596) Update allocator docs

2015-05-18 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov reassigned MESOS-2596:
--

Assignee: Alexander Rukletsov

 Update allocator docs
 -

 Key: MESOS-2596
 URL: https://issues.apache.org/jira/browse/MESOS-2596
 Project: Mesos
  Issue Type: Task
  Components: allocation, documentation, modules
Reporter: Alexander Rukletsov
Assignee: Alexander Rukletsov
  Labels: mesosphere

 Once Allocator interface changes, so does the way of writing new allocators. 
 This should be reflected in Mesos docs. The modules doc should mention how to 
 write and use allocator modules. Configuration doc should mention the new 
 {{--allocator}} flag.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2744) MasterAuthorizationTest.SlaveRemoved is flaky

2015-05-18 Thread Colin Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548241#comment-14548241
 ] 

Colin Williams commented on MESOS-2744:
---

I'm on 3.13.0-35-generic, maybe something changed between those two? Anybody 
else have any ideas?

 MasterAuthorizationTest.SlaveRemoved is flaky
 -

 Key: MESOS-2744
 URL: https://issues.apache.org/jira/browse/MESOS-2744
 Project: Mesos
  Issue Type: Bug
Reporter: haosdent
  Labels: flaky, flaky-test

 See (1) and (2), just executed in that order.
 Results make for me - from a blackbox point of view - no sense at all. My two 
 cents/theory - tests themselfs(t.i. the framework's they use) seem to affect 
 each other.
 Will file an issue in your JIRA. Pls provide info for access/handling your 
 JIRA e.g. is this email as description enough information for your 
 investigation?
 (1)
 joma@kopernikus-u:~/dev/programme/mesos/build/mesos/build$ make check 
 GTEST_FILTER=MasterAuthorizationTest.SlaveRemoved GTEST_REPEAT=1000 
 GTEST_BREAK_ON_FAILURE=1
 ...
 Repeating all tests (iteration 1000) . . .
 Note: Google Test filter = 
 MasterAuthorizationTest.SlaveRemoved-DockerContainerizerTest.ROOT_DOCKER_Launch_Executor:DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged:DockerContainerizerTest.ROOT_DOCKER_Launch:DockerContainerizerTest.ROOT_DOCKER_Kill:DockerContainerizerTest.ROOT_DOCKER_Usage:DockerContainerizerTest.ROOT_DOCKER_Update:DockerContainerizerTest.DISABLED_ROOT_DOCKER_Recover:DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker:DockerContainerizerTest.ROOT_DOCKER_Logs:DockerContainerizerTest.ROOT_DOCKER_Default_CMD:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args:DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer:DockerContainerizerTest.DISABLED_ROOT_DOCKER_SlaveRecoveryExecutorContainer:DockerContainerizerTest.ROOT_DOCKER_PortMapping:DockerContainerizerTest.ROOT_DOCKER_LaunchSandboxWithColon:DockerContainerizerTest.ROOT_DOCKER_DestroyWhileFetching:DockerContainerizerTest.ROOT_DOCKER_Destr
 o
 yWhilePulling:DockerTest.ROOT_DOCKER_interface:DockerTest.ROOT_DOCKER_CheckCommandWithShell:DockerTest.ROOT_DOCKER_CheckPortResource:DockerTest.ROOT_DOCKER_CancelPull:CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:MemIsolatorTest/2.MemUsage:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:SharedFilesystemIsolatorTest.ROOT_RelativeVolume:SharedFilesystemIsolatorTest.ROOT_AbsoluteVolume:NamespacesPidIsolatorTest.ROOT_PidNamespace:UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceBackward:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DI
 S
 ABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FindCgroupSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_
 M
 

[jira] [Resolved] (MESOS-2702) Compare split/flattened cgroup hierarchy for CPU oversubscription

2015-05-18 Thread Ian Downes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Downes resolved MESOS-2702.
---
Resolution: Won't Fix

Changed approach to use SCHED_IDLE scheduler policy for revocable cpu.

 Compare split/flattened cgroup hierarchy for CPU oversubscription
 -

 Key: MESOS-2702
 URL: https://issues.apache.org/jira/browse/MESOS-2702
 Project: Mesos
  Issue Type: Task
  Components: isolation
Reporter: Ian Downes
  Labels: twitter

 Investigate if a flat hierarchy is sufficient for oversubscription of CPU or 
 if a two-way split is necessary/preferred.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2633) Move implementations of Framework struct functions out of master.hpp

2015-05-18 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548291#comment-14548291
 ] 

Marco Massenzio commented on MESOS-2633:


This was eventually suggested the best way forward:
{quote}
Per the offline discussion, how about we create a master/framework.hpp (and 
master/slave.hpp later), much like we did for master/metrics.hpp? Having 
definitions in master.hpp that are defined in framework.cpp is a bit 
unintuitive (I've seen a number of people get confused about this approach in 
master/http.cpp).

Note that originally a master/metrics.cpp file was added on the assumption that 
it would speed up build times, which likely didn't hold. Since you didn't find 
a compile time decrease from the current approach, I'd suggest just keeping all 
the code together in a master/framework.hpp header. Note also that this lets 
you forward declare 'Framework'.
{quote}
The original review has been discarded and a new one will be created.

 Move implementations of Framework struct functions out of master.hpp
 

 Key: MESOS-2633
 URL: https://issues.apache.org/jira/browse/MESOS-2633
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Joris Van Remoortere
Assignee: Marco Massenzio
Priority: Trivial
  Labels: beginner, master, tech-debt, trivial

 To help reduce compile time and keep the header easy to read, let's move the 
 implementations of the Framework struct functions out of master.hpp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2633) Move implementations of Framework struct functions out of master.hpp

2015-05-18 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548291#comment-14548291
 ] 

Marco Massenzio edited comment on MESOS-2633 at 5/18/15 5:03 PM:
-

This was eventually suggested as the best way forward:
{quote}
Per the offline discussion, how about we create a master/framework.hpp (and 
master/slave.hpp later), much like we did for master/metrics.hpp? Having 
definitions in master.hpp that are defined in framework.cpp is a bit 
unintuitive (I've seen a number of people get confused about this approach in 
master/http.cpp).

Note that originally a master/metrics.cpp file was added on the assumption that 
it would speed up build times, which likely didn't hold. Since you didn't find 
a compile time decrease from the current approach, I'd suggest just keeping all 
the code together in a master/framework.hpp header. Note also that this lets 
you forward declare 'Framework'.
{quote}
The original review has been discarded and a new one will be created.


was (Author: marco-mesos):
This was eventually suggested the best way forward:
{quote}
Per the offline discussion, how about we create a master/framework.hpp (and 
master/slave.hpp later), much like we did for master/metrics.hpp? Having 
definitions in master.hpp that are defined in framework.cpp is a bit 
unintuitive (I've seen a number of people get confused about this approach in 
master/http.cpp).

Note that originally a master/metrics.cpp file was added on the assumption that 
it would speed up build times, which likely didn't hold. Since you didn't find 
a compile time decrease from the current approach, I'd suggest just keeping all 
the code together in a master/framework.hpp header. Note also that this lets 
you forward declare 'Framework'.
{quote}
The original review has been discarded and a new one will be created.

 Move implementations of Framework struct functions out of master.hpp
 

 Key: MESOS-2633
 URL: https://issues.apache.org/jira/browse/MESOS-2633
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Joris Van Remoortere
Assignee: Marco Massenzio
Priority: Trivial
  Labels: beginner, master, tech-debt, trivial

 To help reduce compile time and keep the header easy to read, let's move the 
 implementations of the Framework struct functions out of master.hpp



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MESOS-2700) Determine CFS behavior with biased cpu.shares subtrees

2015-05-18 Thread Ian Downes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Downes resolved MESOS-2700.
---
Resolution: Won't Fix

Changed approach to use SCHED_IDLE scheduler policy for revocable cpu.

 Determine CFS behavior with biased cpu.shares subtrees
 --

 Key: MESOS-2700
 URL: https://issues.apache.org/jira/browse/MESOS-2700
 Project: Mesos
  Issue Type: Task
  Components: isolation
Affects Versions: 0.22.0
Reporter: Ian Downes
  Labels: twitter

 See this [ticket|https://issues.apache.org/jira/browse/MESOS-2652] for 
 context.
 * Understand the relationship between cpu.shares and CFS quota.
 * Determine range of possible bias splits
 * Determine how to achieve bias, e.g., should 20:1 be 20480:1024 or ~1024:50
 * Rigorous testing of behavior with varying loads, particularly the 
 combination of latency sensitive loads for high biased tasks (non-revokable), 
 and cpu intensive loads for the low biased tasks (revokable).
 * Discover any performance edge cases?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MESOS-2701) Implement bi-level cpu.shares subtrees in cgroups/cpu isolator.

2015-05-18 Thread Ian Downes (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ian Downes resolved MESOS-2701.
---
Resolution: Won't Fix

Changed approach to use SCHED_IDLE scheduler policy for revocable cpu.

 Implement bi-level cpu.shares subtrees in cgroups/cpu isolator.
 ---

 Key: MESOS-2701
 URL: https://issues.apache.org/jira/browse/MESOS-2701
 Project: Mesos
  Issue Type: Task
  Components: isolation
Affects Versions: 0.22.0
Reporter: Ian Downes
  Labels: twitter

 See this [ticket|https://issues.apache.org/jira/browse/MESOS-2652] for 
 context.
 # Configurable bias
 # Change cgroup layout
 ** Implement roll-forward migration path in isolator recover
 ** Document roll-back migration path



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2637) Consolidate 'foo', 'bar', ... string constants in test and example code

2015-05-18 Thread Colin Williams (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548240#comment-14548240
 ] 

Colin Williams commented on MESOS-2637:
---

I'm now very confused, which one did you mean?

 Consolidate 'foo', 'bar', ... string constants in test and example code
 ---

 Key: MESOS-2637
 URL: https://issues.apache.org/jira/browse/MESOS-2637
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Niklas Quarfot Nielsen
Assignee: Colin Williams

 We are using 'foo', 'bar', ... string constants and pairs in 
 src/tests/master_tests.cpp, src/tests/slave_tests.cpp, 
 src/tests/hook_tests.cpp and src/examples/test_hook_module.cpp for label and 
 hooks tests. We should consolidate them to make the call sites less prone to 
 forgetting to update all call sites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2744) MasterAuthorizationTest.SlaveRemoved is flaky

2015-05-18 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548067#comment-14548067
 ] 

haosdent edited comment on MESOS-2744 at 5/18/15 4:46 PM:
--

[~lackita] I fill this ticket from user email list. The operation system of 
user is Linux kopernikus-u 3.13.0-52-generic #86-Ubuntu SMP x86_64 GNU/Linux. 
You could find from details from this email 
http://search-hadoop.com/m/0Vlr6anAdW1kgvuT. And I don't have ubuntu, I could 
not reproduce this issue in CentOS.


was (Author: haosd...@gmail.com):
[~lackita] I fill this ticket from user email list. The operation system of 
user is Linux kopernikus-u 3.13.0-52-generic #86-Ubuntu SMP x86_64 GNU/Linux. 
You could find from details from this email. And I don't have ubuntu, I could 
not reproduce this issue in CentOS.

 MasterAuthorizationTest.SlaveRemoved is flaky
 -

 Key: MESOS-2744
 URL: https://issues.apache.org/jira/browse/MESOS-2744
 Project: Mesos
  Issue Type: Bug
Reporter: haosdent
  Labels: flaky, flaky-test

 See (1) and (2), just executed in that order.
 Results make for me - from a blackbox point of view - no sense at all. My two 
 cents/theory - tests themselfs(t.i. the framework's they use) seem to affect 
 each other.
 Will file an issue in your JIRA. Pls provide info for access/handling your 
 JIRA e.g. is this email as description enough information for your 
 investigation?
 (1)
 joma@kopernikus-u:~/dev/programme/mesos/build/mesos/build$ make check 
 GTEST_FILTER=MasterAuthorizationTest.SlaveRemoved GTEST_REPEAT=1000 
 GTEST_BREAK_ON_FAILURE=1
 ...
 Repeating all tests (iteration 1000) . . .
 Note: Google Test filter = 
 MasterAuthorizationTest.SlaveRemoved-DockerContainerizerTest.ROOT_DOCKER_Launch_Executor:DockerContainerizerTest.ROOT_DOCKER_Launch_Executor_Bridged:DockerContainerizerTest.ROOT_DOCKER_Launch:DockerContainerizerTest.ROOT_DOCKER_Kill:DockerContainerizerTest.ROOT_DOCKER_Usage:DockerContainerizerTest.ROOT_DOCKER_Update:DockerContainerizerTest.DISABLED_ROOT_DOCKER_Recover:DockerContainerizerTest.ROOT_DOCKER_SkipRecoverNonDocker:DockerContainerizerTest.ROOT_DOCKER_Logs:DockerContainerizerTest.ROOT_DOCKER_Default_CMD:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Override:DockerContainerizerTest.ROOT_DOCKER_Default_CMD_Args:DockerContainerizerTest.ROOT_DOCKER_SlaveRecoveryTaskContainer:DockerContainerizerTest.DISABLED_ROOT_DOCKER_SlaveRecoveryExecutorContainer:DockerContainerizerTest.ROOT_DOCKER_PortMapping:DockerContainerizerTest.ROOT_DOCKER_LaunchSandboxWithColon:DockerContainerizerTest.ROOT_DOCKER_DestroyWhileFetching:DockerContainerizerTest.ROOT_DOCKER_Destr
 o
 yWhilePulling:DockerTest.ROOT_DOCKER_interface:DockerTest.ROOT_DOCKER_CheckCommandWithShell:DockerTest.ROOT_DOCKER_CheckPortResource:DockerTest.ROOT_DOCKER_CancelPull:CpuIsolatorTest/1.UserCpuUsage:CpuIsolatorTest/1.SystemCpuUsage:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs:LimitedCpuIsolatorTest.ROOT_CGROUPS_Cfs_Big_Quota:MemIsolatorTest/0.MemUsage:MemIsolatorTest/1.MemUsage:MemIsolatorTest/2.MemUsage:PerfEventIsolatorTest.ROOT_CGROUPS_Sample:SharedFilesystemIsolatorTest.ROOT_RelativeVolume:SharedFilesystemIsolatorTest.ROOT_AbsoluteVolume:NamespacesPidIsolatorTest.ROOT_PidNamespace:UserCgroupIsolatorTest/0.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/1.ROOT_CGROUPS_UserCgroup:UserCgroupIsolatorTest/2.ROOT_CGROUPS_UserCgroup:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PerfRollForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceForward:MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PidNamespaceBackward:SlaveTest.ROOT_RunTaskWithCommandInfoWithoutUser:SlaveTest.DI
 S
 ABLED_ROOT_RunTaskWithCommandInfoWithUser:ContainerizerTest.ROOT_CGROUPS_BalloonFramework:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Enabled:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Subsystems:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Mounted:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Get:CgroupsAnyHierarchyTest.ROOT_CGROUPS_NestedCgroups:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Tasks:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Read:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Write:CgroupsAnyHierarchyTest.ROOT_CGROUPS_Cfs_Big_Quota:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Busy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_SubsystemsHierarchy:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FindCgroupSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_MountedSubsystems:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_CreateRemove:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_Listen:CgroupsAnyHierarchyWithCpuMemoryTest.ROOT_CGROUPS_FreezeNonFreezer:CgroupsNoHierarchyTest.ROOT_CGROUPS_NOHIERARCHY_
 M
 

[jira] [Updated] (MESOS-1303) ExamplesTest.{TestFramework, NoExecutorFramework} flaky

2015-05-18 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff updated MESOS-1303:
--
Shepherd: Vinod Kone

 ExamplesTest.{TestFramework, NoExecutorFramework} flaky
 ---

 Key: MESOS-1303
 URL: https://issues.apache.org/jira/browse/MESOS-1303
 Project: Mesos
  Issue Type: Bug
  Components: test
Reporter: Ian Downes
Assignee: Till Toenshoff
  Labels: flaky

 I'm having trouble reproducing this but I did observe it once on my OSX 
 system:
 {noformat}
 [==] Running 2 tests from 1 test case.
 [--] Global test environment set-up.
 [--] 2 tests from ExamplesTest
 [ RUN  ] ExamplesTest.TestFramework
 ../../src/tests/script.cpp:81: Failure
 Failed
 test_framework_test.sh terminated with signal 'Abort trap: 6'
 [  FAILED  ] ExamplesTest.TestFramework (953 ms)
 [ RUN  ] ExamplesTest.NoExecutorFramework
 [   OK ] ExamplesTest.NoExecutorFramework (10162 ms)
 [--] 2 tests from ExamplesTest (5 ms total)
 [--] Global test environment tear-down
 [==] 2 tests from 1 test case ran. (11121 ms total)
 [  PASSED  ] 1 test.
 [  FAILED  ] 1 test, listed below:
 [  FAILED  ] ExamplesTest.TestFramework
 {noformat}
 when investigating a failed make check for https://reviews.apache.org/r/20971/
 {noformat}
 [--] 6 tests from ExamplesTest
 [ RUN  ] ExamplesTest.TestFramework
 [   OK ] ExamplesTest.TestFramework (8643 ms)
 [ RUN  ] ExamplesTest.NoExecutorFramework
 tests/script.cpp:81: Failure
 Failed
 no_executor_framework_test.sh terminated with signal 'Aborted'
 [  FAILED  ] ExamplesTest.NoExecutorFramework (7220 ms)
 [ RUN  ] ExamplesTest.JavaFramework
 [   OK ] ExamplesTest.JavaFramework (11181 ms)
 [ RUN  ] ExamplesTest.JavaException
 [   OK ] ExamplesTest.JavaException (5624 ms)
 [ RUN  ] ExamplesTest.JavaLog
 [   OK ] ExamplesTest.JavaLog (6472 ms)
 [ RUN  ] ExamplesTest.PythonFramework
 [   OK ] ExamplesTest.PythonFramework (14467 ms)
 [--] 6 tests from ExamplesTest (53607 ms total)
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2709) Design Master discovery functionality for HTTP-only clients

2015-05-18 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2709:
---
Sprint:   (was: Mesosphere Sprint 10 - 5/25)

 Design Master discovery functionality for HTTP-only clients
 ---

 Key: MESOS-2709
 URL: https://issues.apache.org/jira/browse/MESOS-2709
 Project: Mesos
  Issue Type: Improvement
  Components: java api
Reporter: Marco Massenzio
Assignee: Marco Massenzio

 When building clients that do not bind to {{libmesos}} and only use the HTTP 
 API (via pure language bindings - eg, Java-only) there is no simple way to 
 discover the Master's IP address to connect to.
 Rather than relying on 'out-of-band' configuration mechanisms, we would like 
 to enable the ability of interrogating the ZooKeeper ensemble to discover the 
 Master's IP address (and, possibly, other information) to which the HTTP API 
 requests can be addressed to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2709) Design Master discovery functionality for HTTP-only clients

2015-05-18 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2709:
---
Sprint: Mesosphere Sprint 10 - 5/25

 Design Master discovery functionality for HTTP-only clients
 ---

 Key: MESOS-2709
 URL: https://issues.apache.org/jira/browse/MESOS-2709
 Project: Mesos
  Issue Type: Improvement
  Components: java api
Reporter: Marco Massenzio
Assignee: Marco Massenzio

 When building clients that do not bind to {{libmesos}} and only use the HTTP 
 API (via pure language bindings - eg, Java-only) there is no simple way to 
 discover the Master's IP address to connect to.
 Rather than relying on 'out-of-band' configuration mechanisms, we would like 
 to enable the ability of interrogating the ZooKeeper ensemble to discover the 
 Master's IP address (and, possibly, other information) to which the HTTP API 
 requests can be addressed to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2709) Design Master discovery functionality for HTTP-only clients

2015-05-18 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-2709:
---
Story Points: 3

 Design Master discovery functionality for HTTP-only clients
 ---

 Key: MESOS-2709
 URL: https://issues.apache.org/jira/browse/MESOS-2709
 Project: Mesos
  Issue Type: Improvement
  Components: java api
Reporter: Marco Massenzio
Assignee: Marco Massenzio

 When building clients that do not bind to {{libmesos}} and only use the HTTP 
 API (via pure language bindings - eg, Java-only) there is no simple way to 
 discover the Master's IP address to connect to.
 Rather than relying on 'out-of-band' configuration mechanisms, we would like 
 to enable the ability of interrogating the ZooKeeper ensemble to discover the 
 Master's IP address (and, possibly, other information) to which the HTTP API 
 requests can be addressed to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2716) Add non-const reference version of OptionT::get.

2015-05-18 Thread Mark Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mark Wang reassigned MESOS-2716:


Assignee: Mark Wang

 Add non-const reference version of OptionT::get.
 --

 Key: MESOS-2716
 URL: https://issues.apache.org/jira/browse/MESOS-2716
 Project: Mesos
  Issue Type: Improvement
  Components: stout
Reporter: Benjamin Mahler
Assignee: Mark Wang
  Labels: newbie

 Currently Option only provides a const reference to the underlying object:
 {code}
 template typename T
 class Option
 {
   ...
   const T get() const;
   ...
 };
 {code}
 Since we use Option as a replacement for NULL, we often have optional 
 variables that we need to perform non-const operations on. However, this 
 requires taking a copy:
 {code}
 static void cleanup(const Response response)
 {
   if (response.type == Response::PIPE) {
 CHECK_SOME(response.reader);
 http::Pipe::Reader reader = response.reader.get(); // Remove const.
 reader.close();
   }
 }
 {code}
 Taking a copy is hacky, but works for shared objects and some other copyable 
 objects. Since Option represents a mutable variable, it makes sense to add 
 non-const reference access to the underlying value:
 {code}
 template typename T
 class Option
 {
   ...
   const T get() const;
   T get();
   ...
 };
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2746) As a Framework User I want to be able to discover my Task's IP

2015-05-18 Thread Marco Massenzio (JIRA)
Marco Massenzio created MESOS-2746:
--

 Summary: As a Framework User I want to be able to discover my 
Task's IP
 Key: MESOS-2746
 URL: https://issues.apache.org/jira/browse/MESOS-2746
 Project: Mesos
  Issue Type: Story
Affects Versions: 0.22.1
Reporter: Marco Massenzio
Assignee: Joris Van Remoortere


The information exposed by the Framework via the {{WebUIUrl}} does not always 
resolves to a routable endpoint (eg, when the {{hostname}} is not publicly 
resolvable, or resolvable at all).

In order to facilitate service discovery (via, eg, Marathon UI) we want to add 
the information in {{FrameworksPid}} via the {{/state-summary}} endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2735) Change the interaction between the slave and the resource estimator from polling to pushing

2015-05-18 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14548606#comment-14548606
 ] 

Jie Yu commented on MESOS-2735:
---

Sorry that I just notice this reply here. I committed the patch already, but we 
can certainly revert it if you have strong opinion against it.

 If the estimator never updates the last estimate in the slave, is the same 
 effect - no?

Not the same effect. The slave won't be blocked at least in the push model, 
meaning that the slave will still be able to process all messages (e.g., 
runTask). In the polling model, a bad resource estimator can block slave's 
event queue.

 Is the problem, that the current design doesn't support the multiple firing 
 problem, where the estimator updates while the callback is being executed?

Could you please elaborate on the multiple firing problem? I am curious what 
example in your mind that makes you think that a push model is hard to use than 
the polling model.

 Change the interaction between the slave and the resource estimator from 
 polling to pushing 
 

 Key: MESOS-2735
 URL: https://issues.apache.org/jira/browse/MESOS-2735
 Project: Mesos
  Issue Type: Bug
Reporter: Jie Yu
Assignee: Jie Yu
  Labels: twitter

 This will make the semantics more clear. The resource estimator can control 
 the speed of sending resources estimation to the slave.
 To avoid cyclic dependency, slave will register a callback with the resource 
 estimator and the resource estimator will simply invoke that callback when 
 there's a new estimation ready. The callback will be a defer to the slave's 
 main event queue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-809) External control of the ip that Mesos components publish to zookeeper

2015-05-18 Thread Bjoern Metzdorf (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549089#comment-14549089
 ] 

Bjoern Metzdorf commented on MESOS-809:
---

Hi,

does the patch look good?

 External control of the ip that Mesos components publish to zookeeper
 -

 Key: MESOS-809
 URL: https://issues.apache.org/jira/browse/MESOS-809
 Project: Mesos
  Issue Type: Improvement
  Components: framework, master, slave
Affects Versions: 0.14.2
Reporter: Khalid Goudeaux
Assignee: Anindya Sinha
Priority: Minor

 With tools like Docker making containers more manageable, it's tempting to 
 use containers for all software installation. The CoreOS project is an 
 example of this.
 When an application is run inside a container it sees a different ip/hostname 
 from the host system running the container. That ip is only valid from inside 
 that host, no other machine can see it.
 From inside a container, the Mesos master and slave publish that private ip 
 to zookeeper and as a result they can't find each other if they're on 
 different machines. The --ip option can't help because the public ip isn't 
 available for binding from within a container.
 Essentially, from inside the container, mesos processes don't know the ip 
 they're available at (they may not know the port either).
 It would be nice to bootstrap the processes with the correct ip for them to 
 publish to zookeeper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2587) libprocess should allow configuration of ip/port separate from the ones it binds to

2015-05-18 Thread Bjoern Metzdorf (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549090#comment-14549090
 ] 

Bjoern Metzdorf commented on MESOS-2587:


[~nnielsen] There's a patch for MESOS-809 now.

 libprocess should allow configuration of ip/port separate from the ones it 
 binds to
 ---

 Key: MESOS-2587
 URL: https://issues.apache.org/jira/browse/MESOS-2587
 Project: Mesos
  Issue Type: Bug
  Components: libprocess
Reporter: Cosmin Lehene

 Currently libprocess will advertise {{LIBPROCESS_IP}}{{LIBPROCESS_PORT}}, but 
 if a framework runs in a container without an an interface that has a 
 publicly accessible IP (e.g. a container in bridge mode) it will advertise an 
 IP that will not be reachable by master.
 With this, we could advertise the external IP (reachable from master) of the 
 bridge from within a container. 
 This should allow frameworks running in containers to work in the safer 
 bridged mode.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-809) External control of the ip that Mesos components publish to zookeeper

2015-05-18 Thread Anindya Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549100#comment-14549100
 ] 

Anindya Sinha commented on MESOS-809:
-

There were couple of minor comments. I will push out an update based on those 
by EOD today.

 External control of the ip that Mesos components publish to zookeeper
 -

 Key: MESOS-809
 URL: https://issues.apache.org/jira/browse/MESOS-809
 Project: Mesos
  Issue Type: Improvement
  Components: framework, master, slave
Affects Versions: 0.14.2
Reporter: Khalid Goudeaux
Assignee: Anindya Sinha
Priority: Minor

 With tools like Docker making containers more manageable, it's tempting to 
 use containers for all software installation. The CoreOS project is an 
 example of this.
 When an application is run inside a container it sees a different ip/hostname 
 from the host system running the container. That ip is only valid from inside 
 that host, no other machine can see it.
 From inside a container, the Mesos master and slave publish that private ip 
 to zookeeper and as a result they can't find each other if they're on 
 different machines. The --ip option can't help because the public ip isn't 
 available for binding from within a container.
 Essentially, from inside the container, mesos processes don't know the ip 
 they're available at (they may not know the port either).
 It would be nice to bootstrap the processes with the correct ip for them to 
 publish to zookeeper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo

2015-05-18 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549118#comment-14549118
 ] 

Marco Massenzio commented on MESOS-2340:


The [design 
doc](https://docs.google.com/document/d/1i2pWJaIjnFYhuR-000NG-AC1rFKKrRh3Wn47Y2G6lRE/edit#)
 has almost been finalized: it outlines the current chosen strategy.

 Publish JSON in ZK instead of serialized MasterInfo
 ---

 Key: MESOS-2340
 URL: https://issues.apache.org/jira/browse/MESOS-2340
 Project: Mesos
  Issue Type: Improvement
Reporter: Zameer Manji
Assignee: haosdent

 Currently to discover the master a client needs the ZK node location and 
 access to the MasterInfo protobuf so it can deserialize the binary blob in 
 the node.
 I think it would be nice to publish JSON (like Twitter's ServerSets) so 
 clients are not tied to protobuf to do service discovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo

2015-05-18 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549118#comment-14549118
 ] 

Marco Massenzio edited comment on MESOS-2340 at 5/18/15 8:12 PM:
-

The [design 
doc|https://docs.google.com/document/d/1i2pWJaIjnFYhuR-000NG-AC1rFKKrRh3Wn47Y2G6lRE/edit#]
 has almost been finalized: it outlines the current chosen strategy.


was (Author: marco-mesos):
The [design 
doc](https://docs.google.com/document/d/1i2pWJaIjnFYhuR-000NG-AC1rFKKrRh3Wn47Y2G6lRE/edit#)
 has almost been finalized: it outlines the current chosen strategy.

 Publish JSON in ZK instead of serialized MasterInfo
 ---

 Key: MESOS-2340
 URL: https://issues.apache.org/jira/browse/MESOS-2340
 Project: Mesos
  Issue Type: Improvement
Reporter: Zameer Manji
Assignee: haosdent

 Currently to discover the master a client needs the ZK node location and 
 access to the MasterInfo protobuf so it can deserialize the binary blob in 
 the node.
 I think it would be nice to publish JSON (like Twitter's ServerSets) so 
 clients are not tied to protobuf to do service discovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MESOS-2729) Update DRF sorter to not explicitly keep track of total resources

2015-05-18 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone resolved MESOS-2729.
---
Resolution: Won't Fix

Actually looking at the DRF paper and sorter tests, it is vital for the DRF 
algorithm to keep track of the total resources, not just total allocated 
resources. This is because the dominant resource is decided based on the 
total resources on the box.

Example: 
Host with 100 cpus and 10G mem
Framework 1's allocation: 1 cpu and 1G mem 
Framework 2's allocaiton: 2 cpus and 1G mem

According to DRF:
Dominant share of Framework 1 is *0.1 mem*, because mem share (0.1 = 1/10)  
cpu share (0.01 = 1/00)
Dominant share of Framework 2 is also *0.1 mem*

But if we only account for total allocated resources:
Dominant share of Framework 1 is *0.5 mem*, because mem share (0.5 = 1/2)  cpu 
share (0.3 = 1/3)
Dominant share of Framework 2 is *0.7 cpu*




 Update DRF sorter to not explicitly keep track of total resources
 -

 Key: MESOS-2729
 URL: https://issues.apache.org/jira/browse/MESOS-2729
 Project: Mesos
  Issue Type: Improvement
Reporter: Vinod Kone
Assignee: Vinod Kone
  Labels: twitter

 DRF sorter currently keeps track of allocated resources and total resources. 
 This becomes confusing with oversubscribed resources because the total 
 allocated resources might be greater than total resources on the slave.
 The plan is to get rid of the total resources tracking in DRF sorter because 
 it is not strictly necessary. The share of each client can still be 
 calculated by doing the ratio of allocation of a client to the total 
 allocations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2729) Update DRF sorter to not explicitly keep track of total resources

2015-05-18 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2729:
--
Story Points: 1  (was: 3)

 Update DRF sorter to not explicitly keep track of total resources
 -

 Key: MESOS-2729
 URL: https://issues.apache.org/jira/browse/MESOS-2729
 Project: Mesos
  Issue Type: Improvement
Reporter: Vinod Kone
Assignee: Vinod Kone
  Labels: twitter

 DRF sorter currently keeps track of allocated resources and total resources. 
 This becomes confusing with oversubscribed resources because the total 
 allocated resources might be greater than total resources on the slave.
 The plan is to get rid of the total resources tracking in DRF sorter because 
 it is not strictly necessary. The share of each client can still be 
 calculated by doing the ratio of allocation of a client to the total 
 allocations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2746) As a Framework User I want to be able to discover my Task's IP

2015-05-18 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549217#comment-14549217
 ] 

Joris Van Remoortere commented on MESOS-2746:
-

https://reviews.apache.org/r/34371

 As a Framework User I want to be able to discover my Task's IP
 --

 Key: MESOS-2746
 URL: https://issues.apache.org/jira/browse/MESOS-2746
 Project: Mesos
  Issue Type: Story
Affects Versions: 0.22.1
Reporter: Marco Massenzio
Assignee: Joris Van Remoortere

 The information exposed by the Framework via the {{WebUIUrl}} does not always 
 resolves to a routable endpoint (eg, when the {{hostname}} is not publicly 
 resolvable, or resolvable at all).
 In order to facilitate service discovery (via, eg, Marathon UI) we want to 
 add the information in {{FrameworksPid}} via the {{/state-summary}} endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-809) External control of the ip that Mesos components publish to zookeeper

2015-05-18 Thread Anindya Sinha (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-809?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549357#comment-14549357
 ] 

Anindya Sinha edited comment on MESOS-809 at 5/18/15 10:12 PM:
---

Review republished with changes:

https://reviews.apache.org/r/34128/
https://reviews.apache.org/r/34129/


was (Author: anindya.sinha):
Review published with changes:

https://reviews.apache.org/r/34128/
https://reviews.apache.org/r/34129/

 External control of the ip that Mesos components publish to zookeeper
 -

 Key: MESOS-809
 URL: https://issues.apache.org/jira/browse/MESOS-809
 Project: Mesos
  Issue Type: Improvement
  Components: framework, master, slave
Affects Versions: 0.14.2
Reporter: Khalid Goudeaux
Assignee: Anindya Sinha
Priority: Minor

 With tools like Docker making containers more manageable, it's tempting to 
 use containers for all software installation. The CoreOS project is an 
 example of this.
 When an application is run inside a container it sees a different ip/hostname 
 from the host system running the container. That ip is only valid from inside 
 that host, no other machine can see it.
 From inside a container, the Mesos master and slave publish that private ip 
 to zookeeper and as a result they can't find each other if they're on 
 different machines. The --ip option can't help because the public ip isn't 
 available for binding from within a container.
 Essentially, from inside the container, mesos processes don't know the ip 
 they're available at (they may not know the port either).
 It would be nice to bootstrap the processes with the correct ip for them to 
 publish to zookeeper.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2747) Add watch to the state abstraction

2015-05-18 Thread Connor Doyle (JIRA)
Connor Doyle created MESOS-2747:
---

 Summary: Add watch to the state abstraction
 Key: MESOS-2747
 URL: https://issues.apache.org/jira/browse/MESOS-2747
 Project: Mesos
  Issue Type: Wish
  Components: c++ api, java api
Reporter: Connor Doyle
Priority: Minor


Use case: Frameworks that intend to survive failover tend to implement leader 
election.  Watchable storage could be a first step towards reusable leader 
election libraries that don't depend on a particular backing store.

cc [~kozyraki]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2670) Update existing lambdas to meet style guide

2015-05-18 Thread Benjamin Hindman (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549259#comment-14549259
 ] 

Benjamin Hindman commented on MESOS-2670:
-

commit b26a2e1a716fdc775c301fab15f3fa991b070867
Author: haosdent huang haosd...@gmail.com
Date:   Mon May 18 14:20:39 2015 -0700

Update some existing lambdas to meet style guide.

Review: https://reviews.apache.org/r/34018

commit d46c0d7eb1295ef4a3a2494ca2f323c067f91f45
Author: haosdent huang haosd...@gmail.com
Date:   Mon May 18 14:16:26 2015 -0700

Update some existing lambdas to meet style guide.

Review: https://reviews.apache.org/r/34017

 Update existing lambdas to meet style guide
 ---

 Key: MESOS-2670
 URL: https://issues.apache.org/jira/browse/MESOS-2670
 Project: Mesos
  Issue Type: Task
Reporter: Joris Van Remoortere
Assignee: haosdent
  Labels: c++11

 There are already some lambdas in C++11 specific files. Modify these to meet 
 the updated style guide.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo

2015-05-18 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549134#comment-14549134
 ] 

Marco Massenzio commented on MESOS-2340:


The challenge here is that we write the ZNodes as {{ephemeral sequential}} - so 
{{Master}} can only write one kind (currently, it uses, by default {{info}} 
label): it can't write multiple labels/formats; nor it can write multiple 
znodes and expect them to have the same sequence number (generally):
{noformat}
info_1
json.info_1 -- it may (or may not) be from the same Master as info_1
info_2
json.info_2 -- ditto (2)
info_3
json.info_3 -- ditto (3)
{noformat}

One possible approach would be to have one (and only one) separate process 
(running, eg, on the Leader elected) that _watches_ the ZK _path_ given and 
monitors creation/deletion of znodes; once it detects a new one (or changes to 
an existing - is this even possible?), it will simply create one identically 
named (but with, eg, a {{json}} prefix) and with the same info.

Similarly with node removals. 

 Publish JSON in ZK instead of serialized MasterInfo
 ---

 Key: MESOS-2340
 URL: https://issues.apache.org/jira/browse/MESOS-2340
 Project: Mesos
  Issue Type: Improvement
Reporter: Zameer Manji
Assignee: haosdent

 Currently to discover the master a client needs the ZK node location and 
 access to the MasterInfo protobuf so it can deserialize the binary blob in 
 the node.
 I think it would be nice to publish JSON (like Twitter's ServerSets) so 
 clients are not tied to protobuf to do service discovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2600) Add /reserve and /unreserve endpoints on the master for dynamic reservation

2015-05-18 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-2600:

Summary: Add /reserve and /unreserve endpoints on the master for dynamic 
reservation  (was: Add a /reserve endpoint on the master for dynamic 
reservation)

 Add /reserve and /unreserve endpoints on the master for dynamic reservation
 ---

 Key: MESOS-2600
 URL: https://issues.apache.org/jira/browse/MESOS-2600
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Michael Park
Assignee: Michael Park
  Labels: mesosphere

 Enable operators to manage dynamic reservations by Introducing the 
 {{/reserve}} and {{/unreserve}} HTTP endpoints on the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2600) Add a /reserve endpoint on the master for dynamic reservation

2015-05-18 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-2600:

Summary: Add a /reserve endpoint on the master for dynamic reservation  
(was: Introduce /reserve endpoint on the master)

 Add a /reserve endpoint on the master for dynamic reservation
 -

 Key: MESOS-2600
 URL: https://issues.apache.org/jira/browse/MESOS-2600
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Michael Park
Assignee: Michael Park
  Labels: mesosphere

 Enable operators to manage dynamic reservations by Introducing the 
 {{/reserve}} and {{/unreserve}} HTTP endpoints on the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Littlestar updated MESOS-2749:
--
Description: 
I use on marathon in docker, https://github.com/mesosphere/marathon
docker build -t marathon-head .

when I run marathon in docker, it crased.

Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x7b53c]  cfree+0x1c

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  
org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
j  org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
j  scala.Option.foreach(Lscala/Function1;)V+12
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
j  mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
j  
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
j  
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j  java.lang.Thread.run()V+11
v  ~StubRoutines::call_stub


  was:
I use on marathon in docker.
https://github.com/mesosphere/marathon

docker build -t marathon-head .it crased.

Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x7b53c]  cfree+0x1c

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  
org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
j  org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
j  scala.Option.foreach(Lscala/Function1;)V+12
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
j  mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
j  
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
j  
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j  java.lang.Thread.run()V+11
v  ~StubRoutines::call_stub



 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar

 I use on marathon in docker, https://github.com/mesosphere/marathon
 docker build -t marathon-head .
 when I run marathon in docker, it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 

[jira] [Commented] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549660#comment-14549660
 ] 

Littlestar commented on MESOS-2749:
---

jdk 1.7.0 u60/1.8.0 u40 coredump with same, 100% reproduced in my environment.

 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar
Priority: Critical

 I use on marathon in docker, https://github.com/mesosphere/marathon
 docker build -t marathon-head .
 when I run marathon in docker, it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
 j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
 j  
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
 j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
 j  java.lang.Thread.run()V+11
 v  ~StubRoutines::call_stub
 =
 (gdb) where
 #0  0x003768e32625 in raise () from /lib64/libc.so.6
 #1  0x003768e33e05 in abort () from /lib64/libc.so.6
 #2  0x7f42a227c509 in os::abort(bool) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #3  0x7f42a2424dd5 in VMError::report_and_die() () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #4  0x7f42a2283a31 in JVM_handle_linux_signal () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #5  0x7f42a227ad33 in signalHandler(int, siginfo*, void*) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #6  signal handler called
 #7  0x003768e7b53c in free () from /lib64/libc.so.6
 #8  0x003768ecf630 in freeaddrinfo () from /lib64/libc.so.6
 #9  0x7f419689fbaf in getIP () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #10 0x7f41968da76a in operator () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #11 0x7f41968da0c3 in UPID () from /home/test/mesos/lib/libmesos-0.22.1.so
 #12 0x7f4195fb97f1 in create () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #13 0x7f419619508f in start () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #14 0x7f419697a721 in Java_org_apache_mesos_MesosSchedulerDriver_start () 
 from /home/test/mesos/lib/libmesos-0.22.1.so
 #15 0x7f428d015134 in ?? ()
 #16 0x7f428d014e82 in ?? ()
 #17 0x7f424c1f54a8 in ?? ()
 #18 0x7f424c4a6a30 in ?? ()
 #19 0x7f424c1f5508 in ?? ()
 #20 0x7f424c4a77f0 in ?? ()
 #21 0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2636) Segfault in inline TryIP getIP(const std::string hostname, int family)

2015-05-18 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549744#comment-14549744
 ] 

Littlestar commented on MESOS-2636:
---

hostname in net.hpp has same problem

{noformat}
inline Trystd::string hostname()
{
  char host[512];

  if (gethostname(host, sizeof(host))  0) {
return ErrnoError();
  }

  // TODO(evelinad): Add AF_UNSPEC when we will support IPv6
  struct addrinfo hints = createAddrInfo(SOCK_STREAM, AF_INET, AI_CANONNAME);
  struct addrinfo *result;

  int error = getaddrinfo(host, NULL, hints, result);

  if (error != 0 || result == NULL) {
if (result != NULL) {
  freeaddrinfo(result);
}
return Error(gai_strerror(error));
  }

  std::string hostname = result-ai_canonname;
  freeaddrinfo(result);

  return hostname;
}
{noformat}

 Segfault in inline TryIP getIP(const std::string hostname, int family)
 -

 Key: MESOS-2636
 URL: https://issues.apache.org/jira/browse/MESOS-2636
 Project: Mesos
  Issue Type: Bug
Reporter: Chi Zhang
Assignee: Chi Zhang
  Labels: twitter
 Fix For: 0.23.0


 We saw a segfault in production. Attaching the coredump, we see:
 Core was generated by `/usr/local/sbin/mesos-slave --port=5051 
 --resources=cpus:23;mem:70298;ports:[31'.
 Program terminated with signal 11, Segmentation fault.
 #0  0x7f639867c77e in free () from /lib64/libc.so.6
 (gdb) bt
 #0  0x7f639867c77e in free () from /lib64/libc.so.6
 #1  0x7f63986c25d0 in freeaddrinfo () from /lib64/libc.so.6
 #2  0x7f6399deeafa in net::getIP (hostname=redacted, family=2) at 
 ./3rdparty/stout/include/stout/net.hpp:201
 #3  0x7f6399e1f273 in process::initialize (delegate=Unhandled dwarf 
 expression opcode 0xf3
 ) at src/process.cpp:837
 #4  0x0042342f in main ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)
Littlestar created MESOS-2749:
-

 Summary: Mesos 0.22.1 cause marathon crashed
 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar


I use on marathon in docker.
https://github.com/mesosphere/marathon

docker build -t marathon-head .it crased.

Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x7b53c]  cfree+0x1c

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  
org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
j  org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
j  scala.Option.foreach(Lscala/Function1;)V+12
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
j  mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
j  
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
j  
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j  java.lang.Thread.run()V+11
v  ~StubRoutines::call_stub




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549636#comment-14549636
 ] 

Littlestar edited comment on MESOS-2749 at 5/19/15 1:44 AM:


I checked libmesos.so, I think It must just expose needed symbol only.
{noformat}
{
 global:
JNI_OnLoad;
JNI_OnUnload;
*Java_org_apache_mesos*;
 local:
   *;
};
{noformat}


was (Author: cnstar9988):
I checked libmesos.so, I think It must just expose needed symbol only.
{
 global:
JNI_OnLoad;
JNI_OnUnload;
*Java_org_apache_mesos*;
 local:
   *;
};


 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar

 I use on marathon in docker.
 https://github.com/mesosphere/marathon
 docker build -t marathon-head .it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
 j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
 j  
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
 j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
 j  java.lang.Thread.run()V+11
 v  ~StubRoutines::call_stub



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549636#comment-14549636
 ] 

Littlestar edited comment on MESOS-2749 at 5/19/15 2:04 AM:


I think libmesos has bug on getIP?

another thing, I checked libmesos.so, I think It must just expose needed symbol 
only.
{noformat}
{
 global:
JNI_OnLoad;
JNI_OnUnload;
*Java_org_apache_mesos*;
 local:
   *;
};
{noformat}


was (Author: cnstar9988):
I checked libmesos.so, I think It must just expose needed symbol only.
{noformat}
{
 global:
JNI_OnLoad;
JNI_OnUnload;
*Java_org_apache_mesos*;
 local:
   *;
};
{noformat}

 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar
Priority: Critical

 I use on marathon in docker, https://github.com/mesosphere/marathon
 docker build -t marathon-head .
 when I run marathon in docker, it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
 j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
 j  
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
 j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
 j  java.lang.Thread.run()V+11
 v  ~StubRoutines::call_stub
 =
 (gdb) where
 #0  0x003768e32625 in raise () from /lib64/libc.so.6
 #1  0x003768e33e05 in abort () from /lib64/libc.so.6
 #2  0x7f42a227c509 in os::abort(bool) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #3  0x7f42a2424dd5 in VMError::report_and_die() () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #4  0x7f42a2283a31 in JVM_handle_linux_signal () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #5  0x7f42a227ad33 in signalHandler(int, siginfo*, void*) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #6  signal handler called
 #7  0x003768e7b53c in free () from /lib64/libc.so.6
 #8  0x003768ecf630 in freeaddrinfo () from /lib64/libc.so.6
 #9  0x7f419689fbaf in getIP () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #10 0x7f41968da76a in operator () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #11 0x7f41968da0c3 in UPID () from /home/test/mesos/lib/libmesos-0.22.1.so
 #12 0x7f4195fb97f1 in create () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #13 0x7f419619508f in start () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #14 0x7f419697a721 in Java_org_apache_mesos_MesosSchedulerDriver_start () 
 from /home/test/mesos/lib/libmesos-0.22.1.so
 #15 0x7f428d015134 in ?? ()
 #16 0x7f428d014e82 in ?? ()
 #17 0x7f424c1f54a8 in ?? ()
 #18 0x7f424c4a6a30 in ?? ()
 #19 0x7f424c1f5508 in ?? ()
 #20 0x7f424c4a77f0 in ?? ()
 #21 0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2131) Add a reverse proxy endpoint to mesos

2015-05-18 Thread Cody Maloney (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549538#comment-14549538
 ] 

Cody Maloney commented on MESOS-2131:
-

This is stalled at the moment (I haven't been working on it, heading out of 
town). Can talk to someone about remaining issues with it, path forward if they 
resurrect it.

 Add a reverse proxy endpoint to mesos
 -

 Key: MESOS-2131
 URL: https://issues.apache.org/jira/browse/MESOS-2131
 Project: Mesos
  Issue Type: Improvement
  Components: master, slave
Reporter: Cody Maloney
Assignee: Cody Maloney
Priority: Minor
  Labels: mesosphere

 A new libprocess Process inside mesos which allows attaching/detaching known 
 endpoints at a specific path.
 Ideally I want to be able to do things like attach 'slave-id' and pass HTTP 
 requests on to that slave:
 Sample endpoint actions:
 C++ api:
 attach(std::string name, Node target): Add a new reverse proxy path
 detach(std::string name): Remove an established reverse proxy path
 HTTP endpoints:
 /proxy/go/{name}
  - Prefix matches a path, forwards the remaining path onto the remote endpoin
 /proxy/debug.json
  - Prints out all attached endpoints.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549674#comment-14549674
 ] 

haosdent commented on MESOS-2749:
-

[~cnstar9988]I think this problem is fixed by this 
issue:https://issues.apache.org/jira/browse/MESOS-2636 

 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar
Priority: Critical

 I use on marathon in docker, https://github.com/mesosphere/marathon
 docker build -t marathon-head .
 when I run marathon in docker, it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
 j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
 j  
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
 j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
 j  java.lang.Thread.run()V+11
 v  ~StubRoutines::call_stub
 =
 (gdb) where
 #0  0x003768e32625 in raise () from /lib64/libc.so.6
 #1  0x003768e33e05 in abort () from /lib64/libc.so.6
 #2  0x7f42a227c509 in os::abort(bool) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #3  0x7f42a2424dd5 in VMError::report_and_die() () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #4  0x7f42a2283a31 in JVM_handle_linux_signal () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #5  0x7f42a227ad33 in signalHandler(int, siginfo*, void*) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #6  signal handler called
 #7  0x003768e7b53c in free () from /lib64/libc.so.6
 #8  0x003768ecf630 in freeaddrinfo () from /lib64/libc.so.6
 #9  0x7f419689fbaf in getIP () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #10 0x7f41968da76a in operator () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #11 0x7f41968da0c3 in UPID () from /home/test/mesos/lib/libmesos-0.22.1.so
 #12 0x7f4195fb97f1 in create () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #13 0x7f419619508f in start () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #14 0x7f419697a721 in Java_org_apache_mesos_MesosSchedulerDriver_start () 
 from /home/test/mesos/lib/libmesos-0.22.1.so
 #15 0x7f428d015134 in ?? ()
 #16 0x7f428d014e82 in ?? ()
 #17 0x7f424c1f54a8 in ?? ()
 #18 0x7f424c4a6a30 in ?? ()
 #19 0x7f424c1f5508 in ?? ()
 #20 0x7f424c4a77f0 in ?? ()
 #21 0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2600) Introduce /reserve endpoint on the master

2015-05-18 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-2600:

Summary: Introduce /reserve endpoint on the master  (was: Introduce 
reservation HTTP endpoints on the master)

 Introduce /reserve endpoint on the master
 -

 Key: MESOS-2600
 URL: https://issues.apache.org/jira/browse/MESOS-2600
 Project: Mesos
  Issue Type: Task
  Components: master
Reporter: Michael Park
Assignee: Michael Park
  Labels: mesosphere

 Enable operators to manage dynamic reservations by Introducing the 
 {{/reserve}} and {{/unreserve}} HTTP endpoints on the master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2748) /help generated links point to wrong URLs

2015-05-18 Thread Marco Massenzio (JIRA)
Marco Massenzio created MESOS-2748:
--

 Summary: /help generated links point to wrong URLs
 Key: MESOS-2748
 URL: https://issues.apache.org/jira/browse/MESOS-2748
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.22.1
Reporter: Marco Massenzio
Priority: Minor


As reported by Michael Lunøe mlu...@mesosphere.io (see also MESOS-329 and 
MESOS-913 for background):

{quote}
In {{mesos/3rdparty/libprocess/src/help.cpp}} a markdown file is created, which 
is then converted to html through a javascript library 

All endpoints point to {{/help/...}}, they need to work dynamically for reverse 
proxy to do its thing. {{/mesos/help}} works, and displays the endpoints, but 
they each need to go to their respective {{/mesos/help/...}} endpoint. 

Note that this needs to work both for master, and for slaves. I think the route 
to slaves help is something like this: 
{{/mesos/slaves/20150518-210216-1695027628-5050-1366-S0/help}}, but please 
double check this.
{quote}

The fix appears to be not too complex (as it would require to simply manipulate 
the generated URL) but a quick skim of the code would suggest that something 
more substantial may be desirable too.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Littlestar updated MESOS-2749:
--
Description: 
I use on marathon in docker, https://github.com/mesosphere/marathon
docker build -t marathon-head .

when I run marathon in docker, it crased.

Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x7b53c]  cfree+0x1c

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  
org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
j  org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
j  scala.Option.foreach(Lscala/Function1;)V+12
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
j  mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
j  
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
j  
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j  java.lang.Thread.run()V+11
v  ~StubRoutines::call_stub

=
(gdb) where
#0  0x003768e32625 in raise () from /lib64/libc.so.6
#1  0x003768e33e05 in abort () from /lib64/libc.so.6
#2  0x7f42a227c509 in os::abort(bool) () from 
/home/test/jdk8/jre/lib/amd64/server/libjvm.so
#3  0x7f42a2424dd5 in VMError::report_and_die() () from 
/home/test/jdk8/jre/lib/amd64/server/libjvm.so
#4  0x7f42a2283a31 in JVM_handle_linux_signal () from 
/home/test/jdk8/jre/lib/amd64/server/libjvm.so
#5  0x7f42a227ad33 in signalHandler(int, siginfo*, void*) () from 
/home/test/jdk8/jre/lib/amd64/server/libjvm.so
#6  signal handler called
#7  0x003768e7b53c in free () from /lib64/libc.so.6
#8  0x003768ecf630 in freeaddrinfo () from /lib64/libc.so.6
#9  0x7f419689fbaf in getIP () from /home/test/mesos/lib/libmesos-0.22.1.so
#10 0x7f41968da76a in operator () from 
/home/test/mesos/lib/libmesos-0.22.1.so
#11 0x7f41968da0c3 in UPID () from /home/test/mesos/lib/libmesos-0.22.1.so
#12 0x7f4195fb97f1 in create () from /home/test/mesos/lib/libmesos-0.22.1.so
#13 0x7f419619508f in start () from /home/test/mesos/lib/libmesos-0.22.1.so
#14 0x7f419697a721 in Java_org_apache_mesos_MesosSchedulerDriver_start () 
from /home/test/mesos/lib/libmesos-0.22.1.so
#15 0x7f428d015134 in ?? ()
#16 0x7f428d014e82 in ?? ()
#17 0x7f424c1f54a8 in ?? ()
#18 0x7f424c4a6a30 in ?? ()
#19 0x7f424c1f5508 in ?? ()
#20 0x7f424c4a77f0 in ?? ()
#21 0x in ?? ()

  was:
I use on marathon in docker, https://github.com/mesosphere/marathon
docker build -t marathon-head .

when I run marathon in docker, it crased.

Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
space=1018k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x7b53c]  cfree+0x1c

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
j  
org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
j  org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
j  scala.Option.foreach(Lscala/Function1;)V+12
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
j  mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
j  
mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
j  
scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
j  
java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
j  java.lang.Thread.run()V+11
v  ~StubRoutines::call_stub



 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
   

[jira] [Commented] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549685#comment-14549685
 ] 

Littlestar commented on MESOS-2749:
---

mesos-0.22.1\3rdparty\libprocess\3rdparty\stout\include\stout\net.hpp
{noformat}
// Returns a Try of the IP for the provided hostname or an error if no
// IP is obtained.
inline Tryuint32_t getIP(const std::string hostname, sa_family_t family)
{
  struct addrinfo hints, *result;
  hints = createAddrInfo(SOCK_STREAM, family, 0);

  result = NULL; // here is needed, when error !=0, 
result is wild pointer
  int error = getaddrinfo(hostname.c_str(), NULL, hints, result);
  if (error != 0 || result == NULL) {
if (result != NULL ) {
  freeaddrinfo(result);
}
return Error(gai_strerror(error));
  }
  if (result-ai_addr == NULL) {
freeaddrinfo(result);
return Error(Got no addresses for ' + hostname + ');
  }

  uint32_t ip = ((struct sockaddr_in*)(result-ai_addr))-sin_addr.s_addr;
  freeaddrinfo(result);

  return ip;
}
{noformat}


 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar
Priority: Critical

 I use on marathon in docker, https://github.com/mesosphere/marathon
 docker build -t marathon-head .
 when I run marathon in docker, it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
 j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
 j  
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
 j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
 j  java.lang.Thread.run()V+11
 v  ~StubRoutines::call_stub
 =
 (gdb) where
 #0  0x003768e32625 in raise () from /lib64/libc.so.6
 #1  0x003768e33e05 in abort () from /lib64/libc.so.6
 #2  0x7f42a227c509 in os::abort(bool) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #3  0x7f42a2424dd5 in VMError::report_and_die() () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #4  0x7f42a2283a31 in JVM_handle_linux_signal () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #5  0x7f42a227ad33 in signalHandler(int, siginfo*, void*) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #6  signal handler called
 #7  0x003768e7b53c in free () from /lib64/libc.so.6
 #8  0x003768ecf630 in freeaddrinfo () from /lib64/libc.so.6
 #9  0x7f419689fbaf in getIP () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #10 0x7f41968da76a in operator () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #11 0x7f41968da0c3 in UPID () from /home/test/mesos/lib/libmesos-0.22.1.so
 #12 0x7f4195fb97f1 in create () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #13 0x7f419619508f in start () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #14 0x7f419697a721 in Java_org_apache_mesos_MesosSchedulerDriver_start () 
 from /home/test/mesos/lib/libmesos-0.22.1.so
 #15 0x7f428d015134 in ?? ()
 #16 0x7f428d014e82 in ?? ()
 #17 0x7f424c1f54a8 in ?? ()
 #18 0x7f424c4a6a30 in ?? ()
 #19 0x7f424c1f5508 in ?? ()
 #20 0x7f424c4a77f0 in ?? ()
 #21 0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549690#comment-14549690
 ] 

Littlestar commented on MESOS-2749:
---

thanks to haosdent, it's same problem, 
https://issues.apache.org/jira/browse/MESOS-2636


 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar
Priority: Critical

 I use on marathon in docker, https://github.com/mesosphere/marathon
 docker build -t marathon-head .
 when I run marathon in docker, it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
 j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
 j  
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
 j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
 j  java.lang.Thread.run()V+11
 v  ~StubRoutines::call_stub
 =
 (gdb) where
 #0  0x003768e32625 in raise () from /lib64/libc.so.6
 #1  0x003768e33e05 in abort () from /lib64/libc.so.6
 #2  0x7f42a227c509 in os::abort(bool) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #3  0x7f42a2424dd5 in VMError::report_and_die() () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #4  0x7f42a2283a31 in JVM_handle_linux_signal () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #5  0x7f42a227ad33 in signalHandler(int, siginfo*, void*) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #6  signal handler called
 #7  0x003768e7b53c in free () from /lib64/libc.so.6
 #8  0x003768ecf630 in freeaddrinfo () from /lib64/libc.so.6
 #9  0x7f419689fbaf in getIP () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #10 0x7f41968da76a in operator () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #11 0x7f41968da0c3 in UPID () from /home/test/mesos/lib/libmesos-0.22.1.so
 #12 0x7f4195fb97f1 in create () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #13 0x7f419619508f in start () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #14 0x7f419697a721 in Java_org_apache_mesos_MesosSchedulerDriver_start () 
 from /home/test/mesos/lib/libmesos-0.22.1.so
 #15 0x7f428d015134 in ?? ()
 #16 0x7f428d014e82 in ?? ()
 #17 0x7f424c1f54a8 in ?? ()
 #18 0x7f424c4a6a30 in ?? ()
 #19 0x7f424c1f5508 in ?? ()
 #20 0x7f424c4a77f0 in ?? ()
 #21 0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Littlestar closed MESOS-2749.
-
Resolution: Fixed

 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar
Priority: Critical

 I use on marathon in docker, https://github.com/mesosphere/marathon
 docker build -t marathon-head .
 when I run marathon in docker, it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
 j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
 j  
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
 j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
 j  java.lang.Thread.run()V+11
 v  ~StubRoutines::call_stub
 =
 (gdb) where
 #0  0x003768e32625 in raise () from /lib64/libc.so.6
 #1  0x003768e33e05 in abort () from /lib64/libc.so.6
 #2  0x7f42a227c509 in os::abort(bool) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #3  0x7f42a2424dd5 in VMError::report_and_die() () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #4  0x7f42a2283a31 in JVM_handle_linux_signal () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #5  0x7f42a227ad33 in signalHandler(int, siginfo*, void*) () from 
 /home/test/jdk8/jre/lib/amd64/server/libjvm.so
 #6  signal handler called
 #7  0x003768e7b53c in free () from /lib64/libc.so.6
 #8  0x003768ecf630 in freeaddrinfo () from /lib64/libc.so.6
 #9  0x7f419689fbaf in getIP () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #10 0x7f41968da76a in operator () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #11 0x7f41968da0c3 in UPID () from /home/test/mesos/lib/libmesos-0.22.1.so
 #12 0x7f4195fb97f1 in create () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #13 0x7f419619508f in start () from 
 /home/test/mesos/lib/libmesos-0.22.1.so
 #14 0x7f419697a721 in Java_org_apache_mesos_MesosSchedulerDriver_start () 
 from /home/test/mesos/lib/libmesos-0.22.1.so
 #15 0x7f428d015134 in ?? ()
 #16 0x7f428d014e82 in ?? ()
 #17 0x7f424c1f54a8 in ?? ()
 #18 0x7f424c4a6a30 in ?? ()
 #19 0x7f424c1f5508 in ?? ()
 #20 0x7f424c4a77f0 in ?? ()
 #21 0x in ?? ()



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2749) Mesos 0.22.1 cause marathon crashed

2015-05-18 Thread Littlestar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549636#comment-14549636
 ] 

Littlestar commented on MESOS-2749:
---

I checked libmesos.so, I think It must just expose needed symbol only.
{
 global:
JNI_OnLoad;
JNI_OnUnload;
*Java_org_apache_mesos*;
 local:
   *;
};


 Mesos 0.22.1 cause marathon crashed
 ---

 Key: MESOS-2749
 URL: https://issues.apache.org/jira/browse/MESOS-2749
 Project: Mesos
  Issue Type: Bug
  Components: java api
Affects Versions: 0.22.1
Reporter: Littlestar

 I use on marathon in docker.
 https://github.com/mesosphere/marathon
 docker build -t marathon-head .it crased.
 Stack: [0x7fe1641c8000,0x7fe1642c9000],  sp=0x7fe1642c6b18,  free 
 space=1018k
 Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
 code)
 C  [libc.so.6+0x7b53c]  cfree+0x1c
 Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
 j  
 org.apache.mesos.MesosSchedulerDriver.start()Lorg/apache/mesos/Protos$Status;+0
 j  
 org.apache.mesos.MesosSchedulerDriver.run()Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Lorg/apache/mesos/SchedulerDriver;)Lorg/apache/mesos/Protos$Status;+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1$$anonfun$apply$mcV$sp$1.apply(Ljava/lang/Object;)Ljava/lang/Object;+5
 j  scala.Option.foreach(Lscala/Function1;)V+12
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply$mcV$sp()V+15
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()V+1
 j  
 mesosphere.marathon.MarathonSchedulerService$$anonfun$runDriver$1.apply()Ljava/lang/Object;+1
 j  
 scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1()Lscala/util/Try;+8
 j  scala.concurrent.impl.Future$PromiseCompletingRunnable.run()V+5
 j  
 java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
 j  java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
 j  java.lang.Thread.run()V+11
 v  ~StubRoutines::call_stub



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2728) Introduce concept of cluster wide resources.

2015-05-18 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2728:
--
Labels: mesosphere  (was: )

 Introduce concept of cluster wide resources.
 

 Key: MESOS-2728
 URL: https://issues.apache.org/jira/browse/MESOS-2728
 Project: Mesos
  Issue Type: Epic
Reporter: Joerg Schad
  Labels: mesosphere

 There are resources which are not provided by a single node. Consider for 
 example a external Network Bandwidth of a cluster. Being a limited resource 
 it makes sense for Mesos to manage it but still it is not a resource being 
 offered by a single node.
 Use Cases:
 1. Network Bandwidth
 2. IP Addresses
 3. Global Service Ports
 2. Distributed File System Storage
 3. Software Licences



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2596) Update allocator docs

2015-05-18 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547826#comment-14547826
 ] 

Alexander Rukletsov commented on MESOS-2596:


Absolutely!

 Update allocator docs
 -

 Key: MESOS-2596
 URL: https://issues.apache.org/jira/browse/MESOS-2596
 Project: Mesos
  Issue Type: Task
  Components: allocation, documentation, modules
Reporter: Alexander Rukletsov
  Labels: mesosphere

 Once Allocator interface changes, so does the way of writing new allocators. 
 This should be reflected in Mesos docs. The modules doc should mention how to 
 write and use allocator modules. Configuration doc should mention the new 
 {{--allocator}} flag.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-2596) Update allocator docs

2015-05-18 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-2596:
---
Comment: was deleted

(was: Absolutely!)

 Update allocator docs
 -

 Key: MESOS-2596
 URL: https://issues.apache.org/jira/browse/MESOS-2596
 Project: Mesos
  Issue Type: Task
  Components: allocation, documentation, modules
Reporter: Alexander Rukletsov
  Labels: mesosphere

 Once Allocator interface changes, so does the way of writing new allocators. 
 This should be reflected in Mesos docs. The modules doc should mention how to 
 write and use allocator modules. Configuration doc should mention the new 
 {{--allocator}} flag.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2596) Update allocator docs

2015-05-18 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547827#comment-14547827
 ] 

Alexander Rukletsov commented on MESOS-2596:


Absolutely!

 Update allocator docs
 -

 Key: MESOS-2596
 URL: https://issues.apache.org/jira/browse/MESOS-2596
 Project: Mesos
  Issue Type: Task
  Components: allocation, documentation, modules
Reporter: Alexander Rukletsov
  Labels: mesosphere

 Once Allocator interface changes, so does the way of writing new allocators. 
 This should be reflected in Mesos docs. The modules doc should mention how to 
 write and use allocator modules. Configuration doc should mention the new 
 {{--allocator}} flag.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2369) Segfault when mesos-slave tries to clean up docker containers on startup

2015-05-18 Thread Herman Schistad (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14547867#comment-14547867
 ] 

Herman Schistad commented on MESOS-2369:


Hi. Had this issue as well. Running {{mesos-slave}} with 
{{--containerizers=docker}} yielded a _Segmentation fault_ after {{I0518 
11:28:17.503280 38714 detector.cpp:452] A new leading master 
(UPID=master@**) is detected}} in the log files. This was with Mesos 0.22.1 
and Docker 1.5.0.

Running with {{strace}} showed that SIGSEGV was sent.

Turns out mesos doesn't like it when there's too many dangling and exited 
containers. I had several thousands of them, so I ran:

{{docker rm $(docker ps -qa -f status=exited)}}
{{docker rmi $(docker images -q -f dangling=true)}}

Waited for it to clean up everything and then it worked again. In the future 
I'll run some of my docker containers with the {{--rm}} flag, so they'll clean 
up after themselves.

 Segfault when mesos-slave tries to clean up docker containers on startup
 

 Key: MESOS-2369
 URL: https://issues.apache.org/jira/browse/MESOS-2369
 Project: Mesos
  Issue Type: Bug
  Components: docker
Affects Versions: 0.21.1
 Environment: Debian Jessie, mesos package 0.21.1-1.2.debian77 
 docker 1.3.2 build 39fa2fa
Reporter: Pas

 I did a gdb backtrace, it seems like a stack overflow due to a bit too much 
 recursion.
 The interesting aspect is that after running mesos-slave with strace -f -b 
 execve it successfully proceeded with the docker cleanup. However, there were 
 a few strace sessions (on other slaves) where I was able to observe the 
 SIGSEGV, and it was around (or a bit before) the docker ps -a call, because 
 docker got a broken pipe shortly, then got killed by the propagating SIGSEGV 
 signal.
 {code}
 
 #59296 0x76e7cd98 in process::Futurestd::string 
 process::Futureunsigned 
 long::thenstd::string(std::tr1::functionprocess::Futurestd::string 
 (unsigned long const) const) const () from 
 /usr/local/lib/libmesos-0.21.1.so
 #59297 0x76e4f5d3 in process::io::internal::_read(int, 
 std::tr1::shared_ptrstd::string const, boost::shared_arraychar const, 
 unsigned long) () from /usr/local/lib/libmesos-0.21.1.so
 #59298 0x76e5012c in process::io::internal::__read(unsigned long, 
 int, std::tr1::shared_ptrstd::string const, boost::shared_arraychar 
 const, unsigned long) () from /usr/local/lib/libmesos-0.21.1.so
 #59299 0x76e53000 in 
 std::tr1::_Function_handlerprocess::Futurestd::string (unsigned long 
 const), std::tr1::_Bindprocess::Futurestd::string 
 (*(std::tr1::_Placeholder1, int, std::tr1::shared_ptrstd::string, 
 boost::shared_arraychar, unsigned long))(unsigned long, int, 
 std::tr1::shared_ptrstd::string const, boost::shared_arraychar const, 
 unsigned long) ::_M_invoke(std::tr1::_Any_data const, unsigned long 
 const) () from /usr/local/lib/libmesos-0.21.1.so
 #59300 0x76e7d23b in void process::internal::thenfunsigned long, 
 std::string(std::tr1::shared_ptrprocess::Promisestd::string  const, 
 std::tr1::functionprocess::Futurestd::string (unsigned long const) 
 const, process::Futureunsigned long const) ()
from /usr/local/lib/libmesos-0.21.1.so
 #59301 0x7689ee60 in process::Futureunsigned 
 long::onAny(std::tr1::functionvoid (process::Futureunsigned long const) 
 const) const () from /usr/local/lib/libmesos-0.21.1.so
 #59302 0x76e7cd98 in process::Futurestd::string 
 process::Futureunsigned 
 long::thenstd::string(std::tr1::functionprocess::Futurestd::string 
 (unsigned long const) const) const () from 
 /usr/local/lib/libmesos-0.21.1.so
 #59303 0x76e4f5d3 in process::io::internal::_read(int, 
 std::tr1::shared_ptrstd::string const, boost::shared_arraychar const, 
 unsigned long) () from /usr/local/lib/libmesos-0.21.1.so
 #59304 0x76e5012c in process::io::internal::__read(unsigned long, 
 int, std::tr1::shared_ptrstd::string const, boost::shared_arraychar 
 const, unsigned long) () from /usr/local/lib/libmesos-0.21.1.so
 #59305 0x76e53000 in 
 std::tr1::_Function_handlerprocess::Futurestd::string (unsigned long 
 const), std::tr1::_Bindprocess::Futurestd::string 
 (*(std::tr1::_Placeholder1, int, std::tr1::shared_ptrstd::string, 
 boost::shared_arraychar, unsigned long))(unsigned long, int, 
 std::tr1::shared_ptrstd::string const, boost::shared_arraychar const, 
 unsigned long) ::_M_invoke(std::tr1::_Any_data const, unsigned long 
 const) () from /usr/local/lib/libmesos-0.21.1.so
 #59306 0x76e7d23b in void process::internal::thenfunsigned long, 
 std::string(std::tr1::shared_ptrprocess::Promisestd::string  const, 
 std::tr1::functionprocess::Futurestd::string (unsigned long const) 
 const, process::Futureunsigned long const) ()