[jira] [Commented] (MESOS-10011) Operation feedback with stale agent ID crashes the master

2019-10-11 Thread Yan Xu (Jira)
[ https://issues.apache.org/jira/browse/MESOS-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949656#comment-16949656 ] Yan Xu commented on MESOS-10011: {{removeOperation}} is probably called from [here|http

[jira] [Commented] (MESOS-10011) Operation feedback with stale agent ID crashes the master

2019-10-10 Thread Yan Xu (Jira)
[ https://issues.apache.org/jira/browse/MESOS-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949147#comment-16949147 ] Yan Xu commented on MESOS-10011: To be most compatible with the agent checkpointed resou

[jira] [Commented] (MESOS-10011) Operation feedback with stale agent ID crashes the master

2019-10-10 Thread Yan Xu (Jira)
[ https://issues.apache.org/jira/browse/MESOS-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949139#comment-16949139 ] Yan Xu commented on MESOS-10011: [~greggomann] any thoughts on how this should be addres

[jira] [Commented] (MESOS-10011) Operation feedback with stale agent ID crashes the master

2019-10-10 Thread Yan Xu (Jira)
[ https://issues.apache.org/jira/browse/MESOS-10011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16949069#comment-16949069 ] Yan Xu commented on MESOS-10011: In our environment we only use old style RESERVE/CREATE

[jira] [Created] (MESOS-10011) Operation feedback with stale agent ID crashes the master

2019-10-10 Thread Yan Xu (Jira)
Yan Xu created MESOS-10011: -- Summary: Operation feedback with stale agent ID crashes the master Key: MESOS-10011 URL: https://issues.apache.org/jira/browse/MESOS-10011 Project: Mesos Issue Type: Bug

[jira] [Commented] (MESOS-9768) Allow operators to mount the container rootfs with the `nosuid` flag

2019-05-17 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-9768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16842819#comment-16842819 ] Yan Xu commented on MESOS-9768: --- What we are primarily interested in is to set it for for t

[jira] [Commented] (MESOS-9368) The agent can be resending status updates too aggressively and the backoff is not configurable

2018-11-02 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-9368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16673503#comment-16673503 ] Yan Xu commented on MESOS-9368: --- cc [~fiu] > The agent can be resending status updates too

[jira] [Commented] (MESOS-9368) The agent can be resending status updates too aggressively and the backoff is not configurable

2018-11-02 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-9368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16673497#comment-16673497 ] Yan Xu commented on MESOS-9368: --- [~ipronin] [~jasonlai] do you guys feel similarly for your

[jira] [Created] (MESOS-9368) The agent can be resending status updates too aggressively and the backoff is not configurable

2018-11-02 Thread Yan Xu (JIRA)
Yan Xu created MESOS-9368: - Summary: The agent can be resending status updates too aggressively and the backoff is not configurable Key: MESOS-9368 URL: https://issues.apache.org/jira/browse/MESOS-9368 Projec

[jira] [Commented] (MESOS-9178) Add a metric for master failover time.

2018-09-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16612870#comment-16612870 ] Yan Xu commented on MESOS-9178: --- So my proposal is that, we have the following metrics:..

[jira] [Commented] (MESOS-9178) Add a metric for master failover time.

2018-08-22 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589463#comment-16589463 ] Yan Xu commented on MESOS-9178: --- +1. Yup that's the approach we talked about. Sorry the JIR

[jira] [Created] (MESOS-9171) Mesos agent crashes

2018-08-21 Thread Yan Xu (JIRA)
Yan Xu created MESOS-9171: - Summary: Mesos agent crashes Key: MESOS-9171 URL: https://issues.apache.org/jira/browse/MESOS-9171 Project: Mesos Issue Type: Bug Affects Versions: 1.7.0 R

[jira] [Commented] (MESOS-8897) ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky

2018-05-09 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16469112#comment-16469112 ] Yan Xu commented on MESOS-8897: --- cc [~hdost] > ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWith

[jira] [Created] (MESOS-8897) ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky

2018-05-09 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8897: - Summary: ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky Key: MESOS-8897 URL: https://issues.apache.org/jira/browse/MESOS-8897 Project: Mesos Issue Type: Bug

[jira] [Commented] (MESOS-8750) Check failed: !slaves.registered.contains(task->slave_id)

2018-05-03 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16463379#comment-16463379 ] Yan Xu commented on MESOS-8750: --- {code:title=} commit 520b729857223aeade345cbdf61209ec4f395a

[jira] [Assigned] (MESOS-8630) All subsequent registry operations fail after the registrar is aborted after a failed update

2018-05-01 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8630: - Assignee: Xudong Ni > All subsequent registry operations fail after the registrar is aborted after > a f

[jira] [Commented] (MESOS-8618) ReconciliationTest.ReconcileStatusUpdateTaskState is flaky.

2018-04-30 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459442#comment-16459442 ] Yan Xu commented on MESOS-8618: --- {noformat:title=} commit 1c6d9e5e6d7439444c77d6c91b18642f69

[jira] [Created] (MESOS-8855) Change TaskStatus.Reason's default value to something

2018-04-30 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8855: - Summary: Change TaskStatus.Reason's default value to something Key: MESOS-8855 URL: https://issues.apache.org/jira/browse/MESOS-8855 Project: Mesos Issue Type: Bug

[jira] [Assigned] (MESOS-8824) Send the task's latest "status update state" to frameworks when an unreachable agent reregisters.

2018-04-23 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8824: - Assignee: Yan Xu > Send the task's latest "status update state" to frameworks when an > unreachable agen

[jira] [Commented] (MESOS-8618) ReconciliationTest.ReconcileStatusUpdateTaskState is flaky.

2018-04-23 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16448971#comment-16448971 ] Yan Xu commented on MESOS-8618: --- This test failed because we didn't enable replicated log re

[jira] [Created] (MESOS-8824) Send the task's latest "status update state" to frameworks when an unreachable agent reregisters.

2018-04-23 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8824: - Summary: Send the task's latest "status update state" to frameworks when an unreachable agent reregisters. Key: MESOS-8824 URL: https://issues.apache.org/jira/browse/MESOS-8824 Pro

[jira] [Assigned] (MESOS-8618) ReconciliationTest.ReconcileStatusUpdateTaskState is flaky.

2018-04-23 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8618: - Assignee: Yan Xu > ReconciliationTest.ReconcileStatusUpdateTaskState is flaky. >

[jira] [Commented] (MESOS-8630) All subsequent registry operations fail after the registrar is aborted after a failed update

2018-04-06 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16429105#comment-16429105 ] Yan Xu commented on MESOS-8630: --- A first step could be to identify all the places that updat

[jira] [Commented] (MESOS-8636) Master should store `completed` frameworks for lifecycle enforcement separately from that for webUI and endpoints

2018-03-05 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16386503#comment-16386503 ] Yan Xu commented on MESOS-8636: --- This is in line with what Mesos already does for [gone age

[jira] [Created] (MESOS-8636) Master should store `completed` frameworks for lifecycle enforcement separately from that for webUI and endpoints

2018-03-05 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8636: - Summary: Master should store `completed` frameworks for lifecycle enforcement separately from that for webUI and endpoints Key: MESOS-8636 URL: https://issues.apache.org/jira/browse/MESOS-8636

[jira] [Commented] (MESOS-6422) cgroups_tests not correctly tearing down testing hierarchies

2018-03-02 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384307#comment-16384307 ] Yan Xu commented on MESOS-6422: --- Sorry this is low priority for me right now so I am unassig

[jira] [Assigned] (MESOS-6422) cgroups_tests not correctly tearing down testing hierarchies

2018-03-02 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-6422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-6422: - Assignee: (was: Yan Xu) > cgroups_tests not correctly tearing down testing hierarchies >

[jira] [Created] (MESOS-8630) All subsequent registry operations fail after the registrar is aborted after a failed update

2018-03-02 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8630: - Summary: All subsequent registry operations fail after the registrar is aborted after a failed update Key: MESOS-8630 URL: https://issues.apache.org/jira/browse/MESOS-8630 Project:

[jira] [Created] (MESOS-8622) Agent should send a task status update when upon receiving the task

2018-02-27 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8622: - Summary: Agent should send a task status update when upon receiving the task Key: MESOS-8622 URL: https://issues.apache.org/jira/browse/MESOS-8622 Project: Mesos Issue Ty

[jira] [Created] (MESOS-8602) Subscribers::send incorrectly assumes frameworks are registered

2018-02-22 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8602: - Summary: Subscribers::send incorrectly assumes frameworks are registered Key: MESOS-8602 URL: https://issues.apache.org/jira/browse/MESOS-8602 Project: Mesos Issue Type:

[jira] [Commented] (MESOS-8602) Subscribers::send incorrectly assumes frameworks are registered

2018-02-22 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16373661#comment-16373661 ] Yan Xu commented on MESOS-8602: --- /cc [~greggomann] > Subscribers::send incorrectly assumes

[jira] [Commented] (MESOS-8595) Mesos agent's use of /tmp for overlayfs could be confusing

2018-02-19 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8595?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16369442#comment-16369442 ] Yan Xu commented on MESOS-8595: --- /cc [~gilbert] [~zhitao] > Mesos agent's use of /tmp for o

[jira] [Created] (MESOS-8595) Mesos agent

2018-02-19 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8595: - Summary: Mesos agent Key: MESOS-8595 URL: https://issues.apache.org/jira/browse/MESOS-8595 Project: Mesos Issue Type: Bug Reporter: Yan Xu With MESOS-6000 Me

[jira] [Created] (MESOS-8544) Required mesos.Task.state doesn't support upgrades.

2018-02-05 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8544: - Summary: Required mesos.Task.state doesn't support upgrades. Key: MESOS-8544 URL: https://issues.apache.org/jira/browse/MESOS-8544 Project: Mesos Issue Type: Bug

[jira] [Commented] (MESOS-8232) SlaveTest.RegisteredAgentReregisterAfterFailover is flaky.

2018-01-31 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16347285#comment-16347285 ] Yan Xu commented on MESOS-8232: --- [~alexr] thanks a lot for diligently cleaning up flaky test

[jira] [Assigned] (MESOS-8232) SlaveTest.RegisteredAgentReregisterAfterFailover is flaky.

2018-01-31 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8232: - Assignee: Yan Xu > SlaveTest.RegisteredAgentReregisterAfterFailover is flaky. > -

[jira] [Commented] (MESOS-8507) SLRP discards reservations when the agent is discarded, which could lead to leaked volumes.

2018-01-29 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344362#comment-16344362 ] Yan Xu commented on MESOS-8507: --- /cc [~chhsia0] [~jieyu] > SLRP discards reservations when

[jira] [Created] (MESOS-8507) SLRP discards reservations when the agent is discarded, which could lead to leaked volumes.

2018-01-29 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8507: - Summary: SLRP discards reservations when the agent is discarded, which could lead to leaked volumes. Key: MESOS-8507 URL: https://issues.apache.org/jira/browse/MESOS-8507 Project:

[jira] [Updated] (MESOS-8507) SLRP discards reservations when the agent is discarded, which could lead to leaked volumes.

2018-01-29 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8507: -- Description: In the current SLRP implementation the reservations for new SLRP/CSI backed volumes are checkpoint

[jira] [Comment Edited] (MESOS-5368) Consider introducing persistent agent ID

2018-01-29 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344190#comment-16344190 ] Yan Xu edited comment on MESOS-5368 at 1/29/18 11:27 PM: - [~vinodk

[jira] [Commented] (MESOS-5368) Consider introducing persistent agent ID

2018-01-29 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344190#comment-16344190 ] Yan Xu commented on MESOS-5368: --- [~vinodkone] It still seems to me that the proposal to tie

[jira] [Commented] (MESOS-8337) Invalid state transition attempted when agent is lost.

2018-01-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16324742#comment-16324742 ] Yan Xu commented on MESOS-8337: --- {noformat:title=} commit 35ac2f047abf2c0ea452b98a249c3dbb90

[jira] [Commented] (MESOS-8125) Agent should properly handle recovering an executor when its pid is reused

2018-01-09 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319818#comment-16319818 ] Yan Xu commented on MESOS-8125: --- We used to not need to handle recovering executors after a

[jira] [Assigned] (MESOS-8334) PartitionedSlaveReregistrationMasterFailover is flaky.

2018-01-03 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8334: - Assignee: Yan Xu (was: Megha Sharma) > PartitionedSlaveReregistrationMasterFailover is flaky. >

[jira] [Commented] (MESOS-8334) PartitionedSlaveReregistrationMasterFailover is flaky.

2018-01-03 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16310404#comment-16310404 ] Yan Xu commented on MESOS-8334: --- The agent reregistered before one scheduler hence the statu

[jira] [Commented] (MESOS-6406) Send latest status for partition-aware tasks when agent reregisters

2017-12-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16288018#comment-16288018 ] Yan Xu commented on MESOS-6406: --- {noformat:title=} commit 5e5a8102c3281db25a37157dac123b0ca5

[jira] [Commented] (MESOS-8306) Restrict which agents can statically reserve resources for which roles

2017-12-11 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286429#comment-16286429 ] Yan Xu commented on MESOS-8306: --- After investigating it I found that it makes more sense of

[jira] [Comment Edited] (MESOS-8306) Restrict which agents can statically reserve resources for which roles

2017-12-11 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286857#comment-16286857 ] Yan Xu edited comment on MESOS-8306 at 12/12/17 12:35 AM: -- https:

[jira] [Commented] (MESOS-8306) Restrict which agents can statically reserve resources for which roles

2017-12-11 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16286750#comment-16286750 ] Yan Xu commented on MESOS-8306: --- So in order to authorize the static reservations, the maste

[jira] [Assigned] (MESOS-621) `HierarchicalAllocatorProcess::removeSlave` doesn't properly handle framework allocations/resources

2017-12-06 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-621: Assignee: (was: Yan Xu) > `HierarchicalAllocatorProcess::removeSlave` doesn't properly handle framework

[jira] [Assigned] (MESOS-8306) Restrict which agents can statically reserve resources for which roles

2017-12-06 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8306: - Assignee: Yan Xu > Restrict which agents can statically reserve resources for which roles > -

[jira] [Created] (MESOS-8306) Restrict which agents can statically reserve resources for which roles

2017-12-06 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8306: - Summary: Restrict which agents can statically reserve resources for which roles Key: MESOS-8306 URL: https://issues.apache.org/jira/browse/MESOS-8306 Project: Mesos Issue

[jira] [Commented] (MESOS-8223) Master crashes when suppressed on subscribe is enabled.

2017-12-01 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275169#comment-16275169 ] Yan Xu commented on MESOS-8223: --- {noformat:title=} commit 8c2f972b5c0c42e1519d09275cc26e1765

[jira] [Commented] (MESOS-8200) Suppressed roles are not honoured for v1 scheduler subscribe requests.

2017-12-01 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16275165#comment-16275165 ] Yan Xu commented on MESOS-8200: --- {noformat:title=} commit 3711233fcec761be8625af6a028a228fe9

[jira] [Commented] (MESOS-6406) Send latest status for partition-aware tasks when agent reregisters

2017-11-29 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16271862#comment-16271862 ] Yan Xu commented on MESOS-6406: --- [~ipronin] no if the agent's entry was GCed. The master doe

[jira] [Commented] (MESOS-6406) Send latest status for partition-aware tasks when agent reregisters

2017-11-29 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16271772#comment-16271772 ] Yan Xu commented on MESOS-6406: --- So I think we can probably improve on the approach stated i

[jira] [Created] (MESOS-8276) Benchmark agent reregistration after master failover with connected frameworks.

2017-11-29 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8276: - Summary: Benchmark agent reregistration after master failover with connected frameworks. Key: MESOS-8276 URL: https://issues.apache.org/jira/browse/MESOS-8276 Project: Mesos

[jira] [Commented] (MESOS-8185) Tasks can be known to the agent but unknown to the master.

2017-11-27 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16267891#comment-16267891 ] Yan Xu commented on MESOS-8185: --- [~ipronin] sure and Megha just submitted a RR for MESOS-640

[jira] [Updated] (MESOS-6406) Send latest status for partition-aware tasks when agent reregisters

2017-11-27 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-6406: -- Shepherd: Yan Xu (was: Vinod Kone) > Send latest status for partition-aware tasks when agent reregisters >

[jira] [Assigned] (MESOS-6406) Send latest status for partition-aware tasks when agent reregisters

2017-11-27 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-6406: - Assignee: Megha Sharma (was: Neil Conway) > Send latest status for partition-aware tasks when agent rere

[jira] [Commented] (MESOS-7711) Master updates registry for reregistering agents even when they haven't been unreachable

2017-11-22 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-7711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263224#comment-16263224 ] Yan Xu commented on MESOS-7711: --- Clarification on the fix: by not calling registrar in the m

[jira] [Commented] (MESOS-8185) Tasks can be known to the agent but unknown to the master.

2017-11-17 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257398#comment-16257398 ] Yan Xu commented on MESOS-8185: --- I think so. [~ipronin] with MESOS-7215 no tasks will be kil

[jira] [Updated] (MESOS-8200) Suppressed roles are not honoured for v1 scheduler subscribe requests.

2017-11-15 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8200: -- Affects Version/s: 1.4.0 > Suppressed roles are not honoured for v1 scheduler subscribe requests. >

[jira] [Commented] (MESOS-8223) Master crashes when suppressed on subscribe is enabled.

2017-11-14 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16252014#comment-16252014 ] Yan Xu commented on MESOS-8223: --- The problem is that this [code|https://github.com/apache/m

[jira] [Assigned] (MESOS-8223) Master crashes when suppressed on subscribe is enabled.

2017-11-14 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8223: - Assignee: Yan Xu > Master crashes when suppressed on subscribe is enabled. >

[jira] [Created] (MESOS-8223) Master crashes when suppressed on subscribe is enabled.

2017-11-14 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8223: - Summary: Master crashes when suppressed on subscribe is enabled. Key: MESOS-8223 URL: https://issues.apache.org/jira/browse/MESOS-8223 Project: Mesos Issue Type: Bug

[jira] [Commented] (MESOS-8200) Suppressed roles are not honoured for v1 scheduler subscribe requests.

2017-11-13 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250233#comment-16250233 ] Yan Xu commented on MESOS-8200: --- [~vinodkone] yes. I have the patch for the devolve code rea

[jira] [Commented] (MESOS-8200) Suppressed roles are not honoured for v1 scheduler subscribe requests.

2017-11-10 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247947#comment-16247947 ] Yan Xu commented on MESOS-8200: --- The easies fix is probably to just change to tag to 3: {{re

[jira] [Assigned] (MESOS-8178) UnreachableAgentReregisterAfterFailover is flaky.

2017-11-07 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8178?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8178: - Assignee: Yan Xu > UnreachableAgentReregisterAfterFailover is flaky. > --

[jira] [Commented] (MESOS-8160) Support idempotent framework registration

2017-11-06 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16241612#comment-16241612 ] Yan Xu commented on MESOS-8160: --- /cc [~adam-mesos] this is relevant to the point you made on

[jira] [Commented] (MESOS-8098) Benchmark Master failover performance

2017-11-03 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16238131#comment-16238131 ] Yan Xu commented on MESOS-8098: --- {noformat:title=} commit ac0fa281472c2ba891f7bd0837fbd728ac

[jira] [Updated] (MESOS-8098) Benchmark Master failover performance

2017-11-03 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8098: -- Attachment: withoutperfpatches.perf.svg withperfpatches.perf.svg Attaching two flame graphs comp

[jira] [Created] (MESOS-8160) Support idempotent framework registration

2017-11-01 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8160: - Summary: Support idempotent framework registration Key: MESOS-8160 URL: https://issues.apache.org/jira/browse/MESOS-8160 Project: Mesos Issue Type: Bug Reporte

[jira] [Commented] (MESOS-8138) Master can fail to detect HTTP framework disconnection if it disconnects very fast

2017-10-26 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221487#comment-16221487 ] Yan Xu commented on MESOS-8138: --- {quote} the master realizes the disconnection when it tries

[jira] [Updated] (MESOS-8138) Master can fail to detect HTTP framework disconnection if it disconnects very fast

2017-10-26 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8138: -- Description: What we've observed is that if the framework disconnects before the master actor processes the ini

[jira] [Commented] (MESOS-8138) Master can fail to detect HTTP framework disconnection if it disconnects very fast

2017-10-26 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16221369#comment-16221369 ] Yan Xu commented on MESOS-8138: --- /cc [~anandmazumdar] who implemented MESOS-2294. > Master

[jira] [Created] (MESOS-8138) Master can fail to detect HTTP framework disconnection if it disconnects very fast

2017-10-26 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8138: - Summary: Master can fail to detect HTTP framework disconnection if it disconnects very fast Key: MESOS-8138 URL: https://issues.apache.org/jira/browse/MESOS-8138 Project: Mesos

[jira] [Commented] (MESOS-5368) Consider introducing persistent agent ID

2017-10-25 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16219668#comment-16219668 ] Yan Xu commented on MESOS-5368: --- /cc [~anandmazumdar] does my comment above make sense? > C

[jira] [Assigned] (MESOS-8085) No point in deallocate() for a framework for maintenance if it is deactivated.

2017-10-16 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8085: - Assignee: Yan Xu > No point in deallocate() for a framework for maintenance if it is deactivated. > -

[jira] [Created] (MESOS-8098) Benchmark Master failover performance

2017-10-16 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8098: - Summary: Benchmark Master failover performance Key: MESOS-8098 URL: https://issues.apache.org/jira/browse/MESOS-8098 Project: Mesos Issue Type: Task Components:

[jira] [Assigned] (MESOS-8098) Benchmark Master failover performance

2017-10-16 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8098: - Assignee: Yan Xu > Benchmark Master failover performance > - > >

[jira] [Updated] (MESOS-8085) No point in deallocate() for a framework for maintenance if it is deactivated.

2017-10-16 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8085: -- Description: The {{UnavailableResources}} sent from the allocator to the master are going to be dropped by the

[jira] [Updated] (MESOS-8085) No point in deallocate() for a framework for maintenance if it is deactivated.

2017-10-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8085: -- Summary: No point in deallocate() for a framework for maintenance if it is deactivated. (was: No point in deall

[jira] [Created] (MESOS-8085) No point in deallocate() for a framework for maintenance it is deactivated.

2017-10-12 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8085: - Summary: No point in deallocate() for a framework for maintenance it is deactivated. Key: MESOS-8085 URL: https://issues.apache.org/jira/browse/MESOS-8085 Project: Mesos

[jira] [Updated] (MESOS-8085) No point in deallocate() for a framework for maintenance it is deactivated.

2017-10-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8085: -- Labels: maintenance (was: ) > No point in deallocate() for a framework for maintenance it is deactivated. > ---

[jira] [Commented] (MESOS-5368) Consider introducing persistent agent ID

2017-10-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202636#comment-16202636 ] Yan Xu commented on MESOS-5368: --- Also, how does this relate to MESOS-8008? From there it sou

[jira] [Created] (MESOS-8083) Mesos containerizer should run isolate() sequentially.

2017-10-12 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8083: - Summary: Mesos containerizer should run isolate() sequentially. Key: MESOS-8083 URL: https://issues.apache.org/jira/browse/MESOS-8083 Project: Mesos Issue Type: Improvemen

[jira] [Commented] (MESOS-5368) Consider introducing persistent agent ID

2017-10-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-5368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16202473#comment-16202473 ] Yan Xu commented on MESOS-5368: --- [~vinodkone] This sounds good to me, just a few details whi

[jira] [Updated] (MESOS-8076) PersistentVolumeTest.SharedPersistentVolumeRescindOnDestroy is flaky.

2017-10-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8076: -- Shepherd: Alexander Rukletsov > PersistentVolumeTest.SharedPersistentVolumeRescindOnDestroy is flaky. >

[jira] [Assigned] (MESOS-8076) PersistentVolumeTest.SharedPersistentVolumeRescindOnDestroy is flaky.

2017-10-12 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu reassigned MESOS-8076: - Assignee: Yan Xu > PersistentVolumeTest.SharedPersistentVolumeRescindOnDestroy is flaky. > --

[jira] [Updated] (MESOS-8062) Master sends messages to the agent before it reregisters

2017-10-09 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-8062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-8062: -- Component/s: master > Master sends messages to the agent before it reregisters > ---

[jira] [Created] (MESOS-8062) Master sends messages to the agent before it reregisters

2017-10-09 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8062: - Summary: Master sends messages to the agent before it reregisters Key: MESOS-8062 URL: https://issues.apache.org/jira/browse/MESOS-8062 Project: Mesos Issue Type: Bug

[jira] [Commented] (MESOS-6918) Prometheus exporter endpoints for metrics

2017-10-05 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-6918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16194161#comment-16194161 ] Yan Xu commented on MESOS-6918: --- [~bmahler] let's chat about the reviews? [~jpe...@apache.or

[jira] [Commented] (MESOS-1280) Add replace task primitive

2017-10-02 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-1280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16188488#comment-16188488 ] Yan Xu commented on MESOS-1280: --- Probably not all fields in the TaskInfo make equal sense to

[jira] [Commented] (MESOS-7215) Race condition on re-registration of non-partition-aware frameworks

2017-09-29 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-7215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16186286#comment-16186286 ] Yan Xu commented on MESOS-7215: --- [~megha.sharma] Per offline discussion, we should probably

[jira] [Commented] (MESOS-7964) Heavy-duty GC makes the agent unresponsive

2017-09-26 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16181817#comment-16181817 ] Yan Xu commented on MESOS-7964: --- {noformat:title=master} commit 06341309e61a5cee702ea3c7b6d3

[jira] [Updated] (MESOS-7964) Heavy-duty GC makes the agent unresponsive

2017-09-26 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-7964: -- Fix Version/s: 1.5.0 > Heavy-duty GC makes the agent unresponsive > -- >

[jira] [Updated] (MESOS-7964) Heavy-duty GC makes the agent unresponsive

2017-09-26 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-7964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yan Xu updated MESOS-7964: -- Affects Version/s: 1.4.0 > Heavy-duty GC makes the agent unresponsive >

[jira] [Commented] (MESOS-7921) process::EventQueue sometimes crashes

2017-09-06 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155706#comment-16155706 ] Yan Xu commented on MESOS-7921: --- Tried out the patch and it seemed to work. I had run mesos-

[jira] [Commented] (MESOS-7921) process::EventQueue sometimes crashes

2017-09-01 Thread Yan Xu (JIRA)
[ https://issues.apache.org/jira/browse/MESOS-7921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151119#comment-16151119 ] Yan Xu commented on MESOS-7921: --- So libprocess GC would delete the managed process upon thei

  1   2   3   4   5   6   7   8   9   >