[jira] [Commented] (MESOS-7971) PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test if flaky

2018-05-09 Thread Chun-Hung Hsiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469885#comment-16469885
 ] 

Chun-Hung Hsiao commented on MESOS-7971:


Failed on Apache CI: 
https://builds.apache.org/job/Mesos-Buildbot/5273/BUILDTOOL=cmake,COMPILER=clang,CONFIGURATION=--verbose%20--disable-libtool-wrappers,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1%20MESOS_TEST_AWAIT_TIMEOUT=60secs,OS=ubuntu%3A16.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)&&(!H21)&&(!H23)&&(!H26)&&(!H27)/console

> PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove test if flaky
> -
>
> Key: MESOS-7971
> URL: https://issues.apache.org/jira/browse/MESOS-7971
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 1.4.0, 1.6.0
>Reporter: Vinod Kone
>Priority: Major
>  Labels: flaky-test, mesosphere
>
> Saw this when testing 1.4.0-rc5
> {code}
> [ RUN  ] PersistentVolumeEndpointsTest.EndpointCreateThenOfferRemove
> I0912 05:40:27.335222 30860 cluster.cpp:162] Creating default 'local' 
> authorizer
> I0912 05:40:27.338429 30867 master.cpp:442] Master 
> 2bd1e8eb-e314-4181-9ed3-d397ec1dbede (6aa774430302) started on 
> 172.17.0.3:54639
> I0912 05:40:27.338472 30867 master.cpp:444] Flags at startup: --acls="" 
> --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" 
> --allocation_interval="50ms" --allocator="HierarchicalDRF" 
> --authenticate_agents="true" --authenticate_frameworks="true" 
> --authenticate_http_frameworks="true" --authenticate_http_readonly="true" 
> --authenticate_http_readwrite="true" --authenticators="crammd5" 
> --authorizers="local" --credentials="/tmp/hH0YXe/credentials" 
> --filter_gpu_resources="true" --framework_sorter="drf" --help="false" 
> --hostname_lookup="true" --http_authenticators="basic" 
> --http_framework_authenticators="basic" --initialize_driver_logging="true" 
> --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" 
> --max_agent_ping_timeouts="5" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" 
> --max_unreachable_tasks_per_framework="1000" --port="5050" --quiet="false" 
> --recovery_agent_removal_limit="100%" --registry="in_memory" 
> --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" 
> --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" 
> --registry_store_timeout="100secs" --registry_strict="false" --roles="role1" 
> --root_submissions="true" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/hH0YXe/master" 
> --zk_session_timeout="10secs"
> I0912 05:40:27.338778 30867 master.cpp:494] Master only allowing 
> authenticated frameworks to register
> I0912 05:40:27.338788 30867 master.cpp:508] Master only allowing 
> authenticated agents to register
> I0912 05:40:27.338793 30867 master.cpp:521] Master only allowing 
> authenticated HTTP frameworks to register
> I0912 05:40:27.338799 30867 credentials.hpp:37] Loading credentials for 
> authentication from '/tmp/hH0YXe/credentials'
> I0912 05:40:27.353009 30867 master.cpp:566] Using default 'crammd5' 
> authenticator
> I0912 05:40:27.353183 30867 http.cpp:1026] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-readonly'
> I0912 05:40:27.353364 30867 http.cpp:1026] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-readwrite'
> I0912 05:40:27.353482 30867 http.cpp:1026] Creating default 'basic' HTTP 
> authenticator for realm 'mesos-master-scheduler'
> I0912 05:40:27.353588 30867 master.cpp:646] Authorization enabled
> W0912 05:40:27.353605 30867 master.cpp:709] The '--roles' flag is deprecated. 
> This flag will be removed in the future. See the Mesos 0.27 upgrade notes for 
> more information
> I0912 05:40:27.353742 30868 hierarchical.cpp:171] Initialized hierarchical 
> allocator process
> I0912 05:40:27.353775 30872 whitelist_watcher.cpp:77] No whitelist given
> I0912 05:40:27.356655 30873 master.cpp:2163] Elected as the leading master!
> I0912 05:40:27.356675 30873 master.cpp:1702] Recovering from registrar
> I0912 05:40:27.356868 30874 registrar.cpp:347] Recovering registrar
> I0912 05:40:27.357390 30874 registrar.cpp:391] Successfully fetched the 
> registry (0B) in 494080ns
> I0912 05:40:27.357483 30874 registrar.cpp:495] Applied 1 operations in 
> 31911ns; attempting to update the registry
> I0912 05:40:27.357919 30874 registrar.cpp:552] Successfully updated the 
> registry in 391936ns
> I0912 05:40:27.358018 30874 registrar.cpp:424] Successfully recovered 
> registrar
> I0912 05:40:27.358413 30868 master.cpp:1801] Recovered 0 agents from the 
> registry (129B); allowing 10mins for agents to re-register
> I0912 05:40:27.358482 30867 hierarchical.cpp:209] Skipping recovery of 
> hierarchical allocator: 

[jira] [Created] (MESOS-8901) os::children in stout is unnecessarily expensive

2018-05-09 Thread Jie Yu (JIRA)
Jie Yu created MESOS-8901:
-

 Summary: os::children in stout is unnecessarily expensive
 Key: MESOS-8901
 URL: https://issues.apache.org/jira/browse/MESOS-8901
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Jie Yu


it uses os::processes which gets all process information in the proc 
filesystem. Essentially, we just need pid. This call is used in the container 
launch path, so we should consider optimize it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-8900) Container statistics should be exposed as Metrics.

2018-05-09 Thread Jie Yu (JIRA)
Jie Yu created MESOS-8900:
-

 Summary: Container statistics should be exposed as Metrics.
 Key: MESOS-8900
 URL: https://issues.apache.org/jira/browse/MESOS-8900
 Project: Mesos
  Issue Type: Improvement
Reporter: Jie Yu


Currently, those are not exposed as metrics, but rather a customized format:
https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L1642

It'll be nice to expose those as metrics with proper labeling. For instance, 
using the prometheus format:
https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md

In that way, it'll be much easier to plug into the metrics system (e.g., 
prometheus). Also, some stats like xxx_p50, xxx_p99 can be abstracted into some 
more standard Metrics concept like Quantile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-8899) Add benchmarks to test framework/allocator scalability

2018-05-09 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-8899:
-

 Summary: Add benchmarks to test framework/allocator scalability
 Key: MESOS-8899
 URL: https://issues.apache.org/jira/browse/MESOS-8899
 Project: Mesos
  Issue Type: Task
Reporter: Kapil Arya


The benchmark should launch several test frameworks to test how frequently 
offers are generated/delivered to frameworks. Especially, this will test if 
some frameworks are starving due to others declining offers with a smaller 
timeout.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (MESOS-8798) Build the "unsecure" gRPC libraries to remove SSL dependency.

2018-05-09 Thread Chun-Hung Hsiao (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468235#comment-16468235
 ] 

Chun-Hung Hsiao edited comment on MESOS-8798 at 5/9/18 11:14 PM:
-

Reviews:
 [https://reviews.apache.org/r/67017/]
https://reviews.apache.org/r/67018/
[https://reviews.apache.org/r/66996/]
 [https://reviews.apache.org/r/66997/]


was (Author: chhsia0):
Reviews:
[https://reviews.apache.org/r/67017/]
[https://reviews.apache.org/r/67018/
] [https://reviews.apache.org/r/66996/]
https://reviews.apache.org/r/66997/

> Build the "unsecure" gRPC libraries to remove SSL dependency.
> -
>
> Key: MESOS-8798
> URL: https://issues.apache.org/jira/browse/MESOS-8798
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 1.4.1, 1.5.0, 1.6.0
>Reporter: Chun-Hung Hsiao
>Assignee: Chun-Hung Hsiao
>Priority: Critical
>  Labels: mesosphere, storage
> Fix For: 1.7.0
>
>
> GRPC can be built without SSL (the "unsecure" libraries) so we should use 
> these libraries to avoid a build dependency between gRPC and OpenSSL.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8857) Fix subprocess(flags) logic on Windows to handle arguments with quotes

2018-05-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469564#comment-16469564
 ] 

ASF GitHub Bot commented on MESOS-8857:
---

Github user radhikaj closed the pull request at:

https://github.com/apache/mesos/pull/288


> Fix subprocess(flags) logic on Windows to handle arguments with quotes
> --
>
> Key: MESOS-8857
> URL: https://issues.apache.org/jira/browse/MESOS-8857
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrew Schwartzmeyer
>Assignee: Radhika Jandhyala
>Priority: Major
>  Labels: flags, libprocess, windows
>
> In the {{SubprocessTest.Flags}} unit test, a bug was discovered where the 
> flags argument {{flags.s3 = "\"geek\"";}} does not make it round-trip back to 
> the test. It is because the {{stringify_args}} logic in {{shell.hpp}} 
> purposefully (correctly?) surrounds an argument that contains a double quote 
> with a pair of double quotes. Thus the final command-line flag looks like 
> {{"\"--s3=\\\"geek\\\"\""}}, which {{flags.load()}} then fails to reparse. 
> The same problem occurs for the (more complicated) JSON flag.
> I believe this is because the original logic was expecting the shell to drop 
> the quotes ({{echo "--s3=\"geek\""}} in Bash returns {{--s3="geek"}}, but 
> {{cmd.exe}} echos {{"--s3=\"geek\""}}, exactly what was passed. Maybe the 
> test just needs to be fixed; maybe the stringifier shouldn't add more quotes; 
> maybe {{flags.load()}} needs to parse the quotes and escapes.
> For now, we're enabling the rest of the test by turning off those two checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8857) Fix subprocess(flags) logic on Windows to handle arguments with quotes

2018-05-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469563#comment-16469563
 ] 

ASF GitHub Bot commented on MESOS-8857:
---

GitHub user radhikaj opened a pull request:

https://github.com/apache/mesos/pull/288

Fixtestflags

https://issues.apache.org/jira/browse/MESOS-8857



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/radhikaj/mesos fixtestflags

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/mesos/pull/288.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #288


commit 9d895daf4bcd27e6929b03a95e7cef8002876f59
Author: Radhika Jandhyala 
Date:   2018-05-09T21:09:34Z

Fix TEST_F(SubprocessTest, Flags)

commit 0c60122764d5923c8e7255516042bd26787a1ff5
Author: Radhika Jandhyala 
Date:   2018-05-09T21:51:20Z

run clang-format




> Fix subprocess(flags) logic on Windows to handle arguments with quotes
> --
>
> Key: MESOS-8857
> URL: https://issues.apache.org/jira/browse/MESOS-8857
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrew Schwartzmeyer
>Assignee: Radhika Jandhyala
>Priority: Major
>  Labels: flags, libprocess, windows
>
> In the {{SubprocessTest.Flags}} unit test, a bug was discovered where the 
> flags argument {{flags.s3 = "\"geek\"";}} does not make it round-trip back to 
> the test. It is because the {{stringify_args}} logic in {{shell.hpp}} 
> purposefully (correctly?) surrounds an argument that contains a double quote 
> with a pair of double quotes. Thus the final command-line flag looks like 
> {{"\"--s3=\\\"geek\\\"\""}}, which {{flags.load()}} then fails to reparse. 
> The same problem occurs for the (more complicated) JSON flag.
> I believe this is because the original logic was expecting the shell to drop 
> the quotes ({{echo "--s3=\"geek\""}} in Bash returns {{--s3="geek"}}, but 
> {{cmd.exe}} echos {{"--s3=\"geek\""}}, exactly what was passed. Maybe the 
> test just needs to be fixed; maybe the stringifier shouldn't add more quotes; 
> maybe {{flags.load()}} needs to parse the quotes and escapes.
> For now, we're enabling the rest of the test by turning off those two checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-8898) Move scheduler HTTP API parsing out of master process

2018-05-09 Thread Dario Rexin (JIRA)
Dario Rexin created MESOS-8898:
--

 Summary: Move scheduler HTTP API parsing out of master process
 Key: MESOS-8898
 URL: https://issues.apache.org/jira/browse/MESOS-8898
 Project: Mesos
  Issue Type: Task
  Components: HTTP API, master
Affects Versions: 1.5.0
Reporter: Dario Rexin


Today all calls sent to Mesos via the v1 api are parsed in the master process, 
even though non of the information contained in that process is required for 
this. The master process is already doing the majority of the work and is the 
biggest bottleneck, so anything that can be done elsewhere would improve the 
situation. Parsing of the calls (and some validation) could be done in separate 
processes and in parallel for multiple frameworks, which should improve overall 
responsiveness and throughput of the master process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-8857) Fix subprocess(flags) logic on Windows to handle arguments with quotes

2018-05-09 Thread Radhika Jandhyala (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Radhika Jandhyala reassigned MESOS-8857:


Assignee: Radhika Jandhyala

> Fix subprocess(flags) logic on Windows to handle arguments with quotes
> --
>
> Key: MESOS-8857
> URL: https://issues.apache.org/jira/browse/MESOS-8857
> Project: Mesos
>  Issue Type: Bug
>Reporter: Andrew Schwartzmeyer
>Assignee: Radhika Jandhyala
>Priority: Major
>  Labels: flags, libprocess, windows
>
> In the {{SubprocessTest.Flags}} unit test, a bug was discovered where the 
> flags argument {{flags.s3 = "\"geek\"";}} does not make it round-trip back to 
> the test. It is because the {{stringify_args}} logic in {{shell.hpp}} 
> purposefully (correctly?) surrounds an argument that contains a double quote 
> with a pair of double quotes. Thus the final command-line flag looks like 
> {{"\"--s3=\\\"geek\\\"\""}}, which {{flags.load()}} then fails to reparse. 
> The same problem occurs for the (more complicated) JSON flag.
> I believe this is because the original logic was expecting the shell to drop 
> the quotes ({{echo "--s3=\"geek\""}} in Bash returns {{--s3="geek"}}, but 
> {{cmd.exe}} echos {{"--s3=\"geek\""}}, exactly what was passed. Maybe the 
> test just needs to be fixed; maybe the stringifier shouldn't add more quotes; 
> maybe {{flags.load()}} needs to parse the quotes and escapes.
> For now, we're enabling the rest of the test by turning off those two checks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8895) Document known differences between Docker and Mesos containerizers

2018-05-09 Thread Gilbert Song (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469467#comment-16469467
 ] 

Gilbert Song commented on MESOS-8895:
-

1. default workdir: sandbox in Mesos vs root on docker
2. host volume absolute host path: pre-existed on Mesos vs mkdir if not exist 
on docker

> Document known differences between Docker and Mesos containerizers
> --
>
> Key: MESOS-8895
> URL: https://issues.apache.org/jira/browse/MESOS-8895
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, documentation
>Reporter: Greg Mann
>Priority: Major
>  Labels: documentaion
>
> To help developers who create applications which may be run using both the 
> Docker and Mesos containerizers, we should document all known differences 
> from a developer's perspective.
> For example:
> * By default, Docker starts the task process in {{/}} while Mesos 
> containerizer starts the task process in the Mesos task sandbox.
> * When {{command.shell}} is {{false}}, Mesos containerizer launches the 
> command via {{execvp}} while Docker passes the command as arguments to the 
> Docker CLI. One difference this causes is that {{$@}} will be empty when the 
> Mesos containerizer is used and the executable is a bash script.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8897) ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky

2018-05-09 Thread Harold Dost III (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469186#comment-16469186
 ] 

Harold Dost III commented on MESOS-8897:


[~jpe...@apache.org] Do we want to round up then? Or do we want to modify the 
test to check for greater than or equal to?

To round up we can do 
{code}
409 // If the soft limit is exceeded the container should be killed.
410 if (quotaInfo->used > quotaInfo->softLimit) {
411   Resource resource;
412   resource.set_name("disk");
413   resource.set_type(Value::SCALAR);
414   resource.mutable_scalar()->set_value(
415-- quotaInfo->used.bytes() / Bytes::MEGABYTES);
415++ quotaInfo->used.bytes() / Bytes::MEGABYTES +
416++ (quotaInfo->used.bytes() % Bytes::MEGABYTES > 0 ? 1 : 0)
417++   );
418   info->limitation.set(
419   protobuf::slave::createContainerLimitation(
420   Resources(resource),
421   "Disk usage (" + stringify(quotaInfo->used) +
422   ") exceeds quota (" +
423   stringify(quotaInfo->softLimit) + ")",
424   TaskStatus::REASON_CONTAINER_LIMITATION_DISK));
425 }
426   }
{code}

> ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky
> -
>
> Key: MESOS-8897
> URL: https://issues.apache.org/jira/browse/MESOS-8897
> Project: Mesos
>  Issue Type: Bug
>  Components: flaky, test
>Reporter: Yan Xu
>Priority: Major
>
> {noformat:title=}
> [ RUN ] ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill
> meta-data=/dev/loop0 isize=256 agcount=2, agsize=5120 blks
>  = sectsz=512 attr=2, projid32bit=1
>  = crc=0
> data = bsize=4096 blocks=10240, imaxpct=25
>  = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal log bsize=4096 blocks=1200, version=2
>  = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
> I0508 17:55:12.353438 13453 exec.cpp:162] Version: 1.7.0
> I0508 17:55:12.370332 13451 exec.cpp:236] Executor registered on agent 
> 49668ffa-2a69-4867-b31a-4972b4ac13d2-S0
> I0508 17:55:12.376093 13447 executor.cpp:178] Received SUBSCRIBED event
> I0508 17:55:12.376771 13447 executor.cpp:182] Subscribed executor on 
> mesos.vagrant
> I0508 17:55:12.377038 13447 executor.cpp:178] Received LAUNCH event
> I0508 17:55:12.381901 13447 executor.cpp:665] Starting task 
> edb798b4-1b16-4de4-828c-0db132df70ab
> I0508 17:55:12.387936 13447 executor.cpp:485] Running 
> '/tmp/mesos-build/mesos/build/src/mesos-containerizer launch 
> '
> I0508 17:55:12.392854 13447 executor.cpp:678] Forked command at 13456
> 2+0 records in
> 2+0 records out
> 2097152 bytes (2.1 MB) copied, 0.00404074 s, 519 MB/s
> ../../src/tests/containerizer/xfs_quota_tests.cpp:618: Failure
> Expected: (limit.disk().get()) > (Megabytes(1)), actual: 1MB vs 1MB
> [ FAILED ] ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill (1182 ms)
> {noformat}
> [~jpe...@apache.org] mentioned that 
> {code}
> 409 // If the soft limit is exceeded the container should be killed.
> 410 if (quotaInfo->used > quotaInfo->softLimit) {
> 411   Resource resource;
> 412   resource.set_name("disk");
> 413   resource.set_type(Value::SCALAR);
> 414   resource.mutable_scalar()->set_value(
> 415 quotaInfo->used.bytes() / Bytes::MEGABYTES);
> 416
> 417   info->limitation.set(
> 418   protobuf::slave::createContainerLimitation(
> 419   Resources(resource),
> 420   "Disk usage (" + stringify(quotaInfo->used) +
> 421   ") exceeds quota (" +
> 422   stringify(quotaInfo->softLimit) + ")",
> 423   TaskStatus::REASON_CONTAINER_LIMITATION_DISK));
> 424 }
> 425   }
> {code}
> Converting to MB is rounding down, so we report less space than was actually 
> used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-8897) ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky

2018-05-09 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-8897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469112#comment-16469112
 ] 

Yan Xu commented on MESOS-8897:
---

cc [~hdost]

> ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky
> -
>
> Key: MESOS-8897
> URL: https://issues.apache.org/jira/browse/MESOS-8897
> Project: Mesos
>  Issue Type: Bug
>  Components: flaky, test
>Reporter: Yan Xu
>Priority: Major
>
> {noformat:title=}
> [ RUN ] ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill
> meta-data=/dev/loop0 isize=256 agcount=2, agsize=5120 blks
>  = sectsz=512 attr=2, projid32bit=1
>  = crc=0
> data = bsize=4096 blocks=10240, imaxpct=25
>  = sunit=0 swidth=0 blks
> naming =version 2 bsize=4096 ascii-ci=0
> log =internal log bsize=4096 blocks=1200, version=2
>  = sectsz=512 sunit=0 blks, lazy-count=1
> realtime =none extsz=4096 blocks=0, rtextents=0
> I0508 17:55:12.353438 13453 exec.cpp:162] Version: 1.7.0
> I0508 17:55:12.370332 13451 exec.cpp:236] Executor registered on agent 
> 49668ffa-2a69-4867-b31a-4972b4ac13d2-S0
> I0508 17:55:12.376093 13447 executor.cpp:178] Received SUBSCRIBED event
> I0508 17:55:12.376771 13447 executor.cpp:182] Subscribed executor on 
> mesos.vagrant
> I0508 17:55:12.377038 13447 executor.cpp:178] Received LAUNCH event
> I0508 17:55:12.381901 13447 executor.cpp:665] Starting task 
> edb798b4-1b16-4de4-828c-0db132df70ab
> I0508 17:55:12.387936 13447 executor.cpp:485] Running 
> '/tmp/mesos-build/mesos/build/src/mesos-containerizer launch 
> '
> I0508 17:55:12.392854 13447 executor.cpp:678] Forked command at 13456
> 2+0 records in
> 2+0 records out
> 2097152 bytes (2.1 MB) copied, 0.00404074 s, 519 MB/s
> ../../src/tests/containerizer/xfs_quota_tests.cpp:618: Failure
> Expected: (limit.disk().get()) > (Megabytes(1)), actual: 1MB vs 1MB
> [ FAILED ] ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill (1182 ms)
> {noformat}
> [~jpe...@apache.org] mentioned that 
> {code}
> 409 // If the soft limit is exceeded the container should be killed.
> 410 if (quotaInfo->used > quotaInfo->softLimit) {
> 411   Resource resource;
> 412   resource.set_name("disk");
> 413   resource.set_type(Value::SCALAR);
> 414   resource.mutable_scalar()->set_value(
> 415 quotaInfo->used.bytes() / Bytes::MEGABYTES);
> 416
> 417   info->limitation.set(
> 418   protobuf::slave::createContainerLimitation(
> 419   Resources(resource),
> 420   "Disk usage (" + stringify(quotaInfo->used) +
> 421   ") exceeds quota (" +
> 422   stringify(quotaInfo->softLimit) + ")",
> 423   TaskStatus::REASON_CONTAINER_LIMITATION_DISK));
> 424 }
> 425   }
> {code}
> Converting to MB is rounding down, so we report less space than was actually 
> used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-8897) ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky

2018-05-09 Thread Yan Xu (JIRA)
Yan Xu created MESOS-8897:
-

 Summary: ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill is flaky
 Key: MESOS-8897
 URL: https://issues.apache.org/jira/browse/MESOS-8897
 Project: Mesos
  Issue Type: Bug
  Components: flaky, test
Reporter: Yan Xu


{noformat:title=}
[ RUN ] ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill
meta-data=/dev/loop0 isize=256 agcount=2, agsize=5120 blks
 = sectsz=512 attr=2, projid32bit=1
 = crc=0
data = bsize=4096 blocks=10240, imaxpct=25
 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0
log =internal log bsize=4096 blocks=1200, version=2
 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
I0508 17:55:12.353438 13453 exec.cpp:162] Version: 1.7.0
I0508 17:55:12.370332 13451 exec.cpp:236] Executor registered on agent 
49668ffa-2a69-4867-b31a-4972b4ac13d2-S0
I0508 17:55:12.376093 13447 executor.cpp:178] Received SUBSCRIBED event
I0508 17:55:12.376771 13447 executor.cpp:182] Subscribed executor on 
mesos.vagrant
I0508 17:55:12.377038 13447 executor.cpp:178] Received LAUNCH event
I0508 17:55:12.381901 13447 executor.cpp:665] Starting task 
edb798b4-1b16-4de4-828c-0db132df70ab
I0508 17:55:12.387936 13447 executor.cpp:485] Running 
'/tmp/mesos-build/mesos/build/src/mesos-containerizer launch 
'
I0508 17:55:12.392854 13447 executor.cpp:678] Forked command at 13456
2+0 records in
2+0 records out
2097152 bytes (2.1 MB) copied, 0.00404074 s, 519 MB/s
../../src/tests/containerizer/xfs_quota_tests.cpp:618: Failure
Expected: (limit.disk().get()) > (Megabytes(1)), actual: 1MB vs 1MB
[ FAILED ] ROOT_XFS_QuotaTest.DiskUsageExceedsQuotaWithKill (1182 ms)
{noformat}

[~jpe...@apache.org] mentioned that 

{code}
409 // If the soft limit is exceeded the container should be killed.
410 if (quotaInfo->used > quotaInfo->softLimit) {
411   Resource resource;
412   resource.set_name("disk");
413   resource.set_type(Value::SCALAR);
414   resource.mutable_scalar()->set_value(
415 quotaInfo->used.bytes() / Bytes::MEGABYTES);
416
417   info->limitation.set(
418   protobuf::slave::createContainerLimitation(
419   Resources(resource),
420   "Disk usage (" + stringify(quotaInfo->used) +
421   ") exceeds quota (" +
422   stringify(quotaInfo->softLimit) + ")",
423   TaskStatus::REASON_CONTAINER_LIMITATION_DISK));
424 }
425   }
{code}

Converting to MB is rounding down, so we report less space than was actually 
used.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (MESOS-8836) Add a tests of recovery of the resource provider manager registrars.

2018-05-09 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-8836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-8836:
---

Assignee: Benjamin Bannier

> Add a tests of recovery of the resource provider manager registrars.
> 
>
> Key: MESOS-8836
> URL: https://issues.apache.org/jira/browse/MESOS-8836
> Project: Mesos
>  Issue Type: Bug
>  Components: test
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>Priority: Major
>
> See https://reviews.apache.org/r/66545/



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (MESOS-8896) 'ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors' is flaky

2018-05-09 Thread Jan Schlicht (JIRA)
Jan Schlicht created MESOS-8896:
---

 Summary: 'ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors' 
is flaky
 Key: MESOS-8896
 URL: https://issues.apache.org/jira/browse/MESOS-8896
 Project: Mesos
  Issue Type: Bug
  Components: flaky
Reporter: Jan Schlicht


This was a test failure on macOS with SSL enabled. Not sure yet if other 
systems might be affected as well:
{noformat}
[ RUN  ] ZooKeeperMasterContenderDetectorTest.NonRetryableFrrors
I0509 01:36:35.181434 2992141120 zookeeper_test_server.cpp:156] Started 
ZooKeeperTestServer on port 58450
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@log_env@765: Client 
environment:os.arch=17.4.0
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 
2017; root:xnu-4570.41.2~1/RELEASE_X86_64
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-05-09 01:36:35,181:44641(0x79f15000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:58450 sessionTimeout=1 
watcher=0x1148b6680 sessionId=0 sessionPasswd= context=0x7fe697de7590 
flags=0
2018-05-09 01:36:35,182:44641(0x7aa42000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:58450]
2018-05-09 01:36:35,185:44641(0x7aa42000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:58450], 
sessionId=0x163440b82ec, negotiated timeout=1
I0509 01:36:35.186167 167882752 group.cpp:341] Group process 
(zookeeper-group(14)@10.0.49.4:57595) connected to ZooKeeper
I0509 01:36:35.186213 167882752 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (1, 0, 0)
I0509 01:36:35.186226 167882752 group.cpp:395] Authenticating with ZooKeeper 
using digest
2018-05-09 
01:36:38,534:44641(0x7aa42000):ZOO_INFO@auth_completion_func@1327: 
Authentication scheme digest succeeded
I0509 01:36:38.534493 167882752 group.cpp:419] Trying to create path '/mesos' 
in ZooKeeper
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@log_env@753: Client 
environment:zookeeper.version=zookeeper C client 3.4.8
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@log_env@757: Client 
environment:host.name=Jenkinss-Mac-mini.local
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@log_env@764: Client 
environment:os.name=Darwin
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@log_env@765: Client 
environment:os.arch=17.4.0
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@log_env@766: Client 
environment:os.version=Darwin Kernel Version 17.4.0: Sun Dec 17 09:19:54 PST 
2017; root:xnu-4570.41.2~1/RELEASE_X86_64
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@log_env@774: Client 
environment:user.name=jenkins
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@log_env@782: Client 
environment:user.home=/Users/jenkins
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@log_env@794: Client 
environment:user.dir=/Users/jenkins/workspace/workspace/mesos/Mesos_CI-build/FLAG/SSL/label/mac/mesos/build
2018-05-09 01:36:38,540:44641(0x7a121000):ZOO_INFO@zookeeper_init@827: 
Initiating client connection, host=127.0.0.1:58450 sessionTimeout=1 
watcher=0x1148b6680 sessionId=0 sessionPasswd= context=0x7fe6999c1fe0 
flags=0
I0509 01:36:38.540652 166273024 contender.cpp:152] Joining the ZK group
2018-05-09 01:36:38,540:44641(0x7b463000):ZOO_INFO@check_events@1764: 
initiated connection to server [127.0.0.1:58450]
2018-05-09 01:36:38,542:44641(0x7b463000):ZOO_INFO@check_events@1811: 
session establishment complete on server [127.0.0.1:58450], 
sessionId=0x163440b82ec0001, negotiated timeout=1
I0509 01:36:38.542425 168955904 group.cpp:341] Group process 
(zookeeper-group(15)@10.0.49.4:57595) connected to ZooKeeper
I0509 01:36:38.542466 168955904 group.cpp:831] Syncing group operations: queue 
size (joins, cancels, datas) = (1, 0, 0)
I0509 01:36:38.542480 168955904 group.cpp:395] Authenticating with ZooKeeper 
using digest
2018-05-09 01:36:50,559:44641(0x7aa42000):ZOO_WARN@zookeeper_interest@1597: 
Exceeded deadline by 8687ms
2018-05-09