[jira] [Commented] (MESOS-9080) Port mapping isolator leaks ephemeral ports when a container is destroyed during preparation
[ https://issues.apache.org/jira/browse/MESOS-9080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16751362#comment-16751362 ] Ilya Pronin commented on MESOS-9080: [67936| https://reviews.apache.org/r/67936/] should fix the problem. We can close this. > Port mapping isolator leaks ephemeral ports when a container is destroyed > during preparation > > > Key: MESOS-9080 > URL: https://issues.apache.org/jira/browse/MESOS-9080 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.6.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Major > > {{network/port_mapping}} isolator leaks ephemeral ports during container > cleanup if {{Isolator::isolate()}} was not called, i.e. the container is > being destroyed during preparation. If the isolator doesn't know the main > container's PID it skips filters cleanup (they should not exist in this case) > and ephemeral ports deallocation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9476) XFS project IDs aren't released upon task completion
[ https://issues.apache.org/jira/browse/MESOS-9476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16720643#comment-16720643 ] Ilya Pronin commented on MESOS-9476: This change was introduced in 1.7. The isolator now periodically (every {{\-\-disk_watch_interval}}) checks which container sandboxes and persistent volumes were removed (e.g. by disk GC) and reclaims their project IDs. The main reason for doing so was the fact that project IDs cannot be removed from symlinks, which may lead to weird accounting. Also, currently isolators don't get notified when a persistent volume is removed, to {{disk/xfs}} can only do periodic scans to reclaim volume project IDs. See MESOS-5158 and MESOS-9007 for more information. XFS project IDs are 16 or 32 bit integers, usually there should be plenty of them available. Can you give your Mesos agents a larger ID range? > XFS project IDs aren't released upon task completion > > > Key: MESOS-9476 > URL: https://issues.apache.org/jira/browse/MESOS-9476 > Project: Mesos > Issue Type: Bug > Components: agent >Affects Versions: 1.7.0 > Environment: Centos 7.1 > Mesos 1.7 >Reporter: Omar AitMous >Priority: Major > Attachments: Vagrantfile, build.sh > > > The XFS isolation doesn't release project IDs when a task finishes on Mesos > 1.7 (branch 1.7.x), and once all project IDs are taken, scheduling new tasks > fails with: > {code:java} > Failed to assign project ID, range exhausted > {code} > > Attached is a vagrant configuration that sets up a VM with an XFS disk > (mounted on /var/opt/mesos), zookeeper 3.4.12, mesos 1.7 and marathon 1.6. > Once the box is ready, start zookeeper, mesos-master, mesos-agent (using the > XFS disk) and marathon: > {code:java} > sudo bin/zkServer.sh start > sudo /home/vagrant/mesos/build/bin/mesos-master.sh --ip=192.168.33.10 > --work_dir=/mnt/mesos > sudo /home/vagrant/mesos/build/bin/mesos-agent.sh --master=192.168.33.10:5050 > --work_dir=/var/opt/mesos --enforce_container_disk_quota --isolation=disk/xfs > --xfs_project_range=[5000-5009] > sudo > MESOS_NATIVE_JAVA_LIBRARY="/home/vagrant/mesos/build/src/.libs/libmesos.so" > sbt 'run --master 192.168.33.10:5050 --zk zk://localhost:2181/marathon' > {code} > > Create an app on marathon, for example: > {code:java} > {"id": "/test", "cmd": "sleep 3600", "cpus": 0.01, "mem": 32, "disk": 1, > "instances": 5} > {code} > > You should see 5 project IDs being used: > {code:java} > $ sudo xfs_quota -x -c "report -a -n -L 5000 -U 5009" | grep '^#[1-9][0-9]*' > #5000 4 1024 1024 00 [] > #5001 4 1024 1024 00 [] > #5002 4 1024 1024 00 [] > #5003 4 1024 1024 00 [] > #5004 4 1024 1024 00 [] > {code} > > If you scale down to 0 instances, the project IDs aren't released. > If you scale back up to 8 instances, only 5 of them will start, the remaining > 3 will fail with errors like this: > {code:java} > E1213 14:38:36.190430 20813 slave.cpp:6204] Container > '064b8a6b-c42d-4905-b2a7-632318aa2b83' for executor > 'test.c5e88a67-fee4-11e8-9cc6-0800278a1a98' of framework > 0473e272-04f7-4b1d-ae1d-f7177940e295- failed to start: Failed to assign > project ID, range exhausted > {code} > > I've tested on Mesos 1.4, the project IDs are properly released when the task > finishes. > (I haven't tested other versions) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (MESOS-9451) Libprocess endpoints can ignore required gzip compression
[ https://issues.apache.org/jira/browse/MESOS-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709355#comment-16709355 ] Ilya Pronin edited comment on MESOS-9451 at 12/4/18 10:32 PM: -- Per [RFC 7231|https://tools.ietf.org/html/rfc7231#section-5.3.4] {{Accept-Encoding}} header field is an advertisement that a particular encoding is supported by the requester. The server may still use {{identity}} encoding (no encoding) unless the client forbids it with {{identity;q=0}}. I think it's OK for libprocess to continue to apply body length threshold as long as it checks that {{identity}}'s weight is not 0. was (Author: ipronin): Per [RFC 7231|https://tools.ietf.org/html/rfc7231#section-5.3.4] {{Accept-Encoding}} header field is an advertisement that a particular encoding is supported by the requestor. The server may still use {{identity}} encoding (no encoding) unless the client forbids it with {{identity;q=0}}. I think it's OK for libprocess to continue to apply body length threshold as long as it checks that {{identity}}'s weight is not 0. > Libprocess endpoints can ignore required gzip compression > - > > Key: MESOS-9451 > URL: https://issues.apache.org/jira/browse/MESOS-9451 > Project: Mesos > Issue Type: Bug >Reporter: Benno Evers >Priority: Major > Labels: libprocess > > Currently, libprocess decides whether a response should be compressed by the > following conditional: > {noformat} > if (response.type == http::Response::BODY && > response.body.length() >= GZIP_MINIMUM_BODY_LENGTH && > !headers.contains("Content-Encoding") && > request.acceptsEncoding("gzip")) { > [...] > {noformat} > However, this implies that a request sent with the header "Accept-Encoding: > gzip" can not rely on actually getting a gzipped response, e.g. when the > response size is below the threshold: > {noformat} > $ nc localhost 5050 > GET /tasks HTTP/1.1 > Accept-Encoding: gzip > HTTP/1.1 200 OK > Date: Tue, 04 Dec 2018 12:49:56 GMT > Content-Type: application/json > Content-Length: 12 > {"tasks":[]} > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9451) Libprocess endpoints can ignore required gzip compression
[ https://issues.apache.org/jira/browse/MESOS-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709355#comment-16709355 ] Ilya Pronin commented on MESOS-9451: Per [RFC 7231|https://tools.ietf.org/html/rfc7231#section-5.3.4] {{Accept-Encoding}} header field is an advertisement that a particular encoding is supported by the requestor. The server may still use {{identity}} encoding (no encoding) unless the client forbids it with {{identity;q=0}}. I think it's OK for libprocess to continue to apply body length threshold as long as it checks that {{identity}}'s weight is not 0. > Libprocess endpoints can ignore required gzip compression > - > > Key: MESOS-9451 > URL: https://issues.apache.org/jira/browse/MESOS-9451 > Project: Mesos > Issue Type: Bug >Reporter: Benno Evers >Priority: Major > Labels: libprocess > > Currently, libprocess decides whether a response should be compressed by the > following conditional: > {noformat} > if (response.type == http::Response::BODY && > response.body.length() >= GZIP_MINIMUM_BODY_LENGTH && > !headers.contains("Content-Encoding") && > request.acceptsEncoding("gzip")) { > [...] > {noformat} > However, this implies that a request sent with the header "Accept-Encoding: > gzip" can not rely on actually getting a gzipped response, e.g. when the > response size is below the threshold: > {noformat} > $ nc localhost 5050 > GET /tasks HTTP/1.1 > Accept-Encoding: gzip > HTTP/1.1 200 OK > Date: Tue, 04 Dec 2018 12:49:56 GMT > Content-Type: application/json > Content-Length: 12 > {"tasks":[]} > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9382) mesos-gtest-runner doesn't work on systems without ulimit binary
Ilya Pronin created MESOS-9382: -- Summary: mesos-gtest-runner doesn't work on systems without ulimit binary Key: MESOS-9382 URL: https://issues.apache.org/jira/browse/MESOS-9382 Project: Mesos Issue Type: Bug Components: test Reporter: Ilya Pronin {{mesos-gtest-runner.py}} fails on systems without a separate ulimit binary (i.e. CentOS 7). {noformat} /home/ipronin/mesos/build/../support/mesos-gtest-runner.py --sequential=*ROOT_* ./mesos-tests Could not check compatibility of ulimit settings: [Errno 2] No such file or directory: 'ulimit' {noformat} The problem arises in [this call|https://github.com/apache/mesos/blob/630d8938462381e8e7b0f44fa6434e47460fb178/support/mesos-gtest-runner.py#L209]. Seems that it can be fixed by passing a {{shell=True}} argument to {{subprocess.check_output()}}. Another problem is {{ROOT_*}} tests which should be ran as root. For root {{ulimit -u}} will most likely return "unlimited", which will again crash the runner. {noformat} Could not check compatibility of ulimit settings: invalid literal for int() with base 10: b'unlimited\n' {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9118) Add port mapping isolator and network ports isolator support to CMake
[ https://issues.apache.org/jira/browse/MESOS-9118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560280#comment-16560280 ] Ilya Pronin commented on MESOS-9118: Duplicate of MESOS-8993? I can take with this one. > Add port mapping isolator and network ports isolator support to CMake > - > > Key: MESOS-9118 > URL: https://issues.apache.org/jira/browse/MESOS-9118 > Project: Mesos > Issue Type: Task >Reporter: Andrew Schwartzmeyer >Priority: Major > > These fall under the same issue because they are very similar, and both > require that {{libnl-3}} be checked for as a third-party dependency. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-184) Log has a space leak
[ https://issues.apache.org/jira/browse/MESOS-184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560273#comment-16560273 ] Ilya Pronin commented on MESOS-184: --- Discussion at the dev@ mailing list: https://lists.apache.org/thread.html/a0a58e42cbb8dd92dcebbedaa2556e8460a005d4a7cdb2ce1205d04a@%3Cdev.mesos.apache.org%3E Review requests: https://reviews.apache.org/r/68089/ https://reviews.apache.org/r/68090/ > Log has a space leak > > > Key: MESOS-184 > URL: https://issues.apache.org/jira/browse/MESOS-184 > Project: Mesos > Issue Type: Bug > Components: c++ api, replicated log >Affects Versions: 0.9.0, 0.14.0, 0.14.1, 0.14.2, 0.15.0, 0.16.0, 0.17.0, > 0.18.0, 0.18.1, 0.18.2, 0.19.0 >Reporter: John Sirois >Assignee: Ilya Pronin >Priority: Minor > Labels: twitter > > In short, the access pattern of the Log of the underlying LevelDB storage is > such that background compactions are ineffective and a long running Log will > have a space leak on disk even in the presence of otherwise apparently > sufficient Log::Writer::truncate calls. > It seems the right thing to do is to issue a DB::CompactRange(NULL, > Slice(truncateToKey)) after a replica learns a Action::TRUNCATE Record. The > cost here is a synchronous compaction stall on every truncate so maybe this > should be a configuration option or even an explicit api. > === > Snip of email explanation: > I spent some time understanding what was going on here and our use pattern of > leveldb does in fact defeat the backround compaction algorithm. > The docs are here: http://leveldb.googlecode.com/svn/trunk/doc/impl.html in > the 'Compactions' section, but in short the gist is compaction operates on an > uncompacted file from a level (1 file) + all files overlapping its key range > in the next level. Since we write sequential keys with no randomness at all, > by definition the only overlap we ever can get is in level 0 which is the > only level that leveldb allows for overlap in sstables in the 1st place. > That leaves the question of why no compaction on open. Looking there: > http://code.google.com/p/leveldb/source/browse/db/db_impl.cc#1376 > I see a call to MaybeScheduleCompaction, but following that trail, that just > leads to > http://code.google.com/p/leveldb/source/browse/db/version_set.cc?spec=svnbc1ee4d25e09b04e074db330a41f54ef4af0e31b=36a5f8ed7f9fb3373236d5eace4f5fea369856ee#1156 > which implements the compaction strategy I tried to summarize above, and > thus background compactions for out case are limited to level0 -> level 1 > compactions and lefel1 and higher never compact automatically. > This seems born out by the LOG files. For example, from smf1-prod - restarts > after your manual compaction fix in bold: > [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting > /var/lib/mesos/scheduler_db/mesos_log/LOG.old > 2012/04/13-00:24:20.356673 44c1e940 Compacting 3@0 + 4@1 files > 2012/04/13-00:24:20.490113 44c1e940 Compacting 5@1 + 281@2 files > 2012/04/13-00:24:25.824995 44c1e940 Compacting 1@1 + 0@2 files > 2012/04/13-00:24:26.008857 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.196877 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.312465 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.429817 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.533483 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.631044 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.733702 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.832787 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.949864 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.052502 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.164623 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.275621 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.376748 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.477728 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.611332 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:28.050275 44c1e940 Compacting 50@2 + 242@3 files > 2012/04/13-00:24:32.455665 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:32.538566 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:32.819205 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.052064 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.198850 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.350893 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.521784 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.693531 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.847151 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:34.034277 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:34.225582 44c1e940 Compacting 1@3 + 0@4 files >
[jira] [Assigned] (MESOS-184) Log has a space leak
[ https://issues.apache.org/jira/browse/MESOS-184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-184: - Assignee: Ilya Pronin > Log has a space leak > > > Key: MESOS-184 > URL: https://issues.apache.org/jira/browse/MESOS-184 > Project: Mesos > Issue Type: Bug > Components: c++ api, replicated log >Affects Versions: 0.9.0, 0.14.0, 0.14.1, 0.14.2, 0.15.0, 0.16.0, 0.17.0, > 0.18.0, 0.18.1, 0.18.2, 0.19.0 >Reporter: John Sirois >Assignee: Ilya Pronin >Priority: Minor > Labels: twitter > > In short, the access pattern of the Log of the underlying LevelDB storage is > such that background compactions are ineffective and a long running Log will > have a space leak on disk even in the presence of otherwise apparently > sufficient Log::Writer::truncate calls. > It seems the right thing to do is to issue a DB::CompactRange(NULL, > Slice(truncateToKey)) after a replica learns a Action::TRUNCATE Record. The > cost here is a synchronous compaction stall on every truncate so maybe this > should be a configuration option or even an explicit api. > === > Snip of email explanation: > I spent some time understanding what was going on here and our use pattern of > leveldb does in fact defeat the backround compaction algorithm. > The docs are here: http://leveldb.googlecode.com/svn/trunk/doc/impl.html in > the 'Compactions' section, but in short the gist is compaction operates on an > uncompacted file from a level (1 file) + all files overlapping its key range > in the next level. Since we write sequential keys with no randomness at all, > by definition the only overlap we ever can get is in level 0 which is the > only level that leveldb allows for overlap in sstables in the 1st place. > That leaves the question of why no compaction on open. Looking there: > http://code.google.com/p/leveldb/source/browse/db/db_impl.cc#1376 > I see a call to MaybeScheduleCompaction, but following that trail, that just > leads to > http://code.google.com/p/leveldb/source/browse/db/version_set.cc?spec=svnbc1ee4d25e09b04e074db330a41f54ef4af0e31b=36a5f8ed7f9fb3373236d5eace4f5fea369856ee#1156 > which implements the compaction strategy I tried to summarize above, and > thus background compactions for out case are limited to level0 -> level 1 > compactions and lefel1 and higher never compact automatically. > This seems born out by the LOG files. For example, from smf1-prod - restarts > after your manual compaction fix in bold: > [jsirois@smf1-ajb-35-sr1 ~]$ grep Compacting > /var/lib/mesos/scheduler_db/mesos_log/LOG.old > 2012/04/13-00:24:20.356673 44c1e940 Compacting 3@0 + 4@1 files > 2012/04/13-00:24:20.490113 44c1e940 Compacting 5@1 + 281@2 files > 2012/04/13-00:24:25.824995 44c1e940 Compacting 1@1 + 0@2 files > 2012/04/13-00:24:26.008857 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.196877 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.312465 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.429817 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.533483 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.631044 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.733702 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.832787 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:26.949864 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.052502 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.164623 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.275621 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.376748 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.477728 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:27.611332 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:28.050275 44c1e940 Compacting 50@2 + 242@3 files > 2012/04/13-00:24:32.455665 44c1e940 Compacting 1@2 + 0@3 files > 2012/04/13-00:24:32.538566 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:32.819205 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.052064 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.198850 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.350893 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.521784 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.693531 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:33.847151 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:34.034277 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:34.225582 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:34.390228 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:34.554127 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:34.715242 44c1e940 Compacting 1@3 + 0@4 files > 2012/04/13-00:24:34.852110 44c1e940 Compacting 1@3 + 0@4 files >
[jira] [Comment Edited] (MESOS-9007) XFS disk isolator doesn't clean up project ID from symlinks
[ https://issues.apache.org/jira/browse/MESOS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16543695#comment-16543695 ] Ilya Pronin edited comment on MESOS-9007 at 7/24/18 12:43 AM: -- Review requests: https://reviews.apache.org/r/67915/ https://reviews.apache.org/r/67914/ https://reviews.apache.org/r/68029/ was (Author: ipronin): Review requests: https://reviews.apache.org/r/67915/ https://reviews.apache.org/r/67914/ > XFS disk isolator doesn't clean up project ID from symlinks > --- > > Key: MESOS-9007 > URL: https://issues.apache.org/jira/browse/MESOS-9007 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.5.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > Upon container destruction its project ID is unallocated by the isolator and > removed from the container work directory. However the removing function > skips symbolic links and because of that the project still exists until the > container directory is garbage collected. If the project ID is reused for a > new container, any lingering symlinks that still have that project ID will > contribute to disk usage of the new container. Typically symlinks don't take > much space, but still this leads to inaccuracy in disk space usage accounting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9080) Port mapping isolator leaks ephemeral ports when a container is destroyed during preparation
Ilya Pronin created MESOS-9080: -- Summary: Port mapping isolator leaks ephemeral ports when a container is destroyed during preparation Key: MESOS-9080 URL: https://issues.apache.org/jira/browse/MESOS-9080 Project: Mesos Issue Type: Bug Components: containerization Affects Versions: 1.6.0 Reporter: Ilya Pronin Assignee: Ilya Pronin {{network/port_mapping}} isolator leaks ephemeral ports during container cleanup if {{Isolator::isolate()}} was not called, i.e. the container is being destroyed during preparation. If the isolator doesn't know the main container's PID it skips filters cleanup (they should not exist in this case) and ephemeral ports deallocation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9007) XFS disk isolator doesn't clean up project ID from symlinks
[ https://issues.apache.org/jira/browse/MESOS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16543695#comment-16543695 ] Ilya Pronin commented on MESOS-9007: Review requests: https://reviews.apache.org/r/67915/ https://reviews.apache.org/r/67914/ > XFS disk isolator doesn't clean up project ID from symlinks > --- > > Key: MESOS-9007 > URL: https://issues.apache.org/jira/browse/MESOS-9007 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.5.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > Upon container destruction its project ID is unallocated by the isolator and > removed from the container work directory. However the removing function > skips symbolic links and because of that the project still exists until the > container directory is garbage collected. If the project ID is reused for a > new container, any lingering symlinks that still have that project ID will > contribute to disk usage of the new container. Typically symlinks don't take > much space, but still this leads to inaccuracy in disk space usage accounting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (MESOS-9007) XFS disk isolator doesn't clean up project ID from symlinks
[ https://issues.apache.org/jira/browse/MESOS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-9007: -- Assignee: Ilya Pronin > XFS disk isolator doesn't clean up project ID from symlinks > --- > > Key: MESOS-9007 > URL: https://issues.apache.org/jira/browse/MESOS-9007 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > Upon container destruction its project ID is unallocated by the isolator and > removed from the container work directory. However the removing function > skips symbolic links and because of that the project still exists until the > container directory is garbage collected. If the project ID is reused for a > new container, any lingering symlinks that still have that project ID will > contribute to disk usage of the new container. Typically symlinks don't take > much space, but still this leads to inaccuracy in disk space usage accounting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (MESOS-9007) XFS disk isolator doesn't clean up project ID from symlinks
[ https://issues.apache.org/jira/browse/MESOS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516486#comment-16516486 ] Ilya Pronin edited comment on MESOS-9007 at 6/19/18 12:27 AM: -- Per [discussion at the XFS mailing list|https://www.spinics.net/lists/linux-xfs/msg19197.html] it is not possible to unset project ID from a symlink. So we need to change the approach to project ID deallocation. XFS project IDs are 16-32bit integers, so we have plenty of them. We can leave project IDs on container sandboxes until they are GCed. The question is how track when a directory is GCed, since there's no GC hook that the isolator could use. The simplest way would be to periodically check the work dir to see if any of the sandboxes was removed and the project ID associated with it can be deallocated. Or we can use inotify mechanism. Or add a hook :) was (Author: ipronin): Per [discussion at the XFS mailing list|https://www.spinics.net/lists/linux-xfs/msg19197.html] it is not possible to unset project ID from a symlink. So we need to change the approach to project ID deallocation. XFS project IDs are 32bit integers, so we have plenty of them. We can leave project IDs on container sandboxes until they are GCed. The question is how track when a directory is GCed, since there's no GC hook that the isolator could use. The simplest way would be to periodically check the work dir to see if any of the sandboxes was removed and the project ID associated with it can be deallocated. Or we can use inotify mechanism. Or add a hook :) > XFS disk isolator doesn't clean up project ID from symlinks > --- > > Key: MESOS-9007 > URL: https://issues.apache.org/jira/browse/MESOS-9007 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Ilya Pronin >Priority: Minor > > Upon container destruction its project ID is unallocated by the isolator and > removed from the container work directory. However the removing function > skips symbolic links and because of that the project still exists until the > container directory is garbage collected. If the project ID is reused for a > new container, any lingering symlinks that still have that project ID will > contribute to disk usage of the new container. Typically symlinks don't take > much space, but still this leads to inaccuracy in disk space usage accounting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9007) XFS disk isolator doesn't clean up project ID from symlinks
[ https://issues.apache.org/jira/browse/MESOS-9007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516486#comment-16516486 ] Ilya Pronin commented on MESOS-9007: Per [discussion at the XFS mailing list|https://www.spinics.net/lists/linux-xfs/msg19197.html] it is not possible to unset project ID from a symlink. So we need to change the approach to project ID deallocation. XFS project IDs are 32bit integers, so we have plenty of them. We can leave project IDs on container sandboxes until they are GCed. The question is how track when a directory is GCed, since there's no GC hook that the isolator could use. The simplest way would be to periodically check the work dir to see if any of the sandboxes was removed and the project ID associated with it can be deallocated. Or we can use inotify mechanism. Or add a hook :) > XFS disk isolator doesn't clean up project ID from symlinks > --- > > Key: MESOS-9007 > URL: https://issues.apache.org/jira/browse/MESOS-9007 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Ilya Pronin >Priority: Minor > > Upon container destruction its project ID is unallocated by the isolator and > removed from the container work directory. However the removing function > skips symbolic links and because of that the project still exists until the > container directory is garbage collected. If the project ID is reused for a > new container, any lingering symlinks that still have that project ID will > contribute to disk usage of the new container. Typically symlinks don't take > much space, but still this leads to inaccuracy in disk space usage accounting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9007) XFS disk isolator doesn't clean up project ID from symlinks
Ilya Pronin created MESOS-9007: -- Summary: XFS disk isolator doesn't clean up project ID from symlinks Key: MESOS-9007 URL: https://issues.apache.org/jira/browse/MESOS-9007 Project: Mesos Issue Type: Bug Components: containerization Reporter: Ilya Pronin Upon container destruction its project ID is unallocated by the isolator and removed from the container work directory. However the removing function skips symbolic links and because of that the project still exists until the container directory is garbage collected. If the project ID is reused for a new container, any lingering symlinks that still have that project ID will contribute to disk usage of the new container. Typically symlinks don't take much space, but still this leads to inaccuracy in disk space usage accounting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-8993) `network/ports` isolator missing from CMake build
[ https://issues.apache.org/jira/browse/MESOS-8993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16512788#comment-16512788 ] Ilya Pronin commented on MESOS-8993: Please note that {{network/port_mapping}} (hidden behind {{\-\-with-network-isolator}} flag in Autotools based build) and {{network/ports}} are 2 different isolators. libnl is a dependency of the former and {{src/tests/containerizer/port_mapping_tests.cpp}} contains its tests. I can help with it. > `network/ports` isolator missing from CMake build > - > > Key: MESOS-8993 > URL: https://issues.apache.org/jira/browse/MESOS-8993 > Project: Mesos > Issue Type: Bug > Components: cmake >Affects Versions: 1.7.0 > Environment: Linux with CMake >Reporter: Andrew Schwartzmeyer >Assignee: James Peach >Priority: Major > Labels: cmake > > The `network/ports` isolator is completely missing from the CMake build. It > looks like it needs libnl-3 and the files > src/tests/containerizer/ports_isolator_tests.cpp, > tests/containerizer/port_mapping_tests.cpp, and > src/slave/containerizer/mesos/isolators/network/ports.cpp, along with a > configuration option network-ports-isolator. > Note that this was discovered due to a build break in the associated tests as > they were missing from the build I ran as a test. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-7069) The linux filesystem isolator should set mode and ownership for host volumes.
[ https://issues.apache.org/jira/browse/MESOS-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16357316#comment-16357316 ] Ilya Pronin commented on MESOS-7069: [~jieyu] I believe this was fixed in https://reviews.apache.org/r/61122/. Closing this issue. > The linux filesystem isolator should set mode and ownership for host volumes. > - > > Key: MESOS-7069 > URL: https://issues.apache.org/jira/browse/MESOS-7069 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Gilbert Song >Assignee: Ilya Pronin >Priority: Major > Labels: filesystem, linux, volumes > > If the host path is a relative path, the linux filesystem isolator should set > the mode and ownership for this host volume since it allows non-root user to > write to the volume. Note that this is the case of sharing the host > fileysystem (without rootfs). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (MESOS-7698) Libprocess doesn't handle IP changes
[ https://issues.apache.org/jira/browse/MESOS-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349554#comment-16349554 ] Ilya Pronin edited comment on MESOS-7698 at 2/2/18 12:16 AM: - [~greggomann], libprocess looks up the host address its running on upon process startup and remembers that address for the lifetime of the process. If {{--advertise_ip}} flag is not provided, then this address is used as a return address in inter-libprocess communication ({{User-Agent: libprocess/*}} header field). When I encountered the described problem, the IP address of one of our hosts has changed due to network maintenance. The agent on that host tried to re-register with the master, telling him that he was located at addr1, while in reality he was at addr2. Because of that logic with return address, the master was sending his responses to a wrong host at addr1. I never tried to reproduce the problem, but I suppose it should be relatively easy reproduced by changing the IP address of the interface used by the agent for communicating with the master. Maybe we could make the usage of return addresses in libprocess-libprocess communication more "relaxed". If the user doesn't want libprocess to advertise a specific address, sending libprocess can omit the address in the {{User-Agent}} field and the receiver will use return address from the connection? I can work on a patch if somebody can shepherd this work. was (Author: ipronin): [~greggomann], libprocess looks up the host address its running on upon process startup and remembers that address for the lifetime of the process. If {{--advertise_ip}} flag is not provided, then this address is used as a return address in inter-libprocess communication ({{User-Agent: libprocess/*}} header field). When I encountered the described problem, the IP address of one of our hosts has changed due to network maintenance. The agent on that host tried to re-register with the master, telling him that he was located at addr1, while in reality he was at addr2. Because of that logic with return address, the master was sending his responses to a wrong host at addr1. I never tried to reproduce the problem, but I suppose it should be relatively easy reproduced by changing the IP address of the interface used by the agent for communicating with the master. Maybe we could make the usage of return addresses in libprocess-libprocess communication more "relaxed". If the user doesn't want libprocess to advertise a specific address, sending libprocess can omit the address in the {{User-Agent}} field and the receiver will use return address from the connection? > Libprocess doesn't handle IP changes > > > Key: MESOS-7698 > URL: https://issues.apache.org/jira/browse/MESOS-7698 > Project: Mesos > Issue Type: Bug > Components: libprocess >Affects Versions: 1.2.0 >Reporter: Ilya Pronin >Priority: Major > > If a host IP address changes libprocess will never learn about it and will > continue to send messages "from" the old IP. > This will cause weird situations. E.g. an agent will indefinitely try to > reregister with a master pretending that it can be reached by an old IP. The > master will send {{SlaveReregisteredMessage}} to the wrong host (potentially > a different agent), using an IP from the {{User-Agent: libprocess/*}} header. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-7698) Libprocess doesn't handle IP changes
[ https://issues.apache.org/jira/browse/MESOS-7698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16349554#comment-16349554 ] Ilya Pronin commented on MESOS-7698: [~greggomann], libprocess looks up the host address its running on upon process startup and remembers that address for the lifetime of the process. If {{--advertise_ip}} flag is not provided, then this address is used as a return address in inter-libprocess communication ({{User-Agent: libprocess/*}} header field). When I encountered the described problem, the IP address of one of our hosts has changed due to network maintenance. The agent on that host tried to re-register with the master, telling him that he was located at addr1, while in reality he was at addr2. Because of that logic with return address, the master was sending his responses to a wrong host at addr1. I never tried to reproduce the problem, but I suppose it should be relatively easy reproduced by changing the IP address of the interface used by the agent for communicating with the master. Maybe we could make the usage of return addresses in libprocess-libprocess communication more "relaxed". If the user doesn't want libprocess to advertise a specific address, sending libprocess can omit the address in the {{User-Agent}} field and the receiver will use return address from the connection? > Libprocess doesn't handle IP changes > > > Key: MESOS-7698 > URL: https://issues.apache.org/jira/browse/MESOS-7698 > Project: Mesos > Issue Type: Bug > Components: libprocess >Affects Versions: 1.2.0 >Reporter: Ilya Pronin >Priority: Major > > If a host IP address changes libprocess will never learn about it and will > continue to send messages "from" the old IP. > This will cause weird situations. E.g. an agent will indefinitely try to > reregister with a master pretending that it can be reached by an old IP. The > master will send {{SlaveReregisteredMessage}} to the wrong host (potentially > a different agent), using an IP from the {{User-Agent: libprocess/*}} header. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (MESOS-8493) Master task state metrics discrepancy
[ https://issues.apache.org/jira/browse/MESOS-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-8493: --- Description: Currently if the task status update has no reason we don't increment {{master///}} metric. Because of that {{master/task_/*}} counters may not sum up to {{master/tasks_}} counter, if for example a custom executor doesn't set the reason in its status updates. Since the zero value in {{Reason}} enum is already taken it is possible to just count status updates with no reason an under artificial {{reason_unknown}} name. was: Currently if the task status update has no reason we don't increment {{master///}} metric. Because of that {{master/task_/*}} counters may not sum up to {{master/tasks_}} counter, if for example a custom executor doesn't set a reason to its status updates. Since the zero value in {{Reason}} enum is already taken it is possible to just count status updates with no reason an under artificial {{reason_unknown}} name. > Master task state metrics discrepancy > - > > Key: MESOS-8493 > URL: https://issues.apache.org/jira/browse/MESOS-8493 > Project: Mesos > Issue Type: Improvement > Components: master >Affects Versions: 1.2.0 >Reporter: Ilya Pronin >Priority: Trivial > > Currently if the task status update has no reason we don't increment > {{master///}} metric. Because of that > {{master/task_/*}} counters may not sum up to {{master/tasks_}} > counter, if for example a custom executor doesn't set the reason in its > status updates. > Since the zero value in {{Reason}} enum is already taken it is possible to > just count status updates with no reason an under artificial > {{reason_unknown}} name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (MESOS-8493) Master task state metrics discrepancy
[ https://issues.apache.org/jira/browse/MESOS-8493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-8493: --- Component/s: master > Master task state metrics discrepancy > - > > Key: MESOS-8493 > URL: https://issues.apache.org/jira/browse/MESOS-8493 > Project: Mesos > Issue Type: Improvement > Components: master >Affects Versions: 1.2.0 >Reporter: Ilya Pronin >Priority: Trivial > > Currently if the task status update has no reason we don't increment > {{master///}} metric. Because of that > {{master/task_/*}} counters may not sum up to > {{master/tasks_}} counter, if for example a custom executor doesn't > set a reason to its status updates. > Since the zero value in {{Reason}} enum is already taken it is possible to > just count status updates with no reason an under artificial > {{reason_unknown}} name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-8493) Master task state metrics discrepancy
Ilya Pronin created MESOS-8493: -- Summary: Master task state metrics discrepancy Key: MESOS-8493 URL: https://issues.apache.org/jira/browse/MESOS-8493 Project: Mesos Issue Type: Improvement Affects Versions: 1.2.0 Reporter: Ilya Pronin Currently if the task status update has no reason we don't increment {{master///}} metric. Because of that {{master/task_/*}} counters may not sum up to {{master/tasks_}} counter, if for example a custom executor doesn't set a reason to its status updates. Since the zero value in {{Reason}} enum is already taken it is possible to just count status updates with no reason an under artificial {{reason_unknown}} name. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-6985) os::getenv() can segfault
[ https://issues.apache.org/jira/browse/MESOS-6985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16338076#comment-16338076 ] Ilya Pronin commented on MESOS-6985: [~vinodkone], sorry I missed the comment somehow. I have a POC-like patch for this, didn't have time to finish it. I'll try to finish it maybe next week. Feel free to reassign if somebody would like to work on it before that. > os::getenv() can segfault > - > > Key: MESOS-6985 > URL: https://issues.apache.org/jira/browse/MESOS-6985 > Project: Mesos > Issue Type: Bug > Components: stout > Environment: ASF CI, Ubuntu 14.04 and CentOS 7 both with and without > libevent/SSL >Reporter: Greg Mann >Assignee: Ilya Pronin >Priority: Major > Labels: flaky-test, reliability, stout > Attachments: > MasterMaintenanceTest.InverseOffersFilters-truncated.txt, > MasterTest.MultipleExecutors.txt > > > This was observed on ASF CI. The segfault first showed up on CI on 9/20/16 > and has been produced by the tests {{MasterTest.MultipleExecutors}} and > {{MasterMaintenanceTest.InverseOffersFilters}}. In both cases, > {{os::getenv()}} segfaults with the same stack trace: > {code} > *** Aborted at 1485241617 (unix time) try "date -d @1485241617" if you are > using GNU date *** > PC: @ 0x2ad59e3ae82d (unknown) > I0124 07:06:57.422080 28619 exec.cpp:162] Version: 1.2.0 > *** SIGSEGV (@0xf0) received by PID 28591 (TID 0x2ad5a7b87700) from PID 240; > stack trace: *** > I0124 07:06:57.422336 28615 exec.cpp:212] Executor started at: > executor(75)@172.17.0.2:45752 with pid 28591 > @ 0x2ad5ab953197 (unknown) > @ 0x2ad5ab957479 (unknown) > @ 0x2ad59e165330 (unknown) > @ 0x2ad59e3ae82d (unknown) > @ 0x2ad594631358 os::getenv() > @ 0x2ad59aba6acf mesos::internal::slave::executorEnvironment() > @ 0x2ad59ab845c0 mesos::internal::slave::Framework::launchExecutor() > @ 0x2ad59ab818a2 mesos::internal::slave::Slave::_run() > @ 0x2ad59ac1ec10 > _ZZN7process8dispatchIN5mesos8internal5slave5SlaveERKNS_6FutureIbEERKNS1_13FrameworkInfoERKNS1_12ExecutorInfoERK6OptionINS1_8TaskInfoEERKSF_INS1_13TaskGroupInfoEES6_S9_SC_SH_SL_EEvRKNS_3PIDIT_EEMSP_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_ENKUlPNS_11ProcessBaseEE_clES16_ > @ 0x2ad59ac1e6bf > _ZNSt17_Function_handlerIFvPN7process11ProcessBaseEEZNS0_8dispatchIN5mesos8internal5slave5SlaveERKNS0_6FutureIbEERKNS5_13FrameworkInfoERKNS5_12ExecutorInfoERK6OptionINS5_8TaskInfoEERKSJ_INS5_13TaskGroupInfoEESA_SD_SG_SL_SP_EEvRKNS0_3PIDIT_EEMST_FvT0_T1_T2_T3_T4_ET5_T6_T7_T8_T9_EUlS2_E_E9_M_invokeERKSt9_Any_dataS2_ > @ 0x2ad59bce2304 std::function<>::operator()() > @ 0x2ad59bcc9824 process::ProcessBase::visit() > @ 0x2ad59bd4028e process::DispatchEvent::visit() > @ 0x2ad594616df1 process::ProcessBase::serve() > @ 0x2ad59bcc72b7 process::ProcessManager::resume() > @ 0x2ad59bcd567c > process::ProcessManager::init_threads()::$_2::operator()() > @ 0x2ad59bcd5585 > _ZNSt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvE3$_2vEE9_M_invokeIJEEEvSt12_Index_tupleIJXspT_EEE > @ 0x2ad59bcd std::_Bind_simple<>::operator()() > @ 0x2ad59bcd552c std::thread::_Impl<>::_M_run() > @ 0x2ad59d9e6a60 (unknown) > @ 0x2ad59e15d184 start_thread > @ 0x2ad59e46d37d (unknown) > make[4]: *** [check-local] Segmentation fault > {code} > Find attached the full log from a failed run of > {{MasterTest.MultipleExecutors}} and a truncated log from a failed run of > {{MasterMaintenanceTest.InverseOffersFilters}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (MESOS-8377) RecoverTest.CatchupTruncated is flaky.
[ https://issues.apache.org/jira/browse/MESOS-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310511#comment-16310511 ] Ilya Pronin edited comment on MESOS-8377 at 1/4/18 12:14 AM: - Review request: https://reviews.apache.org/r/64938/ I couldn't reproduce the issue on my machine with {{--gtest_repeat=1000 --gtest_break_on_failure=1}}, but I suspect that it has something to do with the fact the test uses {{Shared}} which probably can still be retained by "managed" {{CatchupProcess}} at the moment when I [try to recreate the replica| https://github.com/apache/mesos/blob/master/src/tests/log_tests.cpp#L2096]. Because of that the DB can not be closed and LevelDB complaints that the process still holds the DB lock. I've added the code to make sure that the test code is the only owner of {{replica3}} before proceeding to recreate it. was (Author: ipronin): Review request: https://reviews.apache.org/r/64938/ I couldn't reproduce the issue on my machine with `--gtest_repeat=1000 --gtest_break_on_failure=1`, but I suspect that it has something to do with the fact the test uses {{Shared}} which probably can still be retained by "managed" {{CatchupProcess}} at the moment when I [try to recreate the replica| https://github.com/apache/mesos/blob/master/src/tests/log_tests.cpp#L2096]. Because of that the DB can not be closed and LevelDB complaints that the process still holds the DB lock. I've added the code to make sure that the test code is the only owner of {{replica3}} before proceeding to recreate it. > RecoverTest.CatchupTruncated is flaky. > -- > > Key: MESOS-8377 > URL: https://issues.apache.org/jira/browse/MESOS-8377 > Project: Mesos > Issue Type: Bug > Components: replicated log >Reporter: Alexander Rukletsov >Assignee: Ilya Pronin > Labels: flaky-test > Attachments: CatchupTruncated-badrun.txt, > RecoverTest.CatchupTruncated-badrun2.txt > > > Observing regularly in our CI. Logs attached. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-8377) RecoverTest.CatchupTruncated is flaky.
[ https://issues.apache.org/jira/browse/MESOS-8377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-8377: -- Assignee: Ilya Pronin > RecoverTest.CatchupTruncated is flaky. > -- > > Key: MESOS-8377 > URL: https://issues.apache.org/jira/browse/MESOS-8377 > Project: Mesos > Issue Type: Bug > Components: replicated log >Reporter: Alexander Rukletsov >Assignee: Ilya Pronin > Labels: flaky-test > Attachments: CatchupTruncated-badrun.txt, > RecoverTest.CatchupTruncated-badrun2.txt > > > Observing regularly in our CI. Logs attached. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7973) Non-leading VOTING replica catch-up
[ https://issues.apache.org/jira/browse/MESOS-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310132#comment-16310132 ] Ilya Pronin commented on MESOS-7973: Documentation patches: https://reviews.apache.org/r/64921/ https://reviews.apache.org/r/64922/ https://reviews.apache.org/r/64923/ > Non-leading VOTING replica catch-up > --- > > Key: MESOS-7973 > URL: https://issues.apache.org/jira/browse/MESOS-7973 > Project: Mesos > Issue Type: Improvement > Components: replicated log >Reporter: Ilya Pronin >Assignee: Ilya Pronin > Fix For: 1.5.0 > > > Currently it is not possible to perform consistent reads from non-leading > replicas due to the fact that if a non-leading replica is partitioned it may > miss some log positions and will not make any attempt to “fill” those holes. > If a non-leading replica could catch-up missing log positions it would be > able to serve eventually consistent reads to the framework. This would make > it possible to do additional work on non-leading framework replicas (e.g. > offload some reading from a leader to standbys or reduce failover time by > keeping in-memory storage represented by the log “hot”). > Design doc: > https://docs.google.com/document/d/1dERXJeAsi3Lnq9Akt82JGWK4pKNeJ6k7PTVCpM9ic_8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-8120) Benchmark scheduler API performance
[ https://issues.apache.org/jira/browse/MESOS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273056#comment-16273056 ] Ilya Pronin edited comment on MESOS-8120 at 11/30/17 5:54 PM: -- Posted the benchmarks described in the doc for review: https://reviews.apache.org/r/64217 was (Author: ipronin): Added the benchmarks described in the doc for review: https://reviews.apache.org/r/64217 > Benchmark scheduler API performance > --- > > Key: MESOS-8120 > URL: https://issues.apache.org/jira/browse/MESOS-8120 > Project: Mesos > Issue Type: Task >Reporter: Ilya Pronin >Priority: Minor > Attachments: revive.master.lockfree.8threads.svg, > revive.master.lockfree.svg, revive.master.svg > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8120) Benchmark scheduler API performance
[ https://issues.apache.org/jira/browse/MESOS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16273056#comment-16273056 ] Ilya Pronin commented on MESOS-8120: Added the benchmarks described in the doc for review: https://reviews.apache.org/r/64217 > Benchmark scheduler API performance > --- > > Key: MESOS-8120 > URL: https://issues.apache.org/jira/browse/MESOS-8120 > Project: Mesos > Issue Type: Task >Reporter: Ilya Pronin >Priority: Minor > Attachments: revive.master.lockfree.8threads.svg, > revive.master.lockfree.svg, revive.master.svg > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-6406) Send latest status for partition-aware tasks when agent reregisters
[ https://issues.apache.org/jira/browse/MESOS-6406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16271799#comment-16271799 ] Ilya Pronin commented on MESOS-6406: What if the agent becomes unreachable, then master failover happens and then the agent re-registers? Let's pretend that the agent's entry was GCd from the registry. In this case the framework will not know that the task came back, right? > Send latest status for partition-aware tasks when agent reregisters > --- > > Key: MESOS-6406 > URL: https://issues.apache.org/jira/browse/MESOS-6406 > Project: Mesos > Issue Type: Bug >Reporter: Neil Conway >Assignee: Megha Sharma > Labels: mesosphere > > When an agent reregisters, we should notify frameworks about the current > status of any partition-aware tasks that were/are running on the agent -- > i.e., report the current state of the task at the agent to the framework. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8185) Tasks can be known to the agent but unknown to the master.
[ https://issues.apache.org/jira/browse/MESOS-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16259755#comment-16259755 ] Ilya Pronin commented on MESOS-8185: [~xujyan] it should, but we'll need MESOS-6406 so we don't have to rely on explicit reconciliation to get notified about the tasks that came back. > Tasks can be known to the agent but unknown to the master. > -- > > Key: MESOS-8185 > URL: https://issues.apache.org/jira/browse/MESOS-8185 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin > Labels: reliability > > Currently, when a master re-registers an agent that was marked unreachable, > it shutdowns all not partition-aware frameworks on that agent. When a master > re-registers an agent that is already registered, it doesn't check that all > tasks from the slave's re-registration message are known to it. > It is possible that due to a transient loss of connectivity an agent may miss > {{SlaveReregisteredMessage}} along with {{ShutdownFrameworkMessage}} and thus > will not kill not partition-aware tasks. But the master will mark the agent > as registered and will not re-add tasks that it thought will be killed. The > agent may re-register again, this time successfully, before becoming marked > unreachable while never having terminated tasks of not partition-aware > frameworks. The master will simply forget those tasks ever existed, because > it has "removed" them during the previous re-registration. > Example scenario: > # Connection from the master to the agent stops working > # Agent doesn't see pings from the master and attempts to re-register > # Master sends {{SlaveRegisteredMessage}} and {{ShutdownSlaveMessage}}, which > don't get to the agent because of the connection failure. Agent is marked > registered. > # Network issue resolves, connection breaks. Agent retries re-registration. > # Master thinks that the agent was registered since step (3) and just > re-sends {{SlaveRegisteredMessage}}. Tasks remain running on the agent. > One of the possible solutions would be to compare the list of tasks the the > already registered agent reports in {{ReregisterSlaveMessage}} and the list > of tasks the master has. In this case anything that the master doesn't know > about should not exist on the agent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-8185) Tasks can be known to the agent but unknown to the master.
[ https://issues.apache.org/jira/browse/MESOS-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-8185: -- Assignee: Ilya Pronin > Tasks can be known to the agent but unknown to the master. > -- > > Key: MESOS-8185 > URL: https://issues.apache.org/jira/browse/MESOS-8185 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin > > Currently, when a master re-registers an agent that was marked unreachable, > it shutdowns all not partition-aware frameworks on that agent. When a master > re-registers an agent that is already registered, it doesn't check that all > tasks from the slave's re-registration message are known to it. > It is possible that due to a transient loss of connectivity an agent may miss > {{SlaveReregisteredMessage}} along with {{ShutdownFrameworkMessage}} and thus > will not kill not partition-aware tasks. But the master will mark the agent > as registered and will not re-add tasks that it thought will be killed. The > agent may re-register again, this time successfully, before becoming marked > unreachable while never having terminated tasks of not partition-aware > frameworks. The master will simply forget those tasks ever existed, because > it has "removed" them during the previous re-registration. > Example scenario: > # Connection from the master to the agent stops working > # Agent doesn't see pings from the master and attempts to re-register > # Master sends {{SlaveRegisteredMessage}} and {{ShutdownSlaveMessage}}, which > don't get to the agent because of the connection failure. Agent is marked > registered. > # Network issue resolves, connection breaks. Agent retries re-registration. > # Master thinks that the agent was registered since step (3) and just > re-sends {{SlaveRegisteredMessage}}. Tasks remain running on the agent. > One of the possible solutions would be to compare the list of tasks the the > already registered agent reports in {{ReregisterSlaveMessage}} and the list > of tasks the master has. In this case anything that the master doesn't know > about should not exist on the agent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-8185) Tasks can be known to the agent but unknown to the master.
[ https://issues.apache.org/jira/browse/MESOS-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249889#comment-16249889 ] Ilya Pronin edited comment on MESOS-8185 at 11/13/17 6:00 PM: -- [~bmahler] I see you've already changed it. Thanks! Sure, let's discuss all available options. I suggested killing, because current contract between master and not {{PARTITION_AWARE}} frameworks is that {{LOST}} tasks will be killed by master. was (Author: ipronin): [~bmahler] I see you've already changed it. Thanks! Sure, let's discuss all available options. I suggested killing, because current contract between master and not {{PARTITION_AWARE}} frameworks is that {{LOST}} tasks will be killed by master. > Tasks can be known to the agent but unknown to the master. > -- > > Key: MESOS-8185 > URL: https://issues.apache.org/jira/browse/MESOS-8185 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Ilya Pronin > > Currently, when a master re-registers an agent that was marked unreachable, > it shutdowns all not partition-aware frameworks on that agent. When a master > re-registers an agent that is already registered, it doesn't check that all > tasks from the slave's re-registration message are known to it. > It is possible that due to a transient loss of connectivity an agent may miss > {{SlaveReregisteredMessage}} along with {{ShutdownFrameworkMessage}} and thus > will not kill not partition-aware tasks. But the master will mark the agent > as registered and will not re-add tasks that it thought will be killed. The > agent may re-register again, this time successfully, before becoming marked > unreachable while never having terminated tasks of not partition-aware > frameworks. The master will simply forget those tasks ever existed, because > it has "removed" them during the previous re-registration. > Example scenario: > # Connection from the master to the agent stops working > # Agent doesn't see pings from the master and attempts to re-register > # Master sends {{SlaveRegisteredMessage}} and {{ShutdownSlaveMessage}}, which > don't get to the agent because of the connection failure. Agent is marked > registered. > # Network issue resolves, connection breaks. Agent retries re-registration. > # Master thinks that the agent was registered since step (3) and just > re-sends {{SlaveRegisteredMessage}}. Tasks remain running on the agent. > One of the possible solutions would be to compare the list of tasks the the > already registered agent reports in {{ReregisterSlaveMessage}} and the list > of tasks the master has. In this case anything that the master doesn't know > about should not exist on the agent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8185) Tasks can be known to the agent but unknown to the master.
[ https://issues.apache.org/jira/browse/MESOS-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16249889#comment-16249889 ] Ilya Pronin commented on MESOS-8185: [~bmahler] I see you've already changed it. Thanks! Sure, let's discuss all available options. I suggested killing, because current contract between master and not {{PARTITION_AWARE}} frameworks is that {{LOST}} tasks will be killed by master. > Tasks can be known to the agent but unknown to the master. > -- > > Key: MESOS-8185 > URL: https://issues.apache.org/jira/browse/MESOS-8185 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Ilya Pronin > > Currently, when a master re-registers an agent that was marked unreachable, > it shutdowns all not partition-aware frameworks on that agent. When a master > re-registers an agent that is already registered, it doesn't check that all > tasks from the slave's re-registration message are known to it. > It is possible that due to a transient loss of connectivity an agent may miss > {{SlaveReregisteredMessage}} along with {{ShutdownFrameworkMessage}} and thus > will not kill not partition-aware tasks. But the master will mark the agent > as registered and will not re-add tasks that it thought will be killed. The > agent may re-register again, this time successfully, before becoming marked > unreachable while never having terminated tasks of not partition-aware > frameworks. The master will simply forget those tasks ever existed, because > it has "removed" them during the previous re-registration. > Example scenario: > # Connection from the master to the agent stops working > # Agent doesn't see pings from the master and attempts to re-register > # Master sends {{SlaveRegisteredMessage}} and {{ShutdownSlaveMessage}}, which > don't get to the agent because of the connection failure. Agent is marked > registered. > # Network issue resolves, connection breaks. Agent retries re-registration. > # Master thinks that the agent was registered since step (3) and just > re-sends {{SlaveRegisteredMessage}}. Tasks remain running on the agent. > One of the possible solutions would be to compare the list of tasks the the > already registered agent reports in {{ReregisterSlaveMessage}} and the list > of tasks the master has. In this case anything that the master doesn't know > about should not exist on the agent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8185) Master should kill tasks that are unknown to it after registered agent re-registers
[ https://issues.apache.org/jira/browse/MESOS-8185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16245048#comment-16245048 ] Ilya Pronin commented on MESOS-8185: [~bmahler] can you shepherd this, please? > Master should kill tasks that are unknown to it after registered agent > re-registers > --- > > Key: MESOS-8185 > URL: https://issues.apache.org/jira/browse/MESOS-8185 > Project: Mesos > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Ilya Pronin > > Currently, when a master re-registers an agent that was marked unreachable, > it shutdowns all not partition-aware frameworks on that agent. When a master > re-registers an agent that is already registered, it doesn't check that all > tasks from the slave's re-registration message are known to it. > It is possible that due to a transient loss of connectivity an agent may miss > {{SlaveReregisteredMessage}} along with {{ShutdownFrameworkMessage}} and thus > will not kill not partition-aware tasks. But the master will mark the agent > as registered and will not re-add tasks that it thought will be killed. The > agent may re-register again, this time successfully, before becoming marked > unreachable while never having terminated tasks of not partition-aware > frameworks. The master will simply forget those tasks ever existed, because > it has "removed" them during the previous re-registration. > Example scenario: > # Connection from the master to the agent stops working > # Agent doesn't see pings from the master and attempts to re-register > # Master sends {{SlaveRegisteredMessage}} and {{ShutdownSlaveMessage}}, which > don't get to the agent because of the connection failure. Agent is marked > registered. > # Network issue resolves, connection breaks. Agent retries re-registration. > # Master thinks that the agent was registered since step (3) and just > re-sends {{SlaveRegisteredMessage}}. Tasks remain running on the agent. > One of the possible solutions would be to compare the list of tasks the the > already registered agent reports in {{ReregisterSlaveMessage}} and the list > of tasks the master has. In this case anything that the master doesn't know > about should not exist on the agent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-8185) Master should kill tasks that are unknown to it after registered agent re-registers
Ilya Pronin created MESOS-8185: -- Summary: Master should kill tasks that are unknown to it after registered agent re-registers Key: MESOS-8185 URL: https://issues.apache.org/jira/browse/MESOS-8185 Project: Mesos Issue Type: Bug Affects Versions: 1.2.0 Reporter: Ilya Pronin Currently, when a master re-registers an agent that was marked unreachable, it shutdowns all not partition-aware frameworks on that agent. When a master re-registers an agent that is already registered, it doesn't check that all tasks from the slave's re-registration message are known to it. It is possible that due to a transient loss of connectivity an agent may miss {{SlaveReregisteredMessage}} along with {{ShutdownFrameworkMessage}} and thus will not kill not partition-aware tasks. But the master will mark the agent as registered and will not re-add tasks that it thought will be killed. The agent may re-register again, this time successfully, before becoming marked unreachable while never having terminated tasks of not partition-aware frameworks. The master will simply forget those tasks ever existed, because it has "removed" them during the previous re-registration. Example scenario: # Connection from the master to the agent stops working # Agent doesn't see pings from the master and attempts to re-register # Master sends {{SlaveRegisteredMessage}} and {{ShutdownSlaveMessage}}, which don't get to the agent because of the connection failure. Agent is marked registered. # Network issue resolves, connection breaks. Agent retries re-registration. # Master thinks that the agent was registered since step (3) and just re-sends {{SlaveRegisteredMessage}}. Tasks remain running on the agent. One of the possible solutions would be to compare the list of tasks the the already registered agent reports in {{ReregisterSlaveMessage}} and the list of tasks the master has. In this case anything that the master doesn't know about should not exist on the agent. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-8165) TASK_UNKNOWN status is ambiguous
Ilya Pronin created MESOS-8165: -- Summary: TASK_UNKNOWN status is ambiguous Key: MESOS-8165 URL: https://issues.apache.org/jira/browse/MESOS-8165 Project: Mesos Issue Type: Bug Affects Versions: 1.4.0 Reporter: Ilya Pronin Assignee: Ilya Pronin Priority: Major There's an ambiguousness in the definition of {{TASK_UNKNOWN}} status. Currently it is sent by the master during explicit reconciliation when it doesn't know about the task. This covers 2 situations that should be handled differently by frameworks: # Task is unknown, agent is unknown - we don't know the fate of the task, it's reasonable for the framework to wait until the task comes back (if SLA allows); # Task is unknown, agent is registered - task is definitely in the terminal state and won't come back, the framework should reschedule it. The second situation should produce {{TASK_GONE}} status. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8120) Benchmark scheduler API performance
[ https://issues.apache.org/jira/browse/MESOS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-8120: --- Attachment: revive.master.lockfree.8threads.svg > Benchmark scheduler API performance > --- > > Key: MESOS-8120 > URL: https://issues.apache.org/jira/browse/MESOS-8120 > Project: Mesos > Issue Type: Task >Reporter: Ilya Pronin >Priority: Minor > Attachments: revive.master.lockfree.8threads.svg, > revive.master.lockfree.svg, revive.master.svg > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-8120) Benchmark scheduler API performance
[ https://issues.apache.org/jira/browse/MESOS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213109#comment-16213109 ] Ilya Pronin edited comment on MESOS-8120 at 10/20/17 7:42 PM: -- Results: https://docs.google.com/document/d/1gGxrrqC9ahN1oYFUpxpE_b27Xc5ld_04x1Bl86DROqY Flamegraph for master during "revive" benchmark: [^revive.master.svg] Flamegraph for master with lock-free event queue and run queue during "revive" benchmark: [^revive.master.lockfree.svg] was (Author: ipronin): Results: https://docs.google.com/document/d/1gGxrrqC9ahN1oYFUpxpE_b27Xc5ld_04x1Bl86DROqY > Benchmark scheduler API performance > --- > > Key: MESOS-8120 > URL: https://issues.apache.org/jira/browse/MESOS-8120 > Project: Mesos > Issue Type: Task >Reporter: Ilya Pronin >Priority: Minor > Attachments: revive.master.lockfree.svg, revive.master.svg > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8120) Benchmark scheduler API performance
[ https://issues.apache.org/jira/browse/MESOS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-8120: --- Attachment: revive.master.lockfree.svg revive.master.svg > Benchmark scheduler API performance > --- > > Key: MESOS-8120 > URL: https://issues.apache.org/jira/browse/MESOS-8120 > Project: Mesos > Issue Type: Task >Reporter: Ilya Pronin >Priority: Minor > Attachments: revive.master.lockfree.svg, revive.master.svg > > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-8120) Benchmark scheduler API performance
[ https://issues.apache.org/jira/browse/MESOS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16213109#comment-16213109 ] Ilya Pronin commented on MESOS-8120: Results: https://docs.google.com/document/d/1gGxrrqC9ahN1oYFUpxpE_b27Xc5ld_04x1Bl86DROqY > Benchmark scheduler API performance > --- > > Key: MESOS-8120 > URL: https://issues.apache.org/jira/browse/MESOS-8120 > Project: Mesos > Issue Type: Task >Reporter: Ilya Pronin >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-8120) Benchmark scheduler API performance
Ilya Pronin created MESOS-8120: -- Summary: Benchmark scheduler API performance Key: MESOS-8120 URL: https://issues.apache.org/jira/browse/MESOS-8120 Project: Mesos Issue Type: Task Reporter: Ilya Pronin Priority: Minor Results: https://docs.google.com/document/d/1gGxrrqC9ahN1oYFUpxpE_b27Xc5ld_04x1Bl86DROqY -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-8120) Benchmark scheduler API performance
[ https://issues.apache.org/jira/browse/MESOS-8120?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-8120: --- Description: (was: Results: https://docs.google.com/document/d/1gGxrrqC9ahN1oYFUpxpE_b27Xc5ld_04x1Bl86DROqY) > Benchmark scheduler API performance > --- > > Key: MESOS-8120 > URL: https://issues.apache.org/jira/browse/MESOS-8120 > Project: Mesos > Issue Type: Task >Reporter: Ilya Pronin >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-7973) Non-leading VOTING replica catch-up
[ https://issues.apache.org/jira/browse/MESOS-7973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164926#comment-16164926 ] Ilya Pronin edited comment on MESOS-7973 at 10/3/17 11:19 PM: -- Review requests: https://reviews.apache.org/r/62283/ https://reviews.apache.org/r/62284/ https://reviews.apache.org/r/62285/ https://reviews.apache.org/r/62760/ https://reviews.apache.org/r/62761/ https://reviews.apache.org/r/62286/ https://reviews.apache.org/r/62287/ https://reviews.apache.org/r/62288/ was (Author: ipronin): Review requests: https://reviews.apache.org/r/62283/ https://reviews.apache.org/r/62284/ https://reviews.apache.org/r/62285/ https://reviews.apache.org/r/62286/ https://reviews.apache.org/r/62287/ https://reviews.apache.org/r/62288/ > Non-leading VOTING replica catch-up > --- > > Key: MESOS-7973 > URL: https://issues.apache.org/jira/browse/MESOS-7973 > Project: Mesos > Issue Type: Improvement > Components: replicated log >Reporter: Ilya Pronin >Assignee: Ilya Pronin > > Currently it is not possible to perform consistent reads from non-leading > replicas due to the fact that if a non-leading replica is partitioned it may > miss some log positions and will not make any attempt to “fill” those holes. > If a non-leading replica could catch-up missing log positions it would be > able to serve eventually consistent reads to the framework. This would make > it possible to do additional work on non-leading framework replicas (e.g. > offload some reading from a leader to standbys or reduce failover time by > keeping in-memory storage represented by the log “hot”). > Design doc: > https://docs.google.com/document/d/1dERXJeAsi3Lnq9Akt82JGWK4pKNeJ6k7PTVCpM9ic_8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-7973) Non-leading VOTING replica catch-up
Ilya Pronin created MESOS-7973: -- Summary: Non-leading VOTING replica catch-up Key: MESOS-7973 URL: https://issues.apache.org/jira/browse/MESOS-7973 Project: Mesos Issue Type: Improvement Components: replicated log Reporter: Ilya Pronin Assignee: Ilya Pronin Currently it is not possible to perform consistent reads from non-leading replicas due to the fact that if a non-leading replica is partitioned it may miss some log positions and will not make any attempt to “fill” those holes. If a non-leading replica could catch-up missing log positions it would be able to serve eventually consistent reads to the framework. This would make it possible to do additional work on non-leading framework replicas (e.g. offload some reading from a leader to standbys or reduce failover time by keeping in-memory storage represented by the log “hot”). Design doc: https://docs.google.com/document/d/1dERXJeAsi3Lnq9Akt82JGWK4pKNeJ6k7PTVCpM9ic_8/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-6971) Use arena allocation to improve protobuf message passing performance.
[ https://issues.apache.org/jira/browse/MESOS-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-6971: -- Assignee: (was: Ilya Pronin) > Use arena allocation to improve protobuf message passing performance. > - > > Key: MESOS-6971 > URL: https://issues.apache.org/jira/browse/MESOS-6971 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Mahler > Labels: mesosphere, performance, tech-debt > > The protobuf message passing provided by {{ProtobufProcess}} provide const > access of the message and/or its fields to the handler function. > This means that we can leverage the [arena > allocator|https://developers.google.com/protocol-buffers/docs/reference/arenas] > provided by protobuf to reduce the memory allocation cost during > de-serialization and improve cache efficiency. > This would require using protobuf 3.x with "proto2" syntax (which appears to > be the default if unspecified) to maintain our existing "proto2" > requirements. The upgrade to protobuf 3.x while keeping "proto2" syntax > should be tackled via a separate ticket that blocks this one. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-7895) ZK session timeout is unconfigurable in agent and scheduler drivers
[ https://issues.apache.org/jira/browse/MESOS-7895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128827#comment-16128827 ] Ilya Pronin edited comment on MESOS-7895 at 8/29/17 12:51 PM: -- [~vinodkone] can you shepherd this please? Review requests for agents: https://reviews.apache.org/r/61689/ https://reviews.apache.org/r/61690/ https://reviews.apache.org/r/61965/ was (Author: ipronin): [~vinodkone] can you shepherd this please? Review requests for agents: https://reviews.apache.org/r/61689/ https://reviews.apache.org/r/61690/ > ZK session timeout is unconfigurable in agent and scheduler drivers > --- > > Key: MESOS-7895 > URL: https://issues.apache.org/jira/browse/MESOS-7895 > Project: Mesos > Issue Type: Improvement >Affects Versions: 1.3.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{ZooKeeperMasterDetector}} in agents and scheduler drivers uses the default > ZK session timeout (10 secs). This timeout may have to be increased to cope > with long ZK upgrades or ZK GC pauses (with local ZK sessions these can cause > lots of {{TASK_LOST}}, because sessions expire on disconnection after > {{session_timeout * 2 / 3}}). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-7867) Master doesn't handle scheduler driver downgrade from HTTP based to PID based
[ https://issues.apache.org/jira/browse/MESOS-7867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118527#comment-16118527 ] Ilya Pronin edited comment on MESOS-7867 at 8/18/17 2:47 PM: - We can add those metrics back in {{Master::failoverFramework(Framework*, const UPID&}} or just remove metrics removal from {{Master::failoverFramework(Framework*, const HttpConnection&)}}. {{Master::addFramework()}} adds those metrics regardless of the scheduler driver type. [~anandmazumdar], can you shepherd this please? was (Author: ipronin): We can add those metrics back in {{Master::failoverFramework(Framework*, const UPID&}} or just remove metrics removal from {{Master::failoverFramework(Framework*, const HttpConnection&)}}. {{Master::addFramework()}} adds those metrics regardless of the scheduler driver type. > Master doesn't handle scheduler driver downgrade from HTTP based to PID based > - > > Key: MESOS-7867 > URL: https://issues.apache.org/jira/browse/MESOS-7867 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 1.3.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin > > When a framework upgrades from a PID based driver to an HTTP based driver, > master removes its per-framework-principal metrics ({{messages_received}} and > {{messages_processed}}) in {{Master::failoverFramework}}. When the same > framework downgrades back to a PID based driver, the master doesn't reinstate > those metrics. This causes a crash when the master receives a message from > the failed over framework and increments {{messages_received}} counter in > {{Master::visit(const MessageEvent&)}}. > {noformat} > I0807 18:17:45.713220 19095 master.cpp:2916] Framework > 70822e80-ca38-4470-916e-e6da073a4742- (TwitterScheduler) failed over > F0807 18:18:20.725908 19079 master.cpp:1451] Check failed: > metrics->frameworks.contains(principal.get()) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-7896) Use std::error_code for reporting platform-dependent errors
Ilya Pronin created MESOS-7896: -- Summary: Use std::error_code for reporting platform-dependent errors Key: MESOS-7896 URL: https://issues.apache.org/jira/browse/MESOS-7896 Project: Mesos Issue Type: Improvement Reporter: Ilya Pronin Priority: Minor It may be useful to return an error code from various functions to be able to distinguish different kinds of errors. E.g. for being able to ignore {{ENOENT}} from {{unlink()}}. This can be achieved by returning {{Try}}, but this is not portable. Since C++11 STL has {{std::error_code}} that hides platform-dependent error code behind a portable error condition. We can use it for error reporting. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7895) ZK session timeout is unconfigurable in agent and scheduler drivers
[ https://issues.apache.org/jira/browse/MESOS-7895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128827#comment-16128827 ] Ilya Pronin commented on MESOS-7895: [~vinodkone] can you shepherd this please? Review requests for agents: https://reviews.apache.org/r/61689/ https://reviews.apache.org/r/61690/ > ZK session timeout is unconfigurable in agent and scheduler drivers > --- > > Key: MESOS-7895 > URL: https://issues.apache.org/jira/browse/MESOS-7895 > Project: Mesos > Issue Type: Improvement >Affects Versions: 1.3.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{ZooKeeperMasterDetector}} in agents and scheduler drivers uses the default > ZK session timeout (10 secs). This timeout may have to be increased to cope > with long ZK upgrades or ZK GC pauses (with local ZK sessions these can cause > lots of {{TASK_LOST}}, because sessions expire on disconnection after > {{session_timeout * 2 / 3}}). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-7895) ZK session timeout is unconfigurable in agent and scheduler drivers
Ilya Pronin created MESOS-7895: -- Summary: ZK session timeout is unconfigurable in agent and scheduler drivers Key: MESOS-7895 URL: https://issues.apache.org/jira/browse/MESOS-7895 Project: Mesos Issue Type: Improvement Affects Versions: 1.3.0 Reporter: Ilya Pronin Assignee: Ilya Pronin Priority: Minor {{ZooKeeperMasterDetector}} in agents and scheduler drivers uses the default ZK session timeout (10 secs). This timeout may have to be increased to cope with long ZK upgrades or ZK GC pauses (with local ZK sessions these can cause lots of {{TASK_LOST}}, because sessions expire on disconnection after {{session_timeout * 2 / 3}}). -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7795) Remove "latest" symlink after agent reboot
[ https://issues.apache.org/jira/browse/MESOS-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16127328#comment-16127328 ] Ilya Pronin commented on MESOS-7795: Review requests: https://reviews.apache.org/r/61661/ https://reviews.apache.org/r/61662/ > Remove "latest" symlink after agent reboot > -- > > Key: MESOS-7795 > URL: https://issues.apache.org/jira/browse/MESOS-7795 > Project: Mesos > Issue Type: Improvement > Components: agent >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > Currently when the agent detects that the host was rebooted it doesn't > recover agent info. New agent info is not checkpointed until the agent > successfully registers with a master. If the agent crashes before > registering, on restart it will recover the old agent info that was > checkpointed before host reboot. > This can lead to problems. E.g. the agent may flap due to incompatible agent > info, if its resources somehow change after reboot. Or the usage of the old > agent ID in reregistration process may cause crashes like MESOS-7432. > We can remove the "latest" symlink when we detect that current boot ID is > different from the checkpointed one in order to prevent the agent from > recovering stale info after we checkpoint new boot ID. Or we can postpone > boot ID checkpointing until we checkpointed new agent info. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-7867) Master doesn't handle scheduler driver downgrade from HTTP based to PID based
[ https://issues.apache.org/jira/browse/MESOS-7867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16118527#comment-16118527 ] Ilya Pronin commented on MESOS-7867: We can add those metrics back in {{Master::failoverFramework(Framework*, const UPID&}} or just remove metrics removal from {{Master::failoverFramework(Framework*, const HttpConnection&)}}. {{Master::addFramework()}} adds those metrics regardless of the scheduler driver type. > Master doesn't handle scheduler driver downgrade from HTTP based to PID based > - > > Key: MESOS-7867 > URL: https://issues.apache.org/jira/browse/MESOS-7867 > Project: Mesos > Issue Type: Bug > Components: master >Affects Versions: 1.3.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin > > When a framework upgrades from a PID based driver to an HTTP based driver, > master removes its per-framework-principal metrics ({{messages_received}} and > {{messages_processed}}) in {{Master::failoverFramework}}. When the same > framework downgrades back to a PID based driver, the master doesn't reinstate > those metrics. This causes a crash when the master receives a message from > the failed over framework and increments {{messages_received}} counter in > {{Master::visit(const MessageEvent&)}}. > {noformat} > I0807 18:17:45.713220 19095 master.cpp:2916] Framework > 70822e80-ca38-4470-916e-e6da073a4742- (TwitterScheduler) failed over > F0807 18:18:20.725908 19079 master.cpp:1451] Check failed: > metrics->frameworks.contains(principal.get()) > {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-7867) Master doesn't handle scheduler driver downgrade from HTTP based to PID based
Ilya Pronin created MESOS-7867: -- Summary: Master doesn't handle scheduler driver downgrade from HTTP based to PID based Key: MESOS-7867 URL: https://issues.apache.org/jira/browse/MESOS-7867 Project: Mesos Issue Type: Bug Components: master Affects Versions: 1.3.0 Reporter: Ilya Pronin Assignee: Ilya Pronin When a framework upgrades from a PID based driver to an HTTP based driver, master removes its per-framework-principal metrics ({{messages_received}} and {{messages_processed}}) in {{Master::failoverFramework}}. When the same framework downgrades back to a PID based driver, the master doesn't reinstate those metrics. This causes a crash when the master receives a message from the failed over framework and increments {{messages_received}} counter in {{Master::visit(const MessageEvent&)}}. {noformat} I0807 18:17:45.713220 19095 master.cpp:2916] Framework 70822e80-ca38-4470-916e-e6da073a4742- (TwitterScheduler) failed over F0807 18:18:20.725908 19079 master.cpp:1451] Check failed: metrics->frameworks.contains(principal.get()) {noformat} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7795) Remove "latest" symlink after agent reboot
[ https://issues.apache.org/jira/browse/MESOS-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-7795: --- Shepherd: Yan Xu > Remove "latest" symlink after agent reboot > -- > > Key: MESOS-7795 > URL: https://issues.apache.org/jira/browse/MESOS-7795 > Project: Mesos > Issue Type: Improvement > Components: agent >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > Currently when the agent detects that the host was rebooted it doesn't > recover agent info. New agent info is not checkpointed until the agent > successfully registers with a master. If the agent crashes before > registering, on restart it will recover the old agent info that was > checkpointed before host reboot. > This can lead to problems. E.g. the agent may flap due to incompatible agent > info, if its resources somehow change after reboot. Or the usage of the old > agent ID in reregistration process may cause crashes like MESOS-7432. > We can remove the "latest" symlink when we detect that current boot ID is > different from the checkpointed one in order to prevent the agent from > recovering stale info after we checkpoint new boot ID. Or we can postpone > boot ID checkpointing until we checkpointed new agent info. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-7795) Remove "latest" symlink after agent reboot
[ https://issues.apache.org/jira/browse/MESOS-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-7795: -- Assignee: Ilya Pronin > Remove "latest" symlink after agent reboot > -- > > Key: MESOS-7795 > URL: https://issues.apache.org/jira/browse/MESOS-7795 > Project: Mesos > Issue Type: Improvement > Components: agent >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > Currently when the agent detects that the host was rebooted it doesn't > recover agent info. New agent info is not checkpointed until the agent > successfully registers with a master. If the agent crashes before > registering, on restart it will recover the old agent info that was > checkpointed before host reboot. > This can lead to problems. E.g. the agent may flap due to incompatible agent > info, if its resources somehow change after reboot. Or the usage of the old > agent ID in reregistration process may cause crashes like MESOS-7432. > We can remove the "latest" symlink when we detect that current boot ID is > different from the checkpointed one in order to prevent the agent from > recovering stale info after we checkpoint new boot ID. Or we can postpone > boot ID checkpointing until we checkpointed new agent info. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-1216) Attributes comparator operator should allow multiple attributes of same name and type
[ https://issues.apache.org/jira/browse/MESOS-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-1216: --- Shepherd: Anand Mazumdar > Attributes comparator operator should allow multiple attributes of same name > and type > - > > Key: MESOS-1216 > URL: https://issues.apache.org/jira/browse/MESOS-1216 > Project: Mesos > Issue Type: Bug >Reporter: Vinod Kone >Assignee: Ilya Pronin > Labels: tech-debt > > The fact that we currently don't support SET type in Attribute means that > operators might end up having multiple attributes (e.g., slave attributes) > with the same name and type but different value. > But the comparator operator for Attributes expects unique (name, type) > Attributes. This results in slave recovery failure when comparing > checkpointed attributes with those set via flags. > https://issues.apache.org/jira/browse/MESOS-1215 adds SET support, but for > backwards compatibility we should fix the comparator operator first. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7795) Remove "latest" symlink after agent reboot
[ https://issues.apache.org/jira/browse/MESOS-7795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-7795: --- Description: Currently when the agent detects that the host was rebooted it doesn't recover agent info. New agent info is not checkpointed until the agent successfully registers with a master. If the agent crashes before registering, on restart it will recover the old agent info that was checkpointed before host reboot. This can lead to problems. E.g. the agent may flap due to incompatible agent info, if its resources somehow change after reboot. Or the usage of the old agent ID in reregistration process may cause crashes like MESOS-7432. We can remove the "latest" symlink when we detect that current boot ID is different from the checkpointed one in order to prevent the agent from recovering stale info after we checkpoint new boot ID. Or we can postpone boot ID checkpointing until we checkpointed new agent info. was: Currently when the agent detects that the host was rebooted it doesn't recover agent info. New agent info is not checkpointed until the agent successfully registers with a master. If the agent crashes before registering, on restart it will recover the old agent info that was checkpointed before host reboot. This can lead to problems. E.g. the agent may flap due to incompatible agent info, if its resources somehow change after reboot. Or the usage of the old agent ID in reregistration process may cause crashes like MESOS-7432. We can remove the "latest" symlink when we detect that current boot ID is different from the checkpointed one in order to prevent the agent from recovering stale info after we checkpoint new boot ID. > Remove "latest" symlink after agent reboot > -- > > Key: MESOS-7795 > URL: https://issues.apache.org/jira/browse/MESOS-7795 > Project: Mesos > Issue Type: Improvement > Components: agent >Reporter: Ilya Pronin >Priority: Minor > > Currently when the agent detects that the host was rebooted it doesn't > recover agent info. New agent info is not checkpointed until the agent > successfully registers with a master. If the agent crashes before > registering, on restart it will recover the old agent info that was > checkpointed before host reboot. > This can lead to problems. E.g. the agent may flap due to incompatible agent > info, if its resources somehow change after reboot. Or the usage of the old > agent ID in reregistration process may cause crashes like MESOS-7432. > We can remove the "latest" symlink when we detect that current boot ID is > different from the checkpointed one in order to prevent the agent from > recovering stale info after we checkpoint new boot ID. Or we can postpone > boot ID checkpointing until we checkpointed new agent info. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-7795) Remove "latest" symlink after agent reboot
Ilya Pronin created MESOS-7795: -- Summary: Remove "latest" symlink after agent reboot Key: MESOS-7795 URL: https://issues.apache.org/jira/browse/MESOS-7795 Project: Mesos Issue Type: Improvement Components: agent Reporter: Ilya Pronin Priority: Minor Currently when the agent detects that the host was rebooted it doesn't recover agent info. New agent info is not checkpointed until the agent successfully registers with a master. If the agent crashes before registering, on restart it will recover the old agent info that was checkpointed before host reboot. This can lead to problems. E.g. the agent may flap due to incompatible agent info, if its resources somehow change after reboot. Or the usage of the old agent ID in reregistration process may cause crashes like MESOS-7432. We can remove the "latest" symlink when we detect that current boot ID is different from the checkpointed one in order to prevent the agent from recovering stale info after we checkpoint new boot ID. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-5361) Consider introducing TCP KeepAlive for Libprocess sockets.
[ https://issues.apache.org/jira/browse/MESOS-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-5361: -- Assignee: (was: Ilya Pronin) > Consider introducing TCP KeepAlive for Libprocess sockets. > -- > > Key: MESOS-5361 > URL: https://issues.apache.org/jira/browse/MESOS-5361 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Anand Mazumdar > Labels: mesosphere > > We currently don't use TCP KeepAlive's when creating sockets in libprocess. > This might benefit master - scheduler, master - agent connections i.e. we can > detect if any of them failed faster. > Currently, if the master process goes down. If for some reason the {{RST}} > sequence did not reach the scheduler, the scheduler can only come to know > about the disconnection when it tries to do a {{send}} itself. > The default TCP keep alive values on Linux are of little use in a real world > application: > {code} > . This means that the keepalive routines wait for two hours (7200 secs) > before sending the first keepalive probe, and then resend it every 75 > seconds. If no ACK response is received for nine consecutive times, the > connection is marked as broken. > {code} > However, for long running instances of scheduler/agent this still can be > beneficial. Also, operators might start tuning the values for their clusters > explicitly once we start supporting it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-1216) Attributes comparator operator should allow multiple attributes of same name and type
[ https://issues.apache.org/jira/browse/MESOS-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-1216: -- Assignee: Ilya Pronin > Attributes comparator operator should allow multiple attributes of same name > and type > - > > Key: MESOS-1216 > URL: https://issues.apache.org/jira/browse/MESOS-1216 > Project: Mesos > Issue Type: Bug >Reporter: Vinod Kone >Assignee: Ilya Pronin > Labels: tech-debt > > The fact that we currently don't support SET type in Attribute means that > operators might end up having multiple attributes (e.g., slave attributes) > with the same name and type but different value. > But the comparator operator for Attributes expects unique (name, type) > Attributes. This results in slave recovery failure when comparing > checkpointed attributes with those set via flags. > https://issues.apache.org/jira/browse/MESOS-1215 adds SET support, but for > backwards compatibility we should fix the comparator operator first. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (MESOS-4092) Try to re-establish connection on ping timeouts with agent before removing it
[ https://issues.apache.org/jira/browse/MESOS-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065357#comment-16065357 ] Ilya Pronin commented on MESOS-4092: Looks like our problem here is that we use our health-check for detecting remote-peer failure and link failure, but don't distinguish them. When a connection breaks, libprocess issues {{ExitedEvent}} and opens a new connection when required. But in the case of a network problem a relatively long time may pass before TCP retransmissions limit is reached and the connection is declared dead. One possible solution can be to try using the aforementioned "relink" functionality at some point during agent pinging. We can use a strategy similar to the one used by TCP: after N consecutive failed pings "relink" before sending the next ping. Plus a similar thing on the agent's side. Another possible solution can be to use TCP keepalive mechanism tuned to "detect" broken connections faster than {{agent_ping_timeout * max_agent_ping_timeouts}}. Or we can mess with TCP user timeout, but IMO it's a road to hell and AFAIK user timeout is available only on Linux. > Try to re-establish connection on ping timeouts with agent before removing it > - > > Key: MESOS-4092 > URL: https://issues.apache.org/jira/browse/MESOS-4092 > Project: Mesos > Issue Type: Improvement > Components: master >Affects Versions: 0.25.0 >Reporter: Ian Downes > > The SlaveObserver will trigger an agent to be removed after > {{flags.max_slave_ping_timeouts}} timeouts of {{flags.slave_ping_timeout}}. > This can occur because of transient network failures, e.g., gray failures of > a switch uplink exhibiting heavy or total packet loss. Some network > architectures are designed to tolerate such gray failures and support > multiple paths between hosts. This can be implemented with equal-cost > multi-path routing (ECMP) where flows are hashed by their 5-tuple to multiple > possible uplinks. In such networks re-establishing a TCP connection will > almost certainly use a new source port and thus will likely be hashed to a > different uplink, avoiding the failed uplink and re-establishing connectivity > with the agent. > After failing to receive pongs the SlaveObserver should next try to > re-establish a TCP connection (with exponential back-off) before declaring > the agent as lost. This can avoid significant disruption where large numbers > of agents reached through a single failed link could be removed unnecessarily > while still ensuring that agents that are truly lost are recognized as such. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-4092) Try to re-establish connection on ping timeouts with agent before removing it
[ https://issues.apache.org/jira/browse/MESOS-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065357#comment-16065357 ] Ilya Pronin edited comment on MESOS-4092 at 6/27/17 7:43 PM: - Seems that our problem here is that we use our health-check for detecting remote-peer failure and link failure, but don't distinguish them. When a connection breaks, libprocess issues {{ExitedEvent}} and opens a new connection when required. But in the case of a network problem a relatively long time may pass before TCP retransmissions limit is reached and the connection is declared dead. One possible solution can be to try using the aforementioned "relink" functionality at some point during agent pinging. We can use a strategy similar to the one used by TCP: after N consecutive failed pings "relink" before sending the next ping. Plus a similar thing on the agent's side. Another possible solution can be to use TCP keepalive mechanism tuned to "detect" broken connections faster than {{agent_ping_timeout * max_agent_ping_timeouts}}. Or we can mess with TCP user timeout, but IMO it's a road to hell and AFAIK user timeout is available only on Linux. was (Author: ipronin): Looks like our problem here is that we use our health-check for detecting remote-peer failure and link failure, but don't distinguish them. When a connection breaks, libprocess issues {{ExitedEvent}} and opens a new connection when required. But in the case of a network problem a relatively long time may pass before TCP retransmissions limit is reached and the connection is declared dead. One possible solution can be to try using the aforementioned "relink" functionality at some point during agent pinging. We can use a strategy similar to the one used by TCP: after N consecutive failed pings "relink" before sending the next ping. Plus a similar thing on the agent's side. Another possible solution can be to use TCP keepalive mechanism tuned to "detect" broken connections faster than {{agent_ping_timeout * max_agent_ping_timeouts}}. Or we can mess with TCP user timeout, but IMO it's a road to hell and AFAIK user timeout is available only on Linux. > Try to re-establish connection on ping timeouts with agent before removing it > - > > Key: MESOS-4092 > URL: https://issues.apache.org/jira/browse/MESOS-4092 > Project: Mesos > Issue Type: Improvement > Components: master >Affects Versions: 0.25.0 >Reporter: Ian Downes > > The SlaveObserver will trigger an agent to be removed after > {{flags.max_slave_ping_timeouts}} timeouts of {{flags.slave_ping_timeout}}. > This can occur because of transient network failures, e.g., gray failures of > a switch uplink exhibiting heavy or total packet loss. Some network > architectures are designed to tolerate such gray failures and support > multiple paths between hosts. This can be implemented with equal-cost > multi-path routing (ECMP) where flows are hashed by their 5-tuple to multiple > possible uplinks. In such networks re-establishing a TCP connection will > almost certainly use a new source port and thus will likely be hashed to a > different uplink, avoiding the failed uplink and re-establishing connectivity > with the agent. > After failing to receive pongs the SlaveObserver should next try to > re-establish a TCP connection (with exponential back-off) before declaring > the agent as lost. This can avoid significant disruption where large numbers > of agents reached through a single failed link could be removed unnecessarily > while still ensuring that agents that are truly lost are recognized as such. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-5361) Consider introducing TCP KeepAlive for Libprocess sockets.
[ https://issues.apache.org/jira/browse/MESOS-5361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-5361: -- Assignee: Ilya Pronin > Consider introducing TCP KeepAlive for Libprocess sockets. > -- > > Key: MESOS-5361 > URL: https://issues.apache.org/jira/browse/MESOS-5361 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Anand Mazumdar >Assignee: Ilya Pronin > Labels: mesosphere > > We currently don't use TCP KeepAlive's when creating sockets in libprocess. > This might benefit master - scheduler, master - agent connections i.e. we can > detect if any of them failed faster. > Currently, if the master process goes down. If for some reason the {{RST}} > sequence did not reach the scheduler, the scheduler can only come to know > about the disconnection when it tries to do a {{send}} itself. > The default TCP keep alive values on Linux are of little use in a real world > application: > {code} > . This means that the keepalive routines wait for two hours (7200 secs) > before sending the first keepalive probe, and then resend it every 75 > seconds. If no ACK response is received for nine consecutive times, the > connection is marked as broken. > {code} > However, for long running instances of scheduler/agent this still can be > beneficial. Also, operators might start tuning the values for their clusters > explicitly once we start supporting it. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-7688) Improve master failover performance by reducing unnecessary agent retries.
[ https://issues.apache.org/jira/browse/MESOS-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059531#comment-16059531 ] Ilya Pronin edited comment on MESOS-7688 at 6/22/17 3:59 PM: - Attached a perf script ([^reregistration.perf.gz]) and a flamegraph ([^reregistration.svg]) for a 2 minute sample of 33k agents reregistration after master (1.2.0 with [r58355|https://reviews.apache.org/r/58355/] backported) failover. was (Author: ipronin): Attached a perf script ([^reregistration.perf.gz]) and a flamegraph ([^reregistration.svg]) for a 2 minute sample of 33k agents reregistration after master failover. > Improve master failover performance by reducing unnecessary agent retries. > -- > > Key: MESOS-7688 > URL: https://issues.apache.org/jira/browse/MESOS-7688 > Project: Mesos > Issue Type: Improvement > Components: agent, master >Reporter: Benjamin Mahler > Labels: scalability > Attachments: 1.2.0.png, reregistration.perf.gz, reregistration.svg > > > Currently, during a failover the agents will (re-)register with the master. > While the master is recovering, the master may drop messages from the agents, > and so the agents must retry registration using a backoff mechanism. For > large clusters, there can be a lot of overhead in processing unnecessary > retries from the agents, given that these messages must be deserialized and > contain all of the task / executor information many times over. > In order to reduce this overhead, the idea is to avoid the need for agents to > blindly retry (re-)registration with the master. Two approaches for this are: > (1) Update the MasterInfo in ZK when the master is recovered. This is a bit > of an abuse of MasterInfo unfortunately, but the idea is for agents to only > (re-)register when they see that the master reaches a recovered state. Once > recovered, the master will not drop messages, and therefore agents only need > to retry when the connection breaks. > (2) Have the master reply with a retry message when it's in the recovering > state, so that agents get a clear signal that their messages were dropped. > The agents only retry when the connection breaks or they get a retry message. > This one is less optimal, because the master may have to process a lot of > messages and send retries, but once the master is recovered, the master will > process only a single (re-)registration from each agent. The number of > (re-)registrations that occur while the master is recovering can be reduced > to 1 in this approach if the master sends the retry message only after the > master completes recovery. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Comment Edited] (MESOS-7688) Improve master failover performance by reducing unnecessary agent retries.
[ https://issues.apache.org/jira/browse/MESOS-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16059531#comment-16059531 ] Ilya Pronin edited comment on MESOS-7688 at 6/22/17 3:32 PM: - Attached a perf script ([^reregistration.perf.gz]) and a flamegraph ([^reregistration.svg]) for a 2 minute sample of 33k agents reregistration after master failover. was (Author: ipronin): Attached a perf script ([^reregistration.perf.gz]) and a flamegraph ([^reregistration.svg]) for a 2 minute sample of agents reregistration after master failover. > Improve master failover performance by reducing unnecessary agent retries. > -- > > Key: MESOS-7688 > URL: https://issues.apache.org/jira/browse/MESOS-7688 > Project: Mesos > Issue Type: Improvement > Components: agent, master >Reporter: Benjamin Mahler > Labels: scalability > Attachments: 1.2.0.png, reregistration.perf.gz, reregistration.svg > > > Currently, during a failover the agents will (re-)register with the master. > While the master is recovering, the master may drop messages from the agents, > and so the agents must retry registration using a backoff mechanism. For > large clusters, there can be a lot of overhead in processing unnecessary > retries from the agents, given that these messages must be deserialized and > contain all of the task / executor information many times over. > In order to reduce this overhead, the idea is to avoid the need for agents to > blindly retry (re-)registration with the master. Two approaches for this are: > (1) Update the MasterInfo in ZK when the master is recovered. This is a bit > of an abuse of MasterInfo unfortunately, but the idea is for agents to only > (re-)register when they see that the master reaches a recovered state. Once > recovered, the master will not drop messages, and therefore agents only need > to retry when the connection breaks. > (2) Have the master reply with a retry message when it's in the recovering > state, so that agents get a clear signal that their messages were dropped. > The agents only retry when the connection breaks or they get a retry message. > This one is less optimal, because the master may have to process a lot of > messages and send retries, but once the master is recovered, the master will > process only a single (re-)registration from each agent. The number of > (re-)registrations that occur while the master is recovering can be reduced > to 1 in this approach if the master sends the retry message only after the > master completes recovery. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (MESOS-7688) Improve master failover performance by reducing unnecessary agent retries.
[ https://issues.apache.org/jira/browse/MESOS-7688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-7688: --- Attachment: reregistration.svg reregistration.perf.gz Attached a perf script ([^reregistration.perf.gz]) and a flamegraph ([^reregistration.svg]) for a 2 minute sample of agents reregistration after master failover. > Improve master failover performance by reducing unnecessary agent retries. > -- > > Key: MESOS-7688 > URL: https://issues.apache.org/jira/browse/MESOS-7688 > Project: Mesos > Issue Type: Improvement > Components: agent, master >Reporter: Benjamin Mahler > Labels: scalability > Attachments: 1.2.0.png, reregistration.perf.gz, reregistration.svg > > > Currently, during a failover the agents will (re-)register with the master. > While the master is recovering, the master may drop messages from the agents, > and so the agents must retry registration using a backoff mechanism. For > large clusters, there can be a lot of overhead in processing unnecessary > retries from the agents, given that these messages must be deserialized and > contain all of the task / executor information many times over. > In order to reduce this overhead, the idea is to avoid the need for agents to > blindly retry (re-)registration with the master. Two approaches for this are: > (1) Update the MasterInfo in ZK when the master is recovered. This is a bit > of an abuse of MasterInfo unfortunately, but the idea is for agents to only > (re-)register when they see that the master reaches a recovered state. Once > recovered, the master will not drop messages, and therefore agents only need > to retry when the connection breaks. > (2) Have the master reply with a retry message when it's in the recovering > state, so that agents get a clear signal that their messages were dropped. > The agents only retry when the connection breaks or they get a retry message. > This one is less optimal, because the master may have to process a lot of > messages and send retries, but once the master is recovered, the master will > process only a single (re-)registration from each agent. The number of > (re-)registrations that occur while the master is recovering can be reduced > to 1 in this approach if the master sends the retry message only after the > master completes recovery. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (MESOS-7698) Libprocess doesn't handle IP changes
Ilya Pronin created MESOS-7698: -- Summary: Libprocess doesn't handle IP changes Key: MESOS-7698 URL: https://issues.apache.org/jira/browse/MESOS-7698 Project: Mesos Issue Type: Bug Components: libprocess Affects Versions: 1.2.0 Reporter: Ilya Pronin If a host IP address changes libprocess will never learn about it and will continue to send messages "from" the old IP. This will cause weird situations. E.g. an agent will indefinitely try to reregister with a master pretending that it can be reached by an old IP. The master will send {{SlaveReregisteredMessage}} to the wrong host (potentially a different agent), using an IP from the {{User-Agent: libprocess/*}} header. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (MESOS-6971) Use arena allocation to improve protobuf message passing performance.
[ https://issues.apache.org/jira/browse/MESOS-6971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-6971: -- Assignee: Ilya Pronin > Use arena allocation to improve protobuf message passing performance. > - > > Key: MESOS-6971 > URL: https://issues.apache.org/jira/browse/MESOS-6971 > Project: Mesos > Issue Type: Improvement >Reporter: Benjamin Mahler >Assignee: Ilya Pronin > Labels: tech-debt > > The protobuf message passing provided by {{ProtobufProcess}} provide const > access of the message and/or its fields to the handler function. > This means that we can leverage the [arena > allocator|https://developers.google.com/protocol-buffers/docs/reference/arenas] > provided by protobuf to reduce the memory allocation cost during > de-serialization and improve cache efficiency. > This would require using protobuf 3.x with "proto2" syntax (which appears to > be the default if unspecified) to maintain our existing "proto2" > requirements. The upgrade to protobuf 3.x while keeping "proto2" syntax > should be tackled via a separate ticket that blocks this one. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7461) balloon test and disk full framework test relies on possibly unavailable ports
[ https://issues.apache.org/jira/browse/MESOS-7461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15998289#comment-15998289 ] Ilya Pronin commented on MESOS-7461: We solved this problem in our environment by randomizing the port number :) > balloon test and disk full framework test relies on possibly unavailable ports > -- > > Key: MESOS-7461 > URL: https://issues.apache.org/jira/browse/MESOS-7461 > Project: Mesos > Issue Type: Bug > Components: test >Reporter: Zhitao Li > > balloon_framework_test.sh and disk_full_framework_test.sh all have code to > directly listen at a {{5432}} port, but in our environment that port is > directly reserved by something else. > A possible fix is to write some utility to try to find an unused port, and > try to use it for the master. It's not perfect though as there could still be > a race condition. > Another possible fix if to move listen "port" to a domain socket, when that's > supported. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7387) ZK master contender and detector don't respect zk_session_timeout option
[ https://issues.apache.org/jira/browse/MESOS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15972777#comment-15972777 ] Ilya Pronin commented on MESOS-7387: Added one more patch that updates the ZK session timeouts part of the high availability doc: https://reviews.apache.org/r/58506/ > ZK master contender and detector don't respect zk_session_timeout option > > > Key: MESOS-7387 > URL: https://issues.apache.org/jira/browse/MESOS-7387 > Project: Mesos > Issue Type: Improvement > Components: master >Affects Versions: 1.3.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{ZooKeeperMasterContender}} and {{ZooKeeperMasterDetector}} are using > hardcoded ZK session timeouts ({{MASTER_CONTENDER_ZK_SESSION_TIMEOUT}} and > {{MASTER_DETECTOR_ZK_SESSION_TIMEOUT}}) and do not respect > {{--zk_session_timeout}} master option. This is unexpected and doesn't play > well with ZK updates that take longer than 10 secs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7387) ZK master contender and detector don't respect zk_session_timeout option
[ https://issues.apache.org/jira/browse/MESOS-7387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15967767#comment-15967767 ] Ilya Pronin commented on MESOS-7387: Review request: https://reviews.apache.org/r/58421/ [~vinodkone], [~bmahler] can you shepherd this please? > ZK master contender and detector don't respect zk_session_timeout option > > > Key: MESOS-7387 > URL: https://issues.apache.org/jira/browse/MESOS-7387 > Project: Mesos > Issue Type: Improvement > Components: master >Affects Versions: 1.3.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{ZooKeeperMasterContender}} and {{ZooKeeperMasterDetector}} are using > hardcoded ZK session timeouts ({{MASTER_CONTENDER_ZK_SESSION_TIMEOUT}} and > {{MASTER_DETECTOR_ZK_SESSION_TIMEOUT}}) and do not respect > {{--zk_session_timeout}} master option. This is unexpected and doesn't play > well with ZK updates that take longer than 10 secs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7387) ZK master contender and detector don't respect zk_session_timeout option
Ilya Pronin created MESOS-7387: -- Summary: ZK master contender and detector don't respect zk_session_timeout option Key: MESOS-7387 URL: https://issues.apache.org/jira/browse/MESOS-7387 Project: Mesos Issue Type: Improvement Components: master Affects Versions: 1.3.0 Reporter: Ilya Pronin Assignee: Ilya Pronin Priority: Minor {{ZooKeeperMasterContender}} and {{ZooKeeperMasterDetector}} are using hardcoded ZK session timeouts ({{MASTER_CONTENDER_ZK_SESSION_TIMEOUT}} and {{MASTER_DETECTOR_ZK_SESSION_TIMEOUT}}) and do not respect {{--zk_session_timeout}} master option. This is unexpected and doesn't play well with ZK updates that take longer than 10 secs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (MESOS-7376) Long registry updates when the number of agents is high
[ https://issues.apache.org/jira/browse/MESOS-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964547#comment-15964547 ] Ilya Pronin edited comment on MESOS-7376 at 4/11/17 3:49 PM: - Review request: https://reviews.apache.org/r/58355/ {{Registry}} is being copied as a part of {{state::protobuf::State::Variable}} or as a bound parameter in {{state::protobuf::State::store()}}. The former can be mitigated by adding move support to {{Variable}} (using {{Swap()}} for protobuf message). The latter - by using {{Owned}}. But that's not enough because {{Variable}} will still be copied in return value propagation through the {{Future}}-s chain. So in my patch I bypassed {{state::protobuf::State}}. [~bmahler] can you shepherd this please? h4. Benchmark results Before: {noformat} I0411 10:04:11.726016 11802 registrar.cpp:508] Successfully updated the registry in 89.478144ms I0411 10:04:13.860827 11803 registrar.cpp:508] Successfully updated the registry in 216.688896ms I0411 10:04:15.167768 11803 registrar.cpp:508] Successfully updated the registry in 1.29364096secs I0411 10:04:18.967394 11803 registrar.cpp:508] Successfully updated the registry in 3.696552192secs I0411 10:04:25.631009 11803 registrar.cpp:508] Successfully updated the registry in 6.267425024secs I0411 10:04:42.625507 11803 registrar.cpp:508] Successfully updated the registry in 15.876419072secs I0411 10:04:44.209377 11787 registrar_tests.cpp:1262] Admitted 5 agents in 30.479743816secs I0411 10:05:04.446650 11820 registrar.cpp:508] Successfully updated the registry in 18.338545152secs I0411 10:05:21.171001 11820 registrar.cpp:508] Successfully updated the registry in 15.31903872secs I0411 10:05:37.592319 11820 registrar.cpp:508] Successfully updated the registry in 14.863101952secs I0411 10:05:39.099174 11787 registrar_tests.cpp:1276] Marked 5 agents reachable in 53.593596102secs ../../src/tests/registrar_tests.cpp:1287: Failure Failed to wait 15secs for registry {noformat} After: {noformat} I0411 15:19:12.228904 40643 registrar.cpp:524] Successfully updated the registry in 91.262208ms I0411 15:19:14.543190 40660 registrar.cpp:524] Successfully updated the registry in 377.45408ms I0411 15:19:15.707006 40660 registrar.cpp:524] Successfully updated the registry in 1.138724096secs I0411 15:19:18.267305 40660 registrar.cpp:524] Successfully updated the registry in 2.466145792secs I0411 15:19:19.092073 40660 registrar.cpp:524] Successfully updated the registry in 523.11296ms I0411 15:19:20.809330 40648 registrar.cpp:524] Successfully updated the registry in 892.141824ms I0411 15:19:21.194135 40622 registrar_tests.cpp:1262] Admitted 5 agents in 6.938952085secs I0411 15:19:26.973904 40637 registrar.cpp:524] Successfully updated the registry in 3.938064128secs I0411 15:19:28.631865 40637 registrar.cpp:524] Successfully updated the registry in 1.116326144secs I0411 15:19:30.222944 40660 registrar.cpp:524] Successfully updated the registry in 911.86688ms I0411 15:19:30.678509 40622 registrar_tests.cpp:1276] Marked 5 agents reachable in 8.249523305secs I0411 15:19:35.138797 40645 registrar.cpp:524] Successfully updated the registry in 815.439104ms I0411 15:19:41.783651 40622 registrar_tests.cpp:1288] Recovered 5 agents (8238915B) in 10.963297677secs I0411 15:19:47.431670 40657 registrar.cpp:524] Successfully updated the registry in 3.960920064secs I0411 15:20:13.769872 40657 registrar.cpp:524] Successfully updated the registry in 1.169234944secs I0411 15:21:49.685801 40657 registrar.cpp:524] Successfully updated the registry in 264.850688ms Removed 5 agents in 2.12256788111667mins {noformat} Similar picture in scale testing: {noformat} I0411 13:04:27.598438 41549 registrar.cpp:537] Successfully updated the registry in 2.68846208secs I0411 13:04:30.716615 41552 registrar.cpp:537] Successfully updated the registry in 2.61457792secs I0411 13:04:33.828133 41554 registrar.cpp:537] Successfully updated the registry in 2.644827904secs I0411 13:04:37.634577 41553 registrar.cpp:537] Successfully updated the registry in 3.338414848secs I0411 13:04:40.723475 41546 registrar.cpp:537] Successfully updated the registry in 2.629734144secs {noformat} was (Author: ipronin): Review request: https://reviews.apache.org/r/58355/ {{Registry}} is being copied as a part of {{state::protobuf::State::Variable}} or as a bound parameter in {{state::protobuf::State::store()}}. The former can be mitigated by adding move support to {{Variable}} (using {{Swap()}} for protobuf message). The latter - by using {{Owned}}. But that's not enough because {{Variable}} will still be copied in return value propagation through the {{Future}}-s chain. So in my patch I bypassed {{state::protobuf::State}}. [~bmahler] can you shepherd this please? h4. Benchmark results Before: {noformat} I0411 10:04:11.726016 11802 registrar.cpp:508]
[jira] [Commented] (MESOS-7376) Long registry updates when the number of agents is high
[ https://issues.apache.org/jira/browse/MESOS-7376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964547#comment-15964547 ] Ilya Pronin commented on MESOS-7376: Review request: https://reviews.apache.org/r/58355/ {{Registry}} is being copied as a part of {{state::protobuf::State::Variable}} or as a bound parameter in {{state::protobuf::State::store()}}. The former can be mitigated by adding move support to {{Variable}} (using {{Swap()}} for protobuf message). The latter - by using {{Owned}}. But that's not enough because {{Variable}} will still be copied in return value propagation through the {{Future}}-s chain. So in my patch I bypassed {{state::protobuf::State}}. [~bmahler] can you shepherd this please? h4. Benchmark results Before: {noformat} I0411 10:04:11.726016 11802 registrar.cpp:508] Successfully updated the registry in 89.478144ms I0411 10:04:13.860827 11803 registrar.cpp:508] Successfully updated the registry in 216.688896ms I0411 10:04:15.167768 11803 registrar.cpp:508] Successfully updated the registry in 1.29364096secs I0411 10:04:18.967394 11803 registrar.cpp:508] Successfully updated the registry in 3.696552192secs I0411 10:04:25.631009 11803 registrar.cpp:508] Successfully updated the registry in 6.267425024secs I0411 10:04:42.625507 11803 registrar.cpp:508] Successfully updated the registry in 15.876419072secs I0411 10:04:44.209377 11787 registrar_tests.cpp:1262] Admitted 5 agents in 30.479743816secs I0411 10:05:04.446650 11820 registrar.cpp:508] Successfully updated the registry in 18.338545152secs I0411 10:05:21.171001 11820 registrar.cpp:508] Successfully updated the registry in 15.31903872secs I0411 10:05:37.592319 11820 registrar.cpp:508] Successfully updated the registry in 14.863101952secs I0411 10:05:39.099174 11787 registrar_tests.cpp:1276] Marked 5 agents reachable in 53.593596102secs ../../src/tests/registrar_tests.cpp:1287: Failure Failed to wait 15secs for registry {noformat} After: {noformat} I0411 15:19:12.228904 40643 registrar.cpp:524] Successfully updated the registry in 91.262208ms I0411 15:19:14.543190 40660 registrar.cpp:524] Successfully updated the registry in 377.45408ms I0411 15:19:15.707006 40660 registrar.cpp:524] Successfully updated the registry in 1.138724096secs I0411 15:19:18.267305 40660 registrar.cpp:524] Successfully updated the registry in 2.466145792secs I0411 15:19:19.092073 40660 registrar.cpp:524] Successfully updated the registry in 523.11296ms I0411 15:19:20.809330 40648 registrar.cpp:524] Successfully updated the registry in 892.141824ms I0411 15:19:21.194135 40622 registrar_tests.cpp:1262] Admitted 5 agents in 6.938952085secs I0411 15:19:26.973904 40637 registrar.cpp:524] Successfully updated the registry in 3.938064128secs I0411 15:19:28.631865 40637 registrar.cpp:524] Successfully updated the registry in 1.116326144secs I0411 15:19:30.222944 40660 registrar.cpp:524] Successfully updated the registry in 911.86688ms I0411 15:19:30.678509 40622 registrar_tests.cpp:1276] Marked 5 agents reachable in 8.249523305secs I0411 15:19:35.138797 40645 registrar.cpp:524] Successfully updated the registry in 815.439104ms I0411 15:19:41.783651 40622 registrar_tests.cpp:1288] Recovered 5 agents (8238915B) in 10.963297677secs I0411 15:19:47.431670 40657 registrar.cpp:524] Successfully updated the registry in 3.960920064secs I0411 15:20:13.769872 40657 registrar.cpp:524] Successfully updated the registry in 1.169234944secs I0411 15:21:49.685801 40657 registrar.cpp:524] Successfully updated the registry in 264.850688ms Removed 5 agents in 2.12256788111667mins {noformat} > Long registry updates when the number of agents is high > --- > > Key: MESOS-7376 > URL: https://issues.apache.org/jira/browse/MESOS-7376 > Project: Mesos > Issue Type: Improvement > Components: master >Affects Versions: 1.3.0 >Reporter: Ilya Pronin >Assignee: Ilya Pronin > > During scale testing we discovered that as the number of registered agents > grows the time it takes to update the registry grows to unacceptable values > very fast. At some point it starts exceeding {{registry_store_timeout}} which > doesn't fire. > With 55k agents we saw this ({{registry_store_timeout=20secs}}): > {noformat} > I0331 17:11:21.227442 36472 registrar.cpp:473] Applied 69 operations in > 3.138843387secs; attempting to update the registry > I0331 17:11:24.441409 36464 log.cpp:529] LogStorage.set: acquired the lock in > 74461ns > I0331 17:11:24.441541 36464 log.cpp:543] LogStorage.set: started in 51770ns > I0331 17:11:26.869323 36462 log.cpp:628] LogStorage.set: wrote append at > position=6420881 in 2.41043644secs > I0331 17:11:26.869454 36462 state.hpp:179] State.store: storage.set has > finished in 2.428189561secs (b=1) > I0331
[jira] [Created] (MESOS-7376) Long registry updates when the number of agents is high
Ilya Pronin created MESOS-7376: -- Summary: Long registry updates when the number of agents is high Key: MESOS-7376 URL: https://issues.apache.org/jira/browse/MESOS-7376 Project: Mesos Issue Type: Improvement Components: master Affects Versions: 1.3.0 Reporter: Ilya Pronin Assignee: Ilya Pronin During scale testing we discovered that as the number of registered agents grows the time it takes to update the registry grows to unacceptable values very fast. At some point it starts exceeding {{registry_store_timeout}} which doesn't fire. With 55k agents we saw this ({{registry_store_timeout=20secs}}): {noformat} I0331 17:11:21.227442 36472 registrar.cpp:473] Applied 69 operations in 3.138843387secs; attempting to update the registry I0331 17:11:24.441409 36464 log.cpp:529] LogStorage.set: acquired the lock in 74461ns I0331 17:11:24.441541 36464 log.cpp:543] LogStorage.set: started in 51770ns I0331 17:11:26.869323 36462 log.cpp:628] LogStorage.set: wrote append at position=6420881 in 2.41043644secs I0331 17:11:26.869454 36462 state.hpp:179] State.store: storage.set has finished in 2.428189561secs (b=1) I0331 17:11:56.199453 36469 registrar.cpp:518] Successfully updated the registry in 34.971944192secs {noformat} This is caused by repeated {{Registry}} copying which involves copying a big object graph that takes roughly 0.4 sec (with 55k agents). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (MESOS-5172) Registry puller cannot fetch blobs correctly from http Redirect 3xx urls.
[ https://issues.apache.org/jira/browse/MESOS-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964075#comment-15964075 ] Ilya Pronin edited comment on MESOS-5172 at 4/11/17 2:36 PM: - This issue looks related: https://issues.apache.org/jira/browse/MESOS-6561 was (Author: ipronin): This issue can be related: https://issues.apache.org/jira/browse/MESOS-6561 > Registry puller cannot fetch blobs correctly from http Redirect 3xx urls. > - > > Key: MESOS-5172 > URL: https://issues.apache.org/jira/browse/MESOS-5172 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Gilbert Song >Assignee: Gilbert Song >Priority: Blocker > Labels: containerizer, mesosphere > > When the registry puller is pulling a private repository from some private > registry (e.g., quay.io), errors may occur when fetching blobs, at which > point fetching the manifest of the repo is finished correctly. The error > message is `Unexpected HTTP response '400 Bad Request' when trying to > download the blob`. This may arise from the logic of fetching blobs, or > incorrect format of uri when requesting blobs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-5172) Registry puller cannot fetch blobs correctly from http Redirect 3xx urls.
[ https://issues.apache.org/jira/browse/MESOS-5172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964075#comment-15964075 ] Ilya Pronin commented on MESOS-5172: This issue can be related: https://issues.apache.org/jira/browse/MESOS-6561 > Registry puller cannot fetch blobs correctly from http Redirect 3xx urls. > - > > Key: MESOS-5172 > URL: https://issues.apache.org/jira/browse/MESOS-5172 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Gilbert Song >Assignee: Gilbert Song >Priority: Blocker > Labels: containerizer, mesosphere > > When the registry puller is pulling a private repository from some private > registry (e.g., quay.io), errors may occur when fetching blobs, at which > point fetching the manifest of the repo is finished correctly. The error > message is `Unexpected HTTP response '400 Bad Request' when trying to > download the blob`. This may arise from the logic of fetching blobs, or > incorrect format of uri when requesting blobs. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (MESOS-6127) Implement suppport for HTTP/2
[ https://issues.apache.org/jira/browse/MESOS-6127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin reassigned MESOS-6127: -- Assignee: Ilya Pronin > Implement suppport for HTTP/2 > - > > Key: MESOS-6127 > URL: https://issues.apache.org/jira/browse/MESOS-6127 > Project: Mesos > Issue Type: Epic > Components: HTTP API, libprocess >Reporter: Aaron Wood >Assignee: Ilya Pronin > Labels: performance > > HTTP/2 will allow us to take advantage of connection multiplexing, header > compression, streams, server push, etc. Add support for communication over > HTTP/2 between masters and agents, framework endpoints, etc. > Should we support HTTP/2 without TLS? The spec allows for this but most major > browser vendors, libraries, and implementations aren't supporting it unless > TLS is used. If we do require TLS, what can be done to reduce the performance > hit of the TLS handshake? Might need to change more code to make sure that we > are taking advantage of connection sharing so that we can (ideally) only ever > have a one-time TLS handshake per shared connection. > Some ideas for libs: > https://nghttp2.org/documentation/package_README.html - Has encoders/decoders > supporting HPACK https://nghttp2.org/documentation/tutorial-hpack.html > https://nghttp2.org/documentation/libnghttp2_asio.html - Currently marked as > experimental by the nghttp2 docs -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (MESOS-7281) Backwards incompatible UpdateFrameworkMessage handling
[ https://issues.apache.org/jira/browse/MESOS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936154#comment-15936154 ] Ilya Pronin edited comment on MESOS-7281 at 3/22/17 12:47 PM: -- Review requests: https://reviews.apache.org/r/57836/ https://reviews.apache.org/r/57838 [~mcypark] since you wrote the original patch, can you shepherd this please? was (Author: ipronin): Review request: https://reviews.apache.org/r/57836/ [~mcypark] since you wrote the original patch, can you shepherd this please? > Backwards incompatible UpdateFrameworkMessage handling > -- > > Key: MESOS-7281 > URL: https://issues.apache.org/jira/browse/MESOS-7281 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Ilya Pronin >Assignee: Ilya Pronin > > Patch in [r/57108|https://reviews.apache.org/r/57108/] introduced framework > info updates. Agents are using a new {{framework_info}} field without > checking that it's present. If a patched agent is used with not patched > master it will get a default-initialized {{framework_info}}. This will cause > agent failures later. E.g abort on framework ID validation when it tries to > launch a new task for the updated framework. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7281) Backwards incompatible UpdateFrameworkMessage handling
[ https://issues.apache.org/jira/browse/MESOS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15936154#comment-15936154 ] Ilya Pronin commented on MESOS-7281: Review request: https://reviews.apache.org/r/57836/ [~mcypark] since you wrote the original patch, can you shepherd this please? > Backwards incompatible UpdateFrameworkMessage handling > -- > > Key: MESOS-7281 > URL: https://issues.apache.org/jira/browse/MESOS-7281 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Ilya Pronin >Assignee: Ilya Pronin > > Patch in [r/57108|https://reviews.apache.org/r/57108/] introduced framework > info updates. Agents are using a new {{framework_info}} field without > checking that it's present. If a patched agent is used with not patched > master it will get a default-initialized {{framework_info}}. This will cause > agent failures later. E.g abort on framework ID validation when it tries to > launch a new task for the updated framework. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (MESOS-7281) Backwards incompatible UpdateFrameworkMessage handling
[ https://issues.apache.org/jira/browse/MESOS-7281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-7281: --- Shepherd: Michael Park > Backwards incompatible UpdateFrameworkMessage handling > -- > > Key: MESOS-7281 > URL: https://issues.apache.org/jira/browse/MESOS-7281 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Ilya Pronin >Assignee: Ilya Pronin > > Patch in [r/57108|https://reviews.apache.org/r/57108/] introduced framework > info updates. Agents are using a new {{framework_info}} field without > checking that it's present. If a patched agent is used with not patched > master it will get a default-initialized {{framework_info}}. This will cause > agent failures later. E.g abort on framework ID validation when it tries to > launch a new task for the updated framework. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7281) Backwards incompatible UpdateFrameworkMessage handling
Ilya Pronin created MESOS-7281: -- Summary: Backwards incompatible UpdateFrameworkMessage handling Key: MESOS-7281 URL: https://issues.apache.org/jira/browse/MESOS-7281 Project: Mesos Issue Type: Bug Components: agent Reporter: Ilya Pronin Assignee: Ilya Pronin Patch in [r/57108|https://reviews.apache.org/r/57108/] introduced framework info updates. Agents are using a new {{framework_info}} field without checking that it's present. If a patched agent is used with not patched master it will get a default-initialized {{framework_info}}. This will cause agent failures later. E.g abort on framework ID validation when it tries to launch a new task for the updated framework. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7256) Replace Boost Type Traits leftovers with STL
[ https://issues.apache.org/jira/browse/MESOS-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928008#comment-15928008 ] Ilya Pronin commented on MESOS-7256: Review requests: https://reviews.apache.org/r/57689/ https://reviews.apache.org/r/57690/ https://reviews.apache.org/r/57691/ [~mcypark] can you shepherd this, please? > Replace Boost Type Traits leftovers with STL > > > Key: MESOS-7256 > URL: https://issues.apache.org/jira/browse/MESOS-7256 > Project: Mesos > Issue Type: Improvement > Components: libprocess, stout >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{boost::enable_if}} and {{boost::is_*}} from Boost Type Traits and Utility > are still being used in some places in Stout and libprocess. They can be > replaced with their C++11 STL counterparts. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7256) Replace Boost Type Traits leftovers with STL
Ilya Pronin created MESOS-7256: -- Summary: Replace Boost Type Traits leftovers with STL Key: MESOS-7256 URL: https://issues.apache.org/jira/browse/MESOS-7256 Project: Mesos Issue Type: Improvement Components: libprocess, stout Reporter: Ilya Pronin Assignee: Ilya Pronin Priority: Minor {{boost::enable_if}} and {{boost::is_*}} from Boost Type Traits and Utility are still being used in some places in Stout and libprocess. They can be replaced with their C++11 STL counterparts. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-2824) Support pre-fetching images
[ https://issues.apache.org/jira/browse/MESOS-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901585#comment-15901585 ] Ilya Pronin commented on MESOS-2824: Review requests: https://reviews.apache.org/r/57425/ ({{Containerizer::pull}} method) https://reviews.apache.org/r/57426/ ({{PULL_CONTAINER_IMAGE}} agent API call) https://reviews.apache.org/r/57427/ (authorization for {{PULL_CONTAINER_IMAGE}} API call) > Support pre-fetching images > --- > > Key: MESOS-2824 > URL: https://issues.apache.org/jira/browse/MESOS-2824 > Project: Mesos > Issue Type: Improvement > Components: isolation >Affects Versions: 0.23.0 >Reporter: Ian Downes >Assignee: Ilya Pronin >Priority: Minor > Labels: mesosphere, twitter > > Default container images can be specified with the --default_container_info > flag to the slave. This may be a large image that will take a long time to > initially fetch/hash/extract when the first container is provisioned. Add > optional support to start fetching the image when the slave starts and > consider not registering until the fetch is complete. > To extend that, we should support an operator endpoint so that operators can > specify images to pre-fetch. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (MESOS-7089) Local Docker Resolver for Mesos Containerizer
[ https://issues.apache.org/jira/browse/MESOS-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ilya Pronin updated MESOS-7089: --- Description: Docker’s mutable tags serve as a layer of indirection which can be used to point a tag to different images digests (concrete immutable images) at different points in time. For instance `latest` tag can point `digest-0` at t0 and then to `digest-1` at t1. Mesos has support for local docker registry, where the images are files on the local filesystem, named either as `repo:tag` or `repo@digest`. This approach trims the degree of freedom provided by the indirection mentioned above (from Docker’s mutable tags), which can be essential in some cases. For instance, it might be useful in cases, where the operator of a cluster would like to rollout image updates without having the customers to update their task configuration. (was: Docker’s mutable tags serve as a layer of indirection which can be used to point a tag to different images digests (concrete immutable images) at different points in time. For instance `latest` tag can point `digest-0` at t0 and then to `digest-1` at t1. Mesos has support for local docker registry, where the images are files on the local filesystem, named either as `repo:tag` or `repo:digest`. This approach trims the degree of freedom provided by the indirection mentioned above (from Docker’s mutable tags), which can be essential in some cases. For instance, it might be useful in cases, where the operator of a cluster would like to rollout image updates without having the customers to update their task configuration.) > Local Docker Resolver for Mesos Containerizer > - > > Key: MESOS-7089 > URL: https://issues.apache.org/jira/browse/MESOS-7089 > Project: Mesos > Issue Type: Story >Reporter: Santhosh Kumar Shanmugham > > Docker’s mutable tags serve as a layer of indirection which can be used to > point a tag to different images digests (concrete immutable images) at > different points in time. For instance `latest` tag can point `digest-0` at > t0 and then to `digest-1` at t1. Mesos has support for local docker registry, > where the images are files on the local filesystem, named either as > `repo:tag` or `repo@digest`. This approach trims the degree of freedom > provided by the indirection mentioned above (from Docker’s mutable tags), > which can be essential in some cases. For instance, it might be useful in > cases, where the operator of a cluster would like to rollout image updates > without having the customers to update their task configuration. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7089) Local Docker Resolver for Mesos Containerizer
[ https://issues.apache.org/jira/browse/MESOS-7089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859778#comment-15859778 ] Ilya Pronin commented on MESOS-7089: [~santhk], I'm afraid people won't be able to access docs in our domain using the link. You need to post it from a personal account. > Local Docker Resolver for Mesos Containerizer > - > > Key: MESOS-7089 > URL: https://issues.apache.org/jira/browse/MESOS-7089 > Project: Mesos > Issue Type: Story >Reporter: Santhosh Kumar Shanmugham > > Docker’s mutable tags serve as a layer of indirection which can be used to > point a tag to different images digests (concrete immutable images) at > different points in time. For instance `latest` tag can point `digest-0` at > t0 and then to `digest-1` at t1. Mesos has support for local docker registry, > where the images are files on the local filesystem, named either as > `repo:tag` or `repo:digest`. This approach trims the degree of freedom > provided by the indirection mentioned above (from Docker’s mutable tags), > which can be essential in some cases. For instance, it might be useful in > cases, where the operator of a cluster would like to rollout image updates > without having the customers to update their task configuration. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (MESOS-7069) The linux filesystem isolator should set mode and ownership for host volumes.
[ https://issues.apache.org/jira/browse/MESOS-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858068#comment-15858068 ] Ilya Pronin edited comment on MESOS-7069 at 2/8/17 2:56 PM: Internally we tried adding the same functionality that {{filesystem/shared}} isolator had (described in [my comment in MESOS-6563|https://issues.apache.org/jira/browse/MESOS-6563?focusedCommentId=15683941=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15683941]). This can be the first step. Also {{Volume}} protobuf has the {{mode}} field. It can be used for setting permissions on the mounted host directory. was (Author: ipronin): Internally we added the same functionality that {{filesystem/shared}} isolator had (described in [my comment in MESOS-6563|https://issues.apache.org/jira/browse/MESOS-6563?focusedCommentId=15683941=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15683941]). This can be the first step. Also {{Volume}} protobuf has the {{mode}} field. It can be used for setting permissions on the mounted host directory. > The linux filesystem isolator should set mode and ownership for host volumes. > - > > Key: MESOS-7069 > URL: https://issues.apache.org/jira/browse/MESOS-7069 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Gilbert Song > Labels: filesystem, linux, volumes > > If the host path is a relative path, the linux filesystem isolator should set > the mode and ownership for this host volume since it allows non-root user to > write to the volume. Note that this is the case of sharing the host > fileysystem (without rootfs). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7069) The linux filesystem isolator should set mode and ownership for host volumes.
[ https://issues.apache.org/jira/browse/MESOS-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15858068#comment-15858068 ] Ilya Pronin commented on MESOS-7069: Internally we added the same functionality that {{filesystem/shared}} isolator had (described in [my comment in MESOS-6563|https://issues.apache.org/jira/browse/MESOS-6563?focusedCommentId=15683941=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15683941]). This can be the first step. Also {{Volume}} protobuf has the {{mode}} field. It can be used for setting permissions on the mounted host directory. > The linux filesystem isolator should set mode and ownership for host volumes. > - > > Key: MESOS-7069 > URL: https://issues.apache.org/jira/browse/MESOS-7069 > Project: Mesos > Issue Type: Bug > Components: isolation >Reporter: Gilbert Song > Labels: filesystem, linux, volumes > > If the host path is a relative path, the linux filesystem isolator should set > the mode and ownership for this host volume since it allows non-root user to > write to the volume. Note that this is the case of sharing the host > fileysystem (without rootfs). -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7045) Skip already stored layers in local Docker puller
[ https://issues.apache.org/jira/browse/MESOS-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855894#comment-15855894 ] Ilya Pronin commented on MESOS-7045: Fixed tests in https://reviews.apache.org/r/56284/ > Skip already stored layers in local Docker puller > - > > Key: MESOS-7045 > URL: https://issues.apache.org/jira/browse/MESOS-7045 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{slave::docker::LocalPuller}} can skip extracting layers that are already > present in the store. {{slave::docker::RegistryPuller}} already does this and > {{slave::docker::Store}} is ready for it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-2824) Support pre-fetching images
[ https://issues.apache.org/jira/browse/MESOS-2824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854053#comment-15854053 ] Ilya Pronin commented on MESOS-2824: I've created a small design doc for this feature: https://docs.google.com/document/d/1TdrF-EFNvxlEYou_CCmW0LnCTPoBcUHasRNALXVS3hY/edit?usp=sharing Please, comment. > Support pre-fetching images > --- > > Key: MESOS-2824 > URL: https://issues.apache.org/jira/browse/MESOS-2824 > Project: Mesos > Issue Type: Improvement > Components: isolation >Affects Versions: 0.23.0 >Reporter: Ian Downes >Assignee: Ilya Pronin >Priority: Minor > Labels: mesosphere, twitter > > Default container images can be specified with the --default_container_info > flag to the slave. This may be a large image that will take a long time to > initially fetch/hash/extract when the first container is provisioned. Add > optional support to start fetching the image when the slave starts and > consider not registering until the fetch is complete. > To extend that, we should support an operator endpoint so that operators can > specify images to pre-fetch. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7006) Launch docker containers with --cpus instead of cpu-shares
[ https://issues.apache.org/jira/browse/MESOS-7006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854002#comment-15854002 ] Ilya Pronin commented on MESOS-7006: On Linux {{--cpus}} option uses only CFS with {{cpus.cfs_period_us}} set to 100ms: https://github.com/docker/docker/blob/master/daemon/daemon_unix.go#L115-L121 > Launch docker containers with --cpus instead of cpu-shares > -- > > Key: MESOS-7006 > URL: https://issues.apache.org/jira/browse/MESOS-7006 > Project: Mesos > Issue Type: Improvement >Affects Versions: 1.1.0 >Reporter: Craig W >Assignee: Tomasz Janiszewski >Priority: Minor > > docker 1.13 was recently released and it now has a new --cpus flag which > allows a user to specify how many cpus a container should have. This is much > simpler for users to reason about. > mesos should switch to starting a container with --cpus instead of > --cpu-shares, or at least make it configurable. > https://blog.docker.com/2017/01/cpu-management-docker-1-13/ -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7059) Unnecessary mkdirs in ProvisionerDockerLocalStoreTest.*
Ilya Pronin created MESOS-7059: -- Summary: Unnecessary mkdirs in ProvisionerDockerLocalStoreTest.* Key: MESOS-7059 URL: https://issues.apache.org/jira/browse/MESOS-7059 Project: Mesos Issue Type: Bug Components: tests Reporter: Ilya Pronin Assignee: Ilya Pronin Priority: Minor {{ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar}} and {{ProvisionerDockerLocalStoreTest.PullingSameImageSimutanuously}} start with creating directories that were already created by {{SetUp()}} and directories that are not used. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7059) Unnecessary mkdirs in ProvisionerDockerLocalStoreTest.*
[ https://issues.apache.org/jira/browse/MESOS-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852093#comment-15852093 ] Ilya Pronin commented on MESOS-7059: Review request: https://reviews.apache.org/r/56291/ > Unnecessary mkdirs in ProvisionerDockerLocalStoreTest.* > --- > > Key: MESOS-7059 > URL: https://issues.apache.org/jira/browse/MESOS-7059 > Project: Mesos > Issue Type: Bug > Components: tests >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{ProvisionerDockerLocalStoreTest.LocalStoreTestWithTar}} and > {{ProvisionerDockerLocalStoreTest.PullingSameImageSimutanuously}} start with > creating directories that were already created by {{SetUp()}} and directories > that are not used. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7046) Simplify AppC provisioner's cache keys comparison
[ https://issues.apache.org/jira/browse/MESOS-7046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848704#comment-15848704 ] Ilya Pronin commented on MESOS-7046: Review requests: https://reviews.apache.org/r/56086/ https://reviews.apache.org/r/56087/ > Simplify AppC provisioner's cache keys comparison > - > > Key: MESOS-7046 > URL: https://issues.apache.org/jira/browse/MESOS-7046 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{appc::Cache::Key::operator==()}} does manual maps comparison by looking up > all elements from one container in another and vice versa. > {{std::map::operator==()}} does that more effectively by checking sizes and > doing element by element comparison. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7046) Simplify AppC provisioner's cache keys comparison
Ilya Pronin created MESOS-7046: -- Summary: Simplify AppC provisioner's cache keys comparison Key: MESOS-7046 URL: https://issues.apache.org/jira/browse/MESOS-7046 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Ilya Pronin Assignee: Ilya Pronin Priority: Minor {{appc::Cache::Key::operator==()}} does manual maps comparison by looking up all elements from one container in another and vice versa. {{std::map::operator==()}} does that more effectively by checking sizes and doing element by element comparison. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7045) Skip already stored layers in local Docker puller
[ https://issues.apache.org/jira/browse/MESOS-7045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848486#comment-15848486 ] Ilya Pronin commented on MESOS-7045: Review request: https://reviews.apache.org/r/56174/ > Skip already stored layers in local Docker puller > - > > Key: MESOS-7045 > URL: https://issues.apache.org/jira/browse/MESOS-7045 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Ilya Pronin >Assignee: Ilya Pronin >Priority: Minor > > {{slave::docker::LocalPuller}} can skip extracting layers that are already > present in the store. {{slave::docker::RegistryPuller}} already does this and > {{slave::docker::Store}} is ready for it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7045) Skip already stored layers in local Docker puller
Ilya Pronin created MESOS-7045: -- Summary: Skip already stored layers in local Docker puller Key: MESOS-7045 URL: https://issues.apache.org/jira/browse/MESOS-7045 Project: Mesos Issue Type: Improvement Components: containerization Reporter: Ilya Pronin Assignee: Ilya Pronin Priority: Minor {{slave::docker::LocalPuller}} can skip extracting layers that are already present in the store. {{slave::docker::RegistryPuller}} already does this and {{slave::docker::Store}} is ready for it. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] (MESOS-7034) Mesos agent needs to attempt overlayfs module
[ https://issues.apache.org/jira/browse/MESOS-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847197#comment-15847197 ] Ilya Pronin edited comment on MESOS-7034 at 1/31/17 5:45 PM: - It doesn't feel 100% right with me either. But overlay module doesn't have any parameters so the only concern will be when it is loaded. Also looks like Docker does the same thing though in their docs they tell that the module should already be loaded: https://github.com/docker/docker/blob/master/daemon/graphdriver/overlay2/overlay.go#L224 was (Author: ipronin): It doesn't feel 100% right with me either. But "overlay" doesn't have any parameters so the only concern will be when it is loaded. Also looks like Docker does the same thing though in their docs they tell that the module should already be loaded: https://github.com/docker/docker/blob/master/daemon/graphdriver/overlay2/overlay.go#L224 > Mesos agent needs to attempt overlayfs module > - > > Key: MESOS-7034 > URL: https://issues.apache.org/jira/browse/MESOS-7034 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Santhosh Kumar Shanmugham >Priority: Minor > > Mesos agent reads {{/proc/filesystems}} to check is a filesystem is > supported. However for optional filesystems such as {{overlayfs}}, the > modules are not loaded by default. Hence attempt to run a {{modprobe > overlayfs}} command before reading the {{/proc/filesystems}} file. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] (MESOS-7034) Mesos agent needs to attempt overlayfs module
[ https://issues.apache.org/jira/browse/MESOS-7034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15847197#comment-15847197 ] Ilya Pronin commented on MESOS-7034: It doesn't feel 100% right with me either. But "overlay" doesn't have any parameters so the only concern will be when it is loaded. Also looks like Docker does the same thing though in their docs they tell that the module should already be loaded: https://github.com/docker/docker/blob/master/daemon/graphdriver/overlay2/overlay.go#L224 > Mesos agent needs to attempt overlayfs module > - > > Key: MESOS-7034 > URL: https://issues.apache.org/jira/browse/MESOS-7034 > Project: Mesos > Issue Type: Bug > Components: agent >Reporter: Santhosh Kumar Shanmugham >Priority: Minor > > Mesos agent reads {{/proc/filesystems}} to check is a filesystem is > supported. However for optional filesystems such as {{overlayfs}}, the > modules are not loaded by default. Hence attempt to run a {{modprobe > overlayfs}} command before reading the {{/proc/filesystems}} file. -- This message was sent by Atlassian JIRA (v6.3.15#6346)