[jira] [Commented] (MESOS-3046) Stout's UUID re-seeds a new random generator during each call to UUID::random.
[ https://issues.apache.org/jira/browse/MESOS-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14805172#comment-14805172 ] Nikita Vetoshkin commented on MESOS-3046: - Seems like clang + libstdc++ issue {quote} thread_local support currently requires the C++ runtime library from g++-4.8 or later. {quote} from http://clang.llvm.org/cxx_status.html. Seems like a frequent issue, there is a corresponding [fix|https://github.com/textmate/textmate/commit/172ce9d4282e408fe60b699c432390b9f6e3f74a] for TextMate. > Stout's UUID re-seeds a new random generator during each call to UUID::random. > -- > > Key: MESOS-3046 > URL: https://issues.apache.org/jira/browse/MESOS-3046 > Project: Mesos > Issue Type: Bug > Components: stout >Reporter: Benjamin Mahler >Assignee: Klaus Ma > Labels: newbie, twitter > Fix For: 0.25.0 > > Attachments: tl.cpp > > > Per [~StephanErb] and [~kevints]'s observations on MESOS-2940, stout's UUID > abstraction is re-seeding the random generator during each call to > {{UUID::random()}}, which is really expensive. > This is confirmed in the perf graph from MESOS-2940. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3046) Stout's UUID re-seeds a new random generator during each call to UUID::random.
[ https://issues.apache.org/jira/browse/MESOS-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647282#comment-14647282 ] Nikita Vetoshkin commented on MESOS-3046: - Out of curiosity took a look at this one. In MESOS-2940 it was suggested to put generator in {{thread_local}} like this {noformat} diff --git a/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp b/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp index e8ebe0b..b0facb2 100644 --- a/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp +++ b/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp @@ -28,7 +28,7 @@ struct UUID : boost::uuids::uuid public: static UUID random() { -return UUID(boost::uuids::random_generator()()); +return UUID(random_generator()); } static UUID fromBytes(const std::string s) @@ -62,6 +62,7 @@ public: private: explicit UUID(const boost::uuids::uuid uuid) : boost::uuids::uuid(uuid) {} + static thread_local boost::uuids::random_generator random_generator; }; #endif // __STOUT_UUID_HPP__ {noformat} fails with GCC 5.1.1 on Fedora x64: {noformat} ./.libs/libmesos.so: error: undefined reference to 'TLS init function for UUID::random_generator' ./.libs/libmesos.so: error: undefined reference to 'UUID::random_generator' {noformat} However, putting static in function does work {noformat} --- a/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp +++ b/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp @@ -28,7 +28,8 @@ struct UUID : boost::uuids::uuid public: static UUID random() { -return UUID(boost::uuids::random_generator()()); +static thread_local boost::uuids::random_generator random_generator; +return UUID(random_generator()); } static UUID fromBytes(const std::string s) {noformat} But I wonder whether it contradicts with no static objects with non-trivial constructors policy, because lifetime of this object is event more sophisticated that ordinary {{static}} with per thread list of destructors to call upon thread exit and so on. Stout's UUID re-seeds a new random generator during each call to UUID::random. -- Key: MESOS-3046 URL: https://issues.apache.org/jira/browse/MESOS-3046 Project: Mesos Issue Type: Bug Components: stout Reporter: Benjamin Mahler Labels: newbie, twitter Per [~StephanErb] and [~kevints]'s observations on MESOS-2940, stout's UUID abstraction is re-seeding the random generator during each call to {{UUID::random()}}, which is really expensive. This is confirmed in the perf graph from MESOS-2940. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo
[ https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557118#comment-14557118 ] Nikita Vetoshkin commented on MESOS-2340: - [~xujyan], that sounds really great and pretty doable unlike introducing zookeeper transactions with {{multi}}. Publish JSON in ZK instead of serialized MasterInfo --- Key: MESOS-2340 URL: https://issues.apache.org/jira/browse/MESOS-2340 Project: Mesos Issue Type: Improvement Reporter: Zameer Manji Assignee: haosdent Currently to discover the master a client needs the ZK node location and access to the MasterInfo protobuf so it can deserialize the binary blob in the node. I think it would be nice to publish JSON (like Twitter's ServerSets) so clients are not tied to protobuf to do service discovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo
[ https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549923#comment-14549923 ] Nikita Vetoshkin commented on MESOS-2340: - Won't {{multi}} operation (a.k.a transaction) help with multiple nodes creation simultaneously? Publish JSON in ZK instead of serialized MasterInfo --- Key: MESOS-2340 URL: https://issues.apache.org/jira/browse/MESOS-2340 Project: Mesos Issue Type: Improvement Reporter: Zameer Manji Assignee: haosdent Currently to discover the master a client needs the ZK node location and access to the MasterInfo protobuf so it can deserialize the binary blob in the node. I think it would be nice to publish JSON (like Twitter's ServerSets) so clients are not tied to protobuf to do service discovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo
[ https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549923#comment-14549923 ] Nikita Vetoshkin edited comment on MESOS-2340 at 5/19/15 6:46 AM: -- Won't {{multi}} operation (a.k.a transaction) help with simultaneous multiple nodes creation? was (Author: nekto0n): Won't {{multi}} operation (a.k.a transaction) help with multiple nodes creation simultaneously? Publish JSON in ZK instead of serialized MasterInfo --- Key: MESOS-2340 URL: https://issues.apache.org/jira/browse/MESOS-2340 Project: Mesos Issue Type: Improvement Reporter: Zameer Manji Assignee: haosdent Currently to discover the master a client needs the ZK node location and access to the MasterInfo protobuf so it can deserialize the binary blob in the node. I think it would be nice to publish JSON (like Twitter's ServerSets) so clients are not tied to protobuf to do service discovery. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-330) Add a shutdownExecutor() method to the scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381455#comment-14381455 ] Nikita Vetoshkin commented on MESOS-330: I wonder if graceful shutdown timeout parameter can be added to this method arguments to override the one provided via slave flag. Add a shutdownExecutor() method to the scheduler driver --- Key: MESOS-330 URL: https://issues.apache.org/jira/browse/MESOS-330 Project: Mesos Issue Type: Improvement Reporter: Vinod Kone Priority: Minor This will let the framework control when to shutdown the executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2205) Add user documentation for reservations
[ https://issues.apache.org/jira/browse/MESOS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14360103#comment-14360103 ] Nikita Vetoshkin commented on MESOS-2205: - Other codes that come to mind are: * {{400 Bad Request}} for invalid arguments. * {{412 Precondition Failed}} I think that {{409 Conflict}} should be used for something like concurrent update issue, when someone already modified item you wish to update. Mesos example that comes to mind is attempt to {{launchTasks}} with optimistic offers. I like the way errors are specified in grpc, e.g. here is [Java version|https://github.com/grpc/grpc-java/blob/master/core/src/main/java/io/grpc/Status.java#L130]. In our case we are interested in {{INVALID_ARGUMENT}}, {{FAILED_PRECONDITION}} and {{OUT_OF_RANGE}}. Anyway, HTTP codes are not strict and can be argued about which to choose. Specifying which code was chosen for which case is a must :) Add user documentation for reservations --- Key: MESOS-2205 URL: https://issues.apache.org/jira/browse/MESOS-2205 Project: Mesos Issue Type: Documentation Components: documentation, framework Reporter: Michael Park Assignee: Michael Park Labels: mesosphere Add a user guide for reservations which describes basic usage of them, how ACLs are used to specify who can unreserve whose resources, and few advanced usage cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1921) Design and implement protobuf storage of IP addresses
[ https://issues.apache.org/jira/browse/MESOS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358716#comment-14358716 ] Nikita Vetoshkin commented on MESOS-1921: - I'd vote for string + address family enum: * Easily represented in JSON. Using human readable representation won't require special handling to encode protobuf to json. It becomes more important with upcoming HTTP API for framework/executors. * A lot of languages do just fine with IP addresses in string format. They actually require addresses to be strings. Some of them are: ** golang ** python ** java * Parsing overhead doesn't seem like a strong assertion It doesn't seem that there are or will be hot loops doing {{inet_pton}}. There's a good saying: Don't optimize what you haven't profiled :) Design and implement protobuf storage of IP addresses - Key: MESOS-1921 URL: https://issues.apache.org/jira/browse/MESOS-1921 Project: Mesos Issue Type: Task Reporter: Dominic Hamon Assignee: Evelina Dumitrescu We can use {{bytes}} type or statements like {{repeated uint32 data = 4[packed=true];}} {{string}} representations might add again some parsing overhead. An additional field might be necessary to specify the protocol family type (distinguish between IPv4/IPv6). For example, if we don't specify the family type we can't distinguish between these Ip addresses in the case of byte/array representation: 0:0:0:0:0:0:IPV4 and IPv4 (see http://tools.ietf.org/html/rfc4291#page-10) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-2205) Add user documentation for reservations
[ https://issues.apache.org/jira/browse/MESOS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358684#comment-14358684 ] Nikita Vetoshkin commented on MESOS-2205: - Easy to read, great job! {quote} If the reserve operation fails, the user receives a Reserve Operation Failed HTTP response. {quote} {quote} If the unreserve operation fails, the user receives a Unreserve Operation Failed HTTP response. {quote} Can you clarify which HTTP status(es) will be used? What happens if framework attempt to reserve/unreserve fails (if it can)? Add user documentation for reservations --- Key: MESOS-2205 URL: https://issues.apache.org/jira/browse/MESOS-2205 Project: Mesos Issue Type: Documentation Components: documentation, framework Reporter: Michael Park Assignee: Michael Park Labels: mesosphere Add a user guide for reservations which describes basic usage of them, how ACLs are used to specify who can unreserve whose resources, and few advanced usage cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1729) LogZooKeeperTest.WriteRead fails due to SIGPIPE (escalated to SIGABRT)
[ https://issues.apache.org/jira/browse/MESOS-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112903#comment-14112903 ] Nikita Vetoshkin commented on MESOS-1729: - [~mcypark] is right at least about Linux issue. Trace (signal is delivered syncronously, thus we have correct traceback) suggests that cause of SIGPIPE is jvm code itself and presense of user defined signal handlers (if I'm not misread jvm code) makes jvm handle signal processing to user handler first (see e.g. [this|https://github.com/awh/openjdk7/blob/master/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp#L237]). Disabling signal handlers fixed the issue for me. P.S. Seems like Mac OS signals even for SIGPIPE can be delivered asyncronously, so there's little hope catching the right stacktrace. LogZooKeeperTest.WriteRead fails due to SIGPIPE (escalated to SIGABRT) -- Key: MESOS-1729 URL: https://issues.apache.org/jira/browse/MESOS-1729 Project: Mesos Issue Type: Bug Components: build, test Affects Versions: 0.21.0 Environment: OSX 10.9.4, clang 3.4. Same or very similar results on Linux Reporter: Till Toenshoff Assignee: Jie Yu Labels: test The following is reported and 100% reproducible when running {{make check}} on my OSX box. {noformat} [ RUN ] LogZooKeeperTest.WriteRead I0821 21:18:34.960811 2078368528 jvm.cpp:572] Looking up method init(Ljava/lang/String;)V I0821 21:18:34.960934 2078368528 jvm.cpp:572] Looking up method deleteOnExit()V I0821 21:18:34.961335 2078368528 jvm.cpp:572] Looking up method init(Ljava/io/File;Ljava/io/File;)V log4j:WARN No appenders could be found for logger (org.apache.zookeeper.server.persistence.FileTxnSnapLog). log4j:WARN Please initialize the log4j system properly. I0821 21:18:35.004449 2078368528 jvm.cpp:572] Looking up method init()V I0821 21:18:35.005053 2078368528 jvm.cpp:572] Looking up method init(Lorg/apache/zookeeper/server/persistence/FileTxnSnapLog;Lorg/apache/zookeeper/server/ZooKeeperServer$DataTreeBuilder;)V I0821 21:18:35.025753 2078368528 jvm.cpp:572] Looking up method init()V I0821 21:18:35.032670 2078368528 jvm.cpp:572] Looking up method init(I)V I0821 21:18:35.032873 2078368528 jvm.cpp:572] Looking up method configure(Ljava/net/InetSocketAddress;I)V I0821 21:18:35.038020 2078368528 jvm.cpp:572] Looking up method startup(Lorg/apache/zookeeper/server/ZooKeeperServer;)V I0821 21:18:35.093870 2078368528 jvm.cpp:572] Looking up method getClientPort()I I0821 21:18:35.093925 2078368528 zookeeper_test_server.cpp:158] Started ZooKeeperTestServer on port 52772 I0821 21:18:35.094081 2078368528 log_tests.cpp:1945] Using temporary directory '/tmp/LogZooKeeperTest_WriteRead_F8UzYv' I0821 21:18:35.095954 2078368528 leveldb.cpp:176] Opened db in 1815us I0821 21:18:35.096392 2078368528 leveldb.cpp:183] Compacted db in 428us I0821 21:18:35.096420 2078368528 leveldb.cpp:198] Created db iterator in 7us I0821 21:18:35.096432 2078368528 leveldb.cpp:204] Seeked to beginning of db in 8us I0821 21:18:35.096442 2078368528 leveldb.cpp:273] Iterated through 0 keys in the db in 8us I0821 21:18:35.096462 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0821 21:18:35.097043 107220992 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 184us I0821 21:18:35.097075 107220992 replica.cpp:320] Persisted replica status to VOTING I0821 21:18:35.099768 2078368528 leveldb.cpp:176] Opened db in 1673us I0821 21:18:35.100049 2078368528 leveldb.cpp:183] Compacted db in 270us I0821 21:18:35.100070 2078368528 leveldb.cpp:198] Created db iterator in 6us I0821 21:18:35.100080 2078368528 leveldb.cpp:204] Seeked to beginning of db in 5us I0821 21:18:35.100088 2078368528 leveldb.cpp:273] Iterated through 0 keys in the db in 5us I0821 21:18:35.100097 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0821 21:18:35.100411 108294144 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 159us I0821 21:18:35.100435 108294144 replica.cpp:320] Persisted replica status to VOTING I0821 21:18:35.101984 2078368528 leveldb.cpp:176] Opened db in 1224us I0821 21:18:35.102934 2078368528 leveldb.cpp:183] Compacted db in 942us I0821 21:18:35.102958 2078368528 leveldb.cpp:198] Created db iterator in 8us I0821 21:18:35.102972 2078368528 leveldb.cpp:204] Seeked to beginning of db in 8us I0821 21:18:35.102984 2078368528 leveldb.cpp:273] Iterated through 1 keys in the db in 9us I0821 21:18:35.102994 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@712:
[jira] [Commented] (MESOS-1729) LogZooKeeperTest.WriteRead fails on OSX
[ https://issues.apache.org/jira/browse/MESOS-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107061#comment-14107061 ] Nikita Vetoshkin commented on MESOS-1729: - Having similar issue on Linux box. {noformat} [ RUN ] ZooKeeperMasterContenderDetectorTest.MasterContenders 2014-08-22 11:41:08,959:24379(0x7f4bb77fe700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:49138] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2014-08-22 11:41:12,296:24379(0x7f4bb77fe700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:49138] zk retcode=-4, errno=111(Connection refused): server refused to accept the client 2014-08-22 11:41:15,632:24379(0x7f4bb77fe700):ZOO_ERROR@handle_socket_error_msg@1697: Socket [127.0.0.1:49138] zk retcode=-4, errno=111(Connection refused): server refused to accept the client W0822 11:41:16.446760 26535 glog.hpp:52] RAW: Received signal SIGPIPE; escalating to SIGABRT *** Aborted at 1408725676 (unix time) try date -d @1408725676 if you are using GNU date *** PC: @ 0x7f4bf739d62b raise *** SIGABRT (@0x3e85f3b) received by PID 24379 (TID 0x7f4bbc4ef700) from PID 24379; stack trace: *** @ 0x7f4bf739d750 (unknown) @ 0x7f4bf739d62b raise @ 0x7f4bf9aa0a47 internal::handler() @ 0x7f4bc359ec39 os::Linux::chained_handler() @ 0x7f4bc35a4a7a JVM_handle_linux_signal @ 0x7f4bf739d750 (unknown) @ 0x7f4bf739c81d __libc_write @ 0x7f4bbcb03882 Java_sun_nio_ch_FileDispatcherImpl_write0 @ 0x7f4bbf73fd98 (unknown) {noformat} Seems like it's a random {{SIGPIPE}} from JVM internals, not Zookeeper bindings. LogZooKeeperTest.WriteRead fails on OSX --- Key: MESOS-1729 URL: https://issues.apache.org/jira/browse/MESOS-1729 Project: Mesos Issue Type: Bug Components: build, test Affects Versions: 0.21.0 Environment: OSX 10.9.4, clang 3.4 Reporter: Till Toenshoff Labels: test The following is reported and 100% reproducible when running {{make check}} on my OSX box. {noformat} [ RUN ] LogZooKeeperTest.WriteRead I0821 21:18:34.960811 2078368528 jvm.cpp:572] Looking up method init(Ljava/lang/String;)V I0821 21:18:34.960934 2078368528 jvm.cpp:572] Looking up method deleteOnExit()V I0821 21:18:34.961335 2078368528 jvm.cpp:572] Looking up method init(Ljava/io/File;Ljava/io/File;)V log4j:WARN No appenders could be found for logger (org.apache.zookeeper.server.persistence.FileTxnSnapLog). log4j:WARN Please initialize the log4j system properly. I0821 21:18:35.004449 2078368528 jvm.cpp:572] Looking up method init()V I0821 21:18:35.005053 2078368528 jvm.cpp:572] Looking up method init(Lorg/apache/zookeeper/server/persistence/FileTxnSnapLog;Lorg/apache/zookeeper/server/ZooKeeperServer$DataTreeBuilder;)V I0821 21:18:35.025753 2078368528 jvm.cpp:572] Looking up method init()V I0821 21:18:35.032670 2078368528 jvm.cpp:572] Looking up method init(I)V I0821 21:18:35.032873 2078368528 jvm.cpp:572] Looking up method configure(Ljava/net/InetSocketAddress;I)V I0821 21:18:35.038020 2078368528 jvm.cpp:572] Looking up method startup(Lorg/apache/zookeeper/server/ZooKeeperServer;)V I0821 21:18:35.093870 2078368528 jvm.cpp:572] Looking up method getClientPort()I I0821 21:18:35.093925 2078368528 zookeeper_test_server.cpp:158] Started ZooKeeperTestServer on port 52772 I0821 21:18:35.094081 2078368528 log_tests.cpp:1945] Using temporary directory '/tmp/LogZooKeeperTest_WriteRead_F8UzYv' I0821 21:18:35.095954 2078368528 leveldb.cpp:176] Opened db in 1815us I0821 21:18:35.096392 2078368528 leveldb.cpp:183] Compacted db in 428us I0821 21:18:35.096420 2078368528 leveldb.cpp:198] Created db iterator in 7us I0821 21:18:35.096432 2078368528 leveldb.cpp:204] Seeked to beginning of db in 8us I0821 21:18:35.096442 2078368528 leveldb.cpp:273] Iterated through 0 keys in the db in 8us I0821 21:18:35.096462 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0821 21:18:35.097043 107220992 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 184us I0821 21:18:35.097075 107220992 replica.cpp:320] Persisted replica status to VOTING I0821 21:18:35.099768 2078368528 leveldb.cpp:176] Opened db in 1673us I0821 21:18:35.100049 2078368528 leveldb.cpp:183] Compacted db in 270us I0821 21:18:35.100070 2078368528 leveldb.cpp:198] Created db iterator in 6us I0821 21:18:35.100080 2078368528 leveldb.cpp:204] Seeked to beginning of db in 5us I0821 21:18:35.100088 2078368528 leveldb.cpp:273] Iterated through 0 keys in the db in 5us I0821 21:18:35.100097 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0821
[jira] [Commented] (MESOS-1729) LogZooKeeperTest.WriteRead fails on OSX
[ https://issues.apache.org/jira/browse/MESOS-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107086#comment-14107086 ] Nikita Vetoshkin commented on MESOS-1729: - I guess I'm using Oracle JDK 1.7.0_65 LogZooKeeperTest.WriteRead fails on OSX --- Key: MESOS-1729 URL: https://issues.apache.org/jira/browse/MESOS-1729 Project: Mesos Issue Type: Bug Components: build, test Affects Versions: 0.21.0 Environment: OSX 10.9.4, clang 3.4 Reporter: Till Toenshoff Labels: test The following is reported and 100% reproducible when running {{make check}} on my OSX box. {noformat} [ RUN ] LogZooKeeperTest.WriteRead I0821 21:18:34.960811 2078368528 jvm.cpp:572] Looking up method init(Ljava/lang/String;)V I0821 21:18:34.960934 2078368528 jvm.cpp:572] Looking up method deleteOnExit()V I0821 21:18:34.961335 2078368528 jvm.cpp:572] Looking up method init(Ljava/io/File;Ljava/io/File;)V log4j:WARN No appenders could be found for logger (org.apache.zookeeper.server.persistence.FileTxnSnapLog). log4j:WARN Please initialize the log4j system properly. I0821 21:18:35.004449 2078368528 jvm.cpp:572] Looking up method init()V I0821 21:18:35.005053 2078368528 jvm.cpp:572] Looking up method init(Lorg/apache/zookeeper/server/persistence/FileTxnSnapLog;Lorg/apache/zookeeper/server/ZooKeeperServer$DataTreeBuilder;)V I0821 21:18:35.025753 2078368528 jvm.cpp:572] Looking up method init()V I0821 21:18:35.032670 2078368528 jvm.cpp:572] Looking up method init(I)V I0821 21:18:35.032873 2078368528 jvm.cpp:572] Looking up method configure(Ljava/net/InetSocketAddress;I)V I0821 21:18:35.038020 2078368528 jvm.cpp:572] Looking up method startup(Lorg/apache/zookeeper/server/ZooKeeperServer;)V I0821 21:18:35.093870 2078368528 jvm.cpp:572] Looking up method getClientPort()I I0821 21:18:35.093925 2078368528 zookeeper_test_server.cpp:158] Started ZooKeeperTestServer on port 52772 I0821 21:18:35.094081 2078368528 log_tests.cpp:1945] Using temporary directory '/tmp/LogZooKeeperTest_WriteRead_F8UzYv' I0821 21:18:35.095954 2078368528 leveldb.cpp:176] Opened db in 1815us I0821 21:18:35.096392 2078368528 leveldb.cpp:183] Compacted db in 428us I0821 21:18:35.096420 2078368528 leveldb.cpp:198] Created db iterator in 7us I0821 21:18:35.096432 2078368528 leveldb.cpp:204] Seeked to beginning of db in 8us I0821 21:18:35.096442 2078368528 leveldb.cpp:273] Iterated through 0 keys in the db in 8us I0821 21:18:35.096462 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0821 21:18:35.097043 107220992 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 184us I0821 21:18:35.097075 107220992 replica.cpp:320] Persisted replica status to VOTING I0821 21:18:35.099768 2078368528 leveldb.cpp:176] Opened db in 1673us I0821 21:18:35.100049 2078368528 leveldb.cpp:183] Compacted db in 270us I0821 21:18:35.100070 2078368528 leveldb.cpp:198] Created db iterator in 6us I0821 21:18:35.100080 2078368528 leveldb.cpp:204] Seeked to beginning of db in 5us I0821 21:18:35.100088 2078368528 leveldb.cpp:273] Iterated through 0 keys in the db in 5us I0821 21:18:35.100097 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0821 21:18:35.100411 108294144 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 159us I0821 21:18:35.100435 108294144 replica.cpp:320] Persisted replica status to VOTING I0821 21:18:35.101984 2078368528 leveldb.cpp:176] Opened db in 1224us I0821 21:18:35.102934 2078368528 leveldb.cpp:183] Compacted db in 942us I0821 21:18:35.102958 2078368528 leveldb.cpp:198] Created db iterator in 8us I0821 21:18:35.102972 2078368528 leveldb.cpp:204] Seeked to beginning of db in 8us I0821 21:18:35.102984 2078368528 leveldb.cpp:273] Iterated through 1 keys in the db in 9us I0821 21:18:35.102994 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@716: Client environment:host.name=lobomacpro2.fritz.box 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@723: Client environment:os.name=Darwin 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@724: Client environment:os.arch=13.3.0 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@725: Client environment:os.version=Darwin Kernel Version 13.3.0: Tue Jun 3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@733: Client environment:user.name=till 2014-08-21
[jira] [Comment Edited] (MESOS-1729) LogZooKeeperTest.WriteRead fails on OSX
[ https://issues.apache.org/jira/browse/MESOS-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107086#comment-14107086 ] Nikita Vetoshkin edited comment on MESOS-1729 at 8/22/14 5:02 PM: -- I guess I'm using Oracle JDK 1.7.0_65 (Linux x64) was (Author: nekto0n): I guess I'm using Oracle JDK 1.7.0_65 LogZooKeeperTest.WriteRead fails on OSX --- Key: MESOS-1729 URL: https://issues.apache.org/jira/browse/MESOS-1729 Project: Mesos Issue Type: Bug Components: build, test Affects Versions: 0.21.0 Environment: OSX 10.9.4, clang 3.4 Reporter: Till Toenshoff Labels: test The following is reported and 100% reproducible when running {{make check}} on my OSX box. {noformat} [ RUN ] LogZooKeeperTest.WriteRead I0821 21:18:34.960811 2078368528 jvm.cpp:572] Looking up method init(Ljava/lang/String;)V I0821 21:18:34.960934 2078368528 jvm.cpp:572] Looking up method deleteOnExit()V I0821 21:18:34.961335 2078368528 jvm.cpp:572] Looking up method init(Ljava/io/File;Ljava/io/File;)V log4j:WARN No appenders could be found for logger (org.apache.zookeeper.server.persistence.FileTxnSnapLog). log4j:WARN Please initialize the log4j system properly. I0821 21:18:35.004449 2078368528 jvm.cpp:572] Looking up method init()V I0821 21:18:35.005053 2078368528 jvm.cpp:572] Looking up method init(Lorg/apache/zookeeper/server/persistence/FileTxnSnapLog;Lorg/apache/zookeeper/server/ZooKeeperServer$DataTreeBuilder;)V I0821 21:18:35.025753 2078368528 jvm.cpp:572] Looking up method init()V I0821 21:18:35.032670 2078368528 jvm.cpp:572] Looking up method init(I)V I0821 21:18:35.032873 2078368528 jvm.cpp:572] Looking up method configure(Ljava/net/InetSocketAddress;I)V I0821 21:18:35.038020 2078368528 jvm.cpp:572] Looking up method startup(Lorg/apache/zookeeper/server/ZooKeeperServer;)V I0821 21:18:35.093870 2078368528 jvm.cpp:572] Looking up method getClientPort()I I0821 21:18:35.093925 2078368528 zookeeper_test_server.cpp:158] Started ZooKeeperTestServer on port 52772 I0821 21:18:35.094081 2078368528 log_tests.cpp:1945] Using temporary directory '/tmp/LogZooKeeperTest_WriteRead_F8UzYv' I0821 21:18:35.095954 2078368528 leveldb.cpp:176] Opened db in 1815us I0821 21:18:35.096392 2078368528 leveldb.cpp:183] Compacted db in 428us I0821 21:18:35.096420 2078368528 leveldb.cpp:198] Created db iterator in 7us I0821 21:18:35.096432 2078368528 leveldb.cpp:204] Seeked to beginning of db in 8us I0821 21:18:35.096442 2078368528 leveldb.cpp:273] Iterated through 0 keys in the db in 8us I0821 21:18:35.096462 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0821 21:18:35.097043 107220992 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 184us I0821 21:18:35.097075 107220992 replica.cpp:320] Persisted replica status to VOTING I0821 21:18:35.099768 2078368528 leveldb.cpp:176] Opened db in 1673us I0821 21:18:35.100049 2078368528 leveldb.cpp:183] Compacted db in 270us I0821 21:18:35.100070 2078368528 leveldb.cpp:198] Created db iterator in 6us I0821 21:18:35.100080 2078368528 leveldb.cpp:204] Seeked to beginning of db in 5us I0821 21:18:35.100088 2078368528 leveldb.cpp:273] Iterated through 0 keys in the db in 5us I0821 21:18:35.100097 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned I0821 21:18:35.100411 108294144 leveldb.cpp:306] Persisting metadata (8 bytes) to leveldb took 159us I0821 21:18:35.100435 108294144 replica.cpp:320] Persisted replica status to VOTING I0821 21:18:35.101984 2078368528 leveldb.cpp:176] Opened db in 1224us I0821 21:18:35.102934 2078368528 leveldb.cpp:183] Compacted db in 942us I0821 21:18:35.102958 2078368528 leveldb.cpp:198] Created db iterator in 8us I0821 21:18:35.102972 2078368528 leveldb.cpp:204] Seeked to beginning of db in 8us I0821 21:18:35.102984 2078368528 leveldb.cpp:273] Iterated through 1 keys in the db in 9us I0821 21:18:35.102994 2078368528 replica.cpp:741] Replica recovered with log positions 0 - 0 with 1 holes and 0 unlearned 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@712: Client environment:zookeeper.version=zookeeper C client 3.4.5 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@716: Client environment:host.name=lobomacpro2.fritz.box 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@723: Client environment:os.name=Darwin 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@724: Client environment:os.arch=13.3.0 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@725: Client environment:os.version=Darwin Kernel Version 13.3.0: Tue Jun 3 21:27:35 PDT 2014; root:xnu-2422.110.17~1/RELEASE_X86_64
[jira] [Created] (MESOS-1728) Libprocess: report bind parameters on failure
Nikita Vetoshkin created MESOS-1728: --- Summary: Libprocess: report bind parameters on failure Key: MESOS-1728 URL: https://issues.apache.org/jira/browse/MESOS-1728 Project: Mesos Issue Type: Improvement Components: libprocess Reporter: Nikita Vetoshkin Assignee: Nikita Vetoshkin Priority: Trivial When you attempt to start slave or master and there's another one already running there, it is nice to report what are the actual parameters to {{bind}} call that failed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MESOS-1722) Wrong attributes separator in slave --help
Nikita Vetoshkin created MESOS-1722: --- Summary: Wrong attributes separator in slave --help Key: MESOS-1722 URL: https://issues.apache.org/jira/browse/MESOS-1722 Project: Mesos Issue Type: Bug Components: slave Reporter: Nikita Vetoshkin Assignee: Nikita Vetoshkin Priority: Trivial {{mesos-slave --help}} says ',' should be used as attributes separator {noformat} --attributes=VALUE Attributes of machine, in the form: rack:2 or 'rack:2,u:1' {noformat} But that doesn't work and according to sources ({{src/common/attributes.cpp}} string is tokenized by ';'. Thus help text should be trivially fixed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (MESOS-1199) Subprocess is slow - gated by process::reap poll interval
[ https://issues.apache.org/jira/browse/MESOS-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14087322#comment-14087322 ] Nikita Vetoshkin edited comment on MESOS-1199 at 8/6/14 6:30 AM: - Just a quick note: polling pid of non-children is a racy deal. Process can die and a new one unrelated with the same pid can spin up in between poll attempts. I wonder if we could extend executors protocol - e.g. ask executor to bind specified Unix Domain socket. Thisi socket can be polled, reconnected and slave will receive disconnect when executor dies. Any thoughts? was (Author: nekto0n): Just a quick note: polling pid of non-children is a racy deal. Process can die and a new one unrelated with the same pid can spin up in between poll attempts. I wonder if we could extend executors protocol - e.g. to bind specified Unix Domain sockets. They can be polled, reconnected and slave will receive disconnect when executor dies. Any thoughts? Subprocess is slow - gated by process::reap poll interval Key: MESOS-1199 URL: https://issues.apache.org/jira/browse/MESOS-1199 Project: Mesos Issue Type: Improvement Affects Versions: 0.18.0 Reporter: Ian Downes Assignee: Craig Hansen-Sturm Attachments: wiatpid.pdf Subprocess uses process::reap to wait on the subprocess pid and set the exit status. However, process::reap polls with a one second interval resulting in a delay up to the interval duration before the status future is set. This means if you need to wait for the subprocess to complete you get hit with E(delay) = 0.5 seconds, independent of the execution time. For example, the MesosContainerizer uses mesos-fetcher in a Subprocess to fetch the executor during launch. At Twitter we fetch a local file, i.e., a very fast operation, but the launch is blocked until the mesos-fetcher pid is reaped - adding 0 to 1 seconds for every launch! The problem is even worse with a chain of short Subprocesses because after the first Subprocess completes you'll be synchronized with the reap interval and you'll see nearly the full interval before notification, i.e., 10 Subprocesses each of 1 second duration with take ~10 seconds! This has become particularly apparent in some new tests I'm working on where test durations are now greatly extended with each taking several seconds. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (MESOS-1554) Persistent resources support for storage-like services
Nikita Vetoshkin created MESOS-1554: --- Summary: Persistent resources support for storage-like services Key: MESOS-1554 URL: https://issues.apache.org/jira/browse/MESOS-1554 Project: Mesos Issue Type: Story Components: general, hadoop Reporter: Nikita Vetoshkin Priority: Minor This question came up in [dev mailing list|http://mail-archives.apache.org/mod_mbox/mesos-dev/201406.mbox/%3CCAK8jAgNDs9Fe011Sq1jeNr0h%3DE-tDD9rak6hAsap3PqHx1y%3DKQ%40mail.gmail.com%3E]. It seems reasonable for storage like services (e.g. HDFS or Cassandra) to use Mesos to manage it's instances. But right now if we'd like to restart instance (e.g. to spin up a new version) - all previous instance version sandbox filesystem resources will be recycled by slave's garbage collector. At the moment filesystem resources can be managed out of band - i.e. instances can save their data in some database specific placed, that various instances can share (e.g. {{/var/lib/cassandra}}). [~benjaminhindman] suggested an idea in the mailing list (though it still needs some fleshing out): {quote} The idea originally came about because, even today, if we allocate some file system space to a task/executor, and then that task/executor terminates, we haven't officially freed those file system resources until after we garbage collect the task/executor sandbox! (We keep the sandbox around so a user/operator can get the stdout/stderr or anything else left around from their task/executor.) To solve this problem we wanted to be able to let a task/executor terminate but not *give up* all of it's resources, hence: persistent resources. Pushing this concept even further you could imagine always reallocating resources to a framework that had already been allocated those resources for a previous task/executor. Looked at from another perspective, these are late-binding, or lazy, resource reservations. At one point in time we had considered just doing 'right-of-first-refusal' for allocations after a task/executor terminate. But this is really insufficient for supporting storage-like frameworks well (and likely even harder to reliably implement then 'persistent resources' IMHO). There are a ton of things that need to get worked out in this model, including (but not limited to), how should a file system (or disk) be exposed in order to be made persistent? How should persistent resources be returned to a master? How many persistent resources can a framework get allocated? {quote} -- This message was sent by Atlassian JIRA (v6.2#6252)