[jira] [Commented] (MESOS-3046) Stout's UUID re-seeds a new random generator during each call to UUID::random.

2015-09-18 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14805172#comment-14805172
 ] 

Nikita Vetoshkin commented on MESOS-3046:
-

Seems like clang + libstdc++ issue

{quote}
thread_local support currently requires the C++ runtime library from g++-4.8 or 
later.
{quote}

from http://clang.llvm.org/cxx_status.html.

Seems like a frequent issue, there is a corresponding 
[fix|https://github.com/textmate/textmate/commit/172ce9d4282e408fe60b699c432390b9f6e3f74a]
 for TextMate.

> Stout's UUID re-seeds a new random generator during each call to UUID::random.
> --
>
> Key: MESOS-3046
> URL: https://issues.apache.org/jira/browse/MESOS-3046
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Benjamin Mahler
>Assignee: Klaus Ma
>  Labels: newbie, twitter
> Fix For: 0.25.0
>
> Attachments: tl.cpp
>
>
> Per [~StephanErb] and [~kevints]'s observations on MESOS-2940, stout's UUID 
> abstraction is re-seeding the random generator during each call to 
> {{UUID::random()}}, which is really expensive.
> This is confirmed in the perf graph from MESOS-2940.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3046) Stout's UUID re-seeds a new random generator during each call to UUID::random.

2015-07-30 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14647282#comment-14647282
 ] 

Nikita Vetoshkin commented on MESOS-3046:
-

Out of curiosity took a look at this one.
In MESOS-2940 it was suggested to put generator in {{thread_local}} like this
{noformat}
diff --git a/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp 
b/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp
index e8ebe0b..b0facb2 100644
--- a/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp
+++ b/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp
@@ -28,7 +28,7 @@ struct UUID : boost::uuids::uuid
 public:
   static UUID random()
   {
-return UUID(boost::uuids::random_generator()());
+return UUID(random_generator());
   }
 
   static UUID fromBytes(const std::string s)
@@ -62,6 +62,7 @@ public:
 private:
   explicit UUID(const boost::uuids::uuid uuid)
 : boost::uuids::uuid(uuid) {}
+  static thread_local boost::uuids::random_generator random_generator;
 };
 
 #endif // __STOUT_UUID_HPP__
{noformat}
fails with GCC 5.1.1 on Fedora x64:
{noformat}
./.libs/libmesos.so: error: undefined reference to 'TLS init function for 
UUID::random_generator'
./.libs/libmesos.so: error: undefined reference to 'UUID::random_generator'
{noformat}
However, putting static in function does work
{noformat}
--- a/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp
+++ b/3rdparty/libprocess/3rdparty/stout/include/stout/uuid.hpp
@@ -28,7 +28,8 @@ struct UUID : boost::uuids::uuid
 public:
   static UUID random()
   {
-return UUID(boost::uuids::random_generator()());
+static thread_local boost::uuids::random_generator random_generator;
+return UUID(random_generator());
   }
 
   static UUID fromBytes(const std::string s)

{noformat}
But I wonder whether it contradicts with no static objects with non-trivial 
constructors policy, because lifetime of this object is event more 
sophisticated that ordinary {{static}} with per thread list of destructors to 
call upon thread exit and so on.

 Stout's UUID re-seeds a new random generator during each call to UUID::random.
 --

 Key: MESOS-3046
 URL: https://issues.apache.org/jira/browse/MESOS-3046
 Project: Mesos
  Issue Type: Bug
  Components: stout
Reporter: Benjamin Mahler
  Labels: newbie, twitter

 Per [~StephanErb] and [~kevints]'s observations on MESOS-2940, stout's UUID 
 abstraction is re-seeding the random generator during each call to 
 {{UUID::random()}}, which is really expensive.
 This is confirmed in the perf graph from MESOS-2940.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo

2015-05-22 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557118#comment-14557118
 ] 

Nikita Vetoshkin commented on MESOS-2340:
-

[~xujyan], that sounds really great and pretty doable unlike introducing 
zookeeper transactions with {{multi}}.

 Publish JSON in ZK instead of serialized MasterInfo
 ---

 Key: MESOS-2340
 URL: https://issues.apache.org/jira/browse/MESOS-2340
 Project: Mesos
  Issue Type: Improvement
Reporter: Zameer Manji
Assignee: haosdent

 Currently to discover the master a client needs the ZK node location and 
 access to the MasterInfo protobuf so it can deserialize the binary blob in 
 the node.
 I think it would be nice to publish JSON (like Twitter's ServerSets) so 
 clients are not tied to protobuf to do service discovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo

2015-05-19 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549923#comment-14549923
 ] 

Nikita Vetoshkin commented on MESOS-2340:
-

Won't {{multi}} operation (a.k.a transaction) help with multiple nodes creation 
simultaneously?

 Publish JSON in ZK instead of serialized MasterInfo
 ---

 Key: MESOS-2340
 URL: https://issues.apache.org/jira/browse/MESOS-2340
 Project: Mesos
  Issue Type: Improvement
Reporter: Zameer Manji
Assignee: haosdent

 Currently to discover the master a client needs the ZK node location and 
 access to the MasterInfo protobuf so it can deserialize the binary blob in 
 the node.
 I think it would be nice to publish JSON (like Twitter's ServerSets) so 
 clients are not tied to protobuf to do service discovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2340) Publish JSON in ZK instead of serialized MasterInfo

2015-05-19 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14549923#comment-14549923
 ] 

Nikita Vetoshkin edited comment on MESOS-2340 at 5/19/15 6:46 AM:
--

Won't {{multi}} operation (a.k.a transaction) help with simultaneous multiple 
nodes creation?


was (Author: nekto0n):
Won't {{multi}} operation (a.k.a transaction) help with multiple nodes creation 
simultaneously?

 Publish JSON in ZK instead of serialized MasterInfo
 ---

 Key: MESOS-2340
 URL: https://issues.apache.org/jira/browse/MESOS-2340
 Project: Mesos
  Issue Type: Improvement
Reporter: Zameer Manji
Assignee: haosdent

 Currently to discover the master a client needs the ZK node location and 
 access to the MasterInfo protobuf so it can deserialize the binary blob in 
 the node.
 I think it would be nice to publish JSON (like Twitter's ServerSets) so 
 clients are not tied to protobuf to do service discovery.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-330) Add a shutdownExecutor() method to the scheduler driver

2015-03-26 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14381455#comment-14381455
 ] 

Nikita Vetoshkin commented on MESOS-330:


I wonder if graceful shutdown timeout parameter can be added to this method 
arguments to override the one provided via slave flag.

 Add a shutdownExecutor() method to the scheduler driver
 ---

 Key: MESOS-330
 URL: https://issues.apache.org/jira/browse/MESOS-330
 Project: Mesos
  Issue Type: Improvement
Reporter: Vinod Kone
Priority: Minor

 This will let the framework control when to shutdown the executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2205) Add user documentation for reservations

2015-03-13 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14360103#comment-14360103
 ] 

Nikita Vetoshkin commented on MESOS-2205:
-

Other codes that come to mind are:
  * {{400 Bad Request}} for invalid arguments.
  * {{412 Precondition Failed}}

I think that {{409 Conflict}} should be used for something like concurrent 
update issue, when someone already modified item you wish to update. Mesos 
example that comes to mind is attempt to {{launchTasks}} with optimistic offers.
I like the way errors are specified in grpc, e.g. here is [Java 
version|https://github.com/grpc/grpc-java/blob/master/core/src/main/java/io/grpc/Status.java#L130].
 In our case we are interested in {{INVALID_ARGUMENT}}, {{FAILED_PRECONDITION}} 
and {{OUT_OF_RANGE}}.

Anyway, HTTP codes are not strict and can be argued about which to choose. 
Specifying which code was chosen for which case is a must :)

 Add user documentation for reservations
 ---

 Key: MESOS-2205
 URL: https://issues.apache.org/jira/browse/MESOS-2205
 Project: Mesos
  Issue Type: Documentation
  Components: documentation, framework
Reporter: Michael Park
Assignee: Michael Park
  Labels: mesosphere

 Add a user guide for reservations which describes basic usage of them, how 
 ACLs are used to specify who can unreserve whose resources, and few advanced 
 usage cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1921) Design and implement protobuf storage of IP addresses

2015-03-12 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358716#comment-14358716
 ] 

Nikita Vetoshkin commented on MESOS-1921:
-

I'd vote for string + address family enum:
  * Easily represented in JSON.
  Using human readable representation won't require special handling to encode 
protobuf to json. It becomes more important with upcoming HTTP API for 
framework/executors.
  * A lot of languages do just fine with IP addresses in string format. They 
actually require addresses to be strings.
  Some of them are:
 ** golang
 ** python
 ** java
  * Parsing overhead doesn't seem like a strong assertion
  It doesn't seem that there are or will be hot loops doing {{inet_pton}}.
  There's a good saying: Don't optimize what you haven't profiled :)


 Design and implement protobuf storage of IP addresses
 -

 Key: MESOS-1921
 URL: https://issues.apache.org/jira/browse/MESOS-1921
 Project: Mesos
  Issue Type: Task
Reporter: Dominic Hamon
Assignee: Evelina Dumitrescu

 We can use {{bytes}} type or statements like {{repeated uint32 data = 
 4[packed=true];}}
 {{string}} representations might add again some parsing overhead. An 
 additional field might be necessary to specify the protocol family type 
 (distinguish between IPv4/IPv6). For example, if we don't specify the family 
 type we can't distinguish between these Ip addresses in the case of 
 byte/array representation: 0:0:0:0:0:0:IPV4 and IPv4 (see 
 http://tools.ietf.org/html/rfc4291#page-10)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2205) Add user documentation for reservations

2015-03-12 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14358684#comment-14358684
 ] 

Nikita Vetoshkin commented on MESOS-2205:
-

Easy to read, great job!
{quote}
If the reserve operation fails, the user receives a Reserve Operation Failed 
HTTP response.
{quote}
{quote}
If the unreserve operation fails, the user receives a Unreserve Operation 
Failed HTTP response.
{quote}
Can you clarify which HTTP status(es) will be used?

What happens if framework attempt to reserve/unreserve fails (if it can)?

 Add user documentation for reservations
 ---

 Key: MESOS-2205
 URL: https://issues.apache.org/jira/browse/MESOS-2205
 Project: Mesos
  Issue Type: Documentation
  Components: documentation, framework
Reporter: Michael Park
Assignee: Michael Park
  Labels: mesosphere

 Add a user guide for reservations which describes basic usage of them, how 
 ACLs are used to specify who can unreserve whose resources, and few advanced 
 usage cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1729) LogZooKeeperTest.WriteRead fails due to SIGPIPE (escalated to SIGABRT)

2014-08-27 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14112903#comment-14112903
 ] 

Nikita Vetoshkin commented on MESOS-1729:
-

[~mcypark] is right at least about Linux issue. Trace (signal is delivered 
syncronously, thus we have correct traceback) suggests that cause of SIGPIPE is 
jvm code itself and presense of user defined signal handlers (if I'm not 
misread jvm code) makes jvm handle signal processing to user handler first (see 
e.g. 
[this|https://github.com/awh/openjdk7/blob/master/hotspot/src/os_cpu/linux_x86/vm/os_linux_x86.cpp#L237]).

Disabling signal handlers fixed the issue for me.

P.S. Seems like Mac OS signals even for SIGPIPE can be delivered asyncronously, 
so there's little hope catching the right stacktrace.

 LogZooKeeperTest.WriteRead fails due to SIGPIPE (escalated to SIGABRT)
 --

 Key: MESOS-1729
 URL: https://issues.apache.org/jira/browse/MESOS-1729
 Project: Mesos
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.21.0
 Environment: OSX 10.9.4, clang 3.4.
 Same or very similar results on Linux
Reporter: Till Toenshoff
Assignee: Jie Yu
  Labels: test

 The following is reported and 100% reproducible when running {{make check}} 
 on my OSX box.
 {noformat}
 [ RUN  ] LogZooKeeperTest.WriteRead
 I0821 21:18:34.960811 2078368528 jvm.cpp:572] Looking up method 
 init(Ljava/lang/String;)V
 I0821 21:18:34.960934 2078368528 jvm.cpp:572] Looking up method 
 deleteOnExit()V
 I0821 21:18:34.961335 2078368528 jvm.cpp:572] Looking up method 
 init(Ljava/io/File;Ljava/io/File;)V
 log4j:WARN No appenders could be found for logger 
 (org.apache.zookeeper.server.persistence.FileTxnSnapLog).
 log4j:WARN Please initialize the log4j system properly.
 I0821 21:18:35.004449 2078368528 jvm.cpp:572] Looking up method init()V
 I0821 21:18:35.005053 2078368528 jvm.cpp:572] Looking up method 
 init(Lorg/apache/zookeeper/server/persistence/FileTxnSnapLog;Lorg/apache/zookeeper/server/ZooKeeperServer$DataTreeBuilder;)V
 I0821 21:18:35.025753 2078368528 jvm.cpp:572] Looking up method init()V
 I0821 21:18:35.032670 2078368528 jvm.cpp:572] Looking up method init(I)V
 I0821 21:18:35.032873 2078368528 jvm.cpp:572] Looking up method 
 configure(Ljava/net/InetSocketAddress;I)V
 I0821 21:18:35.038020 2078368528 jvm.cpp:572] Looking up method 
 startup(Lorg/apache/zookeeper/server/ZooKeeperServer;)V
 I0821 21:18:35.093870 2078368528 jvm.cpp:572] Looking up method 
 getClientPort()I
 I0821 21:18:35.093925 2078368528 zookeeper_test_server.cpp:158] Started 
 ZooKeeperTestServer on port 52772
 I0821 21:18:35.094081 2078368528 log_tests.cpp:1945] Using temporary 
 directory '/tmp/LogZooKeeperTest_WriteRead_F8UzYv'
 I0821 21:18:35.095954 2078368528 leveldb.cpp:176] Opened db in 1815us
 I0821 21:18:35.096392 2078368528 leveldb.cpp:183] Compacted db in 428us
 I0821 21:18:35.096420 2078368528 leveldb.cpp:198] Created db iterator in 7us
 I0821 21:18:35.096432 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 8us
 I0821 21:18:35.096442 2078368528 leveldb.cpp:273] Iterated through 0 keys in 
 the db in 8us
 I0821 21:18:35.096462 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0821 21:18:35.097043 107220992 leveldb.cpp:306] Persisting metadata (8 
 bytes) to leveldb took 184us
 I0821 21:18:35.097075 107220992 replica.cpp:320] Persisted replica status to 
 VOTING
 I0821 21:18:35.099768 2078368528 leveldb.cpp:176] Opened db in 1673us
 I0821 21:18:35.100049 2078368528 leveldb.cpp:183] Compacted db in 270us
 I0821 21:18:35.100070 2078368528 leveldb.cpp:198] Created db iterator in 6us
 I0821 21:18:35.100080 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 5us
 I0821 21:18:35.100088 2078368528 leveldb.cpp:273] Iterated through 0 keys in 
 the db in 5us
 I0821 21:18:35.100097 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0821 21:18:35.100411 108294144 leveldb.cpp:306] Persisting metadata (8 
 bytes) to leveldb took 159us
 I0821 21:18:35.100435 108294144 replica.cpp:320] Persisted replica status to 
 VOTING
 I0821 21:18:35.101984 2078368528 leveldb.cpp:176] Opened db in 1224us
 I0821 21:18:35.102934 2078368528 leveldb.cpp:183] Compacted db in 942us
 I0821 21:18:35.102958 2078368528 leveldb.cpp:198] Created db iterator in 8us
 I0821 21:18:35.102972 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 8us
 I0821 21:18:35.102984 2078368528 leveldb.cpp:273] Iterated through 1 keys in 
 the db in 9us
 I0821 21:18:35.102994 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@712: 

[jira] [Commented] (MESOS-1729) LogZooKeeperTest.WriteRead fails on OSX

2014-08-22 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107061#comment-14107061
 ] 

Nikita Vetoshkin commented on MESOS-1729:
-

Having similar issue on Linux box.
{noformat}
[ RUN  ] ZooKeeperMasterContenderDetectorTest.MasterContenders
2014-08-22 
11:41:08,959:24379(0x7f4bb77fe700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:49138] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2014-08-22 
11:41:12,296:24379(0x7f4bb77fe700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:49138] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
2014-08-22 
11:41:15,632:24379(0x7f4bb77fe700):ZOO_ERROR@handle_socket_error_msg@1697: 
Socket [127.0.0.1:49138] zk retcode=-4, errno=111(Connection refused): server 
refused to accept the client
W0822 11:41:16.446760 26535 glog.hpp:52] RAW: Received signal SIGPIPE; 
escalating to SIGABRT
*** Aborted at 1408725676 (unix time) try date -d @1408725676 if you are 
using GNU date ***
PC: @ 0x7f4bf739d62b raise
*** SIGABRT (@0x3e85f3b) received by PID 24379 (TID 0x7f4bbc4ef700) from 
PID 24379; stack trace: ***
@ 0x7f4bf739d750 (unknown)
@ 0x7f4bf739d62b raise
@ 0x7f4bf9aa0a47 internal::handler()
@ 0x7f4bc359ec39 os::Linux::chained_handler()
@ 0x7f4bc35a4a7a JVM_handle_linux_signal
@ 0x7f4bf739d750 (unknown)
@ 0x7f4bf739c81d __libc_write
@ 0x7f4bbcb03882 Java_sun_nio_ch_FileDispatcherImpl_write0
@ 0x7f4bbf73fd98 (unknown)
{noformat}
Seems like it's a random {{SIGPIPE}} from JVM internals, not Zookeeper bindings.

 LogZooKeeperTest.WriteRead fails on OSX
 ---

 Key: MESOS-1729
 URL: https://issues.apache.org/jira/browse/MESOS-1729
 Project: Mesos
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.21.0
 Environment: OSX 10.9.4, clang 3.4
Reporter: Till Toenshoff
  Labels: test

 The following is reported and 100% reproducible when running {{make check}} 
 on my OSX box.
 {noformat}
 [ RUN  ] LogZooKeeperTest.WriteRead
 I0821 21:18:34.960811 2078368528 jvm.cpp:572] Looking up method 
 init(Ljava/lang/String;)V
 I0821 21:18:34.960934 2078368528 jvm.cpp:572] Looking up method 
 deleteOnExit()V
 I0821 21:18:34.961335 2078368528 jvm.cpp:572] Looking up method 
 init(Ljava/io/File;Ljava/io/File;)V
 log4j:WARN No appenders could be found for logger 
 (org.apache.zookeeper.server.persistence.FileTxnSnapLog).
 log4j:WARN Please initialize the log4j system properly.
 I0821 21:18:35.004449 2078368528 jvm.cpp:572] Looking up method init()V
 I0821 21:18:35.005053 2078368528 jvm.cpp:572] Looking up method 
 init(Lorg/apache/zookeeper/server/persistence/FileTxnSnapLog;Lorg/apache/zookeeper/server/ZooKeeperServer$DataTreeBuilder;)V
 I0821 21:18:35.025753 2078368528 jvm.cpp:572] Looking up method init()V
 I0821 21:18:35.032670 2078368528 jvm.cpp:572] Looking up method init(I)V
 I0821 21:18:35.032873 2078368528 jvm.cpp:572] Looking up method 
 configure(Ljava/net/InetSocketAddress;I)V
 I0821 21:18:35.038020 2078368528 jvm.cpp:572] Looking up method 
 startup(Lorg/apache/zookeeper/server/ZooKeeperServer;)V
 I0821 21:18:35.093870 2078368528 jvm.cpp:572] Looking up method 
 getClientPort()I
 I0821 21:18:35.093925 2078368528 zookeeper_test_server.cpp:158] Started 
 ZooKeeperTestServer on port 52772
 I0821 21:18:35.094081 2078368528 log_tests.cpp:1945] Using temporary 
 directory '/tmp/LogZooKeeperTest_WriteRead_F8UzYv'
 I0821 21:18:35.095954 2078368528 leveldb.cpp:176] Opened db in 1815us
 I0821 21:18:35.096392 2078368528 leveldb.cpp:183] Compacted db in 428us
 I0821 21:18:35.096420 2078368528 leveldb.cpp:198] Created db iterator in 7us
 I0821 21:18:35.096432 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 8us
 I0821 21:18:35.096442 2078368528 leveldb.cpp:273] Iterated through 0 keys in 
 the db in 8us
 I0821 21:18:35.096462 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0821 21:18:35.097043 107220992 leveldb.cpp:306] Persisting metadata (8 
 bytes) to leveldb took 184us
 I0821 21:18:35.097075 107220992 replica.cpp:320] Persisted replica status to 
 VOTING
 I0821 21:18:35.099768 2078368528 leveldb.cpp:176] Opened db in 1673us
 I0821 21:18:35.100049 2078368528 leveldb.cpp:183] Compacted db in 270us
 I0821 21:18:35.100070 2078368528 leveldb.cpp:198] Created db iterator in 6us
 I0821 21:18:35.100080 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 5us
 I0821 21:18:35.100088 2078368528 leveldb.cpp:273] Iterated through 0 keys in 
 the db in 5us
 I0821 21:18:35.100097 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0821 

[jira] [Commented] (MESOS-1729) LogZooKeeperTest.WriteRead fails on OSX

2014-08-22 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107086#comment-14107086
 ] 

Nikita Vetoshkin commented on MESOS-1729:
-

I guess I'm using Oracle JDK 1.7.0_65

 LogZooKeeperTest.WriteRead fails on OSX
 ---

 Key: MESOS-1729
 URL: https://issues.apache.org/jira/browse/MESOS-1729
 Project: Mesos
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.21.0
 Environment: OSX 10.9.4, clang 3.4
Reporter: Till Toenshoff
  Labels: test

 The following is reported and 100% reproducible when running {{make check}} 
 on my OSX box.
 {noformat}
 [ RUN  ] LogZooKeeperTest.WriteRead
 I0821 21:18:34.960811 2078368528 jvm.cpp:572] Looking up method 
 init(Ljava/lang/String;)V
 I0821 21:18:34.960934 2078368528 jvm.cpp:572] Looking up method 
 deleteOnExit()V
 I0821 21:18:34.961335 2078368528 jvm.cpp:572] Looking up method 
 init(Ljava/io/File;Ljava/io/File;)V
 log4j:WARN No appenders could be found for logger 
 (org.apache.zookeeper.server.persistence.FileTxnSnapLog).
 log4j:WARN Please initialize the log4j system properly.
 I0821 21:18:35.004449 2078368528 jvm.cpp:572] Looking up method init()V
 I0821 21:18:35.005053 2078368528 jvm.cpp:572] Looking up method 
 init(Lorg/apache/zookeeper/server/persistence/FileTxnSnapLog;Lorg/apache/zookeeper/server/ZooKeeperServer$DataTreeBuilder;)V
 I0821 21:18:35.025753 2078368528 jvm.cpp:572] Looking up method init()V
 I0821 21:18:35.032670 2078368528 jvm.cpp:572] Looking up method init(I)V
 I0821 21:18:35.032873 2078368528 jvm.cpp:572] Looking up method 
 configure(Ljava/net/InetSocketAddress;I)V
 I0821 21:18:35.038020 2078368528 jvm.cpp:572] Looking up method 
 startup(Lorg/apache/zookeeper/server/ZooKeeperServer;)V
 I0821 21:18:35.093870 2078368528 jvm.cpp:572] Looking up method 
 getClientPort()I
 I0821 21:18:35.093925 2078368528 zookeeper_test_server.cpp:158] Started 
 ZooKeeperTestServer on port 52772
 I0821 21:18:35.094081 2078368528 log_tests.cpp:1945] Using temporary 
 directory '/tmp/LogZooKeeperTest_WriteRead_F8UzYv'
 I0821 21:18:35.095954 2078368528 leveldb.cpp:176] Opened db in 1815us
 I0821 21:18:35.096392 2078368528 leveldb.cpp:183] Compacted db in 428us
 I0821 21:18:35.096420 2078368528 leveldb.cpp:198] Created db iterator in 7us
 I0821 21:18:35.096432 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 8us
 I0821 21:18:35.096442 2078368528 leveldb.cpp:273] Iterated through 0 keys in 
 the db in 8us
 I0821 21:18:35.096462 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0821 21:18:35.097043 107220992 leveldb.cpp:306] Persisting metadata (8 
 bytes) to leveldb took 184us
 I0821 21:18:35.097075 107220992 replica.cpp:320] Persisted replica status to 
 VOTING
 I0821 21:18:35.099768 2078368528 leveldb.cpp:176] Opened db in 1673us
 I0821 21:18:35.100049 2078368528 leveldb.cpp:183] Compacted db in 270us
 I0821 21:18:35.100070 2078368528 leveldb.cpp:198] Created db iterator in 6us
 I0821 21:18:35.100080 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 5us
 I0821 21:18:35.100088 2078368528 leveldb.cpp:273] Iterated through 0 keys in 
 the db in 5us
 I0821 21:18:35.100097 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0821 21:18:35.100411 108294144 leveldb.cpp:306] Persisting metadata (8 
 bytes) to leveldb took 159us
 I0821 21:18:35.100435 108294144 replica.cpp:320] Persisted replica status to 
 VOTING
 I0821 21:18:35.101984 2078368528 leveldb.cpp:176] Opened db in 1224us
 I0821 21:18:35.102934 2078368528 leveldb.cpp:183] Compacted db in 942us
 I0821 21:18:35.102958 2078368528 leveldb.cpp:198] Created db iterator in 8us
 I0821 21:18:35.102972 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 8us
 I0821 21:18:35.102984 2078368528 leveldb.cpp:273] Iterated through 1 keys in 
 the db in 9us
 I0821 21:18:35.102994 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@712: Client 
 environment:zookeeper.version=zookeeper C client 3.4.5
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@716: Client 
 environment:host.name=lobomacpro2.fritz.box
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@723: Client 
 environment:os.name=Darwin
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@724: Client 
 environment:os.arch=13.3.0
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@725: Client 
 environment:os.version=Darwin Kernel Version 13.3.0: Tue Jun  3 21:27:35 PDT 
 2014; root:xnu-2422.110.17~1/RELEASE_X86_64
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@733: Client 
 environment:user.name=till
 2014-08-21 

[jira] [Comment Edited] (MESOS-1729) LogZooKeeperTest.WriteRead fails on OSX

2014-08-22 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14107086#comment-14107086
 ] 

Nikita Vetoshkin edited comment on MESOS-1729 at 8/22/14 5:02 PM:
--

I guess I'm using Oracle JDK 1.7.0_65 (Linux x64)


was (Author: nekto0n):
I guess I'm using Oracle JDK 1.7.0_65

 LogZooKeeperTest.WriteRead fails on OSX
 ---

 Key: MESOS-1729
 URL: https://issues.apache.org/jira/browse/MESOS-1729
 Project: Mesos
  Issue Type: Bug
  Components: build, test
Affects Versions: 0.21.0
 Environment: OSX 10.9.4, clang 3.4
Reporter: Till Toenshoff
  Labels: test

 The following is reported and 100% reproducible when running {{make check}} 
 on my OSX box.
 {noformat}
 [ RUN  ] LogZooKeeperTest.WriteRead
 I0821 21:18:34.960811 2078368528 jvm.cpp:572] Looking up method 
 init(Ljava/lang/String;)V
 I0821 21:18:34.960934 2078368528 jvm.cpp:572] Looking up method 
 deleteOnExit()V
 I0821 21:18:34.961335 2078368528 jvm.cpp:572] Looking up method 
 init(Ljava/io/File;Ljava/io/File;)V
 log4j:WARN No appenders could be found for logger 
 (org.apache.zookeeper.server.persistence.FileTxnSnapLog).
 log4j:WARN Please initialize the log4j system properly.
 I0821 21:18:35.004449 2078368528 jvm.cpp:572] Looking up method init()V
 I0821 21:18:35.005053 2078368528 jvm.cpp:572] Looking up method 
 init(Lorg/apache/zookeeper/server/persistence/FileTxnSnapLog;Lorg/apache/zookeeper/server/ZooKeeperServer$DataTreeBuilder;)V
 I0821 21:18:35.025753 2078368528 jvm.cpp:572] Looking up method init()V
 I0821 21:18:35.032670 2078368528 jvm.cpp:572] Looking up method init(I)V
 I0821 21:18:35.032873 2078368528 jvm.cpp:572] Looking up method 
 configure(Ljava/net/InetSocketAddress;I)V
 I0821 21:18:35.038020 2078368528 jvm.cpp:572] Looking up method 
 startup(Lorg/apache/zookeeper/server/ZooKeeperServer;)V
 I0821 21:18:35.093870 2078368528 jvm.cpp:572] Looking up method 
 getClientPort()I
 I0821 21:18:35.093925 2078368528 zookeeper_test_server.cpp:158] Started 
 ZooKeeperTestServer on port 52772
 I0821 21:18:35.094081 2078368528 log_tests.cpp:1945] Using temporary 
 directory '/tmp/LogZooKeeperTest_WriteRead_F8UzYv'
 I0821 21:18:35.095954 2078368528 leveldb.cpp:176] Opened db in 1815us
 I0821 21:18:35.096392 2078368528 leveldb.cpp:183] Compacted db in 428us
 I0821 21:18:35.096420 2078368528 leveldb.cpp:198] Created db iterator in 7us
 I0821 21:18:35.096432 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 8us
 I0821 21:18:35.096442 2078368528 leveldb.cpp:273] Iterated through 0 keys in 
 the db in 8us
 I0821 21:18:35.096462 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0821 21:18:35.097043 107220992 leveldb.cpp:306] Persisting metadata (8 
 bytes) to leveldb took 184us
 I0821 21:18:35.097075 107220992 replica.cpp:320] Persisted replica status to 
 VOTING
 I0821 21:18:35.099768 2078368528 leveldb.cpp:176] Opened db in 1673us
 I0821 21:18:35.100049 2078368528 leveldb.cpp:183] Compacted db in 270us
 I0821 21:18:35.100070 2078368528 leveldb.cpp:198] Created db iterator in 6us
 I0821 21:18:35.100080 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 5us
 I0821 21:18:35.100088 2078368528 leveldb.cpp:273] Iterated through 0 keys in 
 the db in 5us
 I0821 21:18:35.100097 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 I0821 21:18:35.100411 108294144 leveldb.cpp:306] Persisting metadata (8 
 bytes) to leveldb took 159us
 I0821 21:18:35.100435 108294144 replica.cpp:320] Persisted replica status to 
 VOTING
 I0821 21:18:35.101984 2078368528 leveldb.cpp:176] Opened db in 1224us
 I0821 21:18:35.102934 2078368528 leveldb.cpp:183] Compacted db in 942us
 I0821 21:18:35.102958 2078368528 leveldb.cpp:198] Created db iterator in 8us
 I0821 21:18:35.102972 2078368528 leveldb.cpp:204] Seeked to beginning of db 
 in 8us
 I0821 21:18:35.102984 2078368528 leveldb.cpp:273] Iterated through 1 keys in 
 the db in 9us
 I0821 21:18:35.102994 2078368528 replica.cpp:741] Replica recovered with log 
 positions 0 - 0 with 1 holes and 0 unlearned
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@712: Client 
 environment:zookeeper.version=zookeeper C client 3.4.5
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@716: Client 
 environment:host.name=lobomacpro2.fritz.box
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@723: Client 
 environment:os.name=Darwin
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@724: Client 
 environment:os.arch=13.3.0
 2014-08-21 21:18:35,103:6420(0x106641000):ZOO_INFO@log_env@725: Client 
 environment:os.version=Darwin Kernel Version 13.3.0: Tue Jun  3 21:27:35 PDT 
 2014; root:xnu-2422.110.17~1/RELEASE_X86_64
 

[jira] [Created] (MESOS-1728) Libprocess: report bind parameters on failure

2014-08-21 Thread Nikita Vetoshkin (JIRA)
Nikita Vetoshkin created MESOS-1728:
---

 Summary: Libprocess: report bind parameters on failure
 Key: MESOS-1728
 URL: https://issues.apache.org/jira/browse/MESOS-1728
 Project: Mesos
  Issue Type: Improvement
  Components: libprocess
Reporter: Nikita Vetoshkin
Assignee: Nikita Vetoshkin
Priority: Trivial


When you attempt to start slave or master and there's another one already 
running there, it is nice to report what are the actual parameters to {{bind}} 
call that failed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MESOS-1722) Wrong attributes separator in slave --help

2014-08-19 Thread Nikita Vetoshkin (JIRA)
Nikita Vetoshkin created MESOS-1722:
---

 Summary: Wrong attributes separator in slave --help
 Key: MESOS-1722
 URL: https://issues.apache.org/jira/browse/MESOS-1722
 Project: Mesos
  Issue Type: Bug
  Components: slave
Reporter: Nikita Vetoshkin
Assignee: Nikita Vetoshkin
Priority: Trivial


{{mesos-slave --help}} says ',' should be used as attributes separator
{noformat}
  --attributes=VALUE Attributes of machine, in the form:
 rack:2 or 'rack:2,u:1'

{noformat}
But that doesn't work and according to sources ({{src/common/attributes.cpp}} 
string is tokenized by ';'. Thus help text should be trivially fixed.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (MESOS-1199) Subprocess is slow - gated by process::reap poll interval

2014-08-06 Thread Nikita Vetoshkin (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14087322#comment-14087322
 ] 

Nikita Vetoshkin edited comment on MESOS-1199 at 8/6/14 6:30 AM:
-

Just a quick note: polling pid of non-children is a racy deal. Process can die 
and a new one unrelated with the same pid can spin up in between poll attempts.
I wonder if we could extend executors protocol - e.g. ask executor to bind 
specified Unix Domain socket. Thisi socket can be polled, reconnected and slave 
will receive disconnect when executor dies. Any thoughts?


was (Author: nekto0n):
Just a quick note: polling pid of non-children is a racy deal. Process can die 
and a new one unrelated with the same pid can spin up in between poll attempts.
I wonder if we could extend executors protocol - e.g. to bind specified Unix 
Domain sockets. They can be polled, reconnected and slave will receive 
disconnect when executor dies. Any thoughts?

 Subprocess is slow - gated by process::reap poll interval
 

 Key: MESOS-1199
 URL: https://issues.apache.org/jira/browse/MESOS-1199
 Project: Mesos
  Issue Type: Improvement
Affects Versions: 0.18.0
Reporter: Ian Downes
Assignee: Craig Hansen-Sturm
 Attachments: wiatpid.pdf


 Subprocess uses process::reap to wait on the subprocess pid and set the exit 
 status. However, process::reap polls with a one second interval resulting in 
 a delay up to the interval duration before the status future is set.
 This means if you need to wait for the subprocess to complete you get hit 
 with E(delay) = 0.5 seconds, independent of the execution time. For example, 
 the MesosContainerizer uses mesos-fetcher in a Subprocess to fetch the 
 executor during launch. At Twitter we fetch a local file, i.e., a very fast 
 operation, but the launch is blocked until the mesos-fetcher pid is reaped - 
 adding 0 to 1 seconds for every launch!
 The problem is even worse with a chain of short Subprocesses because after 
 the first Subprocess completes you'll be synchronized with the reap interval 
 and you'll see nearly the full interval before notification, i.e., 10 
 Subprocesses each of  1 second duration with take ~10 seconds!
 This has become particularly apparent in some new tests I'm working on where 
 test durations are now greatly extended with each taking several seconds.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (MESOS-1554) Persistent resources support for storage-like services

2014-06-29 Thread Nikita Vetoshkin (JIRA)
Nikita Vetoshkin created MESOS-1554:
---

 Summary: Persistent resources support for storage-like services
 Key: MESOS-1554
 URL: https://issues.apache.org/jira/browse/MESOS-1554
 Project: Mesos
  Issue Type: Story
  Components: general, hadoop
Reporter: Nikita Vetoshkin
Priority: Minor


This question came up in [dev mailing 
list|http://mail-archives.apache.org/mod_mbox/mesos-dev/201406.mbox/%3CCAK8jAgNDs9Fe011Sq1jeNr0h%3DE-tDD9rak6hAsap3PqHx1y%3DKQ%40mail.gmail.com%3E].
It seems reasonable for storage like services (e.g. HDFS or Cassandra) to use 
Mesos to manage it's instances. But right now if we'd like to restart instance 
(e.g. to spin up a new version) - all previous instance version sandbox 
filesystem resources will be recycled by slave's garbage collector.

At the moment filesystem resources can be managed out of band - i.e. instances 
can save their data in some database specific placed, that various instances 
can share (e.g. {{/var/lib/cassandra}}).

[~benjaminhindman] suggested an idea in the mailing list (though it still needs 
some fleshing out):
{quote}
The idea originally came about because, even today, if we allocate some
file system space to a task/executor, and then that task/executor
terminates, we haven't officially freed those file system resources until
after we garbage collect the task/executor sandbox! (We keep the sandbox
around so a user/operator can get the stdout/stderr or anything else left
around from their task/executor.)

To solve this problem we wanted to be able to let a task/executor terminate
but not *give up* all of it's resources, hence: persistent resources.

Pushing this concept even further you could imagine always reallocating
resources to a framework that had already been allocated those resources
for a previous task/executor. Looked at from another perspective, these are
late-binding, or lazy, resource reservations.

At one point in time we had considered just doing 'right-of-first-refusal'
for allocations after a task/executor terminate. But this is really
insufficient for supporting storage-like frameworks well (and likely even
harder to reliably implement then 'persistent resources' IMHO).

There are a ton of things that need to get worked out in this model,
including (but not limited to), how should a file system (or disk) be
exposed in order to be made persistent? How should persistent resources be
returned to a master? How many persistent resources can a framework get
allocated?
{quote}



--
This message was sent by Atlassian JIRA
(v6.2#6252)