Hi Kaveh,

No worries, I'm glad I can help tracking down those issues.

I think we're almost there ;-)
All 3 agents and the first primary DBServer now start correctly, but I'm 
not able to start an additional primary DBServer or Coordinator:

Second DBServer:

bart@laptop ~/test2/arangodb $ ./build/bin/arangod 
--server.authentication=false --server.endpoint tcp://0.0.0.0:8530 
--cluster.my-address tcp://127.0.0.1:8530 --cluster.my-local-info db2 
--cluster.my-role PRIMARY --cluster.agency-endpoint tcp://127.0.0.1:5001 
primary2
2016-06-28T09:25:02Z [2076] INFO using SSL options: 
SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2016-06-28T09:25:02Z [2076] FATAL unable to determine unambiguous role for 
server ''. No role configured in agency (http+tcp://127.0.0.1:5001, 
http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001)

Coordinator:

bart@laptop ~/test2/arangodb $ ./build/bin/arangod 
--server.authentication=false --server.endpoint tcp://0.0.0.0:8531 
--cluster.my-address tcp://127.0.0.1:8531 --cluster.my-local-info coord1 
--cluster.my-role COORDINATOR --cluster.agency-endpoint 
tcp://127.0.0.1:5001 coordinator
2016-06-28T09:27:53Z [2276] INFO using SSL options: 
SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG
2016-06-28T09:27:53Z [2276] FATAL unable to determine unambiguous role for 
server ''. No role configured in agency (http+tcp://127.0.0.1:5001, 
http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001)

Bart


On Tuesday, June 28, 2016 at 10:55:48 AM UTC+2, Kaveh Vahedipour wrote:
>
> Hi again, 
>
> I think we’re there. I’ve tested in Linux Mint in a VM and everything 
> looks really happy. 
>
> So, cd into the directory where you cloned arangodb from github: 
>
> cd <arangodb-dir> 
> git pull 
> cd build 
> make 
>
> And you will have the build with the fixes. No worries, unless you have 
> not deleted the build directory, it’s gonna be a considerably shorted build 
> time compared to last night. 
>
> I also don’t know how much experience you have with building projects of 
> this size. But if you have a couple of processors and don’t need the box 
> you are building on for any other purposes for the compile time, you might 
> want to consider to replace the last of the above lines by 'make -j<ncpu>', 
> where <ncpu> is the number of hyperthreaded cores you have available. I 
> generally build with make -j8 for example. 
>
> Anyway, many thanks for hanging in there. Contributions like yours advance 
> the development considerably. 
>
> Keep us posted, please. 
>
> Kind regards, 
> Kaveh. 
>
> > On 28 Jun 2016, at 09:38, Bart DS <[email protected] <javascript:>> 
> wrote: 
> > 
> > Ok, I did some tests and the agents are running fine now. 
> > At least, initially... 
> > 
> > When I start my first server via the following command: 
> > 
> > ./build/bin/arangod --server.authentication=false --server.endpoint 
> tcp://0.0.0.0:8529 --cluster.my-address tcp://127.0.0.1:8529 
> --cluster.my-local-info db1 --cluster.my-role PRIMARY 
> --cluster.agency-endpoint tcp://127.0.0.1:5001 primary1 
> > 
> > I get following output: 
> > 
> > bart@laptop ~/test/arangodb $ ./build/bin/arangod 
> --server.authentication=false --server.endpoint tcp://0.0.0.0:8529 
> --cluster.my-address tcp://127.0.0.1:8529 --cluster.my-local-info db1 
> --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://127.0.0.1:5001 
> primary1 
> > 2016-06-28T07:06:38Z [24113] INFO using SSL options: 
> SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG 
> > 2016-06-28T07:06:38Z [24113] INFO file-descriptors (nofiles) hard limit 
> is 4096, soft limit is 1024 
> > 2016-06-28T07:06:38Z [24113] INFO created database directory 'primary1'. 
> > 2016-06-28T07:06:38Z [24113] INFO WAL directory 'primary1/journals' does 
> not exist. creating it... 
> > 2016-06-28T07:06:38Z [24113] INFO ArangoDB 3.0.x-devel [linux] 64bit, 
> using VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1f 6 Jan 2014 
> > 2016-06-28T07:06:38Z [24113] INFO loaded database '_system' from 
> 'primary1/databases/database-1' 
> > 2016-06-28T07:06:38Z [24113] INFO the server has 4 (hyper) cores, using 
> 1 scheduler thread(s), 4 dispatcher thread(s) 
> > 2016-06-28T07:06:39Z [24113] INFO JavaScript using startup './js', 
> application './js/apps' 
> > 2016-06-28T07:06:39Z [24113] INFO changing state of PRIMARY server from 
> UNDEFINED to STARTUP 
> > 2016-06-28T07:06:39Z [24113] INFO Cluster feature is turned on. Agency 
> version: , Agency endpoints: http+tcp://127.0.0.1:5002, http+tcp://
> 127.0.0.1:5002, http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5001, 
> http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, server id: 
> 'DBServer001', internal address: tcp://127.0.0.1:8529, role: PRIMARY 
> > 2016-06-28T07:06:39Z [24113] INFO using heartbeat interval value '1000 
> ms' from agency 
> > 2016-06-28T07:06:39Z [24113] INFO changing state of PRIMARY server from 
> STARTUP to SERVING 
> > 2016-06-28T07:06:39Z [24113] INFO In database '_system': No version 
> information file found in database directory. 
> > 2016-06-28T07:06:39Z [24113] INFO In database '_system': Database is 
> up-to-date (30000/cluster-local/init) 
> > 2016-06-28T07:06:40Z [24113] INFO using endpoint 'http+tcp://
> 0.0.0.0:8529' for non-encrypted requests 
> > 2016-06-28T07:06:40Z [24113] INFO Authentication is turned off 
> > 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001 
> > 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001 
> > 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001 
> > 2016-06-28T07:06:40Z [24113] INFO In database '_system': Database is 
> up-to-date (-/db-server-local/init) 
> > 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001 
> > 2016-06-28T07:06:40Z [24113] INFO ArangoDB (version 3.0.x-devel [linux]) 
> is ready for business. Have fun! 
> > 2016-06-28T07:06:41Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:06:51Z [24113] ERROR error details: 
> {"code":307,"errorNum":0,"errorMessage":"Temporary Redirect (Temporary 
> Redirect)","error":true} 
> > 2016-06-28T07:06:51Z [24113] ERROR error stack: ArangoError: Temporary 
> Redirect (Temporary Redirect)\n    at Error (native)\n    at writeLocked 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1663:41)\n 
>    at createLocalDatabases 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:348:9)\n   
>  at handleDatabaseChanges 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:461:3)\n   
>  at handleChanges 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1460:3)\n 
>    at handlePlanChange 
> (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1675:24) 
> > 2016-06-28T07:06:51Z [24113] ERROR plan change handling failed 
> > 2016-06-28T07:06:52Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:06:53Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:06:54Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:06:55Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:06:56Z [24113] WARNING {heartbeat} heartbeat could not be 
> sent to agency endpoints (http+tcp://127.0.0.1:5002, http+tcp://
> 127.0.0.1:5002, http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5003, 
> http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5002, http+tcp://
> 127.0.0.1:5001, http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001): 
> http code: 307, body: 
> > 2016-06-28T07:06:56Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:06:57Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:06:58Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:06:59Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:00Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:01Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:02Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:03Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:04Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:05Z [24113] WARNING {heartbeat} heartbeat could not be 
> sent to agency endpoints (http+tcp://127.0.0.1:5002, http+tcp://
> 127.0.0.1:5003, http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5003, 
> http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5002, http+tcp://
> 127.0.0.1:5001, http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001): 
> http code: 307, body: 
> > 2016-06-28T07:07:05Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:06Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:07Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:09Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:10Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:13Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:13Z [24113] ERROR {heartbeat} Could not read 
> Current/Version from agency. 
> > 2016-06-28T07:07:28Z [24113] INFO plan change handling successful 
> > 
> > 
> > And the agents start using all cpu resources until the system becomes 
> almost unresponsive. 
> > 
> > At that time, the agents are logging following messages: 
> > 
> > 2016-06-28T07:06:52Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:06:52Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:06:54Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:06:57Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:06:58Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:06:58Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:02Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:02Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:10Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:13Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:25Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:30Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:30Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:33Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:07:43Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:08:24Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:08:34Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:08:54Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:08:55Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:09:14Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:09:25Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 
> > 
> > When I stop the primary server again via ctrl-c, after some time the cpu 
> usage starts to decrease but the memory is increasing rapidly, finally 
> taking up all system memory and swap, in the end freezing the whole system. 
> > At this point the agents start to log following messages: 
> > 
> > 2016-06-28T07:12:34Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', 
> took 26.345204 
> > 2016-06-28T07:12:43Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:13:05Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:13:10Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:13:20Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:13:23Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:13:39Z [23997] WARNING {agency} I have a higher term than 
> RPC caller. 
> > 2016-06-28T07:15:23Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', 
> took 22.091859 
> > 2016-06-28T07:21:06Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', 
> took 17.283082 
> > 2016-06-28T07:22:26Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', 
> took 19.252867 
> > 2016-06-28T07:22:26Z [23478] WARNING {queries} slow query: 'FOR s in 
> @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', 
> took 40.744055 
> > [...] 
> > 
> > 
> > So the initial issue with the agents seems to be resolved, but it's 
> definitely still not working correctly 
> > 
> > Bart 
> > 
> > 
> > 
> > -- 
> > You received this message because you are subscribed to a topic in the 
> Google Groups "ArangoDB" group. 
> > To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/arangodb/p30FN_TXp30/unsubscribe. 
> > To unsubscribe from this group and all its topics, send an email to 
> [email protected] <javascript:>. 
> > For more options, visit https://groups.google.com/d/optout. 
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to