Hi Kaveh, No worries, I'm glad I can help tracking down those issues.
I think we're almost there ;-) All 3 agents and the first primary DBServer now start correctly, but I'm not able to start an additional primary DBServer or Coordinator: Second DBServer: bart@laptop ~/test2/arangodb $ ./build/bin/arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8530 --cluster.my-address tcp://127.0.0.1:8530 --cluster.my-local-info db2 --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://127.0.0.1:5001 primary2 2016-06-28T09:25:02Z [2076] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG 2016-06-28T09:25:02Z [2076] FATAL unable to determine unambiguous role for server ''. No role configured in agency (http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001) Coordinator: bart@laptop ~/test2/arangodb $ ./build/bin/arangod --server.authentication=false --server.endpoint tcp://0.0.0.0:8531 --cluster.my-address tcp://127.0.0.1:8531 --cluster.my-local-info coord1 --cluster.my-role COORDINATOR --cluster.agency-endpoint tcp://127.0.0.1:5001 coordinator 2016-06-28T09:27:53Z [2276] INFO using SSL options: SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG 2016-06-28T09:27:53Z [2276] FATAL unable to determine unambiguous role for server ''. No role configured in agency (http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001) Bart On Tuesday, June 28, 2016 at 10:55:48 AM UTC+2, Kaveh Vahedipour wrote: > > Hi again, > > I think we’re there. I’ve tested in Linux Mint in a VM and everything > looks really happy. > > So, cd into the directory where you cloned arangodb from github: > > cd <arangodb-dir> > git pull > cd build > make > > And you will have the build with the fixes. No worries, unless you have > not deleted the build directory, it’s gonna be a considerably shorted build > time compared to last night. > > I also don’t know how much experience you have with building projects of > this size. But if you have a couple of processors and don’t need the box > you are building on for any other purposes for the compile time, you might > want to consider to replace the last of the above lines by 'make -j<ncpu>', > where <ncpu> is the number of hyperthreaded cores you have available. I > generally build with make -j8 for example. > > Anyway, many thanks for hanging in there. Contributions like yours advance > the development considerably. > > Keep us posted, please. > > Kind regards, > Kaveh. > > > On 28 Jun 2016, at 09:38, Bart DS <[email protected] <javascript:>> > wrote: > > > > Ok, I did some tests and the agents are running fine now. > > At least, initially... > > > > When I start my first server via the following command: > > > > ./build/bin/arangod --server.authentication=false --server.endpoint > tcp://0.0.0.0:8529 --cluster.my-address tcp://127.0.0.1:8529 > --cluster.my-local-info db1 --cluster.my-role PRIMARY > --cluster.agency-endpoint tcp://127.0.0.1:5001 primary1 > > > > I get following output: > > > > bart@laptop ~/test/arangodb $ ./build/bin/arangod > --server.authentication=false --server.endpoint tcp://0.0.0.0:8529 > --cluster.my-address tcp://127.0.0.1:8529 --cluster.my-local-info db1 > --cluster.my-role PRIMARY --cluster.agency-endpoint tcp://127.0.0.1:5001 > primary1 > > 2016-06-28T07:06:38Z [24113] INFO using SSL options: > SSL_OP_CIPHER_SERVER_PREFERENCE, SSL_OP_TLS_ROLLBACK_BUG > > 2016-06-28T07:06:38Z [24113] INFO file-descriptors (nofiles) hard limit > is 4096, soft limit is 1024 > > 2016-06-28T07:06:38Z [24113] INFO created database directory 'primary1'. > > 2016-06-28T07:06:38Z [24113] INFO WAL directory 'primary1/journals' does > not exist. creating it... > > 2016-06-28T07:06:38Z [24113] INFO ArangoDB 3.0.x-devel [linux] 64bit, > using VPack 0.1.30, ICU 54.1, V8 5.0.71.39, OpenSSL 1.0.1f 6 Jan 2014 > > 2016-06-28T07:06:38Z [24113] INFO loaded database '_system' from > 'primary1/databases/database-1' > > 2016-06-28T07:06:38Z [24113] INFO the server has 4 (hyper) cores, using > 1 scheduler thread(s), 4 dispatcher thread(s) > > 2016-06-28T07:06:39Z [24113] INFO JavaScript using startup './js', > application './js/apps' > > 2016-06-28T07:06:39Z [24113] INFO changing state of PRIMARY server from > UNDEFINED to STARTUP > > 2016-06-28T07:06:39Z [24113] INFO Cluster feature is turned on. Agency > version: , Agency endpoints: http+tcp://127.0.0.1:5002, http+tcp:// > 127.0.0.1:5002, http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5001, > http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001, server id: > 'DBServer001', internal address: tcp://127.0.0.1:8529, role: PRIMARY > > 2016-06-28T07:06:39Z [24113] INFO using heartbeat interval value '1000 > ms' from agency > > 2016-06-28T07:06:39Z [24113] INFO changing state of PRIMARY server from > STARTUP to SERVING > > 2016-06-28T07:06:39Z [24113] INFO In database '_system': No version > information file found in database directory. > > 2016-06-28T07:06:39Z [24113] INFO In database '_system': Database is > up-to-date (30000/cluster-local/init) > > 2016-06-28T07:06:40Z [24113] INFO using endpoint 'http+tcp:// > 0.0.0.0:8529' for non-encrypted requests > > 2016-06-28T07:06:40Z [24113] INFO Authentication is turned off > > 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001 > > 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001 > > 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001 > > 2016-06-28T07:06:40Z [24113] INFO In database '_system': Database is > up-to-date (-/db-server-local/init) > > 2016-06-28T07:06:40Z [24113] INFO bootstraped DB server DBServer001 > > 2016-06-28T07:06:40Z [24113] INFO ArangoDB (version 3.0.x-devel [linux]) > is ready for business. Have fun! > > 2016-06-28T07:06:41Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:06:51Z [24113] ERROR error details: > {"code":307,"errorNum":0,"errorMessage":"Temporary Redirect (Temporary > Redirect)","error":true} > > 2016-06-28T07:06:51Z [24113] ERROR error stack: ArangoError: Temporary > Redirect (Temporary Redirect)\n at Error (native)\n at writeLocked > (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1663:41)\n > at createLocalDatabases > (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:348:9)\n > at handleDatabaseChanges > (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:461:3)\n > at handleChanges > (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1460:3)\n > at handlePlanChange > (/home/bart/test/arangodb/js/server/modules/@arangodb/cluster.js:1675:24) > > 2016-06-28T07:06:51Z [24113] ERROR plan change handling failed > > 2016-06-28T07:06:52Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:06:53Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:06:54Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:06:55Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:06:56Z [24113] WARNING {heartbeat} heartbeat could not be > sent to agency endpoints (http+tcp://127.0.0.1:5002, http+tcp:// > 127.0.0.1:5002, http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5003, > http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5002, http+tcp:// > 127.0.0.1:5001, http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001): > http code: 307, body: > > 2016-06-28T07:06:56Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:06:57Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:06:58Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:06:59Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:00Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:01Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:02Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:03Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:04Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:05Z [24113] WARNING {heartbeat} heartbeat could not be > sent to agency endpoints (http+tcp://127.0.0.1:5002, http+tcp:// > 127.0.0.1:5003, http+tcp://127.0.0.1:5003, http+tcp://127.0.0.1:5003, > http+tcp://127.0.0.1:5002, http+tcp://127.0.0.1:5002, http+tcp:// > 127.0.0.1:5001, http+tcp://127.0.0.1:5001, http+tcp://127.0.0.1:5001): > http code: 307, body: > > 2016-06-28T07:07:05Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:06Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:07Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:09Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:10Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:13Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:13Z [24113] ERROR {heartbeat} Could not read > Current/Version from agency. > > 2016-06-28T07:07:28Z [24113] INFO plan change handling successful > > > > > > And the agents start using all cpu resources until the system becomes > almost unresponsive. > > > > At that time, the agents are logging following messages: > > > > 2016-06-28T07:06:52Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:06:52Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:06:54Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:06:57Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:06:58Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:06:58Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:02Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:02Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:10Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:13Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:25Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:30Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:30Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:33Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:07:43Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:08:24Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:08:34Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:08:54Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:08:55Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:09:14Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:09:25Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > > > > > When I stop the primary server again via ctrl-c, after some time the cpu > usage starts to decrease but the memory is increasing rapidly, finally > taking up all system memory and swap, in the end freezing the whole system. > > At this point the agents start to log following messages: > > > > 2016-06-28T07:12:34Z [23478] WARNING {queries} slow query: 'FOR s in > @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', > took 26.345204 > > 2016-06-28T07:12:43Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:13:05Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:13:10Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:13:20Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:13:23Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:13:39Z [23997] WARNING {agency} I have a higher term than > RPC caller. > > 2016-06-28T07:15:23Z [23478] WARNING {queries} slow query: 'FOR s in > @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', > took 22.091859 > > 2016-06-28T07:21:06Z [23478] WARNING {queries} slow query: 'FOR s in > @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', > took 17.283082 > > 2016-06-28T07:22:26Z [23478] WARNING {queries} slow query: 'FOR s in > @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', > took 19.252867 > > 2016-06-28T07:22:26Z [23478] WARNING {queries} slow query: 'FOR s in > @@collection FILTER s.time >= @start SORT s.time desc LIMIT 1 RETURN s', > took 40.744055 > > [...] > > > > > > So the initial issue with the agents seems to be resolved, but it's > definitely still not working correctly > > > > Bart > > > > > > > > -- > > You received this message because you are subscribed to a topic in the > Google Groups "ArangoDB" group. > > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/arangodb/p30FN_TXp30/unsubscribe. > > To unsubscribe from this group and all its topics, send an email to > [email protected] <javascript:>. > > For more options, visit https://groups.google.com/d/optout. > > -- You received this message because you are subscribed to the Google Groups "ArangoDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
