I am using arangodb for creating a database for some GB of data. I have imported the document collections say A and B. Now I am importing edges between A->B(_from,_to). I am using arangoimp for this. I am importing the data using a shell script where it which loops over the files that needs to be imported. Each and every file < 11 MB. The format of the shell script is as follows:
for filename in /path/<folder_name>/*.csv; do arangoimp --server.endpoint tcp://**********/ <http://www.google.com/url?q=http%3A%2F%2F50.17.248.223%3A8531%2F&sa=D&sntz=1&usg=AFQjCNGphQ-Y0dNupIRgceX833UidQaeIA> --server.username root --server.password ****** --file $filename --type csv --server.database TestDB --on-duplicate ignore --collection <collectionname> > "$filename"_log" rm "$filename"_log done *The Problem:* Importing speed is significantly low frankly speaking its < 1mbps. I am running it on cluster of three servers one of them I set up as coordinator and rest of them are dbservers. I have checked logs and it is showing me a warning. Here is the log for reference. WARNING {queries} slow query: 'FOR x IN @@collection LET att = APPEND(SLICE(ATTRIBUTES(x), 0, 25), '_key', true) LIMIT @offset, @count RETURN KEEP(x, att)', took: 14.240174 1. for document collection I have my own key(So that I can create edges easily with _key) 2. I have multiple edges between specific selected set of vertices, to be precise if v1 is a document in A and v2 is a document in B, then I may have 5 edges between these two documents(vertices) 3. I made sure the self generated keys should not contain any character else of recommended ones. 4. I have self generated key in all cases collection A, B and edges. Please help me to improve the speed or guide me what is wrong in my approach. -- You received this message because you are subscribed to the Google Groups "ArangoDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
