Re: [Neo4j] ArangoDB vs. Neo4j -- what's up? article of Jun 04, 2015

Frank Celler Wed, 10 Jun 2015 09:56:38 -0700

Hi Michael,

thanks for sharing your preliminary findings. I'll incorporate them into 
the benchmark suite and rerun the tests. I've seen that there is a 30day 
trial for the enterprise edition. So I can tests that as well.


Is it possible to upload the database where you changed the AGE attribute? 
Or is there any easy cypher command to change the type?

Thanks
  Frank


Am Mittwoch, 10. Juni 2015 17:27:05 UTC+2 schrieb Michael Hunger:
>
> I also did some experiments but didn't have the time to finish yet, here 
> are my observations so far:
>
> *Arangodb Measurement*
>
> - index -> constraint `CREATE CONSTRAINT ON (p:PROFILES) ASSERT p._key IS 
> UNIQUE;`
> - seraph -> replace with node-neo4j 2.0.RC1 
>   - uses 2 year old /cypher api, doesn't send X-Stream:true header
>   - does not do efficient auth (encode creds on every call)
>   - doesn't do pooling
> - suboptimal queries
> - make sure the concurrency level is adequate for the setup (utilize all 
> cores but don't flood, use e.g. async.eachWithLimit)
> - warmup with nodes and rels `MATCH ()--() return count(*);`
> - enterprise with better vertical read/write scalability vs. community
> - Use 12G-24G heap, 2G new gen (-Xmn2G)
> - pagecache to 2.5G + growth (e.g. another 2.5G)
> - in 2.2 set cache_type = soft or cache_type=none depending on available 
> heap
> - fix property encoding, e.g. AGE as int not string, don't store "null" !!
>   -> affects esp. aggregate query
> - don't re-run the benchmark on the same store, start at the initial one
>   -> creating and deleting the additional PROFILES_TEMP nodes affects 
> repeatability of results
>
> correct datatypes:
>
> * "null" should *never be stored*
> * int: public, gender, completion_percentage, AGE,
> * long/time: last_login, registration 
> * optionally as label: gender, public
>
>   -> test repository (WIP): with changes in *description.js and 
> benchmark.js*
>
> https://github.com/jexp/nosql-tests/tree/node-neo4j
>
> queries for  for neo4j-shell:
>
> export from="P/P1"
> export to="P/P277"
>
> export key="P/P1"
>
> // warmup
> MATCH ()--() return count(*);
> // 61.245.128 rows
>
> MATCH (s:PROFILES) return count(*);
> // 1.632.803 profiles
> // 1.15 s
>
> profile
>
> MATCH (s:PROFILES {_key:{key}})-[*1..2]->(n:PROFILES) RETURN DISTINCT 
> n._key;
> // 295 rows 5 ms
>
>
> // 1st degree neighbours
> MATCH (:PROFILES {_key:{key}})-->(n) RETURN n._key;
> // 14 rows 1ms 
>
> // 2nd degree neighbours
> MATCH (s:PROFILES {_key:{key}})-->(x)
> MATCH (x)-->(n:PROFILES)
> RETURN DISTINCT n._key;
> // 283 rows 6 ms
>
> // shortest path
> MATCH (s:PROFILES {_key:{from}}),(t:PROFILES {_key:{to}}), 
> p = shortestPath((s)-[*..15]->(t)) RETURN [x in nodes(p) | x._key] as path;
> // 1 ms, don't return the full data only keys like in the other db's
>
> // aggregation
> MATCH (f:PROFILES) RETURN f.AGE, count(*);
> // 22s -> should be rather 1.5s
>
> // single read
> MATCH (f:PROFILES) WHERE f._key = {key} RETURN f;
> // or
> MATCH (s:PROFILES {_key:{key}}) RETURN s;
> // 1 row with 59 properties 1 ms
>
> // single writes
> CREATE (s:PROFILES_TEMP {data}) RETURN id(s);
>
> // delete all nodes with a certain label
> // loop until returns 0
> MATCH (n:PROFILES_TEMP) WITH n LIMIT 5000 OPTIONAL MATCH (n)-[r]-() DELETE 
> n,r RETURN count(*) as deleted
> ----
>
> MATCH (s:PROFILES {_key:{key}})-[*1..2]->(n:PROFILES) WITH DISTINCT n._key 
> as key RETURN count(*);
> // 295 count 5-6ms
>
> MATCH (f:PROFILES) return id(f) % 140, count(*);
> // 140 rows -> 1502 ms that's how it should be
>
> sample data:
>
> _key:"P/P1",
> public:"1",
> completion_percentage:"14",
> gender:"1",
> region:"zilinsky kraj, zilina",
> last_login:"2012-05-25 11:20:00.0",
> registration:"2005-04-03 00:00:00.0",
> AGE:26,
> body:"185 cm, 90 kg",
> I_am_working_in_field:"it",
> spoken_languages:"anglicky",
> hobbies:"sportovanie, spanie, kino, jedlo, pocuvanie hudby, priatelia, 
> divadlo",
> I_most_enjoy_good_food:"v dobrej restauracii",
> pets:"mam psa",
> body_type:"null",
> my_eyesight:"null",
> eye_color:"null",
> hair_color:"null",
> hair_type:"null",
> completed_level_of_education:"null",
> favourite_color:"null",
> relation_to_smoking:"null",
> relation_to_alcohol:"null",
> sign_in_zodiac:"null",
> on_pokec_i_am_looking_for:"null",
> love_is_for_me:"null",
> relation_to_casual_sex:"null",
> my_partner_should_be:"null",
> marital_status:"null",
> children:"null",
> relation_to_children:"null",
> I_like_movies:"null",
> I_like_watching_movie:"null",
> I_like_music:"null",
> I_mostly_like_listening_to_music:"null",
> the_idea_of_good_evening:"null",
> I_like_specialties_from_kitchen:"null",
> fun:"null",
> I_am_going_to_concerts:"null",
> my_active_sports:"null",
> my_passive_sports:"null",
> profession:"null",
> I_like_books:"null",
> life_style:"null",
> music:"null",
> cars:"null",
> politics:"null",
> relationships:"null",
> art_culture:"null",
> hobbies_interests:"null",
> science_technologies:"null",
> computers_internet:"null",
> education:"null",
> sport:"null",
> movies:"null",
> travelling:"null",
> health:"null",
> companies_brands:"null",
> more:"null"
>
>
> neo4j-server.properties:
> org.neo4j.server.database.location=/Users/mh/support/arangodb/db/data
> org.neo4j.server.webserver.port=8474
> dbms.security.auth_enabled=false
>
>
> neo4j-wrapper.conf:
> wrapper.java.initmemory=8000
> wrapper.java.maxmemory=8000
> wrapper.java.additional=-Xmn2G
>
> neo4j.properties:
> dbms.pagecache.memory=5G
> keep_logical_logs=false
> remote_shell_enabled=false
> cache_type=soft
> online_backup_enabled=false
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] ArangoDB vs. Neo4j -- what's up? article of Jun 04, 2015

Reply via email to