Re: [Neo4j] ArangoDB vs. Neo4j -- what's up? article of Jun 04, 2015

jsc Fri, 12 Jun 2015 01:19:01 -0700

Hi Michael,

In general should one use coalesce with a string/int/float/etc and empty 
string/0/0.0 so as to avoid saving null.


On Wednesday, June 10, 2015 at 11:27:05 AM UTC-4, Michael Hunger wrote:
>
> I also did some experiments but didn't have the time to finish yet, here 
> are my observations so far:
>
> *Arangodb Measurement*
>
> - index -> constraint `CREATE CONSTRAINT ON (p:PROFILES) ASSERT p._key IS 
> UNIQUE;`
> - seraph -> replace with node-neo4j 2.0.RC1 
>   - uses 2 year old /cypher api, doesn't send X-Stream:true header
>   - does not do efficient auth (encode creds on every call)
>   - doesn't do pooling
> - suboptimal queries
> - make sure the concurrency level is adequate for the setup (utilize all 
> cores but don't flood, use e.g. async.eachWithLimit)
> - warmup with nodes and rels `MATCH ()--() return count(*);`
> - enterprise with better vertical read/write scalability vs. community
> - Use 12G-24G heap, 2G new gen (-Xmn2G)
> - pagecache to 2.5G + growth (e.g. another 2.5G)
> - in 2.2 set cache_type = soft or cache_type=none depending on available 
> heap
> - fix property encoding, e.g. AGE as int not string, don't store "null" !!
>   -> affects esp. aggregate query
> - don't re-run the benchmark on the same store, start at the initial one
>   -> creating and deleting the additional PROFILES_TEMP nodes affects 
> repeatability of results
>
> correct datatypes:
>
> * "null" should *never be stored*
> * int: public, gender, completion_percentage, AGE,
> * long/time: last_login, registration 
> * optionally as label: gender, public
>
>   -> test repository (WIP): with changes in *description.js and 
> benchmark.js*
>
> https://github.com/jexp/nosql-tests/tree/node-neo4j
>
> queries for  for neo4j-shell:
>
> export from="P/P1"
> export to="P/P277"
>
> export key="P/P1"
>
> // warmup
> MATCH ()--() return count(*);
> // 61.245.128 rows
>
> MATCH (s:PROFILES) return count(*);
> // 1.632.803 profiles
> // 1.15 s
>
> profile
>
> MATCH (s:PROFILES {_key:{key}})-[*1..2]->(n:PROFILES) RETURN DISTINCT 
> n._key;
> // 295 rows 5 ms
>
>
> // 1st degree neighbours
> MATCH (:PROFILES {_key:{key}})-->(n) RETURN n._key;
> // 14 rows 1ms 
>
> // 2nd degree neighbours
> MATCH (s:PROFILES {_key:{key}})-->(x)
> MATCH (x)-->(n:PROFILES)
> RETURN DISTINCT n._key;
> // 283 rows 6 ms
>
> // shortest path
> MATCH (s:PROFILES {_key:{from}}),(t:PROFILES {_key:{to}}), 
> p = shortestPath((s)-[*..15]->(t)) RETURN [x in nodes(p) | x._key] as path;
> // 1 ms, don't return the full data only keys like in the other db's
>
> // aggregation
> MATCH (f:PROFILES) RETURN f.AGE, count(*);
> // 22s -> should be rather 1.5s
>
> // single read
> MATCH (f:PROFILES) WHERE f._key = {key} RETURN f;
> // or
> MATCH (s:PROFILES {_key:{key}}) RETURN s;
> // 1 row with 59 properties 1 ms
>
> // single writes
> CREATE (s:PROFILES_TEMP {data}) RETURN id(s);
>
> // delete all nodes with a certain label
> // loop until returns 0
> MATCH (n:PROFILES_TEMP) WITH n LIMIT 5000 OPTIONAL MATCH (n)-[r]-() DELETE 
> n,r RETURN count(*) as deleted
> ----
>
> MATCH (s:PROFILES {_key:{key}})-[*1..2]->(n:PROFILES) WITH DISTINCT n._key 
> as key RETURN count(*);
> // 295 count 5-6ms
>
> MATCH (f:PROFILES) return id(f) % 140, count(*);
> // 140 rows -> 1502 ms that's how it should be
>
> sample data:
>
> _key:"P/P1",
> public:"1",
> completion_percentage:"14",
> gender:"1",
> region:"zilinsky kraj, zilina",
> last_login:"2012-05-25 11:20:00.0",
> registration:"2005-04-03 00:00:00.0",
> AGE:26,
> body:"185 cm, 90 kg",
> I_am_working_in_field:"it",
> spoken_languages:"anglicky",
> hobbies:"sportovanie, spanie, kino, jedlo, pocuvanie hudby, priatelia, 
> divadlo",
> I_most_enjoy_good_food:"v dobrej restauracii",
> pets:"mam psa",
> body_type:"null",
> my_eyesight:"null",
> eye_color:"null",
> hair_color:"null",
> hair_type:"null",
> completed_level_of_education:"null",
> favourite_color:"null",
> relation_to_smoking:"null",
> relation_to_alcohol:"null",
> sign_in_zodiac:"null",
> on_pokec_i_am_looking_for:"null",
> love_is_for_me:"null",
> relation_to_casual_sex:"null",
> my_partner_should_be:"null",
> marital_status:"null",
> children:"null",
> relation_to_children:"null",
> I_like_movies:"null",
> I_like_watching_movie:"null",
> I_like_music:"null",
> I_mostly_like_listening_to_music:"null",
> the_idea_of_good_evening:"null",
> I_like_specialties_from_kitchen:"null",
> fun:"null",
> I_am_going_to_concerts:"null",
> my_active_sports:"null",
> my_passive_sports:"null",
> profession:"null",
> I_like_books:"null",
> life_style:"null",
> music:"null",
> cars:"null",
> politics:"null",
> relationships:"null",
> art_culture:"null",
> hobbies_interests:"null",
> science_technologies:"null",
> computers_internet:"null",
> education:"null",
> sport:"null",
> movies:"null",
> travelling:"null",
> health:"null",
> companies_brands:"null",
> more:"null"
>
>
> neo4j-server.properties:
> org.neo4j.server.database.location=/Users/mh/support/arangodb/db/data
> org.neo4j.server.webserver.port=8474
> dbms.security.auth_enabled=false
>
>
> neo4j-wrapper.conf:
> wrapper.java.initmemory=8000
> wrapper.java.maxmemory=8000
> wrapper.java.additional=-Xmn2G
>
> neo4j.properties:
> dbms.pagecache.memory=5G
> keep_logical_logs=false
> remote_shell_enabled=false
> cache_type=soft
> online_backup_enabled=false
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [Neo4j] ArangoDB vs. Neo4j -- what's up? article of Jun 04, 2015

Reply via email to