Hi Michael, In general should one use coalesce with a string/int/float/etc and empty string/0/0.0 so as to avoid saving null.
On Wednesday, June 10, 2015 at 11:27:05 AM UTC-4, Michael Hunger wrote: > > I also did some experiments but didn't have the time to finish yet, here > are my observations so far: > > *Arangodb Measurement* > > - index -> constraint `CREATE CONSTRAINT ON (p:PROFILES) ASSERT p._key IS > UNIQUE;` > - seraph -> replace with node-neo4j 2.0.RC1 > - uses 2 year old /cypher api, doesn't send X-Stream:true header > - does not do efficient auth (encode creds on every call) > - doesn't do pooling > - suboptimal queries > - make sure the concurrency level is adequate for the setup (utilize all > cores but don't flood, use e.g. async.eachWithLimit) > - warmup with nodes and rels `MATCH ()--() return count(*);` > - enterprise with better vertical read/write scalability vs. community > - Use 12G-24G heap, 2G new gen (-Xmn2G) > - pagecache to 2.5G + growth (e.g. another 2.5G) > - in 2.2 set cache_type = soft or cache_type=none depending on available > heap > - fix property encoding, e.g. AGE as int not string, don't store "null" !! > -> affects esp. aggregate query > - don't re-run the benchmark on the same store, start at the initial one > -> creating and deleting the additional PROFILES_TEMP nodes affects > repeatability of results > > correct datatypes: > > * "null" should *never be stored* > * int: public, gender, completion_percentage, AGE, > * long/time: last_login, registration > * optionally as label: gender, public > > -> test repository (WIP): with changes in *description.js and > benchmark.js* > > https://github.com/jexp/nosql-tests/tree/node-neo4j > > queries for for neo4j-shell: > > export from="P/P1" > export to="P/P277" > > export key="P/P1" > > // warmup > MATCH ()--() return count(*); > // 61.245.128 rows > > MATCH (s:PROFILES) return count(*); > // 1.632.803 profiles > // 1.15 s > > profile > > MATCH (s:PROFILES {_key:{key}})-[*1..2]->(n:PROFILES) RETURN DISTINCT > n._key; > // 295 rows 5 ms > > > // 1st degree neighbours > MATCH (:PROFILES {_key:{key}})-->(n) RETURN n._key; > // 14 rows 1ms > > // 2nd degree neighbours > MATCH (s:PROFILES {_key:{key}})-->(x) > MATCH (x)-->(n:PROFILES) > RETURN DISTINCT n._key; > // 283 rows 6 ms > > // shortest path > MATCH (s:PROFILES {_key:{from}}),(t:PROFILES {_key:{to}}), > p = shortestPath((s)-[*..15]->(t)) RETURN [x in nodes(p) | x._key] as path; > // 1 ms, don't return the full data only keys like in the other db's > > // aggregation > MATCH (f:PROFILES) RETURN f.AGE, count(*); > // 22s -> should be rather 1.5s > > // single read > MATCH (f:PROFILES) WHERE f._key = {key} RETURN f; > // or > MATCH (s:PROFILES {_key:{key}}) RETURN s; > // 1 row with 59 properties 1 ms > > // single writes > CREATE (s:PROFILES_TEMP {data}) RETURN id(s); > > // delete all nodes with a certain label > // loop until returns 0 > MATCH (n:PROFILES_TEMP) WITH n LIMIT 5000 OPTIONAL MATCH (n)-[r]-() DELETE > n,r RETURN count(*) as deleted > ---- > > MATCH (s:PROFILES {_key:{key}})-[*1..2]->(n:PROFILES) WITH DISTINCT n._key > as key RETURN count(*); > // 295 count 5-6ms > > MATCH (f:PROFILES) return id(f) % 140, count(*); > // 140 rows -> 1502 ms that's how it should be > > sample data: > > _key:"P/P1", > public:"1", > completion_percentage:"14", > gender:"1", > region:"zilinsky kraj, zilina", > last_login:"2012-05-25 11:20:00.0", > registration:"2005-04-03 00:00:00.0", > AGE:26, > body:"185 cm, 90 kg", > I_am_working_in_field:"it", > spoken_languages:"anglicky", > hobbies:"sportovanie, spanie, kino, jedlo, pocuvanie hudby, priatelia, > divadlo", > I_most_enjoy_good_food:"v dobrej restauracii", > pets:"mam psa", > body_type:"null", > my_eyesight:"null", > eye_color:"null", > hair_color:"null", > hair_type:"null", > completed_level_of_education:"null", > favourite_color:"null", > relation_to_smoking:"null", > relation_to_alcohol:"null", > sign_in_zodiac:"null", > on_pokec_i_am_looking_for:"null", > love_is_for_me:"null", > relation_to_casual_sex:"null", > my_partner_should_be:"null", > marital_status:"null", > children:"null", > relation_to_children:"null", > I_like_movies:"null", > I_like_watching_movie:"null", > I_like_music:"null", > I_mostly_like_listening_to_music:"null", > the_idea_of_good_evening:"null", > I_like_specialties_from_kitchen:"null", > fun:"null", > I_am_going_to_concerts:"null", > my_active_sports:"null", > my_passive_sports:"null", > profession:"null", > I_like_books:"null", > life_style:"null", > music:"null", > cars:"null", > politics:"null", > relationships:"null", > art_culture:"null", > hobbies_interests:"null", > science_technologies:"null", > computers_internet:"null", > education:"null", > sport:"null", > movies:"null", > travelling:"null", > health:"null", > companies_brands:"null", > more:"null" > > > neo4j-server.properties: > org.neo4j.server.database.location=/Users/mh/support/arangodb/db/data > org.neo4j.server.webserver.port=8474 > dbms.security.auth_enabled=false > > > neo4j-wrapper.conf: > wrapper.java.initmemory=8000 > wrapper.java.maxmemory=8000 > wrapper.java.additional=-Xmn2G > > neo4j.properties: > dbms.pagecache.memory=5G > keep_logical_logs=false > remote_shell_enabled=false > cache_type=soft > online_backup_enabled=false > > -- You received this message because you are subscribed to the Google Groups "Neo4j" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
