Great work! Congrats guys. Pawel On Monday, February 3, 2014 2:06:17 PM UTC+1, Andrey Lomakin wrote: > > Hi, > We are glad to announce new implementation of relationships in graph > database. > According to data loading benchmark it can be 15 times faster than current > implementation. > > First several words about new implementation. It is based on new data > structure sbtree and optimized for usage not only for embedded but for > remote storage too. > To achieve such kind of optimization we have introduced new data type > LINKBAG, it represents set of RIDs, but allows duplication of values, also > it does not implement Collection interface. > LINKBAG has two binary presentations, in form of modified btree and in > form of collection embedded in document but collection is deserealized only > on demand, in case of iteration for example. > > Because this data structure is based on sbtree it can be used in plocal > and remote storages only, we have plan to implement dmemory storage (memory > storage which uses not heap but direct memory) so memory db users > also will be able to use this data structure. > If you still want to work on local or memory storage you should set > ridBag.embeddedToSbtreeBonsaiThreshold property value to MAX_INTEGER value, > but of course it will lead to performance degradation. > > Bellow are results of compassion of load speed in case of import of > Wikipedia page structure (without page content) which consist of 130 mln of > vertexes and more than 1 billion of edges. > To prevent duplication of vertexes we used unique index by page key. Data > were taken from http://downloads.dbpedia.org/3.6/en/page_links_en.nt.bz2. > Load test was ran on PC with 24 Gb RAM, 7500 RPM HDD, Intel(R) Core(TM) > i7-2600 CPU @ 3.40GHz. > > First test is load of data in tx mode, blue line is current Graph DB > implementation, red line - LINKBAG based implementation. > X axis amount of imported pages, Y axis time which was spent to import > these pages (sorry that image is big that is only way to show numbers). > > [image: Inline image 1] > As you can see 6 300 000 imported records current implementation suffers > from dramatic slow down so there were no point to continue test. > > Second test is import of the same data but in non-TX mode. > [image: Inline image 2] > > And the last graph is comparison of import of whole Wikipedia data using > new LINKBAG implementation. > > [image: Inline image 3] > > In non-TX mode test was completed for 6.5 hours and in tx mode it was > completed for 14 hours. > > This implementation will be part of first 1.7 release candidate which will > be published at the middle of the week. > > We are planing to have several release candidates. > > 1. RC1 - new graph database implementation without distributed storage > support + new record level lock feature ( > https://github.com/orientechnologies/orientdb/issues/1056). > 2. RC2 - support of distributed storage in new graph database > implementation + fix OOM in case of restore of big transaction ( > https://github.com/orientechnologies/orientdb/issues/1604). > 3. RC3 - binary compatibility support with old versions will be returned > back in OrientDB + migration tool for new graph database structure. > 4. 1.7 release. > > > -- > Best regards, > Andrey Lomakin. > > Orient Technologies > the Company behind OrientDB > >
-- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.
