Re: [orientdb] paper about OrientDB and other graph-db

Luca Garulli Thu, 14 Aug 2014 00:48:07 -0700

On 14 August 2014 00:51, Valerio Schiavoni <[email protected]>
wrote:


> Hello Luca,
>
>
> 60M of nodes and 400M of edges is not that big, you could manage the
>> entire graph in 1 machine, but depends by the kind of operation you do
>> against the graph:
>>
>>    - is it mostly read than write?
>>
>>
> Yes. The database is write-once, but reads follow unpredictable patterns.
>

So you import the database at the beginning and then you make only
traversal, right?

>
>>    - what kind of traversal you will do?
>>
>> The main operation will be a shortest-path between any two vertices.
> The number of hops between any two vertices follow a scale-free
> distribution.
>

Ok


>
>>    - what's the expected performance in terms of query/sec?
>>
>> I don't need to comply to a particularly stringent SLA, but it has to be
> fast...
> For instance: internally, does OrientDB  memoize traversal results ?
>
> [2] - http://en.wikipedia.org/wiki/Memoization
>

We don't support this out of the box, but you could create in-memory graphs
as result of traversing, and then reuse them fir further queries.


>
>  Can OrientDB automatically shard (via some kind of semi/automatic
>>> graph-clustering) a large graph across
>>> a set of OrientDB nodes ?
>>>
>>
>> You can shard the graph against multiple nodes. We have 3 strategies to
>> distribute data against different machines (round-robin, balanced and
>> sticky), but 99% of the times it's better to rely to the application where
>> to store the nodes because he better know the domain.
>>
>
> Can you point me to the documentation where these 3 strategies are
> described, if any ?
>

https://github.com/orientechnologies/orientdb/wiki/SQL-Create-Class#cluster-selection-strategy


>
>
>>  How the nodes communicate to each other ?
>>>
>>
>> All nodes are connected via TCP/IP, so the traversal is transparent, but
>> could be costly if you have many hop back and forth on machines.
>>
>
> Are connections established once and for all, or they are created on the
> fly only when a given edge is traversed, and closed once the traversal has
> completed ?
>

Connections remain until you stop a server or becomes unreachable.


>  If the vertices are sharded following a min-cut clustering, this cost
> should be low...right?
>

Right


>
>
>>  Is there some rebalancing mechanism in the case a graph-cluster becomes
>>> too big ?
>>>
>>
>> With 2.0 we have a "move vertex" command to move vertices to nodes, this
>> is very useful to move one node or entire clusters.
>>
>
> Interesting. Given latest version I see is 1.7.8, the question is due:
> when do you plan to release 2.0 ?
>

2.0-rc1 is scheduled for the beginning of September, the final release
within September. 2.0 has more news, like new binary protocol serialization
to save 20-60% of space on disk, better multi-core support, new Studio with
Graph Render (pretty cool), better performance on distributed architecture
and much more.


>  1 -
> http://www.odbms.org/wp-content/uploads/2014/05/an-empirical-comparison-of-graph-databases.pdf
>
>>
>> This is a pretty old benchmark with the old version of OrientDB that used
>> local and mvrb+tree indexes. Now OrientDB is totally different, and it
>> scales much better.
>>
>
> I understand your concerns. But the paper is not so old after all,
> officially published back in September 2013 here:
>
> http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=6693403&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D6693403
>

They tested OrientDB 1.3, released on December, 19th 2012 (
https://code.google.com/p/orient/downloads/detail?name=orientdb-1.3.0.tar.gz&can=2&q=#makechanges
).

Recently I found another benchmark published 2 months ago, where they used
YCSB with recent version of NoSQL products and... OrientDB 1.0.1...

Lvc@


>
>
>
>>  By the way I asked to the authors of the paper to retry the very same
>> benchmark against 1.7.x but they never responded me...
>>
>
> Too bad :-(
>
>
>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] paper about OrientDB and other graph-db

Reply via email to