Re: [orientdb] Re: New benchmarks from Arango

Luca Garulli Mon, 22 Jun 2015 02:57:03 -0700

Hey guys,
We sent the PR, but seems it takes a lot of time to merge it and re-run
tests...

However, Enrico (maggiolo00) took a look a this benchmark and optimized the
OrientDB implementation in the following ways:

   - Using OrientDB 2.1-rc4 instead of 2.0.x. This seemed more fair since
   the release from Arango was the last alpha
   - The database was created without lightweight edges. Once we
   re-imported it (by the way we used OrientDB ETL and a couple of JSON files)
   everything was much faster
   - The singleWrite test used SQL statement, but this is not the fastest
   way, so Enrico used direct document create. On this test we are the fastest
   DBMS on insertion

Everybody can clone this repository and run the benchmark by himself ;-)

https://github.com/maggiolo00/nosql-tests

And this is the only commit Enrico did:
https://github.com/maggiolo00/nosql-tests/commit/b8fdddf9662322d748dfdbc5dd18787db2b75416

However since on these days we adopted the OrientJS driver from Oriento
project, we found some issues and bottlenecks, that on this tests made the
difference. For example on "singleRead", OrientDB was very slow. On my PC
it took about 60 secs to execute 100k of reads, but once we profiled the
time spent we discovered that 70% of the time was on Node.js driver and
only 30% to execute the query! While with marshalling, the Node.js driver
is good enough, with unmarshalling it's very slow. For this reason we're
going to improve the unmarshalling in the next weeks. Stay tuned to get the
updates on this.

NOTE: it's not that the OrientJS (forked from Oriento project)
implementation was bad, we think Node.js is not so good to manipulate
chars/bytes, so we're considering new options to do that ;-)

We run the benchmark multiple times and OrientDB was the fastest DBMS of
all the others tested, but "singleRead" (see above) and "neighbors2"
(actually on "neighbors" OrientDB is the fastest). By looking at the kind
of benchmark we understood why: it's not what you can expect by a classic
2-nd level neighbors, but it returns only the IDs. On Arango, like any
other Relational DBMS, you have primary keys that are on indexes.

So that particular query uses the index without even fetching the real
documents. That's why seems faster, but retrieving the ids is an edge case,
without any particular meaning in a real use case. If you're looking for
neighbors you usually are interested on any information about the
neighbors, like name, city, etc. Not just the IDs.

However, even if this benchmark has been created by a vendor to demonstrate
that is the fastest, the complexity of the benchmark is very simple. So I'd
call it micro-benchmark. Furthermore other vendors weren't called to do any
tuning, so I see this as a mere marketing move to make a lot of noise.

Best Regards,

Luca Garulli
CEO at Orient Technologies LTD
the Company behind OrientDB <http://orientdb.com>

On 21 June 2015 at 23:33, Marvin Froeder <[email protected]> wrote:

> I would love too see this PR... Just to make sure I'm not doing the same
> mistakes :)
>
> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Re: New benchmarks from Arango

Reply via email to