Hi Luca,
thanks for your response.
Yeah it's kinda too late now. Hibernate/postgres guys won. This time, at
least. We had to decide today because of a current project.
But for future projects, jury is still out. We still want and we have to
build a large, flexible network of related information for our Library.
Regarding your response, I'm confused and terrified at the same time. But
at least now I know that I know nothing about graph databases. (lol)
Since I had zero experience with graph databases the whole concept with
these links and navigating through the data with "java-like access-paths"
(like accessing java-class properties) felt super natural to me.
I was able to write queries in a few days without much learning.
I experimented a little with TRAVERS and I was able to build a query which
performs way better than the others:
select from (
traverse * from (
select from RelationImpl where predicate.uniqueKey =
'IS_PUB_PLACE_OF'
and subject.relationNodeContainers contains (uniqueKey = 'MILANO')
) while $depth <= 3
)
) where @class = 'RecordImpl' and type.uniqueKey = 'TOME'
order by sortIndex asc
Less than 2 seconds with order by, compared to 14 seconds for the first
query I've posted.
TBH, I have no idea what kind of black magic I've done there.
Anyways, we have to move on for now.
Best regards & thanks again,
Sebastian
On Friday, July 31, 2015 at 3:32:41 PM UTC+2, l.garulli wrote:
>
> Hi Sebastian,
> Sorry to have seen this now, I hope it's not too late.
>
> Starting from your last query (about 7 seconds):
>
> select from RecordImpl
> where
> type contains [#46:0]
> and (
> relationNode.relations
> contains (
> predicate contains [#34:0] and subject contains [#30:18]
> )
> )
>
> I see the bottleneck is the expression: *relationNode.relations contains
> ( predicate contains [#34:0] and subject contains [#30:18] )*. In facts
> with such expression OrientDB does a full scan of many records. You can try
> by prefixing *EXPLAIN* to the query:
>
> *explain* select from RecordImpl where type contains [#46:0] and
> ( relationNode.relations contains ( predicate contains [#34:0] and subject
> contains [#30:18] ) )
>
> The secret for fast queries is, in any DBMS, using indexes as much as you
> can. When you use the dot notation (.) OrientDB can't use the indexes. By
> reading the original query:
>
> select from RecordImpl
> where
> type.uniqueKey = 'TOME'
> and
> relationNode.relations
> contains (
> (predicate.uniqueKey = 'IS_PUB_PLACE_OF')
> and
> subject.relationNodeContainers contains (uniqueKey = 'MILANO')
> )
>
> You have 3 conditions to match. If you'd use the Graph API you'd have
> bidirectional edges, so you can start from any point in the graph and cross
> in any direction. For example you can lookup for all the place of type "
> IS_PUB_PLACE_OF" and start crossing the graph matching the other
> conditions. Or you could do the same with "MILANO".
>
> To help you more I'd need the schema of the entities involved in this
> query.
>
> Best Regards,
>
> Luca Garulli
> Founder & CEO
> OrientDB <http://orientdb.com/>
>
>
> On 31 July 2015 at 12:46, Zapp El <[email protected] <javascript:>>
> wrote:
>
>>
>> So, we officially gave up on OrientDB.
>>
>> Performance is just too bad for our use-case. We did try a couple of
>> things more, but none of them helped. Even with a very small amount of Data
>> (2 GB, ~ 4,859,173 Records), Performance is abysmal.
>>
>> That is really a pity. I totally like the basic concept of OrientDBs
>> ObjectDB.
>>
>> Since I work with several hundred GBs of index data with Lucene and Solr
>> on a daily base and never had a problem to achieve a decent performance my
>> only guess right now is that OrientDBs ObjectDB suffers from a poorly
>> written query optimization.
>>
>> And BTW, we really hoped that one of the developers would chime in here.
>> In the end, we are potential customers, but as long as we can't get a
>> basic proof of concept going or at least get confirmation, that our data
>> isn't a complete mismatch for OrientDB, we won't buy any licences.
>>
>> Best regards,
>>
>> Sebastian
>>
>>
>>
>>
>> On Tuesday, July 28, 2015 at 7:13:06 PM UTC+2, Zapp El wrote:
>>>
>>> Hello Community,
>>> hello OrientDB developers!
>>>
>>> I work in public services, and currently we're evaluating different
>>> technologies (OrientDB, Hibernate/postgres, Fedora4 and neo4j) in order to
>>> find the best possible backend-solution for our data.
>>> Over the course of the last month we developed a fairly straight-forward
>>> Java-Class-Model that we like to use regardless of the underlying
>>> technology.
>>>
>>> In future applications we're going to have more than 1,5 Mill. objects
>>> to persist, manage and retrieve.
>>>
>>> Handling Java-Objects directly seems so much more intuitive and flexible
>>> instead of mapping them with JPA, so we were eager to try something new,
>>> like for example, OrientDBs Object Database functionalities.
>>>
>>> But somehow we can't figure out how to get a decent performance out of
>>> our experimental setup.
>>>
>>> So far we've persisted about 95,601 of our Objects (books and other
>>> media), resulting in
>>> 46,995,663 ORecords (see screen-shot) and about 18,8 GB of data on our
>>> NAS.
>>> Our test-system:
>>>
>>> - Virtual Machine on VMware
>>> - SUSE 10 OS
>>> - 1 TB of NAS.
>>> - Java 1.8
>>> - OrientDD PE Version 2.0.12
>>> - QuadCore CPU
>>>
>>>
>>>
>>> Select a book with a specific relation (like a triple):
>>>
>>> select from RecordImpl
>>> where
>>> type.uniqueKey = 'TOME'
>>> and
>>> relationNode.relations
>>> contains (
>>> (predicate.uniqueKey = 'IS_PUB_PLACE_OF')
>>> and
>>> subject.relationNodeContainers contains (uniqueKey = 'MILANO')
>>> )
>>> Query executed in 9.26 sec. Returned 20 record(s)
>>>
>>> 9.26 sec. , how can we accelerate this query? We have indexes on all
>>> the uniqueKeys.
>>>
>>>
>>>
>>> We managed to accelerate this query a little by rewriting the statement
>>> like this:
>>>
>>> select from RecordImpl
>>> where
>>> type contains [#46:0]
>>> and (
>>> relationNode.relations
>>> contains (
>>> predicate contains [#34:0] and subject contains [#30:18]
>>> )
>>> )
>>> Query executed in 7.122 sec. Returned 20 record(s)
>>>
>>> 7.122 sec. , sadly not acceptable. And that is one of the more simple
>>> questions we'd like to get answered in a decent time.
>>>
>>>
>>>
>>> Now this one with a simple order by:
>>>
>>> select from RecordImpl
>>> where
>>> type contains [#46:0]
>>> and (
>>> relationNode.relations
>>> contains (
>>> predicate contains [#34:0] and subject contains [#30:18]
>>> )
>>> )
>>> order by sortIndex desc
>>> Query executed in 133.423 sec. Returned 20 record(s)
>>>
>>>
>>>
>>> So, any ideas how we could accelerate our queries? What do we wrong?
>>>
>>>
>>> Best regards & thanks,
>>>
>>> Sebastian
>>>
>>>
>>> Edit: Added number of cores (4) at sys specs
>>>
>> --
>>
>> ---
>> You received this message because you are subscribed to the Google Groups
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.