Re: [orientdb] Indexing and Queries with Java API

Luca Garulli Thu, 18 Aug 2016 14:08:39 -0700

Cool you solved.

Anyway we have to improve the docs, because I'm sure many users just drop
OrientDB after the first problem and maybe it's something trivial like this
;-)



Best Regards,

Luca Garulli
Founder & CEO
OrientDB LTD <http://orientdb.com/>

Want to share your opinion about OrientDB?
Rate & review us at Gartner's Software Review
<https://www.gartner.com/reviews/survey/home>


On 18 August 2016 at 15:29, John J. Szucs <[email protected]> wrote:

> New issue opened at https://github.com/orientechnologies/orientdb/
> issues/6589.
>
> BTW, the performance test results I shared yesterday where running under
> the debugger and with extensive instrumentation. Here are the "clean"
> results. Wow!
>
> Created 10000 entities in 00:00:05.840, 1712.33 per second
> Retrieving 10000 entities...
> Retrieved 10000 entities in 00:00:01.561, 6406.15 per second
> Deleting 10000 entities...
> Deleted 10000 entities in 00:00:01.960, 5102.04 per second
>
> Thanks again!
>
> -- John
>
> On Wednesday, August 17, 2016 at 11:55:51 PM UTC-4, l.garulli wrote:
>>
>> Hi John,
>>
>> Happy to help. Yes, please, could you open a new issue for the
>> documentation?
>>
>> Best Regards,
>>
>> Luca Garulli
>> Founder & CEO
>> OrientDB LTD <http://orientdb.com/>
>>
>> Want to share your opinion about OrientDB?
>> Rate & review us at Gartner's Software Review
>> <https://www.gartner.com/reviews/survey/home>
>>
>>
>> On 17 August 2016 at 07:45, John J. Szucs <[email protected]> wrote:
>>
>>> Luca,
>>>
>>> I just tried this. The only change was:
>>>
>>> Iterable<Vertex> vertices=graph.getVertices("identifier", myUriStr);
>>>
>>> to:
>>>
>>> Iterable<Vertex> vertices=graph.getVertices("Identifier.identifier",
>>> myUriStr);
>>>
>>>
>>> The results speak for themselves:
>>>
>>> Created 10000 entities in 00:02:05.755, 79.52 per second
>>>
>>>
>>> This is the kind of performance I was expecting!
>>>
>>> Thank you!!!
>>>
>>> I will note that this was a very subtle change. Essentially, it seems
>>> that for the graph API's getVertices() method to use the indices, the
>>> property names have to be qualified with the vertex type name. Would you
>>> like for me to add an issue on GitHub to improve the documentation around
>>> this?
>>>
>>> Thanks again!
>>>
>>> -- John
>>>
>>> On Tuesday, August 16, 2016 at 7:01:10 PM UTC-4, l.garulli wrote:
>>>>
>>>> It looks like you're not using the index from the Graph API. Look at
>>>> the documentation:
>>>>
>>>> http://orientdb.com/docs/last/Performance-Tuning-Graph.html#
>>>> use-indexes-to-lookup-vertices-by-an-id
>>>>
>>>> If it's not clear, please write here again, we will help you on this ;-)
>>>>
>>>> Best Regards,
>>>>
>>>> Luca Garulli
>>>> Founder & CEO
>>>> OrientDB LTD <http://orientdb.com/>
>>>>
>>>> Want to share your opinion about OrientDB?
>>>> Rate & review us at Gartner's Software Review
>>>> <https://www.gartner.com/reviews/survey/home>
>>>>
>>>>
>>>> On 16 August 2016 at 17:26, John J. Szucs <[email protected]> wrote:
>>>>
>>>>> In my OrientDB-based application, I need to do an INSERT-IF-NOT-EXISTS
>>>>> operation using the Java (TinkerPop) API.
>>>>>
>>>>> I have created a vertex type "Identifier." It has a single property,
>>>>> "identifier," which contains a URI (effectively a String for purposes of
>>>>> this discussion).
>>>>>
>>>>> I have also created an index like this:
>>>>>
>>>>> ParametersBuilder builder=new ParametersBuilder();
>>>>>
>>>>> builder.add("class", "Identifier");
>>>>>
>>>>> builder.add("type", "UNIQUE_HASH_INDEX");
>>>>>
>>>>> graph.createKeyIndex("identifier", Vertex.class, builder.build());
>>>>>
>>>>>
>>>>> Then, I perform the INSERT-IF-NOT-EXISTS operation in a loop like
>>>>> this. This snippet is using the Google Guava libraries and is obviously a
>>>>> simplification of our real application:
>>>>>
>>>>> int n=10000;
>>>>> for (int i=0; i<n; i++)
>>>>> {
>>>>>
>>>>> String myUriStr="http://example.org/"+i.toString();
>>>>>
>>>>> Iterable<Vertex> vertices=graph.getVertices("identifier", myUriStr);
>>>>>
>>>>> Vertex vertex=Iterables.getOnlyElement(vertices);
>>>>>
>>>>> if (null==vertex)
>>>>>
>>>>> {
>>>>>
>>>>> // Create vertex
>>>>>
>>>>> ...
>>>>>
>>>>> }
>>>>>
>>>>> // Use vertex
>>>>>
>>>>> ...
>>>>>
>>>>> }
>>>>>
>>>>>
>>>>> What I am seeing is that the throughput of this loop rapidly
>>>>> diminishes as more vertices are added, like this (with the throughput
>>>>> relative to the n=1,000 baseline):
>>>>>
>>>>>
>>>>> n=1,000 throughput=100%
>>>>> n=2,000 throughput=58.8%
>>>>> n=5,000 throughput=29.7%
>>>>>
>>>>> n=10,000 throughput=16.5%
>>>>>
>>>>>
>>>>> This obviously suggests that indexing is not working, so I tried a SQL
>>>>> EXPLAIN command.
>>>>>
>>>>> *explain select from identifier where identifier='http://example.org/1
>>>>> <http://example.org/1>'*
>>>>> documentReads=1
>>>>> fullySortedByIndex=false
>>>>> documentAnalyzedCompatibleClass=1
>>>>> recordReads=1
>>>>> fetchingFromTargetElapsed=0
>>>>> indexIsUsedInOrderBy=false
>>>>> compositeIndexUsed=1
>>>>> current=Identifier#153:0{identifier:http://example.org/1,
>>>>> out_id:[size=1]} v2
>>>>> involvedIndexes=[Identifier.identifier]
>>>>> limit=-1
>>>>> evaluated=1
>>>>> user=#5:0
>>>>> elapsed=2.387001
>>>>> resultType=collection
>>>>> resultSize=1
>>>>>
>>>>>
>>>>> The documentation at http://orientdb.com/docs/master/SQL-Explain.html does
>>>>> not seem to be 100% current on how to interpret the output of the EXPLAIN
>>>>> command, but my interpretation is that the query did recognize and use the
>>>>> index that I created.
>>>>>
>>>>> I also tried some profiling (with JProfiler) and see a hot spot at
>>>>> com.tinkerpop.blueprints.impls.orient.OrientElementIterator.hasNext.
>>>>>
>>>>> All of this is with OrientDB running in embedded mode, on a fairly
>>>>> high-end Linux machine and with a fresh, empty database at the beginning 
>>>>> of
>>>>> each test.
>>>>>
>>>>> I have to believe I am doing something wrong to see such a rapid
>>>>> drop-off in query performance under such relatively small data volumes.
>>>>>
>>>>> I have been struggling with this for several days off-and-on now and
>>>>> it's time to ask for help. Has anyone else encountered a similar issue?
>>>>> What can I do to address this?
>>>>>
>>>>> Thanks in advance!
>>>>>
>>>>> -- John
>>>>>
>>>>> --
>>>>>
>>>>> ---
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OrientDB" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Indexing and Queries with Java API

Reply via email to