Re: [orientdb] Indexing and Queries with Java API

John J. Szucs Thu, 18 Aug 2016 14:15:50 -0700


---
John J. Szucs (on my iPhone)


> On Aug 18, 2016, at 17:07, Luca Garulli <[email protected]> wrote:
> 
> Cool you solved.
> 
> Anyway we have to improve the docs, because I'm sure many users just drop 
> OrientDB after the first problem and maybe it's something trivial like this 
> ;-)
> 
> 
> Best Regards,
> 
> Luca Garulli
> Founder & CEO
> OrientDB LTD
> 
> Want to share your opinion about OrientDB?
> Rate & review us at Gartner's Software Review
> 
> 
>> On 18 August 2016 at 15:29, John J. Szucs <[email protected]> wrote:
>> New issue opened at 
>> https://github.com/orientechnologies/orientdb/issues/6589.
>> 
>> BTW, the performance test results I shared yesterday where running under the 
>> debugger and with extensive instrumentation. Here are the "clean" results. 
>> Wow!
>> 
>> Created 10000 entities in 00:00:05.840, 1712.33 per second
>> Retrieving 10000 entities...
>> Retrieved 10000 entities in 00:00:01.561, 6406.15 per second
>> Deleting 10000 entities...
>> Deleted 10000 entities in 00:00:01.960, 5102.04 per second
>> 
>> Thanks again!
>> 
>> -- John
>> 
>>> On Wednesday, August 17, 2016 at 11:55:51 PM UTC-4, l.garulli wrote:
>>> Hi John,
>>> 
>>> Happy to help. Yes, please, could you open a new issue for the 
>>> documentation?
>>> 
>>> Best Regards,
>>> 
>>> Luca Garulli
>>> Founder & CEO
>>> OrientDB LTD
>>> 
>>> Want to share your opinion about OrientDB?
>>> Rate & review us at Gartner's Software Review
>>> 
>>> 
>>>> On 17 August 2016 at 07:45, John J. Szucs <[email protected]> wrote:
>>>> Luca,
>>>> 
>>>> I just tried this. The only change was:
>>>> Iterable<Vertex> vertices=graph.getVertices("identifier", myUriStr);
>>>> to:
>>>> 
>>>> Iterable<Vertex> vertices=graph.getVertices("Identifier.identifier", 
>>>> myUriStr);
>>>> 
>>>> The results speak for themselves:
>>>> 
>>>> Created 10000 entities in 00:02:05.755, 79.52 per second
>>>> 
>>>> This is the kind of performance I was expecting!
>>>> 
>>>> Thank you!!!
>>>> 
>>>> I will note that this was a very subtle change. Essentially, it seems that 
>>>> for the graph API's getVertices() method to use the indices, the property 
>>>> names have to be qualified with the vertex type name. Would you like for 
>>>> me to add an issue on GitHub to improve the documentation around this?
>>>> 
>>>> Thanks again!
>>>> 
>>>> -- John
>>>> 
>>>>> On Tuesday, August 16, 2016 at 7:01:10 PM UTC-4, l.garulli wrote:
>>>>> It looks like you're not using the index from the Graph API. Look at the 
>>>>> documentation:
>>>>> 
>>>>> http://orientdb.com/docs/last/Performance-Tuning-Graph.html#use-indexes-to-lookup-vertices-by-an-id
>>>>> 
>>>>> If it's not clear, please write here again, we will help you on this ;-)
>>>>> 
>>>>> Best Regards,
>>>>> 
>>>>> Luca Garulli
>>>>> Founder & CEO
>>>>> OrientDB LTD
>>>>> 
>>>>> Want to share your opinion about OrientDB?
>>>>> Rate & review us at Gartner's Software Review
>>>>> 
>>>>> 
>>>>>> On 16 August 2016 at 17:26, John J. Szucs <[email protected]> wrote:
>>>>>> In my OrientDB-based application, I need to do an INSERT-IF-NOT-EXISTS 
>>>>>> operation using the Java (TinkerPop) API.
>>>>>> 
>>>>>> I have created a vertex type "Identifier." It has a single property, 
>>>>>> "identifier," which contains a URI (effectively a String for purposes of 
>>>>>> this discussion).
>>>>>> 
>>>>>> I have also created an index like this:
>>>>>> 
>>>>>> ParametersBuilder builder=new ParametersBuilder(); 
>>>>>> builder.add("class", "Identifier"); 
>>>>>> builder.add("type", "UNIQUE_HASH_INDEX");
>>>>>> graph.createKeyIndex("identifier", Vertex.class, builder.build());
>>>>>> 
>>>>>> Then, I perform the INSERT-IF-NOT-EXISTS operation in a loop like this. 
>>>>>> This snippet is using the Google Guava libraries and is obviously a 
>>>>>> simplification of our real application:
>>>>>> 
>>>>>> int n=10000;
>>>>>> for (int i=0; i<n; i++)
>>>>>> {
>>>>>> String myUriStr="http://example.org/"+i.toString();
>>>>>> Iterable<Vertex> vertices=graph.getVertices("identifier", myUriStr);
>>>>>> Vertex vertex=Iterables.getOnlyElement(vertices);
>>>>>> if (null==vertex)
>>>>>> {
>>>>>> // Create vertex
>>>>>> ...
>>>>>> }
>>>>>> // Use vertex
>>>>>> ...
>>>>>> }
>>>>>> 
>>>>>> What I am seeing is that the throughput of this loop rapidly diminishes 
>>>>>> as more vertices are added, like this (with the throughput relative to 
>>>>>> the n=1,000 baseline):
>>>>>> 
>>>>>> n=1,000 throughput=100%
>>>>>> n=2,000 throughput=58.8%
>>>>>> n=5,000 throughput=29.7%
>>>>>> n=10,000 throughput=16.5%
>>>>>> 
>>>>>> This obviously suggests that indexing is not working, so I tried a SQL 
>>>>>> EXPLAIN command.
>>>>>> 
>>>>>> explain select from identifier where identifier='http://example.org/1'
>>>>>> documentReads=1
>>>>>> fullySortedByIndex=false
>>>>>> documentAnalyzedCompatibleClass=1
>>>>>> recordReads=1
>>>>>> fetchingFromTargetElapsed=0
>>>>>> indexIsUsedInOrderBy=false
>>>>>> compositeIndexUsed=1
>>>>>> current=Identifier#153:0{identifier:http://example.org/1,out_id:[size=1]}
>>>>>>  v2
>>>>>> involvedIndexes=[Identifier.identifier]
>>>>>> limit=-1
>>>>>> evaluated=1
>>>>>> user=#5:0
>>>>>> elapsed=2.387001
>>>>>> resultType=collection
>>>>>> resultSize=1 
>>>>>>  
>>>>>> The documentation at http://orientdb.com/docs/master/SQL-Explain.html 
>>>>>> does not seem to be 100% current on how to interpret the output of the 
>>>>>> EXPLAIN command, but my interpretation is that the query did recognize 
>>>>>> and use the index that I created.
>>>>>> 
>>>>>> I also tried some profiling (with JProfiler) and see a hot spot at 
>>>>>> com.tinkerpop.blueprints.impls.orient.OrientElementIterator.hasNext.
>>>>>> 
>>>>>> All of this is with OrientDB running in embedded mode, on a fairly 
>>>>>> high-end Linux machine and with a fresh, empty database at the beginning 
>>>>>> of each test.
>>>>>> 
>>>>>> I have to believe I am doing something wrong to see such a rapid 
>>>>>> drop-off in query performance under such relatively small data volumes.
>>>>>> 
>>>>>> I have been struggling with this for several days off-and-on now and 
>>>>>> it's time to ask for help. Has anyone else encountered a similar issue? 
>>>>>> What can I do to address this?
>>>>>> 
>>>>>> Thanks in advance!
>>>>>> 
>>>>>> -- John
>>>>>> 
>>>>>> -- 
>>>>>> 
>>>>>> --- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>>> an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>> 
>>>> -- 
>>>> 
>>>> --- 
>>>> You received this message because you are subscribed to the Google Groups 
>>>> "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>>> email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>> 
>> -- 
>> 
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "OrientDB" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> For more options, visit https://groups.google.com/d/optout.
> 
> -- 
> 
> --- 
> You received this message because you are subscribed to a topic in the Google 
> Groups "OrientDB" group.
> To unsubscribe from this topic, visit 
> https://groups.google.com/d/topic/orient-database/iPU0QlY1yl4/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to 
> [email protected].
> For more options, visit https://groups.google.com/d/optout.

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Indexing and Queries with Java API

Reply via email to