Re: [orientdb] Indexing and Queries with Java API

Luca Garulli Wed, 17 Aug 2016 20:56:50 -0700

Hi John,

Happy to help. Yes, please, could you open a new issue for the
documentation?


Best Regards,

Luca Garulli
Founder & CEO
OrientDB LTD <http://orientdb.com/>

Want to share your opinion about OrientDB?
Rate & review us at Gartner's Software Review
<https://www.gartner.com/reviews/survey/home>


On 17 August 2016 at 07:45, John J. Szucs <[email protected]> wrote:

> Luca,
>
> I just tried this. The only change was:
>
> Iterable<Vertex> vertices=graph.getVertices("identifier", myUriStr);
>
> to:
>
> Iterable<Vertex> vertices=graph.getVertices("Identifier.identifier",
> myUriStr);
>
>
> The results speak for themselves:
>
> Created 10000 entities in 00:02:05.755, 79.52 per second
>
>
> This is the kind of performance I was expecting!
>
> Thank you!!!
>
> I will note that this was a very subtle change. Essentially, it seems that
> for the graph API's getVertices() method to use the indices, the property
> names have to be qualified with the vertex type name. Would you like for me
> to add an issue on GitHub to improve the documentation around this?
>
> Thanks again!
>
> -- John
>
> On Tuesday, August 16, 2016 at 7:01:10 PM UTC-4, l.garulli wrote:
>>
>> It looks like you're not using the index from the Graph API. Look at the
>> documentation:
>>
>> http://orientdb.com/docs/last/Performance-Tuning-Graph.html#
>> use-indexes-to-lookup-vertices-by-an-id
>>
>> If it's not clear, please write here again, we will help you on this ;-)
>>
>> Best Regards,
>>
>> Luca Garulli
>> Founder & CEO
>> OrientDB LTD <http://orientdb.com/>
>>
>> Want to share your opinion about OrientDB?
>> Rate & review us at Gartner's Software Review
>> <https://www.gartner.com/reviews/survey/home>
>>
>>
>> On 16 August 2016 at 17:26, John J. Szucs <[email protected]> wrote:
>>
>>> In my OrientDB-based application, I need to do an INSERT-IF-NOT-EXISTS
>>> operation using the Java (TinkerPop) API.
>>>
>>> I have created a vertex type "Identifier." It has a single property,
>>> "identifier," which contains a URI (effectively a String for purposes of
>>> this discussion).
>>>
>>> I have also created an index like this:
>>>
>>> ParametersBuilder builder=new ParametersBuilder();
>>>
>>> builder.add("class", "Identifier");
>>>
>>> builder.add("type", "UNIQUE_HASH_INDEX");
>>>
>>> graph.createKeyIndex("identifier", Vertex.class, builder.build());
>>>
>>>
>>> Then, I perform the INSERT-IF-NOT-EXISTS operation in a loop like this.
>>> This snippet is using the Google Guava libraries and is obviously a
>>> simplification of our real application:
>>>
>>> int n=10000;
>>> for (int i=0; i<n; i++)
>>> {
>>>
>>> String myUriStr="http://example.org/"+i.toString();
>>>
>>> Iterable<Vertex> vertices=graph.getVertices("identifier", myUriStr);
>>>
>>> Vertex vertex=Iterables.getOnlyElement(vertices);
>>>
>>> if (null==vertex)
>>>
>>> {
>>>
>>> // Create vertex
>>>
>>> ...
>>>
>>> }
>>>
>>> // Use vertex
>>>
>>> ...
>>>
>>> }
>>>
>>>
>>> What I am seeing is that the throughput of this loop rapidly diminishes
>>> as more vertices are added, like this (with the throughput relative to the
>>> n=1,000 baseline):
>>>
>>>
>>> n=1,000 throughput=100%
>>> n=2,000 throughput=58.8%
>>> n=5,000 throughput=29.7%
>>>
>>> n=10,000 throughput=16.5%
>>>
>>>
>>> This obviously suggests that indexing is not working, so I tried a SQL
>>> EXPLAIN command.
>>>
>>> *explain select from identifier where identifier='http://example.org/1
>>> <http://example.org/1>'*
>>> documentReads=1
>>> fullySortedByIndex=false
>>> documentAnalyzedCompatibleClass=1
>>> recordReads=1
>>> fetchingFromTargetElapsed=0
>>> indexIsUsedInOrderBy=false
>>> compositeIndexUsed=1
>>> current=Identifier#153:0{identifier:http://example.org/1,out_id:[size=1]}
>>> v2
>>> involvedIndexes=[Identifier.identifier]
>>> limit=-1
>>> evaluated=1
>>> user=#5:0
>>> elapsed=2.387001
>>> resultType=collection
>>> resultSize=1
>>>
>>>
>>> The documentation at http://orientdb.com/docs/master/SQL-Explain.html does
>>> not seem to be 100% current on how to interpret the output of the EXPLAIN
>>> command, but my interpretation is that the query did recognize and use the
>>> index that I created.
>>>
>>> I also tried some profiling (with JProfiler) and see a hot spot at
>>> com.tinkerpop.blueprints.impls.orient.OrientElementIterator.hasNext.
>>>
>>> All of this is with OrientDB running in embedded mode, on a fairly
>>> high-end Linux machine and with a fresh, empty database at the beginning of
>>> each test.
>>>
>>> I have to believe I am doing something wrong to see such a rapid
>>> drop-off in query performance under such relatively small data volumes.
>>>
>>> I have been struggling with this for several days off-and-on now and
>>> it's time to ask for help. Has anyone else encountered a similar issue?
>>> What can I do to address this?
>>>
>>> Thanks in advance!
>>>
>>> -- John
>>>
>>> --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Indexing and Queries with Java API

Reply via email to