Andrey,

Sorry for the delayed response. I had to focus on a milestone, which was 
followed by a holiday weekend here in the US.

The exception is thrown from OIndexUnique.put(Object, OIdentifiable). In 
this particular test run, the existing record ID (in the variable "value" 
in this method) is #85:12. The new record ID is (in the variable 
"iSingleValue") is #100:11.

Both of those records have a value of 
https://en.wikipedia.org/wiki/fédération_anarchiste 
<https://www.google.com/url?q=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2Ff%25C3%25A9d%25C3%25A9ration_anarchiste&sa=D&sntz=1&usg=AFQjCNGRELH6ujBCuKU1r0vOaDIUgCCARA>
 for 
Identifier.identifier.

In this area of the code, there is a statement that checks for a 
"mergeKeys" property in the index's metadata, but the metadata is null when 
this happens.

Looking at the problem from another angle, this problem occurs in the 
context of a fairly large transaction. As you may have gathered, I am 
ingesting data from Wikipedia (or other MediaWiki-based wikis). Each page 
and all of its links (specifically, hyperlinks and narrower/broader 
category links) is processed in a single transaction.

Often (especially early in an import, for obvious reasons) those links 
refer to other pages which I have not yet ingested, so I create a stub 
Identifier vertex for them. In this particular example case, I have created 
such a stub Identifier vertex for the URI in question *in the scope of a 
still-pending transaction*.

The stack trace also seems to suggest that this may be related to the 
transaction context because what the app is actually trying to do is just 
create an edge between two vertices that, as far as the app is concerned, 
already exist. Looking at the stack trace below, though, you can see that 
this makes OrientDB try to commit a pending index transaction that, for 
some reason, duplicates an existing index entry.

com.orientechnologies.orient.core.storage.ORecordDuplicatedException: 
Cannot index record #100:11: found duplicated key 
'https://en.wikipedia.org/wiki/f%c3%a9d%c3%a9ration_anarchiste' in index 
'Identifier.identifier' previously assigned to the record #85:12
DB name="kb" INDEX=Identifier.identifier RID=#85:12
at 
com.orientechnologies.orient.core.index.OIndexUnique.put(OIndexUnique.java:64)
at 
com.orientechnologies.orient.core.index.OIndexUnique.put(OIndexUnique.java:34)
at 
com.orientechnologies.orient.core.index.OIndexAbstract.putInSnapshot(OIndexAbstract.java:930)
at 
com.orientechnologies.orient.core.index.OIndexAbstract.applyIndexTxEntry(OIndexAbstract.java:762)
at 
com.orientechnologies.orient.core.index.OIndexAbstract.addTxOperation(OIndexAbstract.java:735)
at 
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commitIndexes(OAbstractPaginatedStorage.java:1499)
at 
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commit(OAbstractPaginatedStorage.java:1464)
at 
com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:566)
at 
com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:106)
at 
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2733)
at 
com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.executeOutsideTx(OrientBaseGraph.java:1770)
at 
com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1434)
at 
com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1385)
at 
com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1360)
at 
com.tinkerpop.blueprints.impls.orient.OrientGraph.addEdgeInternal(OrientGraph.java:318)
at 
com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:717)
at 
com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:656)
... <my code> ...

BTW, I upgraded to OrientDB 2.2.9 before running the latest tests, just to 
be current. There is no change in the behavior regarding this issue.

Is this input helpful? Do you have any further insights as to fix or 
work-around?

-- John

On Thursday, September 1, 2016 at 11:34:27 AM UTC-4, John J. Szucs wrote:
>
> Andrey,
>
> I am up against a deadline today (using my case-folding work-around for 
> now) and time zone differences are working against us.I will get back to 
> you with the results of this test tomorrow or over the weekend.
>
> Thanks for your patience!
>
> -- John
>
> On Thursday, September 1, 2016 at 5:41:42 AM UTC-4, Andrey Lomakin wrote:
>>
>> Hi John,
>>
>> Strange issue. 
>> Could you do following:
>>
>> 1. Get the source code of a database.
>> 2. Set breakpoint on ORecordDuplicatedException and check values of new 
>> and existing records when  exception is going to be thrown
>>
>> WDYT ?
>>
>>
>> On Wed, Aug 31, 2016 at 6:39 PM John J. Szucs <[email protected]> 
>> wrote:
>>
>>> Andrey,
>>>
>>> Thanks for responding.
>>>
>>> The RID in question changes every time I run this test case. Here are 
>>> some results with my current run. The way that my environment is set-up, I 
>>> can't really run the OrientDB console or Studio tool, so I wrote a little 
>>> "db" command in my app that allows me to execute SQL commands for 
>>> testing/debugging. You can see this being used below.
>>>
>>> com.orientechnologies.orient.core.storage.ORecordDuplicatedException: 
>>> Cannot index record #100:14: found duplicated key '
>>> https://en.wikipedia.org/wiki/fédération_anarchiste' in index 
>>> 'Identifier.identifier' previously assigned to the record #85:13
>>> ...
>>> db "select * from #85:13"
>>> 0 results. 
>>>
>>>  
>>>
>>> db "select * from #100:14"
>>> 0 results.
>>>
>>>
>>> db "select * from Identifier"
>>> Identifier#81:0{identifier:
>>> https://en.wikipedia.org/wiki/AccessibleComputing,out_id:[size=1]} v2
>>> Identifier#82:0{identifier:
>>> https://en.wikipedia.org/wiki/Computer_accessibility,out_id:[size=1]} v1
>>> Identifier#83:0{identifier:
>>> https://en.wikipedia.org/wiki/Anarchism,out_id:[size=1]} v1
>>> Identifier#84:0{identifier:
>>> https://en.wikipedia.org/wiki/political_philosophy,out_id:[size=1]} v1
>>> Identifier#85:0{identifier:
>>> https://en.wikipedia.org/wiki/AfghanistanHistory,out_id:[size=1]} v1
>>> Identifier#86:0{identifier:
>>> https://en.wikipedia.org/wiki/History_of_Afghanistan,out_id:[size=1]} v1
>>> 6 results.
>>>
>>>
>>> Note that neither #85:13 nor #100:14 appears to have actually been 
>>> committed to the database.
>>>
>>> -- John
>>>
>>> On Wednesday, August 31, 2016 at 5:43:04 AM UTC-4, Andrey Lomakin wrote:
>>>
>>>> Hi John,
>>>>
>>>> Could you send us content of record with rid #109:13 (value of indexed 
>>>> field will be enough I think) ?
>>>>
>>>> On Tue, Aug 30, 2016 at 7:02 PM John J. Szucs <[email protected]> 
>>>> wrote:
>>>>
>>> Thanks for pointing that out. I double-checked the actual code and it is 
>>>>> using the correct "collate"="ci" Parameter pair for the Java API.
>>>>>
>>>>>
>>>>> On Tuesday, August 30, 2016 at 11:54:21 AM UTC-4, 
>>>>> [email protected] wrote:
>>>>>>
>>>>>> Hi,
>>>>>> this will not solve the problem but I think the correct command for 
>>>>>> creating an index with case insensitive is collate
>>>>>>
>>>>>> CREATE INDEX <name> [ON <class-name> (prop-names [COLLATE <collate>
>>>>>> ])] <type> [<key-type>] [METADATA Metadata Document} {JSON Index]
>>>>>>
>>>>>> Example:
>>>>>>
>>>>>> create index User.name on User (name collate ci) UNIQUE
>>>>>>
>>>>>> Kind regards,
>>>>>> Alessandro
>>>>>>
>>>>> -- 
>>>>>
>>>>> --- 
>>>>> You received this message because you are subscribed to the Google 
>>>>> Groups "OrientDB" group.
>>>>>
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>>> an email to [email protected].
>>>>
>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>> -- 
>>>> Best regards,
>>>> Andrey Lomakin, R&D lead. 
>>>> OrientDB Ltd
>>>>
>>>> twitter: @Andrey_Lomakin 
>>>> linkedin: https://ua.linkedin.com/in/andreylomakin
>>>> blogger: http://andreylomakin.blogspot.com/ 
>>>>
>>> -- 
>>>
>>> --- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> -- 
>> Best regards,
>> Andrey Lomakin, R&D lead. 
>> OrientDB Ltd
>>
>> twitter: @Andrey_Lomakin 
>> linkedin: https://ua.linkedin.com/in/andreylomakin
>> blogger: http://andreylomakin.blogspot.com/ 
>>
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to