Bump. Any further thoughts or progress on this topic?
The workarounds for this issue are spreading through my code base like an infection and I'd like to be able to cure the disease instead of the symptoms. -- John On Tuesday, September 6, 2016 at 2:29:24 PM UTC-4, John J. Szucs wrote: > > Andrey, > > Sorry for the delayed response. I had to focus on a milestone, which was > followed by a holiday weekend here in the US. > > The exception is thrown from OIndexUnique.put(Object, OIdentifiable). In > this particular test run, the existing record ID (in the variable "value" > in this method) is #85:12. The new record ID is (in the variable > "iSingleValue") is #100:11. > > Both of those records have a value of > https://en.wikipedia.org/wiki/fédération_anarchiste > <https://www.google.com/url?q=https%3A%2F%2Fen.wikipedia.org%2Fwiki%2Ff%25C3%25A9d%25C3%25A9ration_anarchiste&sa=D&sntz=1&usg=AFQjCNGRELH6ujBCuKU1r0vOaDIUgCCARA> > for > Identifier.identifier. > > In this area of the code, there is a statement that checks for a > "mergeKeys" property in the index's metadata, but the metadata is null when > this happens. > > Looking at the problem from another angle, this problem occurs in the > context of a fairly large transaction. As you may have gathered, I am > ingesting data from Wikipedia (or other MediaWiki-based wikis). Each page > and all of its links (specifically, hyperlinks and narrower/broader > category links) is processed in a single transaction. > > Often (especially early in an import, for obvious reasons) those links > refer to other pages which I have not yet ingested, so I create a stub > Identifier vertex for them. In this particular example case, I have created > such a stub Identifier vertex for the URI in question *in the scope of a > still-pending transaction*. > > The stack trace also seems to suggest that this may be related to the > transaction context because what the app is actually trying to do is just > create an edge between two vertices that, as far as the app is concerned, > already exist. Looking at the stack trace below, though, you can see that > this makes OrientDB try to commit a pending index transaction that, for > some reason, duplicates an existing index entry. > > com.orientechnologies.orient.core.storage.ORecordDuplicatedException: > Cannot index record #100:11: found duplicated key ' > https://en.wikipedia.org/wiki/f%c3%a9d%c3%a9ration_anarchiste' in index > 'Identifier.identifier' previously assigned to the record #85:12 > DB name="kb" INDEX=Identifier.identifier RID=#85:12 > at > com.orientechnologies.orient.core.index.OIndexUnique.put(OIndexUnique.java:64) > at > com.orientechnologies.orient.core.index.OIndexUnique.put(OIndexUnique.java:34) > at > com.orientechnologies.orient.core.index.OIndexAbstract.putInSnapshot(OIndexAbstract.java:930) > at > com.orientechnologies.orient.core.index.OIndexAbstract.applyIndexTxEntry(OIndexAbstract.java:762) > at > com.orientechnologies.orient.core.index.OIndexAbstract.addTxOperation(OIndexAbstract.java:735) > at > com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commitIndexes(OAbstractPaginatedStorage.java:1499) > at > com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.commit(OAbstractPaginatedStorage.java:1464) > at > com.orientechnologies.orient.core.tx.OTransactionOptimistic.doCommit(OTransactionOptimistic.java:566) > at > com.orientechnologies.orient.core.tx.OTransactionOptimistic.commit(OTransactionOptimistic.java:106) > at > com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.commit(ODatabaseDocumentTx.java:2733) > at > com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.executeOutsideTx(OrientBaseGraph.java:1770) > at > com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1434) > at > com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1385) > at > com.tinkerpop.blueprints.impls.orient.OrientBaseGraph.createEdgeType(OrientBaseGraph.java:1360) > at > com.tinkerpop.blueprints.impls.orient.OrientGraph.addEdgeInternal(OrientGraph.java:318) > at > com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:717) > at > com.tinkerpop.blueprints.impls.orient.OrientVertex.addEdge(OrientVertex.java:656) > ... <my code> ... > > BTW, I upgraded to OrientDB 2.2.9 before running the latest tests, just to > be current. There is no change in the behavior regarding this issue. > > Is this input helpful? Do you have any further insights as to fix or > work-around? > > -- John > > On Thursday, September 1, 2016 at 11:34:27 AM UTC-4, John J. Szucs wrote: >> >> Andrey, >> >> I am up against a deadline today (using my case-folding work-around for >> now) and time zone differences are working against us.I will get back to >> you with the results of this test tomorrow or over the weekend. >> >> Thanks for your patience! >> >> -- John >> >> On Thursday, September 1, 2016 at 5:41:42 AM UTC-4, Andrey Lomakin wrote: >>> >>> Hi John, >>> >>> Strange issue. >>> Could you do following: >>> >>> 1. Get the source code of a database. >>> 2. Set breakpoint on ORecordDuplicatedException and check values of new >>> and existing records when exception is going to be thrown >>> >>> WDYT ? >>> >>> >>> On Wed, Aug 31, 2016 at 6:39 PM John J. Szucs <[email protected]> >>> wrote: >>> >>>> Andrey, >>>> >>>> Thanks for responding. >>>> >>>> The RID in question changes every time I run this test case. Here are >>>> some results with my current run. The way that my environment is set-up, I >>>> can't really run the OrientDB console or Studio tool, so I wrote a little >>>> "db" command in my app that allows me to execute SQL commands for >>>> testing/debugging. You can see this being used below. >>>> >>>> com.orientechnologies.orient.core.storage.ORecordDuplicatedException: >>>> Cannot index record #100:14: found duplicated key ' >>>> https://en.wikipedia.org/wiki/fédération_anarchiste' in index >>>> 'Identifier.identifier' previously assigned to the record #85:13 >>>> ... >>>> db "select * from #85:13" >>>> 0 results. >>>> >>>> >>>> >>>> db "select * from #100:14" >>>> 0 results. >>>> >>>> >>>> db "select * from Identifier" >>>> Identifier#81:0{identifier: >>>> https://en.wikipedia.org/wiki/AccessibleComputing,out_id:[size=1]} v2 >>>> Identifier#82:0{identifier: >>>> https://en.wikipedia.org/wiki/Computer_accessibility,out_id:[size=1]} >>>> v1 >>>> Identifier#83:0{identifier: >>>> https://en.wikipedia.org/wiki/Anarchism,out_id:[size=1]} v1 >>>> Identifier#84:0{identifier: >>>> https://en.wikipedia.org/wiki/political_philosophy,out_id:[size=1]} v1 >>>> Identifier#85:0{identifier: >>>> https://en.wikipedia.org/wiki/AfghanistanHistory,out_id:[size=1]} v1 >>>> Identifier#86:0{identifier: >>>> https://en.wikipedia.org/wiki/History_of_Afghanistan,out_id:[size=1]} >>>> v1 >>>> 6 results. >>>> >>>> >>>> Note that neither #85:13 nor #100:14 appears to have actually been >>>> committed to the database. >>>> >>>> -- John >>>> >>>> On Wednesday, August 31, 2016 at 5:43:04 AM UTC-4, Andrey Lomakin wrote: >>>> >>>>> Hi John, >>>>> >>>>> Could you send us content of record with rid #109:13 (value of >>>>> indexed field will be enough I think) ? >>>>> >>>>> On Tue, Aug 30, 2016 at 7:02 PM John J. Szucs <[email protected]> >>>>> wrote: >>>>> >>>> Thanks for pointing that out. I double-checked the actual code and it >>>>>> is using the correct "collate"="ci" Parameter pair for the Java API. >>>>>> >>>>>> >>>>>> On Tuesday, August 30, 2016 at 11:54:21 AM UTC-4, >>>>>> [email protected] wrote: >>>>>>> >>>>>>> Hi, >>>>>>> this will not solve the problem but I think the correct command for >>>>>>> creating an index with case insensitive is collate >>>>>>> >>>>>>> CREATE INDEX <name> [ON <class-name> (prop-names [COLLATE <collate> >>>>>>> ])] <type> [<key-type>] [METADATA Metadata Document} {JSON Index] >>>>>>> >>>>>>> Example: >>>>>>> >>>>>>> create index User.name on User (name collate ci) UNIQUE >>>>>>> >>>>>>> Kind regards, >>>>>>> Alessandro >>>>>>> >>>>>> -- >>>>>> >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "OrientDB" group. >>>>>> >>>>> To unsubscribe from this group and stop receiving emails from it, send >>>>>> an email to [email protected]. >>>>> >>>>> >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> -- >>>>> Best regards, >>>>> Andrey Lomakin, R&D lead. >>>>> OrientDB Ltd >>>>> >>>>> twitter: @Andrey_Lomakin >>>>> linkedin: https://ua.linkedin.com/in/andreylomakin >>>>> blogger: http://andreylomakin.blogspot.com/ >>>>> >>>> -- >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "OrientDB" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> -- >>> Best regards, >>> Andrey Lomakin, R&D lead. >>> OrientDB Ltd >>> >>> twitter: @Andrey_Lomakin >>> linkedin: https://ua.linkedin.com/in/andreylomakin >>> blogger: http://andreylomakin.blogspot.com/ >>> >> -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
