[orientdb] [2.0.14] ETL : Duplicate created for Edge with Composite Index

Richard Jones Fri, 21 Aug 2015 05:17:24 -0700

Hello,

If I use ETL to import an edge with a compound unique index, then even if 
the index has metadata:{mergeSameKeys:true} and the edge has 
skipDuplicates:true, then the duplicate edge will still be created.


If I run two sequential ETL imports with this data : 
https://gist.github.com/bmcgavin/69be0b2ebbf535fdf3e9 and this import json 
: https://gist.github.com/bmcgavin/177b549a68c7e38e6626 I would expect 
there to be one Link and two Articles - effectively the second ETL import 
is null because the data is already present.

However I get two Articles and two Links, because the duplicate Edge is not 
deleted.

I think the problem is in OIndexUnique.java, line 69-71 :

             if (mergeSameKey != null && mergeSameKey)
                // IGNORE IT, THE EXISTENT KEY HAS BEEN MERGED
                ;

But at this point the Edge has already been created, it's just the Index 
that will be skipped. I can't confirm this, because I'm not sure how to get 
the contents of a compound index. SELECT FROM index:Article.articleId WHERE 
key = '12345' works, but I don't know the working version of SELECT FROM 
index:Link.in_out WHERE key = '#11:1_#11:2'. However, SELECT FROM Link 
gives two Edges with matching in and out values, which should be invalid.

I have generated a version of core with this line :

              if (mergeSameKey != null && mergeSameKey)
                // Need to delete the newly created edge
                iSingleValue.getRecord().delete();

And I now with two ETL runs I get two Articles and one Link, as expected. 
I'm not sure if deleting the edge is the right thing to do, or if there's a 
way to check the index before creating the Edge?

Thanks,
Rich

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[orientdb] [2.0.14] ETL : Duplicate created for Edge with Composite Index

Reply via email to