I'm following up on a serious problem that we reported privately, I'm
hoping that others may have seen related behavior.
The critical issue is that we have found what seem to be malformed /
corrupt edges in data that our application has loaded - but no errors
occurred during loading. (we have scanned all of our code to be sure that
we are not accidentally discarding errors)
We are very concerned because we have just gone live with a site where
users can upload their data and we now believe that their data is at risk -
or may already have been compromised - in unpredictable ways.
*Problem summary:*
In 1.7.9-SNAPSHOT, a small fraction of edges created by our data loader
using the Graph API via a PLOCAL connection can be found and counted by
tools such as "load record" in the console, BUT they are not found by
traversal queries.
We previously reported (what we think was) the same behavior on a REMOTE
connection and then we switched to PLOCAL. (see example 2 below) Initially
we thought the problem was resolved by switching to PLOCAL, but on closer
examination there are still error cases, such as Example 1 below.
This discrepancy was difficult to find because the edge counts for the
objects matched our expectations and we assumed the data was correct. But
when queried, the edges were not found and so our exporter application
produced output with missing elements.
Below is an example of a record which, when examined by load record, shows
4 incoming edges of type in_nSupport but a traverse query returns only 2
edges.
Is this a known issue - perhaps related to some of the recent posts?
Thanks,
Dexter
*Example 1:*
orientdb {db=ndex}> load record #22:6402
--------------------------------------------------
ODocument - Class: support id: #22:6402 v.9
--------------------------------------------------
id : 62894
text : In contrast to protein-binding inhibition, ...
out_citeFrom : citation#21:460{id:62623,title:Trends in molecular
medicine,idType:URI,identifier:pmid:15350900,authors:[2],in_citations:#26:9,in_citeFrom:[size=27],in_eCitation:[size=82],in_nCitation:[size=6]}
v117
in_supports : network#26:9{...} v5984
in_nSupport : [size=4]
in_eSupport : [size=2]
*orientdb {db=ndex}> traverse in_nSupport from #22:6402
----+---------+-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------------------------------
# |@RID |id
|in_suppor|in_eSuppo|out_citeF|out_nSupp|out_nCita|out_ndexP|out_repre|in_networ|in_nSuppo|text
----+---------+-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------------------------------
0 |#22:6402 |62894|#26:9 |[size=2] |#21:460 |null |null
|null |null |null |[#27:1...|In contrast to protein-binding...
1 |#27:12743|62898|null |null |null |[size=2] |[size=2]
|#14:42061|#24:14391|#26:9 |null |null
2 |#27:12744|62900|null |null |null |[size=2] |[size=2]
|#14:42062|#24:14392|#26:9 |null |null
----+---------+-----+---------+---------+---------+---------+---------+---------+---------+---------+---------+---------------------------------
3 item(s) found. Traverse executed in 0.002 sec(s).*
*Example 2: (includes code example. Problem is more pervasive with REMOTE
connection)*
We are using the 1.7.9-SNAPSHOT version through remote access. In our
application, all edges are created by the Graph API funcions. The code that
creates the edges are like this:
* ODocument citationDoc = elementIdCache.get(citationId) ; *
* if (citationDoc == null) *
* throw new NdexException ("Citation Id:" + citationId +
" was not found in elementIdCache.");*
* OrientVertex citationV = graph.getVertex(citationDoc);*
* edgeVertex.addEdge("eCitation", citationV);*
No error or warning when loading data into the db.
Different results when we query the same data from different directions.
For example
orientdb {db=ndex}> select count(*) from (traverse in_eCitation from #21:6)
where @class='edge';
----+-----+-----
# |@RID |count
----+-----+-----
0 |#-2:0|91
----+-----+-----
1 item(s) found. Query executed in 0.042 sec(s).
orientdb {db=ndex}> select count(*) from edge where out_eCitation=#21:6 ;
----+-----+-----
# |@RID |count
----+-----+-----
0 |#-2:0|94
----+-----+-----
1 item(s) found. Query executed in 0.795 sec(s).
>From one direction we got 91 edges, but from the other direction we got 94
edges.
I tried our code to load the same data multiple times with empty databases,
and the error seems to be consistent.
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.