Hi,
We are building a contact management application. Each contact is a node.
If 2 or more contacts are discovered to be duplicates we want to provide
the ability to merge them into a single node. Additionally, we want to
maintain the pre-merge node states, so that we can undo the merge if
required (*).
We propose to model this by creating a new node and linking the old nodes
to it with a "merged_into" edge, and setting a status property to "removed".
Now we have two options:
1. We copy all the existing edges from the two merged nodes to the new node
2. We don't.
Option 2 gives a simpler data structure, however it makes all our queries
much more complex. Because we have to travel back through potentially
multiple levels of merged nodes to fetch all the edges
Option 1 would keep the queries the same, but will introduce a lot of extra
edges.
We also are considering a 3rd option of creating a copy of the full
database with all the merged nodes collapsed. i.e. just a view of the
current contacts. This would need to be kept in sync with the main database.
Would appreciate any advice/suggestions on the best way to handle this.
I'd also like to suggest a new "collapse" query feature, which would enable
Option 2 to work more easily.... something like this:
select out("attended_class") collapse("merged_into") from 10#12
which would collapse the specified edges until there are no further
outbound "merged_into" edges, and thus retrieve all the edges attached to
the previous (pre-merged) nodes
* To keep things *simple* we won't allow the unmerge operation after any
edges have been defined on the new node
Kind Regards
Swami Kevala
--
---
You received this message because you are subscribed to the Google Groups
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.