Hi,

We are building a contact management application. Each contact is a node. 
If 2 or more contacts are discovered to be duplicates we want to provide 
the ability to merge them into a single node. Additionally, we want to 
maintain the pre-merge node states, so that we can undo the merge if 
required (*).

We propose to model this by creating a new node and linking the old nodes 
to it with a "merged_into" edge, and setting a status property to "removed".

Now we have two options:

1. We copy all the existing edges from the two merged nodes to the new node

2. We don't.

Option 2 gives a simpler data structure, however it makes all our queries 
much more complex. Because we have to travel back through potentially 
multiple levels of merged nodes to fetch all the edges

Option 1 would keep the queries the same, but will introduce a lot of extra 
edges.

We also are considering a 3rd option of creating a copy of the full 
database with all the merged nodes collapsed. i.e. just a view of the 
current contacts. This would need to be kept in sync with the main database.

Would appreciate any advice/suggestions on the best way to handle this.

I'd also like to suggest a new "collapse" query feature, which would enable 
Option 2 to work more easily.... something like this:

select out("attended_class") collapse("merged_into") from 10#12

which would collapse the specified edges until there are no further 
outbound "merged_into" edges, and thus retrieve all the edges attached to 
the previous (pre-merged) nodes


* To keep things *simple* we won't allow the unmerge operation after any 
edges have been defined on the new node

Kind Regards

Swami Kevala

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to