Hello,

Quite a long problem, apologies. 

I've got a two-node distribution setup, my database has data (imported 
through ETL offline and synched to the other node when ODB is started), but 
I need to import daily updates (deltas) to that data, and can't get any way 
to work. Here are the ways I've tried to import a delta of 273 changes 
(each of which will update multiple vertices and edges). The import is a 
bit messy, there are less than 100 changes that actually need to be made, 
with lots of duplicates and changes being made and undone sequentially, so 
it's quite understandable that there's a transactional problem, but I would 
hope there's a way to overcome it.

Firstly, I tried an ETL import to a remote:localhost:2424/database URL 
(online), writeQuorum 2, tx true and batchCommit 1000 (defaults) - I get a 
writeQuorum failure in the ETL output, and the server log for the node that 
I'm not running ETL on has this :

Error on updating record #11:390 (cluster: 
com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster@365e3fb7)
java.lang.NullPointerException
at 
com.orientechnologies.orient.core.index.sbtreebonsai.local.OSBTreeBonsaiLocal.load(OSBTreeBonsaiLocal.java:455)
at 
com.orientechnologies.orient.core.db.record.ridbag.sbtree.OSBTreeCollectionManagerShared.loadTree(OSBTreeCollectionManagerShared.java:95)
at 
com.orientechnologies.orient.core.db.record.ridbag.sbtree.OSBTreeCollectionManagerAbstract.loadSBTree(OSBTreeCollectionManagerAbstract.java:98)
at 
com.orientechnologies.orient.core.db.record.ridbag.sbtree.OSBTreeCollectionManagerProxy.loadSBTree(OSBTreeCollectionManagerProxy.java:52)
at 
com.orientechnologies.orient.core.storage.impl.local.paginated.ORidBagUpdateSerializationOperation.loadTree(ORidBagUpdateSerializationOperation.java:73)
at 
com.orientechnologies.orient.core.storage.impl.local.paginated.ORidBagUpdateSerializationOperation.execute(ORidBagUpdateSerializationOperation.java:54)
at 
com.orientechnologies.orient.core.storage.impl.local.paginated.ORecordSerializationContext.executeOperations(ORecordSerializationContext.java:99)
at 
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doUpdateRecord(OAbstractPaginatedStorage.java:1567)
at 
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.updateRecord(OAbstractPaginatedStorage.java:728)
at 
com.orientechnologies.orient.server.distributed.ODistributedStorage.updateRecord(ODistributedStorage.java:599)
at 
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeSaveRecord(ODatabaseDocumentTx.java:1757)
at 
com.orientechnologies.orient.core.tx.OTransactionNoTx.saveRecord(OTransactionNoTx.java:102)
at 
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2316)
at 
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2150)
at 
com.orientechnologies.orient.server.distributed.task.OUpdateRecordTask.execute(OUpdateRecordTask.java:95)
at 
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.executeOnLocalNode(OHazelcastPlugin.java:753)
at 
com.orientechnologies.orient.server.hazelcast.ODistributedWorker.onMessage(ODistributedWorker.java:298)
at 
com.orientechnologies.orient.server.hazelcast.ODistributedWorker.run(ODistributedWorker.java:121)

Which appears to be trying to get a page out of cache. Any suggestions as 
to what might be the problem here? I was going to set the embedded/OSB 
threshold very high to force everything to be embedded edges, but according 
to the various warnings 
(e.g. 
http://orientdb.com/docs/last/Concurrency.html#concurrency-on-adding-edges) 
when in distributed mode the edges are always embedded so this has no 
effect. I have to wonder why the OSBTreeBonsai code is being executed, but 
I must misunderstand what it's being used for. Or maybe it's because I 
imported the initial dataset offline, so the edges were created as OSBTree 
before the database was somehow flagged as distributed? This may be the 
case because if I import the delta into an empty database (so there's been 
no offline ETL import), then I get a different exception 
(OWOWCache.java:865, see below).

The end result of this import is that I have one vertex (11:390 mentioned 
in the error) with different versions (out by one) on each node, so the 
database is unusable and I don't know how to fix it. For each different 
method of importing the DB I reset to the offline-import before trying 
another way.

I then tried to shut down one ODB server and perform an offline ETL of the 
delta. This works fine, but when I start up the server with the new data, 
it then receives a copy of the (old) data from the other node, presumably 
because the offline node is rejoining an active cluster and so needs to get 
new data? NB - there are no inserts/updates (or even selects) happening 
other then my ETL imports, so the node that was online has not received any 
changes. How can I make the node that receives the offline ETL import the 
'master' so it syncs its database to the other node?

One way I've managed to import the data is turn off one node, run the 
online ETL on the remaining node, then bring the other node back online. 
This causes the node that was offline to receive a copy of the updated 
database. But this isn't ideal, as it's difficult to orchestrate and 
automate, and it means I'll only have one DB node running (which will be 
busy) for the period of the import.

If I try to import with writeQuorum 1, then the import completes but on the 
node I'm not running ETL on, the log contains amny errors like the stack 
trace above, and also this :

Error on updating record #55:10 (cluster: 
com.orientechnologies.orient.core.storage.impl.local.paginated.OPaginatedCluster@6d66a58c)
java.lang.NullPointerException
at 
com.orientechnologies.orient.core.index.hashindex.local.cache.OWOWCache.fileNameById(OWOWCache.java:865)
at 
com.orientechnologies.orient.core.index.hashindex.local.cache.OReadWriteDiskCache.fileNameById(OReadWriteDiskCache.java:236)
at 
com.orientechnologies.orient.core.storage.impl.local.paginated.atomicoperations.OAtomicOperation.fileNameById(OAtomicOperation.java:281)
at 
com.orientechnologies.orient.core.db.record.ridbag.sbtree.OSBTreeCollectionManagerShared.loadTree(OSBTreeCollectionManagerShared.java:89)
at 
com.orientechnologies.orient.core.db.record.ridbag.sbtree.OSBTreeCollectionManagerAbstract.loadSBTree(OSBTreeCollectionManagerAbstract.java:98)
at 
com.orientechnologies.orient.core.db.record.ridbag.sbtree.OSBTreeCollectionManagerProxy.loadSBTree(OSBTreeCollectionManagerProxy.java:52)
at 
com.orientechnologies.orient.core.storage.impl.local.paginated.ORidBagUpdateSerializationOperation.loadTree(ORidBagUpdateSerializationOperation.java:73)
at 
com.orientechnologies.orient.core.storage.impl.local.paginated.ORidBagUpdateSerializationOperation.execute(ORidBagUpdateSerializationOperation.java:54)
at 
com.orientechnologies.orient.core.storage.impl.local.paginated.ORecordSerializationContext.executeOperations(ORecordSerializationContext.java:99)
at 
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.doUpdateRecord(OAbstractPaginatedStorage.java:1567)
at 
com.orientechnologies.orient.core.storage.impl.local.OAbstractPaginatedStorage.updateRecord(OAbstractPaginatedStorage.java:728)
at 
com.orientechnologies.orient.server.distributed.ODistributedStorage.updateRecord(ODistributedStorage.java:599)
at 
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.executeSaveRecord(ODatabaseDocumentTx.java:1757)
at 
com.orientechnologies.orient.core.tx.OTransactionNoTx.saveRecord(OTransactionNoTx.java:102)
at 
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2316)
at 
com.orientechnologies.orient.core.db.document.ODatabaseDocumentTx.save(ODatabaseDocumentTx.java:2150)
at 
com.orientechnologies.orient.server.distributed.task.OUpdateRecordTask.execute(OUpdateRecordTask.java:95)
at 
com.orientechnologies.orient.server.hazelcast.OHazelcastPlugin.executeOnLocalNode(OHazelcastPlugin.java:753)
at 
com.orientechnologies.orient.server.hazelcast.ODistributedWorker.onMessage(ODistributedWorker.java:298)
at 
com.orientechnologies.orient.server.hazelcast.ODistributedWorker.run(ODistributedWorker.java:121)

The database is then (understandably) completely out of sync and unusable.

I also get the OSBTreeBonsaiLocal.java:455 stack trace if I have a database 
with writeQuorum 2, and an ETL import with tx true and batchCommit 1, which 
I would expect to be slow but safe.

I'm a bit stuck at the moment, so unless anyone has any suggestions on the 
NullPointerExceptions or on how to allow the node I have run the offline 
ETL import on to sync its data to the other node, then I can't really see a 
way past this problem. Is ETL the wrong tool for this problem? Should I be 
batching up inserts and running them via another client API?

I can try to provide a minimal test case if anyone is interested and the 
stack traces are not enough.

Thanks,
Rich

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to