jaredwinick commented on issue #660:
URL: https://github.com/apache/fluo/issues/660#issuecomment-909401013
I am not sure if this is 100% the same thing, but on Fluo 2.0.0-SNAPSHOT we
ran into something that looks similar. This occurred on a single test server
that was likely very overloaded at the time of failure. When trying to scan we
see that exception
```
root@fluo-oracle:/# fluo scan -a crucible -p dataset_offsets:Alerts:
dataset_offsets:Alerts:0 offset 21179
dataset_offsets:Alerts:1 offset 21179
dataset_offsets:Alerts:2 offset 21179
dataset_offsets:Alerts:3 offset 21179
Exception in thread "main" java.lang.IllegalStateException: can not abort :
record:Alerts:4:000021178 10 143265458 (UNKNOWN)
at
org.apache.fluo.core.impl.LockResolver.resolveLocks(LockResolver.java:201)
at
org.apache.fluo.core.impl.SnapshotScanner$SnapIter.resolveLock(SnapshotScanner.java:184)
at
org.apache.fluo.core.impl.SnapshotScanner$SnapIter.getNext(SnapshotScanner.java:221)
at
org.apache.fluo.core.impl.SnapshotScanner$SnapIter.hasNext(SnapshotScanner.java:131)
at
com.google.common.collect.TransformedIterator.hasNext(TransformedIterator.java:42)
at org.apache.fluo.core.util.ScanUtil.scan(ScanUtil.java:124)
at org.apache.fluo.core.util.ScanUtil.scanFluo(ScanUtil.java:152)
at org.apache.fluo.command.FluoScan.execute(FluoScan.java:109)
at
org.apache.fluo.command.FluoProgram.runFluoCommand(FluoProgram.java:69)
at org.apache.fluo.command.FluoProgram.main(FluoProgram.java:33)
```
When looking at the raw data we see the lock to the primary
```
...
dataset_offsets:Alerts:4 :offset [] 113249162-WRITE 113249161
dataset_offsets:Alerts:4 :offset [] 113247916-WRITE 113247915
dataset_offsets:Alerts:4 :offset [] 113246900-WRITE 113246897
dataset_offsets:Alerts:4 :offset [] 85418754-WRITE 85418753
dataset_offsets:Alerts:4 :offset [] 143265458-LOCK
record:Alerts:4:000021178 10 WRITE NOT_DELETE NOT_TRIGGER c
dataset_offsets:Alerts:4 :offset [] 143265458-DATA 21178
dataset_offsets:Alerts:4 :offset [] 143265003-DATA 21177
dataset_offsets:Alerts:4 :offset [] 114952314-DATA 21159
...
```
But maybe in this case the primary does exist?
```
record:Alerts:4:000021178 :10 [] 143265458-LOCK record:Alerts:4:000021178
10 WRITE NOT_DELETE NOT_TRIGGER c
```
Are there any recovery tools or process for cleaning up a situation like
this? Thanks for any advice anyone may have. cc @wjsl
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]