[
https://issues.apache.org/jira/browse/ACCUMULO-1998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13859766#comment-13859766
]
Josh Elser commented on ACCUMULO-1998:
--------------------------------------
So you pad the cipher stream to get it to flush to ensure that you don't lose
data (or have old data reintroduced)? Have you noticed any sort of performance
change in this? I assume the flush()'s in a real system are far enough between
one another that it's not terrible? That, and if you don't have super small
buffers?
> Encrypted WALogs seem to be excessively buffering
> -------------------------------------------------
>
> Key: ACCUMULO-1998
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1998
> Project: Accumulo
> Issue Type: Bug
> Reporter: Michael Allen
> Assignee: John Vines
> Priority: Blocker
> Fix For: 1.6.0
>
> Attachments:
> 0001-ACCUMULO-1998-Working-around-the-cipher-s-buffer-by-.patch,
> 0001-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch,
> 0002-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch,
> 0003-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch,
> 0004-ACCUMULO-1998-forcing-Buffered-crypto-stream-to-flus.patch
>
>
> The reproduction steps around this are a little bit fuzzy but basically we
> ran a moderate workload against a 1.6.0 server. Encryption happened to be
> turned on but that doesn't seem to be germane to the problem. After doing a
> moderate amount of work, Accumulo is refusing to start up, spewing this error
> over and over to the log:
> {noformat}
> 2013-12-10 10:23:02,529 [tserver.TabletServer] WARN : exception while doing
> multi-scan
> java.lang.RuntimeException: java.io.IOException: Failed to open
> hdfs://10.10.1.115:9000/accumulo/tables/!0/table_info/A000042x.rf
> at
> org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:1125)
> at
> org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> at
> org.apache.accumulo.trace.instrument.TraceRunnable.run(TraceRunnable.java:47)
> at
> org.apache.accumulo.core.util.LoggingRunnable.run(LoggingRunnable.java:34)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.IOException: Failed to open
> hdfs://10.10.1.115:9000/accumulo/tables/!0/table_info/A000042x.rf
> at
> org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:333)
> at
> org.apache.accumulo.tserver.FileManager.access$500(FileManager.java:58)
> at
> org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:478)
> at
> org.apache.accumulo.tserver.FileManager$ScanFileManager.openFileRefs(FileManager.java:466)
> at
> org.apache.accumulo.tserver.FileManager$ScanFileManager.openFiles(FileManager.java:486)
> at
> org.apache.accumulo.tserver.Tablet$ScanDataSource.createIterator(Tablet.java:2027)
> at
> org.apache.accumulo.tserver.Tablet$ScanDataSource.iterator(Tablet.java:1989)
> at
> org.apache.accumulo.core.iterators.system.SourceSwitchingIterator.seek(SourceSwitchingIterator.java:163)
> at org.apache.accumulo.tserver.Tablet.lookup(Tablet.java:1565)
> at org.apache.accumulo.tserver.Tablet.lookup(Tablet.java:1672)
> at
> org.apache.accumulo.tserver.TabletServer$ThriftClientHandler$LookupTask.run(TabletServer.java:1114)
> ... 6 more
> Caused by: java.io.FileNotFoundException: File does not exist:
> /accumulo/tables/!0/table_info/A000042x.rf
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
> at
> org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:436)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBCFile(CachableBlockFile.java:256)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.access$000(CachableBlockFile.java:143)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader$MetaBlockLoader.get(CachableBlockFile.java:212)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:313)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:367)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:143)
> at
> org.apache.accumulo.core.file.rfile.RFile$Reader.<init>(RFile.java:825)
> at
> org.apache.accumulo.core.file.rfile.RFileOperations.openReader(RFileOperations.java:79)
> at
> org.apache.accumulo.core.file.DispatchingFileFactory.openReader(FileOperations.java:119)
> at
> org.apache.accumulo.tserver.FileManager.reserveReaders(FileManager.java:314)
> ... 16 more
> {noformat}
> Here's some other pieces of context:
> HDFS contents:
> {noformat}
> ubuntu@ip-10-10-1-115:/data0/logs/accumulo$ hadoop fs -lsr /accumulo/tables/
> drwxr-xr-x - accumulo hadoop 0 2013-12-10 00:32 /accumulo/tables/!0
> drwxr-xr-x - accumulo hadoop 0 2013-12-10 01:06
> /accumulo/tables/!0/default_tablet
> drwxr-xr-x - accumulo hadoop 0 2013-12-10 10:49
> /accumulo/tables/!0/table_info
> -rw-r--r-- 5 accumulo hadoop 1698 2013-12-10 00:34
> /accumulo/tables/!0/table_info/F0000000.rf
> -rw-r--r-- 5 accumulo hadoop 43524 2013-12-10 01:53
> /accumulo/tables/!0/table_info/F000062q.rf
> drwxr-xr-x - accumulo hadoop 0 2013-12-10 00:32 /accumulo/tables/+r
> drwxr-xr-x - accumulo hadoop 0 2013-12-10 10:45
> /accumulo/tables/+r/root_tablet
> -rw-r--r-- 5 accumulo hadoop 2070 2013-12-10 10:45
> /accumulo/tables/+r/root_tablet/A0000738.rf
> drwxr-xr-x - accumulo hadoop 0 2013-12-10 00:33 /accumulo/tables/1
> drwxr-xr-x - accumulo hadoop 0 2013-12-10 00:33
> /accumulo/tables/1/default_tablet
> {noformat}
> ZooKeeper entries
> {noformat}
> [zk: localhost:2181(CONNECTED) 6] get
> /accumulo/371cfa3e-fe96-4a50-92e9-da7572589ffa/root_tablet/dir
> hdfs://10.10.1.115:9000/accumulo/tables/+r/root_tablet
> cZxid = 0x1b
> ctime = Tue Dec 10 00:32:56 EST 2013
> mZxid = 0x1b
> mtime = Tue Dec 10 00:32:56 EST 2013
> pZxid = 0x1b
> cversion = 0
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 54
> numChildren = 0
> {noformat}
> I'm going to preserve the state of this machine in HDFS for a while but not
> forever, so if there are other pieces of context people need, let me know.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)