I'm not an expert on bloom filters, but I asked a colleague and they think
that what may be happening is that the file is opened for read (to support
a scan, probably), but then the file is closed before the background bloom
filter thread can load the bloom filters to optimize future queries of that
file.

This could happen because for a number of reasons. Keep in mind that this
is not an ERROR or WARN message, but a DEBUG one, so, it may be safe to
ignore, or if it happens frequently, it may indicate that there's room for
further system tuning to optimize your use of bloom filters.

Some things you can try are:

* Modify `tserver.scan.files.open.max` to increase it so that files don't
get evicted and closed as quickly.
* Modify `tserver.files.open.idle` to increase the amount of idle time
after the most recently read file before closing it (in case the background
bloom filter threads need more time to load bloom filters, and so it can
still be open the next time it is read).
* Modify `tserver.bloom.load.concurrent.max` to increase the number of
background threads for loading bloom filters (in case they aren't getting
loaded fast enough to be used). Or, set it to 0 to force it to load in the
foreground instead of the background.
* Modify other `table.bloom.*` parameters to make bloom filters smaller so
they load faster or are utilized more optimally for your work load and
access patterns.

Other possibilities might involve changing how big your RFiles are, or the
compaction ratio, or other settings to try to reduce the number of files
open concurrently on the tablet servers.

On Thu, Dec 21, 2017 at 10:34 AM vLex Systems <vlexsyst...@vlex.com> wrote:

> Hi
>
> We've activated the bloomfilter on an accumulo table to see if it
> helped with the CPU usage and we're seeing this messages in our
> tserver debug log:
>
> 2017-12-20 12:08:28,800 [impl.CachableBlockFile] DEBUG: Error full
> blockRead for file
> hdfs://10.0.32.143:9000/accumulo/tables/6/t-0000013/F0008k42.rf for
> block acu_bloom
> java.io.IOException: Stream is closed!
> at org.apache.hadoop.hdfs.DFSInputStream.seek(DFSInputStream.java:1404)
> at org.apache.hadoop.fs.FSDataInputStream.seek(FSDataInputStream.java:63)
> at
> org.apache.accumulo.core.file.rfile.bcfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:98)
> at
> org.apache.hadoop.io.compress.DecompressorStream.getCompressedData(DecompressorStream.java:159)
> at
> org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:143)
> at
> org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:85)
> at java.io.BufferedInputStream.read1(BufferedInputStream.java:273)
> at java.io.BufferedInputStream.read(BufferedInputStream.java:334)
> at java.io.DataInputStream.readFully(DataInputStream.java:195)
> at java.io.DataInputStream.readFully(DataInputStream.java:169)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.cacheBlock(CachableBlockFile.java:335)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getBlock(CachableBlockFile.java:318)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:368)
> at
> org.apache.accumulo.core.file.blockfile.impl.CachableBlockFile$Reader.getMetaBlock(CachableBlockFile.java:137)
> at
> org.apache.accumulo.core.file.rfile.RFile$Reader.getMetaStore(RFile.java:974)
> at
> org.apache.accumulo.core.file.BloomFilterLayer$BloomFilterLoader$1.run(BloomFilterLayer.java:211)
> at
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at
> org.apache.accumulo.fate.util.LoggingRunnable.run(LoggingRunnable.java:35)
> at java.lang.Thread.run(Thread.java:745)
> 2017-12-20 12:08:28,801 [file.BloomFilterLayer] DEBUG: Can't open
> BloomFilter, file closed : Stream is closed!
>
>
> Does anyone know what these mean or what is causing them?
>
> Thank you.
>

Reply via email to