[jira] [Updated] (ACCUMULO-4669) RFile can create very large blocks when key statistics are not uniform
[ https://issues.apache.org/jira/browse/ACCUMULO-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated ACCUMULO-4669: - Labels: pull-request-available (was: ) > RFile can create very large blocks when key statistics are not uniform > -- > > Key: ACCUMULO-4669 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4669 > Project: Accumulo > Issue Type: Bug > Components: core >Affects Versions: 1.7.2, 1.7.3, 1.8.0, 1.8.1 >Reporter: Adam Fuchs >Assignee: Keith Turner >Priority: Blocker > Labels: pull-request-available > Fix For: 1.7.4, 1.8.2, 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > RFile.Writer.append checks for giant keys and avoid writing them as index > blocks. This check is flawed and can result in multi-GB blocks. In our case, > a 20GB compressed RFile had one block with over 2GB raw size. This happened > because the key size statistics changed after some point in the file. The > code in question follows: > {code} > private boolean isGiantKey(Key k) { > // consider a key thats more than 3 standard deviations from previously > seen key sizes as giant > return k.getSize() > keyLenStats.getMean() + > keyLenStats.getStandardDeviation() * 3; > } > ... > if (blockWriter == null) { > blockWriter = fileWriter.prepareDataBlock(); > } else if (blockWriter.getRawSize() > blockSize) { > ... > if ((prevKey.getSize() <= avergageKeySize || blockWriter.getRawSize() > > maxBlockSize) && !isGiantKey(prevKey)) { > closeBlock(prevKey, false); > ... > {code} > Before closing a block that has grown beyond the target block size we check > to see that the key is below average in size or that the block is 1.1 times > the target block size (maxBlockSize), and we check that the key isn't a > "giant" key, or more than 3 standard deviations from the mean of keys seen so > far. > Our RFiles often have one row of data with different column families > representing various forward and inverted indexes. This is a table design > similar to the WikiSearch example. The first column family in this case had > very uniform, relatively small key sizes. This first column family comprised > gigabytes of data, split up into roughly 100KB blocks. When we switched to > the next column family the keys grew in size, but were still under about 100 > bytes. The statistics of the first column family had firmly established a > smaller mean and tiny standard deviation (approximately 0), and it took over > 2GB of larger keys to bring the standard deviation up enough so that keys > were no longer considered "giant" and the block could be closed. > Now that we're aware, we see large blocks (more than 10x the target block > size) in almost every RFile we write. This only became a glaring problem when > we got OOM exceptions trying to decompress the block, but it also shows up in > a number of subtle performance problems, like high variance in latencies for > looking up particular keys. > The fix for this should produce bounded RFile block sizes, limited to the > greater of 2x the maximum key/value size in the block and some configurable > threshold, such as 1.1 times the compressed block size. We need a firm cap to > be able to reason about memory usage in various applications. > The following code produces arbitrarily large RFile blocks: > {code} > FileSKVWriter writer = RFileOperations.getInstance().openWriter(filename, > fs, conf, acuconf); > writer.startDefaultLocalityGroup(); > SummaryStatistics keyLenStats = new SummaryStatistics(); > Random r = new Random(); > byte [] buffer = new byte[minRowSize]; > for(int i = 0; i < 10; i++) { > byte [] valBytes = new byte[valLength]; > r.nextBytes(valBytes); > r.nextBytes(buffer); > ByteBuffer.wrap(buffer).putInt(i); > Key k = new Key(buffer, 0, buffer.length, emptyBytes, 0, 0, emptyBytes, > 0, 0, emptyBytes, 0, 0, 0); > Value v = new Value(valBytes); > writer.append(k, v); > keyLenStats.addValue(k.getSize()); > int newBufferSize = Math.max(buffer.length, (int) > Math.ceil(keyLenStats.getMean() + keyLenStats.getStandardDeviation() * 4 + > 0.0001)); > buffer = new byte[newBufferSize]; > if(keyLenStats.getSum() > targetSize) > break; > } > writer.close(); > {code} > One telltale symptom of this bug is an OutOfMemoryException thrown from a > readahead thread with message "Requested array size exceeds VM limit". This > will only happen if the block cache size is big enough to hold the expected > raw block size, 2GB in our case. This message is rare, and really only > happens when allocating an array of size Integer.MAX_VALUE or >
[jira] [Updated] (ACCUMULO-4669) RFile can create very large blocks when key statistics are not uniform
[ https://issues.apache.org/jira/browse/ACCUMULO-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Turner updated ACCUMULO-4669: --- Priority: Blocker (was: Critical) > RFile can create very large blocks when key statistics are not uniform > -- > > Key: ACCUMULO-4669 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4669 > Project: Accumulo > Issue Type: Bug > Components: core >Affects Versions: 1.7.2, 1.7.3, 1.8.0, 1.8.1 >Reporter: Adam Fuchs >Assignee: Keith Turner >Priority: Blocker > Fix For: 1.7.4, 1.8.2, 2.0.0 > > > RFile.Writer.append checks for giant keys and avoid writing them as index > blocks. This check is flawed and can result in multi-GB blocks. In our case, > a 20GB compressed RFile had one block with over 2GB raw size. This happened > because the key size statistics changed after some point in the file. The > code in question follows: > {code} > private boolean isGiantKey(Key k) { > // consider a key thats more than 3 standard deviations from previously > seen key sizes as giant > return k.getSize() > keyLenStats.getMean() + > keyLenStats.getStandardDeviation() * 3; > } > ... > if (blockWriter == null) { > blockWriter = fileWriter.prepareDataBlock(); > } else if (blockWriter.getRawSize() > blockSize) { > ... > if ((prevKey.getSize() <= avergageKeySize || blockWriter.getRawSize() > > maxBlockSize) && !isGiantKey(prevKey)) { > closeBlock(prevKey, false); > ... > {code} > Before closing a block that has grown beyond the target block size we check > to see that the key is below average in size or that the block is 1.1 times > the target block size (maxBlockSize), and we check that the key isn't a > "giant" key, or more than 3 standard deviations from the mean of keys seen so > far. > Our RFiles often have one row of data with different column families > representing various forward and inverted indexes. This is a table design > similar to the WikiSearch example. The first column family in this case had > very uniform, relatively small key sizes. This first column family comprised > gigabytes of data, split up into roughly 100KB blocks. When we switched to > the next column family the keys grew in size, but were still under about 100 > bytes. The statistics of the first column family had firmly established a > smaller mean and tiny standard deviation (approximately 0), and it took over > 2GB of larger keys to bring the standard deviation up enough so that keys > were no longer considered "giant" and the block could be closed. > Now that we're aware, we see large blocks (more than 10x the target block > size) in almost every RFile we write. This only became a glaring problem when > we got OOM exceptions trying to decompress the block, but it also shows up in > a number of subtle performance problems, like high variance in latencies for > looking up particular keys. > The fix for this should produce bounded RFile block sizes, limited to the > greater of 2x the maximum key/value size in the block and some configurable > threshold, such as 1.1 times the compressed block size. We need a firm cap to > be able to reason about memory usage in various applications. > The following code produces arbitrarily large RFile blocks: > {code} > FileSKVWriter writer = RFileOperations.getInstance().openWriter(filename, > fs, conf, acuconf); > writer.startDefaultLocalityGroup(); > SummaryStatistics keyLenStats = new SummaryStatistics(); > Random r = new Random(); > byte [] buffer = new byte[minRowSize]; > for(int i = 0; i < 10; i++) { > byte [] valBytes = new byte[valLength]; > r.nextBytes(valBytes); > r.nextBytes(buffer); > ByteBuffer.wrap(buffer).putInt(i); > Key k = new Key(buffer, 0, buffer.length, emptyBytes, 0, 0, emptyBytes, > 0, 0, emptyBytes, 0, 0, 0); > Value v = new Value(valBytes); > writer.append(k, v); > keyLenStats.addValue(k.getSize()); > int newBufferSize = Math.max(buffer.length, (int) > Math.ceil(keyLenStats.getMean() + keyLenStats.getStandardDeviation() * 4 + > 0.0001)); > buffer = new byte[newBufferSize]; > if(keyLenStats.getSum() > targetSize) > break; > } > writer.close(); > {code} > One telltale symptom of this bug is an OutOfMemoryException thrown from a > readahead thread with message "Requested array size exceeds VM limit". This > will only happen if the block cache size is big enough to hold the expected > raw block size, 2GB in our case. This message is rare, and really only > happens when allocating an array of size Integer.MAX_VALUE or > Integer.MAX_VALUE-1 on the hotspot JVM. Integer.MAX_VALUE happens in this > case due to some strange handling of raw
[jira] [Updated] (ACCUMULO-4669) RFile can create very large blocks when key statistics are not uniform
[ https://issues.apache.org/jira/browse/ACCUMULO-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Turner updated ACCUMULO-4669: --- Fix Version/s: 2.0.0 1.8.2 1.7.4 > RFile can create very large blocks when key statistics are not uniform > -- > > Key: ACCUMULO-4669 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4669 > Project: Accumulo > Issue Type: Bug > Components: core >Affects Versions: 1.7.2, 1.7.3, 1.8.0, 1.8.1 >Reporter: Adam Fuchs >Priority: Critical > Fix For: 1.7.4, 1.8.2, 2.0.0 > > > RFile.Writer.append checks for giant keys and avoid writing them as index > blocks. This check is flawed and can result in multi-GB blocks. In our case, > a 20GB compressed RFile had one block with over 2GB raw size. This happened > because the key size statistics changed after some point in the file. The > code in question follows: > {code} > private boolean isGiantKey(Key k) { > // consider a key thats more than 3 standard deviations from previously > seen key sizes as giant > return k.getSize() > keyLenStats.getMean() + > keyLenStats.getStandardDeviation() * 3; > } > ... > if (blockWriter == null) { > blockWriter = fileWriter.prepareDataBlock(); > } else if (blockWriter.getRawSize() > blockSize) { > ... > if ((prevKey.getSize() <= avergageKeySize || blockWriter.getRawSize() > > maxBlockSize) && !isGiantKey(prevKey)) { > closeBlock(prevKey, false); > ... > {code} > Before closing a block that has grown beyond the target block size we check > to see that the key is below average in size or that the block is 1.1 times > the target block size (maxBlockSize), and we check that the key isn't a > "giant" key, or more than 3 standard deviations from the mean of keys seen so > far. > Our RFiles often have one row of data with different column families > representing various forward and inverted indexes. This is a table design > similar to the WikiSearch example. The first column family in this case had > very uniform, relatively small key sizes. This first column family comprised > gigabytes of data, split up into roughly 100KB blocks. When we switched to > the next column family the keys grew in size, but were still under about 100 > bytes. The statistics of the first column family had firmly established a > smaller mean and tiny standard deviation (approximately 0), and it took over > 2GB of larger keys to bring the standard deviation up enough so that keys > were no longer considered "giant" and the block could be closed. > Now that we're aware, we see large blocks (more than 10x the target block > size) in almost every RFile we write. This only became a glaring problem when > we got OOM exceptions trying to decompress the block, but it also shows up in > a number of subtle performance problems, like high variance in latencies for > looking up particular keys. > The fix for this should produce bounded RFile block sizes, limited to the > greater of 2x the maximum key/value size in the block and some configurable > threshold, such as 1.1 times the compressed block size. We need a firm cap to > be able to reason about memory usage in various applications. > The following code produces arbitrarily large RFile blocks: > {code} > FileSKVWriter writer = RFileOperations.getInstance().openWriter(filename, > fs, conf, acuconf); > writer.startDefaultLocalityGroup(); > SummaryStatistics keyLenStats = new SummaryStatistics(); > Random r = new Random(); > byte [] buffer = new byte[minRowSize]; > for(int i = 0; i < 10; i++) { > byte [] valBytes = new byte[valLength]; > r.nextBytes(valBytes); > r.nextBytes(buffer); > ByteBuffer.wrap(buffer).putInt(i); > Key k = new Key(buffer, 0, buffer.length, emptyBytes, 0, 0, emptyBytes, > 0, 0, emptyBytes, 0, 0, 0); > Value v = new Value(valBytes); > writer.append(k, v); > keyLenStats.addValue(k.getSize()); > int newBufferSize = Math.max(buffer.length, (int) > Math.ceil(keyLenStats.getMean() + keyLenStats.getStandardDeviation() * 4 + > 0.0001)); > buffer = new byte[newBufferSize]; > if(keyLenStats.getSum() > targetSize) > break; > } > writer.close(); > {code} > One telltale symptom of this bug is an OutOfMemoryException thrown from a > readahead thread with message "Requested array size exceeds VM limit". This > will only happen if the block cache size is big enough to hold the expected > raw block size, 2GB in our case. This message is rare, and really only > happens when allocating an array of size Integer.MAX_VALUE or > Integer.MAX_VALUE-1 on the hotspot JVM. Integer.MAX_VALUE happens in this > case due to some strange handling of
[jira] [Updated] (ACCUMULO-4669) RFile can create very large blocks when key statistics are not uniform
[ https://issues.apache.org/jira/browse/ACCUMULO-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Keith Turner updated ACCUMULO-4669: --- Affects Version/s: 1.7.2 1.7.3 1.8.0 1.8.1 > RFile can create very large blocks when key statistics are not uniform > -- > > Key: ACCUMULO-4669 > URL: https://issues.apache.org/jira/browse/ACCUMULO-4669 > Project: Accumulo > Issue Type: Bug > Components: core >Affects Versions: 1.7.2, 1.7.3, 1.8.0, 1.8.1 >Reporter: Adam Fuchs >Priority: Critical > Fix For: 1.7.4, 1.8.2, 2.0.0 > > > RFile.Writer.append checks for giant keys and avoid writing them as index > blocks. This check is flawed and can result in multi-GB blocks. In our case, > a 20GB compressed RFile had one block with over 2GB raw size. This happened > because the key size statistics changed after some point in the file. The > code in question follows: > {code} > private boolean isGiantKey(Key k) { > // consider a key thats more than 3 standard deviations from previously > seen key sizes as giant > return k.getSize() > keyLenStats.getMean() + > keyLenStats.getStandardDeviation() * 3; > } > ... > if (blockWriter == null) { > blockWriter = fileWriter.prepareDataBlock(); > } else if (blockWriter.getRawSize() > blockSize) { > ... > if ((prevKey.getSize() <= avergageKeySize || blockWriter.getRawSize() > > maxBlockSize) && !isGiantKey(prevKey)) { > closeBlock(prevKey, false); > ... > {code} > Before closing a block that has grown beyond the target block size we check > to see that the key is below average in size or that the block is 1.1 times > the target block size (maxBlockSize), and we check that the key isn't a > "giant" key, or more than 3 standard deviations from the mean of keys seen so > far. > Our RFiles often have one row of data with different column families > representing various forward and inverted indexes. This is a table design > similar to the WikiSearch example. The first column family in this case had > very uniform, relatively small key sizes. This first column family comprised > gigabytes of data, split up into roughly 100KB blocks. When we switched to > the next column family the keys grew in size, but were still under about 100 > bytes. The statistics of the first column family had firmly established a > smaller mean and tiny standard deviation (approximately 0), and it took over > 2GB of larger keys to bring the standard deviation up enough so that keys > were no longer considered "giant" and the block could be closed. > Now that we're aware, we see large blocks (more than 10x the target block > size) in almost every RFile we write. This only became a glaring problem when > we got OOM exceptions trying to decompress the block, but it also shows up in > a number of subtle performance problems, like high variance in latencies for > looking up particular keys. > The fix for this should produce bounded RFile block sizes, limited to the > greater of 2x the maximum key/value size in the block and some configurable > threshold, such as 1.1 times the compressed block size. We need a firm cap to > be able to reason about memory usage in various applications. > The following code produces arbitrarily large RFile blocks: > {code} > FileSKVWriter writer = RFileOperations.getInstance().openWriter(filename, > fs, conf, acuconf); > writer.startDefaultLocalityGroup(); > SummaryStatistics keyLenStats = new SummaryStatistics(); > Random r = new Random(); > byte [] buffer = new byte[minRowSize]; > for(int i = 0; i < 10; i++) { > byte [] valBytes = new byte[valLength]; > r.nextBytes(valBytes); > r.nextBytes(buffer); > ByteBuffer.wrap(buffer).putInt(i); > Key k = new Key(buffer, 0, buffer.length, emptyBytes, 0, 0, emptyBytes, > 0, 0, emptyBytes, 0, 0, 0); > Value v = new Value(valBytes); > writer.append(k, v); > keyLenStats.addValue(k.getSize()); > int newBufferSize = Math.max(buffer.length, (int) > Math.ceil(keyLenStats.getMean() + keyLenStats.getStandardDeviation() * 4 + > 0.0001)); > buffer = new byte[newBufferSize]; > if(keyLenStats.getSum() > targetSize) > break; > } > writer.close(); > {code} > One telltale symptom of this bug is an OutOfMemoryException thrown from a > readahead thread with message "Requested array size exceeds VM limit". This > will only happen if the block cache size is big enough to hold the expected > raw block size, 2GB in our case. This message is rare, and really only > happens when allocating an array of size Integer.MAX_VALUE or > Integer.MAX_VALUE-1 on the hotspot JVM. Integer.MAX_VALUE happens in this