[jira] [Commented] (ACCUMULO-4669) RFile can create very large blocks when key statistics are not uniform

2017-06-29 Thread Christopher Tubbs (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069323#comment-16069323
 ] 

Christopher Tubbs commented on ACCUMULO-4669:
-

No, it wouldn't prevent them from getting into the index entirely... but it 
would allow a best effort within some reasonable distance of the 
user-configured block size (after the soft limit, before the hard limit).

> RFile can create very large blocks when key statistics are not uniform
> --
>
> Key: ACCUMULO-4669
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4669
> Project: Accumulo
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.7.2, 1.7.3, 1.8.0, 1.8.1
>Reporter: Adam Fuchs
>Assignee: Keith Turner
>Priority: Blocker
> Fix For: 1.7.4, 1.8.2, 2.0.0
>
>
> RFile.Writer.append checks for giant keys and avoid writing them as index 
> blocks. This check is flawed and can result in multi-GB blocks. In our case, 
> a 20GB compressed RFile had one block with over 2GB raw size. This happened 
> because the key size statistics changed after some point in the file. The 
> code in question follows:
> {code}
> private boolean isGiantKey(Key k) {
>   // consider a key thats more than 3 standard deviations from previously 
> seen key sizes as giant
>   return k.getSize() > keyLenStats.getMean() + 
> keyLenStats.getStandardDeviation() * 3;
> }
> ...
>   if (blockWriter == null) {
> blockWriter = fileWriter.prepareDataBlock();
>   } else if (blockWriter.getRawSize() > blockSize) {
> ...
> if ((prevKey.getSize() <= avergageKeySize || blockWriter.getRawSize() 
> > maxBlockSize) && !isGiantKey(prevKey)) {
>   closeBlock(prevKey, false);
> ...
> {code}
> Before closing a block that has grown beyond the target block size we check 
> to see that the key is below average in size or that the block is 1.1 times 
> the target block size (maxBlockSize), and we check that the key isn't a 
> "giant" key, or more than 3 standard deviations from the mean of keys seen so 
> far.
> Our RFiles often have one row of data with different column families 
> representing various forward and inverted indexes. This is a table design 
> similar to the WikiSearch example. The first column family in this case had 
> very uniform, relatively small key sizes. This first column family comprised 
> gigabytes of data, split up into roughly 100KB blocks. When we switched to 
> the next column family the keys grew in size, but were still under about 100 
> bytes. The statistics of the first column family had firmly established a 
> smaller mean and tiny standard deviation (approximately 0), and it took over 
> 2GB of larger keys to bring the standard deviation up enough so that keys 
> were no longer considered "giant" and the block could be closed.
> Now that we're aware, we see large blocks (more than 10x the target block 
> size) in almost every RFile we write. This only became a glaring problem when 
> we got OOM exceptions trying to decompress the block, but it also shows up in 
> a number of subtle performance problems, like high variance in latencies for 
> looking up particular keys.
> The fix for this should produce bounded RFile block sizes, limited to the 
> greater of 2x the maximum key/value size in the block and some configurable 
> threshold, such as 1.1 times the compressed block size. We need a firm cap to 
> be able to reason about memory usage in various applications.
> The following code produces arbitrarily large RFile blocks:
> {code}
>   FileSKVWriter writer = RFileOperations.getInstance().openWriter(filename, 
> fs, conf, acuconf);
>   writer.startDefaultLocalityGroup();
>   SummaryStatistics keyLenStats = new SummaryStatistics();
>   Random r = new Random();
>   byte [] buffer = new byte[minRowSize]; 
>   for(int i = 0; i < 10; i++) {
> byte [] valBytes = new byte[valLength];
> r.nextBytes(valBytes);
> r.nextBytes(buffer);
> ByteBuffer.wrap(buffer).putInt(i);
> Key k = new Key(buffer, 0, buffer.length, emptyBytes, 0, 0, emptyBytes, 
> 0, 0, emptyBytes, 0, 0, 0);
> Value v = new Value(valBytes);
> writer.append(k, v);
> keyLenStats.addValue(k.getSize());
> int newBufferSize = Math.max(buffer.length, (int) 
> Math.ceil(keyLenStats.getMean() + keyLenStats.getStandardDeviation() * 4 + 
> 0.0001));
> buffer = new byte[newBufferSize];
> if(keyLenStats.getSum() > targetSize)
>   break;
>   }
>   writer.close();
> {code}
> One telltale symptom of this bug is an OutOfMemoryException thrown from a 
> readahead thread with message "Requested array size exceeds VM limit". This 
> will only happen if the block cache size is big enough to hold the expected 
> raw block size, 2GB in our 

[jira] [Commented] (ACCUMULO-4669) RFile can create very large blocks when key statistics are not uniform

2017-06-29 Thread Keith Turner (JIRA)

[ 
https://issues.apache.org/jira/browse/ACCUMULO-4669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069287#comment-16069287
 ] 

Keith Turner commented on ACCUMULO-4669:


{quote}
 I think the isGiant heuristic is worth keeping, but we should reintroduce a 
hard limit on the data block size. 
{quote}

The goal of the isGiant check was to prevent really large (megabyte size) keys 
from making it into the index.   I was dealing with keys that followed a 
zipfian distribution.  The hard limit would not prevent these keys from making 
it into the index.   When these keys get into the index, it can be a disaster 
because they may be pushed up the index tree.  Maybe the isGiant check should 
be dropped if it can't accomplish its intended goal without making large data 
blocks.

> RFile can create very large blocks when key statistics are not uniform
> --
>
> Key: ACCUMULO-4669
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4669
> Project: Accumulo
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.7.2, 1.7.3, 1.8.0, 1.8.1
>Reporter: Adam Fuchs
>Assignee: Keith Turner
>Priority: Blocker
> Fix For: 1.7.4, 1.8.2, 2.0.0
>
>
> RFile.Writer.append checks for giant keys and avoid writing them as index 
> blocks. This check is flawed and can result in multi-GB blocks. In our case, 
> a 20GB compressed RFile had one block with over 2GB raw size. This happened 
> because the key size statistics changed after some point in the file. The 
> code in question follows:
> {code}
> private boolean isGiantKey(Key k) {
>   // consider a key thats more than 3 standard deviations from previously 
> seen key sizes as giant
>   return k.getSize() > keyLenStats.getMean() + 
> keyLenStats.getStandardDeviation() * 3;
> }
> ...
>   if (blockWriter == null) {
> blockWriter = fileWriter.prepareDataBlock();
>   } else if (blockWriter.getRawSize() > blockSize) {
> ...
> if ((prevKey.getSize() <= avergageKeySize || blockWriter.getRawSize() 
> > maxBlockSize) && !isGiantKey(prevKey)) {
>   closeBlock(prevKey, false);
> ...
> {code}
> Before closing a block that has grown beyond the target block size we check 
> to see that the key is below average in size or that the block is 1.1 times 
> the target block size (maxBlockSize), and we check that the key isn't a 
> "giant" key, or more than 3 standard deviations from the mean of keys seen so 
> far.
> Our RFiles often have one row of data with different column families 
> representing various forward and inverted indexes. This is a table design 
> similar to the WikiSearch example. The first column family in this case had 
> very uniform, relatively small key sizes. This first column family comprised 
> gigabytes of data, split up into roughly 100KB blocks. When we switched to 
> the next column family the keys grew in size, but were still under about 100 
> bytes. The statistics of the first column family had firmly established a 
> smaller mean and tiny standard deviation (approximately 0), and it took over 
> 2GB of larger keys to bring the standard deviation up enough so that keys 
> were no longer considered "giant" and the block could be closed.
> Now that we're aware, we see large blocks (more than 10x the target block 
> size) in almost every RFile we write. This only became a glaring problem when 
> we got OOM exceptions trying to decompress the block, but it also shows up in 
> a number of subtle performance problems, like high variance in latencies for 
> looking up particular keys.
> The fix for this should produce bounded RFile block sizes, limited to the 
> greater of 2x the maximum key/value size in the block and some configurable 
> threshold, such as 1.1 times the compressed block size. We need a firm cap to 
> be able to reason about memory usage in various applications.
> The following code produces arbitrarily large RFile blocks:
> {code}
>   FileSKVWriter writer = RFileOperations.getInstance().openWriter(filename, 
> fs, conf, acuconf);
>   writer.startDefaultLocalityGroup();
>   SummaryStatistics keyLenStats = new SummaryStatistics();
>   Random r = new Random();
>   byte [] buffer = new byte[minRowSize]; 
>   for(int i = 0; i < 10; i++) {
> byte [] valBytes = new byte[valLength];
> r.nextBytes(valBytes);
> r.nextBytes(buffer);
> ByteBuffer.wrap(buffer).putInt(i);
> Key k = new Key(buffer, 0, buffer.length, emptyBytes, 0, 0, emptyBytes, 
> 0, 0, emptyBytes, 0, 0, 0);
> Value v = new Value(valBytes);
> writer.append(k, v);
> keyLenStats.addValue(k.getSize());
> int newBufferSize = Math.max(buffer.length, (int) 
> Math.ceil(keyLenStats.getMean() + keyLenStats.getStandardDeviation() * 4 + 
> 0.0001));
> buffer = new 

[jira] [Updated] (ACCUMULO-4672) NPE extracting samplerConfiguration from InputSplit

2017-06-29 Thread Christopher Tubbs (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Christopher Tubbs updated ACCUMULO-4672:

Fix Version/s: 2.0.0

> NPE extracting samplerConfiguration from InputSplit
> ---
>
> Key: ACCUMULO-4672
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4672
> Project: Accumulo
>  Issue Type: Bug
>  Components: mapreduce
>Reporter: Josh Elser
>Assignee: Josh Elser
>Priority: Minor
> Fix For: 1.8.2, 2.0.0
>
>
> {noformat}
> Caused by: java.lang.NullPointerException
>   at 
> org.apache.accumulo.core.client.mapred.AbstractInputFormat$AbstractRecordReader.initialize(AbstractInputFormat.java:608)
>   at 
> org.apache.accumulo.core.client.mapred.AccumuloRowInputFormat$1.initialize(AccumuloRowInputFormat.java:60)
>   at 
> org.apache.accumulo.core.client.mapred.AccumuloRowInputFormat.getRecordReader(AccumuloRowInputFormat.java:84)
> {noformat}
> I still need to dig into this one and try to write a test case for it that 
> doesn't involve Hive (as it may have just been something that I was doing). 
> Best as I can tell..
> AbstractInputFormat extracts a default table configuration object from the 
> Job's Configuration class:
> {code}
>   InputTableConfig tableConfig = getInputTableConfig(job, 
> baseSplit.getTableName());
> {code}
> Eventually, the same class tries to extract the samplerConfiguration from 
> this tableConfig (after noticing it is not present in the InputSplit) and 
> this throws an NPE. Somehow the tableConfig was null. It very well could be 
> that Hive was to blame, I just wanted to make sure that this was captured 
> before I forgot about it.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (ACCUMULO-4043) start-up needs to scale with the number of servers

2017-06-29 Thread Ivan Bella (JIRA)

 [ 
https://issues.apache.org/jira/browse/ACCUMULO-4043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Bella reassigned ACCUMULO-4043:


Assignee: (was: Ivan Bella)

> start-up needs to scale with the number of servers
> --
>
> Key: ACCUMULO-4043
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4043
> Project: Accumulo
>  Issue Type: Bug
>  Components: master
>Affects Versions: 1.6.4
> Environment: very large production cluster
>Reporter: Eric Newton
> Fix For: 2.0.0
>
>
> When starting a very large production cluster, the master gets running fast 
> enough that it loads the metadata table before all the tservers are 
> registered and running. The result is a very long balancing period after all 
> servers started. The wait period for the tablet server stabilization needs to 
> scale somewhat with the number of tablet servers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)