keith-turner commented on a change in pull request #373: ACCUMULO-4709 sanity 
check in Mutation
URL: https://github.com/apache/accumulo/pull/373#discussion_r167305930
 
 

 ##########
 File path: core/src/main/java/org/apache/accumulo/core/data/Mutation.java
 ##########
 @@ -306,6 +318,8 @@ private void put(byte[] cf, int cfLength, byte[] cq, int 
cqLength, byte[] cv, bo
     if (buffer == null) {
       throw new IllegalStateException("Can not add to mutation after 
serializing it");
     }
+    estimatedSize += cfLength + cqLength + (hasts ? 8 : 0) + valLength + 2 * 1 
+ 4 * SERIALIZATION_OVERHEAD;
+    Preconditions.checkArgument(estimatedSize < MAX_MUTATION_SIZE && 
estimatedSize >= 0, "Maximum mutation size must be less than 2GB ");
 
 Review comment:
   I was thinking of a slightly different approach that will give more accurate 
size estimates.  Something like the following.
   
    * Use estimatedSize to track row and large values only.
    * Add size() method to `UnsynchronizedBuffer.Writer` that returns its 
offset.  Then can call `buffer.size()` to get an accurate count of amount of 
data written.
   
   Then this check could be something like
   
   ```java
     long estSizeAfterPut = estimatedSize + buffer.size() + cfLength + cqLength 
+ (hasts ? 8 : 0) + valLength + 2 * 1 + 4 * SERIALIZATION_OVERHEAD;
   Preconditions.checkArgument(estSizeAfterPut < MAX_MUTATION_SIZE && 
estSizeAfterPut >= 0, "Maximum mutation size must be less than 2GB ");
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to