cshannon commented on issue #3348:
URL: https://github.com/apache/accumulo/issues/3348#issuecomment-1537170960

   > One thing to keep in mind, that may make this easier to address: 
SplitUtils is not public API and is not intended for direct consumption. It is 
used internally to help us approximate relative split sizes when calculating 
InputSplits. So, it doesn't matter if its method returns a negative number, as 
long as the places where it's used check its sign to ensure that they handle 
that situation appropriately.
   > 
   > We could also try to come up with better approximation methods for split 
sizes, but I think addressing the negative in the places where it's used is the 
quickest and easiest way to fix this, without completely rewriting the 
approximation algorithm.
   
   So do you think it's appropriate to just use Long.MAX_VALUE as the range 
length if a negative is detect in the spots where the method is used?  (looks 
like 
[BatchInputSplit](https://github.com/apache/accumulo/blob/56d49f15a05db9a46dbceb845918497760601c11/hadoop-mapreduce/src/main/java/org/apache/accumulo/hadoopImpl/mapreduce/BatchInputSplit.java#L113)
 and RangeInputSplit currently)
   
   We could of course just handle a returned negative in both places but it 
still seems like the simplest thing is to have that method just return 
Long.MAX_VALUE as the computed range length if it detects an overflow/negative 
otherwise everytime we call the method we have to handle a potential negative 
returned which could lead to inconsistencies.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to