[ 
https://issues.apache.org/jira/browse/HADOOP-15204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350889#comment-16350889
 ] 

Chris Douglas commented on HADOOP-15204:
----------------------------------------

bq. I find getTimeDuration API extremely intuitive, hence imitating that for 
this API
Soright; I only mention it as relevant context.

bq. Rounding causes a significant loss when we convert from x bytes to y 
exabytes. Hence I voted for the least element of surprise and decided to return 
double
Is it sufficient for the caller to provide the precision? If the caller wants 
petabytes with some decimal places, then they can request terabytes. If they 
want to ensure the conversion is within some epsilon, then they can request the 
value with high precision and measure the loss. Even rounding decisions can be 
left to the caller. Instead of passing this context into {{Configuration}} (or 
setting defaults), its role can be limited to converting and scaling the 
stringly-typed value.

Similarly:
bq. That would let me return something like -1 to mean "no value set", which I 
can't do with the current API.
{{Configuration}} supports that with a raw {{get(key)}}. It's only where we 
have the default in hand that it provides typed getters.

bq. This is the curse of writing a unit as a library; we need to be cognizant 
of that single use case which will break us. Hence I have used bigDecimal to be 
safe and correct and return doubles. It yields values that people expect.
Sure, I only mention it because it differs from {{getTimeDuration}}. With 
{{TimeUnit}}, a caller could, with low false positives, check if the result was 
equal to max to detect overflow. Doing that here would have a higher false 
positive rate, so the {{BigDecimal}} approach with explicit precision is 
superior.

Minor:
* In this overload:
{noformat}
+  public double getStorageSize(String name, String defaultValue,
+      StorageUnit targetUnit) {
{noformat}
Does this come up often? The {{getTimeDuration}} assumes that the default will 
be in the same unit as the conversion. So in your example, one would write 
{{getTimeDuration("key", 5000, MEGABYTES)}}. It's less elegant, but it type 
checks.
* I'd lean toward {{MB}} instead of {{MEGABYTES}}, and similar. Even as a 
static import, those are unlikely to collide and they're equally readable.

> Add Configuration API for parsing storage sizes
> -----------------------------------------------
>
>                 Key: HADOOP-15204
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15204
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: conf
>    Affects Versions: 3.1.0
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>            Priority: Minor
>             Fix For: 3.1.0
>
>         Attachments: HADOOP-15204.001.patch, HADOOP-15204.002.patch
>
>
> Hadoop has a lot of configurations that specify memory and disk size. This 
> JIRA proposes to add an API like {{Configuration.getStorageSize}} which will 
> allow users
>  to specify units like KB, MB, GB etc. This is JIRA is inspired by 
> HADOOP-8608 and Ozone. Adding {{getTimeDuration}} support was a great 
> improvement for ozone code base, this JIRA hopes to do the same thing for 
> configs that deal with disk and memory usage.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to