[ https://issues.apache.org/jira/browse/HADOOP-15204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16350889#comment-16350889 ]
Chris Douglas commented on HADOOP-15204: ---------------------------------------- bq. I find getTimeDuration API extremely intuitive, hence imitating that for this API Soright; I only mention it as relevant context. bq. Rounding causes a significant loss when we convert from x bytes to y exabytes. Hence I voted for the least element of surprise and decided to return double Is it sufficient for the caller to provide the precision? If the caller wants petabytes with some decimal places, then they can request terabytes. If they want to ensure the conversion is within some epsilon, then they can request the value with high precision and measure the loss. Even rounding decisions can be left to the caller. Instead of passing this context into {{Configuration}} (or setting defaults), its role can be limited to converting and scaling the stringly-typed value. Similarly: bq. That would let me return something like -1 to mean "no value set", which I can't do with the current API. {{Configuration}} supports that with a raw {{get(key)}}. It's only where we have the default in hand that it provides typed getters. bq. This is the curse of writing a unit as a library; we need to be cognizant of that single use case which will break us. Hence I have used bigDecimal to be safe and correct and return doubles. It yields values that people expect. Sure, I only mention it because it differs from {{getTimeDuration}}. With {{TimeUnit}}, a caller could, with low false positives, check if the result was equal to max to detect overflow. Doing that here would have a higher false positive rate, so the {{BigDecimal}} approach with explicit precision is superior. Minor: * In this overload: {noformat} + public double getStorageSize(String name, String defaultValue, + StorageUnit targetUnit) { {noformat} Does this come up often? The {{getTimeDuration}} assumes that the default will be in the same unit as the conversion. So in your example, one would write {{getTimeDuration("key", 5000, MEGABYTES)}}. It's less elegant, but it type checks. * I'd lean toward {{MB}} instead of {{MEGABYTES}}, and similar. Even as a static import, those are unlikely to collide and they're equally readable. > Add Configuration API for parsing storage sizes > ----------------------------------------------- > > Key: HADOOP-15204 > URL: https://issues.apache.org/jira/browse/HADOOP-15204 > Project: Hadoop Common > Issue Type: Improvement > Components: conf > Affects Versions: 3.1.0 > Reporter: Anu Engineer > Assignee: Anu Engineer > Priority: Minor > Fix For: 3.1.0 > > Attachments: HADOOP-15204.001.patch, HADOOP-15204.002.patch > > > Hadoop has a lot of configurations that specify memory and disk size. This > JIRA proposes to add an API like {{Configuration.getStorageSize}} which will > allow users > to specify units like KB, MB, GB etc. This is JIRA is inspired by > HADOOP-8608 and Ozone. Adding {{getTimeDuration}} support was a great > improvement for ozone code base, this JIRA hopes to do the same thing for > configs that deal with disk and memory usage. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org