[
https://issues.apache.org/jira/browse/HADOOP-17146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ahmed Hussein updated HADOOP-17146:
-----------------------------------
Description:
Hadoop source code uses {{com.google.common.base.Splitter}} . We need to
analyze the performance overhead of splitter and consider different
implementations such as apache-commons.
Hadoop has an implementation {{org.apache.hadoop.util.StringUtils.split()}}.
Therefore, all split() calls have to use the wrapper in common package. This
will make the utility calls less invasive and confusing as the behavior of
apache-commons.stringUtils is not the same as guava.
Once we have the wrapper, and all calls are using that wrapper, we can decide
to use apache-commons or do specific optimizations without changing the entire
source code.
{code:bash}
Targets
Occurrences of 'import com.google.common.base.Splitter;' in project with
mask '*.java'
Found Occurrences (18 usages found)
org.apache.hadoop.crypto (1 usage found)
CryptoCodec.java (1 usage found)
34 import com.google.common.base.Splitter;
org.apache.hadoop.mapred.nativetask.kvtest (1 usage found)
KVTest.java (1 usage found)
44 import com.google.common.base.Splitter;
org.apache.hadoop.mapreduce.v2.util (1 usage found)
MRWebAppUtil.java (1 usage found)
20 import com.google.common.base.Splitter;
org.apache.hadoop.metrics2.impl (1 usage found)
MetricsConfig.java (1 usage found)
32 import com.google.common.base.Splitter;
org.apache.hadoop.registry.client.impl.zk (1 usage found)
RegistrySecurity.java (1 usage found)
22 import com.google.common.base.Splitter;
org.apache.hadoop.security.authentication.server (1 usage found)
MultiSchemeAuthenticationHandler.java (1 usage found)
34 import com.google.common.base.Splitter;
org.apache.hadoop.security.token.delegation.web (1 usage found)
MultiSchemeDelegationTokenAuthenticationHandler.java (1 usage found)
40 import com.google.common.base.Splitter;
org.apache.hadoop.tools.dynamometer (1 usage found)
Client.java (1 usage found)
22 import com.google.common.base.Splitter;
org.apache.hadoop.tools.dynamometer.workloadgenerator.audit (2 usages
found)
AuditLogDirectParser.java (1 usage found)
20 import com.google.common.base.Splitter;
AuditReplayThread.java (1 usage found)
20 import com.google.common.base.Splitter;
org.apache.hadoop.util (2 usages found)
TestApplicationClassLoader.java (1 usage found)
44 import com.google.common.base.Splitter;
ZKUtil.java (1 usage found)
31 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.api.records.timeline (1 usage found)
TimelineEntityGroupId.java (1 usage found)
27 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.server.resourcemanager.scheduler (1 usage found)
QueueMetrics.java (1 usage found)
55 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.util (1 usage found)
StringHelper.java (1 usage found)
21 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.webapp (1 usage found)
WebApp.java (1 usage found)
37 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.webapp.hamlet (1 usage found)
HamletImpl.java (1 usage found)
22 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.webapp.hamlet2 (1 usage found)
HamletImpl.java (1 usage found)
22 import com.google.common.base.Splitter;
{code}
was:
Hadoop source code uses {{com.google.common.base.Splitter}} . We need to
analyze the performance overhead of splitter and consider different
implementations such as apache-commons.
Hadoop has an implementation {{org.apache.hadoop.util.StringUtils.split()}}.
Therefore, we should all split() calls use the wrapper in common. This will
make the utility calls less invasive and confusing as the behavior of
apache-commons.stringUtils is not the same as guava.
Once we have the wrapper, and all calls are using that wrapper, we can decide
to use apache-commons or do specific optimizations without changing the entire
source code.
{code:bash}
Targets
Occurrences of 'import com.google.common.base.Splitter;' in project with
mask '*.java'
Found Occurrences (18 usages found)
org.apache.hadoop.crypto (1 usage found)
CryptoCodec.java (1 usage found)
34 import com.google.common.base.Splitter;
org.apache.hadoop.mapred.nativetask.kvtest (1 usage found)
KVTest.java (1 usage found)
44 import com.google.common.base.Splitter;
org.apache.hadoop.mapreduce.v2.util (1 usage found)
MRWebAppUtil.java (1 usage found)
20 import com.google.common.base.Splitter;
org.apache.hadoop.metrics2.impl (1 usage found)
MetricsConfig.java (1 usage found)
32 import com.google.common.base.Splitter;
org.apache.hadoop.registry.client.impl.zk (1 usage found)
RegistrySecurity.java (1 usage found)
22 import com.google.common.base.Splitter;
org.apache.hadoop.security.authentication.server (1 usage found)
MultiSchemeAuthenticationHandler.java (1 usage found)
34 import com.google.common.base.Splitter;
org.apache.hadoop.security.token.delegation.web (1 usage found)
MultiSchemeDelegationTokenAuthenticationHandler.java (1 usage found)
40 import com.google.common.base.Splitter;
org.apache.hadoop.tools.dynamometer (1 usage found)
Client.java (1 usage found)
22 import com.google.common.base.Splitter;
org.apache.hadoop.tools.dynamometer.workloadgenerator.audit (2 usages
found)
AuditLogDirectParser.java (1 usage found)
20 import com.google.common.base.Splitter;
AuditReplayThread.java (1 usage found)
20 import com.google.common.base.Splitter;
org.apache.hadoop.util (2 usages found)
TestApplicationClassLoader.java (1 usage found)
44 import com.google.common.base.Splitter;
ZKUtil.java (1 usage found)
31 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.api.records.timeline (1 usage found)
TimelineEntityGroupId.java (1 usage found)
27 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.server.resourcemanager.scheduler (1 usage found)
QueueMetrics.java (1 usage found)
55 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.util (1 usage found)
StringHelper.java (1 usage found)
21 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.webapp (1 usage found)
WebApp.java (1 usage found)
37 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.webapp.hamlet (1 usage found)
HamletImpl.java (1 usage found)
22 import com.google.common.base.Splitter;
org.apache.hadoop.yarn.webapp.hamlet2 (1 usage found)
HamletImpl.java (1 usage found)
22 import com.google.common.base.Splitter;
{code}
> Replace Guava.Splitter with common.util.StringUtils
> ---------------------------------------------------
>
> Key: HADOOP-17146
> URL: https://issues.apache.org/jira/browse/HADOOP-17146
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: common
> Reporter: Ahmed Hussein
> Priority: Major
>
> Hadoop source code uses {{com.google.common.base.Splitter}} . We need to
> analyze the performance overhead of splitter and consider different
> implementations such as apache-commons.
> Hadoop has an implementation {{org.apache.hadoop.util.StringUtils.split()}}.
> Therefore, all split() calls have to use the wrapper in common package. This
> will make the utility calls less invasive and confusing as the behavior of
> apache-commons.stringUtils is not the same as guava.
> Once we have the wrapper, and all calls are using that wrapper, we can decide
> to use apache-commons or do specific optimizations without changing the
> entire source code.
>
> {code:bash}
> Targets
> Occurrences of 'import com.google.common.base.Splitter;' in project with
> mask '*.java'
> Found Occurrences (18 usages found)
> org.apache.hadoop.crypto (1 usage found)
> CryptoCodec.java (1 usage found)
> 34 import com.google.common.base.Splitter;
> org.apache.hadoop.mapred.nativetask.kvtest (1 usage found)
> KVTest.java (1 usage found)
> 44 import com.google.common.base.Splitter;
> org.apache.hadoop.mapreduce.v2.util (1 usage found)
> MRWebAppUtil.java (1 usage found)
> 20 import com.google.common.base.Splitter;
> org.apache.hadoop.metrics2.impl (1 usage found)
> MetricsConfig.java (1 usage found)
> 32 import com.google.common.base.Splitter;
> org.apache.hadoop.registry.client.impl.zk (1 usage found)
> RegistrySecurity.java (1 usage found)
> 22 import com.google.common.base.Splitter;
> org.apache.hadoop.security.authentication.server (1 usage found)
> MultiSchemeAuthenticationHandler.java (1 usage found)
> 34 import com.google.common.base.Splitter;
> org.apache.hadoop.security.token.delegation.web (1 usage found)
> MultiSchemeDelegationTokenAuthenticationHandler.java (1 usage found)
> 40 import com.google.common.base.Splitter;
> org.apache.hadoop.tools.dynamometer (1 usage found)
> Client.java (1 usage found)
> 22 import com.google.common.base.Splitter;
> org.apache.hadoop.tools.dynamometer.workloadgenerator.audit (2 usages
> found)
> AuditLogDirectParser.java (1 usage found)
> 20 import com.google.common.base.Splitter;
> AuditReplayThread.java (1 usage found)
> 20 import com.google.common.base.Splitter;
> org.apache.hadoop.util (2 usages found)
> TestApplicationClassLoader.java (1 usage found)
> 44 import com.google.common.base.Splitter;
> ZKUtil.java (1 usage found)
> 31 import com.google.common.base.Splitter;
> org.apache.hadoop.yarn.api.records.timeline (1 usage found)
> TimelineEntityGroupId.java (1 usage found)
> 27 import com.google.common.base.Splitter;
> org.apache.hadoop.yarn.server.resourcemanager.scheduler (1 usage found)
> QueueMetrics.java (1 usage found)
> 55 import com.google.common.base.Splitter;
> org.apache.hadoop.yarn.util (1 usage found)
> StringHelper.java (1 usage found)
> 21 import com.google.common.base.Splitter;
> org.apache.hadoop.yarn.webapp (1 usage found)
> WebApp.java (1 usage found)
> 37 import com.google.common.base.Splitter;
> org.apache.hadoop.yarn.webapp.hamlet (1 usage found)
> HamletImpl.java (1 usage found)
> 22 import com.google.common.base.Splitter;
> org.apache.hadoop.yarn.webapp.hamlet2 (1 usage found)
> HamletImpl.java (1 usage found)
> 22 import com.google.common.base.Splitter;
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]