[jira] [Commented] (HADOOP-17115) Replace Guava Sets usage by Hadoop's own Sets in hadoop-common and hadoop-tools

2021-05-25 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350966#comment-17350966
 ] 

Viraj Jasani commented on HADOOP-17115:
---

Thank you [~ahussein] and [~busbey] for all your reviews. It was great help.

> Replace Guava Sets usage by Hadoop's own Sets in hadoop-common and 
> hadoop-tools
> ---
>
> Key: HADOOP-17115
> URL: https://issues.apache.org/jira/browse/HADOOP-17115
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Unjustified usage of Guava API to initialize a {{HashSet}}. This should be 
> replaced by Java APIs.
> {code:java}
> Targets
> Occurrences of 'Sets.newHashSet' in project
> Found Occurrences  (223 usages found)
> org.apache.hadoop.crypto.key  (2 usages found)
> TestValueQueue.java  (2 usages found)
> testWarmUp()  (2 usages found)
> 106 Assert.assertEquals(Sets.newHashSet("k1", "k2", "k3"),
> 107 Sets.newHashSet(fillInfos[0].key,
> org.apache.hadoop.crypto.key.kms  (6 usages found)
> TestLoadBalancingKMSClientProvider.java  (6 usages found)
> testCreation()  (6 usages found)
> 86 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";),
> 87 Sets.newHashSet(providers[0].getKMSUrl()));
> 95 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 98 Sets.newHashSet(providers[0].getKMSUrl(),
> 108 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 111 Sets.newHashSet(providers[0].getKMSUrl(),
> org.apache.hadoop.crypto.key.kms.server  (1 usage found)
> KMSAudit.java  (1 usage found)
> 59 static final Set AGGREGATE_OPS_WHITELIST = 
> Sets.newHashSet(
> org.apache.hadoop.fs.s3a  (1 usage found)
> TestS3AAWSCredentialsProvider.java  (1 usage found)
> testFallbackToDefaults()  (1 usage found)
> 183 Sets.newHashSet());
> org.apache.hadoop.fs.s3a.auth  (1 usage found)
> AssumedRoleCredentialProvider.java  (1 usage found)
> AssumedRoleCredentialProvider(URI, Configuration)  (1 usage found)
> 113 Sets.newHashSet(this.getClass()));
> org.apache.hadoop.fs.s3a.commit.integration  (1 usage found)
> ITestS3ACommitterMRJob.java  (1 usage found)
> test_200_execute()  (1 usage found)
> 232 Set expectedKeys = Sets.newHashSet();
> org.apache.hadoop.fs.s3a.commit.staging  (5 usages found)
> TestStagingCommitter.java  (3 usages found)
> testSingleTaskMultiFileCommit()  (1 usage found)
> 341 Set keys = Sets.newHashSet();
> runTasks(JobContext, int, int)  (1 usage found)
> 603 Set uploads = Sets.newHashSet();
> commitTask(StagingCommitter, TaskAttemptContext, int)  (1 usage 
> found)
> 640 Set files = Sets.newHashSet();
> TestStagingPartitionedTaskCommit.java  (2 usages found)
> verifyFilesCreated(PartitionedStagingCommitter)  (1 usage found)
> 148 Set files = Sets.newHashSet();
> buildExpectedList(StagingCommitter)  (1 usage found)
> 188 Set expected = Sets.newHashSet();
> org.apache.hadoop.hdfs  (5 usages found)
> DFSUtil.java  (2 usages found)
> getNNServiceRpcAddressesForCluster(Configuration)  (1 usage found)
> 615 Set availableNameServices = Sets.newHashSet(conf
> getNNLifelineRpcAddressesForCluster(Configuration)  (1 usage 
> found)
> 660 Set availableNameServices = Sets.newHashSet(conf
> MiniDFSCluster.java  (1 usage found)
> 597 private Set fileSystems = Sets.newHashSet();
> TestDFSUtil.java  (2 usages found)
> testGetNNServiceRpcAddressesForNsIds()  (2 usages found)
> 1046 assertEquals(Sets.newHashSet("nn1"), internal);
> 1049 assertEquals(Sets.newHashSet("nn1", "nn2"), all);
> org.apache.hadoop.hdfs.net  (5 usages found)
> TestDFSNetworkTopology.java  (5 usages found)
> testChooseRandomWithStorageType()  (4 usages found)
> 277 Sets.newHashSet("host2", "host4", "host5", "host6");
> 278 Set archiveUnderL1 = Sets.newHashSet("host1", 
> "host3");
> 279 Set ramdiskUnderL1 = Sets.newHashSet("host7");
> 280 Set ssdUnderL1 = Sets.newHashSet("host8");
> testChooseRandomWithSt

[jira] [Commented] (HADOOP-17115) Replace Guava Sets usage by Hadoop's own Sets in hadoop-common and hadoop-tools

2021-05-24 Thread Ahmed Hussein (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17350467#comment-17350467
 ] 

Ahmed Hussein commented on HADOOP-17115:


Apologies for not following up with reviews as I was OOO for the last two weeks.
Thanks [~busbey] and [~vjasani] for getting this code merged.

> Replace Guava Sets usage by Hadoop's own Sets in hadoop-common and 
> hadoop-tools
> ---
>
> Key: HADOOP-17115
> URL: https://issues.apache.org/jira/browse/HADOOP-17115
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Unjustified usage of Guava API to initialize a {{HashSet}}. This should be 
> replaced by Java APIs.
> {code:java}
> Targets
> Occurrences of 'Sets.newHashSet' in project
> Found Occurrences  (223 usages found)
> org.apache.hadoop.crypto.key  (2 usages found)
> TestValueQueue.java  (2 usages found)
> testWarmUp()  (2 usages found)
> 106 Assert.assertEquals(Sets.newHashSet("k1", "k2", "k3"),
> 107 Sets.newHashSet(fillInfos[0].key,
> org.apache.hadoop.crypto.key.kms  (6 usages found)
> TestLoadBalancingKMSClientProvider.java  (6 usages found)
> testCreation()  (6 usages found)
> 86 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";),
> 87 Sets.newHashSet(providers[0].getKMSUrl()));
> 95 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 98 Sets.newHashSet(providers[0].getKMSUrl(),
> 108 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 111 Sets.newHashSet(providers[0].getKMSUrl(),
> org.apache.hadoop.crypto.key.kms.server  (1 usage found)
> KMSAudit.java  (1 usage found)
> 59 static final Set AGGREGATE_OPS_WHITELIST = 
> Sets.newHashSet(
> org.apache.hadoop.fs.s3a  (1 usage found)
> TestS3AAWSCredentialsProvider.java  (1 usage found)
> testFallbackToDefaults()  (1 usage found)
> 183 Sets.newHashSet());
> org.apache.hadoop.fs.s3a.auth  (1 usage found)
> AssumedRoleCredentialProvider.java  (1 usage found)
> AssumedRoleCredentialProvider(URI, Configuration)  (1 usage found)
> 113 Sets.newHashSet(this.getClass()));
> org.apache.hadoop.fs.s3a.commit.integration  (1 usage found)
> ITestS3ACommitterMRJob.java  (1 usage found)
> test_200_execute()  (1 usage found)
> 232 Set expectedKeys = Sets.newHashSet();
> org.apache.hadoop.fs.s3a.commit.staging  (5 usages found)
> TestStagingCommitter.java  (3 usages found)
> testSingleTaskMultiFileCommit()  (1 usage found)
> 341 Set keys = Sets.newHashSet();
> runTasks(JobContext, int, int)  (1 usage found)
> 603 Set uploads = Sets.newHashSet();
> commitTask(StagingCommitter, TaskAttemptContext, int)  (1 usage 
> found)
> 640 Set files = Sets.newHashSet();
> TestStagingPartitionedTaskCommit.java  (2 usages found)
> verifyFilesCreated(PartitionedStagingCommitter)  (1 usage found)
> 148 Set files = Sets.newHashSet();
> buildExpectedList(StagingCommitter)  (1 usage found)
> 188 Set expected = Sets.newHashSet();
> org.apache.hadoop.hdfs  (5 usages found)
> DFSUtil.java  (2 usages found)
> getNNServiceRpcAddressesForCluster(Configuration)  (1 usage found)
> 615 Set availableNameServices = Sets.newHashSet(conf
> getNNLifelineRpcAddressesForCluster(Configuration)  (1 usage 
> found)
> 660 Set availableNameServices = Sets.newHashSet(conf
> MiniDFSCluster.java  (1 usage found)
> 597 private Set fileSystems = Sets.newHashSet();
> TestDFSUtil.java  (2 usages found)
> testGetNNServiceRpcAddressesForNsIds()  (2 usages found)
> 1046 assertEquals(Sets.newHashSet("nn1"), internal);
> 1049 assertEquals(Sets.newHashSet("nn1", "nn2"), all);
> org.apache.hadoop.hdfs.net  (5 usages found)
> TestDFSNetworkTopology.java  (5 usages found)
> testChooseRandomWithStorageType()  (4 usages found)
> 277 Sets.newHashSet("host2", "host4", "host5", "host6");
> 278 Set archiveUnderL1 = Sets.newHashSet("host1", 
> "host3");
> 279 Set ramdiskUnderL1 = Sets.newHashSet("host7");
> 280 Set ssdUnde

[jira] [Commented] (HADOOP-17115) Replace Guava Sets usage by Hadoop's own Sets

2021-05-16 Thread Sean Busbey (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345689#comment-17345689
 ] 

Sean Busbey commented on HADOOP-17115:
--

Also there is no license issue with copying code out of guava so long as we 
attribute it properly. We have existing classes where this was done wholesale, 
i e LimitInputStream.

> Replace Guava Sets usage by Hadoop's own Sets
> -
>
> Key: HADOOP-17115
> URL: https://issues.apache.org/jira/browse/HADOOP-17115
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Unjustified usage of Guava API to initialize a {{HashSet}}. This should be 
> replaced by Java APIs.
> {code:java}
> Targets
> Occurrences of 'Sets.newHashSet' in project
> Found Occurrences  (223 usages found)
> org.apache.hadoop.crypto.key  (2 usages found)
> TestValueQueue.java  (2 usages found)
> testWarmUp()  (2 usages found)
> 106 Assert.assertEquals(Sets.newHashSet("k1", "k2", "k3"),
> 107 Sets.newHashSet(fillInfos[0].key,
> org.apache.hadoop.crypto.key.kms  (6 usages found)
> TestLoadBalancingKMSClientProvider.java  (6 usages found)
> testCreation()  (6 usages found)
> 86 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";),
> 87 Sets.newHashSet(providers[0].getKMSUrl()));
> 95 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 98 Sets.newHashSet(providers[0].getKMSUrl(),
> 108 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 111 Sets.newHashSet(providers[0].getKMSUrl(),
> org.apache.hadoop.crypto.key.kms.server  (1 usage found)
> KMSAudit.java  (1 usage found)
> 59 static final Set AGGREGATE_OPS_WHITELIST = 
> Sets.newHashSet(
> org.apache.hadoop.fs.s3a  (1 usage found)
> TestS3AAWSCredentialsProvider.java  (1 usage found)
> testFallbackToDefaults()  (1 usage found)
> 183 Sets.newHashSet());
> org.apache.hadoop.fs.s3a.auth  (1 usage found)
> AssumedRoleCredentialProvider.java  (1 usage found)
> AssumedRoleCredentialProvider(URI, Configuration)  (1 usage found)
> 113 Sets.newHashSet(this.getClass()));
> org.apache.hadoop.fs.s3a.commit.integration  (1 usage found)
> ITestS3ACommitterMRJob.java  (1 usage found)
> test_200_execute()  (1 usage found)
> 232 Set expectedKeys = Sets.newHashSet();
> org.apache.hadoop.fs.s3a.commit.staging  (5 usages found)
> TestStagingCommitter.java  (3 usages found)
> testSingleTaskMultiFileCommit()  (1 usage found)
> 341 Set keys = Sets.newHashSet();
> runTasks(JobContext, int, int)  (1 usage found)
> 603 Set uploads = Sets.newHashSet();
> commitTask(StagingCommitter, TaskAttemptContext, int)  (1 usage 
> found)
> 640 Set files = Sets.newHashSet();
> TestStagingPartitionedTaskCommit.java  (2 usages found)
> verifyFilesCreated(PartitionedStagingCommitter)  (1 usage found)
> 148 Set files = Sets.newHashSet();
> buildExpectedList(StagingCommitter)  (1 usage found)
> 188 Set expected = Sets.newHashSet();
> org.apache.hadoop.hdfs  (5 usages found)
> DFSUtil.java  (2 usages found)
> getNNServiceRpcAddressesForCluster(Configuration)  (1 usage found)
> 615 Set availableNameServices = Sets.newHashSet(conf
> getNNLifelineRpcAddressesForCluster(Configuration)  (1 usage 
> found)
> 660 Set availableNameServices = Sets.newHashSet(conf
> MiniDFSCluster.java  (1 usage found)
> 597 private Set fileSystems = Sets.newHashSet();
> TestDFSUtil.java  (2 usages found)
> testGetNNServiceRpcAddressesForNsIds()  (2 usages found)
> 1046 assertEquals(Sets.newHashSet("nn1"), internal);
> 1049 assertEquals(Sets.newHashSet("nn1", "nn2"), all);
> org.apache.hadoop.hdfs.net  (5 usages found)
> TestDFSNetworkTopology.java  (5 usages found)
> testChooseRandomWithStorageType()  (4 usages found)
> 277 Sets.newHashSet("host2", "host4", "host5", "host6");
> 278 Set archiveUnderL1 = Sets.newHashSet("host1", 
> "host3");
> 279 Set ramdiskUnderL1 = Sets.newHashSet("host7");
> 280 Set ssdUnderL1 = Sets.newHashSet("host8");
> testChooseRandomWithStor

[jira] [Commented] (HADOOP-17115) Replace Guava Sets usage by Hadoop's own Sets

2021-05-16 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345685#comment-17345685
 ] 

Viraj Jasani commented on HADOOP-17115:
---

Thanks for your interest [~quanli]. Reducing Guava dependency in Hadoop code is 
a big and nice initiative started by [~ahussein] on parent Jira HADOOP-17098. 
Please feel free to take a detailed look, similar discussion has been covered 
there already.
{quote}In general these changes hardly or no advantage.
{quote}
I disagree with this comment because we had to bump Guava multiple times due to 
many CVEs and it keeps coming back up. Now everytime new CVE is introduced, not 
only we need to perform Hadoop release only but thirdparty release first 
followed by Hadoop, and some users (although do not wish to upgrade on any 
patch release) would have to mandatorily keep performing upgrades due to 
various security vulnerabilities.

Anyways, please feel free to take a look at parent Jira HADOOP-17098, lots of 
work has already been done by [~ahussein] and I am just helping out with some 
remaining work, trying our best if we can get rid of this cycle of redundant 
upgrades due to CVEs in Guava.
{quote}saw a comment telling to copy code from guava, is no licence issue here?
{quote}
The patch has new Sets class introduced in Hadoop code base but it just copies 
required method signatures from Guava Sets. Internal logic is not really 
copied. Implementation is really restricted to our use case only.

Thanks

> Replace Guava Sets usage by Hadoop's own Sets
> -
>
> Key: HADOOP-17115
> URL: https://issues.apache.org/jira/browse/HADOOP-17115
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Unjustified usage of Guava API to initialize a {{HashSet}}. This should be 
> replaced by Java APIs.
> {code:java}
> Targets
> Occurrences of 'Sets.newHashSet' in project
> Found Occurrences  (223 usages found)
> org.apache.hadoop.crypto.key  (2 usages found)
> TestValueQueue.java  (2 usages found)
> testWarmUp()  (2 usages found)
> 106 Assert.assertEquals(Sets.newHashSet("k1", "k2", "k3"),
> 107 Sets.newHashSet(fillInfos[0].key,
> org.apache.hadoop.crypto.key.kms  (6 usages found)
> TestLoadBalancingKMSClientProvider.java  (6 usages found)
> testCreation()  (6 usages found)
> 86 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";),
> 87 Sets.newHashSet(providers[0].getKMSUrl()));
> 95 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 98 Sets.newHashSet(providers[0].getKMSUrl(),
> 108 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 111 Sets.newHashSet(providers[0].getKMSUrl(),
> org.apache.hadoop.crypto.key.kms.server  (1 usage found)
> KMSAudit.java  (1 usage found)
> 59 static final Set AGGREGATE_OPS_WHITELIST = 
> Sets.newHashSet(
> org.apache.hadoop.fs.s3a  (1 usage found)
> TestS3AAWSCredentialsProvider.java  (1 usage found)
> testFallbackToDefaults()  (1 usage found)
> 183 Sets.newHashSet());
> org.apache.hadoop.fs.s3a.auth  (1 usage found)
> AssumedRoleCredentialProvider.java  (1 usage found)
> AssumedRoleCredentialProvider(URI, Configuration)  (1 usage found)
> 113 Sets.newHashSet(this.getClass()));
> org.apache.hadoop.fs.s3a.commit.integration  (1 usage found)
> ITestS3ACommitterMRJob.java  (1 usage found)
> test_200_execute()  (1 usage found)
> 232 Set expectedKeys = Sets.newHashSet();
> org.apache.hadoop.fs.s3a.commit.staging  (5 usages found)
> TestStagingCommitter.java  (3 usages found)
> testSingleTaskMultiFileCommit()  (1 usage found)
> 341 Set keys = Sets.newHashSet();
> runTasks(JobContext, int, int)  (1 usage found)
> 603 Set uploads = Sets.newHashSet();
> commitTask(StagingCommitter, TaskAttemptContext, int)  (1 usage 
> found)
> 640 Set files = Sets.newHashSet();
> TestStagingPartitionedTaskCommit.java  (2 usages found)
> verifyFilesCreated(PartitionedStagingCommitter)  (1 usage found)
> 148 Set files = Sets.newHashSet();
> buildExpectedList(StagingCommitter)  (1 usage found)
> 188 Set expected = Sets.newHashSet();
> org.apache.hadoop.hdfs  (5 usages found)
> DFSUtil.java  (2 usages found)
> getNNServiceRpcAddre

[jira] [Commented] (HADOOP-17115) Replace Guava Sets usage by Hadoop's own Sets

2021-05-16 Thread Quan Li (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17345672#comment-17345672
 ] 

Quan Li commented on HADOOP-17115:
--

[~busbey]/[~vjasani]
can you share the performance improvements numbers for this improvement, or can 
you point me to the person or mailing list where I can get explanation? In 
general these changes hardly or no advantage.
saw a comment telling to copy code from guava, is no licence issue here? and 
does any other project also did this work. this in general is not a code logic 
improvement right? general practise to share improvement numbers, not theory 
before concluding

> Replace Guava Sets usage by Hadoop's own Sets
> -
>
> Key: HADOOP-17115
> URL: https://issues.apache.org/jira/browse/HADOOP-17115
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>
> Unjustified usage of Guava API to initialize a {{HashSet}}. This should be 
> replaced by Java APIs.
> {code:java}
> Targets
> Occurrences of 'Sets.newHashSet' in project
> Found Occurrences  (223 usages found)
> org.apache.hadoop.crypto.key  (2 usages found)
> TestValueQueue.java  (2 usages found)
> testWarmUp()  (2 usages found)
> 106 Assert.assertEquals(Sets.newHashSet("k1", "k2", "k3"),
> 107 Sets.newHashSet(fillInfos[0].key,
> org.apache.hadoop.crypto.key.kms  (6 usages found)
> TestLoadBalancingKMSClientProvider.java  (6 usages found)
> testCreation()  (6 usages found)
> 86 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";),
> 87 Sets.newHashSet(providers[0].getKMSUrl()));
> 95 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 98 Sets.newHashSet(providers[0].getKMSUrl(),
> 108 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 111 Sets.newHashSet(providers[0].getKMSUrl(),
> org.apache.hadoop.crypto.key.kms.server  (1 usage found)
> KMSAudit.java  (1 usage found)
> 59 static final Set AGGREGATE_OPS_WHITELIST = 
> Sets.newHashSet(
> org.apache.hadoop.fs.s3a  (1 usage found)
> TestS3AAWSCredentialsProvider.java  (1 usage found)
> testFallbackToDefaults()  (1 usage found)
> 183 Sets.newHashSet());
> org.apache.hadoop.fs.s3a.auth  (1 usage found)
> AssumedRoleCredentialProvider.java  (1 usage found)
> AssumedRoleCredentialProvider(URI, Configuration)  (1 usage found)
> 113 Sets.newHashSet(this.getClass()));
> org.apache.hadoop.fs.s3a.commit.integration  (1 usage found)
> ITestS3ACommitterMRJob.java  (1 usage found)
> test_200_execute()  (1 usage found)
> 232 Set expectedKeys = Sets.newHashSet();
> org.apache.hadoop.fs.s3a.commit.staging  (5 usages found)
> TestStagingCommitter.java  (3 usages found)
> testSingleTaskMultiFileCommit()  (1 usage found)
> 341 Set keys = Sets.newHashSet();
> runTasks(JobContext, int, int)  (1 usage found)
> 603 Set uploads = Sets.newHashSet();
> commitTask(StagingCommitter, TaskAttemptContext, int)  (1 usage 
> found)
> 640 Set files = Sets.newHashSet();
> TestStagingPartitionedTaskCommit.java  (2 usages found)
> verifyFilesCreated(PartitionedStagingCommitter)  (1 usage found)
> 148 Set files = Sets.newHashSet();
> buildExpectedList(StagingCommitter)  (1 usage found)
> 188 Set expected = Sets.newHashSet();
> org.apache.hadoop.hdfs  (5 usages found)
> DFSUtil.java  (2 usages found)
> getNNServiceRpcAddressesForCluster(Configuration)  (1 usage found)
> 615 Set availableNameServices = Sets.newHashSet(conf
> getNNLifelineRpcAddressesForCluster(Configuration)  (1 usage 
> found)
> 660 Set availableNameServices = Sets.newHashSet(conf
> MiniDFSCluster.java  (1 usage found)
> 597 private Set fileSystems = Sets.newHashSet();
> TestDFSUtil.java  (2 usages found)
> testGetNNServiceRpcAddressesForNsIds()  (2 usages found)
> 1046 assertEquals(Sets.newHashSet("nn1"), internal);
> 1049 assertEquals(Sets.newHashSet("nn1", "nn2"), all);
> org.apache.hadoop.hdfs.net  (5 usages found)
> TestDFSNetworkTopology.java  (5 usages found)
> testChooseRandomWithStorageType()  (4 usages found)
> 277 Sets.newHashSet

[jira] [Commented] (HADOOP-17115) Replace Guava Sets usage by Hadoop's own Sets

2021-05-08 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HADOOP-17115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17341217#comment-17341217
 ] 

Viraj Jasani commented on HADOOP-17115:
---

[~ahussein] QA result is available, if you would like to take a look at PR.

Thanks

> Replace Guava Sets usage by Hadoop's own Sets
> -
>
> Key: HADOOP-17115
> URL: https://issues.apache.org/jira/browse/HADOOP-17115
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ahmed Hussein
>Assignee: Viraj Jasani
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Unjustified usage of Guava API to initialize a {{HashSet}}. This should be 
> replaced by Java APIs.
> {code:java}
> Targets
> Occurrences of 'Sets.newHashSet' in project
> Found Occurrences  (223 usages found)
> org.apache.hadoop.crypto.key  (2 usages found)
> TestValueQueue.java  (2 usages found)
> testWarmUp()  (2 usages found)
> 106 Assert.assertEquals(Sets.newHashSet("k1", "k2", "k3"),
> 107 Sets.newHashSet(fillInfos[0].key,
> org.apache.hadoop.crypto.key.kms  (6 usages found)
> TestLoadBalancingKMSClientProvider.java  (6 usages found)
> testCreation()  (6 usages found)
> 86 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";),
> 87 Sets.newHashSet(providers[0].getKMSUrl()));
> 95 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 98 Sets.newHashSet(providers[0].getKMSUrl(),
> 108 
> assertEquals(Sets.newHashSet("http://host1:9600/kms/foo/v1/";,
> 111 Sets.newHashSet(providers[0].getKMSUrl(),
> org.apache.hadoop.crypto.key.kms.server  (1 usage found)
> KMSAudit.java  (1 usage found)
> 59 static final Set AGGREGATE_OPS_WHITELIST = 
> Sets.newHashSet(
> org.apache.hadoop.fs.s3a  (1 usage found)
> TestS3AAWSCredentialsProvider.java  (1 usage found)
> testFallbackToDefaults()  (1 usage found)
> 183 Sets.newHashSet());
> org.apache.hadoop.fs.s3a.auth  (1 usage found)
> AssumedRoleCredentialProvider.java  (1 usage found)
> AssumedRoleCredentialProvider(URI, Configuration)  (1 usage found)
> 113 Sets.newHashSet(this.getClass()));
> org.apache.hadoop.fs.s3a.commit.integration  (1 usage found)
> ITestS3ACommitterMRJob.java  (1 usage found)
> test_200_execute()  (1 usage found)
> 232 Set expectedKeys = Sets.newHashSet();
> org.apache.hadoop.fs.s3a.commit.staging  (5 usages found)
> TestStagingCommitter.java  (3 usages found)
> testSingleTaskMultiFileCommit()  (1 usage found)
> 341 Set keys = Sets.newHashSet();
> runTasks(JobContext, int, int)  (1 usage found)
> 603 Set uploads = Sets.newHashSet();
> commitTask(StagingCommitter, TaskAttemptContext, int)  (1 usage 
> found)
> 640 Set files = Sets.newHashSet();
> TestStagingPartitionedTaskCommit.java  (2 usages found)
> verifyFilesCreated(PartitionedStagingCommitter)  (1 usage found)
> 148 Set files = Sets.newHashSet();
> buildExpectedList(StagingCommitter)  (1 usage found)
> 188 Set expected = Sets.newHashSet();
> org.apache.hadoop.hdfs  (5 usages found)
> DFSUtil.java  (2 usages found)
> getNNServiceRpcAddressesForCluster(Configuration)  (1 usage found)
> 615 Set availableNameServices = Sets.newHashSet(conf
> getNNLifelineRpcAddressesForCluster(Configuration)  (1 usage 
> found)
> 660 Set availableNameServices = Sets.newHashSet(conf
> MiniDFSCluster.java  (1 usage found)
> 597 private Set fileSystems = Sets.newHashSet();
> TestDFSUtil.java  (2 usages found)
> testGetNNServiceRpcAddressesForNsIds()  (2 usages found)
> 1046 assertEquals(Sets.newHashSet("nn1"), internal);
> 1049 assertEquals(Sets.newHashSet("nn1", "nn2"), all);
> org.apache.hadoop.hdfs.net  (5 usages found)
> TestDFSNetworkTopology.java  (5 usages found)
> testChooseRandomWithStorageType()  (4 usages found)
> 277 Sets.newHashSet("host2", "host4", "host5", "host6");
> 278 Set archiveUnderL1 = Sets.newHashSet("host1", 
> "host3");
> 279 Set ramdiskUnderL1 = Sets.newHashSet("host7");
> 280 Set ssdUnderL1 = Sets.newHashSet("host8");
> testChooseRandomWithStorageTypeWithExcluded()  (1 usage found)
> 363 Set expectedSet = Sets.newHashSet("