[
https://issues.apache.org/jira/browse/TEZ-4557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17859483#comment-17859483
]
László Bodor commented on TEZ-4557:
-----------------------------------
I was thinking about this and I feel that we might want to remove this
exclusion, so add httpclient transitive dependency back
I remember back in the day when TEZ-4303 was merged, we tended to follow a
pattern of removing every transitive dependency from Tez that caused CVE scan
issues, because we - from tez side - just wanted to get rid of them, but at the
same time we violated a more important rule to have a standalone, OOTB
tez.tar.gz that works in most of the cases...here in the stacktrace, it's
clearly seen that Hadoop's KMSClientProvider needs this library, so excluding
it is a hack again, and doesn't improve the stack (Hadoop will have the same
"bad" dependency)
today I would solve CVE issues like:
1. if something is required by Hadoop, let's try not to mess with that
2. if a transitive dependency causes CVE warning, let's push the solution on
Hadoop folks, and upgrade the Hadoop dependency when it's available -> this way
the whole Hadoop stack will leverage our efforts
I'm open to any objections here: [~ayushtkn], [~jeagles]
> Revert TEZ-4303, NoClassDefFoundError because of missing httpclient jar
> -----------------------------------------------------------------------
>
> Key: TEZ-4557
> URL: https://issues.apache.org/jira/browse/TEZ-4557
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Raghav Aggarwal
> Assignee: Raghav Aggarwal
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> When insert data into table located in encryption zone using Hive with tez
> fails as the httpclient jar has been excluded from hadoop transitive
> dependency. Same query passes with MR.
> Tez: 0.10.2,0.10.3
> Hadoop: 3.3.6
> Hive: 3.1.2
>
> Steps to reproduce issue:
> 1. Create a encryption key using ranger keyadmin user.
> 2. hdfs crypto -createZone -keyName test_key -path /user/raghav/encrypt_zone
> 3. create table tbl(id int) location '/user/raghav/encrypt_zone';
> 4. insert into tbl values(1);
>
> Stacktrace:
> {code:java}
> Caused by: java.lang.NoClassDefFoundError:
> org/apache/http/client/utils/URIBuilder
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.createURL(KMSClientProvider.java:468)
> at
> org.apache.hadoop.crypto.key.kms.KMSClientProvider.decryptEncryptedKey(KMSClientProvider.java:823)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:354)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider$5.call(LoadBalancingKMSClientProvider.java:350)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.doOp(LoadBalancingKMSClientProvider.java:175)
> at
> org.apache.hadoop.crypto.key.kms.LoadBalancingKMSClientProvider.decryptEncryptedKey(LoadBalancingKMSClientProvider.java:350)
> at
> org.apache.hadoop.crypto.key.KeyProviderCryptoExtension.decryptEncryptedKey(KeyProviderCryptoExtension.java:535)
> at
> org.apache.hadoop.hdfs.HdfsKMSUtil.decryptEncryptedDataEncryptionKey(HdfsKMSUtil.java:216)
> at
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:1002)
> at
> org.apache.hadoop.hdfs.DFSClient.createWrappedOutputStream(DFSClient.java:983)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.safelyCreateWrappedOutputStream(DistributedFileSystem.java:734)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.access$300(DistributedFileSystem.java:149)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:572)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$8.doCall(DistributedFileSystem.java:566)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:580)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:507)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1233)
> at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:1109)
> at
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat.getHiveRecordWriter(HiveIgnoreKeyTextOutputFormat.java:81)
> at
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getRecordWriter(HiveFileFormatUtils.java:297)
> at
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getHiveRecordWriter(HiveFileFormatUtils.java:282)
> at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketForFileIdx(FileSinkOperator.java:801)
> at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.createBucketFiles(FileSinkOperator.java:752)
> at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:922)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
> at
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
> at
> org.apache.hadoop.hive.ql.exec.UDTFOperator.forwardUDTFOutput(UDTFOperator.java:133)
> at
> org.apache.hadoop.hive.ql.udf.generic.UDTFCollector.collect(UDTFCollector.java:45)
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTF.forward(GenericUDTF.java:110)
> at
> org.apache.hadoop.hive.ql.udf.generic.GenericUDTFInline.process(GenericUDTFInline.java:64)
> at
> org.apache.hadoop.hive.ql.exec.UDTFOperator.process(UDTFOperator.java:116)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:926)
> at
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:95)
> at org.apache.hadoop.hive.ql.exec.Operator.baseForward(Operator.java:993)
> at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:939)
> at
> org.apache.hadoop.hive.ql.exec.TableScanOperator.process(TableScanOperator.java:125)
> at
> org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:154)
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:556)
> at
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.processRow(MapRecordSource.java:92){code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)