[
https://issues.apache.org/jira/browse/SPARK-38330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
André F. updated SPARK-38330:
-----------------------------
Description:
Trying to run any job after bumping our Spark version from 3.1.2 to 3.2.1, lead
us to the current exception while reading files on s3:
{code:java}
org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on
s3a://<bucket>/<path>.parquet: com.amazonaws.SdkClientException: Unable to
execute HTTP request: Certificate for <bucket.s3.amazonaws.com> doesn't match
any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]:
Unable to execute HTTP request: Certificate for <bucket> doesn't match any of
the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com] at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208) at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:170) at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3351)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3185)
at org.apache.hadoop.fs.s3a.S3AFileSystem.isDirectory(S3AFileSystem.java:4277)
at
org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54)
at
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
at
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274) at
org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
at scala.Option.getOrElse(Option.scala:189) at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245) at
org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:596) {code}
{code:java}
Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for
<bucket.s3.amazonaws.com> doesn't match any of the subject alternative names:
[*.s3.amazonaws.com, s3.amazonaws.com]
at
com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:507)
at
com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:437)
at
com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384)
at
com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
at
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy16.connect(Unknown Source)
at
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
at
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at
com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at
com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
{code}
We found similar problems in the following tickets but:
- https://issues.apache.org/jira/browse/HADOOP-17017 (we don't use `.` in our
bucket names)
- [https://github.com/aws/aws-sdk-java-v2/issues/1786] (we tried to override
it by building Spark with `httpclient:4.5.10` or `httpclient:4.5.8`, with no
effect. We also made sure we are using the same `httpclient` version on our
main jar).
was:
Trying to run any job after bumping our Spark version from 3.1.2 to 3.2.1, lead
us to the current exception while reading files on s3:
{code:java}
org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on
s3a://<bucket>/<path>.parquet: com.amazonaws.SdkClientException: Unable to
execute HTTP request: Certificate for <bucket> doesn't match any of the subject
alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]: Unable to execute
HTTP request: Certificate for <bucket> doesn't match any of the subject
alternative names: [*.s3.amazonaws.com, s3.amazonaws.com] at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208) at
org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:170) at
org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3351)
at
org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3185)
at org.apache.hadoop.fs.s3a.S3AFileSystem.isDirectory(S3AFileSystem.java:4277)
at
org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54)
at
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
at
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274) at
org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
at scala.Option.getOrElse(Option.scala:189) at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245) at
org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:596) {code}
{code:java}
Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for
<bucket.s3.amazonaws.com> doesn't match any of the subject alternative names:
[*.s3.amazonaws.com, s3.amazonaws.com]
at
com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:507)
at
com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:437)
at
com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384)
at
com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
at
com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy16.connect(Unknown Source)
at
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
at
com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at
com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at
com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at
com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
{code}
We found similar problems in the following tickets but:
- https://issues.apache.org/jira/browse/HADOOP-17017 (we don't use `.` in our
bucket names)
- [https://github.com/aws/aws-sdk-java-v2/issues/1786] (we tried to override
it by building Spark with `httpclient:4.5.10` or `httpclient:4.5.8`, with no
effect. We also made sure we are using the same `httpclient` version on our
main jar).
> Certificate doesn't match any of the subject alternative names:
> [*.s3.amazonaws.com, s3.amazonaws.com]
> ------------------------------------------------------------------------------------------------------
>
> Key: SPARK-38330
> URL: https://issues.apache.org/jira/browse/SPARK-38330
> Project: Spark
> Issue Type: Bug
> Components: EC2
> Affects Versions: 3.2.1
> Environment: Spark 3.2.1 built with `hadoop-cloud` flag.
> Direct access to s3 using default file committer.
> JDK8.
>
> Reporter: André F.
> Priority: Major
>
> Trying to run any job after bumping our Spark version from 3.1.2 to 3.2.1,
> lead us to the current exception while reading files on s3:
> {code:java}
> org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on
> s3a://<bucket>/<path>.parquet: com.amazonaws.SdkClientException: Unable to
> execute HTTP request: Certificate for <bucket.s3.amazonaws.com> doesn't match
> any of the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com]:
> Unable to execute HTTP request: Certificate for <bucket> doesn't match any of
> the subject alternative names: [*.s3.amazonaws.com, s3.amazonaws.com] at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:208) at
> org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:170) at
> org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3351)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3185)
> at
> org.apache.hadoop.fs.s3a.S3AFileSystem.isDirectory(S3AFileSystem.java:4277)
> at
> org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54)
> at
> org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370)
> at
> org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:274)
> at
> org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:245)
> at scala.Option.getOrElse(Option.scala:189) at
> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:245) at
> org.apache.spark.sql.DataFrameReader.parquet(DataFrameReader.scala:596) {code}
>
> {code:java}
> Caused by: javax.net.ssl.SSLPeerUnverifiedException: Certificate for
> <bucket.s3.amazonaws.com> doesn't match any of the subject alternative names:
> [*.s3.amazonaws.com, s3.amazonaws.com]
> at
> com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.verifyHostname(SSLConnectionSocketFactory.java:507)
> at
> com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.createLayeredSocket(SSLConnectionSocketFactory.java:437)
> at
> com.amazonaws.thirdparty.apache.http.conn.ssl.SSLConnectionSocketFactory.connectSocket(SSLConnectionSocketFactory.java:384)
> at
> com.amazonaws.thirdparty.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
> at
> com.amazonaws.thirdparty.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
> at sun.reflect.GeneratedMethodAccessor36.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
> at com.amazonaws.http.conn.$Proxy16.connect(Unknown Source)
> at
> com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
> at
> com.amazonaws.thirdparty.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
> at
> com.amazonaws.thirdparty.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
> at
> com.amazonaws.thirdparty.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> at
> com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> at
> com.amazonaws.thirdparty.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> at
> com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1333)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
> {code}
> We found similar problems in the following tickets but:
>
> - https://issues.apache.org/jira/browse/HADOOP-17017 (we don't use `.` in
> our bucket names)
> - [https://github.com/aws/aws-sdk-java-v2/issues/1786] (we tried to override
> it by building Spark with `httpclient:4.5.10` or `httpclient:4.5.8`, with no
> effect. We also made sure we are using the same `httpclient` version on our
> main jar).
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]