[ https://issues.apache.org/jira/browse/HADOOP-18839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Steve Loughran updated HADOOP-18839: ------------------------------------ Summary: SSLException is raised after very long timeout "Unsupported or unrecognized SSL message" (was: SSLException is raised after very long timeout) > SSLException is raised after very long timeout "Unsupported or unrecognized > SSL message" > ---------------------------------------------------------------------------------------- > > Key: HADOOP-18839 > URL: https://issues.apache.org/jira/browse/HADOOP-18839 > Project: Hadoop Common > Issue Type: Bug > Components: fs/s3 > Affects Versions: 3.3.4 > Reporter: Maxim Martynov > Priority: Minor > Attachments: host.log, ssl.log > > > I've tried to connect from PySpark to Minio running in docker. > Installing PySpark and starting Minio: > {code:bash} > pip install pyspark==3.4.1 > docker run --rm -d --hostname minio --name minio -p 9000:9000 -p 9001:9001 -e > MINIO_ACCESS_KEY=access -e MINIO_SECRET_KEY=Eevoh2wo0ui6ech0wu8oy > 3feiR3eicha -e MINIO_ROOT_USER=admin -e > MINIO_ROOT_PASSWORD=iepaegaigi3ofa9TaephieSo1iecaesh bitnami/minio:latest > docker exec minio mc mb test-bucket > {code} > Then create Spark session: > {code:python} > from pyspark.sql import SparkSession > spark = SparkSession.builder\ > .config("spark.jars.packages", > "org.apache.hadoop:hadoop-aws:3.3.4")\ > .config("spark.hadoop.fs.s3a.endpoint", "localhost:9000")\ > .config("spark.hadoop.fs.s3a.access.key", "access")\ > .config("spark.hadoop.fs.s3a.secret.key", > "Eevoh2wo0ui6ech0wu8oy3feiR3eicha")\ > .config("spark.hadoop.fs.s3a.aws.credentials.provider", > "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider")\ > .getOrCreate() > spark.sparkContext.setLogLevel("debug") > {code} > And try to access some object in a bucket: > {code:python} > import time > begin = time.perf_counter() > spark.read.format("csv").load("s3a://test-bucket/fake") > end = time.perf_counter() > py4j.protocol.Py4JJavaError: An error occurred while calling o40.load. > : org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on > s3a://test-bucket/fake: com.amazonaws.SdkClientException: Unable to execute > HTTP request: Unsupported or unrecognized SSL message: Unable to execute HTTP > request: Unsupported or unrecognized SSL message > ... > {code} > [^ssl.log] > {code:python} > >>> print((end-begin)/60) > 14.72387898775002 > {code} > I was waiting almost *15 minutes* to get the exception from Spark. The reason > was I tried to connect to endpoint with > {{{}fs.s3a.connection.ssl.enabled=true{}}}, but Minio is configured to listen > for HTTP protocol only. > Is there any way to immediately raise exception if SSL connection cannot be > established? > If I try to pass wrong endpoint, like {{{}localhos:9000{}}}, I'll get > exception like this in just 5 seconds: > {code:java} > : org.apache.hadoop.fs.s3a.AWSClientIOException: getFileStatus on > s3a://test-bucket/fake: com.amazonaws.SdkClientException: Unable to execute > HTTP request: test-bucket.localhos: Unable to execute HTTP request: > test-bucket.localhos > ... > {code} > [^host.log] > {code:python} > >>> print((end-begin)/60) > 0.09500707178334172 > >>> end-begin > 5.700424307000503 > {code} > I know about options like {{fs.s3a.attempts.maximum}} and > {{{}fs.s3a.retry.limit{}}}, setting them to 1 will cause raising exception > just immediately. But this does not look right. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org