This is not necessarily about the readStream / read API. As long as you correctly imported the needed dependencies and set up spark config, you should be able to readStream from s3 path.
See https://stackoverflow.com/questions/46740670/no-filesystem-for-scheme-s3-with-pyspark Kleckner, Jade <jade.kleck...@ipp.mpg.de> 于2025年8月5日周二 10:21写道: > Hello all, > > > > I’m developing a pipeline to possibly read a stream from a MinIO bucket. > I have no issues setting Hadoop s3a variables and reading files but when I > try to create a bucket for Spark to use as a readStream location it > produces the following errors: > > > > Example code: initDF = spark.readStream.schema(tempschema).option("path", > "s3://bucketname").load() > > > > The below I have used for the bucket path: > > > > s3 -> py4j.protocol.Py4JJavaError: An error occurred while calling > o436.load. > > : org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for > scheme "s3" > > > > s3a -> pyspark.errors.exceptions.captured.IllegalArgumentException: path > must be absolute > > > > Absolute path -> > pyspark.errors.exceptions.captured.UnsupportedOperationException: None > > > > I’m curious if readStream has any support for s3 buckets at all? Any > help/guidance would be appreciated, thank you for your time. > > > > Sincerely, > > Jade Kleckner >