Subject: Re: Structured Streaming Initial Listing Issue
You don't often get email from andrewlopuk...@gmail.com. Learn why this is
important<https://aka.ms/LearnAboutSenderIdentification>
Hello.
AFAIK the problem is not scoped to streaming and can't be mitigated with only
maxRed
Hello.
AFAIK the problem is not scoped to streaming and can't be mitigated with
only maxRedultSize for such input. Spark has to bring all file paths into
driver memory even in case of streaming (
https://github.com/apache/spark/blob/37028fafc4f9fc873195a88f0840ab69edcf9d2b/sql/core/src/main/scala
That 1073.3 MiB isn't too much bigger than spark.driver.maxResultSize,
can't you just increase that config with a larger number?
/ Wei
Anastasiia Sokhova 于2025年4月16日周三
03:37写道:
> Dear Spark Community,
>
>
>
> I run a Structured Streaming Query to read json files from S3 into an Ic
> eberg table
Dear Spark Community,
I run a Structured Streaming Query to read json files from S3 into an Iceberg
table. This is my query:
```python
stream_reader = (
spark_session.readStream.format("json")
.schema(schema)
.option("maxFilesPerTrigger", 256_000)
.option("basePath", f"s3a://tes