dongjoon-hyun commented on PR #249: URL: https://github.com/apache/spark-kubernetes-operator/pull/249#issuecomment-2980831031
> > @melin , it's a warning, not an error. > > > The Spark event log cannot be directly written to s3. There are the following errors: > > > > > > FYI, Apache Spark 3.2+ has the following by default. > > > > * [[SPARK-35868][CORE] Add fs.s3a.downgrade.syncable.exceptions if not set spark#33044](https://github.com/apache/spark/pull/33044) > > spark.eventLog.dir cannot be directly set to s3 path, which is not very convenient. There are two solutions: > > 1. Write directly to the nfs shared storage > 2. First, write to the pod locally. Once the task is completed, upload it to s3. It seems that the above are not questions. So, to be clear, No, your personal assessments and proposed solutions sounds wrong to me, @melin . It doesn't make sense at all for long running jobs and streaming jobs. In other words, that's not a community recommendation. Apache Spark community has been using S3 directly as an Apache Spark event log directory for a long time. BTW, Apache Spark PR is not a good place for this kind of Q&A. I'd recommend you to send to [[email protected]](https://lists.apache.org/[email protected]) if you have any difficulties or you want to discuss this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
