Hi list,
I am having an issue ditributing a pandas_udf to my workers.
I'm using Spark 2.4.1 in standalone mode.
*Submit*:
- via SparkLauncher as separate process. I do add the py-files with the
self-executable zip (with .py extension) before launching the application.
- The whole applica
I have Spark code that writes a batch to Kafka as specified here:
https://spark.apache.org/docs/2.4.0/structured-streaming-kafka-integration.html
The code looks like the following:
df.selectExpr("CAST(key AS STRING)", "CAST(value AS STRING)")
\
.write \
.format("kafka") \
.option("
unsubcribe
"Confidentiality Warning: This message and any attachments are intended only
for the use of the intended recipient(s).
are confidential and may be privileged. If you are not the intended recipient.
you are hereby notified that any
review. re-transmission. conversion to hard copy. cop