gsudhanshu opened a new issue, #9968:
URL: https://github.com/apache/hudi/issues/9968
I am using SPARK-Hadoop 3.3.2 bundle On IP 192.168.1.4x: I have started
spark master (port 7077) and worker
On IP 192.168.1.22y: I have my webapp.py which: a. creates spark session
(see below config):
```
spark = SparkSession.builder \
.appName("dataHudi") \
.master('spark://192.168.1.40:7077') \
.config("spark.submit.deployMode","client") \
.config('spark.driver.bindAddress', '192.168.1.40') \
.config('spark.driver.host', '192.168.1.40') \
.config('spark.driver.port', '33037') \
.config('spark.jars.packages',
'org.apache.hudi:hudi-spark3.3-bundle_2.12:0.13.1') \
.config('spark.serializer',
'org.apache.spark.serializer.KryoSerializer') \
.config('spark.sql.catalog.spark_catalog',
'org.apache.spark.sql.hudi.catalog.HoodieCatalog') \
.config('spark.sql.extensions',
'org.apache.spark.sql.hudi.HoodieSparkSessionExtension') \
.getOrCreate()
```
b. submits a job:
```
spark_df = spark.createDataFrame(ingested_df)
spark_df.write \
.format("org.apache.hudi") \
.options(**hudi_options) \
.mode("append") \
.save(basePath_ID +"/"+f"{unique_filename}")
```
when I check logs on 192.168.1.4x:8080 and 192.168.1.4x:8081 then I see that
application is running and executors exiting and starting. but then when I
check executor stderr and stdout logs then I see that the spark is trying to
connect to a random port say 33037 and connection is failing on that port.
well, I tried to run both spark and application on the same machine and on
ip 192.168.1.22y This worked.
But on LAN it fails. we tried with different configurations. changing the
bind address, driver.host, driver.port etc..
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]