Hopefully, this StackOverflow answer can solve your problem:
https://stackoverflow.com/questions/47523037/how-do-i-configure-pyspark-to-write-to-hdfs-by-default

Spark doesn't control the behavior of qualifying paths. It's decided by
certain configs and/or config files.

On Tue, Jan 11, 2022 at 3:03 AM Pablo Langa Blanco <soy...@gmail.com> wrote:

> Hi Pralabh,
>
> If it helps, it is probably related to this change
> https://github.com/apache/spark/pull/28527
>
> Regards
>
> On Mon, Jan 10, 2022 at 10:42 AM Pralabh Kumar <pralabhku...@gmail.com>
> wrote:
>
>> Hi Spark Team
>>
>> When creating a database via Spark 3.0 on Hive
>>
>> 1) spark.sql("create database test location '/user/hive'").  It creates
>> the database location on hdfs . As expected
>>
>> 2) When running the same command on 3.1 the database is created on the
>> local file system by default. I have to prefix with hdfs to create db on
>> hdfs.
>>
>> Why is there a difference in the behavior, Can you please point me to the
>> jira which causes this change.
>>
>> Note : spark.sql.warehouse.dir and hive.metastore.warehouse.dir both are
>> having default values(not explicitly set)
>>
>> Regards
>> Pralabh Kumar
>>
>

Reply via email to