Hi all,

If my understanding is correct, now Spark supports to set some options to
Hadoop configuration instance via read/write.option(..) API.

However, I recently saw some comments and opinion about this. If I
understood them correctly, it was as below:

   -

   Respecting all the configurations in Hadoop configuration instance
   including what Spark configurations/options cover and Spark
   configurations/options will override it if the equivalent ones are
   duplicated
   -

   Not respecting the configurations in the instance that Spark
   configurations/options cover, meaning using the default value in Spark
   configuration regardless of the values set in this instance for the
   equivalent ones.

For example, now, Spark is supporting compression for ORC as an option but
currently we are not respecting orc.compress. This is being ignored
(meaning following the latter case).

Maybe I understood those comments and opinions wrongly and might be a dump
question from my misunderstanding but I would really appreciate that if
anyone helps me to know which one is correct.
​

Thanks !

Reply via email to