[ https://issues.apache.org/jira/browse/SPARK-30411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010799#comment-17010799 ]
Sanket Reddy commented on SPARK-30411: -------------------------------------- spark.sql.orc.impl=hive does not work. [~yumwang] [~hyukjin.kwon] do we want to revisit [PR-22078|https://github.com/apache/spark/pull/22078#issuecomment-458851287] or do we have a good reason not to proceed on this based on my concerns above? I could make an internal patch but just wondering on your thoughts. > saveAsTable does not honor spark.hadoop.hive.warehouse.subdir.inherit.perms > --------------------------------------------------------------------------- > > Key: SPARK-30411 > URL: https://issues.apache.org/jira/browse/SPARK-30411 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.4 > Reporter: Sanket Reddy > Priority: Minor > > {code} > -bash-4.2$ hdfs dfs -ls /tmp | grep my_databases > drwxr-x--T - redsanket users 0 2019-12-04 20:15 /tmp/my_databases > {code} > {code} > >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) > >>> STORED AS orc"); > {code} > {code} > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwxr-x--T - redsanket users 0 2019-12-04 20:20 /tmp/my_databases/example > {code} > Now after {{saveAsTable}} > {code} > >>> data = [('First', 1), ('Second', 2), ('Third', 3), ('Fourth', 4), > ('Fifth', 5)] > >>> df = spark.createDataFrame(data) > >>> > df.write.format("orc").mode('overwrite').saveAsTable('redsanket_db.example') > {code} > {code} > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwx------ - redsanket users 0 2019-12-04 20:23 /tmp/my_databases/example > {code} > Overwrites the permissions > Insert into honors preserving parent directory permissions. > {code} > >>> spark.sql("DROP table redsanket_db.example"); > DataFrame[] > >>> spark.sql("CREATE TABLE redsanket_db.example(bcookie string, ip int) > STORED AS orc"); > DataFrame[] > >>> df.write.format("orc").insertInto('redsanket_db.example') > {code} > {code} > -bash-4.2$ hdfs dfs -ls /tmp/my_databases | grep example > drwxr-x--T - redsanket users 0 2019-12-04 20:43 /tmp/my_databases/example > {code} > It is either limitation of the API based on the mode and the behavior has to > be documented or needs to be fixed -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org