[
https://issues.apache.org/jira/browse/SPARK-19881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun resolved SPARK-19881.
-----------------------------------
Resolution: Won't Fix
As mentioned in
[comment|https://github.com/apache/spark/pull/17223#issuecomment-286608743], we
will not set hive conf dynamically in order to keep session isolation.
{quote}
Since hive client is shared among all sessions, we can't set hive conf
dynamically, to keep session isolation. I think we should treat hive conf as
static sql conf, and throw exception when users try to change them.
{quote}
> Support Dynamic Partition Inserts params with SET command
> ---------------------------------------------------------
>
> Key: SPARK-19881
> URL: https://issues.apache.org/jira/browse/SPARK-19881
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.0.0, 2.1.0
> Reporter: Dongjoon Hyun
> Priority: Minor
>
> Since Spark 2.0.0, `SET` commands do not pass the values to HiveClient. In
> most case, Spark handles well. However, for the dynamic partition insert,
> users meet the following misleading situation.
> {code}
> scala> spark.range(1001).selectExpr("id as key", "id as
> value").registerTempTable("t1001")
> scala> sql("create table p (value int) partitioned by (key int)").show
> scala> sql("insert into table p partition(key) select key, value from t1001")
> org.apache.spark.SparkException:
> Dynamic partition strict mode requires at least one static partition column.
> To turn this off set hive.exec.dynamic.partition.mode=nonstrict
> scala> sql("set hive.exec.dynamic.partition.mode=nonstrict")
> scala> sql("insert into table p partition(key) select key, value from t1001")
> org.apache.hadoop.hive.ql.metadata.HiveException:
> Number of dynamic partitions created is 1001, which is more than 1000.
> To solve this try to set hive.exec.max.dynamic.partitions to at least 1001.
> scala> sql("set hive.exec.max.dynamic.partitions=1001")
> scala> sql("set hive.exec.max.dynamic.partitions").show(false)
> +--------------------------------+-----+
> |key |value|
> +--------------------------------+-----+
> |hive.exec.max.dynamic.partitions|1001 |
> +--------------------------------+-----+
> scala> sql("insert into table p partition(key) select key, value from t1001")
> org.apache.hadoop.hive.ql.metadata.HiveException:
> Number of dynamic partitions created is 1001, which is more than 1000.
> To solve this try to set hive.exec.max.dynamic.partitions to at least 1001.
> {code}
> The last error is the same with the previous one. `HiveClient` does not know
> new value 1001. There is no way to change the default value of
> `hive.exec.max.dynamic.partitions` of `HiveCilent` with `SET` command.
> The root cause is that `hive` parameters are passed to `HiveClient` on
> creating. So, the workaround is to use `--hiveconf` when starting
> `spark-shell`. However, it is still unchangeable in `spark-shell`. We had
> better handle this case without misleading error messages ending infinite
> loop.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]