[
https://issues.apache.org/jira/browse/HIVE-25303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sai Hemanth Gantasala updated HIVE-25303:
-----------------------------------------
Description:
Under legacy table creation mode (hive.create.as.external.legacy=true), when a
database has been created in a specific LOCATION, in a session where that
database is Used, tables are created using the following command:
{code:java}
CREATE TABLE <tablename> AS SELECT <select statement>{code}
should inherit the HDFS path from the database's location. Instead, Hive is
trying to write the table data into
/warehouse/tablespace/managed/hive/<database_directory_name>/<table_name>
+Design+:
In the CTAS query, first data is written in the target directory (which
happens in HS2) and then the table is created(This happens in HMS). So here two
decisions are being made i) target directory location ii) how the table should
be created (table type, sd e.t.c).
When HS2 needs a target location that needs to be set, it'll make create a
table dry run call to HMS (where table translation happens) and i) and ii)
decisions are made within HMS and returns table object. Then HS2 will use this
location set by HMS for placing the data.
The patch for issue addresses the table location being incorrect and table
data being empty for the following cases 1) when the external legacy config is
set i.e.., hive.create.as.external.legacy=true 2) when the table is created
with the transactional property set to false i.e.., TBLPROPERTIES
('transactional'='false')
was:
Under legacy table creation mode (hive.create.as.external.legacy=true), when a
database has been created in a specific LOCATION, in a session where that
database is Used, tables are created using the following command:
{code:java}
CREATE TABLE <tablename> AS SELECT <select statement>{code}
should inherit the HDFS path from the database's location. Instead, Hive is
trying to write the table data into
/warehouse/tablespace/managed/hive/<database_directory_name>/<table_name>
+Design+:
In the CTAS query, first data is written in the target directory (which
happens in HS2) and then the table is created(This happens in HMS). So here two
decisions are being made i) target directory location ii) how the table should
be created (table type, sd e.t.c).
When HS2 needs a target location that needs to be set, it'll make create table
dry run call to HMS (where table translation happens) and i) and ii) decisions
are made within HMS and returns table object. Then HS2 will use this location
set by HMS for placing the data.
The patch for issue addresses the table location
> CTAS hive.create.as.external.legacy tries to place data files in managed WH
> path
> --------------------------------------------------------------------------------
>
> Key: HIVE-25303
> URL: https://issues.apache.org/jira/browse/HIVE-25303
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2, Standalone Metastore
> Reporter: Sai Hemanth Gantasala
> Assignee: Sai Hemanth Gantasala
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0
>
> Time Spent: 3.5h
> Remaining Estimate: 0h
>
> Under legacy table creation mode (hive.create.as.external.legacy=true), when
> a database has been created in a specific LOCATION, in a session where that
> database is Used, tables are created using the following command:
> {code:java}
> CREATE TABLE <tablename> AS SELECT <select statement>{code}
> should inherit the HDFS path from the database's location. Instead, Hive is
> trying to write the table data into
> /warehouse/tablespace/managed/hive/<database_directory_name>/<table_name>
> +Design+:
> In the CTAS query, first data is written in the target directory (which
> happens in HS2) and then the table is created(This happens in HMS). So here
> two decisions are being made i) target directory location ii) how the table
> should be created (table type, sd e.t.c).
> When HS2 needs a target location that needs to be set, it'll make create a
> table dry run call to HMS (where table translation happens) and i) and ii)
> decisions are made within HMS and returns table object. Then HS2 will use
> this location set by HMS for placing the data.
> The patch for issue addresses the table location being incorrect and table
> data being empty for the following cases 1) when the external legacy config
> is set i.e.., hive.create.as.external.legacy=true 2) when the table is
> created with the transactional property set to false i.e.., TBLPROPERTIES
> ('transactional'='false')
--
This message was sent by Atlassian Jira
(v8.3.4#803005)