[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=612403&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-612403 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 21/Jun/21 00:08 Start Date: 21/Jun/21 00:08 Worklog Time Spent: 10m Work Description: github-actions[bot] closed pull request #2089: URL: https://github.com/apache/hive/pull/2089 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 612403) Time Spent: 2h 50m (was: 2h 40m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=610501&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-610501 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Jun/21 07:59 Start Date: 14/Jun/21 07:59 Worklog Time Spent: 10m Work Description: github-actions[bot] commented on pull request #2089: URL: https://github.com/apache/hive/pull/2089#issuecomment-860291360 This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Feel free to reach out on the d...@hive.apache.org list if the patch is in need of reviews. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 610501) Time Spent: 2h 40m (was: 2.5h) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 40m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582660&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582660 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 16:12 Start Date: 14/Apr/21 16:12 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613387158 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) } if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) { Review comment: only managed tables would have visibilityId poppulated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582660) Time Spent: 2.5h (was: 2h 20m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582658&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582658 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 16:11 Start Date: 14/Apr/21 16:11 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613386562 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) } if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) { -if (tbl.getSd().getLocation() == null -|| tbl.getSd().getLocation().isEmpty()) { - tblPath = wh.getDefaultTablePath(db, tbl); +if (tbl.getSd().getLocation() == null || tbl.getSd().getLocation().isEmpty()) { + String relPath = tbl.getTableName() + (tbl.isSetTxnId() ? "_v" + tbl.getTxnId() : ""); + tblPath = wh.getDefaultTablePath(db, relPath, isExternal(tbl)); Review comment: moved this logic to the Warehouse.getDefaultTablePath -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582658) Time Spent: 2h 20m (was: 2h 10m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582573&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582573 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 14:54 Start Date: 14/Apr/21 14:54 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613322059 ## File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift ## @@ -593,8 +593,8 @@ struct Table { 22: optional byte accessType, 23: optional list requiredReadCapabilities, 24: optional list requiredWriteCapabilities - 25: optional i64 id, // id of the table. It will be ignored if set. It's only for -// read purposed + 25: optional i64 id, // id of the table. It will be ignored if set. It's only for read purposes + 26: optional i64 txnId, // txnId associated with the table creation Review comment: That could work, but it would scatter table path generation logic. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582573) Time Spent: 2h 10m (was: 2h) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582391&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582391 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 11:17 Start Date: 14/Apr/21 11:17 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613157631 ## File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift ## @@ -593,8 +593,8 @@ struct Table { 22: optional byte accessType, 23: optional list requiredReadCapabilities, 24: optional list requiredWriteCapabilities - 25: optional i64 id, // id of the table. It will be ignored if set. It's only for -// read purposed + 25: optional i64 id, // id of the table. It will be ignored if set. It's only for read purposes + 26: optional i64 txnId, // txnId associated with the table creation Review comment: Sorry. My mistake. Instead of: > Why not just override the tableName? I wanted to ask: Why not just override the table location? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582391) Time Spent: 2h (was: 1h 50m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582376&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582376 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 10:53 Start Date: 14/Apr/21 10:53 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613143904 ## File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift ## @@ -593,8 +593,8 @@ struct Table { 22: optional byte accessType, 23: optional list requiredReadCapabilities, 24: optional list requiredWriteCapabilities - 25: optional i64 id, // id of the table. It will be ignored if set. It's only for -// read purposed + 25: optional i64 id, // id of the table. It will be ignored if set. It's only for read purposes + 26: optional i64 txnId, // txnId associated with the table creation Review comment: AFAIK, in multiple places, we are referencing the table by its name and not id, like when checking if a table exists. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582376) Time Spent: 1h 50m (was: 1h 40m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h 50m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582316&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582316 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 09:21 Start Date: 14/Apr/21 09:21 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613083121 ## File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift ## @@ -593,8 +593,8 @@ struct Table { 22: optional byte accessType, 23: optional list requiredReadCapabilities, 24: optional list requiredWriteCapabilities - 25: optional i64 id, // id of the table. It will be ignored if set. It's only for -// read purposed + 25: optional i64 id, // id of the table. It will be ignored if set. It's only for read purposes + 26: optional i64 txnId, // txnId associated with the table creation Review comment: > @pvary, could you please elaborate on that. We can't just override tableName if that's what you are thinking. Why not just override the tableName? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582316) Time Spent: 1h 40m (was: 1.5h) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h 40m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582314&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582314 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 09:19 Start Date: 14/Apr/21 09:19 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613082011 ## File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift ## @@ -593,8 +593,8 @@ struct Table { 22: optional byte accessType, 23: optional list requiredReadCapabilities, 24: optional list requiredWriteCapabilities - 25: optional i64 id, // id of the table. It will be ignored if set. It's only for -// read purposed + 25: optional i64 id, // id of the table. It will be ignored if set. It's only for read purposes + 26: optional i64 txnId, // txnId associated with the table creation Review comment: @pvary, could you please elaborate on that. We can't just override tableName if that's what you are thinking. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582314) Time Spent: 1.5h (was: 1h 20m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582310&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582310 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 09:14 Start Date: 14/Apr/21 09:14 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613076327 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) } if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) { -if (tbl.getSd().getLocation() == null -|| tbl.getSd().getLocation().isEmpty()) { - tblPath = wh.getDefaultTablePath(db, tbl); +if (tbl.getSd().getLocation() == null || tbl.getSd().getLocation().isEmpty()) { + String relPath = tbl.getTableName() + (tbl.isSetTxnId() ? "_v" + tbl.getTxnId() : ""); Review comment: I think, /v_ would be even more confusing + we would need to take care of the folder if 0 versions remained. Also when removing old style table path we would need to filter out new table version folders if any. >/.db//v_12/delta_13_15 >/.db//delta_1_11 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582310) Time Spent: 1h 20m (was: 1h 10m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h 20m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=582309&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-582309 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 14/Apr/21 09:11 Start Date: 14/Apr/21 09:11 Worklog Time Spent: 10m Work Description: deniskuzZ commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r613076327 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) } if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) { -if (tbl.getSd().getLocation() == null -|| tbl.getSd().getLocation().isEmpty()) { - tblPath = wh.getDefaultTablePath(db, tbl); +if (tbl.getSd().getLocation() == null || tbl.getSd().getLocation().isEmpty()) { + String relPath = tbl.getTableName() + (tbl.isSetTxnId() ? "_v" + tbl.getTxnId() : ""); Review comment: I think, /v_ would be even more confusing + we would need to take care of the folder if 0 versions remained: >/.db//v_12/delta_13_15 >/.db//delta_1_11 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 582309) Time Spent: 1h 10m (was: 1h) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578437&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578437 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 07/Apr/21 16:06 Start Date: 07/Apr/21 16:06 Worklog Time Spent: 10m Work Description: kgyrtkirk commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r608793866 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) } if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) { -if (tbl.getSd().getLocation() == null -|| tbl.getSd().getLocation().isEmpty()) { - tblPath = wh.getDefaultTablePath(db, tbl); +if (tbl.getSd().getLocation() == null || tbl.getSd().getLocation().isEmpty()) { + String relPath = tbl.getTableName() + (tbl.isSetTxnId() ? "_v" + tbl.getTxnId() : ""); + tblPath = wh.getDefaultTablePath(db, relPath, isExternal(tbl)); Review comment: can't we have a variation of this method which accepts a full "table" - and can decide the path ; and the externalability based on that...so that we don't spread the above line everywhere this method is invoked (I'm also working on a patch which invokes this method) ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) } if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) { Review comment: this conditional doesn't seem to only match "managed" tables - could that cause any issues? with the "translator" in place I think we are mangling the location at too many places... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 578437) Time Spent: 1h (was: 50m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578345&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578345 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 07/Apr/21 13:04 Start Date: 07/Apr/21 13:04 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r608632826 ## File path: standalone-metastore/metastore-common/src/main/thrift/hive_metastore.thrift ## @@ -593,8 +593,8 @@ struct Table { 22: optional byte accessType, 23: optional list requiredReadCapabilities, 24: optional list requiredWriteCapabilities - 25: optional i64 id, // id of the table. It will be ignored if set. It's only for -// read purposed + 25: optional i64 id, // id of the table. It will be ignored if set. It's only for read purposes + 26: optional i64 txnId, // txnId associated with the table creation Review comment: Quick question: Why not do the whole stuff on client side, and without the HMS API change? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 578345) Time Spent: 50m (was: 40m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578344&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578344 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 07/Apr/21 13:03 Start Date: 07/Apr/21 13:03 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r608632190 ## File path: standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java ## @@ -2419,9 +2419,9 @@ private void create_table_core(final RawStore ms, final CreateTableRequest req) } if (!TableType.VIRTUAL_VIEW.toString().equals(tbl.getTableType())) { -if (tbl.getSd().getLocation() == null -|| tbl.getSd().getLocation().isEmpty()) { - tblPath = wh.getDefaultTablePath(db, tbl); +if (tbl.getSd().getLocation() == null || tbl.getSd().getLocation().isEmpty()) { + String relPath = tbl.getTableName() + (tbl.isSetTxnId() ? "_v" + tbl.getTxnId() : ""); Review comment: How is the new location looks like? ``` /.db/_v ``` This might get mixed up if there are old tables, or tables created with `hive.txn.nonblocking.droptable.enabled=false` and the name ends with _v1234. Small chance, but... What about using directories instead, like: ``` /.db//v_ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 578344) Time Spent: 40m (was: 0.5h) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578339&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578339 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 07/Apr/21 12:56 Start Date: 07/Apr/21 12:56 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r608627251 ## File path: ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java ## @@ -3239,4 +3247,72 @@ public void testFullTableReadLock() throws Exception { checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "tab_acid", null, locks); checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "tab_not_acid", null, locks); } -} + + @Test + public void testNonBlockingDropAndReCreateTable() throws Exception { +dropTable(new String[] {"tab_acid"}); + +conf.setBoolVar(HiveConf.ConfVars.HIVE_TXN_NON_BLOCKING_DROP_TABLE, true); +driver2.getConf().setBoolVar( HiveConf.ConfVars.HIVE_TXN_NON_BLOCKING_DROP_TABLE, true); + +driver.run("create table if not exists tab_acid (a int, b int) partitioned by (p string) " + + "stored as orc TBLPROPERTIES ('transactional'='true')"); +driver.run("insert into tab_acid partition(p) (a,b,p) values(1,2,'foo'),(3,4,'bar')"); + +driver.compileAndRespond("select * from tab_acid"); + +DbTxnManager txnMgr2 = (DbTxnManager) TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); +swapTxnManager(txnMgr2); +driver2.run("drop table if exists tab_acid"); + +swapTxnManager(txnMgr); +driver.run(); + +FileSystem fs = FileSystem.get(conf); +FileStatus[] stat = fs.listStatus(new Path(Paths.get("target/warehouse").toUri()), + p -> p.getName().startsWith("tab_acid_v")); +if (1 == stat.length) { + Assert.fail("Table data was not removed from FS"); +} + +List res = new ArrayList(); +driver.getFetchTask().fetch(res); +Assert.assertEquals("Non-empty resultset", 0, res.size()); + +try { + driver.run("select * from tab_acid"); +} catch (CommandProcessorException ex) { + Assert.assertEquals(ErrorMsg.INVALID_TABLE.getErrorCode(), ex.getResponseCode()); + Assert.assertTrue(ex.getMessage().contains(ErrorMsg.INVALID_TABLE.getMsg("'tab_acid'"))); +} + +//re-create table with the same name +driver.compileAndRespond("create table if not exists tab_acid (a int, b int) partitioned by (p string) " + + "stored as orc TBLPROPERTIES ('transactional'='true')"); +long txnId = txnMgr.getCurrentTxnId(); +driver.run(); +driver.run("insert into tab_acid partition(p) (a,b,p) values(1,2,'foo'),(3,4,'bar')"); + +driver.run("select * from tab_acid "); +res = new ArrayList(); +driver.getFetchTask().fetch(res); +Assert.assertEquals("No records found", 2, res.size()); Review comment: nit: The message can be a little bit misleading, since we do not check for empty table -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 578339) Time Spent: 0.5h (was: 20m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=578338&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-578338 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 07/Apr/21 12:56 Start Date: 07/Apr/21 12:56 Worklog Time Spent: 10m Work Description: pvary commented on a change in pull request #2089: URL: https://github.com/apache/hive/pull/2089#discussion_r608626658 ## File path: ql/src/test/org/apache/hadoop/hive/ql/lockmgr/TestDbTxnManager2.java ## @@ -3239,4 +3247,72 @@ public void testFullTableReadLock() throws Exception { checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "tab_acid", null, locks); checkLock(LockType.SHARED_READ, LockState.ACQUIRED, "default", "tab_not_acid", null, locks); } -} + + @Test + public void testNonBlockingDropAndReCreateTable() throws Exception { +dropTable(new String[] {"tab_acid"}); + +conf.setBoolVar(HiveConf.ConfVars.HIVE_TXN_NON_BLOCKING_DROP_TABLE, true); +driver2.getConf().setBoolVar( HiveConf.ConfVars.HIVE_TXN_NON_BLOCKING_DROP_TABLE, true); + +driver.run("create table if not exists tab_acid (a int, b int) partitioned by (p string) " + + "stored as orc TBLPROPERTIES ('transactional'='true')"); +driver.run("insert into tab_acid partition(p) (a,b,p) values(1,2,'foo'),(3,4,'bar')"); + +driver.compileAndRespond("select * from tab_acid"); + +DbTxnManager txnMgr2 = (DbTxnManager) TxnManagerFactory.getTxnManagerFactory().getTxnManager(conf); +swapTxnManager(txnMgr2); +driver2.run("drop table if exists tab_acid"); + +swapTxnManager(txnMgr); +driver.run(); + +FileSystem fs = FileSystem.get(conf); +FileStatus[] stat = fs.listStatus(new Path(Paths.get("target/warehouse").toUri()), + p -> p.getName().startsWith("tab_acid_v")); +if (1 == stat.length) { + Assert.fail("Table data was not removed from FS"); +} + +List res = new ArrayList(); +driver.getFetchTask().fetch(res); +Assert.assertEquals("Non-empty resultset", 0, res.size()); + +try { + driver.run("select * from tab_acid"); +} catch (CommandProcessorException ex) { + Assert.assertEquals(ErrorMsg.INVALID_TABLE.getErrorCode(), ex.getResponseCode()); + Assert.assertTrue(ex.getMessage().contains(ErrorMsg.INVALID_TABLE.getMsg("'tab_acid'"))); +} + +//re-create table with the same name +driver.compileAndRespond("create table if not exists tab_acid (a int, b int) partitioned by (p string) " + Review comment: I would remove `if not exists` from the SQL, so we can detect if there is a problem with the test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 578338) Time Spent: 20m (was: 10m) > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Assignee: Denys Kuzmenko >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (HIVE-24906) Suffix the table location with UUID/txnId
[ https://issues.apache.org/jira/browse/HIVE-24906?focusedWorklogId=568530&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-568530 ] ASF GitHub Bot logged work on HIVE-24906: - Author: ASF GitHub Bot Created on: 18/Mar/21 17:47 Start Date: 18/Mar/21 17:47 Worklog Time Spent: 10m Work Description: deniskuzZ opened a new pull request #2089: URL: https://github.com/apache/hive/pull/2089 ### What changes were proposed in this pull request? Suffixes managed table location with txnId that created the table. ### Why are the changes needed? Part of non-blocking drop table implementation. Could resolve concurrency issues between ongoing compaction and table re-create operation. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit tests -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 568530) Remaining Estimate: 0h Time Spent: 10m > Suffix the table location with UUID/txnId > - > > Key: HIVE-24906 > URL: https://issues.apache.org/jira/browse/HIVE-24906 > Project: Hive > Issue Type: Sub-task >Reporter: Denys Kuzmenko >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Suffixing the table location during create table with UUID/txnId can help in > deleting the data in asynchronous fashion. -- This message was sent by Atlassian Jira (v8.3.4#803005)