[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam Szita updated HIVE-9995: - Resolution: Fixed Fix Version/s: 4.0.0 Status: Resolved (was: Patch Available) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Fix For: 4.0.0 > > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, > HIVE-9995.09.patch, HIVE-9995.10.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.10.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, > HIVE-9995.09.patch, HIVE-9995.10.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.09.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, > HIVE-9995.09.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.08.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.08.patch, > HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.07.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: (was: HIVE-9995.07.patch) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.07.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: (was: HIVE-9995.07.patch) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.07.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: (was: HIVE-9995.07.patch) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.07.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: (was: HIVE-9995.07.patch) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.07.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: (was: HIVE-9995.07.patch) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.07.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.07.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.06.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.06.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.05.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.05.patch, > HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: (was: HIVE-9995.04.patch) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.04.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.04.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: (was: HIVE-9995.04.patch) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.04.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.04.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.03.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.03.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Kuzmenko updated HIVE-9995: - Attachment: HIVE-9995.02.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Denys Kuzmenko >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.02.patch, > HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-9995: - Target Version/s: 4.0.0 Status: Patch Available (was: Open) > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-9995: - Attachment: HIVE-9995.01.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman >Priority: Major > Attachments: HIVE-9995.01.patch, HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-9995: - Attachment: HIVE-9995.WIP.patch > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > Attachments: HIVE-9995.WIP.patch > > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (HIVE-9995) ACID compaction tries to compact a single file
[ https://issues.apache.org/jira/browse/HIVE-9995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eugene Koifman updated HIVE-9995: - Description: Consider TestWorker.minorWithOpenInMiddle() since there is an open txnId=23, this doesn't have any meaningful minor compaction work to do. The system still tries to compact a single delta file for 21-22 id range, and effectively copies the file onto itself. This is 1. inefficient and 2. can potentially affect a reader. (from a real cluster) Suppose we start with {noformat} drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 /user/hive/warehouse/t/base_016 -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 /user/hive/warehouse/t/base_016/bucket_0 drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 /user/hive/warehouse/t/base_017 -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 /user/hive/warehouse/t/base_017/bucket_0 drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 /user/hive/warehouse/t/delta_017_017_ -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 /user/hive/warehouse/t/delta_017_017_/bucket_0 drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 /user/hive/warehouse/t/delta_018_018_ -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 /user/hive/warehouse/t/delta_018_018_/bucket_0 {noformat} then do _alter table T compact 'minor';_ then we end up with {noformat} drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 /user/hive/warehouse/t/base_017 -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 /user/hive/warehouse/t/base_017/bucket_0 drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 /user/hive/warehouse/t/delta_018_018 -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 /user/hive/warehouse/t/delta_018_018/bucket_0 drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 /user/hive/warehouse/t/delta_018_018_ -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 /user/hive/warehouse/t/delta_018_018_/bucket_0 {noformat} So compaction created a new dir _/user/hive/warehouse/t/delta_018_018_ was: Consider TestWorker.minorWithOpenInMiddle() since there is an open txnId=23, this doesn't have any meaningful minor compaction work to do. The system still tries to compact a single delta file for 21-22 id range, and effectively copies the file onto itself. This is 1. inefficient and 2. can potentially affect a reader. > ACID compaction tries to compact a single file > -- > > Key: HIVE-9995 > URL: https://issues.apache.org/jira/browse/HIVE-9995 > Project: Hive > Issue Type: Bug > Components: Transactions >Affects Versions: 1.0.0 >Reporter: Eugene Koifman >Assignee: Eugene Koifman > > Consider TestWorker.minorWithOpenInMiddle() > since there is an open txnId=23, this doesn't have any meaningful minor > compaction work to do. The system still tries to compact a single delta file > for 21-22 id range, and effectively copies the file onto itself. > This is 1. inefficient and 2. can potentially affect a reader. > (from a real cluster) > Suppose we start with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:03 > /user/hive/warehouse/t/base_016 > -rw-r--r-- 1 ekoifman staff602 2016-06-09 16:03 > /user/hive/warehouse/t/base_016/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_017_017_ > -rw-r--r-- 1 ekoifman staff514 2016-06-09 16:06 > /user/hive/warehouse/t/delta_017_017_/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1 ekoifman staff612 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_/bucket_0 > {noformat} > then do _alter table T compact 'minor';_ > then we end up with > {noformat} > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/base_017 > -rw-r--r-- 1 ekoifman staff588 2016-06-09 16:07 > /user/hive/warehouse/t/base_017/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018 > -rw-r--r-- 1 ekoifman staff500 2016-06-09 16:11 > /user/hive/warehouse/t/delta_018_018/bucket_0 > drwxr-xr-x - ekoifman staff 0 2016-06-09 16:07 > /user/hive/warehouse/t/delta_018_018_ > -rw-r--r-- 1