[jira] [Updated] (HIVE-21075) Metastore: Drop partition performance downgrade with Postgres DB
[ https://issues.apache.org/jira/browse/HIVE-21075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yongzhi Chen updated HIVE-21075: Fix Version/s: 4.0.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Metastore: Drop partition performance downgrade with Postgres DB > > > Key: HIVE-21075 > URL: https://issues.apache.org/jira/browse/HIVE-21075 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Yongzhi Chen >Assignee: Oleksiy Sayankin >Priority: Major > Labels: pull-request-available > Fix For: 4.0.0 > > Attachments: HIVE-21075.2.patch > > Time Spent: 7h 50m > Remaining Estimate: 0h > > In order to workaround oracle not supporting limit statement caused > performance issue, HIVE-9447 makes all the backend DB run select count(1) > from SDS where SDS.CD_ID=? to check if the specific CD_ID is referenced in > SDS table before drop a partition. This select count(1) statement does not > scale well in Postgres, and there is no index for CD_ID column in SDS table. > For a SDS table with with 1.5 million rows, select count(1) has average 700ms > without index, while in 10-20ms with index. But the statement before > HIVE-9447( SELECT * FROM "SDS" "A0" WHERE "A0"."CD_ID" = $1 limit 1) uses > less than 10ms . -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-21075) Metastore: Drop partition performance downgrade with Postgres DB
[ https://issues.apache.org/jira/browse/HIVE-21075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksiy Sayankin updated HIVE-21075: Status: Patch Available (was: In Progress) > Metastore: Drop partition performance downgrade with Postgres DB > > > Key: HIVE-21075 > URL: https://issues.apache.org/jira/browse/HIVE-21075 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Yongzhi Chen >Assignee: Oleksiy Sayankin >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21075.2.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > In order to workaround oracle not supporting limit statement caused > performance issue, HIVE-9447 makes all the backend DB run select count(1) > from SDS where SDS.CD_ID=? to check if the specific CD_ID is referenced in > SDS table before drop a partition. This select count(1) statement does not > scale well in Postgres, and there is no index for CD_ID column in SDS table. > For a SDS table with with 1.5 million rows, select count(1) has average 700ms > without index, while in 10-20ms with index. But the statement before > HIVE-9447( SELECT * FROM "SDS" "A0" WHERE "A0"."CD_ID" = $1 limit 1) uses > less than 10ms . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-21075) Metastore: Drop partition performance downgrade with Postgres DB
[ https://issues.apache.org/jira/browse/HIVE-21075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksiy Sayankin updated HIVE-21075: Attachment: HIVE-21075.2.patch > Metastore: Drop partition performance downgrade with Postgres DB > > > Key: HIVE-21075 > URL: https://issues.apache.org/jira/browse/HIVE-21075 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Yongzhi Chen >Assignee: Oleksiy Sayankin >Priority: Major > Labels: pull-request-available > Attachments: HIVE-21075.2.patch > > Time Spent: 2.5h > Remaining Estimate: 0h > > In order to workaround oracle not supporting limit statement caused > performance issue, HIVE-9447 makes all the backend DB run select count(1) > from SDS where SDS.CD_ID=? to check if the specific CD_ID is referenced in > SDS table before drop a partition. This select count(1) statement does not > scale well in Postgres, and there is no index for CD_ID column in SDS table. > For a SDS table with with 1.5 million rows, select count(1) has average 700ms > without index, while in 10-20ms with index. But the statement before > HIVE-9447( SELECT * FROM "SDS" "A0" WHERE "A0"."CD_ID" = $1 limit 1) uses > less than 10ms . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (HIVE-21075) Metastore: Drop partition performance downgrade with Postgres DB
[ https://issues.apache.org/jira/browse/HIVE-21075?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HIVE-21075: -- Labels: pull-request-available (was: ) > Metastore: Drop partition performance downgrade with Postgres DB > > > Key: HIVE-21075 > URL: https://issues.apache.org/jira/browse/HIVE-21075 > Project: Hive > Issue Type: Bug > Components: Metastore >Affects Versions: 3.0.0 >Reporter: Yongzhi Chen >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > In order to workaround oracle not supporting limit statement caused > performance issue, HIVE-9447 makes all the backend DB run select count(1) > from SDS where SDS.CD_ID=? to check if the specific CD_ID is referenced in > SDS table before drop a partition. This select count(1) statement does not > scale well in Postgres, and there is no index for CD_ID column in SDS table. > For a SDS table with with 1.5 million rows, select count(1) has average 700ms > without index, while in 10-20ms with index. But the statement before > HIVE-9447( SELECT * FROM "SDS" "A0" WHERE "A0"."CD_ID" = $1 limit 1) uses > less than 10ms . -- This message was sent by Atlassian Jira (v8.3.4#803005)