[ https://issues.apache.org/jira/browse/PHOENIX-7484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sanjeet Malhotra updated PHOENIX-7484: -------------------------------------- Description: Upserts using tenant connection on a multi-tenant table are taking 5K-6K % more time than upserts using non-tenant connection for 2M rows. Here the time being taken means total time spent in `executeUpdate()` and `commit()` call. The batch size and schema was same when testing with tenant connection and non-tenant connection. On further analysis, got to know that when doing upserts (for 2M rows) on a multi-tenant table over a tenant connection 13K-14K% more time was being spent in executeUpdate call than non-tenant connection. This whole regression is coming from mutation plan creation phase of executeUpdate call. Further root caused that, with tenant connection we are always hitting SYSCAT to get PTable object during mutation plan creation. So, every call to executeUpdate() over tenant connection results in PTable lookup from SYSCAT during mutation plan creation adding ~1ms to every call of executeUpdate() and for 2M rows this cumulate to 29-33 mins. For multi-tenant tables, the PTableKey in metadata cache has tenant Id as null as table was created over a non-tenant connection. When we are using multi-tenant connection for doing upserts, the PTableKey used to lookup PTableRef in metadata cache on client has tenant Id same as tenant Id of connection i.e. non null. Thus, when lookup happens for PTableRef it results in cache miss and next we immediately fallback to `getTableNoCache()` which ends up hitting SYSCAT. Rather we should first fallback to looking in metadata cache again but with tenant Id as null in PTableKey used for lookup and if still we don't find PTableRef then we should fallback to `getTableNoCache()`. Code pointer: https://github.com/apache/phoenix/blob/7682e3cee82e9cecb952eddaade1c544e6bd502d/phoenix-core-client/src/main/java/org/apache/phoenix/jdbc/PhoenixConnection.java#L766-L768 was: Upserts using tenant connection on a multi-tenant table are taking 5K-6K % more time than upserts using non-tenant connection for 2M rows. Here the time being taken means total time spent in `executeUpdate()` and `commit()` call. The batch size and schema was same when testing with tenant connection and non-tenant connection. On further analysis, got to know that when doing upserts (for 2M rows) on a multi-tenant table over a tenant connection 13K-14K% more time was being spent in executeUpdate call than non-tenant connection. This whole regression is coming from mutation plan creation phase of executeUpdate call. Further root caused that, with tenant connection we are always hitting SYSCAT to get PTable object during mutation plan creation. So, every call to executeUpdate() over tenant connection results in PTable lookup from SYSCAT during mutation plan creation adding ~1ms to every call of executeUpdate() and for 2M rows this cumulate to 29-33 mins. For multi-tenant tables, the PTableKey in metadata cache has tenant Id as null as table was created over a non-tenant connection. When we are using multi-tenant connection for doing upserts, the PTableKey used to lookup PTableRef in metadata cache on client has tenant Id same as tenant Id of connection i.e. non null. Thus, when lookup happens for PTableRef it results in cache miss and next we immediately fallback to `getTableNoCache()` which ends up hitting SYSCAT. Rather we should first fallback to looking in metadata cache again but with tenant Id as null in PTableKey used for lookup and if still we don't find PTableRef then we should fallback to `getTableNoCache()`. > Upserts on a multi-tenant tables using tenant connection are taking 5K-6K % > more time than non-tenant connection > ---------------------------------------------------------------------------------------------------------------- > > Key: PHOENIX-7484 > URL: https://issues.apache.org/jira/browse/PHOENIX-7484 > Project: Phoenix > Issue Type: Bug > Reporter: Sanjeet Malhotra > Assignee: Sanjeet Malhotra > Priority: Major > > Upserts using tenant connection on a multi-tenant table are taking 5K-6K % > more time than upserts using non-tenant connection for 2M rows. Here the time > being taken means total time spent in `executeUpdate()` and `commit()` call. > The batch size and schema was same when testing with tenant connection and > non-tenant connection. > On further analysis, got to know that when doing upserts (for 2M rows) on a > multi-tenant table over a tenant connection 13K-14K% more time was being > spent in executeUpdate call than non-tenant connection. This whole regression > is coming from mutation plan creation phase of executeUpdate call. > Further root caused that, with tenant connection we are always hitting SYSCAT > to get PTable object during mutation plan creation. So, every call to > executeUpdate() over tenant connection results in PTable lookup from SYSCAT > during mutation plan creation adding ~1ms to every call of executeUpdate() > and for 2M rows this cumulate to 29-33 mins. > > For multi-tenant tables, the PTableKey in metadata cache has tenant Id as > null as table was created over a non-tenant connection. When we are using > multi-tenant connection for doing upserts, the PTableKey used to lookup > PTableRef in metadata cache on client has tenant Id same as tenant Id of > connection i.e. non null. Thus, when lookup happens for PTableRef it results > in cache miss and next we immediately fallback to `getTableNoCache()` which > ends up hitting SYSCAT. Rather we should first fallback to looking in > metadata cache again but with tenant Id as null in PTableKey used for lookup > and if still we don't find PTableRef then we should fallback to > `getTableNoCache()`. > > Code pointer: > https://github.com/apache/phoenix/blob/7682e3cee82e9cecb952eddaade1c544e6bd502d/phoenix-core-client/src/main/java/org/apache/phoenix/jdbc/PhoenixConnection.java#L766-L768 > -- This message was sent by Atlassian Jira (v8.20.10#820010)