[GitHub] [phoenix-connectors] joshelser commented on a diff in pull request #80: PHOENIX-6694 Avoid unnecessary calls of fetching table meta data to r…

GitBox Mon, 16 May 2022 13:33:23 -0700


joshelser commented on code in PR #80:
URL: https://github.com/apache/phoenix-connectors/pull/80#discussion_r874129143



##########
phoenix-spark-base/src/main/java/org/apache/phoenix/spark/datasource/v2/reader/PhoenixInputPartitionReader.java:
##########
@@ -94,6 +99,10 @@ private QueryPlan getQueryPlan() throws SQLException {
         }
         try (Connection conn = DriverManager.getConnection(
                 JDBC_PROTOCOL + JDBC_PROTOCOL_SEPARATOR + zkUrl, 
overridingProps)) {
+            PTable pTable = PTable.parseFrom(options.getTableBytes());
+            org.apache.phoenix.schema.PTable table = 
PTableImpl.createFromProto(pTable);
+            PhoenixConnection phoenixConnection = 
conn.unwrap(PhoenixConnection.class);
+            phoenixConnection.addTable(table, System.currentTimeMillis());

Review Comment:
   Please add a log message that we are adding a cached version of the PTable 
here (rather than resolving the real table).
   
   On the off chance that there is some problem (from when we read the table 
until when we ran this code), it would be good to remind anyone looking at the 
log that we're doing this trick.



##########
phoenix-spark-base/src/main/java/org/apache/phoenix/spark/datasource/v2/reader/PhoenixInputPartitionReader.java:
##########
@@ -81,7 +86,7 @@ Properties getOverriddenPropsFromOptions() {
         return options.getOverriddenProps();
     }
 
-    private QueryPlan getQueryPlan() throws SQLException {
+    private QueryPlan getQueryPlan() throws SQLException, 
InvalidProtocolBufferException {

Review Comment:
   I think you could catch the `InvalidProtocolBufferException` and throw it as 
a RuntimeException in `getQueryPlan()`, rather than doing it in `initialize()`



##########
phoenix-spark-base/src/main/java/org/apache/phoenix/spark/datasource/v2/reader/PhoenixInputPartitionReader.java:
##########
@@ -94,6 +99,10 @@ private QueryPlan getQueryPlan() throws SQLException {
         }
         try (Connection conn = DriverManager.getConnection(
                 JDBC_PROTOCOL + JDBC_PROTOCOL_SEPARATOR + zkUrl, 
overridingProps)) {
+            PTable pTable = PTable.parseFrom(options.getTableBytes());
+            org.apache.phoenix.schema.PTable table = 
PTableImpl.createFromProto(pTable);
+            PhoenixConnection phoenixConnection = 
conn.unwrap(PhoenixConnection.class);
+            phoenixConnection.addTable(table, System.currentTimeMillis());

Review Comment:
   Also, do we want to record the time in which the PTable was cached and use 
that rather than `System.currentTimeMillis()`?
   
   My thinking is probably "no". If we use the original time we read the PTable 
and a long time elapsed (really big job, starvation on the Executor side), it 
could be hour(s) until this code is actually executed. If the 
`UPDATE_CACHE_FREQUENCY` isn't increased to a large value, it might cause the 
Phoenix client to re-fetch the PTable again? (just guessing).
   
   Have you thought about this case already?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [phoenix-connectors] joshelser commented on a diff in pull request #80: PHOENIX-6694 Avoid unnecessary calls of fetching table meta data to r…

Reply via email to