stoty commented on code in PR #80:
URL: https://github.com/apache/phoenix-connectors/pull/80#discussion_r874344921


##########
phoenix-spark-base/src/main/java/org/apache/phoenix/spark/datasource/v2/reader/PhoenixInputPartitionReader.java:
##########
@@ -94,6 +99,10 @@ private QueryPlan getQueryPlan() throws SQLException {
         }
         try (Connection conn = DriverManager.getConnection(
                 JDBC_PROTOCOL + JDBC_PROTOCOL_SEPARATOR + zkUrl, 
overridingProps)) {
+            PTable pTable = PTable.parseFrom(options.getTableBytes());
+            org.apache.phoenix.schema.PTable table = 
PTableImpl.createFromProto(pTable);
+            PhoenixConnection phoenixConnection = 
conn.unwrap(PhoenixConnection.class);
+            phoenixConnection.addTable(table, System.currentTimeMillis());

Review Comment:
   Interesting point about the timestamp.
   
   The point of this patch is to avoid hammering the system tables with a huge 
number of parallel requests.
   I think that if we have executor starvation, then the jobs will not be 
started immediately, and the syscat load is not really a problem.
   
   Can you think of a case of when the jobs are delayed enough for this to 
matter, but enough of them start up synchrounously for the generated load to be 
a problem (I don't know enough about Spark to tell) ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to