ChinmaySKulkarni commented on a change in pull request #4: PHOENIX-5238 Provide 
an option to pass hints with PhoenixRDD and Data…
URL: https://github.com/apache/phoenix-connectors/pull/4#discussion_r279934637
 
 

 ##########
 File path: 
phoenix-spark/src/main/java/org/apache/phoenix/spark/datasource/v2/reader/PhoenixDataSourceReader.java
 ##########
 @@ -148,6 +150,9 @@ public StructType readSchema() {
             // Optimize the query plan so that we potentially use secondary 
indexes
             final QueryPlan queryPlan = pstmt.optimizeQuery(selectStatement);
             final Scan scan = queryPlan.getContext().getScan();
+            if (this.disableBlockCache) {
+                scan.setCacheBlocks(false);
 
 Review comment:
   The `scan` variable is unused. You can actually remove it. You should be 
setting this on each scan  in the queryPlan, otherwise the Spark executor scans 
will not have this hint set. Instead of iterating over each scan here, it may 
be easier to set this in `PhoenixDataSourceReadOptions`. We create an instance 
of this when we call `PhoenixDataSourceReader#planInputPartitions()` from the 
driver. Also, these are embedded in each of our InputPartitions, so the read 
options are available to us on the Spark executors (see 
`PhoenixInputPartitionReader#initialize()`). Here we are iterating over the 
scans and you can use the set value in the read options to `setCacheBlocks` to 
false.
   
   Also, in case this hint is provided, you should make sure any other scan 
objects used on the driver also has this property set for example, the scan 
that we use on the driver-side to get the region locations.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to