MaxNevermind commented on code in PR #1335:
URL: https://github.com/apache/parquet-java/pull/1335#discussion_r1704892705


##########
parquet-hadoop/src/main/java/org/apache/parquet/hadoop/rewrite/RewriteOptions.java:
##########
@@ -350,6 +436,28 @@ public Builder indexCacheStrategy(IndexCache.CacheStrategy 
cacheStrategy) {
       return this;
     }
 
+    /**
+     * Set a flag weather columns from join files need to overwrite columns 
from input files.

Review Comment:
   > What is the overwrite strategy? If the input file joins file A and joins 
file B. All those files have a column: col, which has the highest priority?
   
   All inputFilesToJoin file are currently expected to have the same schema. 
Additionally there is a flag `overwriteInputWithJoinColumns` which if set to 
true will prioritize columns from  inputFilesToJoin instead of inputFiles, it 
is `false` by default so be default inputFiles columns will be preserved.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to