shardulm94 commented on a change in pull request #2281:
URL: https://github.com/apache/iceberg/pull/2281#discussion_r584210600
##########
File path:
spark3/src/main/java/org/apache/iceberg/spark/source/SparkBatchScan.java
##########
@@ -150,24 +169,39 @@ public PartitionReaderFactory createReaderFactory() {
.allMatch(fileScanTask -> fileScanTask.file().format().equals(
FileFormat.ORC)));
+ boolean hasNoRowFilters =
+ tasks().stream()
+ .allMatch(combinedScanTask -> !combinedScanTask.isDataTask() &&
combinedScanTask.files()
+ .stream()
+ .allMatch(fileScanTask ->
OrcRowFilterUtils.rowFilterFromTask(fileScanTask) == null));
Review comment:
This code was also not touched by
https://github.com/linkedin/iceberg/pull/48 and does not exist in
apache/iceberg. Can you remove this?
These are several other changes in the files which are linkedin specific and
were not touched by https://github.com/linkedin/iceberg/pull/48
##########
File path: spark2/src/main/java/org/apache/iceberg/spark/source/Reader.java
##########
@@ -136,7 +143,7 @@
if (io.getValue() instanceof HadoopFileIO) {
String fsscheme = "no_exist";
try {
- Configuration conf =
SparkSession.active().sessionState().newHadoopConf();
+ Configuration conf = new
Configuration(activeSparkSession().sessionState().newHadoopConf());
Review comment:
Many changes in this file seems to be copied over from LinkedIn's fork
which are not relevant to apache/iceberg. Can you remove these?
The PR over linkedin/iceberg does not have these changes either. So not sure
how those were copied over.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]