Github user parthchandra commented on a diff in the pull request:
https://github.com/apache/drill/pull/1214#discussion_r183919198
--- Diff:
exec/java-exec/src/main/java/org/apache/drill/exec/ops/BaseOperatorContext.java
---
@@ -158,25 +159,26 @@ public void close() {
} catch (RuntimeException e) {
ex = ex == null ? e : ex;
}
- try {
- if (fs != null) {
+
+ for (DrillFileSystem fs : fileSystems) {
+ try {
fs.close();
- fs = null;
- }
- } catch (IOException e) {
+ } catch (IOException e) {
throw UserException.resourceError(e)
- .addContext("Failed to close the Drill file system for " +
getName())
- .build(logger);
+ .addContext("Failed to close the Drill file system for " +
getName())
+ .build(logger);
+ }
}
+
if (ex != null) {
throw ex;
}
}
@Override
public DrillFileSystem newFileSystem(Configuration conf) throws
IOException {
- Preconditions.checkState(fs == null, "Tried to create a second
FileSystem. Can only be called once per OperatorContext");
- fs = new DrillFileSystem(conf, getStats());
+ DrillFileSystem fs = new DrillFileSystem(conf, getStats());
--- End diff --
I don't get why you need multiple DrillFileSystems per operator context?
The reason for the DrillFileSystem abstraction (and the reason for tying it to
the operator context) is to track the time a (scan) operator was waiting for a
file system call to return. This is reported in the wait time for the operator
in the query profile. For scans this is a critical number as the time spent
waiting for a disk read determines if the query is disk bound.
Associating multiple file system objects with a single operator context
will throw the math out of whack. I think.
---