sakshamgangwar commented on a change in pull request #1079:
URL: https://github.com/apache/phoenix/pull/1079#discussion_r562080082
##########
File path:
phoenix-core/src/it/java/org/apache/phoenix/end2end/TableSnapshotReadsMapReduceIT.java
##########
@@ -274,6 +293,9 @@ private void configureJob(Job job, String tableName, String
inputQuery, String c
assertFalse("Should only have stored" + result.size() + "rows in the
table for the timestamp!", rs.next());
} finally {
+ if (isSnapshotRestoreDoneExternally) {
+ assertRestoreDirCount(conf, tmpDir.toString(), 1);
Review comment:
@shahrs87 There were two levels of subdirectories getting created for
snapshot restore on every scan:
https://github.com/apache/phoenix/blob/55f1362fc52eeabed139728dae153518883743a5/phoenix-core/src/main/java/org/apache/phoenix/mapreduce/util/PhoenixMapReduceUtil.java#L185
https://github.com/apache/phoenix/blob/55f1362fc52eeabed139728dae153518883743a5/phoenix-core/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java#L81
I have removed those in the original flow so now the directory gets cleaned
up every single scan and gets created again for the next scan with the same
directory structure.
I can assert here no existence of the restore directory in the original
flow.
##########
File path:
phoenix-core/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java
##########
@@ -78,8 +80,7 @@ public TableSnapshotResultIterator(Configuration
configuration, Scan scan, ScanM
this.scan = scan;
this.scanMetricsHolder = scanMetricsHolder;
this.scanIterator = UNINITIALIZED_SCANNER;
- this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY),
- UUID.randomUUID().toString());
+ this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY));
Review comment:
@shahrs87 I don't think so. I believe the original flow was faulty in
itself and for every scan, we were creating a subdirectory, not instead of
that, we have the same subdirectory for restore and every scan restore happens
there and clean up also happens per scan. So there should not be any issue. We
want to get rid of this structure because when we do an external restore or
external cleanup, we need to provide exact restore directory for reading it.
##########
File path:
phoenix-core/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java
##########
@@ -78,8 +80,7 @@ public TableSnapshotResultIterator(Configuration
configuration, Scan scan, ScanM
this.scan = scan;
this.scanMetricsHolder = scanMetricsHolder;
this.scanIterator = UNINITIALIZED_SCANNER;
- this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY),
- UUID.randomUUID().toString());
+ this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY));
Review comment:
@gjacoby126I missed that part, thanks for catching that. I will guard
the directories with the same config as well.
##########
File path:
phoenix-core/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java
##########
@@ -78,8 +80,7 @@ public TableSnapshotResultIterator(Configuration
configuration, Scan scan, ScanM
this.scan = scan;
this.scanMetricsHolder = scanMetricsHolder;
this.scanIterator = UNINITIALIZED_SCANNER;
- this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY),
- UUID.randomUUID().toString());
+ this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY));
Review comment:
@gjacoby126 Can you review again? I have kept old behavior as it is, as
well as in the assertion for restore directory I am asserting whether or not
snapshot directory exists (in case of externally cleaned up) or not (cleaned up
per scan).
##########
File path:
phoenix-core/src/main/java/org/apache/phoenix/iterate/TableSnapshotResultIterator.java
##########
@@ -80,7 +81,12 @@ public TableSnapshotResultIterator(Configuration
configuration, Scan scan, ScanM
this.scan = scan;
this.scanMetricsHolder = scanMetricsHolder;
this.scanIterator = UNINITIALIZED_SCANNER;
- this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY));
+ if (PhoenixConfigurationUtil.isMRSnapshotManagedExternally(configuration))
{
+ this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY));
+ } else {
+ this.restoreDir = new
Path(configuration.get(PhoenixConfigurationUtil.RESTORE_DIR_KEY),
Review comment:
@gjacoby126 that existed before this change as well, this is not a
necessary code, I believe at job level in PhoenixMapReduceUtil we update the
configuration restoreDir to a new path so as to make sure that even if
restoreDir configuration sent from the caller is same for multiple jobs, it
should be able to handle it internally by adding a new sub-directory (which we
are also doing at scan level too :) ). Whereas in externally managed now we
take full responsibility for the restoreDir creation/usage/deletion.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]