steveloughran commented on issue #1359: HADOOP-16430.S3AFilesystem.delete to incrementally update s3guard with deletions URL: https://github.com/apache/hadoop/pull/1359#issuecomment-526350261 thanks. I;ve just pushed up another comment for you to look at, and I am making sure I run it without s3guard as well as with. Tested, S3 ireland # undeleted files A `ITestS3AContractRootDir` run failed failing with undeleted file "fork-0005/test/testFile". That filename is used in too many tests to identify the problem. The latest patch uses a unique name for each test case, so if the problem recurs, I can start tracking down the issue. I don't think it's related to my changes in deletion, but given how critical delete is for cleanup, I am not ignoring it. At the same time, I think with this patch, we are actually being more rigourous in cleanup. We use see guard to identify and delete all files, even when S3 being inconsistent. And when the client is using s3guard as an authoritative store, we still bypass it for a final bit of due diligence. ``` [ERROR] testListEmptyRootDirectory(org.apache.hadoop.fs.contract.s3a.ITestS3AContractRootDir) Time elapsed: 38.181 s <<< FAILURE! java.lang.AssertionError: Expected no results from listFiles(/, true), but got 1 elements: S3ALocatedFileStatus{path=s3a://hwdev-steve-ireland-new/fork-0005/test/testFile; isDirectory=false; length=1; replication=1; blocksize=33554432; modification_time=1567098043000; access_time=0; owner=stevel; group=stevel; permission=rw-rw-rw-; isSymlink=false; hasAcl=false; isEncrypted=true; isErasureCoded=false}[eTag='55a54008ad1ba589aa210d2629c1df41', versionId=''] at org.junit.Assert.fail(Assert.java:88) at org.apache.hadoop.fs.contract.AbstractContractRootDirectoryTest.assertNoElements(AbstractContractRootDirectoryTest.java:218) at org.apache.hadoop.fs.contract.AbstractContractRootDirectoryTest.testListEmptyRootDirectory(AbstractContractRootDirectoryTest.java:200) at org.apache.hadoop.fs.contract.s3a.ITestS3AContractRootDir.testListEmptyRootDirectory(ITestS3AContractRootDir.java:82) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:748) ``` BTW, note in the stack how S3ALocatedFileStatus now prints etag and version? There's now lossless conversion between that and S3AFileStatus, which I needed for the isDirectory flag to not get lost, as that is used in the delete operation to choose whether to add a / on a path. This why there's a new package-private ctor for S3AFileStatus. With the changes in, the file at fault appears to be `testFailedMetadataUpdate` from `ITestS3AMetadataPersistenceException`; doing more cleanup there. That test is deliberately creating failures in the metastore update process; maybe if the normal test FS doesn't know about the file then it's test clenanup doesn't find it. Now I know that empty dir markers will stop a scan for and delete of objects, I am starting to wonder if that is the cause of some intermittent test failures we've had in the past -though those could just have come from S3 list inconsistency not finding files to delete. ## Speed I added a section in the DeleteOperation about opportunities to speak up that process through better parallelisation. I also make it clear that you should only derive such changing from data created in benchmarks running in ECT itself. If you test remotely, latency can dominate, but also you are less prone to encountering throttling on AWS services, because you are not generating enough load. I don't immediately plan to do such performance tuning as I can see opportunities in speeding up directory listings which are on critical path for query planning. This patch here already make things faster especially very large directories. A question I have no information off is how important is directory deletion performance to applications? That's large directories. There's always going to be a fixed amount of needless overhead for deleting even a small directory: Head List delete calls, and setting up fake parent directories as needed. It's a larger directories that Dynamo DB calls come to dominate, and only once you have many thousands of files in a directory tree, that the advantages of more parallel batches of deletions probably start to deliver benefits.
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
