[ 
https://issues.apache.org/jira/browse/HADOOP-15079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16322219#comment-16322219
 ] 

Steve Loughran commented on HADOOP-15079:
-----------------------------------------

bq. Looks gift horse in the mouth "How did these unrelated annotation changes 
get in here

Because I went through the entire call chain of what rename was up to and what 
their policy was, when things were unmarked their policy was reviewed & 
updated. As noted, it's some s3guard translation which is what we need to wrap 
up (straightforward).

bq. us I think you've changed listLocatedStatus() to Mixed.

OK, I thought it was superfluous, but I will (a) trace it throough and (b) 
reinstate anyway. (Done)

Also: added annotations for Once/Retry ExceptionsSwallowed, and tagged where 
this is done in the dir/finishWrite calls.
I can see this becoming a maintenance PITA, but it does force us to be 
consistent. I wonder if you could do some tool which
analysed transient use of invoked methods for consistency...but you'd need to 
look at try/catch logic too. Hard.

bq. Do you want an {{if (outcome) }} condition around maybeCreateFakeParent()?

good idea. Done

bq. . Now that finishedWrite() does S3Guard.addAncestors() (IIRC it did not 
originally), maybe we don't need this and the whole makeDirsOrdered() thing can 
go away. A FS with real directories would still probably want to use that 
function but .. it'll always be in git history. I'm fine with trying it out 
here, or doing a separate JIRA, 'sup to you.

Let's do it here, but then we'll need the extra review of that. Note that 
removing it reduces the IOPs used, helping keep DDB costs down: tangible 
benefits

bq. Wow, good catch. Looks like that has been there since S3A was committed.

We just have to accept that sometimes, code you consider stable turns out to 
have bugs of some kind or other. Like, say, CPU parts :)

This is something we can/should backport on its own, as it is adding time & 
cost to every rename. Once this is in I'll do a micro-backport of the != fix, 
not worrying about test coverage.

bq. Your tests sharing a path instance implies the you were renaming from root 
dir to root dir (the only case Path#getParent() doesn't allocate a new Path). 
Sound right?

Actually, I don't think we have a test for that anywhere; there's nothing in 
abstract contract root directory test. Should be though, really. I just assumed 
that the mismatch was part of the cause of extra delete calls --maybe it was 
just unrelated but while doing the due diligence I noticed it in the debugger.

* Added a new test {{ITestS3AFileOperationCost#testCostOfRootRename()}} to see 
what goes on there. Not a lot, thankfully.
* Given this was the first time we've looked at that test suite for a while, 
did a quick cleanup, moving to LTU.intercept() where appropriate, and having 
the entire suite skipped during fault injection; too much risk of false test 
failures.

Also, we don't need to skip that test with s3guard on. now know things are 
fixed. 


w.r.t {{makeDirsOrdered}} in innerMkDirs: I've cut it, everything seems good.

But I've left the method in S3Guard for now, because it discusses issues there 
which are interesting. Maybe 

h3. Checkstyle
Fixed important, cut comment too long, and also moved innerRename to 
maybeAddTrailingSlash where appropriate...that keeps the line count down there. 

h3. Tests

Tested {{ITestS3AFileOperationCost}} the  with s3guard, s3guard + auth, s3guard 
+ dynamo, s3guard + dynamo + auth. The set.

Did full suite test run with: {{-Dparallel-tests -DtestsThreadCount=12 
-Ds3guard -Dauth -Dscale}}

I did *not* see the problems that Sean saw, which I can't even point to a cause 
of, especially the {{AbstractS3GuardToolTestBase.run:95 ยป IllegalArgument}} 
cases. 

Sean, can you do a clean build & test & stick up stack traces? 

> ITestS3AFileOperationCost#testFakeDirectoryDeletion failing after 
> OutputCommitter patch
> ---------------------------------------------------------------------------------------
>
>                 Key: HADOOP-15079
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15079
>             Project: Hadoop Common
>          Issue Type: Sub-task
>    Affects Versions: 3.1.0
>            Reporter: Sean Mackrory
>            Assignee: Steve Loughran
>            Priority: Critical
>         Attachments: HADOOP-15079-001.patch, HADOOP-15079-002.patch
>
>
> I see this test failing with "object_delete_requests expected:<1> but 
> was:<2>". I printed stack traces whenever this metric was incremented, and 
> found the root cause to be that innerMkdirs is now causing two calls to 
> delete fake directories when it previously caused only one. It is called once 
> inside createFakeDirectory, and once directly inside innerMkdirs later:
> {code}
>         at 
> org.apache.hadoop.fs.s3a.S3AInstrumentation.incrementCounter(S3AInstrumentation.java:454)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.incrementStatistic(S3AFileSystem.java:1108)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$deleteObjects$8(S3AFileSystem.java:1369)
>         at 
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:313)
>         at 
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:279)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.deleteObjects(S3AFileSystem.java:1366)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeys(S3AFileSystem.java:1625)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.deleteUnnecessaryFakeDirectories(S3AFileSystem.java:2634)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.finishedWrite(S3AFileSystem.java:2599)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.putObjectDirect(S3AFileSystem.java:1498)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$createEmptyObject$11(S3AFileSystem.java:2684)
>         at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:108)
>         at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$3(Invoker.java:259)
>         at 
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:313)
>         at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:255)
>         at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:230)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createEmptyObject(S3AFileSystem.java:2682)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.createFakeDirectory(S3AFileSystem.java:2657)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:2021)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:1956)
>         at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2305)
>         at 
> org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338)
>         at 
> org.apache.hadoop.fs.s3a.ITestS3AFileOperationCost.testFakeDirectoryDeletion(ITestS3AFileOperationCost.java:209)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74
> {code}
> {code}
>         at 
> org.apache.hadoop.fs.s3a.S3AInstrumentation.incrementCounter(S3AInstrumentation.java:454)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.incrementStatistic(S3AFileSystem.java:1108)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$deleteObjects$8(S3AFileSystem.java:1369)
>         at 
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:313)
>         at 
> org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:279)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.deleteObjects(S3AFileSystem.java:1366)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.removeKeys(S3AFileSystem.java:1625)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.deleteUnnecessaryFakeDirectories(S3AFileSystem.java:2634)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.innerMkdirs(S3AFileSystem.java:2025)
>         at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.mkdirs(S3AFileSystem.java:1956)
>         at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2305)
>         at 
> org.apache.hadoop.fs.contract.AbstractFSContractTestBase.mkdirs(AbstractFSContractTestBase.java:338)
>         at 
> org.apache.hadoop.fs.s3a.ITestS3AFileOperationCost.testFakeDirectoryDeletion(ITestS3AFileOperationCost.java:209)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:498)
>         at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>         at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>         at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>         at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>         at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>         at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>         at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>         at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to