[ 
https://issues.apache.org/jira/browse/HADOOP-14215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Mackrory updated HADOOP-14215:
-----------------------------------
    Attachment: HADOOP-14215-HADOOP-13345.002.patch

Attaching a patch with the test added. The test borrows heavily from how 
ITestS3AConcurrentOps ended up doing things.

The characteristics of the test seem to vary wildly, but I think I'm almost 
satisfied. I'm going to do a bunch more test runs just to be more certain. Last 
night I was able to reproduce the problem in 1 or 2 threads in each iteration 
within 1 or 2 iterations. This morning I had a bunch of test runs that couldn't 
reproduce the problem. This afternoon I saw all but 1 thread (sometimes all 
threads - which was weird - see below) in every single iteration experience the 
error.

Immediately after getting a bunch of test runs that had the problem very 
severely, I applied your patch, and the results started being complete success 
75% of the time, and once in a while a single thread had a failure. I then 
realized I was missing your retry-until-version-marker-is-present fix. I added 
that and I haven't had a failure since. I mention all that detail because I'm 
afraid I didn't think to really dig into the logs on the occasional cases when 
not a single thread failed, so I'm only guessing that those were the cases that 
didn't hit the UPDATING state, but that did lose the race condition with the 
version marker.

So again, I'm going to do a bunch more test runs to be sure, but I think things 
are looking good. +1 to your changes based on a visual code review, too.

> DynamoDB client should waitForActive on existing tables
> -------------------------------------------------------
>
>                 Key: HADOOP-14215
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14215
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>            Reporter: Sean Mackrory
>            Assignee: Sean Mackrory
>            Priority: Critical
>         Attachments: HADOOP-14215-HADOOP-13345.000.patch, 
> HADOOP-14215-HADOOP-13345.001.patch, HADOOP-14215-HADOOP-13345.002.patch
>
>
> I saw a case where 2 separate applications tried to use the same 
> non-pre-existing table with table.create = true at about the same time. One 
> failed with a ResourceInUse exception. If a table does not exist, we attempt 
> to create it and then wait for it to enter the active state. If another jumps 
> in in the middle of that, the table may exist, thus bypassing our call to 
> waitForActive(), and then try to use the table immediately.
> While we're at it, let's also make sure that the race condition where a table 
> might get created between checking if it exists and attempting to create it 
> is handled gracefully.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to