[
https://issues.apache.org/jira/browse/ACCUMULO-4542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Christopher Tubbs resolved ACCUMULO-4542.
-----------------------------------------
Resolution: Cannot Reproduce
Can't reproduce, and this is OBE, with the new 2.0 bulk import API.
> Tablet left in bad state after bulk import timeout
> --------------------------------------------------
>
> Key: ACCUMULO-4542
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4542
> Project: Accumulo
> Issue Type: Bug
> Affects Versions: 1.7.2
> Reporter: John Vines
> Priority: Major
>
> On a cluster we saw a large amount of network issues at one point. Cause
> still has not been pinpointed, but it did result in us seeing a lot of rpc
> exceptions and the like.
> While these network issues happened, a bulk import was kicked off for a
> single file. This single file was assigned to two tablets (which both
> happened to be on the same server). Unfortunately, in the 3 attempts bulk
> import made to assign this file to this tablet, there were 3 rpc exceptions
> due to a socket timeout. After the three failures the bulk import went ahead
> and moved this file to the failures directory and carried on.
> Unfortunately, this file was actually assigned to the tablet succesfully on
> the first attempt. The following 2 attempts logged about how the server had
> already been assigned this file. It was shortly afterward a query came in
> (and then later major compactions) which then complained about how the file
> could not be found because the bulk import moved it to the failures directory.
> I think in this event we need some sort of final validation the record didn't
> end up in the metadata table before we move it to the failures.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)