Brian Loss created ACCUMULO-727:
-----------------------------------
Summary: Bulk Import retry time needs to be longer/configurable
Key: ACCUMULO-727
URL: https://issues.apache.org/jira/browse/ACCUMULO-727
Project: Accumulo
Issue Type: Bug
Components: tserver
Affects Versions: 1.4.1
Reporter: Brian Loss
Assignee: Keith Turner
Bulk import retries way too fast (at least under some circumstances). We had a
tablet server that the master killed (we were overloading it with ingest and
the hold time got too big so the master killed it). At the same time, a bulk
import operation had begun and several map files were assigned to the server
that was just killed. The bulk import retried three times in an 8 second span,
each time failing with a connection refused error, and then gave up, failing
the file completely. Meanwhile, it took the master about 1m 20s to reassign
the tablet to another server.
The bulk import process should account for this possibility. Either it needs
to recognize that it can't connect to a tablet server so it must be down and
the tablet will be reassigned somewhere else, or it should wait longer (such
that the default max wait time is > the average tablet reassignment time). In
the latter case, the retry interval should be made into a configurable option
at the same time.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira