[
https://issues.apache.org/jira/browse/ACCUMULO-4090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15069981#comment-15069981
]
Eric Newton commented on ACCUMULO-4090:
---------------------------------------
I wrote a throw-away test to try and simulate the behavior.
Modified TabletServer to close any update sessions in flush.
I instrumented TabletServerBatchWriter to keep count of the number of instances
of TSBWs.
I wrote a client tester that wrote data, let it flush with latency (because
that's what the production client is doing). Then I wrote some more mutations
and closed it. As expected, I get the MutationsRejectedException. Then I
close the batchWriter and I get another MutationsRejectedException.
I repeat the process 10 times, ran the GC, waited, ran the GC again, and
verified that all the TSBWs were gone.
> BatchWriter close not cleaning up all resources
> -----------------------------------------------
>
> Key: ACCUMULO-4090
> URL: https://issues.apache.org/jira/browse/ACCUMULO-4090
> Project: Accumulo
> Issue Type: Bug
> Components: client
> Affects Versions: 1.7.0
> Reporter: Eric Newton
> Assignee: Eric Newton
>
> I'm debugging an issue with a long-running ingestor, similar to the
> TraceServer.
> After realizing that BatchWriter close needs to be called when a
> MutationsRejectedException occurs (see ACCUMULO-4088), a close was added, and
> the client became more stable.
> However, after a day, or so, the client became sluggish. When inspecting a
> heap dump, many TabletServerBatchWriter objects were still referenced. This
> server should only have two BatchWriter instances at any one time, and this
> server had >100.
> Still debugging.
> The error that initiates the issue is a SessionID not found, presumably
> because the session timed out. This is the cause of the
> MutationsRejectedException seen by the client.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)