[
https://issues.apache.org/jira/browse/SOLR-8575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144779#comment-15144779
]
Yonik Seeley commented on SOLR-8575:
------------------------------------
I've started testing this morning with this patch... it will be a few hours at
least before I know if it's fixed for me as well.
One of the error caused by premature EOF that I was seeing happened after the
re-open, so the constructor changes should not matter in that specific fail.
But an important addition was made in this current patch, which calls
fos.flushBuffer() in the reopen... that was missing in the previous patch.
Actually, it looks like this patch fixed more than just performance... that
missing fos.flushBuffer() wasn't just missing from the previous patch, it was
never there in the code to begin with! This appears to mean that prior to this
JIRA, buffering while replaying could sometimes prematurely abort (by getting
an EOF) because a partial record was written. Simply adding a flushBuffer
would not have been sufficient though... by using the actual size of the file
(unsynchronized) as the point to read up to, we can get premature EOFs as well.
Given we're using 64K write buffers, the odds of seeing issues due to this is
related to the document size being indexed as well as the throughput.
> Fix HDFSLogReader replay status numbers and a performance bug where we can
> reopen FSDataInputStream too often.
> --------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-8575
> URL: https://issues.apache.org/jira/browse/SOLR-8575
> Project: Solr
> Issue Type: Bug
> Reporter: Mark Miller
> Assignee: Mark Miller
> Fix For: master
>
> Attachments: SOLR-8575.patch, SOLR-8575.patch
>
>
> [[email protected]] noticed some funny transaction log replay status
> logging a while back:
> active=true starting pos=444978 current pos=2855956 current size=16262 %
> read=17562
> active=true starting pos=444978 current pos=5748869 current size=16262 %
> read=35352
> 17562% read? Current size does not change as expected in this case?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]