[ 
https://issues.apache.org/jira/browse/SOLR-8575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144882#comment-15144882
 ] 

Yonik Seeley commented on SOLR-8575:
------------------------------------

Yeah, if HDFS had reported the correct length, the old code (prior to this 
JIRA) would have attempted to read partial records and get EOFs where it 
shouldn't.

For others following along... the key thing to the current patch is this:
{code}
        synchronized (HdfsTransactionLog.this) {
          fos.flushBuffer();
          sz = fos.size();
        }
{code}

The synchronization (which is the same monitor used to write records) means 
that our recorded "sz" represents a whole record and is hence safe to read up 
to.


> Fix HDFSLogReader replay status numbers, a performance bug where we can 
> reopen FSDataInputStream much too often, and an hdfs tlog data integrity bug.
> -----------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-8575
>                 URL: https://issues.apache.org/jira/browse/SOLR-8575
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Mark Miller
>            Assignee: Mark Miller
>             Fix For: master
>
>         Attachments: SOLR-8575.patch, SOLR-8575.patch
>
>
> [~pdvo...@cloudera.com] noticed some funny transaction log replay status 
> logging a while back:
> active=true starting pos=444978 current pos=2855956 current size=16262 % 
> read=17562
> active=true starting pos=444978 current pos=5748869 current size=16262 % 
> read=35352
> 17562% read? Current size does not change as expected in this case?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to