[ https://issues.apache.org/jira/browse/SOLR-8575?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15144882#comment-15144882 ]
Yonik Seeley commented on SOLR-8575: ------------------------------------ Yeah, if HDFS had reported the correct length, the old code (prior to this JIRA) would have attempted to read partial records and get EOFs where it shouldn't. For others following along... the key thing to the current patch is this: {code} synchronized (HdfsTransactionLog.this) { fos.flushBuffer(); sz = fos.size(); } {code} The synchronization (which is the same monitor used to write records) means that our recorded "sz" represents a whole record and is hence safe to read up to. > Fix HDFSLogReader replay status numbers, a performance bug where we can > reopen FSDataInputStream much too often, and an hdfs tlog data integrity bug. > ----------------------------------------------------------------------------------------------------------------------------------------------------- > > Key: SOLR-8575 > URL: https://issues.apache.org/jira/browse/SOLR-8575 > Project: Solr > Issue Type: Bug > Reporter: Mark Miller > Assignee: Mark Miller > Fix For: master > > Attachments: SOLR-8575.patch, SOLR-8575.patch > > > [~pdvo...@cloudera.com] noticed some funny transaction log replay status > logging a while back: > active=true starting pos=444978 current pos=2855956 current size=16262 % > read=17562 > active=true starting pos=444978 current pos=5748869 current size=16262 % > read=35352 > 17562% read? Current size does not change as expected in this case? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org