[ https://issues.apache.org/jira/browse/HADOOP-19557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17948469#comment-17948469 ]
ASF GitHub Bot commented on HADOOP-19557: ----------------------------------------- ahmarsuhail commented on code in PR #7662: URL: https://github.com/apache/hadoop/pull/7662#discussion_r2068635417 ########## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java: ########## @@ -829,7 +829,8 @@ public boolean hasCapability(String capability) { @Override public void hflush() throws IOException { statistics.hflushInvoked(); - handleSyncableInvocation(); + // do not reject these, but downgrade to a no-oop + LOG.debug("Hflush invoked"); Review Comment: @steveloughran is parquet the only reader calling hflush? think this changes behaviour for everyone.. is this something we need to care about? > S3A: S3ABlockOutputStream to never log/reject hflush(): calls > ------------------------------------------------------------- > > Key: HADOOP-19557 > URL: https://issues.apache.org/jira/browse/HADOOP-19557 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.1 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Critical > Labels: pull-request-available > > Parquet's GH-3204 patch uses hflush() just before close() > this is needless and hurts write performance on hdfs. > For s3A it will trigger a warning long (Syncable is not supported) or an > actual failure if > fs.s3a.downgrade.syncable.exceptions is false > proposed: hflush to log at debug -only log/reject on hsync, which is the real > place where semantics cannot be met -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org