[ https://issues.apache.org/jira/browse/HADOOP-19557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17948510#comment-17948510 ]
ASF GitHub Bot commented on HADOOP-19557: ----------------------------------------- steveloughran commented on code in PR #7662: URL: https://github.com/apache/hadoop/pull/7662#discussion_r2068923434 ########## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java: ########## @@ -829,7 +829,8 @@ public boolean hasCapability(String capability) { @Override public void hflush() throws IOException { statistics.hflushInvoked(); - handleSyncableInvocation(); + // do not reject these, but downgrade to a no-oop + LOG.debug("Hflush invoked"); Review Comment: look at the fs spec. we say "don't use the api and highlight the inconsistent outcomes" The semantics of hflush say "visible to all" but no persistence, so it's not changing any durability semantics. Are we changing the visibility? we're certainly not meeting them. I remember having a long talk with others about hflush, as in "what does it do?" -the answer is "nothing you can rely on". when exceptions are downgraded (default) all that happens is the log message is removed, so reducing confusion. when exceptions are rejected, the failure goes away. The one I want to fail here is hsync(), and at holds. AFAIK nobody runs with that flag on except for some of our test setups. -- > S3A: S3ABlockOutputStream to never log/reject hflush(): calls > ------------------------------------------------------------- > > Key: HADOOP-19557 > URL: https://issues.apache.org/jira/browse/HADOOP-19557 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 > Affects Versions: 3.4.1 > Reporter: Steve Loughran > Assignee: Steve Loughran > Priority: Critical > Labels: pull-request-available > > Parquet's GH-3204 patch uses hflush() just before close() > this is needless and hurts write performance on hdfs. > For s3A it will trigger a warning long (Syncable is not supported) or an > actual failure if > fs.s3a.downgrade.syncable.exceptions is false > proposed: hflush to log at debug -only log/reject on hsync, which is the real > place where semantics cannot be met -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org