[ 
https://issues.apache.org/jira/browse/HADOOP-18458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17703520#comment-17703520
 ] 

ASF GitHub Bot commented on HADOOP-18458:
-----------------------------------------

wujinhu commented on code in PR #4912:
URL: https://github.com/apache/hadoop/pull/4912#discussion_r1144329309


##########
hadoop-tools/hadoop-aliyun/src/main/java/org/apache/hadoop/fs/aliyun/oss/AliyunOSSBlockOutputStream.java:
##########
@@ -138,64 +168,74 @@ public synchronized void write(int b) throws IOException {
   @Override
   public synchronized void write(byte[] b, int off, int len)
       throws IOException {
-    if (closed) {
-      throw new IOException("Stream closed.");
+    OSSDataBlocks.validateWriteArgs(b, off, len);
+    checkOpen();
+    if (len == 0) {
+      return;
     }
-    blockStream.write(b, off, len);
-    blockWritten += len;
-    if (blockWritten >= blockSize) {
-      uploadCurrentPart();
-      blockWritten = 0L;
+    OSSDataBlocks.DataBlock block = createBlockIfNeeded();
+    int written = block.write(b, off, len);
+    blockWritten += written;
+    int remainingCapacity = block.remainingCapacity();
+    if (written < len) {
+      // not everything was written — the block has run out
+      // of capacity
+      // Trigger an upload then process the remainder.
+      LOG.debug("writing more data than block has capacity -triggering 
upload");
+      uploadCurrentBlock();
+      // tail recursion is mildly expensive, but given buffer sizes must be MB.
+      // it's unlikely to recurse very deeply.
+      this.write(b, off + written, len - written);

Review Comment:
   Good suggestion, will optimize the code.





> AliyunOSS: AliyunOSSBlockOutputStream to support heap/off-heap buffer before 
> uploading data to OSS
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-18458
>                 URL: https://issues.apache.org/jira/browse/HADOOP-18458
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs/oss
>    Affects Versions: 3.0.3, 3.1.4, 2.10.2, 3.2.4, 3.3.4
>            Reporter: wujinhu
>            Assignee: wujinhu
>            Priority: Major
>              Labels: pull-request-available
>
> Recently, our customers raise a requirement: AliyunOSSBlockOutputStream 
> should support heap/off-heap buffer before uploading data to OSS.
> Currently, AliyunOSSBlockOutputStream buffers data in local directory before 
> uploading to OSS, it is not efficient compared to memory.
> Changes:
>  # Adds heap/off-heap buffers
>  # Adds limitation of memory used, and fallback to disk



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to