[jira] [Comment Edited] (HADOOP-15616) Incorporate Tencent Cloud COS File System Implementation

Sammi Chen (JIRA) Mon, 29 Apr 2019 05:12:44 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829186#comment-16829186
 ]


Sammi Chen edited comment on HADOOP-15616 at 4/29/19 12:11 PM:
---------------------------------------------------------------

Comments for 009.patch, 

 1. CosNConfigKeys, default pool size is too big.  CosN will CPU resource with 
other modules in Hadoop and with other services on in the big data system. Be 
Conservative when allocating resources.  Suggest 4 threads for upload pool and 
8 threads for copy pool. 
     
{code:java}
public static final int DEFAULT_UPLOAD_THREAD_POOL_SIZE =
      Runtime.getRuntime().availableProcessors() * 5;

public static final int DEFAULT_COPY_THREAD_POOL_SIZE =
      Runtime.getRuntime().availableProcessors() * 3;
{code}

 add comments for the unit of  

{code:java}
 public static final long DEFAULT_THREAD_KEEP_ALIVE_TIME = 60L;
{code}

2. function format, prefer a compact format, keep as many as parameters in a 
line. There are many such places in the patch to be fixed, not just the 
following one. 
{code:java}
  public CosNFileReadTask(
      Configuration conf,
      String key, NativeFileSystemStore store,
      CosNInputStream.ReadBuffer readBuffer) {
{code}
preferred 

{code:java}
public CosNFileReadTask(Configuration conf, String key,
      NativeFileSystemStore store, CosNInputStream.ReadBuffer readBuffer) {
{code}


3. CosNFileSystem#initialize, suggest use two variable to control pool size and 
waiting queue size, pool size should be relative smaller number, while waiting 
queue can be bigger number, such as 256, etc. 
{code:java}
 this.boundedIOThreadPool = BlockingThreadPoolExecutorService.newInstance(
        ioThreadPoolSize / 2, ioThreadPoolSize,
        threadKeepAlive, TimeUnit.SECONDS,
        "cos-transfer-thread-pool");
  this.boundedCopyThreadPool =
        BlockingThreadPoolExecutorService.newInstance(
        copyThreadPoolSize / 2, copyThreadPoolSize, threadKeepAliveTime,
        TimeUnit.SECONDS, "cos-copy-thread-pool");
{code}

4. CosNFileSystem,  use PATH_DELIMITER to replace all "/",   following is not 
necessary, can be removed to keep code concise. 
{code:java}
if (LOG.isDebugEnabled())
{code}

5. ByteBufferWrapper#close，use nested try/catch to guarantee every resource get 
the chance to be released

{code:java}
void close() throws IOException {
    if (null != this.byteBuffer) {
      this.byteBuffer.clear();
    }

    if (null != randomAccessFile) {
      this.randomAccessFile.close();
    }

    if (this.byteBuffer instanceof MappedByteBuffer) {
      munmap((MappedByteBuffer) this.byteBuffer);
    }

    if (null != this.file && this.file.exists()) {
      this.file.delete();
    }
  }
{code}

6. CosNOutputStream#uploadPart#, byte buffer is not released if 
(store).uploadPart throw any exception

7. Enhanced the unit tests.  You can leverage the code coverage report function 
in intellij. 


was (Author: sammi):
Comments for 009.patch,

 1. CosNConfigKeys, default pool size is too big.  CosN will CPU resource with 
other modules in Hadoop and with other services on in the big data system. Be 
Conservative when allocating resources.  Suggest 4 threads for upload pool and 
8 threads for copy pool. 
     
{code:java}
public static final int DEFAULT_UPLOAD_THREAD_POOL_SIZE =
      Runtime.getRuntime().availableProcessors() * 5;

public static final int DEFAULT_COPY_THREAD_POOL_SIZE =
      Runtime.getRuntime().availableProcessors() * 3;
{code}

 add comments for the unit of  

{code:java}
 public static final long DEFAULT_THREAD_KEEP_ALIVE_TIME = 60L;
{code}

2. function format, prefer a compact format, keep as many as parameters in a 
line. There are many such places in the patch to be fixed, not just the 
following one. 
{code:java}
  public CosNFileReadTask(
      Configuration conf,
      String key, NativeFileSystemStore store,
      CosNInputStream.ReadBuffer readBuffer) {
{code}
preferred 

{code:java}
public CosNFileReadTask(Configuration conf, String key,
      NativeFileSystemStore store, CosNInputStream.ReadBuffer readBuffer) {
{code}


3. CosNFileSystem#initialize, suggest use two variable to control pool size and 
waiting queue size, pool size should be relative smaller number, while waiting 
queue can be bigger number, such as 256, etc. 
{code:java}
 this.boundedIOThreadPool = BlockingThreadPoolExecutorService.newInstance(
        ioThreadPoolSize / 2, ioThreadPoolSize,
        threadKeepAlive, TimeUnit.SECONDS,
        "cos-transfer-thread-pool");
  this.boundedCopyThreadPool =
        BlockingThreadPoolExecutorService.newInstance(
        copyThreadPoolSize / 2, copyThreadPoolSize, threadKeepAliveTime,
        TimeUnit.SECONDS, "cos-copy-thread-pool");
{code}

4. CosNFileSystem,  use PATH_DELIMITER to replace all "/",   following is not 
necessary, can be removed to keep code concise. 
{code:java}
if (LOG.isDebugEnabled())
{code}

5. ByteBufferWrapper#close，use nested try/catch to guarantee every resource get 
the chance to be released

{code:java}
void close() throws IOException {
    if (null != this.byteBuffer) {
      this.byteBuffer.clear();
    }

    if (null != randomAccessFile) {
      this.randomAccessFile.close();
    }

    if (this.byteBuffer instanceof MappedByteBuffer) {
      munmap((MappedByteBuffer) this.byteBuffer);
    }

    if (null != this.file && this.file.exists()) {
      this.file.delete();
    }
  }
{code}

6. CosNOutputStream#uploadPart#, byte buffer is not released if 
(store).uploadPart throw any exception


> Incorporate Tencent Cloud COS File System Implementation
> --------------------------------------------------------
>
>                 Key: HADOOP-15616
>                 URL: https://issues.apache.org/jira/browse/HADOOP-15616
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/cos
>            Reporter: Junping Du
>            Assignee: YangY
>            Priority: Major
>         Attachments: HADOOP-15616.001.patch, HADOOP-15616.002.patch, 
> HADOOP-15616.003.patch, HADOOP-15616.004.patch, HADOOP-15616.005.patch, 
> HADOOP-15616.006.patch, HADOOP-15616.007.patch, HADOOP-15616.008.patch, 
> HADOOP-15616.009.patch, Tencent-COS-Integrated-v2.pdf, 
> Tencent-COS-Integrated.pdf
>
>
> Tencent cloud is top 2 cloud vendors in China market and the object store COS 
> （[https://intl.cloud.tencent.com/product/cos]） is widely used among China’s 
> cloud users but now it is hard for hadoop user to access data laid on COS 
> storage as no native support for COS in Hadoop.
> This work aims to integrate Tencent cloud COS with Hadoop/Spark/Hive, just 
> like what we do before for S3, ADL, OSS, etc. With simple configuration, 
> Hadoop applications can read/write data from COS without any code change.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-15616) Incorporate Tencent Cloud COS File System Implementation

Reply via email to