[ https://issues.apache.org/jira/browse/HADOOP-15616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16829186#comment-16829186 ]
Sammi Chen edited comment on HADOOP-15616 at 4/29/19 12:11 PM: --------------------------------------------------------------- Comments for 009.patch, 1. CosNConfigKeys, default pool size is too big. CosN will CPU resource with other modules in Hadoop and with other services on in the big data system. Be Conservative when allocating resources. Suggest 4 threads for upload pool and 8 threads for copy pool. {code:java} public static final int DEFAULT_UPLOAD_THREAD_POOL_SIZE = Runtime.getRuntime().availableProcessors() * 5; public static final int DEFAULT_COPY_THREAD_POOL_SIZE = Runtime.getRuntime().availableProcessors() * 3; {code} add comments for the unit of {code:java} public static final long DEFAULT_THREAD_KEEP_ALIVE_TIME = 60L; {code} 2. function format, prefer a compact format, keep as many as parameters in a line. There are many such places in the patch to be fixed, not just the following one. {code:java} public CosNFileReadTask( Configuration conf, String key, NativeFileSystemStore store, CosNInputStream.ReadBuffer readBuffer) { {code} preferred {code:java} public CosNFileReadTask(Configuration conf, String key, NativeFileSystemStore store, CosNInputStream.ReadBuffer readBuffer) { {code} 3. CosNFileSystem#initialize, suggest use two variable to control pool size and waiting queue size, pool size should be relative smaller number, while waiting queue can be bigger number, such as 256, etc. {code:java} this.boundedIOThreadPool = BlockingThreadPoolExecutorService.newInstance( ioThreadPoolSize / 2, ioThreadPoolSize, threadKeepAlive, TimeUnit.SECONDS, "cos-transfer-thread-pool"); this.boundedCopyThreadPool = BlockingThreadPoolExecutorService.newInstance( copyThreadPoolSize / 2, copyThreadPoolSize, threadKeepAliveTime, TimeUnit.SECONDS, "cos-copy-thread-pool"); {code} 4. CosNFileSystem, use PATH_DELIMITER to replace all "/", following is not necessary, can be removed to keep code concise. {code:java} if (LOG.isDebugEnabled()) {code} 5. ByteBufferWrapper#close,use nested try/catch to guarantee every resource get the chance to be released {code:java} void close() throws IOException { if (null != this.byteBuffer) { this.byteBuffer.clear(); } if (null != randomAccessFile) { this.randomAccessFile.close(); } if (this.byteBuffer instanceof MappedByteBuffer) { munmap((MappedByteBuffer) this.byteBuffer); } if (null != this.file && this.file.exists()) { this.file.delete(); } } {code} 6. CosNOutputStream#uploadPart#, byte buffer is not released if (store).uploadPart throw any exception 7. Enhanced the unit tests. You can leverage the code coverage report function in intellij. was (Author: sammi): Comments for 009.patch, 1. CosNConfigKeys, default pool size is too big. CosN will CPU resource with other modules in Hadoop and with other services on in the big data system. Be Conservative when allocating resources. Suggest 4 threads for upload pool and 8 threads for copy pool. {code:java} public static final int DEFAULT_UPLOAD_THREAD_POOL_SIZE = Runtime.getRuntime().availableProcessors() * 5; public static final int DEFAULT_COPY_THREAD_POOL_SIZE = Runtime.getRuntime().availableProcessors() * 3; {code} add comments for the unit of {code:java} public static final long DEFAULT_THREAD_KEEP_ALIVE_TIME = 60L; {code} 2. function format, prefer a compact format, keep as many as parameters in a line. There are many such places in the patch to be fixed, not just the following one. {code:java} public CosNFileReadTask( Configuration conf, String key, NativeFileSystemStore store, CosNInputStream.ReadBuffer readBuffer) { {code} preferred {code:java} public CosNFileReadTask(Configuration conf, String key, NativeFileSystemStore store, CosNInputStream.ReadBuffer readBuffer) { {code} 3. CosNFileSystem#initialize, suggest use two variable to control pool size and waiting queue size, pool size should be relative smaller number, while waiting queue can be bigger number, such as 256, etc. {code:java} this.boundedIOThreadPool = BlockingThreadPoolExecutorService.newInstance( ioThreadPoolSize / 2, ioThreadPoolSize, threadKeepAlive, TimeUnit.SECONDS, "cos-transfer-thread-pool"); this.boundedCopyThreadPool = BlockingThreadPoolExecutorService.newInstance( copyThreadPoolSize / 2, copyThreadPoolSize, threadKeepAliveTime, TimeUnit.SECONDS, "cos-copy-thread-pool"); {code} 4. CosNFileSystem, use PATH_DELIMITER to replace all "/", following is not necessary, can be removed to keep code concise. {code:java} if (LOG.isDebugEnabled()) {code} 5. ByteBufferWrapper#close,use nested try/catch to guarantee every resource get the chance to be released {code:java} void close() throws IOException { if (null != this.byteBuffer) { this.byteBuffer.clear(); } if (null != randomAccessFile) { this.randomAccessFile.close(); } if (this.byteBuffer instanceof MappedByteBuffer) { munmap((MappedByteBuffer) this.byteBuffer); } if (null != this.file && this.file.exists()) { this.file.delete(); } } {code} 6. CosNOutputStream#uploadPart#, byte buffer is not released if (store).uploadPart throw any exception > Incorporate Tencent Cloud COS File System Implementation > -------------------------------------------------------- > > Key: HADOOP-15616 > URL: https://issues.apache.org/jira/browse/HADOOP-15616 > Project: Hadoop Common > Issue Type: New Feature > Components: fs/cos > Reporter: Junping Du > Assignee: YangY > Priority: Major > Attachments: HADOOP-15616.001.patch, HADOOP-15616.002.patch, > HADOOP-15616.003.patch, HADOOP-15616.004.patch, HADOOP-15616.005.patch, > HADOOP-15616.006.patch, HADOOP-15616.007.patch, HADOOP-15616.008.patch, > HADOOP-15616.009.patch, Tencent-COS-Integrated-v2.pdf, > Tencent-COS-Integrated.pdf > > > Tencent cloud is top 2 cloud vendors in China market and the object store COS > ([https://intl.cloud.tencent.com/product/cos]) is widely used among China’s > cloud users but now it is hard for hadoop user to access data laid on COS > storage as no native support for COS in Hadoop. > This work aims to integrate Tencent cloud COS with Hadoop/Spark/Hive, just > like what we do before for S3, ADL, OSS, etc. With simple configuration, > Hadoop applications can read/write data from COS without any code change. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org