[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16961604#comment-16961604
 ] 

Li Cheng edited comment on HDDS-2356 at 10/29/19 6:53 AM:
----------------------------------------------------------

[~bharat] I'm using python to write to a OS path. sth like:
{code:java}
// code placeholder
sub_files = os.listdir(dir)
num = 0 
output = "" 
for sub_file in sub_files:
 sub_file_path = dir + sub_file
 with open(dest_dir + sub_file_path, "w") as fw:
   with open(sub_file_path, "r") as fr:
    line = fr.readline()
    while line:
      num += 1
      output += line
      if (num >= 2000):
        fw.write(output)
        output = ""
        num = 0
      line = fr.readline()
   fw.write(output)
{code}
 

Also I'm using S3 gateway to connect to ozone and mount local file path by fuse 
(goofys). Have you tested s3 gateway? Most unit tests are going thru RPC.


was (Author: timmylicheng):
[~bharat] I'm using python to write to a OS path. sth like:
{code:java}
// code placeholder
sub_files = os.listdir(dir)
num = 0 
output = "" 
for sub_file in sub_files:
 sub_file_path = dir + sub_file
 with open(dest_dir + sub_file_path, "w") as fw:
   with open(sub_file_path, "r") as fr:
    line = fr.readline()
    while line:
      num += 1
      output += line
      if (num >= 2000):
        fw.write(output)
        output = ""
        num = 0
      line = fr.readline()
   fw.write(output)
{code}

> Multipart upload report errors while writing to ozone Ratis pipeline
> --------------------------------------------------------------------
>
>                 Key: HDDS-2356
>                 URL: https://issues.apache.org/jira/browse/HDDS-2356
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Manager
>    Affects Versions: 0.4.1
>         Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>            Reporter: Li Cheng
>            Assignee: Bharat Viswanadham
>            Priority: Blocker
>             Fix For: 0.5.0
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> 2019-10-24 16:01:59,527 [OMDoubleBufferFlushThread] ERROR - Terminating with 
> exit status 2: OMDoubleBuffer flush 
> threadOMDoubleBufferFlushThreadencountered Throwable error
> java.util.ConcurrentModificationException
>  at java.util.TreeMap.forEach(TreeMap.java:1004)
>  at 
> org.apache.hadoop.ozone.om.helpers.OmMultipartKeyInfo.getProto(OmMultipartKeyInfo.java:111)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:38)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:31)
>  at 
> org.apache.hadoop.hdds.utils.db.CodecRegistry.asRawData(CodecRegistry.java:68)
>  at 
> org.apache.hadoop.hdds.utils.db.TypedTable.putWithBatch(TypedTable.java:125)
>  at 
> org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCommitPartResponse.addToDBBatch(S3MultipartUploadCommitPartResponse.java:112)
>  at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$flushTransactions$0(OzoneManagerDoubleBuffer.java:137)
>  at java.util.Iterator.forEachRemaining(Iterator.java:116)
>  at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:135)
>  at java.lang.Thread.run(Thread.java:745)
> 2019-10-24 16:01:59,629 [shutdown-hook-0] INFO - SHUTDOWN_MSG:



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to