Hi Michel Thanks for trying out Ozone! For ozone 0.5 - Can you please try another run after increasing config value for dfs.ratis.client.request.retry.interval to 10 or 15 seconds? The default value for this config is 1 second. > ozone.datanode.pipeline.limit Can you try a smaller value like 1 or 2 for above config with a data node heap size of 4GB? Please check GC pressure on the datanode with this config.
There are some improvements which have gone recently after ozone 0.5 release. I would also recommend to try the latest ozone. Thanks Lokesh > On 29-Jul-2020, at 12:57 AM, Michel Sumbul <michelsum...@gmail.com> wrote: > > I forgot to mention that I set ozone.datanode.pipeline.limit to 5. > Michel > > Le mar. 28 juil. 2020 à 20:22, Michel Sumbul <michelsum...@gmail.com > <mailto:michelsum...@gmail.com>> a écrit : > Hi guys, > > I would like to know if you have any advice tips/tricks to get the best > performance of Ozone? (Memory tuning / thread / specific settings / etc..) > > I did a few teragen/terasort on it and the results are really surprising > compared to HDFS,Ozone (using the hadoopFS) is almost 2 times slower than > HDFS. > > > > > > > The clusters were exactly the same for both: > - 3 masters and 4 slaves (8core/32GB) (its a small cluster but that should > matter here) > - Backend storage is a CEPH storage (80 servers) > - NIC: 2 x 25Gb/S > - Ozone version 0.5 > - Each job was executed 5 times > > HDFS and Ozone were installed on the same nodes, one was down where the other > one was up, to guarantee no other differences of configuration that the > distributed FS. > > I was not expecting a big difference like this, do you have any idea how to > improve this? > Or what can be the reason for that? I saw a few jira regarding data locality > at read, can it be linked to that? > > Thanks, > Michel >