Re: Advice to get the best performance of Ozone

Lokesh Jain Fri, 07 Aug 2020 04:42:27 -0700

Hi Michel

Can you please try running the benchmarks with 92 mappers in the same 
environment?


Couple of other suggestions.
1. It's a good idea to use separate disk for data node metadata and data. 
Configurable using dfs.container.ratis.datanode.storage.dir and 
hdds.datanode.dir
2. Can you also try running the benchmark in cluster where Ozone and HDFS data 
nodes are not sharing the physical hosts with Yarn Node Managers?

Thanks
Lokesh

> On 06-Aug-2020, at 4:54 PM, Michel Sumbul <michelsum...@gmail.com> wrote:
> 
> Hi Lokesh,
> 
> Thanks for the tips, it had only the xmx and no the xms, so the heap was
> not at 8GB, adding xms did it. Increasing the young gen to 2GB helps to
> decrease the total GC pause to a few seconds over the all test, which I
> think is good.
> The test is a teragen and terasort of 100GB with 24 mapper.
> Even with the GC pause improvement, I still got a difference of 35% for
> teragen and 61% for terasort. Here is the details results:
> Teragen:
> 
> Run        HDFS     Ozone
> 
> 1              173         243
> 
> 2              176         243
> 
> 3              179         242
> 
> 4              185         244
> 
> 5              182         246
> 
> avg         179         243.6
> 
> 
> Terasort
> 
> Run        HDFS     Ozone
> 
> 1              433         653
> 
> 2              431         682
> 
> 3              418         685
> 
> 4              432         671
> 
> 5              429         691
> 
> avg         428.6     676.4
> 
> I used exactly the same command when run on hdfs or ozone:
> time yarn jar
> /opt/cloudera/parcels/CDH-7.1.1-1.cdh7.1.1.p0.3266817/jars/hadoop-mapreduce-examples-3.1.1.7.1.1.0-565.jar
> teragen -Dmapred.map.tasks=24 1000000000 /data/tera/gen
> 
> Any idea how to improve the read?
> 
> Thanks,
> 
> Michel
> 
> Le ven. 31 juil. 2020 à 07:14, Lokesh Jain <lokeshjainl...@gmail.com> a
> écrit :
> 
>> Thanks for sharing the GC link Michel!
>> 
>> Memory
>> Total heap (usage / alloc. max) 1,501.7M (81.7%) / 1,837.4M
>> Tenured heap (usage / alloc. max)       1,361.7M (80.7%) / 1,686.7M
>> Young heap (usage / alloc. max) 150.6M (100.0%) / 150.6M
>> 
>> I think the heap settings are not reflecting. Can you please try again?
>> Please use -Xms and -Xmx both and set them to 8g. Also please try setting
>> minimum young gen allocation to 2GB.
>> Can you also please mention what tests are you comparing?
>> 
>> Regards
>> Lokesh
>> 
>>> On 30-Jul-2020, at 8:33 PM, Michel Sumbul <michelsum...@gmail.com>
>> wrote:
>>> 
>>> Hi Lokesh,
>>> 
>>> Thanks for your advice.
>>> I changed the heap size from 3GB to 4Gb and then to 8GB, set to
>> dfs.ratis.client.request.retry to 10 and decreased the number of pipeline
>> from 5 to 2.
>>> It allows 10-15% better performance but it's still 35% slower than HDFS.
>> I also notice that the data is less well spread over the 4 nodes with only
>> 2 pipelines.
>>> 
>>> I had a look at the GC log during the execution of the teragen (copy
>> attach this email), I'm not an expert at all in gc tuning so I load it the
>> GC log on this site and it tell me that for the 4 minutes of the execution
>> of the job the datanode was in pause during 29 seconds and trigger 1260 GC
>> over 4 minutes. That looks pretty high for me but maybe not?
>>> 
>> http://gcloganalyzer.com/?file=70338453-4505-4238-a900-c0206a2d52f4test.gc
>> 
>>> 
>>> I also take care that the yarn job don't request more than 80% of the VM
>> resource (even try with 60% but didn't change anything)
>>> 
>>> Do you think there is something else I can do to improve it or should I
>> stop here? Is it possible that the difference comes from the short circuit
>> feature of HDFS?
>>> 
>>> Thanks a lot,
>>> Michel
>>> 
>>> Le mer. 29 juil. 2020 à 10:54, Lokesh Jain <lj...@apache.org> a écrit :
>>> Hi Michel
>>> 
>>> Thanks for trying out Ozone!
>>> For ozone 0.5 - Can you please try another run after increasing config
>> value for dfs.ratis.client.request.retry.interval to 10 or 15 seconds? The
>> default value for this config is 1 second.
>>>> ozone.datanode.pipeline.limit
>>> Can you try a smaller value like 1 or 2 for above config with a data
>> node heap size of 4GB? Please check GC pressure on the datanode with this
>> config.
>>> 
>>> There are some improvements which have gone recently after ozone 0.5
>> release. I would also recommend to try the latest ozone.
>>> 
>>> Thanks
>>> Lokesh
>>> 
>>> 
>>>> On 29-Jul-2020, at 12:57 AM, Michel Sumbul <michelsum...@gmail.com>
>> wrote:
>>>> 
>>>> I forgot to mention that I set ozone.datanode.pipeline.limit to 5.
>>>> Michel
>>>> 
>>>> Le mar. 28 juil. 2020 à 20:22, Michel Sumbul <michelsum...@gmail.com
>> <mailto:michelsum...@gmail.com>> a écrit :
>>>> Hi guys,
>>>> 
>>>> I would like to know if you have any advice tips/tricks to get the
>> best performance of Ozone? (Memory tuning / thread / specific settings /
>> etc..)
>>>> 
>>>> I did a few teragen/terasort on it and the results are really
>> surprising compared to HDFS,Ozone (using the hadoopFS)  is almost 2 times
>> slower than HDFS.
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> The clusters were exactly the same for both:
>>>> - 3 masters and 4 slaves (8core/32GB) (its a small cluster but that
>> should matter here)
>>>> - Backend storage is a CEPH storage (80 servers)
>>>> - NIC: 2 x 25Gb/S
>>>> - Ozone version 0.5
>>>> - Each job was executed 5 times
>>>> 
>>>> HDFS and Ozone were installed on the same nodes, one was down where
>> the other one was up, to guarantee no other differences of configuration
>> that the distributed FS.
>>>> 
>>>> I was not expecting a big difference like this, do you have any idea
>> how to improve this?
>>>> Or what can be the reason for that? I saw a few jira regarding data
>> locality at read, can it be linked to that?
>>>> 
>>>> Thanks,
>>>> Michel
>>>> 
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-dev-h...@hadoop.apache.org

Re: Advice to get the best performance of Ozone

Reply via email to