Hi Daegyu,

It’s interesting. IMHO, we could also explore the impact of latencies - the 
latency of a remote NVMe target storage device connected over an appropriate 
networking fabric Vs the latency of a NVMe storage device using a local 
server's PCIe bus.

Could you please tell me the network usage of your test with NVMeOF. How about 
increasing network bandwidth may be >=10 Gbps ?

Yes, you can increase the parallelism in TestDFSIO . One idea is to play with 
"-nrFiles" arguments and run more mappers.
For example, we can test 500GB of cluster data with "-nrFiles 100 -fileSize 
5GB" or with "-nrFiles 5 -fileSize 100GB". In both cases the parallelism to 
HDFS is different.

Could you please elaborate on your cluster, TestDFSIO benchmark.

What type of test you are doing:
- sequential read/write
                              - random read/write

Have you connected NVMe device in DAX mode - 
https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/configuring-persistent-memory-for-use-in-device-dax-mode

Thanks,
Rakesh
From: Wei-Chiu Chuang [mailto:weic...@apache.org]
Sent: Wednesday, June 26, 2019 1:00 AM
To: Daegyu Han <hdg9...@gmail.com>
Cc: Anu Engineer <aengin...@cloudera.com>; user.hadoop <user@hadoop.apache.org>
Subject: Re: NVMe Over fabric performance on HDFS

There are a few Intel folks contributor NVMe related features in HDFS. They are 
probably the best source for this questions.

Without having access to the NVMe hardware, it is hard to tell. I learned GCE 
offers Intel Optane DC Persistent Memory attached instances. That can be used 
for tests if any one is interested.

I personally have not received reports regarding unexpected performance issue 
with NVMe with HDFS. A lot of test tuning could result in better performance. 
File size can have a great impact in a TestDFSIO, for example. You should also 
make sure you saturate the local NVMe rather than network bandwidth. Try set 
replication factor=1? With the default replication factor you pretty much 
saturate network rather than storage, I guess.

The Intel folks elected to implement DCPMM as a HDFS cache rather than a 
storage. There's probably some consideration behind that.

On Tue, Jun 25, 2019 at 10:29 AM Daegyu Han 
<hdg9...@gmail.com<mailto:hdg9...@gmail.com>> wrote:
Hi Anu,

Each datanode has own Samsung NVMe SSD which is on storage node.
In other words, just separate compute node and storage (nvme ssd).

I know that the maximum bandwidth of my Samsung NVMe SSD is about 3GB / s.

Experimental results of TestDFSIO and HDFS_API show that the
performance of local NVMe SSD is up to 2GB / s, while NVMeOF SSD has
500 ~ 800MB / s performance.
Even IPoIB using InfiniBand has a bandwidth of 1GB / s.

In research papers evaluating NVMeOF through FIO or KV Store
applications, the performance of NVMeOF is similar to that of local
SSD.
They said also, in order to improve NVMeOF performance as much as
local level, it is required to perform parallel IO.
Why does not the performance of NVMeOF IO bandwidth in HDFS be as good as local?

Regards,
Daegyu

2019년 6월 26일 (수) 오전 12:04, Anu Engineer 
<aengin...@cloudera.com<mailto:aengin...@cloudera.com>>님이 작성:
>
> Is your NVMe shared and all datanodes sending I/O to the same set of disks ? 
> Is it possible for you to see the I/O queue length of the NVMe Devices?
> I would suggest that you try to find out what is causing the perf issue, and 
> once we know in ball park where the issue is -- that is, is it disks or HDFS, 
> it might be possible to see what we can do.
>
>
>
> Thanks
> Anu
>
>
> On Tue, Jun 25, 2019 at 7:20 AM Daegyu Han 
> <hdg9...@gmail.com<mailto:hdg9...@gmail.com>> wrote:
>>
>> Hi all,
>>
>> I am using storage disaggregation by mounting nvme ssds on the storage node.
>>
>> When we connect the compute node and the storage node with nvme over
>> fabric (nvmeof) and test it, performance is much lower than that of
>> local storage (DAS).
>>
>> In general, we know that applications need to increase io parallelism
>> and io size to improve the performance of nvmeof.
>>
>> How can I change the settings of hdfs specifically to improve the io
>> performance of NVMeOF in HDFS?
>>
>> Best regards,
>> Daegyu
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: 
>> user-unsubscr...@hadoop.apache.org<mailto:user-unsubscr...@hadoop.apache.org>
>> For additional commands, e-mail: 
>> user-h...@hadoop.apache.org<mailto:user-h...@hadoop.apache.org>
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: 
user-unsubscr...@hadoop.apache.org<mailto:user-unsubscr...@hadoop.apache.org>
For additional commands, e-mail: 
user-h...@hadoop.apache.org<mailto:user-h...@hadoop.apache.org>

Reply via email to