Hi David,
Good to hear things work now.
1) Technically, you can use the RdmaStorageTier "directly" with a SSD since
it allocates its data in "datapath" (and then mmaps it). Now this path is
typically a hugetlbfs but it can be a standard mount point. However, there
are a few drawbacks with this approach: all IO is buffered and you have no
control over when it is written to the SSD and since Rdma requires that all
memory is pinned you have to allocate as much memory as your SSD has. So
overall that is not really feasible.
My recommendation is to use the NVMf storage tier locally.
2) Correct, at the moment that is the only way you can do this: start
multiple instances of SPDK or use SPDK RAID0 if you just want to use
multiple devices in the same storage class.
FYI the shuffle plugin also supports configuring the storage class it should
write to: "spark.crail.shuffle.storageclass" (put into Spark config)
Regards,
Jonas
On Tue, 9 Jul 2019 01:05:37 +0000
David Crespi <david.cre...@storedgesystems.com> wrote:
HI,
Wanted to ask if there is a way of using local ssd via the
RdmaStorageTier, so a couple of question.
From the blog example there were these three classes.
crail@clustermaster:~$ cat $CRAIL_HOME/conf/slaves
clusternode1 -t org.apache.crail.storage.rdma.RdmaStorageTier -c 0
clusternode1 -t org.apache.crail.storage.nvmf.NvmfStorageTier -c 1
disaggnode -t org.apache.crail.storage.nvmf.NvmfStorageTier -c 2
1. Is there a way of using the RdmaStorageTier directly with a SSD
that is local to the server “clusternode1”?
Or is it that the local SSD has to be included into a NVMf subsystem
on that local server, thus the NvmfStorageTier
is used on that same server in order to access the SSD locally via
an nvmf subsystem.
1. I asked the question a few days ago about how to use the same
Subsystem NQN, which I can’t with a single
instance of SPDK. Is this how using the same a NQN is possible, that
different instances of SPDK would be used… one on each server (i.e.
clusternode1 & clusternode2), each with their own “version” of that
same Subsystem?
BTW…
I have my environment all running now, and all in containers.
Everything appears to be working as advertised.
The spark shuffle seems to be filling up the memory tier, then
continuing on to the ssd tier. Haven’t done anything
over 300G yet, but it’s coming. I’m clarifying the above to be sure
I’m not missing out on one of the configs. I’m
currently also using HDFS for the tmp results as I currently only
have one instance of SPDK, so both
NVMf class 1 and 2 can’t exist for me (assuming the answers above
that is 😊).
Regards,
David