Hi Meiko,

I don't see anything immediately wrong with your setup. Have you tried running 
rping (to test if CM works in general) or ibv_devices (to check if the verbs 
dev works) in the container?

Jonas

On Wednesday, January 29th, 2025 at 07:15, Meiko Prilop 
<meiko.prilop-...@ibm.com.INVALID> wrote:

> 
> 
> Dear Sir or Madam,
> 
> My name is Meiko Prilop and I am currently writing my master's thesis in the 
> field of caching using RDMA at the Vienna University of Technology. In the 
> course of my research, I came across your Crail project, which is exactly the 
> system I need for my master's thesis. I found your contact in the Github 
> project incubator-crail and wanted to try my luck asking questions about the 
> project. In the past, I was able to get in touch with Dr Metzler through my 
> job at IBM, who was able to give me information regarding Soft-iWARP. 
> However, I am now faced with the hurdle of setting up and testing Crail.
> 
> For my setup I have Ubuntu instances of version 18.04 running, where I setup 
> Soft-iWARP accordingly to the information in
> 
>       
> https://github.com/animeshtrivedi/blog/blob/master/post/2019-06-26-siw.md
> 
> I am then prompted three devices using the command ibv_devices:
> 
>        device     node GUID
> ------       ----------------
> siw_docker0  0242187be7d30000
> siw_lo      7369775f6c6f0000
> siw_enp1s0 525400afe5890000
> 
> And more precisely, using rdma link show:
> 
>       1/1: siw_lo/1: state ACTIVE physical_state LINK_UP
> 2/1: siw_enp1s0/1: state ACTIVE physical_state LINK_UP
>       3/1: siw_docker0/1: state ACTIVE physical_state LINK_UP
> 
> I have been able to test rping functionality between two instances so far.
> 
> Next, I followed the steps described in:
>       https://crail.readthedocs.io/en/latest/docker.html
> Although it appears to be the old link to the read me files, the one stated 
> in the incubator-crail github are not available anymore. It appears to me 
> that these setup descriptions are still the same as in the doc folder tho.
> 
> I cloned the repository found at:
>       https://github.com/apache/incubator-crail/tree/master
> and created an image using the Dockerfile found at /docker/RDMA.
> 
> I then try to run the command stated below:
> 
> sudo docker run -it --network host -e NAMENODE_HOST=rdma0 -e INTERFACE=enp1s0 
> --cap-add=IPC_LOCK --device=/dev/infiniband/uverbs0 
> --device=/dev/infiniband/rdma_cm -v /dev/hugepages:/dev/hugepages crail-rdma 
> namenode
> 
> I further recognized that uverbs0 and rdma_cm are available in the path given.
> 
> When running the command, I get this error:
> 
> Exception in thread "main" java.io.IOException: j2c::createEventChannel: 
> rdma_create_event_channel failed: No such device
> 
> at com.ibm.disni.verbs.impl.NativeDispatcher._createEventChannel(Native 
> Method)
> at com.ibm.disni.verbs.impl.RdmaCmNat.createEventChannel(RdmaCmNat.java:60)
> at 
> com.ibm.disni.verbs.RdmaEventChannel.createEventChannel(RdmaEventChannel.java:67)
> at com.ibm.disni.RdmaCmProcessor.<init>(RdmaCmProcessor.java:48)
> 
> at com.ibm.disni.RdmaEndpointGroup.<init>(RdmaEndpointGroup.java:61)
> 
> at com.ibm.darpc.DaRPCEndpointGroup.<init>(DaRPCEndpointGroup.java:47)
> 
> at com.ibm.darpc.DaRPCServerGroup.<init>(DaRPCServerGroup.java:58)
> 
> at com.ibm.darpc.DaRPCServerGroup.createServerGroup(DaRPCServerGroup.java:52)
> at 
> org.apache.crail.namenode.rpc.darpc.DaRPCNameNodeServer.init(DaRPCNameNodeServer.java:56)
> at org.apache.crail.namenode.NameNode.main(NameNode.java:92)
> 
> 
> Unfortunately, I can't seem to fix this issue at this point. I hope you can 
> help me with this problem, Crail seems to me to be a fundamental building 
> block for my Master's thesis!
> 
> Thank you very much for your time and I look forward to your feedback.
> 
> 
> Yours sincerely
> 
> Meiko Prilop

Reply via email to