Bounced on the first attempt. Regards,
David From: David Crespi<mailto:david.cre...@storedgesystems.com> Sent: Monday, July 1, 2019 5:27 PM To: dev@crail.apache.org<mailto:dev@crail.apache.org>; Jonas Pfefferle<mailto:peppe...@japf.ch> Subject: RE: Setting up storage class 1 and 2 Jonas, Just wanted to be sure Iām doing things correctly. It runs okay without adding in the NVMf datanode (i.e. completes teragen). When I add the NVMf node in, even without using it on the run, it hangs during the terasort, with nothing being written to the datanode ā only the metadata is created (i.e. /spark). My config is: 1 namenode container 1 rdma datanode storage class 1 container 1 nvmf datanode storage class 1 container. The namenode is showing that both datanode are starting up as Type 0 to storage class 0⦠is that correct? NameNode log at startup: 19/07/01 17:18:16 INFO crail: initalizing namenode 19/07/01 17:18:16 INFO crail: crail.version 3101 19/07/01 17:18:16 INFO crail: crail.directorydepth 16 19/07/01 17:18:16 INFO crail: crail.tokenexpiration 10 19/07/01 17:18:16 INFO crail: crail.blocksize 1048576 19/07/01 17:18:16 INFO crail: crail.cachelimit 0 19/07/01 17:18:16 INFO crail: crail.cachepath /dev/hugepages/cache 19/07/01 17:18:16 INFO crail: crail.user crail 19/07/01 17:18:16 INFO crail: crail.shadowreplication 1 19/07/01 17:18:16 INFO crail: crail.debug true 19/07/01 17:18:16 INFO crail: crail.statistics false 19/07/01 17:18:16 INFO crail: crail.rpctimeout 1000 19/07/01 17:18:16 INFO crail: crail.datatimeout 1000 19/07/01 17:18:16 INFO crail: crail.buffersize 1048576 19/07/01 17:18:16 INFO crail: crail.slicesize 65536 19/07/01 17:18:16 INFO crail: crail.singleton true 19/07/01 17:18:16 INFO crail: crail.regionsize 1073741824 19/07/01 17:18:16 INFO crail: crail.directoryrecord 512 19/07/01 17:18:16 INFO crail: crail.directoryrandomize true 19/07/01 17:18:16 INFO crail: crail.cacheimpl org.apache.crail.memory.MappedBufferCache 19/07/01 17:18:16 INFO crail: crail.locationmap 19/07/01 17:18:16 INFO crail: crail.namenode.address crail://minnie:9060?id=0&size=1 19/07/01 17:18:16 INFO crail: crail.namenode.blockselection roundrobin 19/07/01 17:18:16 INFO crail: crail.namenode.fileblocks 16 19/07/01 17:18:16 INFO crail: crail.namenode.rpctype org.apache.crail.namenode.rpc.tcp.TcpNameNode 19/07/01 17:18:16 INFO crail: crail.namenode.log 19/07/01 17:18:16 INFO crail: crail.storage.types org.apache.crail.storage.nvmf.NvmfStorageTier,org.apache.crail.storage.rdma.RdmaStorageTier 19/07/01 17:18:16 INFO crail: crail.storage.classes 2 19/07/01 17:18:16 INFO crail: crail.storage.rootclass 1 19/07/01 17:18:16 INFO crail: crail.storage.keepalive 2 19/07/01 17:18:16 INFO crail: round robin block selection 19/07/01 17:18:16 INFO crail: round robin block selection 19/07/01 17:18:16 INFO narpc: new NaRPC server group v1.0, queueDepth 32, messageSize 512, nodealy true, cores 2 19/07/01 17:18:16 INFO crail: crail.namenode.tcp.queueDepth 32 19/07/01 17:18:16 INFO crail: crail.namenode.tcp.messageSize 512 19/07/01 17:18:16 INFO crail: crail.namenode.tcp.cores 2 19/07/01 17:18:17 INFO crail: new connection from /192.168.1.164:39260 19/07/01 17:18:17 INFO narpc: adding new channel to selector, from /192.168.1.164:39260 19/07/01 17:18:17 INFO crail: adding datanode /192.168.3.100:4420 of type 0 to storage class 0 19/07/01 17:18:17 INFO crail: new connection from /192.168.1.164:39262 19/07/01 17:18:17 INFO narpc: adding new channel to selector, from /192.168.1.164:39262 19/07/01 17:18:18 INFO crail: adding datanode /192.168.3.100:50020 of type 0 to storage class 0 The RDMA datanode ā it is set to have 4x1GB hugepages: 19/07/01 17:18:17 INFO crail: crail.version 3101 19/07/01 17:18:17 INFO crail: crail.directorydepth 16 19/07/01 17:18:17 INFO crail: crail.tokenexpiration 10 19/07/01 17:18:17 INFO crail: crail.blocksize 1048576 19/07/01 17:18:17 INFO crail: crail.cachelimit 0 19/07/01 17:18:17 INFO crail: crail.cachepath /dev/hugepages/cache 19/07/01 17:18:17 INFO crail: crail.user crail 19/07/01 17:18:17 INFO crail: crail.shadowreplication 1 19/07/01 17:18:17 INFO crail: crail.debug true 19/07/01 17:18:17 INFO crail: crail.statistics false 19/07/01 17:18:17 INFO crail: crail.rpctimeout 1000 19/07/01 17:18:17 INFO crail: crail.datatimeout 1000 19/07/01 17:18:17 INFO crail: crail.buffersize 1048576 19/07/01 17:18:17 INFO crail: crail.slicesize 65536 19/07/01 17:18:17 INFO crail: crail.singleton true 19/07/01 17:18:17 INFO crail: crail.regionsize 1073741824 19/07/01 17:18:17 INFO crail: crail.directoryrecord 512 19/07/01 17:18:17 INFO crail: crail.directoryrandomize true 19/07/01 17:18:17 INFO crail: crail.cacheimpl org.apache.crail.memory.MappedBufferCache 19/07/01 17:18:17 INFO crail: crail.locationmap 19/07/01 17:18:17 INFO crail: crail.namenode.address crail://minnie:9060 19/07/01 17:18:17 INFO crail: crail.namenode.blockselection roundrobin 19/07/01 17:18:17 INFO crail: crail.namenode.fileblocks 16 19/07/01 17:18:17 INFO crail: crail.namenode.rpctype org.apache.crail.namenode.rpc.tcp.TcpNameNode 19/07/01 17:18:17 INFO crail: crail.namenode.log 19/07/01 17:18:17 INFO crail: crail.storage.types org.apache.crail.storage.rdma.RdmaStorageTier 19/07/01 17:18:17 INFO crail: crail.storage.classes 1 19/07/01 17:18:17 INFO crail: crail.storage.rootclass 1 19/07/01 17:18:17 INFO crail: crail.storage.keepalive 2 19/07/01 17:18:17 INFO disni: creating RdmaProvider of type 'nat' 19/07/01 17:18:17 INFO disni: jverbs jni version 32 19/07/01 17:18:17 INFO disni: sock_addr_in size mismatch, jverbs size 28, native size 16 19/07/01 17:18:17 INFO disni: IbvRecvWR size match, jverbs size 32, native size 32 19/07/01 17:18:17 INFO disni: IbvSendWR size mismatch, jverbs size 72, native size 128 19/07/01 17:18:17 INFO disni: IbvWC size match, jverbs size 48, native size 48 19/07/01 17:18:17 INFO disni: IbvSge size match, jverbs size 16, native size 16 19/07/01 17:18:17 INFO disni: Remote addr offset match, jverbs size 40, native size 40 19/07/01 17:18:17 INFO disni: Rkey offset match, jverbs size 48, native size 48 19/07/01 17:18:17 INFO disni: createEventChannel, objId 140349068383088 19/07/01 17:18:17 INFO disni: passive endpoint group, maxWR 32, maxSge 4, cqSize 3200 19/07/01 17:18:17 INFO disni: createId, id 140349068429968 19/07/01 17:18:17 INFO disni: new server endpoint, id 0 19/07/01 17:18:17 INFO disni: launching cm processor, cmChannel 0 19/07/01 17:18:17 INFO disni: bindAddr, address /192.168.3.100:50020 19/07/01 17:18:17 INFO disni: listen, id 0 19/07/01 17:18:17 INFO disni: allocPd, objId 140349068679808 19/07/01 17:18:17 INFO disni: setting up protection domain, context 100, pd 1 19/07/01 17:18:17 INFO disni: PD value 1 19/07/01 17:18:17 INFO crail: crail.storage.rdma.interface enp94s0f1 19/07/01 17:18:17 INFO crail: crail.storage.rdma.port 50020 19/07/01 17:18:17 INFO crail: crail.storage.rdma.storagelimit 4294967296 19/07/01 17:18:17 INFO crail: crail.storage.rdma.allocationsize 1073741824 19/07/01 17:18:17 INFO crail: crail.storage.rdma.datapath /dev/hugepages/rdma 19/07/01 17:18:17 INFO crail: crail.storage.rdma.localmap true 19/07/01 17:18:17 INFO crail: crail.storage.rdma.queuesize 32 19/07/01 17:18:17 INFO crail: crail.storage.rdma.type passive 19/07/01 17:18:17 INFO crail: crail.storage.rdma.backlog 100 19/07/01 17:18:17 INFO crail: crail.storage.rdma.connecttimeout 1000 19/07/01 17:18:17 INFO narpc: new NaRPC server group v1.0, queueDepth 32, messageSize 512, nodealy true 19/07/01 17:18:17 INFO crail: crail.namenode.tcp.queueDepth 32 19/07/01 17:18:17 INFO crail: crail.namenode.tcp.messageSize 512 19/07/01 17:18:17 INFO crail: crail.namenode.tcp.cores 2 19/07/01 17:18:17 INFO crail: rdma storage server started, address /192.168.3.100:50020, persistent false, maxWR 32, maxSge 4, cqSize 3200 19/07/01 17:18:17 INFO disni: starting accept 19/07/01 17:18:18 INFO crail: connected to namenode(s) minnie/192.168.1.164:9060 19/07/01 17:18:18 INFO crail: datanode statistics, freeBlocks 1024 19/07/01 17:18:18 INFO crail: datanode statistics, freeBlocks 2048 19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 3072 19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 4096 19/07/01 17:18:19 INFO crail: datanode statistics, freeBlocks 4096 NVMf datanode is showing 1TB. 19/07/01 17:23:57 INFO crail: datanode statistics, freeBlocks 1048576 Regards, David