Oh, and while I’m thinking about it Jonas, when I added the patches you 
provided the other day, I only

added them to the spark containers (clients) not to my crail containers running 
on my storage server.

Should the patches been added to all of the containers?



Regards,



           David





________________________________
From: Jonas Pfefferle <peppe...@japf.ch>
Sent: Friday, June 28, 2019 12:54:27 AM
To: dev@crail.apache.org; David Crespi
Subject: Re: Setting up storage class 1 and 2

Hi David,


At the moment, it is possible to add a NVMf datanode even if only the RDMA
storage type is specified in the config. As you have seen this will go wrong
as soon as a client tries to connect to the datanode. Make sure to start the
RDMA datanode with the appropriate classname, see:
https://incubator-crail.readthedocs.io/en/latest/run.html
The correct classname is org.apache.crail.storage.rdma.RdmaStorageTier.

Regards,
Jonas

  On Thu, 27 Jun 2019 23:09:26 +0000
  David Crespi <david.cre...@storedgesystems.com> wrote:
> Hi,
> I’m trying to integrate the storage classes and I’m hitting another
>issue when running terasort and just
> using the crail-shuffle with HDFS as the tmp storage.  The program
>just sits, after the following
> message:
> 19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser: closed
> 19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser: stopped, remaining
>connections 0
>
> During this run, I’ve removed the two crail nvmf (class 1 and 2)
>containers from the server, and I’m only running
> the namenode and a rdma storage class 1 datanode.  My spark
>configuration is also now only looking at
> the rdma class.  It looks as though it’s picking up the NVMf IP and
>port in the INFO messages seen below.
> I must be configuring something wrong, but I’ve not been able to
>track it down.  Any thoughts?
>
>
> ************************************
>         TeraSort
> ************************************
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
>[jar:file:/crail/jars/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
>[jar:file:/crail/jars/jnvmf-1.6-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
>[jar:file:/crail/jars/disni-2.1-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
>[jar:file:/usr/spark-2.4.2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 19/06/27 15:59:07 WARN NativeCodeLoader: Unable to load
>native-hadoop library for your platform... using builtin-java classes
>where applicable
> 19/06/27 15:59:07 INFO SparkContext: Running Spark version 2.4.2
> 19/06/27 15:59:07 INFO SparkContext: Submitted application: TeraSort
> 19/06/27 15:59:07 INFO SecurityManager: Changing view acls to:
>hduser
> 19/06/27 15:59:07 INFO SecurityManager: Changing modify acls to:
>hduser
> 19/06/27 15:59:07 INFO SecurityManager: Changing view acls groups
>to:
> 19/06/27 15:59:07 INFO SecurityManager: Changing modify acls groups
>to:
> 19/06/27 15:59:07 INFO SecurityManager: SecurityManager:
>authentication disabled; ui acls disabled; users  with view
>permissions: Set(hduser); groups with view permissions: Set(); users
> with modify permissions: Set(hduser); groups with modify
>permissions: Set()
> 19/06/27 15:59:08 DEBUG InternalLoggerFactory: Using SLF4J as the
>default logging framework
> 19/06/27 15:59:08 DEBUG InternalThreadLocalMap:
>-Dio.netty.threadLocalMap.stringBuilder.initialSize: 1024
> 19/06/27 15:59:08 DEBUG InternalThreadLocalMap:
>-Dio.netty.threadLocalMap.stringBuilder.maxSize: 4096
> 19/06/27 15:59:08 DEBUG MultithreadEventLoopGroup:
>-Dio.netty.eventLoopThreads: 112
> 19/06/27 15:59:08 DEBUG PlatformDependent0: -Dio.netty.noUnsafe:
>false
> 19/06/27 15:59:08 DEBUG PlatformDependent0: Java version: 8
> 19/06/27 15:59:08 DEBUG PlatformDependent0:
>sun.misc.Unsafe.theUnsafe: available
> 19/06/27 15:59:08 DEBUG PlatformDependent0:
>sun.misc.Unsafe.copyMemory: available
> 19/06/27 15:59:08 DEBUG PlatformDependent0: java.nio.Buffer.address:
>available
> 19/06/27 15:59:08 DEBUG PlatformDependent0: direct buffer
>constructor: available
> 19/06/27 15:59:08 DEBUG PlatformDependent0: java.nio.Bits.unaligned:
>available, true
> 19/06/27 15:59:08 DEBUG PlatformDependent0:
>jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable
>prior to Java9
> 19/06/27 15:59:08 DEBUG PlatformDependent0:
>java.nio.DirectByteBuffer.<init>(long, int): available
> 19/06/27 15:59:08 DEBUG PlatformDependent: sun.misc.Unsafe:
>available
> 19/06/27 15:59:08 DEBUG PlatformDependent: -Dio.netty.tmpdir: /tmp
>(java.io.tmpdir)
> 19/06/27 15:59:08 DEBUG PlatformDependent: -Dio.netty.bitMode: 64
>(sun.arch.data.model)
> 19/06/27 15:59:08 DEBUG PlatformDependent:
>-Dio.netty.noPreferDirect: false
> 19/06/27 15:59:08 DEBUG PlatformDependent:
>-Dio.netty.maxDirectMemory: 1029177344 bytes
> 19/06/27 15:59:08 DEBUG PlatformDependent:
>-Dio.netty.uninitializedArrayAllocationThreshold: -1
> 19/06/27 15:59:08 DEBUG CleanerJava6: java.nio.ByteBuffer.cleaner():
>available
> 19/06/27 15:59:08 DEBUG NioEventLoop:
>-Dio.netty.noKeySetOptimization: false
> 19/06/27 15:59:08 DEBUG NioEventLoop:
>-Dio.netty.selectorAutoRebuildThreshold: 512
> 19/06/27 15:59:08 DEBUG PlatformDependent:
>org.jctools-core.MpscChunkedArrayQueue: available
> 19/06/27 15:59:08 DEBUG ResourceLeakDetector:
>-Dio.netty.leakDetection.level: simple
> 19/06/27 15:59:08 DEBUG ResourceLeakDetector:
>-Dio.netty.leakDetection.targetRecords: 4
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.numHeapArenas: 9
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.numDirectArenas: 10
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.pageSize: 8192
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.maxOrder: 11
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.chunkSize: 16777216
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.tinyCacheSize: 512
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.smallCacheSize: 256
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.normalCacheSize: 64
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.maxCachedBufferCapacity: 32768
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.cacheTrimInterval: 8192
> 19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
>-Dio.netty.allocator.useCacheForAllThreads: true
> 19/06/27 15:59:08 DEBUG DefaultChannelId: -Dio.netty.processId: 2236
>(auto-detected)
> 19/06/27 15:59:08 DEBUG NetUtil: -Djava.net.preferIPv4Stack: false
> 19/06/27 15:59:08 DEBUG NetUtil: -Djava.net.preferIPv6Addresses:
>false
> 19/06/27 15:59:08 DEBUG NetUtil: Loopback interface: lo (lo,
>127.0.0.1)
> 19/06/27 15:59:08 DEBUG NetUtil: /proc/sys/net/core/somaxconn: 128
> 19/06/27 15:59:08 DEBUG DefaultChannelId: -Dio.netty.machineId:
>02:42:ac:ff:fe:1b:00:02 (auto-detected)
> 19/06/27 15:59:08 DEBUG ByteBufUtil: -Dio.netty.allocator.type:
>pooled
> 19/06/27 15:59:08 DEBUG ByteBufUtil:
>-Dio.netty.threadLocalDirectBufferSize: 65536
> 19/06/27 15:59:08 DEBUG ByteBufUtil:
>-Dio.netty.maxThreadLocalCharBufferSize: 16384
> 19/06/27 15:59:08 DEBUG TransportServer: Shuffle server started on
>port: 36915
> 19/06/27 15:59:08 INFO Utils: Successfully started service
>'sparkDriver' on port 36915.
> 19/06/27 15:59:08 DEBUG SparkEnv: Using serializer: class
>org.apache.spark.serializer.KryoSerializer
> 19/06/27 15:59:08 INFO SparkEnv: Registering MapOutputTracker
> 19/06/27 15:59:08 DEBUG MapOutputTrackerMasterEndpoint: init
> 19/06/27 15:59:08 INFO CrailShuffleManager: crail shuffle started
> 19/06/27 15:59:08 INFO SparkEnv: Registering BlockManagerMaster
> 19/06/27 15:59:08 INFO BlockManagerMasterEndpoint: Using
>org.apache.spark.storage.DefaultTopologyMapper for getting topology
>information
> 19/06/27 15:59:08 INFO BlockManagerMasterEndpoint:
>BlockManagerMasterEndpoint up
> 19/06/27 15:59:08 INFO DiskBlockManager: Created local directory at
>/tmp/blockmgr-15237510-f459-40e3-8390-10f4742930a5
> 19/06/27 15:59:08 DEBUG DiskBlockManager: Adding shutdown hook
> 19/06/27 15:59:08 INFO MemoryStore: MemoryStore started with
>capacity 366.3 MB
> 19/06/27 15:59:08 INFO SparkEnv: Registering OutputCommitCoordinator
> 19/06/27 15:59:08 DEBUG
>OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: init
> 19/06/27 15:59:08 DEBUG SecurityManager: Created SSL options for ui:
>SSLOptions{enabled=false, port=None, keyStore=None,
>keyStorePassword=None, trustStore=None, trustStorePassword=None,
>protocol=None, enabledAlgorithms=Set()}
> 19/06/27 15:59:08 INFO Utils: Successfully started service 'SparkUI'
>on port 4040.
> 19/06/27 15:59:08 INFO SparkUI: Bound SparkUI to 0.0.0.0, and
>started at http://192.168.1.161:4040
> 19/06/27 15:59:08 INFO SparkContext: Added JAR
>file:/spark-terasort/target/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar
>at
>spark://master:36915/jars/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar
>with timestamp 1561676348562
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint:
>Connecting to master spark://master:7077...
> 19/06/27 15:59:08 DEBUG TransportClientFactory: Creating new
>connection to master/192.168.3.13:7077
> 19/06/27 15:59:08 DEBUG AbstractByteBuf:
>-Dio.netty.buffer.bytebuf.checkAccessible: true
> 19/06/27 15:59:08 DEBUG ResourceLeakDetectorFactory: Loaded default
>ResourceLeakDetector: io.netty.util.ResourceLeakDetector@5b1bb5d2
> 19/06/27 15:59:08 DEBUG TransportClientFactory: Connection to
>master/192.168.3.13:7077 successful, running bootstraps...
> 19/06/27 15:59:08 INFO TransportClientFactory: Successfully created
>connection to master/192.168.3.13:7077 after 41 ms (0 ms spent in
>bootstraps)
> 19/06/27 15:59:08 DEBUG Recycler:
>-Dio.netty.recycler.maxCapacityPerThread: 32768
> 19/06/27 15:59:08 DEBUG Recycler:
>-Dio.netty.recycler.maxSharedCapacityFactor: 2
> 19/06/27 15:59:08 DEBUG Recycler: -Dio.netty.recycler.linkCapacity:
>16
> 19/06/27 15:59:08 DEBUG Recycler: -Dio.netty.recycler.ratio: 8
> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Connected to
>Spark cluster with app ID app-20190627155908-0005
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>added: app-20190627155908-0005/0 on
>worker-20190627152154-192.168.3.11-8882 (192.168.3.11:8882) with 2
>core(s)
> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
>ID app-20190627155908-0005/0 on hostPort 192.168.3.11:8882 with 2
>core(s), 1024.0 MB RAM
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>added: app-20190627155908-0005/1 on
>worker-20190627152150-192.168.3.12-8881 (192.168.3.12:8881) with 2
>core(s)
> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
>ID app-20190627155908-0005/1 on hostPort 192.168.3.12:8881 with 2
>core(s), 1024.0 MB RAM
> 19/06/27 15:59:08 DEBUG TransportServer: Shuffle server started on
>port: 39189
> 19/06/27 15:59:08 INFO Utils: Successfully started service
>'org.apache.spark.network.netty.NettyBlockTransferService' on port
>39189.
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>added: app-20190627155908-0005/2 on
>worker-20190627152203-192.168.3.9-8884 (192.168.3.9:8884) with 2
>core(s)
> 19/06/27 15:59:08 INFO NettyBlockTransferService: Server created on
>master:39189
> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
>ID app-20190627155908-0005/2 on hostPort 192.168.3.9:8884 with 2
>core(s), 1024.0 MB RAM
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>added: app-20190627155908-0005/3 on
>worker-20190627152158-192.168.3.10-8883 (192.168.3.10:8883) with 2
>core(s)
> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
>ID app-20190627155908-0005/3 on hostPort 192.168.3.10:8883 with 2
>core(s), 1024.0 MB RAM
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>added: app-20190627155908-0005/4 on
>worker-20190627152207-192.168.3.8-8885 (192.168.3.8:8885) with 2
>core(s)
> 19/06/27 15:59:08 INFO BlockManager: Using
>org.apache.spark.storage.RandomBlockReplicationPolicy for block
>replication policy
> 19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
>ID app-20190627155908-0005/4 on hostPort 192.168.3.8:8885 with 2
>core(s), 1024.0 MB RAM
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>updated: app-20190627155908-0005/0 is now RUNNING
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>updated: app-20190627155908-0005/3 is now RUNNING
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>updated: app-20190627155908-0005/4 is now RUNNING
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>updated: app-20190627155908-0005/1 is now RUNNING
> 19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
>updated: app-20190627155908-0005/2 is now RUNNING
> 19/06/27 15:59:08 INFO BlockManagerMaster: Registering BlockManager
>BlockManagerId(driver, master, 39189, None)
> 19/06/27 15:59:08 DEBUG DefaultTopologyMapper: Got a request for
>master
> 19/06/27 15:59:08 INFO BlockManagerMasterEndpoint: Registering block
>manager master:39189 with 366.3 MB RAM, BlockManagerId(driver,
>master, 39189, None)
> 19/06/27 15:59:08 INFO BlockManagerMaster: Registered BlockManager
>BlockManagerId(driver, master, 39189, None)
> 19/06/27 15:59:08 INFO BlockManager: Initialized BlockManager:
>BlockManagerId(driver, master, 39189, None)
> 19/06/27 15:59:09 INFO StandaloneSchedulerBackend: SchedulerBackend
>is ready for scheduling beginning after reached
>minRegisteredResourcesRatio: 0.0
> 19/06/27 15:59:09 DEBUG SparkContext: Adding shutdown hook
> 19/06/27 15:59:09 DEBUG BlockReaderLocal:
>dfs.client.use.legacy.blockreader.local = false
> 19/06/27 15:59:09 DEBUG BlockReaderLocal:
>dfs.client.read.shortcircuit = false
> 19/06/27 15:59:09 DEBUG BlockReaderLocal:
>dfs.client.domain.socket.data.traffic = false
> 19/06/27 15:59:09 DEBUG BlockReaderLocal: dfs.domain.socket.path =
> 19/06/27 15:59:09 DEBUG RetryUtils: multipleLinearRandomRetry = null
> 19/06/27 15:59:09 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER,
>rpcRequestWrapperClass=class
>org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper,
>rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@23f3dbf0
> 19/06/27 15:59:09 DEBUG Client: getting client out of cache:
>org.apache.hadoop.ipc.Client@3ed03652
> 19/06/27 15:59:09 DEBUG PerformanceAdvisory: Both short-circuit
>local reads and UNIX domain socket are disabled.
> 19/06/27 15:59:09 DEBUG DataTransferSaslUtil: DataTransferProtocol
>not using SaslPropertiesResolver, no QOP found in configuration for
>dfs.data.transfer.protection
> 19/06/27 15:59:10 INFO MemoryStore: Block broadcast_0 stored as
>values in memory (estimated size 288.9 KB, free 366.0 MB)
> 19/06/27 15:59:10 DEBUG BlockManager: Put block broadcast_0 locally
>took  115 ms
> 19/06/27 15:59:10 DEBUG BlockManager: Putting block broadcast_0
>without replication took  117 ms
> 19/06/27 15:59:10 INFO MemoryStore: Block broadcast_0_piece0 stored
>as bytes in memory (estimated size 23.8 KB, free 366.0 MB)
> 19/06/27 15:59:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in
>memory on master:39189 (size: 23.8 KB, free: 366.3 MB)
> 19/06/27 15:59:10 DEBUG BlockManagerMaster: Updated info of block
>broadcast_0_piece0
> 19/06/27 15:59:10 DEBUG BlockManager: Told master about block
>broadcast_0_piece0
> 19/06/27 15:59:10 DEBUG BlockManager: Put block broadcast_0_piece0
>locally took  6 ms
> 19/06/27 15:59:10 DEBUG BlockManager: Putting block
>broadcast_0_piece0 without replication took  6 ms
> 19/06/27 15:59:10 INFO SparkContext: Created broadcast 0 from
>newAPIHadoopFile at TeraSort.scala:60
> 19/06/27 15:59:10 DEBUG Client: The ping interval is 60000 ms.
> 19/06/27 15:59:10 DEBUG Client: Connecting to
>NameNode-1/192.168.3.7:54310
> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser: starting, having
>connections 1
> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser sending #0
> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser got value #0
> 19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: getFileInfo took
>31ms
> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser sending #1
> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser got value #1
> 19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: getListing took 5ms
> 19/06/27 15:59:10 DEBUG FileInputFormat: Time taken to get
>FileStatuses: 134
> 19/06/27 15:59:10 INFO FileInputFormat: Total input paths to process
>: 2
> 19/06/27 15:59:10 DEBUG FileInputFormat: Total # of splits generated
>by getSplits: 2, TimeTaken: 139
> 19/06/27 15:59:10 DEBUG FileCommitProtocol: Creating committer
>org.apache.spark.internal.io.HadoopMapReduceCommitProtocol; job 1;
>output=hdfs://NameNode-1:54310/tmp/data_sort; dynamic=false
> 19/06/27 15:59:10 DEBUG FileCommitProtocol: Using (String, String,
>Boolean) constructor
> 19/06/27 15:59:10 INFO FileOutputCommitter: File Output Committer
>Algorithm version is 1
> 19/06/27 15:59:10 DEBUG DFSClient: /tmp/data_sort/_temporary/0:
>masked=rwxr-xr-x
> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser sending #2
> 19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser got value #2
> 19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: mkdirs took 3ms
> 19/06/27 15:59:10 DEBUG ClosureCleaner: Cleaning lambda:
>$anonfun$write$1
> 19/06/27 15:59:10 DEBUG ClosureCleaner:  +++ Lambda closure
>($anonfun$write$1) is now cleaned +++
> 19/06/27 15:59:10 INFO SparkContext: Starting job: runJob at
>SparkHadoopWriter.scala:78
> 19/06/27 15:59:10 INFO CrailDispatcher: CrailStore starting version
>400
> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.deleteonclose
>false
> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.deleteOnStart
>true
> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.preallocate 0
> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.writeAhead 0
> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.debug false
> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.serializer
>org.apache.spark.serializer.CrailSparkSerializer
> 19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.shuffle.affinity
>true
> 19/06/27 15:59:10 INFO CrailDispatcher:
>spark.crail.shuffle.outstanding 1
> 19/06/27 15:59:10 INFO CrailDispatcher:
>spark.crail.shuffle.storageclass 0
> 19/06/27 15:59:10 INFO CrailDispatcher:
>spark.crail.broadcast.storageclass 0
> 19/06/27 15:59:10 INFO crail: creating singleton crail file system
> 19/06/27 15:59:10 INFO crail: crail.version 3101
> 19/06/27 15:59:10 INFO crail: crail.directorydepth 16
> 19/06/27 15:59:10 INFO crail: crail.tokenexpiration 10
> 19/06/27 15:59:10 INFO crail: crail.blocksize 1048576
> 19/06/27 15:59:10 INFO crail: crail.cachelimit 0
> 19/06/27 15:59:10 INFO crail: crail.cachepath /dev/hugepages/cache
> 19/06/27 15:59:10 INFO crail: crail.user crail
> 19/06/27 15:59:10 INFO crail: crail.shadowreplication 1
> 19/06/27 15:59:10 INFO crail: crail.debug true
> 19/06/27 15:59:10 INFO crail: crail.statistics true
> 19/06/27 15:59:10 INFO crail: crail.rpctimeout 1000
> 19/06/27 15:59:10 INFO crail: crail.datatimeout 1000
> 19/06/27 15:59:10 INFO crail: crail.buffersize 1048576
> 19/06/27 15:59:10 INFO crail: crail.slicesize 65536
> 19/06/27 15:59:10 INFO crail: crail.singleton true
> 19/06/27 15:59:10 INFO crail: crail.regionsize 1073741824
> 19/06/27 15:59:10 INFO crail: crail.directoryrecord 512
> 19/06/27 15:59:10 INFO crail: crail.directoryrandomize true
> 19/06/27 15:59:10 INFO crail: crail.cacheimpl
>org.apache.crail.memory.MappedBufferCache
> 19/06/27 15:59:10 INFO crail: crail.locationmap
> 19/06/27 15:59:10 INFO crail: crail.namenode.address
>crail://192.168.1.164:9060
> 19/06/27 15:59:10 INFO crail: crail.namenode.blockselection
>roundrobin
> 19/06/27 15:59:10 INFO crail: crail.namenode.fileblocks 16
> 19/06/27 15:59:10 INFO crail: crail.namenode.rpctype
>org.apache.crail.namenode.rpc.tcp.TcpNameNode
> 19/06/27 15:59:10 INFO crail: crail.namenode.log
> 19/06/27 15:59:10 INFO crail: crail.storage.types
>org.apache.crail.storage.rdma.RdmaStorageTier
> 19/06/27 15:59:10 INFO crail: crail.storage.classes 1
> 19/06/27 15:59:10 INFO crail: crail.storage.rootclass 0
> 19/06/27 15:59:10 INFO crail: crail.storage.keepalive 2
> 19/06/27 15:59:10 INFO crail: buffer cache, allocationCount 0,
>bufferCount 1024
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.interface eth0
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.port 50020
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.storagelimit
>4294967296
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.allocationsize
>1073741824
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.datapath
>/dev/hugepages/rdma
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.localmap true
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.queuesize 32
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.type passive
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.backlog 100
> 19/06/27 15:59:10 INFO crail: crail.storage.rdma.connecttimeout 1000
> 19/06/27 15:59:10 INFO narpc: new NaRPC server group v1.0,
>queueDepth 32, messageSize 512, nodealy true
> 19/06/27 15:59:10 INFO crail: crail.namenode.tcp.queueDepth 32
> 19/06/27 15:59:10 INFO crail: crail.namenode.tcp.messageSize 512
> 19/06/27 15:59:10 INFO crail: crail.namenode.tcp.cores 1
> 19/06/27 15:59:10 INFO crail: connected to namenode(s)
>/192.168.1.164:9060
> 19/06/27 15:59:10 INFO CrailDispatcher: creating main dir /spark
> 19/06/27 15:59:10 INFO crail: lookupDirectory: path /spark
> 19/06/27 15:59:10 INFO CrailDispatcher: creating main dir /spark
> 19/06/27 15:59:10 INFO crail: createNode: name /spark, type
>DIRECTORY, storageAffinity 0, locationAffinity 0
> 19/06/27 15:59:10 INFO crail: CoreOutputStream, open, path /, fd 0,
>streamId 1, isDir true, writeHint 0
> 19/06/27 15:59:10 INFO crail: passive data client
> 19/06/27 15:59:10 INFO disni: creating  RdmaProvider of type 'nat'
> 19/06/27 15:59:10 INFO disni: jverbs jni version 32
> 19/06/27 15:59:10 INFO disni: sock_addr_in size mismatch, jverbs
>size 28, native size 16
> 19/06/27 15:59:10 INFO disni: IbvRecvWR size match, jverbs size 32,
>native size 32
> 19/06/27 15:59:10 INFO disni: IbvSendWR size mismatch, jverbs size
>72, native size 128
> 19/06/27 15:59:10 INFO disni: IbvWC size match, jverbs size 48,
>native size 48
> 19/06/27 15:59:10 INFO disni: IbvSge size match, jverbs size 16,
>native size 16
> 19/06/27 15:59:10 INFO disni: Remote addr offset match, jverbs size
>40, native size 40
> 19/06/27 15:59:10 INFO disni: Rkey offset match, jverbs size 48,
>native size 48
> 19/06/27 15:59:10 INFO disni: createEventChannel, objId
>139811924587312
> 19/06/27 15:59:10 INFO disni: passive endpoint group, maxWR 32,
>maxSge 4, cqSize 64
> 19/06/27 15:59:10 INFO disni: launching cm processor, cmChannel 0
> 19/06/27 15:59:10 INFO disni: createId, id 139811924676432
> 19/06/27 15:59:10 INFO disni: new client endpoint, id 0, idPriv 0
> 19/06/27 15:59:10 INFO disni: resolveAddr, addres
>/192.168.3.100:4420
> 19/06/27 15:59:10 INFO disni: resolveRoute, id 0
> 19/06/27 15:59:10 INFO disni: allocPd, objId 139811924679808
> 19/06/27 15:59:10 INFO disni: setting up protection domain, context
>467, pd 1
> 19/06/27 15:59:10 INFO disni: setting up cq processor
> 19/06/27 15:59:10 INFO disni: new endpoint CQ processor
> 19/06/27 15:59:10 INFO disni: createCompChannel, context
>139810647883744
> 19/06/27 15:59:10 INFO disni: createCQ, objId 139811924680688, ncqe
>64
> 19/06/27 15:59:10 INFO disni: createQP, objId 139811924691192,
>send_wr size 32, recv_wr_size 32
> 19/06/27 15:59:10 INFO disni: connect, id 0
> 19/06/27 15:59:10 INFO disni: got event type + UNKNOWN, srcAddress
>/192.168.3.13:43273, dstAddress /192.168.3.100:4420
> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
>Registered executor NettyRpcEndpointRef(spark-client://Executor)
>(192.168.3.11:35854) with ID 0
> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
>Registered executor NettyRpcEndpointRef(spark-client://Executor)
>(192.168.3.12:44312) with ID 1
> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
>Registered executor NettyRpcEndpointRef(spark-client://Executor)
>(192.168.3.8:34774) with ID 4
> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
>Registered executor NettyRpcEndpointRef(spark-client://Executor)
>(192.168.3.9:58808) with ID 2
> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
>192.168.3.11
> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
>manager 192.168.3.11:41919 with 366.3 MB RAM, BlockManagerId(0,
>192.168.3.11, 41919, None)
> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
>192.168.3.12
> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
>manager 192.168.3.12:46697 with 366.3 MB RAM, BlockManagerId(1,
>192.168.3.12, 46697, None)
> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
>192.168.3.8
> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
>manager 192.168.3.8:37281 with 366.3 MB RAM, BlockManagerId(4,
>192.168.3.8, 37281, None)
> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
>192.168.3.9
> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
>manager 192.168.3.9:43857 with 366.3 MB RAM, BlockManagerId(2,
>192.168.3.9, 43857, None)
> 19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
>Registered executor NettyRpcEndpointRef(spark-client://Executor)
>(192.168.3.10:40100) with ID 3
> 19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
>192.168.3.10
> 19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
>manager 192.168.3.10:38527 with 366.3 MB RAM, BlockManagerId(3,
>192.168.3.10, 38527, None)
> 19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser: closed
> 19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection
>to NameNode-1/192.168.3.7:54310 from hduser: stopped, remaining
>connections 0
>
>
> Regards,
>
>           David
>


Reply via email to