Hi David,

Can you clarify which unpatched version you are talking about? Are you talking about the NVMf thread fix where I send you a link to a branch in my repository or the fix we provided earlier for the Spark hang in the Crail master?

Generally, if you update, update all: clients and datanode/namenode.

Regards,
Jonas

 On Fri, 28 Jun 2019 17:59:32 +0000
 David Crespi <david.cre...@storedgesystems.com> wrote:
Jonas,
FYI - I went back to using the unpatched version of crail on the clients and it appears to work okay now with the shuffle and RDMA, with only the RDMA containers running on the server.

Regards,

          David


________________________________
From: David Crespi
Sent: Friday, June 28, 2019 7:49:51 AM
To: Jonas Pfefferle; dev@crail.apache.org
Subject: RE: Setting up storage class 1 and 2


Oh, and while I’m thinking about it Jonas, when I added the patches you provided the other day, I only

added them to the spark containers (clients) not to my crail containers running on my storage server.

Should the patches been added to all of the containers?


Regards,


          David


________________________________
From: Jonas Pfefferle <peppe...@japf.ch>
Sent: Friday, June 28, 2019 12:54:27 AM
To: dev@crail.apache.org; David Crespi
Subject: Re: Setting up storage class 1 and 2

Hi David,


At the moment, it is possible to add a NVMf datanode even if only the RDMA storage type is specified in the config. As you have seen this will go wrong as soon as a client tries to connect to the datanode. Make sure to start the
RDMA datanode with the appropriate classname, see:
https://incubator-crail.readthedocs.io/en/latest/run.html
The correct classname is org.apache.crail.storage.rdma.RdmaStorageTier.

Regards,
Jonas

 On Thu, 27 Jun 2019 23:09:26 +0000
 David Crespi <david.cre...@storedgesystems.com> wrote:
Hi,
I’m trying to integrate the storage classes and I’m hitting another
issue when running terasort and just
using the crail-shuffle with HDFS as the tmp storage.  The program
just sits, after the following
message:
19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser: closed
19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser: stopped, remaining
connections 0

During this run, I’ve removed the two crail nvmf (class 1 and 2)
containers from the server, and I’m only running
the namenode and a rdma storage class 1 datanode.  My spark
configuration is also now only looking at
the rdma class.  It looks as though it’s picking up the NVMf IP and
port in the INFO messages seen below.
I must be configuring something wrong, but I’ve not been able to
track it down.  Any thoughts?


************************************
        TeraSort
************************************
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/crail/jars/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/crail/jars/jnvmf-1.6-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/crail/jars/disni-2.1-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/spark-2.4.2/jars/slf4j-log4j12-1.7.16.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
19/06/27 15:59:07 WARN NativeCodeLoader: Unable to load
native-hadoop library for your platform... using builtin-java classes
where applicable
19/06/27 15:59:07 INFO SparkContext: Running Spark version 2.4.2
19/06/27 15:59:07 INFO SparkContext: Submitted application: TeraSort
19/06/27 15:59:07 INFO SecurityManager: Changing view acls to:
hduser
19/06/27 15:59:07 INFO SecurityManager: Changing modify acls to:
hduser
19/06/27 15:59:07 INFO SecurityManager: Changing view acls groups
to:
19/06/27 15:59:07 INFO SecurityManager: Changing modify acls groups
to:
19/06/27 15:59:07 INFO SecurityManager: SecurityManager:
authentication disabled; ui acls disabled; users  with view
permissions: Set(hduser); groups with view permissions: Set(); users
with modify permissions: Set(hduser); groups with modify
permissions: Set()
19/06/27 15:59:08 DEBUG InternalLoggerFactory: Using SLF4J as the
default logging framework
19/06/27 15:59:08 DEBUG InternalThreadLocalMap:
-Dio.netty.threadLocalMap.stringBuilder.initialSize: 1024
19/06/27 15:59:08 DEBUG InternalThreadLocalMap:
-Dio.netty.threadLocalMap.stringBuilder.maxSize: 4096
19/06/27 15:59:08 DEBUG MultithreadEventLoopGroup:
-Dio.netty.eventLoopThreads: 112
19/06/27 15:59:08 DEBUG PlatformDependent0: -Dio.netty.noUnsafe:
false
19/06/27 15:59:08 DEBUG PlatformDependent0: Java version: 8
19/06/27 15:59:08 DEBUG PlatformDependent0:
sun.misc.Unsafe.theUnsafe: available
19/06/27 15:59:08 DEBUG PlatformDependent0:
sun.misc.Unsafe.copyMemory: available
19/06/27 15:59:08 DEBUG PlatformDependent0: java.nio.Buffer.address:
available
19/06/27 15:59:08 DEBUG PlatformDependent0: direct buffer
constructor: available
19/06/27 15:59:08 DEBUG PlatformDependent0: java.nio.Bits.unaligned:
available, true
19/06/27 15:59:08 DEBUG PlatformDependent0:
jdk.internal.misc.Unsafe.allocateUninitializedArray(int): unavailable
prior to Java9
19/06/27 15:59:08 DEBUG PlatformDependent0:
java.nio.DirectByteBuffer.<init>(long, int): available
19/06/27 15:59:08 DEBUG PlatformDependent: sun.misc.Unsafe:
available
19/06/27 15:59:08 DEBUG PlatformDependent: -Dio.netty.tmpdir: /tmp
(java.io.tmpdir)
19/06/27 15:59:08 DEBUG PlatformDependent: -Dio.netty.bitMode: 64
(sun.arch.data.model)
19/06/27 15:59:08 DEBUG PlatformDependent:
-Dio.netty.noPreferDirect: false
19/06/27 15:59:08 DEBUG PlatformDependent:
-Dio.netty.maxDirectMemory: 1029177344 bytes
19/06/27 15:59:08 DEBUG PlatformDependent:
-Dio.netty.uninitializedArrayAllocationThreshold: -1
19/06/27 15:59:08 DEBUG CleanerJava6: java.nio.ByteBuffer.cleaner():
available
19/06/27 15:59:08 DEBUG NioEventLoop:
-Dio.netty.noKeySetOptimization: false
19/06/27 15:59:08 DEBUG NioEventLoop:
-Dio.netty.selectorAutoRebuildThreshold: 512
19/06/27 15:59:08 DEBUG PlatformDependent:
org.jctools-core.MpscChunkedArrayQueue: available
19/06/27 15:59:08 DEBUG ResourceLeakDetector:
-Dio.netty.leakDetection.level: simple
19/06/27 15:59:08 DEBUG ResourceLeakDetector:
-Dio.netty.leakDetection.targetRecords: 4
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.numHeapArenas: 9
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.numDirectArenas: 10
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.pageSize: 8192
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.maxOrder: 11
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.chunkSize: 16777216
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.tinyCacheSize: 512
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.smallCacheSize: 256
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.normalCacheSize: 64
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.maxCachedBufferCapacity: 32768
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.cacheTrimInterval: 8192
19/06/27 15:59:08 DEBUG PooledByteBufAllocator:
-Dio.netty.allocator.useCacheForAllThreads: true
19/06/27 15:59:08 DEBUG DefaultChannelId: -Dio.netty.processId: 2236
(auto-detected)
19/06/27 15:59:08 DEBUG NetUtil: -Djava.net.preferIPv4Stack: false
19/06/27 15:59:08 DEBUG NetUtil: -Djava.net.preferIPv6Addresses:
false
19/06/27 15:59:08 DEBUG NetUtil: Loopback interface: lo (lo,
127.0.0.1)
19/06/27 15:59:08 DEBUG NetUtil: /proc/sys/net/core/somaxconn: 128
19/06/27 15:59:08 DEBUG DefaultChannelId: -Dio.netty.machineId:
02:42:ac:ff:fe:1b:00:02 (auto-detected)
19/06/27 15:59:08 DEBUG ByteBufUtil: -Dio.netty.allocator.type:
pooled
19/06/27 15:59:08 DEBUG ByteBufUtil:
-Dio.netty.threadLocalDirectBufferSize: 65536
19/06/27 15:59:08 DEBUG ByteBufUtil:
-Dio.netty.maxThreadLocalCharBufferSize: 16384
19/06/27 15:59:08 DEBUG TransportServer: Shuffle server started on
port: 36915
19/06/27 15:59:08 INFO Utils: Successfully started service
'sparkDriver' on port 36915.
19/06/27 15:59:08 DEBUG SparkEnv: Using serializer: class
org.apache.spark.serializer.KryoSerializer
19/06/27 15:59:08 INFO SparkEnv: Registering MapOutputTracker
19/06/27 15:59:08 DEBUG MapOutputTrackerMasterEndpoint: init
19/06/27 15:59:08 INFO CrailShuffleManager: crail shuffle started
19/06/27 15:59:08 INFO SparkEnv: Registering BlockManagerMaster
19/06/27 15:59:08 INFO BlockManagerMasterEndpoint: Using
org.apache.spark.storage.DefaultTopologyMapper for getting topology
information
19/06/27 15:59:08 INFO BlockManagerMasterEndpoint:
BlockManagerMasterEndpoint up
19/06/27 15:59:08 INFO DiskBlockManager: Created local directory at
/tmp/blockmgr-15237510-f459-40e3-8390-10f4742930a5
19/06/27 15:59:08 DEBUG DiskBlockManager: Adding shutdown hook
19/06/27 15:59:08 INFO MemoryStore: MemoryStore started with
capacity 366.3 MB
19/06/27 15:59:08 INFO SparkEnv: Registering OutputCommitCoordinator
19/06/27 15:59:08 DEBUG
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: init
19/06/27 15:59:08 DEBUG SecurityManager: Created SSL options for ui:
SSLOptions{enabled=false, port=None, keyStore=None,
keyStorePassword=None, trustStore=None, trustStorePassword=None,
protocol=None, enabledAlgorithms=Set()}
19/06/27 15:59:08 INFO Utils: Successfully started service 'SparkUI'
on port 4040.
19/06/27 15:59:08 INFO SparkUI: Bound SparkUI to 0.0.0.0, and
started at http://192.168.1.161:4040
19/06/27 15:59:08 INFO SparkContext: Added JAR
file:/spark-terasort/target/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar
at
spark://master:36915/jars/spark-terasort-1.1-SNAPSHOT-jar-with-dependencies.jar
with timestamp 1561676348562
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint:
Connecting to master spark://master:7077...
19/06/27 15:59:08 DEBUG TransportClientFactory: Creating new
connection to master/192.168.3.13:7077
19/06/27 15:59:08 DEBUG AbstractByteBuf:
-Dio.netty.buffer.bytebuf.checkAccessible: true
19/06/27 15:59:08 DEBUG ResourceLeakDetectorFactory: Loaded default
ResourceLeakDetector: io.netty.util.ResourceLeakDetector@5b1bb5d2
19/06/27 15:59:08 DEBUG TransportClientFactory: Connection to
master/192.168.3.13:7077 successful, running bootstraps...
19/06/27 15:59:08 INFO TransportClientFactory: Successfully created
connection to master/192.168.3.13:7077 after 41 ms (0 ms spent in
bootstraps)
19/06/27 15:59:08 DEBUG Recycler:
-Dio.netty.recycler.maxCapacityPerThread: 32768
19/06/27 15:59:08 DEBUG Recycler:
-Dio.netty.recycler.maxSharedCapacityFactor: 2
19/06/27 15:59:08 DEBUG Recycler: -Dio.netty.recycler.linkCapacity:
16
19/06/27 15:59:08 DEBUG Recycler: -Dio.netty.recycler.ratio: 8
19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Connected to
Spark cluster with app ID app-20190627155908-0005
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
added: app-20190627155908-0005/0 on
worker-20190627152154-192.168.3.11-8882 (192.168.3.11:8882) with 2
core(s)
19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
ID app-20190627155908-0005/0 on hostPort 192.168.3.11:8882 with 2
core(s), 1024.0 MB RAM
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
added: app-20190627155908-0005/1 on
worker-20190627152150-192.168.3.12-8881 (192.168.3.12:8881) with 2
core(s)
19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
ID app-20190627155908-0005/1 on hostPort 192.168.3.12:8881 with 2
core(s), 1024.0 MB RAM
19/06/27 15:59:08 DEBUG TransportServer: Shuffle server started on
port: 39189
19/06/27 15:59:08 INFO Utils: Successfully started service
'org.apache.spark.network.netty.NettyBlockTransferService' on port
39189.
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
added: app-20190627155908-0005/2 on
worker-20190627152203-192.168.3.9-8884 (192.168.3.9:8884) with 2
core(s)
19/06/27 15:59:08 INFO NettyBlockTransferService: Server created on
master:39189
19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
ID app-20190627155908-0005/2 on hostPort 192.168.3.9:8884 with 2
core(s), 1024.0 MB RAM
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
added: app-20190627155908-0005/3 on
worker-20190627152158-192.168.3.10-8883 (192.168.3.10:8883) with 2
core(s)
19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
ID app-20190627155908-0005/3 on hostPort 192.168.3.10:8883 with 2
core(s), 1024.0 MB RAM
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
added: app-20190627155908-0005/4 on
worker-20190627152207-192.168.3.8-8885 (192.168.3.8:8885) with 2
core(s)
19/06/27 15:59:08 INFO BlockManager: Using
org.apache.spark.storage.RandomBlockReplicationPolicy for block
replication policy
19/06/27 15:59:08 INFO StandaloneSchedulerBackend: Granted executor
ID app-20190627155908-0005/4 on hostPort 192.168.3.8:8885 with 2
core(s), 1024.0 MB RAM
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
updated: app-20190627155908-0005/0 is now RUNNING
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
updated: app-20190627155908-0005/3 is now RUNNING
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
updated: app-20190627155908-0005/4 is now RUNNING
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
updated: app-20190627155908-0005/1 is now RUNNING
19/06/27 15:59:08 INFO StandaloneAppClient$ClientEndpoint: Executor
updated: app-20190627155908-0005/2 is now RUNNING
19/06/27 15:59:08 INFO BlockManagerMaster: Registering BlockManager
BlockManagerId(driver, master, 39189, None)
19/06/27 15:59:08 DEBUG DefaultTopologyMapper: Got a request for
master
19/06/27 15:59:08 INFO BlockManagerMasterEndpoint: Registering block
manager master:39189 with 366.3 MB RAM, BlockManagerId(driver,
master, 39189, None)
19/06/27 15:59:08 INFO BlockManagerMaster: Registered BlockManager
BlockManagerId(driver, master, 39189, None)
19/06/27 15:59:08 INFO BlockManager: Initialized BlockManager:
BlockManagerId(driver, master, 39189, None)
19/06/27 15:59:09 INFO StandaloneSchedulerBackend: SchedulerBackend
is ready for scheduling beginning after reached
minRegisteredResourcesRatio: 0.0
19/06/27 15:59:09 DEBUG SparkContext: Adding shutdown hook
19/06/27 15:59:09 DEBUG BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
19/06/27 15:59:09 DEBUG BlockReaderLocal:
dfs.client.read.shortcircuit = false
19/06/27 15:59:09 DEBUG BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
19/06/27 15:59:09 DEBUG BlockReaderLocal: dfs.domain.socket.path =
19/06/27 15:59:09 DEBUG RetryUtils: multipleLinearRandomRetry = null
19/06/27 15:59:09 DEBUG Server: rpcKind=RPC_PROTOCOL_BUFFER,
rpcRequestWrapperClass=class
org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper,
rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@23f3dbf0
19/06/27 15:59:09 DEBUG Client: getting client out of cache:
org.apache.hadoop.ipc.Client@3ed03652
19/06/27 15:59:09 DEBUG PerformanceAdvisory: Both short-circuit
local reads and UNIX domain socket are disabled.
19/06/27 15:59:09 DEBUG DataTransferSaslUtil: DataTransferProtocol
not using SaslPropertiesResolver, no QOP found in configuration for
dfs.data.transfer.protection
19/06/27 15:59:10 INFO MemoryStore: Block broadcast_0 stored as
values in memory (estimated size 288.9 KB, free 366.0 MB)
19/06/27 15:59:10 DEBUG BlockManager: Put block broadcast_0 locally
took  115 ms
19/06/27 15:59:10 DEBUG BlockManager: Putting block broadcast_0
without replication took  117 ms
19/06/27 15:59:10 INFO MemoryStore: Block broadcast_0_piece0 stored
as bytes in memory (estimated size 23.8 KB, free 366.0 MB)
19/06/27 15:59:10 INFO BlockManagerInfo: Added broadcast_0_piece0 in
memory on master:39189 (size: 23.8 KB, free: 366.3 MB)
19/06/27 15:59:10 DEBUG BlockManagerMaster: Updated info of block
broadcast_0_piece0
19/06/27 15:59:10 DEBUG BlockManager: Told master about block
broadcast_0_piece0
19/06/27 15:59:10 DEBUG BlockManager: Put block broadcast_0_piece0
locally took  6 ms
19/06/27 15:59:10 DEBUG BlockManager: Putting block
broadcast_0_piece0 without replication took  6 ms
19/06/27 15:59:10 INFO SparkContext: Created broadcast 0 from
newAPIHadoopFile at TeraSort.scala:60
19/06/27 15:59:10 DEBUG Client: The ping interval is 60000 ms.
19/06/27 15:59:10 DEBUG Client: Connecting to
NameNode-1/192.168.3.7:54310
19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser: starting, having
connections 1
19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser sending #0
19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser got value #0
19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: getFileInfo took
31ms
19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser sending #1
19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser got value #1
19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: getListing took 5ms
19/06/27 15:59:10 DEBUG FileInputFormat: Time taken to get
FileStatuses: 134
19/06/27 15:59:10 INFO FileInputFormat: Total input paths to process
: 2
19/06/27 15:59:10 DEBUG FileInputFormat: Total # of splits generated
by getSplits: 2, TimeTaken: 139
19/06/27 15:59:10 DEBUG FileCommitProtocol: Creating committer
org.apache.spark.internal.io.HadoopMapReduceCommitProtocol; job 1;
output=hdfs://NameNode-1:54310/tmp/data_sort; dynamic=false
19/06/27 15:59:10 DEBUG FileCommitProtocol: Using (String, String,
Boolean) constructor
19/06/27 15:59:10 INFO FileOutputCommitter: File Output Committer
Algorithm version is 1
19/06/27 15:59:10 DEBUG DFSClient: /tmp/data_sort/_temporary/0:
masked=rwxr-xr-x
19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser sending #2
19/06/27 15:59:10 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser got value #2
19/06/27 15:59:10 DEBUG ProtobufRpcEngine: Call: mkdirs took 3ms
19/06/27 15:59:10 DEBUG ClosureCleaner: Cleaning lambda:
$anonfun$write$1
19/06/27 15:59:10 DEBUG ClosureCleaner:  +++ Lambda closure
($anonfun$write$1) is now cleaned +++
19/06/27 15:59:10 INFO SparkContext: Starting job: runJob at
SparkHadoopWriter.scala:78
19/06/27 15:59:10 INFO CrailDispatcher: CrailStore starting version
400
19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.deleteonclose
false
19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.deleteOnStart
true
19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.preallocate 0
19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.writeAhead 0
19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.debug false
19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.serializer
org.apache.spark.serializer.CrailSparkSerializer
19/06/27 15:59:10 INFO CrailDispatcher: spark.crail.shuffle.affinity
true
19/06/27 15:59:10 INFO CrailDispatcher:
spark.crail.shuffle.outstanding 1
19/06/27 15:59:10 INFO CrailDispatcher:
spark.crail.shuffle.storageclass 0
19/06/27 15:59:10 INFO CrailDispatcher:
spark.crail.broadcast.storageclass 0
19/06/27 15:59:10 INFO crail: creating singleton crail file system
19/06/27 15:59:10 INFO crail: crail.version 3101
19/06/27 15:59:10 INFO crail: crail.directorydepth 16
19/06/27 15:59:10 INFO crail: crail.tokenexpiration 10
19/06/27 15:59:10 INFO crail: crail.blocksize 1048576
19/06/27 15:59:10 INFO crail: crail.cachelimit 0
19/06/27 15:59:10 INFO crail: crail.cachepath /dev/hugepages/cache
19/06/27 15:59:10 INFO crail: crail.user crail
19/06/27 15:59:10 INFO crail: crail.shadowreplication 1
19/06/27 15:59:10 INFO crail: crail.debug true
19/06/27 15:59:10 INFO crail: crail.statistics true
19/06/27 15:59:10 INFO crail: crail.rpctimeout 1000
19/06/27 15:59:10 INFO crail: crail.datatimeout 1000
19/06/27 15:59:10 INFO crail: crail.buffersize 1048576
19/06/27 15:59:10 INFO crail: crail.slicesize 65536
19/06/27 15:59:10 INFO crail: crail.singleton true
19/06/27 15:59:10 INFO crail: crail.regionsize 1073741824
19/06/27 15:59:10 INFO crail: crail.directoryrecord 512
19/06/27 15:59:10 INFO crail: crail.directoryrandomize true
19/06/27 15:59:10 INFO crail: crail.cacheimpl
org.apache.crail.memory.MappedBufferCache
19/06/27 15:59:10 INFO crail: crail.locationmap
19/06/27 15:59:10 INFO crail: crail.namenode.address
crail://192.168.1.164:9060
19/06/27 15:59:10 INFO crail: crail.namenode.blockselection
roundrobin
19/06/27 15:59:10 INFO crail: crail.namenode.fileblocks 16
19/06/27 15:59:10 INFO crail: crail.namenode.rpctype
org.apache.crail.namenode.rpc.tcp.TcpNameNode
19/06/27 15:59:10 INFO crail: crail.namenode.log
19/06/27 15:59:10 INFO crail: crail.storage.types
org.apache.crail.storage.rdma.RdmaStorageTier
19/06/27 15:59:10 INFO crail: crail.storage.classes 1
19/06/27 15:59:10 INFO crail: crail.storage.rootclass 0
19/06/27 15:59:10 INFO crail: crail.storage.keepalive 2
19/06/27 15:59:10 INFO crail: buffer cache, allocationCount 0,
bufferCount 1024
19/06/27 15:59:10 INFO crail: crail.storage.rdma.interface eth0
19/06/27 15:59:10 INFO crail: crail.storage.rdma.port 50020
19/06/27 15:59:10 INFO crail: crail.storage.rdma.storagelimit
4294967296
19/06/27 15:59:10 INFO crail: crail.storage.rdma.allocationsize
1073741824
19/06/27 15:59:10 INFO crail: crail.storage.rdma.datapath
/dev/hugepages/rdma
19/06/27 15:59:10 INFO crail: crail.storage.rdma.localmap true
19/06/27 15:59:10 INFO crail: crail.storage.rdma.queuesize 32
19/06/27 15:59:10 INFO crail: crail.storage.rdma.type passive
19/06/27 15:59:10 INFO crail: crail.storage.rdma.backlog 100
19/06/27 15:59:10 INFO crail: crail.storage.rdma.connecttimeout 1000
19/06/27 15:59:10 INFO narpc: new NaRPC server group v1.0,
queueDepth 32, messageSize 512, nodealy true
19/06/27 15:59:10 INFO crail: crail.namenode.tcp.queueDepth 32
19/06/27 15:59:10 INFO crail: crail.namenode.tcp.messageSize 512
19/06/27 15:59:10 INFO crail: crail.namenode.tcp.cores 1
19/06/27 15:59:10 INFO crail: connected to namenode(s)
/192.168.1.164:9060
19/06/27 15:59:10 INFO CrailDispatcher: creating main dir /spark
19/06/27 15:59:10 INFO crail: lookupDirectory: path /spark
19/06/27 15:59:10 INFO CrailDispatcher: creating main dir /spark
19/06/27 15:59:10 INFO crail: createNode: name /spark, type
DIRECTORY, storageAffinity 0, locationAffinity 0
19/06/27 15:59:10 INFO crail: CoreOutputStream, open, path /, fd 0,
streamId 1, isDir true, writeHint 0
19/06/27 15:59:10 INFO crail: passive data client
19/06/27 15:59:10 INFO disni: creating  RdmaProvider of type 'nat'
19/06/27 15:59:10 INFO disni: jverbs jni version 32
19/06/27 15:59:10 INFO disni: sock_addr_in size mismatch, jverbs
size 28, native size 16
19/06/27 15:59:10 INFO disni: IbvRecvWR size match, jverbs size 32,
native size 32
19/06/27 15:59:10 INFO disni: IbvSendWR size mismatch, jverbs size
72, native size 128
19/06/27 15:59:10 INFO disni: IbvWC size match, jverbs size 48,
native size 48
19/06/27 15:59:10 INFO disni: IbvSge size match, jverbs size 16,
native size 16
19/06/27 15:59:10 INFO disni: Remote addr offset match, jverbs size
40, native size 40
19/06/27 15:59:10 INFO disni: Rkey offset match, jverbs size 48,
native size 48
19/06/27 15:59:10 INFO disni: createEventChannel, objId
139811924587312
19/06/27 15:59:10 INFO disni: passive endpoint group, maxWR 32,
maxSge 4, cqSize 64
19/06/27 15:59:10 INFO disni: launching cm processor, cmChannel 0
19/06/27 15:59:10 INFO disni: createId, id 139811924676432
19/06/27 15:59:10 INFO disni: new client endpoint, id 0, idPriv 0
19/06/27 15:59:10 INFO disni: resolveAddr, addres
/192.168.3.100:4420
19/06/27 15:59:10 INFO disni: resolveRoute, id 0
19/06/27 15:59:10 INFO disni: allocPd, objId 139811924679808
19/06/27 15:59:10 INFO disni: setting up protection domain, context
467, pd 1
19/06/27 15:59:10 INFO disni: setting up cq processor
19/06/27 15:59:10 INFO disni: new endpoint CQ processor
19/06/27 15:59:10 INFO disni: createCompChannel, context
139810647883744
19/06/27 15:59:10 INFO disni: createCQ, objId 139811924680688, ncqe
64
19/06/27 15:59:10 INFO disni: createQP, objId 139811924691192,
send_wr size 32, recv_wr_size 32
19/06/27 15:59:10 INFO disni: connect, id 0
19/06/27 15:59:10 INFO disni: got event type + UNKNOWN, srcAddress
/192.168.3.13:43273, dstAddress /192.168.3.100:4420
19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
Registered executor NettyRpcEndpointRef(spark-client://Executor)
(192.168.3.11:35854) with ID 0
19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
Registered executor NettyRpcEndpointRef(spark-client://Executor)
(192.168.3.12:44312) with ID 1
19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
Registered executor NettyRpcEndpointRef(spark-client://Executor)
(192.168.3.8:34774) with ID 4
19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
Registered executor NettyRpcEndpointRef(spark-client://Executor)
(192.168.3.9:58808) with ID 2
19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
192.168.3.11
19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
manager 192.168.3.11:41919 with 366.3 MB RAM, BlockManagerId(0,
192.168.3.11, 41919, None)
19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
192.168.3.12
19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
manager 192.168.3.12:46697 with 366.3 MB RAM, BlockManagerId(1,
192.168.3.12, 46697, None)
19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
192.168.3.8
19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
manager 192.168.3.8:37281 with 366.3 MB RAM, BlockManagerId(4,
192.168.3.8, 37281, None)
19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
192.168.3.9
19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
manager 192.168.3.9:43857 with 366.3 MB RAM, BlockManagerId(2,
192.168.3.9, 43857, None)
19/06/27 15:59:11 INFO CoarseGrainedSchedulerBackend$DriverEndpoint:
Registered executor NettyRpcEndpointRef(spark-client://Executor)
(192.168.3.10:40100) with ID 3
19/06/27 15:59:11 DEBUG DefaultTopologyMapper: Got a request for
192.168.3.10
19/06/27 15:59:11 INFO BlockManagerMasterEndpoint: Registering block
manager 192.168.3.10:38527 with 366.3 MB RAM, BlockManagerId(3,
192.168.3.10, 38527, None)
19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser: closed
19/06/27 15:59:20 DEBUG Client: IPC Client (1998371610) connection
to NameNode-1/192.168.3.7:54310 from hduser: stopped, remaining
connections 0


Regards,

          David



Reply via email to