Did a fresh pull in the morning.
I am running with spark 2.4.2, so had to make changes to the pom.xml
file.
This is what version of crail is: v1.1-7-ga6e622f
Does this look correct?
Regards,
David
C: 714-476-2692
________________________________
From: Adrian Schüpbach Gribex <adrian.schuepb...@gribex.net>
Sent: Wednesday, June 19, 2019 7:27:52 AM
To: dev@crail.apache.org; David Crespi; dev@crail.apache.org
Subject: RE: Crail-Spark Shuffle Manager config error
Hi David
Do you use the latest Apache Crail from master?
It works only with this version.
Regards
Adrian
Am 19. Juni 2019 16:19:05 MESZ schrieb David Crespi
<david.cre...@storedgesystems.com>:
Adrian,
Did you change the code in the crail-spark-io?
I’m getting a build error now.
[INFO] /crail-spark-io/src/main/scala:-1: info: compiling
[INFO] Compiling 12 source files to /crail-spark-io/target/classes
at 1560953910105
[ERROR]
/crail-spark-io/src/main/scala/org/apache/spark/storage/CrailDispatcher.scala:119:
error: value createConfigurationFromFile is not a member of object
org.apache.crail.conf.CrailConfiguration
[ERROR] val crailConf =
CrailConfiguration.createConfigurationFromFile();
[ERROR] ^
[ERROR] one error found
[INFO]
------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO]
------------------------------------------------------------------------
[INFO] Total time: 15.007 s
[INFO] Finished at: 2019-06-19T07:18:33-07:00
Regards,
David
________________________________
From: Adrian Schuepbach <adrian.schuepb...@gribex.net>
Sent: Wednesday, June 19, 2019 5:28:30 AM
To: dev@crail.apache.org
Subject: Re: Crail-Spark Shuffle Manager config error
Hi David
I changed the code to use the new API to create the Crail
configuration.
Please pull, build and install the newest version.
Please also remove the old jars from the directory where the
classpath
is pointing to, since if you have multiple jars of different
versions
in the classpath, it is unclear, which one will be taken.
Best regards
Adrian
On 6/19/19 13:43, Adrian Schuepbach wrote:
Hi David
This is caused by the API change to create a Crail configuration
object.
The new API has three different static methods to create the Crail
configuration instead of the empty constructor.
I am adapting the dependent repositories to the new API.
What is a bit unclear to me is why you hit this. The
crail-dispatcher's
dependency is to crail-client 1.0, however the new API is only
available
on the current master (version 1.2-incubating-SNAPSHOT).
If you built Apache Crail from source, you get
1.2-incubating-SNAPSHOT,
but not the 1.0 version. I would have expected that you cannot even
build
crail-spark-io.
In any case, the fix is shortly ready.
Regards
Adrian
On 6/19/19 09:21, Jonas Pfefferle wrote:
Hi David,
I assume you are running with latest Crail master. We just pushed a
change to the CrailConfiguration initialization which we have not
adapted yet in the shuffle plugin (Should be a one line fix).
@Adrian
Can you take a look.
Regards,
Jonas
On Tue, 18 Jun 2019 23:24:48 +0000
David Crespi <david.cre...@storedgesystems.com> wrote:
Hi,
I’m getting what looks to be a configuration error when trying to
use
the CrailShuffleManager.
(spark.shuffle.manager
org.apache.spark.shuffle.crail.CrailShuffleManager)
It seems like a basic error, but other things are running okay until
I add in the line above in to my spark-defaults.conf
File.
I have my environment variable for crail home set, as well as for
the
disni libs using:
LD_LIBRARY_PATH=/usr/local/lib
$ ls -l /usr/local/lib/
total 156
-rwxr-xr-x 1 root root 947 Jun 18 08:11 libdisni.la
lrwxrwxrwx 1 root root 17 Jun 18 08:11 libdisni.so ->
libdisni.so.0.0.0
lrwxrwxrwx 1 root root 17 Jun 18 08:11 libdisni.so.0 ->
libdisni.so.0.0.0
-rwxr-xr-x 1 root root 149784 Jun 18 08:11 libdisni.so.0.0.0
I also have a environment variable for classpath set:
CLASSPATH=/disni/target/*:/jNVMf/target/*:/crail/jars/*
Could the classpath veriable be the issue?
19/06/18 15:59:47 DEBUG Client: getting client out of cache:
org.apache.hadoop.ipc.Client@7bebcd65
19/06/18 15:59:47 DEBUG PerformanceAdvisory: Both short-circuit
local
reads and UNIX domain socket are disabled.
19/06/18 15:59:47 DEBUG DataTransferSaslUtil: DataTransferProtocol
not using SaslPropertiesResolver, no QOP found in configuration for
dfs.data.transfer.protection
19/06/18 15:59:48 INFO MemoryStore: Block broadcast_0 stored as
values in memory (estimated size 288.9 KB, free 366.0 MB)
19/06/18 15:59:48 DEBUG BlockManager: Put block broadcast_0 locally
took 123 ms
19/06/18 15:59:48 DEBUG BlockManager: Putting block broadcast_0
without replication took 125 ms
19/06/18 15:59:48 INFO MemoryStore: Block broadcast_0_piece0 stored
as bytes in memory (estimated size 23.8 KB, free 366.0 MB)
19/06/18 15:59:48 INFO BlockManagerInfo: Added broadcast_0_piece0 in
memory on master:34103 (size: 23.8 KB, free: 366.3 MB)
19/06/18 15:59:48 DEBUG BlockManagerMaster: Updated info of block
broadcast_0_piece0
19/06/18 15:59:48 DEBUG BlockManager: Told master about block
broadcast_0_piece0
19/06/18 15:59:48 DEBUG BlockManager: Put block broadcast_0_piece0
locally took 7 ms
19/06/18 15:59:48 DEBUG BlockManager: Putting block
broadcast_0_piece0 without replication took 8 ms
19/06/18 15:59:48 INFO SparkContext: Created broadcast 0 from
newAPIHadoopFile at TeraSort.scala:60
19/06/18 15:59:48 DEBUG Client: The ping interval is 60000 ms.
19/06/18 15:59:48 DEBUG Client: Connecting to
NameNode-1/192.168.3.7:54310
19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to
NameNode-1/192.168.3.7:54310 from hduser: starting, having
connections 1
19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to
NameNode-1/192.168.3.7:54310 from hduser sending #0
19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to
NameNode-1/192.168.3.7:54310 from hduser got value #0
19/06/18 15:59:48 DEBUG ProtobufRpcEngine: Call: getFileInfo took
56ms
19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to
NameNode-1/192.168.3.7:54310 from hduser sending #1
19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to
NameNode-1/192.168.3.7:54310 from hduser got value #1
19/06/18 15:59:48 DEBUG ProtobufRpcEngine: Call: getListing took 3ms
19/06/18 15:59:48 DEBUG FileInputFormat: Time taken to get
FileStatuses: 142
19/06/18 15:59:48 INFO FileInputFormat: Total input paths to process
: 2
19/06/18 15:59:48 DEBUG FileInputFormat: Total # of splits generated
by getSplits: 2, TimeTaken: 145
19/06/18 15:59:48 DEBUG FileCommitProtocol: Creating committer
org.apache.spark.internal.io.HadoopMapReduceCommitProtocol; job 1;
output=hdfs://NameNode-1:54310/tmp/data_sort; dynamic=false
19/06/18 15:59:48 DEBUG FileCommitProtocol: Using (String, String,
Boolean) constructor
19/06/18 15:59:48 INFO FileOutputCommitter: File Output Committer
Algorithm version is 1
19/06/18 15:59:48 DEBUG DFSClient: /tmp/data_sort/_temporary/0:
masked=rwxr-xr-x
19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to
NameNode-1/192.168.3.7:54310 from hduser sending #2
19/06/18 15:59:48 DEBUG Client: IPC Client (199041063) connection to
NameNode-1/192.168.3.7:54310 from hduser got value #2
19/06/18 15:59:48 DEBUG ProtobufRpcEngine: Call: mkdirs took 3ms
19/06/18 15:59:48 DEBUG ClosureCleaner: Cleaning lambda:
$anonfun$write$1
19/06/18 15:59:48 DEBUG ClosureCleaner: +++ Lambda closure
($anonfun$write$1) is now cleaned +++
19/06/18 15:59:48 INFO SparkContext: Starting job: runJob at
SparkHadoopWriter.scala:78
19/06/18 15:59:48 INFO CrailDispatcher: CrailStore starting version
400
19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.deleteonclose
false
19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.deleteOnStart
true
19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.preallocate 0
19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.writeAhead 0
19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.debug false
19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.serializer
org.apache.spark.serializer.CrailSparkSerializer
19/06/18 15:59:48 INFO CrailDispatcher: spark.crail.shuffle.affinity
true
19/06/18 15:59:48 INFO CrailDispatcher:
spark.crail.shuffle.outstanding 1
19/06/18 15:59:48 INFO CrailDispatcher:
spark.crail.shuffle.storageclass 0
19/06/18 15:59:48 INFO CrailDispatcher:
spark.crail.broadcast.storageclass 0
Exception in thread "dag-scheduler-event-loop"
java.lang.IllegalAccessError: tried to access method
org.apache.crail.conf.CrailConfiguration.<init>()V from class
org.apache.spark.storage.CrailDispatcher
at
org.apache.spark.storage.CrailDispatcher.org$apache$spark$storage$CrailDispatcher$$init(CrailDispatcher.scala:119)
at
org.apache.spark.storage.CrailDispatcher$.get(CrailDispatcher.scala:662)
at
org.apache.spark.shuffle.crail.CrailShuffleManager.registerShuffle(CrailShuffleManager.scala:52)
at
org.apache.spark.ShuffleDependency.<init>(Dependency.scala:94)
at
org.apache.spark.rdd.ShuffledRDD.getDependencies(ShuffledRDD.scala:87)
at
org.apache.spark.rdd.RDD.$anonfun$dependencies$2(RDD.scala:240)
at scala.Option.getOrElse(Option.scala:138)
at org.apache.spark.rdd.RDD.dependencies(RDD.scala:238)
at
org.apache.spark.scheduler.DAGScheduler.getShuffleDependencies(DAGScheduler.scala:512)
at
org.apache.spark.scheduler.DAGScheduler.getOrCreateParentStages(DAGScheduler.scala:461)
at
org.apache.spark.scheduler.DAGScheduler.createResultStage(DAGScheduler.scala:448)
at
org.apache.spark.scheduler.DAGScheduler.handleJobSubmitted(DAGScheduler.scala:962)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2067)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048)
at
org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
Regards,
David
--
Adrian Schüpbach, Dr. sc. ETH Zürich