Need help on building a cube with Spark

Lingaraja A J Wed, 11 Dec 2019 20:04:23 -0800

Hi team,

Greetings for the day!


I'm using Kylin to build an OLAP cube system for my project. I have create a 
cube with engine type set to Spark and am trying to build it. It consists of 11 
steps among which Spark code runs on step 6. At this step, it's throwing the 
following error:

    Running org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable 
default.kylin_intermediate_table -output 
gs://bucket/kylin/kylin_metadata/kylin-b52a579f-3084-1ced-3544-fdf5b3eccb8a/model/cuboid/
 -input 
gs://bucket/kylin/kylin_metadata/kylin-b52a579f-3084-1ced-3544-fdf5b3eccb8a/kylin_intermediate_table
 -segmentId 9047b119-14ad-ecb8-bb87-c76d255907df -metaUrl 
kylin_metadata@hdfs,path=gs://bucket/kylin/kylin_metadata/kylin-b52a579f-3084-1ced-3544-fdf5b3eccb8a/model/metadata
 -cubename cube
Exception in thread "main" java.lang.NoSuchMethodError: 
org.apache.kylin.common.persistence.ResourceStore.getResource(Ljava/lang/String;Ljava/lang/Class;Lorg/apache/kylin/common/persistence/Serializer;)Lorg/apache/kylin/common/persistence/RootPersistentEntity;
    at org.apache.kylin.cube.CubeManager.loadCubeInstance(CubeManager.java:682)
    at 
org.apache.kylin.cube.CubeManager.loadAllCubeInstance(CubeManager.java:671)
    at org.apache.kylin.cube.CubeManager.<init>(CubeManager.java:128)
    at org.apache.kylin.cube.CubeManager.getInstance(CubeManager.java:97)
    at 
org.apache.kylin.engine.spark.SparkCubingByLayer.execute(SparkCubingByLayer.java:137)
    at 
org.apache.kylin.common.util.AbstractApplication.execute(AbstractApplication.java:37)
    at org.apache.kylin.common.util.SparkEntry.main(SparkEntry.java:44)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at 
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
    at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
The command is:
export HADOOP_CONF_DIR=/etc/hadoop/conf && /usr/lib/spark/bin/spark-submit 
--class org.apache.kylin.common.util.SparkEntry  --conf 
spark.executor.instances=40  --conf spark.yarn.queue=default  --conf 
spark.history.fs.logDirectory=hdfs:///kylin/spark-history  --conf 
spark.io.compression.codec=org.apache.spark.io.SnappyCompressionCodec  --conf 
spark.master=yarn  --conf spark.hadoop.yarn.timeline-service.enabled=false  
--conf spark.executor.memory=4G  --conf spark.eventLog.enabled=true  --conf 
spark.eventLog.dir=hdfs:///kylin/spark-history  --conf 
spark.yarn.executor.memoryOverhead=1024  --conf spark.driver.memory=2G  --conf 
spark.shuffle.service.enabled=true --jars /etc/kylin/lib/kylin-job-2.6.4.jar 
/etc/kylin/lib/kylin-job-2.6.4.jar -className 
org.apache.kylin.engine.spark.SparkCubingByLayer -hiveTable 
default.kylin_intermediate_sol_v3_new_9047b119_14ad_ecb8_bb87_c76d255907df 
-output 
gs://bucket/kylin/kylin_metadata/kylin-b52a579f-3084-1ced-3544-fdf5b3eccb8a/model/cuboid/
 -input 
gs://bucket/kylin/kylin_metadata/kylin-b52a579f-3084-1ced-3544-fdf5b3eccb8a/kylin_intermediate_table
 -segmentId 9047b119-14ad-ecb8-bb87-c76d255907df -metaUrl 
kylin_metadata@hdfs,path=gs://bucket/kylin/kylin_metadata/kylin-b52a579f-3084-1ced-3544-fdf5b3eccb8a/model/metadata
 -cubename cube
   at 
org.apache.kylin.engine.spark.SparkExecutable.doWork(SparkExecutable.java:347)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
    at 
org.apache.kylin.job.execution.DefaultChainedExecutable.doWork(DefaultChainedExecutable.java:71)
    at 
org.apache.kylin.job.execution.AbstractExecutable.execute(AbstractExecutable.java:167)
    at 
org.apache.kylin.job.impl.threadpool.DefaultScheduler$JobRunner.run(DefaultScheduler.java:114)
    at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)



I checked the kylin-core-common-2.6.4.jar file I placed in the 
/usr/lib/spark/jars path as well as in /etc/kylin/spark/jars path. The source 
code has the getResource() method implementation in the ResourceStore class:



public final <T extends RootPersistentEntity> T getResource(final String 
resPath, final Serializer<T> serializer) throws IOException {

    return this.getResource(resPath, 
(org.apache.kylin.common.persistence.ContentReader<T>)new 
ContentReader((Serializer)serializer));

}



public final <T extends RootPersistentEntity> T getResource(String resPath, 
final ContentReader<T> reader) throws IOException {

    resPath = this.norm(resPath);

    final RawResource res = this.getResourceWithRetry(resPath);

    if (res == null) {

        return null;

    }

    return (T)reader.readContent(res);

}



public final RawResource getResource(final String resPath) throws IOException {

    return this.getResourceWithRetry(this.norm(resPath));

}


I'm not sure why Kylin is unable to find this method. This is critical to the 
decision making process on whether to use Kylin in my project in near future. 
Please let me know if I'm missing something here or if you need more details, 
ASAP.


Thanks & regards,

Lingaraja A J
Senior Software Engineer
Tredence Analytics Solutions Pvt. Ltd.

This message, including any attachments, is the property of Tredence Inc. 
and/or one of its subsidiaries. It is confidential and may contain proprietary 
or legally privileged information. If you are not the intended recipient, 
please delete it without reading the contents. Thank you.

Need help on building a cube with Spark

Reply via email to