[ 
https://issues.apache.org/jira/browse/KYLIN-4473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17094194#comment-17094194
 ] 

Xiaoxiang Yu commented on KYLIN-4473:
-------------------------------------

Hi, [~JerryVerghesec] , could you please attach your complete kylin.log ? Or 
related log is not shareable , you may output of this :

 
{code:java}
jinfo <KYLIN_PID>{code}
 

> Issue while writting HFILES while using apache spark in EMR
> -----------------------------------------------------------
>
>                 Key: KYLIN-4473
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4473
>             Project: Kylin
>          Issue Type: Bug
>          Components: Spark Engine, Storage - HBase
>    Affects Versions: v3.0.1
>         Environment: EMR 
>            Reporter: Jerry Verghese Cheruvathoor
>            Priority: Major
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> While trying to create a cube using the spark i am facing issue in creating 
> the hfiles.
> Spark command executed from kylin : 
> export HADOOP_CONF_DIR=/etc/hadoop/conf && 
> /usr/local/kylin/apache-kylin-3.0.1-bin-hbase1x/spark/bin/spark-submit 
> --class org.apache.kylin.common.util.SparkEntry --name "Convert Cuboid Data 
> to HFile" --conf spark.executor.instances=40 --conf spark.yarn.queue=default 
> --conf spark.history.fs.logDirectory=hdfs:///kylin/spark-history --conf 
> spark.master=yarn --conf spark.hadoop.yarn.timeline-service.enabled=false 
> --conf spark.executor.memory=4G --conf spark.eventLog.enabled=true --conf 
> spark.eventLog.dir=hdfs:///kylin/spark-history --conf 
> spark.yarn.executor.memoryOverhead=1024 --conf spark.driver.memory=2G --conf 
> spark.shuffle.service.enabled=true --jars 
> /usr/lib/hbase/lib/hbase-common-1.4.10.jar,/usr/lib/hbase/lib/hbase-server-1.4.10.jar,/usr/lib/hbase/lib/hbase-client-1.4.10.jar,/usr/lib/hbase/lib/hbase-protocol-1.4.10.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.10.jar,/usr/lib/hbase/lib/htrace-core-3.1.0-incubating.jar,/usr/lib/hbase/lib/metrics-core-2.2.0.jar,/usr/lib/hbase/lib/hbase-hadoop-compat-1.4.10.jar,/usr/lib/hbase/lib/hbase-hadoop2-compat-1.4.10.jar,
>  /usr/local/kylin/apache-kylin-3.0.1-bin-hbase1x/lib/kylin-job-3.0.1.jar 
> -className org.apache.kylin.storage.hbase.steps.SparkCubeHFile -partitions 
> hdfs://ip-10-234-119-182.w2.ngap2dev.nike.com:8020/kylin/kylin_metadata4/kylin-75dc6473-b974-c5c7-5d9b-bc42142d4ee9/test1/rowkey_stats/part-r-00000_hfile
>  -counterOutput 
> hdfs://ip-10-234-119-182.w2.ngap2dev.nike.com:8020/kylin/kylin_metadata4/kylin-75dc6473-b974-c5c7-5d9b-bc42142d4ee9/test1/counter
>  -cubename test1 -output 
> hdfs://ip-10-234-119-182.w2.ngap2dev.nike.com:8020/kylin/kylin_metadata4/kylin-75dc6473-b974-c5c7-5d9b-bc42142d4ee9/test1/hfile
>  -input 
> hdfs://ip-10-234-119-182.w2.ngap2dev.nike.com:8020/kylin/kylin_metadata4/kylin-75dc6473-b974-c5c7-5d9b-bc42142d4ee9/test1/cuboid/
>  -segmentId af2fe8db-6132-4b53-1680-7f93ac622f25 -metaUrl 
> kylin_metadata4@hdfs,path=hdfs://ip-10-234-119-182.w2.ngap2dev.nike.com:8020/kylin/kylin_metadata4/kylin-75dc6473-b974-c5c7-5d9b-bc42142d4ee9/test1/metadata
>  -hbaseConfPath 
> hdfs://ip-10-234-119-182.w2.ngap2dev.nike.com:8020/kylin/kylin_metadata4/kylin-75dc6473-b974-c5c7-5d9b-bc42142d4ee9/hbase-conf.xml
> Below is the error log .
> 34.115.209:3764434.115.209:376442020-04-28 05:21:01 WARN  TaskSetManager:66 - 
> Lost task 0.0 in stage 1.0 (TID 2, ip-10-234-115-209.w2.ngap2dev.nike.com, 
> executor 4): org.apache.spark.SparkException: Task failed while writing rows 
> at 
> org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:155)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:83)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:78)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at 
> org.apache.spark.scheduler.Task.run(Task.scala:109) at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)Caused by: 
> java.lang.NoClassDefFoundError: 
> Lorg/apache/hadoop/hbase/metrics/MetricRegistry; at 
> java.lang.Class.getDeclaredFields0(Native Method) at 
> java.lang.Class.privateGetDeclaredFields(Class.java:2583) at 
> java.lang.Class.getDeclaredFields(Class.java:1916) at 
> org.apache.hadoop.util.ReflectionUtils.getDeclaredFieldsIncludingInherited(ReflectionUtils.java:323)
>  at 
> org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.initRegistry(MetricsSourceBuilder.java:92)
>  at 
> org.apache.hadoop.metrics2.lib.MetricsSourceBuilder.<init>(MetricsSourceBuilder.java:56)
>  at 
> org.apache.hadoop.metrics2.lib.MetricsAnnotations.newSourceBuilder(MetricsAnnotations.java:43)
>  at 
> org.apache.hadoop.metrics2.impl.MetricsSystemImpl.register(MetricsSystemImpl.java:224)
>  at 
> org.apache.hadoop.hbase.metrics.BaseSourceImpl.<init>(BaseSourceImpl.java:115)
>  at 
> org.apache.hadoop.hbase.io.MetricsIOSourceImpl.<init>(MetricsIOSourceImpl.java:44)
>  at 
> org.apache.hadoop.hbase.io.MetricsIOSourceImpl.<init>(MetricsIOSourceImpl.java:36)
>  at 
> org.apache.hadoop.hbase.regionserver.MetricsRegionServerSourceFactoryImpl.createIO(MetricsRegionServerSourceFactoryImpl.java:73)
>  at org.apache.hadoop.hbase.io.MetricsIO.<init>(MetricsIO.java:32) at 
> org.apache.hadoop.hbase.io.hfile.HFile.<clinit>(HFile.java:191) at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:914)
>  at 
> org.apache.hadoop.hbase.regionserver.StoreFile$Writer.<init>(StoreFile.java:834)
>  at 
> org.apache.hadoop.hbase.regionserver.StoreFile$WriterBuilder.build(StoreFile.java:764)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.getNewWriter(HFileOutputFormat3.java:224)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:181)
>  at 
> org.apache.kylin.storage.hbase.steps.HFileOutputFormat3$1.write(HFileOutputFormat3.java:153)
>  at 
> org.apache.spark.internal.io.HadoopMapReduceWriteConfigUtil.write(SparkHadoopWriter.scala:356)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:130)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$4.apply(SparkHadoopWriter.scala:127)
>  at 
> org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1415)
>  at 
> org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:139)
>  ... 8 moreCaused by: java.lang.ClassNotFoundException: 
> org.apache.hadoop.hbase.metrics.MetricRegistry at 
> java.net.URLClassLoader.findClass(URLClassLoader.java:382) at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:419) at 
> java.lang.ClassLoader.loadClass(ClassLoader.java:352) ... 33 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to