Hi, Why are you setting carbon.kettle.home=/opt/data-integration. It supposed to be <carbondata>/processing/carbonplugins right? It seems `/opt/data-integration` has some other plugins as well, that is it is throwing this exception. Please keep only carbonplugins as kettle home.
Thanks & Regards, Ravindra. On 4 August 2016 at 14:01, 金铸 <[email protected]> wrote: > I have a question, If upload the csv file to hdfs, I using spark-shell > .,it success can not be loaded. > > > carbon.properties : > > #Mandatory. Carbon Store path > carbon.storelocation=hdfs://master:9100/opt/CarbonStore > #Base directory for Data files > carbon.ddl.base.hdfs.url=hdfs://master:9100/opt/data > #Path where the bad records are stored > carbon.badRecords.location=/opt/Carbon/Spark/badrecords > #Mandatory. path to kettle home > #carbon.kettle.home=$<SPARK_HOME>/carbonlib/carbonplugins > carbon.kettle.home=/opt/data-integration > > data-integration version:4.4 > > [hadoop@slave2 bin]$ ./carbon-spark-shell > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/opt/incubator-carbondata/assembly/target/scala-2.10/carbondata_2.10-0.1.0-SNAPSHOT-shade-hadoop2.2.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/opt/spark/lib/spark-assembly-1.6.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > > scala> cc.sql("load data inpath > 'hdfs://master:9100/opt/carbondata/sample.csv' into table test_table > options('FILEHEADER'='id,name,city,age')") > > INFO 05-08 00:11:59,938 - main Property file path: > /opt/incubator-carbondata/bin/../../../conf/carbon.properties > INFO 05-08 00:11:59,938 - main ------Using Carbon.properties -------- > INFO 05-08 00:11:59,938 - main {carbon.graph.rowset.size=100000, > carbon.enable.quick.filter=false, carbon.number.of.cores=4, > carbon.sort.file.buffer.size=20, carbon.kettle.home=/opt/data-integration, > carbon.number.of.cores.while.compacting=2, > carbon.compaction.level.threshold=4,3, > carbon.number.of.cores.while.loading=6, > carbon.badRecords.location=/opt/Carbon/Spark/badrecords, > carbon.sort.size=500000, carbon.inmemory.record.size=120000, > carbon.enableXXHash=true, > carbon.ddl.base.hdfs.url=hdfs://master:9100/opt/data, > carbon.major.compaction.size=1024, > carbon.storelocation=hdfs://master:9100/opt/CarbonStore} > INFO 05-08 00:11:59,939 - main Query [LOAD DATA INPATH > 'HDFS://MASTER:9100/OPT/CARBONDATA/SAMPLE.CSV' INTO TABLE TEST_TABLE > OPTIONS('FILEHEADER'='ID,NAME,CITY,AGE')] > > ERROR 05-08 00:12:06,268 - Exception in task 1.0 in stage 3.0 (TID 6) > java.lang.NoClassDefFoundError: org/scannotation/AnnotationDB > at > org.pentaho.di.core.plugins.BasePluginType.findAnnotatedClassFiles(BasePluginType.java:237) > at > org.pentaho.di.core.plugins.BasePluginType.registerPluginJars(BasePluginType.java:500) > at > org.pentaho.di.core.plugins.BasePluginType.searchPlugins(BasePluginType.java:115) > at > org.pentaho.di.core.plugins.PluginRegistry.init(PluginRegistry.java:420) > at > org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:121) > at > org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:68) > at > org.carbondata.processing.graphgenerator.GraphGenerator.validateAndInitialiseKettelEngine(GraphGenerator.java:304) > at > org.carbondata.processing.graphgenerator.GraphGenerator.generateGraph(GraphGenerator.java:278) > at > org.carbondata.spark.load.CarbonLoaderUtil.generateGraph(CarbonLoaderUtil.java:118) > at > org.carbondata.spark.load.CarbonLoaderUtil.executeGraph(CarbonLoaderUtil.java:173) > at > org.carbondata.spark.rdd.CarbonDataLoadRDD$$anon$1.<init>(CarbonDataLoadRDD.scala:196) > at > org.carbondata.spark.rdd.CarbonDataLoadRDD.compute(CarbonDataLoadRDD.scala:155) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.ClassNotFoundException: org.scannotation.AnnotationDB > at java.net.URLClassLoader$1.run(URLClassLoader.java:366) > at java.net.URLClassLoader$1.run(URLClassLoader.java:355) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:354) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > ... 20 more > > WARN 05-08 00:12:05,419 - Lost task 0.0 in stage 3.0 (TID 5, slave1): > org.carbondata.processing.etl.DataLoadingException: Internal Errors > at > org.carbondata.processing.csvload.DataGraphExecuter.execute(DataGraphExecuter.java:212) > at > org.carbondata.processing.csvload.DataGraphExecuter.executeGraph(DataGraphExecuter.java:144) > at > org.carbondata.spark.load.CarbonLoaderUtil.executeGraph(CarbonLoaderUtil.java:176) > at > org.carbondata.spark.rdd.CarbonDataLoadRDD$$anon$1.<init>(CarbonDataLoadRDD.scala:196) > at > org.carbondata.spark.rdd.CarbonDataLoadRDD.compute(CarbonDataLoadRDD.scala:155) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > ======== > ERROR 05-08 00:12:05,564 - [test_table: Graph - > MDKeyGentest_table][partitionID:0] Problem while copying file from local > store to carbon store > org.carbondata.processing.store.writer.exception.CarbonDataWriterException: > Problem while copying file from local store to carbon store > at > org.carbondata.processing.store.writer.AbstractFactDataWriter.copyCarbonDataFileToCarbonStorePath(AbstractFactDataWriter.java:598) > at > org.carbondata.processing.store.writer.AbstractFactDataWriter.closeWriter(AbstractFactDataWriter.java:504) > at > org.carbondata.processing.store.CarbonFactDataHandlerColumnar.closeHandler(CarbonFactDataHandlerColumnar.java:860) > at > org.carbondata.processing.mdkeygen.MDKeyGenStep.processingComplete(MDKeyGenStep.java:240) > at > org.carbondata.processing.mdkeygen.MDKeyGenStep.processRow(MDKeyGenStep.java:229) > at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50) > at java.lang.Thread.run(Thread.java:745) > > > thanks a lot! > > > > --------------------------------------------------------------------------------------------------- > Confidentiality Notice: The information contained in this e-mail and any > accompanying attachment(s) > is intended only for the use of the intended recipient and may be > confidential and/or privileged of > Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader > of this communication is > not the intended recipient, unauthorized use, forwarding, printing, > storing, disclosure or copying > is strictly prohibited, and may be unlawful.If you have received this > communication in error,please > immediately notify the sender by return e-mail, and delete the original > message and all copies from > your system. Thank you. > > --------------------------------------------------------------------------------------------------- > -- Thanks & Regards, Ravi
