I have a question, If upload the csv file to hdfs, I using spark-shell
.,it success can not be loaded.
carbon.properties :
#Mandatory. Carbon Store path
carbon.storelocation=hdfs://master:9100/opt/CarbonStore
#Base directory for Data files
carbon.ddl.base.hdfs.url=hdfs://master:9100/opt/data
#Path where the bad records are stored
carbon.badRecords.location=/opt/Carbon/Spark/badrecords
#Mandatory. path to kettle home
#carbon.kettle.home=$<SPARK_HOME>/carbonlib/carbonplugins
carbon.kettle.home=/opt/data-integration
data-integration version:4.4
[hadoop@slave2 bin]$ ./carbon-spark-shell
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/opt/incubator-carbondata/assembly/target/scala-2.10/carbondata_2.10-0.1.0-SNAPSHOT-shade-hadoop2.2.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/opt/spark/lib/spark-assembly-1.6.2-hadoop2.6.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
scala> cc.sql("load data inpath
'hdfs://master:9100/opt/carbondata/sample.csv' into table test_table
options('FILEHEADER'='id,name,city,age')")
INFO 05-08 00:11:59,938 - main Property file path:
/opt/incubator-carbondata/bin/../../../conf/carbon.properties
INFO 05-08 00:11:59,938 - main ------Using Carbon.properties --------
INFO 05-08 00:11:59,938 - main {carbon.graph.rowset.size=100000,
carbon.enable.quick.filter=false, carbon.number.of.cores=4,
carbon.sort.file.buffer.size=20,
carbon.kettle.home=/opt/data-integration,
carbon.number.of.cores.while.compacting=2,
carbon.compaction.level.threshold=4,3,
carbon.number.of.cores.while.loading=6,
carbon.badRecords.location=/opt/Carbon/Spark/badrecords,
carbon.sort.size=500000, carbon.inmemory.record.size=120000,
carbon.enableXXHash=true,
carbon.ddl.base.hdfs.url=hdfs://master:9100/opt/data,
carbon.major.compaction.size=1024,
carbon.storelocation=hdfs://master:9100/opt/CarbonStore}
INFO 05-08 00:11:59,939 - main Query [LOAD DATA INPATH
'HDFS://MASTER:9100/OPT/CARBONDATA/SAMPLE.CSV' INTO TABLE TEST_TABLE
OPTIONS('FILEHEADER'='ID,NAME,CITY,AGE')]
ERROR 05-08 00:12:06,268 - Exception in task 1.0 in stage 3.0 (TID 6)
java.lang.NoClassDefFoundError: org/scannotation/AnnotationDB
at
org.pentaho.di.core.plugins.BasePluginType.findAnnotatedClassFiles(BasePluginType.java:237)
at
org.pentaho.di.core.plugins.BasePluginType.registerPluginJars(BasePluginType.java:500)
at
org.pentaho.di.core.plugins.BasePluginType.searchPlugins(BasePluginType.java:115)
at
org.pentaho.di.core.plugins.PluginRegistry.init(PluginRegistry.java:420)
at
org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:121)
at org.pentaho.di.core.KettleEnvironment.init(KettleEnvironment.java:68)
at
org.carbondata.processing.graphgenerator.GraphGenerator.validateAndInitialiseKettelEngine(GraphGenerator.java:304)
at
org.carbondata.processing.graphgenerator.GraphGenerator.generateGraph(GraphGenerator.java:278)
at
org.carbondata.spark.load.CarbonLoaderUtil.generateGraph(CarbonLoaderUtil.java:118)
at
org.carbondata.spark.load.CarbonLoaderUtil.executeGraph(CarbonLoaderUtil.java:173)
at
org.carbondata.spark.rdd.CarbonDataLoadRDD$$anon$1.<init>(CarbonDataLoadRDD.scala:196)
at
org.carbondata.spark.rdd.CarbonDataLoadRDD.compute(CarbonDataLoadRDD.scala:155)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException: org.scannotation.AnnotationDB
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 20 more
WARN 05-08 00:12:05,419 - Lost task 0.0 in stage 3.0 (TID 5, slave1):
org.carbondata.processing.etl.DataLoadingException: Internal Errors
at
org.carbondata.processing.csvload.DataGraphExecuter.execute(DataGraphExecuter.java:212)
at
org.carbondata.processing.csvload.DataGraphExecuter.executeGraph(DataGraphExecuter.java:144)
at
org.carbondata.spark.load.CarbonLoaderUtil.executeGraph(CarbonLoaderUtil.java:176)
at
org.carbondata.spark.rdd.CarbonDataLoadRDD$$anon$1.<init>(CarbonDataLoadRDD.scala:196)
at
org.carbondata.spark.rdd.CarbonDataLoadRDD.compute(CarbonDataLoadRDD.scala:155)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
========
ERROR 05-08 00:12:05,564 - [test_table: Graph -
MDKeyGentest_table][partitionID:0] Problem while copying file from local store
to carbon store
org.carbondata.processing.store.writer.exception.CarbonDataWriterException:
Problem while copying file from local store to carbon store
at
org.carbondata.processing.store.writer.AbstractFactDataWriter.copyCarbonDataFileToCarbonStorePath(AbstractFactDataWriter.java:598)
at
org.carbondata.processing.store.writer.AbstractFactDataWriter.closeWriter(AbstractFactDataWriter.java:504)
at
org.carbondata.processing.store.CarbonFactDataHandlerColumnar.closeHandler(CarbonFactDataHandlerColumnar.java:860)
at
org.carbondata.processing.mdkeygen.MDKeyGenStep.processingComplete(MDKeyGenStep.java:240)
at
org.carbondata.processing.mdkeygen.MDKeyGenStep.processRow(MDKeyGenStep.java:229)
at org.pentaho.di.trans.step.RunThread.run(RunThread.java:50)
at java.lang.Thread.run(Thread.java:745)
thanks a lot!
---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of
this communication is
not the intended recipient, unauthorized use, forwarding, printing, storing,
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this
communication in error,please
immediately notify the sender by return e-mail, and delete the original message
and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------