WARN 02-08 18:37:46,529 - Lost task 1.0 in stage 6.0 (TID 20, master):
org.carbondata.processing.graphgenerator.GraphGeneratorException:
Error While Initializing the Kettel Engine
at
org.carbondata.processing.graphgenerator.GraphGenerator.validateAndInitialiseKettelEngine(GraphGenerator.java:309)
at
org.carbondata.processing.graphgenerator.GraphGenerator.generateGraph(GraphGenerator.java:278)
at
org.carbondata.spark.load.CarbonLoaderUtil.generateGraph(CarbonLoaderUtil.java:118)
at
org.carbondata.spark.load.CarbonLoaderUtil.executeGraph(CarbonLoaderUtil.java:173)
at
org.carbondata.spark.rdd.CarbonDataLoadRDD$$anon$1.<init>(CarbonDataLoadRDD.scala:196)
at
org.carbondata.spark.rdd.CarbonDataLoadRDD.compute(CarbonDataLoadRDD.scala:155)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.pentaho.di.core.exception.KettleException:
Unable to read file
'/opt/incubator-carbondata/processing/carbonplugins/.kettle/kettle.properties'
/opt/incubator-carbondata/processing/carbonplugins/.kettle/kettle.properties
(No such file or directory)
at org.pentaho.di.core.util.EnvUtil.readProperties(EnvUtil.java:65)
at org.pentaho.di.core.util.EnvUtil.environmentInit(EnvUtil.java:95)
at
org.carbondata.processing.graphgenerator.GraphGenerator.validateAndInitialiseKettelEngine(GraphGenerator.java:303)
... 13 more
Caused by: java.io.FileNotFoundException:
/opt/incubator-carbondata/processing/carbonplugins/.kettle/kettle.properties
(No such file or directory)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at java.io.FileInputStream.<init>(FileInputStream.java:101)
at org.pentaho.di.core.util.EnvUtil.readProperties(EnvUtil.java:60)
... 15 more
在 2016/8/1 11:04, 金铸 写道:
[hadoop@master ~]$ hadoop fs -cat /opt/incubator-carbondata/sample.csv16/08/01 18:11:00 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableid,name,city,age 1,david,shenzhen,31 2,eason,shenzhen,27 3,jarry,wuhan,35 [hadoop@master ~]$ spark-sql> load data inpath '../sample.csv' into table test_table;INFO 01-08 18:19:08,914 - Job 7 finished: collect at CarbonDataRDDFactory.scala:731, took 0.088371 sINFO 01-08 18:19:08,915 - ********starting clean up********** INFO 01-08 18:19:08,915 - ********clean up done**********AUDIT 01-08 18:19:08,915 - [master][hadoop][Thread-1]Data load is failed for default.test_table INFO 01-08 18:19:08,915 - task runtime:(count: 2, mean: 58.000000, stdev: 20.000000, max: 78.000000, min: 38.000000) INFO 01-08 18:19:08,915 - 0% 5% 10% 25% 50% 75% 90% 95% 100% INFO 01-08 18:19:08,915 - 38.0 ms 38.0 ms 38.0 ms 38.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms INFO 01-08 18:19:08,916 - task result size:(count: 2, mean: 948.000000, stdev: 0.000000, max: 948.000000, min: 948.000000) INFO 01-08 18:19:08,916 - 0% 5% 10% 25% 50% 75% 90% 95% 100% INFO 01-08 18:19:08,916 - 948.0 B 948.0 B 948.0 B 948.0 B 948.0 B 948.0 B 948.0 B 948.0 B 948.0 BWARN 01-08 18:19:08,915 - Unable to write load metadata file ERROR 01-08 18:19:08,917 - main java.lang.Exception: Dataload failureat org.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:791) at org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:1161) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)at org.carbondata.spark.rdd.CarbonDataFrameRDD.<init>(CarbonDataFrameRDD.scala:23)at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:131)at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:311) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226) at org.apache.spark.sql.hive.cli.CarbonSQLCLIDriver$.main(CarbonSQLCLIDriver.scala:40) at org.apache.spark.sql.hive.cli.CarbonSQLCLIDriver.main(CarbonSQLCLIDriver.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:606)at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Thanks & Regards, 金铸 在 2016/7/31 11:32, Ravindra Pesala 写道:Hi,Exception says input path '/opt/incubator-carbondata/sample.csv' does notexist. So please make sure following things, 1. Whether the the sample.csv file is present in the location ' /opt/incubator-carbondata/' 2. Are you running the Spark in local mode or cluster mode.(If it is running in cluster mode please keep the csv file in hdfs.) 3. Please try to keep the csv file in hdfs and load the data. Thanks & Regards, Ravindra On 31 July 2016 at 07:37, 金铸 <[email protected]> wrote:hi Liang: Thanks your repay。 I have already used the “/opt/incubator-carbondata/sample.csv” ,Reported the same error。 在 2016/7/30 22:44, Liang Big data 写道:Hi jinzhu金铸: please check the below error:the input path having some issues. Please use the absolute path to try it again. ----------------------------------------------- ERROR 29-07 16:39:46,904 - mainorg.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input pathdoes not exist: /opt/incubator-carbondata/sample.csv Regards Liang 2016-07-29 8:47 GMT+08:00 金铸 <[email protected]>: [hadoop@slave2 ~]$ cat /opt/incubator-carbondata/sample.csvid,name,city,age 1,david,shenzhen,31 2,eason,shenzhen,27 3,jarry,wuhan,35 [hadoop@slave2 ~]$ > load data inpath '../sample.csv' into table test_table; INFO 29-07 16:39:46,087 - main Property file path: /opt/incubator-carbondata/bin/../../../conf/carbon.propertiesINFO 29-07 16:39:46,087 - main ------Using Carbon.properties --------INFO 29-07 16:39:46,087 - main {}INFO 29-07 16:39:46,088 - main Query [LOAD DATA INPATH '../SAMPLE.CSV'INTO TABLE TEST_TABLE]INFO 29-07 16:39:46,527 - Successfully able to get the table metadatafile lockINFO 29-07 16:39:46,537 - main Initiating Direct Load for the Table :(default.test_table)INFO 29-07 16:39:46,541 - Generate global dictionary from source datafiles! INFO 29-07 16:39:46,569 - [Block Distribution] INFO 29-07 16:39:46,569 - totalInputSpaceConsumed : 74 , defaultParallelism : 6INFO 29-07 16:39:46,569 - mapreduce.input.fileinputformat.split.maxsize: 16777216INFO 29-07 16:39:46,689 - Block broadcast_0 stored as values in memory(estimated size 232.6 KB, free 232.6 KB)INFO 29-07 16:39:46,849 - Block broadcast_0_piece0 stored as bytes inmemory (estimated size 19.7 KB, free 252.3 KB) INFO 29-07 16:39:46,850 - Added broadcast_0_piece0 in memory on 192.168.241.223:41572 (size: 19.7 KB, free: 511.5 MB) INFO 29-07 16:39:46,856 - Created broadcast 0 from NewHadoopRDD at CarbonTextFile.scala:45 ERROR 29-07 16:39:46,904 - generate global dictionary failed ERROR 29-07 16:39:46,904 - mainorg.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input pathdoes not exist: /opt/incubator-carbondata/sample.csv atorg.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:321)atorg.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)atorg.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:385)atorg.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) atorg.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1307)atorg.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)atorg.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.take(RDD.scala:1302) atcom.databricks.spark.csv.CarbonCsvRelation.firstLine$lzycompute(CarbonCsvRelation.scala:175)atcom.databricks.spark.csv.CarbonCsvRelation.firstLine(CarbonCsvRelation.scala:170)atcom.databricks.spark.csv.CarbonCsvRelation.inferSchema(CarbonCsvRelation.scala:141)atcom.databricks.spark.csv.CarbonCsvRelation.<init>(CarbonCsvRelation.scala:71)atcom.databricks.spark.csv.newapi.DefaultSource.createRelation(DefaultSource.scala:142)atcom.databricks.spark.csv.newapi.DefaultSource.createRelation(DefaultSource.scala:44)atorg.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109) atorg.carbondata.spark.util.GlobalDictionaryUtil$.loadDataFrame(GlobalDictionaryUtil.scala:365)atorg.carbondata.spark.util.GlobalDictionaryUtil$.generateGlobalDictionary(GlobalDictionaryUtil.scala:676)atorg.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:1159)atorg.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)atorg.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)atorg.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)atorg.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)atorg.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)atorg.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) atorg.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)atorg.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145) at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130) atorg.carbondata.spark.rdd.CarbonDataFrameRDD.<init>(CarbonDataFrameRDD.scala:23) at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:131)atorg.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63)atorg.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:311)at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) atorg.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)atorg.apache.spark.sql.hive.cli.CarbonSQLCLIDriver$.main(CarbonSQLCLIDriver.scala:40)atorg.apache.spark.sql.hive.cli.CarbonSQLCLIDriver.main(CarbonSQLCLIDriver.scala)at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)at java.lang.reflect.Method.invoke(Method.java:606) atorg.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)atorg.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) -- 金铸--------------------------------------------------------------------------------------------------- Confidentiality Notice: The information contained in this e-mail and anyaccompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,pleaseimmediately notify the sender by return e-mail, and delete the originalmessage and all copies from your system. Thank you.---------------------------------------------------------------------------------------------------
-- 金铸 技术发展部(TDD) 东软集团股份有限公司 沈阳浑南新区新秀街2号东软软件园A2-105A Postcode:110179 Tel: (86 24)8366 2049 Mobile:13897999526
--------------------------------------------------------------------------------------------------- Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) is intended only for the use of the intended recipient and may be confidential and/or privileged of Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is not the intended recipient, unauthorized use, forwarding, printing, storing, disclosure or copying is strictly prohibited, and may be unlawful.If you have received this communication in error,please immediately notify the sender by return e-mail, and delete the original message and all copies from your system. Thank you. ---------------------------------------------------------------------------------------------------
