Hi, It says /opt/incubator-carbondata/processing/carbonplugins not exists. Can you check you set the kettle home path correctly. Thanks, Ravindra
On Tue, 2 Aug 2016 8:24 am 金铸, <[email protected]> wrote: > WARN 02-08 18:37:46,529 - Lost task 1.0 in stage 6.0 (TID 20, master): > org.carbondata.processing.graphgenerator.GraphGeneratorException: > Error While Initializing the Kettel Engine > at > org.carbondata.processing.graphgenerator.GraphGenerator.validateAndInitialiseKettelEngine(GraphGenerator.java:309) > at > org.carbondata.processing.graphgenerator.GraphGenerator.generateGraph(GraphGenerator.java:278) > at > org.carbondata.spark.load.CarbonLoaderUtil.generateGraph(CarbonLoaderUtil.java:118) > at > org.carbondata.spark.load.CarbonLoaderUtil.executeGraph(CarbonLoaderUtil.java:173) > at > org.carbondata.spark.rdd.CarbonDataLoadRDD$$anon$1.<init>(CarbonDataLoadRDD.scala:196) > at > org.carbondata.spark.rdd.CarbonDataLoadRDD.compute(CarbonDataLoadRDD.scala:155) > at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) > at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) > at > org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) > at org.apache.spark.scheduler.Task.run(Task.scala:89) > at > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:227) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.pentaho.di.core.exception.KettleException: > Unable to read file > '/opt/incubator-carbondata/processing/carbonplugins/.kettle/kettle.properties' > /opt/incubator-carbondata/processing/carbonplugins/.kettle/kettle.properties > (No such file or directory) > > at org.pentaho.di.core.util.EnvUtil.readProperties(EnvUtil.java:65) > at > org.pentaho.di.core.util.EnvUtil.environmentInit(EnvUtil.java:95) > at > org.carbondata.processing.graphgenerator.GraphGenerator.validateAndInitialiseKettelEngine(GraphGenerator.java:303) > ... 13 more > Caused by: java.io.FileNotFoundException: > /opt/incubator-carbondata/processing/carbonplugins/.kettle/kettle.properties > (No such file or directory) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:146) > at java.io.FileInputStream.<init>(FileInputStream.java:101) > at org.pentaho.di.core.util.EnvUtil.readProperties(EnvUtil.java:60) > ... 15 more > > > 在 2016/8/1 11:04, 金铸 写道: > > [hadoop@master ~]$ hadoop fs -cat /opt/incubator-carbondata/sample.csv > > 16/08/01 18:11:00 WARN util.NativeCodeLoader: Unable to load > > native-hadoop library for your platform... using builtin-java classes > > where applicable > > id,name,city,age > > 1,david,shenzhen,31 > > 2,eason,shenzhen,27 > > 3,jarry,wuhan,35 > > [hadoop@master ~]$ > > > > spark-sql> load data inpath '../sample.csv' into table test_table; > > > > INFO 01-08 18:19:08,914 - Job 7 finished: collect at > > CarbonDataRDDFactory.scala:731, took 0.088371 s > > INFO 01-08 18:19:08,915 - ********starting clean up********** > > INFO 01-08 18:19:08,915 - ********clean up done********** > > AUDIT 01-08 18:19:08,915 - [master][hadoop][Thread-1]Data load is > > failed for default.test_table > > INFO 01-08 18:19:08,915 - task runtime:(count: 2, mean: 58.000000, > > stdev: 20.000000, max: 78.000000, min: 38.000000) > > INFO 01-08 18:19:08,915 - 0% 5% 10% 25% 50% 75% > > 90% 95% 100% > > INFO 01-08 18:19:08,915 - 38.0 ms 38.0 ms 38.0 ms 38.0 > > ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 ms > > INFO 01-08 18:19:08,916 - task result size:(count: 2, mean: > > 948.000000, stdev: 0.000000, max: 948.000000, min: 948.000000) > > INFO 01-08 18:19:08,916 - 0% 5% 10% 25% 50% 75% > > 90% 95% 100% > > INFO 01-08 18:19:08,916 - 948.0 B 948.0 B 948.0 B 948.0 > > B 948.0 B 948.0 B 948.0 B 948.0 B 948.0 B > > WARN 01-08 18:19:08,915 - Unable to write load metadata file > > ERROR 01-08 18:19:08,917 - main > > java.lang.Exception: Dataload failure > > at > > > org.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:791) > > at > > > org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:1161) > > at > > > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) > > at > > > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) > > at > > > org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) > > at > > > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) > > at > > > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) > > at > > > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) > > at > > org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) > > at > > > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55) > > at > > > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55) > > at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145) > > at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130) > > at > > > org.carbondata.spark.rdd.CarbonDataFrameRDD.<init>(CarbonDataFrameRDD.scala:23) > > at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:131) > > at > > > org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63) > > at > > > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:311) > > at > > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > > at > > > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226) > > at > > > org.apache.spark.sql.hive.cli.CarbonSQLCLIDriver$.main(CarbonSQLCLIDriver.scala:40) > > at > > > org.apache.spark.sql.hive.cli.CarbonSQLCLIDriver.main(CarbonSQLCLIDriver.scala) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:606) > > at > > > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > > at > > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > > at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > > > > > > Thanks & Regards, > > > > > > 金铸 > > > > > > 在 2016/7/31 11:32, Ravindra Pesala 写道: > >> Hi, > >> > >> Exception says input path '/opt/incubator-carbondata/sample.csv' does > >> not > >> exist. So please make sure following things, > >> 1. Whether the the sample.csv file is present in the location ' > >> /opt/incubator-carbondata/' > >> 2. Are you running the Spark in local mode or cluster mode.(If it is > >> running in cluster mode please keep the csv file in hdfs.) > >> 3. Please try to keep the csv file in hdfs and load the data. > >> > >> Thanks & Regards, > >> Ravindra > >> > >> On 31 July 2016 at 07:37, 金铸 <[email protected]> wrote: > >> > >>> hi Liang: > >>> > >>> Thanks your repay。 > >>> > >>> I have already used the “/opt/incubator-carbondata/sample.csv” > >>> ,Reported the same error。 > >>> > >>> > >>> > >>> 在 2016/7/30 22:44, Liang Big data 写道: > >>> > >>>> Hi jinzhu金铸: > >>>> > >>>> > >>>> please check the below error:the input path having some issues. > >>>> Please use the absolute path to try it again. > >>>> ----------------------------------------------- > >>>> ERROR 29-07 16:39:46,904 - main > >>>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input > >>>> path > >>>> does not exist: /opt/incubator-carbondata/sample.csv > >>>> > >>>> Regards > >>>> Liang > >>>> > >>>> 2016-07-29 8:47 GMT+08:00 金铸 <[email protected]>: > >>>> > >>>> [hadoop@slave2 ~]$ cat /opt/incubator-carbondata/sample.csv > >>>>> id,name,city,age > >>>>> 1,david,shenzhen,31 > >>>>> 2,eason,shenzhen,27 > >>>>> 3,jarry,wuhan,35 > >>>>> [hadoop@slave2 ~]$ > >>>>> > >>>>> > load data inpath '../sample.csv' into table test_table; > >>>>> INFO 29-07 16:39:46,087 - main Property file path: > >>>>> /opt/incubator-carbondata/bin/../../../conf/carbon.properties > >>>>> INFO 29-07 16:39:46,087 - main ------Using Carbon.properties > >>>>> -------- > >>>>> INFO 29-07 16:39:46,087 - main {} > >>>>> INFO 29-07 16:39:46,088 - main Query [LOAD DATA INPATH > >>>>> '../SAMPLE.CSV' > >>>>> INTO TABLE TEST_TABLE] > >>>>> INFO 29-07 16:39:46,527 - Successfully able to get the table > >>>>> metadata > >>>>> file lock > >>>>> INFO 29-07 16:39:46,537 - main Initiating Direct Load for the > >>>>> Table : > >>>>> (default.test_table) > >>>>> INFO 29-07 16:39:46,541 - Generate global dictionary from source > >>>>> data > >>>>> files! > >>>>> INFO 29-07 16:39:46,569 - [Block Distribution] > >>>>> INFO 29-07 16:39:46,569 - totalInputSpaceConsumed : 74 , > >>>>> defaultParallelism : 6 > >>>>> INFO 29-07 16:39:46,569 - > >>>>> mapreduce.input.fileinputformat.split.maxsize > >>>>> : > >>>>> 16777216 > >>>>> INFO 29-07 16:39:46,689 - Block broadcast_0 stored as values in > >>>>> memory > >>>>> (estimated size 232.6 KB, free 232.6 KB) > >>>>> INFO 29-07 16:39:46,849 - Block broadcast_0_piece0 stored as > >>>>> bytes in > >>>>> memory (estimated size 19.7 KB, free 252.3 KB) > >>>>> INFO 29-07 16:39:46,850 - Added broadcast_0_piece0 in memory on > >>>>> 192.168.241.223:41572 (size: 19.7 KB, free: 511.5 MB) > >>>>> INFO 29-07 16:39:46,856 - Created broadcast 0 from NewHadoopRDD at > >>>>> CarbonTextFile.scala:45 > >>>>> ERROR 29-07 16:39:46,904 - generate global dictionary failed > >>>>> ERROR 29-07 16:39:46,904 - main > >>>>> org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input > >>>>> path > >>>>> does not exist: /opt/incubator-carbondata/sample.csv > >>>>> at > >>>>> > >>>>> > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:321) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:385) > >>>>> > >>>>> at > >>>>> > org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120) > >>>>> > >>>>> at > >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > >>>>> at > >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > >>>>> at scala.Option.getOrElse(Option.scala:120) > >>>>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > >>>>> at > >>>>> > >>>>> > org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) > >>>>> > >>>>> at > >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) > >>>>> at > >>>>> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) > >>>>> at scala.Option.getOrElse(Option.scala:120) > >>>>> at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) > >>>>> at > >>>>> org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1307) > >>>>> at > >>>>> > >>>>> > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) > >>>>> > >>>>> at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) > >>>>> at org.apache.spark.rdd.RDD.take(RDD.scala:1302) > >>>>> at > >>>>> > >>>>> > com.databricks.spark.csv.CarbonCsvRelation.firstLine$lzycompute(CarbonCsvRelation.scala:175) > >>>>> > >>>>> at > >>>>> > >>>>> > com.databricks.spark.csv.CarbonCsvRelation.firstLine(CarbonCsvRelation.scala:170) > >>>>> > >>>>> at > >>>>> > >>>>> > com.databricks.spark.csv.CarbonCsvRelation.inferSchema(CarbonCsvRelation.scala:141) > >>>>> > >>>>> at > >>>>> > >>>>> > com.databricks.spark.csv.CarbonCsvRelation.<init>(CarbonCsvRelation.scala:71) > >>>>> > >>>>> at > >>>>> > >>>>> > com.databricks.spark.csv.newapi.DefaultSource.createRelation(DefaultSource.scala:142) > >>>>> > >>>>> at > >>>>> > >>>>> > com.databricks.spark.csv.newapi.DefaultSource.createRelation(DefaultSource.scala:44) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158) > >>>>> > >>>>> at > >>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) > >>>>> at > >>>>> org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109) > >>>>> at > >>>>> > >>>>> > org.carbondata.spark.util.GlobalDictionaryUtil$.loadDataFrame(GlobalDictionaryUtil.scala:365) > >>>>> > >>>>> at > >>>>> > >>>>> > org.carbondata.spark.util.GlobalDictionaryUtil$.generateGlobalDictionary(GlobalDictionaryUtil.scala:676) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:1159) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) > >>>>> > >>>>> at > >>>>> org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130) > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55) > >>>>> > >>>>> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145) > >>>>> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130) > >>>>> at > >>>>> > >>>>> > org.carbondata.spark.rdd.CarbonDataFrameRDD.<init>(CarbonDataFrameRDD.scala:23) > >>>>> > >>>>> at > >>>>> org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:131) > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:311) > >>>>> > >>>>> at > >>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376) > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.hive.cli.CarbonSQLCLIDriver$.main(CarbonSQLCLIDriver.scala:40) > >>>>> > >>>>> at > >>>>> > >>>>> > org.apache.spark.sql.hive.cli.CarbonSQLCLIDriver.main(CarbonSQLCLIDriver.scala) > >>>>> > >>>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > >>>>> at > >>>>> > >>>>> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > >>>>> > >>>>> at > >>>>> > >>>>> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > >>>>> > >>>>> at java.lang.reflect.Method.invoke(Method.java:606) > >>>>> at > >>>>> > >>>>> > org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) > >>>>> > >>>>> at > >>>>> > org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) > >>>>> > >>>>> at > >>>>> org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) > >>>>> at > >>>>> org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) > >>>>> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > >>>>> > >>>>> -- > >>>>> 金铸 > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > --------------------------------------------------------------------------------------------------- > >>>>> > >>>>> Confidentiality Notice: The information contained in this e-mail > >>>>> and any > >>>>> accompanying attachment(s) > >>>>> is intended only for the use of the intended recipient and may be > >>>>> confidential and/or privileged of > >>>>> Neusoft Corporation, its subsidiaries and/or its affiliates. If any > >>>>> reader > >>>>> of this communication is > >>>>> not the intended recipient, unauthorized use, forwarding, printing, > >>>>> storing, disclosure or copying > >>>>> is strictly prohibited, and may be unlawful.If you have received this > >>>>> communication in error,please > >>>>> immediately notify the sender by return e-mail, and delete the > >>>>> original > >>>>> message and all copies from > >>>>> your system. Thank you. > >>>>> > >>>>> > >>>>> > --------------------------------------------------------------------------------------------------- > >>>>> > >>>>> > >>>>> > >>>> > >>> > >>> > > > > > > > > > > > > -- > 金铸 > 技术发展部(TDD) > 东软集团股份有限公司 > 沈阳浑南新区新秀街2号东软软件园A2-105A > Postcode:110179 > Tel: (86 24)8366 2049 > Mobile:13897999526 > > > > > > > --------------------------------------------------------------------------------------------------- > Confidentiality Notice: The information contained in this e-mail and any > accompanying attachment(s) > is intended only for the use of the intended recipient and may be > confidential and/or privileged of > Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader > of this communication is > not the intended recipient, unauthorized use, forwarding, printing, > storing, disclosure or copying > is strictly prohibited, and may be unlawful.If you have received this > communication in error,please > immediately notify the sender by return e-mail, and delete the original > message and all copies from > your system. Thank you. > > --------------------------------------------------------------------------------------------------- >
