Re: load data error

金铸 Sun, 31 Jul 2016 20:06:07 -0700

[hadoop@master ~]$ hadoop fs -cat /opt/incubator-carbondata/sample.csv

16/08/01 18:11:00 WARN util.NativeCodeLoader: Unable to loadnative-hadoop library for your platform... using builtin-java classeswhere applicable

id,name,city,age
1,david,shenzhen,31
2,eason,shenzhen,27
3,jarry,wuhan,35
[hadoop@master ~]$


spark-sql> load data inpath '../sample.csv' into table test_table;

INFO 01-08 18:19:08,914 - Job 7 finished: collect atCarbonDataRDDFactory.scala:731, took 0.088371 s

INFO  01-08 18:19:08,915 - ********starting clean up**********
INFO  01-08 18:19:08,915 - ********clean up done**********

AUDIT 01-08 18:19:08,915 - [master][hadoop][Thread-1]Data load is failedfor default.test_tableINFO 01-08 18:19:08,915 - task runtime:(count: 2, mean: 58.000000,stdev: 20.000000, max: 78.000000, min: 38.000000)INFO 01-08 18:19:08,915 - 0% 5% 10% 25% 50% 75%90% 95% 100%INFO 01-08 18:19:08,915 - 38.0 ms 38.0 ms 38.0 ms 38.0 ms78.0 ms 78.0 ms 78.0 ms 78.0 ms 78.0 msINFO 01-08 18:19:08,916 - task result size:(count: 2, mean: 948.000000,stdev: 0.000000, max: 948.000000, min: 948.000000)INFO 01-08 18:19:08,916 - 0% 5% 10% 25% 50% 75%90% 95% 100%INFO 01-08 18:19:08,916 - 948.0 B 948.0 B 948.0 B 948.0 B948.0 B 948.0 B 948.0 B 948.0 B 948.0 B

WARN  01-08 18:19:08,915 - Unable to write load metadata file
ERROR 01-08 18:19:08,917 - main
java.lang.Exception: Dataload failure

atorg.carbondata.spark.rdd.CarbonDataRDDFactory$.loadCarbonData(CarbonDataRDDFactory.scala:791)atorg.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:1161)atorg.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)atorg.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)atorg.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)atorg.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)atorg.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)atorg.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)atorg.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)atorg.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)atorg.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)

    at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145)
    at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)

atorg.carbondata.spark.rdd.CarbonDataFrameRDD.<init>(CarbonDataFrameRDD.scala:23)

    at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:131)

atorg.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63)atorg.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:311)

    at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)

atorg.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)atorg.apache.spark.sql.hive.cli.CarbonSQLCLIDriver$.main(CarbonSQLCLIDriver.scala:40)atorg.apache.spark.sql.hive.cli.CarbonSQLCLIDriver.main(CarbonSQLCLIDriver.scala)

    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)atsun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

    at java.lang.reflect.Method.invoke(Method.java:606)

atorg.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)atorg.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)

    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)


Thanks & Regards,


金铸


在 2016/7/31 11:32, Ravindra Pesala 写道:

Hi,

Exception says input path '/opt/incubator-carbondata/sample.csv' does not
exist. So please make sure following things,
1. Whether the the sample.csv file is present in the location  '
/opt/incubator-carbondata/'
2. Are you running the Spark in local mode or cluster mode.(If it is
running in cluster mode please keep the csv file in hdfs.)
3. Please try to keep the csv file in hdfs and load the data.

Thanks & Regards,
Ravindra

On 31 July 2016 at 07:37, 金铸 <[email protected]> wrote:

hi Liang：

        Thanks your repay。

        I have already used the “/opt/incubator-carbondata/sample.csv”
，Reported the same error。



在 2016/7/30 22:44, Liang Big data 写道:

Hi jinzhu金铸:


please check the below error:the input path having some issues.
Please use the absolute path to try it again.
-----------------------------------------------
ERROR 29-07 16:39:46,904 - main
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
does not exist: /opt/incubator-carbondata/sample.csv

Regards
Liang

2016-07-29 8:47 GMT+08:00 金铸 <[email protected]>:

[hadoop@slave2 ~]$ cat /opt/incubator-carbondata/sample.csv

id,name,city,age
1,david,shenzhen,31
2,eason,shenzhen,27
3,jarry,wuhan,35
[hadoop@slave2 ~]$

     > load data inpath '../sample.csv' into table test_table;
INFO  29-07 16:39:46,087 - main Property file path:
/opt/incubator-carbondata/bin/../../../conf/carbon.properties
INFO  29-07 16:39:46,087 - main ------Using Carbon.properties --------
INFO  29-07 16:39:46,087 - main {}
INFO  29-07 16:39:46,088 - main Query [LOAD DATA INPATH '../SAMPLE.CSV'
INTO TABLE TEST_TABLE]
INFO  29-07 16:39:46,527 - Successfully able to get the table metadata
file lock
INFO  29-07 16:39:46,537 - main Initiating Direct Load for the Table :
(default.test_table)
INFO  29-07 16:39:46,541 - Generate global dictionary from source data
files!
INFO  29-07 16:39:46,569 - [Block Distribution]
INFO  29-07 16:39:46,569 - totalInputSpaceConsumed : 74 ,
defaultParallelism : 6
INFO  29-07 16:39:46,569 - mapreduce.input.fileinputformat.split.maxsize
:
16777216
INFO  29-07 16:39:46,689 - Block broadcast_0 stored as values in memory
(estimated size 232.6 KB, free 232.6 KB)
INFO  29-07 16:39:46,849 - Block broadcast_0_piece0 stored as bytes in
memory (estimated size 19.7 KB, free 252.3 KB)
INFO  29-07 16:39:46,850 - Added broadcast_0_piece0 in memory on
192.168.241.223:41572 (size: 19.7 KB, free: 511.5 MB)
INFO  29-07 16:39:46,856 - Created broadcast 0 from NewHadoopRDD at
CarbonTextFile.scala:45
ERROR 29-07 16:39:46,904 - generate global dictionary failed
ERROR 29-07 16:39:46,904 - main
org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path
does not exist: /opt/incubator-carbondata/sample.csv
      at

org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:321)
      at

org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:264)
      at

org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:385)
      at
org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:120)
      at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
      at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
      at scala.Option.getOrElse(Option.scala:120)
      at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
      at

org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
      at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
      at
org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
      at scala.Option.getOrElse(Option.scala:120)
      at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
      at org.apache.spark.rdd.RDD$$anonfun$take$1.apply(RDD.scala:1307)
      at

org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
      at

org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
      at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
      at org.apache.spark.rdd.RDD.take(RDD.scala:1302)
      at

com.databricks.spark.csv.CarbonCsvRelation.firstLine$lzycompute(CarbonCsvRelation.scala:175)
      at

com.databricks.spark.csv.CarbonCsvRelation.firstLine(CarbonCsvRelation.scala:170)
      at

com.databricks.spark.csv.CarbonCsvRelation.inferSchema(CarbonCsvRelation.scala:141)
      at

com.databricks.spark.csv.CarbonCsvRelation.<init>(CarbonCsvRelation.scala:71)
      at

com.databricks.spark.csv.newapi.DefaultSource.createRelation(DefaultSource.scala:142)
      at

com.databricks.spark.csv.newapi.DefaultSource.createRelation(DefaultSource.scala:44)
      at

org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158)
      at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119)
      at
org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:109)
      at

org.carbondata.spark.util.GlobalDictionaryUtil$.loadDataFrame(GlobalDictionaryUtil.scala:365)
      at

org.carbondata.spark.util.GlobalDictionaryUtil$.generateGlobalDictionary(GlobalDictionaryUtil.scala:676)
      at

org.apache.spark.sql.execution.command.LoadTable.run(carbonTableSchema.scala:1159)
      at

org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult$lzycompute(commands.scala:58)
      at

org.apache.spark.sql.execution.ExecutedCommand.sideEffectResult(commands.scala:56)
      at

org.apache.spark.sql.execution.ExecutedCommand.doExecute(commands.scala:70)
      at

org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
      at

org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
      at

org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
      at
org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
      at

org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:55)
      at

org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:55)
      at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:145)
      at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:130)
      at

org.carbondata.spark.rdd.CarbonDataFrameRDD.<init>(CarbonDataFrameRDD.scala:23)
      at org.apache.spark.sql.CarbonContext.sql(CarbonContext.scala:131)
      at

org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:63)
      at

org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:311)
      at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:376)
      at

org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:226)
      at

org.apache.spark.sql.hive.cli.CarbonSQLCLIDriver$.main(CarbonSQLCLIDriver.scala:40)
      at

org.apache.spark.sql.hive.cli.CarbonSQLCLIDriver.main(CarbonSQLCLIDriver.scala)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at

sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      at

sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      at java.lang.reflect.Method.invoke(Method.java:606)
      at

org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
      at
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
      at
org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
      at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
      at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

--
金铸







---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any
accompanying attachment(s)
is intended only for the use of the intended recipient and may be
confidential and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any
reader
of this communication is
not the intended recipient, unauthorized use, forwarding, printing,
storing, disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this
communication in error,please
immediately notify the sender by return e-mail, and delete the original
message and all copies from
your system. Thank you.


---------------------------------------------------------------------------------------------------



---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any 
accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential 
and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of 
this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, 
disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this 
communication in error,please
immediately notify the sender by return e-mail, and delete the original message 
and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------

Re: load data error

Reply via email to