????load??????????????????load????Quick
Start????????????????????????????????Quick Start??????
scala> cc.sql(s"load data inpath 'hdfs://localhost:9000/test/sample.csv' into
table table1")
INFO 30-06 22:15:13,860 - main Query [LOAD DATA INPATH
'HDFS://LOCALHOST:9000/TEST/SAMPLE.CSV' INTO TABLE TABLE1]
INFO 30-06 22:15:13,867 - Successfully able to get the table metadata file lock
INFO 30-06 22:15:13,876 - main Initiating Direct Load for the Table :
(default.table1)
INFO 30-06 22:15:13,913 - [Block Distribution]
INFO 30-06 22:15:13,924 - totalInputSpaceConsumed : 74 , defaultParallelism : 1
INFO 30-06 22:15:13,924 - mapreduce.input.fileinputformat.split.maxsize :
16777216
INFO 30-06 22:15:13,928 - Block broadcast_31 stored as values in memory
(estimated size 211.1 KB, free 480.3 KB)
INFO 30-06 22:15:13,948 - Block broadcast_31_piece0 stored as bytes in memory
(estimated size 19.6 KB, free 499.9 KB)
INFO 30-06 22:15:13,948 - Added broadcast_31_piece0 in memory on
localhost:52028 (size: 19.6 KB, free: 511.1 MB)
INFO 30-06 22:15:13,950 - Created broadcast 31 from NewHadoopRDD at
CarbonTextFile.scala:45
INFO 30-06 22:15:13,963 - Total input paths to process : 1
INFO 30-06 22:15:13,974 - Starting job: take at CarbonCsvRelation.scala:175
INFO 30-06 22:15:13,979 - Got job 16 (take at CarbonCsvRelation.scala:175)
with 1 output partitions
INFO 30-06 22:15:13,980 - Final stage: ResultStage 21 (take at
CarbonCsvRelation.scala:175)
INFO 30-06 22:15:13,980 - Parents of final stage: List()
INFO 30-06 22:15:13,980 - Missing parents: List()
INFO 30-06 22:15:13,980 - Submitting ResultStage 21 (MapPartitionsRDD[91] at
map at CarbonTextFile.scala:55), which has no missing parents
INFO 30-06 22:15:13,982 - Block broadcast_32 stored as values in memory
(estimated size 2.6 KB, free 502.5 KB)
INFO 30-06 22:15:13,985 - Block broadcast_32_piece0 stored as bytes in memory
(estimated size 1608.0 B, free 504.0 KB)
INFO 30-06 22:15:13,986 - Added broadcast_32_piece0 in memory on
localhost:52028 (size: 1608.0 B, free: 511.1 MB)
INFO 30-06 22:15:13,987 - Created broadcast 32 from broadcast at
DAGScheduler.scala:1006
INFO 30-06 22:15:13,987 - Submitting 1 missing tasks from ResultStage 21
(MapPartitionsRDD[91] at map at CarbonTextFile.scala:55)
INFO 30-06 22:15:13,987 - Adding task set 21.0 with 1 tasks
INFO 30-06 22:15:13,988 - Starting task 0.0 in stage 21.0 (TID 224, localhost,
partition 0,ANY, 2342 bytes)
INFO 30-06 22:15:13,988 - Running task 0.0 in stage 21.0 (TID 224)
INFO 30-06 22:15:13,989 - Input split:
hdfs://localhost:9000/test/sample.csv:0+74
INFO 30-06 22:15:13,996 - Finished task 0.0 in stage 21.0 (TID 224). 2126
bytes result sent to driver
INFO 30-06 22:15:13,997 - Finished task 0.0 in stage 21.0 (TID 224) in 9 ms on
localhost (1/1)
INFO 30-06 22:15:13,997 - ResultStage 21 (take at CarbonCsvRelation.scala:175)
finished in 0.009 s
INFO 30-06 22:15:13,997 - Removed TaskSet 21.0, whose tasks have all
completed, from pool
INFO 30-06 22:15:13,998 - Job 16 finished: take at
CarbonCsvRelation.scala:175, took 0.020150 s
INFO 30-06 22:15:14,028 - [Block Distribution]
INFO 30-06 22:15:14,028 - totalInputSpaceConsumed : 74 , defaultParallelism : 1
INFO 30-06 22:15:14,028 - mapreduce.input.fileinputformat.split.maxsize :
16777216
INFO 30-06 22:15:14,035 - Block broadcast_33 stored as values in memory
(estimated size 211.1 KB, free 715.2 KB)
INFO 30-06 22:15:14,059 - Block broadcast_33_piece0 stored as bytes in memory
(estimated size 19.6 KB, free 734.8 KB)
INFO 30-06 22:15:14,060 - Added broadcast_33_piece0 in memory on
localhost:52028 (size: 19.6 KB, free: 511.1 MB)
INFO 30-06 22:15:14,060 - Created broadcast 33 from NewHadoopRDD at
CarbonTextFile.scala:45
INFO 30-06 22:15:14,086 - Starting job: collect at
GlobalDictionaryUtil.scala:512
INFO 30-06 22:15:14,088 - Total input paths to process : 1
INFO 30-06 22:15:14,091 - Registering RDD 100 (RDD at
CarbonGlobalDictionaryRDD.scala:154)
INFO 30-06 22:15:14,091 - Got job 17 (collect at
GlobalDictionaryUtil.scala:512) with 3 output partitions
INFO 30-06 22:15:14,091 - Final stage: ResultStage 23 (collect at
GlobalDictionaryUtil.scala:512)
INFO 30-06 22:15:14,091 - Parents of final stage: List(ShuffleMapStage 22)
INFO 30-06 22:15:14,091 - Missing parents: List(ShuffleMapStage 22)
INFO 30-06 22:15:14,091 - Submitting ShuffleMapStage 22
(CarbonBlockDistinctValuesCombineRDD[100] at RDD at
CarbonGlobalDictionaryRDD.scala:154), which has no missing parents
INFO 30-06 22:15:14,093 - Block broadcast_34 stored as values in memory
(estimated size 11.8 KB, free 746.6 KB)
INFO 30-06 22:15:14,094 - Block broadcast_34_piece0 stored as bytes in memory
(estimated size 6.2 KB, free 752.8 KB)
INFO 30-06 22:15:14,095 - Added broadcast_34_piece0 in memory on
localhost:52028 (size: 6.2 KB, free: 511.0 MB)
INFO 30-06 22:15:14,095 - Created broadcast 34 from broadcast at
DAGScheduler.scala:1006
INFO 30-06 22:15:14,095 - Submitting 1 missing tasks from ShuffleMapStage 22
(CarbonBlockDistinctValuesCombineRDD[100] at RDD at
CarbonGlobalDictionaryRDD.scala:154)
INFO 30-06 22:15:14,095 - Adding task set 22.0 with 1 tasks
INFO 30-06 22:15:14,096 - Starting task 0.0 in stage 22.0 (TID 225, localhost,
partition 0,ANY, 2331 bytes)
INFO 30-06 22:15:14,096 - Running task 0.0 in stage 22.0 (TID 225)
INFO 30-06 22:15:14,098 - Input split:
hdfs://localhost:9000/test/sample.csv:0+74
INFO 30-06 22:15:14,115 - Finished task 0.0 in stage 22.0 (TID 225). 2400
bytes result sent to driver
INFO 30-06 22:15:14,116 - Finished task 0.0 in stage 22.0 (TID 225) in 21 ms
on localhost (1/1)
INFO 30-06 22:15:14,116 - ShuffleMapStage 22 (RDD at
CarbonGlobalDictionaryRDD.scala:154) finished in 0.021 s
INFO 30-06 22:15:14,116 - Removed TaskSet 22.0, whose tasks have all
completed, from pool
INFO 30-06 22:15:14,116 - looking for newly runnable stages
INFO 30-06 22:15:14,116 - running: Set()
INFO 30-06 22:15:14,116 - waiting: Set(ResultStage 23)
INFO 30-06 22:15:14,116 - failed: Set()
INFO 30-06 22:15:14,116 - Submitting ResultStage 23
(CarbonGlobalDictionaryGenerateRDD[102] at RDD at
CarbonGlobalDictionaryRDD.scala:215), which has no missing parents
INFO 30-06 22:15:14,117 - Block broadcast_35 stored as values in memory
(estimated size 5.2 KB, free 758.0 KB)
INFO 30-06 22:15:14,118 - Block broadcast_35_piece0 stored as bytes in memory
(estimated size 3.0 KB, free 761.0 KB)
INFO 30-06 22:15:14,119 - Added broadcast_35_piece0 in memory on
localhost:52028 (size: 3.0 KB, free: 511.0 MB)
INFO 30-06 22:15:14,119 - Created broadcast 35 from broadcast at
DAGScheduler.scala:1006
INFO 30-06 22:15:14,119 - Submitting 3 missing tasks from ResultStage 23
(CarbonGlobalDictionaryGenerateRDD[102] at RDD at
CarbonGlobalDictionaryRDD.scala:215)
INFO 30-06 22:15:14,119 - Adding task set 23.0 with 3 tasks
INFO 30-06 22:15:14,120 - Starting task 0.0 in stage 23.0 (TID 226, localhost,
partition 0,NODE_LOCAL, 2065 bytes)
INFO 30-06 22:15:14,120 - Running task 0.0 in stage 23.0 (TID 226)
INFO 30-06 22:15:14,121 - Getting 1 non-empty blocks out of 1 blocks
INFO 30-06 22:15:14,121 - Started 0 remote fetches in 0 ms
INFO 30-06 22:15:14,122 - Finished task 0.0 in stage 23.0 (TID 226). 1369
bytes result sent to driver
INFO 30-06 22:15:14,122 - Starting task 1.0 in stage 23.0 (TID 227, localhost,
partition 1,NODE_LOCAL, 2065 bytes)
INFO 30-06 22:15:14,123 - Running task 1.0 in stage 23.0 (TID 227)
INFO 30-06 22:15:14,123 - Finished task 0.0 in stage 23.0 (TID 226) in 3 ms on
localhost (1/3)
INFO 30-06 22:15:14,124 - Getting 1 non-empty blocks out of 1 blocks
INFO 30-06 22:15:14,124 - Started 0 remote fetches in 0 ms
INFO 30-06 22:15:14,125 - Finished task 1.0 in stage 23.0 (TID 227). 1369
bytes result sent to driver
INFO 30-06 22:15:14,125 - Starting task 2.0 in stage 23.0 (TID 228, localhost,
partition 2,NODE_LOCAL, 2065 bytes)
INFO 30-06 22:15:14,125 - Running task 2.0 in stage 23.0 (TID 228)
INFO 30-06 22:15:14,125 - Finished task 1.0 in stage 23.0 (TID 227) in 3 ms on
localhost (2/3)
INFO 30-06 22:15:14,127 - Getting 1 non-empty blocks out of 1 blocks
INFO 30-06 22:15:14,127 - Started 0 remote fetches in 0 ms
INFO 30-06 22:15:14,127 - Finished task 2.0 in stage 23.0 (TID 228). 1369
bytes result sent to driver
INFO 30-06 22:15:14,128 - Finished task 2.0 in stage 23.0 (TID 228) in 3 ms on
localhost (3/3)
INFO 30-06 22:15:14,128 - ResultStage 23 (collect at
GlobalDictionaryUtil.scala:512) finished in 0.009 s
INFO 30-06 22:15:14,128 - Removed TaskSet 23.0, whose tasks have all
completed, from pool
INFO 30-06 22:15:14,128 - Job 17 finished: collect at
GlobalDictionaryUtil.scala:512, took 0.041827 s
INFO 30-06 22:15:14,129 - generate global dictionary successfully
AUDIT 30-06 22:15:14,135 - [MacBook-Pro.local]The data load request has been
received.
INFO 30-06 22:15:14,139 - main compaction need status is false
INFO 30-06 22:15:14,172 - [Block Distribution]
INFO 30-06 22:15:14,172 - totalInputSpaceConsumed : 74 , defaultParallelism : 1
INFO 30-06 22:15:14,172 - mapreduce.input.fileinputformat.split.maxsize :
16777216
INFO 30-06 22:15:14,174 - Total input paths to process : 1
INFO 30-06 22:15:14,177 - Total no of blocks : 1, No.of Nodes : 1
INFO 30-06 22:15:14,177 - #Node: 192.168.1.105 no.of.blocks: 1
INFO 30-06 22:15:14,185 - Starting job: collect at
CarbonDataRDDFactory.scala:646
INFO 30-06 22:15:14,186 - Got job 18 (collect at
CarbonDataRDDFactory.scala:646) with 1 output partitions
INFO 30-06 22:15:14,186 - Final stage: ResultStage 24 (collect at
CarbonDataRDDFactory.scala:646)
INFO 30-06 22:15:14,186 - Parents of final stage: List()
INFO 30-06 22:15:14,186 - Missing parents: List()
INFO 30-06 22:15:14,186 - Submitting ResultStage 24 (CarbonDataLoadRDD[103] at
RDD at CarbonDataLoadRDD.scala:94), which has no missing parents
INFO 30-06 22:15:14,186 - Prefered Location for split : 192.168.1.105
INFO 30-06 22:15:14,187 - Block broadcast_36 stored as values in memory
(estimated size 8.2 KB, free 769.2 KB)
INFO 30-06 22:15:14,188 - Block broadcast_36_piece0 stored as bytes in memory
(estimated size 4.0 KB, free 773.3 KB)
INFO 30-06 22:15:14,188 - Added broadcast_36_piece0 in memory on
localhost:52028 (size: 4.0 KB, free: 511.0 MB)
INFO 30-06 22:15:14,189 - Created broadcast 36 from broadcast at
DAGScheduler.scala:1006
INFO 30-06 22:15:14,189 - Submitting 1 missing tasks from ResultStage 24
(CarbonDataLoadRDD[103] at RDD at CarbonDataLoadRDD.scala:94)
INFO 30-06 22:15:14,189 - Adding task set 24.0 with 1 tasks
INFO 30-06 22:15:14,190 - Starting task 0.0 in stage 24.0 (TID 229, localhost,
partition 0,ANY, 2430 bytes)
INFO 30-06 22:15:14,190 - Running task 0.0 in stage 24.0 (TID 229)
INFO 30-06 22:15:14,191 - Input split: 192.168.1.105
INFO 30-06 22:15:14,191 - The Block Count in this node :1
INFO 30-06 22:15:14,192 - [Executor task launch
worker-3][partitionID:default_table1_6ed3887d-5633-4a60-8eb5-7fbbde9fb5b8]
************* Is Columnar Storagetrue
INFO 30-06 22:15:14,212 - [Executor task launch
worker-3][partitionID:default_table1_6ed3887d-5633-4a60-8eb5-7fbbde9fb5b8]
Kettle environment initialized
INFO 30-06 22:15:14,233 - [Executor task launch
worker-3][partitionID:default_table1_6ed3887d-5633-4a60-8eb5-7fbbde9fb5b8] **
Using csv file **
INFO 30-06 22:15:14,242 - [Executor task launch
worker-3][partitionID:default_table1_6ed3887d-5633-4a60-8eb5-7fbbde9fb5b8]
Graph execution is started
/var/folders/pj/8xxn8_tx6g32wfl93yz1m9940000gn/T//188149868655074/0/etl/default/table1/3/0/table1.ktr
INFO 30-06 22:15:14,242 - table1: Graph - CSV Input *****************Started
ALL ALL csv reading***********
INFO 30-06 22:15:14,243 -
[pool-61-thread-1][partitionID:PROCESS_BLOCKS;queryID:pool-61-thread-1]
*****************started csv reading by thread***********
INFO 30-06 22:15:14,248 -
[pool-61-thread-1][partitionID:PROCESS_BLOCKS;queryID:pool-61-thread-1]
*****************Completed csv reading by thread***********
INFO 30-06 22:15:14,251 - [table1: Graph - Sort Key: Sort
keystable1][partitionID:0] Sort size for cube: 100000
INFO 30-06 22:15:14,251 - [table1: Graph - Sort Key: Sort
keystable1][partitionID:0] Number of intermediate file to be merged: 10
INFO 30-06 22:15:14,251 - [table1: Graph - Sort Key: Sort
keystable1][partitionID:0] File Buffer Size: 1048576
INFO 30-06 22:15:14,251 - [table1: Graph - Sort Key: Sort
keystable1][partitionID:0] temp file
location/var/folders/pj/8xxn8_tx6g32wfl93yz1m9940000gn/T//188149868655074/0/default/table1/Fact/Part0/Segment_3/0/sortrowtmp
INFO 30-06 22:15:14,449 - table1: Graph - CSV Input *****************Completed
ALL ALL csv reading***********
INFO 30-06 22:15:14,555 - [table1: Graph - Carbon Surrogate Key
Generator][partitionID:0] Level cardinality file written to :
/var/folders/pj/8xxn8_tx6g32wfl93yz1m9940000gn/T//188149868655074/0/default/table1/Fact/Part0/Segment_3/0/levelmetadata_table1.metadata
INFO 30-06 22:15:14,555 - [table1: Graph - Carbon Surrogate Key
Generator][partitionID:0] Record Procerssed For table: table1
INFO 30-06 22:15:14,555 - [table1: Graph - Carbon Surrogate Key
Generator][partitionID:0] Summary: Carbon CSV Based Seq Gen Step : 3: Write: 3
INFO 30-06 22:15:14,557 - [table1: Graph - Sort Key: Sort
keystable1][partitionID:0] File based sorting will be used
INFO 30-06 22:15:14,558 - [table1: Graph - Sort Key: Sort
keystable1][partitionID:0] Record Processed For table: table1
INFO 30-06 22:15:14,558 - [table1: Graph - Sort Key: Sort
keystable1][partitionID:0] Summary: Carbon Sort Key Step: Read: 3: Write: 3
INFO 30-06 22:15:14,561 - [table1: Graph - MDKeyGentable1][partitionID:0]
Initializing writer executors
INFO 30-06 22:15:14,561 - [table1: Graph - MDKeyGentable1][partitionID:0]
Blocklet Size: 120000
INFO 30-06 22:15:14,562 - [table1: Graph - MDKeyGentable1][partitionID:0]
Total file size: 1073741824 and dataBlock Size: 966367642
INFO 30-06 22:15:14,562 - [table1: Graph - MDKeyGentable1][partitionID:0]
Number of temp file: 1
INFO 30-06 22:15:14,562 - [table1: Graph - MDKeyGentable1][partitionID:0] File
Buffer Size: 10485760
INFO 30-06 22:15:14,562 - [table1: Graph - MDKeyGentable1][partitionID:0]
Started adding first record from each file
INFO 30-06 22:15:14,564 - [table1: Graph - MDKeyGentable1][partitionID:0] Heap
Size1
INFO 30-06 22:15:14,565 - pool-66-thread-1 Number Of records processed: 3
INFO 30-06 22:15:14,566 - [table1: Graph - MDKeyGentable1][partitionID:0]
Record Procerssed For table: table1
INFO 30-06 22:15:14,566 - [table1: Graph - MDKeyGentable1][partitionID:0]
Finished Carbon Mdkey Generation Step: Read: 3: Write: 3
INFO 30-06 22:15:14,622 - [table1: Graph - MDKeyGentable1][partitionID:0] All
blocklets have been finished writing
INFO 30-06 22:15:14,622 - [table1: Graph - MDKeyGentable1][partitionID:0]
Copying
/var/folders/pj/8xxn8_tx6g32wfl93yz1m9940000gn/T//188149868655074/0/default/table1/Fact/Part0/Segment_3/0/part-0-0-1467296114000.carbondata
--> ./carbondata/store/default/table1/Fact/Part0/Segment_3
INFO 30-06 22:15:14,622 - [table1: Graph - MDKeyGentable1][partitionID:0]
Total copy time (ms) to copy file
/var/folders/pj/8xxn8_tx6g32wfl93yz1m9940000gn/T//188149868655074/0/default/table1/Fact/Part0/Segment_3/0/part-0-0-1467296114000.carbondata
is 0
INFO 30-06 22:15:14,624 - [table1: Graph - Carbon Slice
Mergertable1][partitionID:table1] Record Procerssed For table: table1
INFO 30-06 22:15:14,624 - [table1: Graph - Carbon Slice
Mergertable1][partitionID:table1] Summary: Carbon Slice Merger Step: Read: 1:
Write: 0
INFO 30-06 22:15:14,624 - [Executor task launch
worker-3][partitionID:default_table1_6ed3887d-5633-4a60-8eb5-7fbbde9fb5b8]
Graph execution is finished.
INFO 30-06 22:15:14,624 - [Executor task launch
worker-3][partitionID:default_table1_6ed3887d-5633-4a60-8eb5-7fbbde9fb5b8]
Graph execution task is over with No error.
INFO 30-06 22:15:14,625 - DataLoad complete
INFO 30-06 22:15:14,625 - Data Loaded successfully with LoadCount:3
INFO 30-06 22:15:14,625 - Finished task 0.0 in stage 24.0 (TID 229). 1267
bytes result sent to driver
INFO 30-06 22:15:14,626 - Finished task 0.0 in stage 24.0 (TID 229) in 437 ms
on localhost (1/1)
INFO 30-06 22:15:14,626 - ResultStage 24 (collect at
CarbonDataRDDFactory.scala:646) finished in 0.437 s
INFO 30-06 22:15:14,626 - Removed TaskSet 24.0, whose tasks have all
completed, from pool
INFO 30-06 22:15:14,626 - Job 18 finished: collect at
CarbonDataRDDFactory.scala:646, took 0.440783 s
AUDIT 30-06 22:15:14,628 - [Pro.local]The data loading is successful.
INFO 30-06 22:15:14,628 - Table MetaData Unlocked Successfully after data load
res17: org.apache.spark.sql.SchemaRDD = []
scala> cc.sql("select * from table1").show
INFO 30-06 22:16:21,625 - main Query [SELECT * FROM TABLE1]
INFO 30-06 22:16:21,628 - Parsing command: select * from table1
INFO 30-06 22:16:21,628 - Parse Completed
INFO 30-06 22:16:21,629 - Parsing command: select * from table1
INFO 30-06 22:16:21,629 - Parse Completed
INFO 30-06 22:16:21,732 - No valid segments found to scan
INFO 30-06 22:16:21,734 - Starting job: show at <console>:33
INFO 30-06 22:16:21,734 - Got job 19 (show at <console>:33) with 1 output
partitions
INFO 30-06 22:16:21,734 - Final stage: ResultStage 25 (show at <console>:33)
INFO 30-06 22:16:21,734 - Parents of final stage: List()
INFO 30-06 22:16:21,734 - Missing parents: List()
INFO 30-06 22:16:21,735 - Submitting ResultStage 25 (MapPartitionsRDD[111] at
show at <console>:33), which has no missing parents
INFO 30-06 22:16:21,736 - Block broadcast_37 stored as values in memory
(estimated size 14.3 KB, free 787.6 KB)
INFO 30-06 22:16:21,738 - Block broadcast_37_piece0 stored as bytes in memory
(estimated size 7.1 KB, free 794.7 KB)
INFO 30-06 22:16:21,738 - Added broadcast_37_piece0 in memory on
localhost:52028 (size: 7.1 KB, free: 511.0 MB)
INFO 30-06 22:16:21,738 - Created broadcast 37 from broadcast at
DAGScheduler.scala:1006
INFO 30-06 22:16:21,738 - Submitting 1 missing tasks from ResultStage 25
(MapPartitionsRDD[111] at show at <console>:33)
INFO 30-06 22:16:21,739 - Adding task set 25.0 with 1 tasks
INFO 30-06 22:16:21,739 - Starting task 0.0 in stage 25.0 (TID 230, localhost,
partition 0,ANY, 2251 bytes)
INFO 30-06 22:16:21,740 - Running task 0.0 in stage 25.0 (TID 230)
INFO 30-06 22:16:21,742 - ********************** Total Time Taken to execute
the query in Carbon Side: 1467296181742
INFO 30-06 22:16:21,744 - Finished task 0.0 in stage 25.0 (TID 230). 1085
bytes result sent to driver
INFO 30-06 22:16:21,746 - Finished task 0.0 in stage 25.0 (TID 230) in 6 ms on
localhost (1/1)
INFO 30-06 22:16:21,746 - Removed TaskSet 25.0, whose tasks have all
completed, from pool
INFO 30-06 22:16:21,747 - ResultStage 25 (show at <console>:33) finished in
0.008 s
INFO 30-06 22:16:21,748 - Job 19 finished: show at <console>:33, took 0.014400
s
+---+----+----+---+
| id|name|city|age|
+---+----+----+---+
+---+----+----+---+
------------------ ???????? ------------------
??????: "????";<[email protected]>;
????????: 2016??6??30??(??????) ????9:59
??????: "caiqiang"<[email protected]>; "Liang Big data"<[email protected]>;
????: "dev"<[email protected]>;
????: ??????Fwd: carbondata????????????
???????????????????????????? STORED BY ??????????????????????????
???? Quick Start ??????????????
------------------ ???????? ------------------
??????: "caiqiang";<[email protected]>;
????????: 2016??6??30??(??????) ????7:30
??????: "Liang Big data"<[email protected]>; "????"<[email protected]>;
????: "dev"<[email protected]>;
????: ??????Fwd: carbondata????????????
please convert 'ORG.APACHE.CARBONDATA.FORMAT' to lowercase
------------------ ???????? ------------------
??????: Liang Big data <[email protected]>
????????: 2016??6??30?? 18:59
??????: qiangcai <[email protected]>, 33822323 <[email protected]>
????: Fwd: carbondata????????????
Please caiqiang privide needful helps.
---------- Forwarded message ----------
From: ???? <[email protected]>
Date: 2016-06-30 13:04 GMT+05:30
Subject: carbondata????????????
To: dev <[email protected]>
?????????????? https://github.com/HuaweiBigData/carbondata/wiki/Quick-Start
????????????????????????????????
scala> cc.sql("create table if not exists table1 (id string, name string,
city string, age Int) STORED BY 'org.apache.carbondata.format'")
????????????????????
INFO 30-06 15:32:28,408 - main Query [CREATE TABLE IF NOT EXISTS TABLE1 (ID
STRING, NAME STRING, CITY STRING, AGE INT) STORED BY
'ORG.APACHE.CARBONDATA.FORMAT']
INFO 30-06 15:32:28,420 - Parsing command: create table if not exists table1
(id string, name string, city string, age Int) STORED BY
'org.apache.carbondata.format'
INFO 30-06 15:32:28,421 - Parse Completed
AUDIT 30-06 15:32:28,426 - [Pro.local]Creating Table with Database name
[default] and Table name [table1]
INFO 30-06 15:32:28,526 - Table table1 for Database default created
successfully.
INFO 30-06 15:32:28,527 - main Table table1 for Database default created
successfully.
INFO 30-06 15:32:28,527 - main Query [CREATE TABLE DEFAULT.TABLE1 USING
ORG.APACHE.SPARK.SQL.CARBONSOURCE OPTIONS (TABLENAME "DEFAULT.TABLE1",
TABLEPATH "./CARBONDATA/STORE/DEFAULT/TABLE1/METADATA") ]
WARN 30-06 15:32:28,605 - Couldn't find corresponding Hive SerDe for data
source provider org.apache.spark.sql.CarbonSource. Persisting data source
relation `default`.`table1` into Hive metastore in Spark SQL specific format,
which is NOT compatible with Hive.
WARN 30-06 15:32:49,830 - MetaStoreClient lost connection. Attempting to
reconnect.
MetaException(message:javax.jdo.JDODataStoreException: An exception was thrown
while adding/validating class(es) : Specified key was too long; max key length
is 767 bytes
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Specified key was
too long; max key length is 767 bytes
at sun.reflect.GeneratedConstructorAccessor30.newInstance(Unknown
Source)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
at com.mysql.jdbc.Util.getInstance(Util.java:386)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1053)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4096)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:4028)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:2490)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2651)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2728)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2678)
at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:894)
at com.mysql.jdbc.StatementImpl.execute(StatementImpl.java:732)
at com.jolbox.bonecp.StatementHandle.execute(StatementHandle.java:254)
at
org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatement(AbstractTable.java:760)
at
org.datanucleus.store.rdbms.table.AbstractTable.executeDdlStatementList(AbstractTable.java:711)
at
org.datanucleus.store.rdbms.table.AbstractTable.create(AbstractTable.java:425)
at
org.datanucleus.store.rdbms.table.AbstractTable.exists(AbstractTable.java:488)
at
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.performTablesValidation(RDBMSStoreManager.java:3380)
at
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.addClassTablesAndValidate(RDBMSStoreManager.java:3190)
at
org.datanucleus.store.rdbms.RDBMSStoreManager$ClassAdder.run(RDBMSStoreManager.java:2841)
at
org.datanucleus.store.rdbms.AbstractSchemaTransaction.execute(AbstractSchemaTransaction.java:122)
at
org.datanucleus.store.rdbms.RDBMSStoreManager.addClasses(RDBMSStoreManager.java:1605)
at
org.datanucleus.store.AbstractStoreManager.addClass(AbstractStoreManager.java:954)
at
org.datanucleus.store.rdbms.RDBMSStoreManager.getDatastoreClass(RDBMSStoreManager.java:679)
at
org.datanucleus.store.rdbms.RDBMSStoreManager.getPropertiesForGenerator(RDBMSStoreManager.java:2045)
at
org.datanucleus.store.AbstractStoreManager.getStrategyValue(AbstractStoreManager.java:1365)
at
org.datanucleus.ExecutionContextImpl.newObjectId(ExecutionContextImpl.java:3827)
at
org.datanucleus.state.JDOStateManager.setIdentity(JDOStateManager.java:2571)
at
org.datanucleus.state.JDOStateManager.initialiseForPersistentNew(JDOStateManager.java:513)
at
org.datanucleus.state.ObjectProviderFactoryImpl.newForPersistentNew(ObjectProviderFactoryImpl.java:232)
at
org.datanucleus.ExecutionContextImpl.newObjectProviderForPersistentNew(ExecutionContextImpl.java:1414)
at
org.datanucleus.ExecutionContextImpl.persistObjectInternal(ExecutionContextImpl.java:2218)
at
org.datanucleus.ExecutionContextImpl.persistObjectWork(ExecutionContextImpl.java:2065)
at
org.datanucleus.ExecutionContextImpl.persistObject(ExecutionContextImpl.java:1913)
at
org.datanucleus.ExecutionContextThreadedImpl.persistObject(ExecutionContextThreadedImpl.java:217)
at
org.datanucleus.api.jdo.JDOPersistenceManager.jdoMakePersistent(JDOPersistenceManager.java:727)
at
org.datanucleus.api.jdo.JDOPersistenceManager.makePersistent(JDOPersistenceManager.java:752)
at
org.apache.hadoop.hive.metastore.ObjectStore.createTable(ObjectStore.java:814)
at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:114)
at com.sun.proxy.$Proxy2.createTable(Unknown Source)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:1416)
at
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_with_environment_context(HiveMetaStore.java:1449)
at sun.reflect.GeneratedMethodAccessor31.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
at com.sun.proxy.$Proxy4.create_table_with_environment_context(Unknown
Source)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9200)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table_with_environment_context.getResult(ThriftHiveMetastore.java:9184)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
at
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:110)
at
org.apache.hadoop.hive.metastore.TUGIBasedProcessor$1.run(TUGIBasedProcessor.java:106)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1656)
at
org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:118)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:285)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
????????????????????????
--
RegardsLiang