Weizhong created SPARK-15335:
--------------------------------
Summary: In Spark 2.0 TRUNCATE TABLE is unsupported
Key: SPARK-15335
URL: https://issues.apache.org/jira/browse/SPARK-15335
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 2.0.0
Reporter: Weizhong
Priority: Minor
Spark version based on commit b3930f74a0929b2cdcbbe5cbe34f0b1d35eb01cc, test
result is like below:
{noformat}
spark-sql> create table truncateTT(c string);
16/05/16 10:23:15 INFO execution.SparkSqlParser: Parsing command: create table
truncateTT(c string)
16/05/16 10:23:15 INFO metastore.HiveMetaStore: 0: get_database: default
16/05/16 10:23:15 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr
cmd=get_database: default
16/05/16 10:23:15 INFO metastore.HiveMetaStore: 0: get_database: default
16/05/16 10:23:15 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr
cmd=get_database: default
16/05/16 10:23:15 INFO metastore.HiveMetaStore: 0: create_table:
Table(tableName:truncatett, dbName:default, owner:root, createTime:1463365395,
lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:c,
type:string, comment:null)], location:null,
inputFormat:org.apache.hadoop.mapred.TextInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=1}), bucketCols:[], sortCols:[],
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[],
skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{},
viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE,
privileges:PrincipalPrivilegeSet(userPrivileges:{}, groupPrivileges:null,
rolePrivileges:null))
16/05/16 10:23:15 INFO HiveMetaStore.audit: ugi=root ip=unknown-ip-addr
cmd=create_table: Table(tableName:truncatett, dbName:default, owner:root,
createTime:1463365395, lastAccessTime:0, retention:0,
sd:StorageDescriptor(cols:[FieldSchema(name:c, type:string, comment:null)],
location:null, inputFormat:org.apache.hadoop.mapred.TextInputFormat,
outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat,
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
serializationLib:org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe,
parameters:{serialization.format=1}), bucketCols:[], sortCols:[],
parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[],
skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{},
viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE,
privileges:PrincipalPrivilegeSet(userPrivileges:{}, groupPrivileges:null,
rolePrivileges:null))
16/05/16 10:23:15 INFO common.FileUtils: Creating directory if it doesn't
exist: hdfs://vm001:9000/opt/apache/spark/spark-warehouse/truncatett
16/05/16 10:23:16 INFO spark.SparkContext: Starting job: processCmd at
CliDriver.java:376
16/05/16 10:23:16 INFO scheduler.DAGScheduler: Got job 1 (processCmd at
CliDriver.java:376) with 1 output partitions
16/05/16 10:23:16 INFO scheduler.DAGScheduler: Final stage: ResultStage 1
(processCmd at CliDriver.java:376)
16/05/16 10:23:16 INFO scheduler.DAGScheduler: Parents of final stage: List()
16/05/16 10:23:16 INFO scheduler.DAGScheduler: Missing parents: List()
16/05/16 10:23:16 INFO scheduler.DAGScheduler: Submitting ResultStage 1
(MapPartitionsRDD[5] at processCmd at CliDriver.java:376), which has no missing
parents
16/05/16 10:23:16 INFO memory.MemoryStore: Block broadcast_1 stored as values
in memory (estimated size 3.2 KB, free 1823.2 MB)
16/05/16 10:23:16 INFO memory.MemoryStore: Block broadcast_1_piece0 stored as
bytes in memory (estimated size 1965.0 B, free 1823.2 MB)
16/05/16 10:23:16 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in
memory on 192.168.151.146:47228 (size: 1965.0 B, free: 1823.2 MB)
16/05/16 10:23:16 INFO spark.SparkContext: Created broadcast 1 from broadcast
at DAGScheduler.scala:1012
16/05/16 10:23:16 INFO scheduler.DAGScheduler: Submitting 1 missing tasks from
ResultStage 1 (MapPartitionsRDD[5] at processCmd at CliDriver.java:376)
16/05/16 10:23:16 INFO cluster.YarnScheduler: Adding task set 1.0 with 1 tasks
16/05/16 10:23:16 INFO scheduler.TaskSetManager: Starting task 0.0 in stage 1.0
(TID 1, vm001, partition 0, PROCESS_LOCAL, 5387 bytes)
16/05/16 10:23:16 INFO cluster.YarnClientSchedulerBackend: Launching task 1 on
executor id: 1 hostname: vm001.
16/05/16 10:23:16 INFO storage.BlockManagerInfo: Added broadcast_1_piece0 in
memory on vm001:35665 (size: 1965.0 B, free: 4.4 GB)
16/05/16 10:23:18 INFO scheduler.TaskSetManager: Finished task 0.0 in stage 1.0
(TID 1) in 2105 ms on vm001 (1/1)
16/05/16 10:23:18 INFO cluster.YarnScheduler: Removed TaskSet 1.0, whose tasks
have all completed, from pool
16/05/16 10:23:18 INFO scheduler.DAGScheduler: ResultStage 1 (processCmd at
CliDriver.java:376) finished in 2.105 s
16/05/16 10:23:18 INFO scheduler.DAGScheduler: Job 1 finished: processCmd at
CliDriver.java:376, took 2.121866 s
Time taken: 2.691 seconds
16/05/16 10:23:18 INFO CliDriver: Time taken: 2.691 seconds
spark-sql> truncate table truncateTT;
16/05/16 10:23:32 INFO execution.SparkSqlParser: Parsing command: truncate
table truncateTT
Error in query:
Unsupported SQL statement
== SQL ==
truncate table truncateTT
spark-sql>
{noformat}
Before 2.0 we will run 'TRUNCATE TABLE ...' as a Hive native command, so it can
work. In Spark 2.0 we also need to support it.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]