Hi all:I am testing the performance of hive on spark sql.The existing table is
created with ROW FORMAT SERDE
'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES (
'input.regex' =
'(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)\\|\\^\\|(.*?)','output.format.string'
= '%1$s %2$s %3$s %4$s %5$s %16$s %7$s %8$s %9$s %10$s %11$s %12$s %13$s %14$s
%15$s %16$s %17$s ')STORED AS TEXTFILElocation '/data/BaseData/wx/xx/xx/xx/xx';
When i use spark sql(spark-shell) to query the existing table, got exception
like this:Caused by: MetaException(message:java.lang.ClassNotFoundException
Class org.apache.hadoop.hive.contrib.serde2.RegexSerDe not found) at
org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:382)
at
org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)
I add the jar dependency in the spark-shell command, still do not
work.SPARK_SUBMIT_OPTS="-XX:MaxPermSize=256m" ./bin/spark-shell --jars
/data/dbcenter/cdh5/spark-1.4.0-bin-hadoop2.4/hive-contrib-0.13.1-cdh5.2.0.jar,postgresql-9.2-1004-jdbc41.jar
How should i fix the problem?Cheers