First the JAR needs to be deployed using the ‹jars argument. Then in your HQL code you need to use the DeprecatedParquetInputFormat and DeprecatedParquetOutputFormat as described here https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Hive0.10-0 .12
This is because SparkSQL is based on Hive 0.12. That¹s what¹s worked for me. Thanks, Silvio On 8/18/14, 3:14 PM, "lyc" <yanchen....@huawei.com> wrote: >I followed your instructions to try to load data as parquet format through >hiveContext but failed. Do you happen to know my uncorrectness in the >following steps? > >The steps I am following is like: >1. download "parquet-hive-bundle-1.5.0.jar" >2. revise hive-site.xml including this: > ><property> > <name>hive.jar.directory</name> > <value>/home/hduser/hive/lib/parquet-hive-bundle-1.5.0.jar</value> > <description> > This is the location hive in tez mode will look for to find a site >wide > installed hive instance. If not set, the directory under >hive.user.install.directory > corresponding to current user name will be used. > </description> ></property> > >3. copy hive-site.xml to all nodes. >4. start spark-shell, then try to create table: >val hiveContext = new org.apache.spark.sql.hive.HiveContext(sc) >import hiveContext._ > hql("create table part (P_PARTKEY INT, P_NAME STRING, P_MFGR STRING, >P_BRAND STRING, P_TYPE STRING, P_SIZE INT, P_CONTAINER STRING, >P_RETAILPRICE >DOUBLE, P_COMMENT STRING) STORED AS PARQUET") > >Then I got this error: >14/08/18 19:09:00 ERROR Driver: FAILED: SemanticException Unrecognized >file >format in STORED AS clause: PARQUET >org.apache.hadoop.hive.ql.parse.SemanticException: Unrecognized file >format >in STORED AS clause: PARQUET > at >org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.handleGenericFileForm >at(BaseSemanticAnalyzer.java:569) > at >org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(Semant >icAnalyzer.java:8968) > at >org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticA >nalyzer.java:8313) > at >org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticA >nalyzer.java:284) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:441) > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:342) > at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:977) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) > at >org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:186) > at >org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:160) > at >org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(Hive >Context.scala:250) > at >org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.sca >la:247) > at >org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:85) > at org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:90) > at >$line44.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:18) > at $line44.$read$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:23) > at $line44.$read$$iwC$$iwC$$iwC$$iwC.<init>(<console>:25) > at $line44.$read$$iwC$$iwC$$iwC.<init>(<console>:27) > at $line44.$read$$iwC$$iwC.<init>(<console>:29) > at $line44.$read$$iwC.<init>(<console>:31) > at $line44.$read.<init>(<console>:33) > at $line44.$read$.<init>(<console>:37) > at $line44.$read$.<clinit>(<console>) > at $line44.$eval$.<init>(<console>:7) > at $line44.$eval$.<clinit>(<console>) > at $line44.$eval.$print(<console>) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >57) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at >org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:788) > at >org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1056) > at >org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:614) > at >org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:645) > at >org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:609) > at >org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:796) > at >org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:84 >1) > at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:753) > at >org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:601) > at >org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:608) > at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:611) > at >org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoo >p.scala:936) > at >org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala >:884) > at >org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala >:884) > at >scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoade >r.scala:135) > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:884) > at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:982) > at org.apache.spark.repl.Main$.main(Main.scala:31) > at org.apache.spark.repl.Main.main(Main.scala) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: >57) > at >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorIm >pl.java:43) > at java.lang.reflect.Method.invoke(Method.java:601) > at >org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:292) > at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:55) > at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) > >Many thanks for help! > > > >-- >View this message in context: >http://apache-spark-user-list.1001560.n3.nabble.com/Does-HiveContext-suppo >rt-Parquet-tp12209p12318.html >Sent from the Apache Spark User List mailing list archive at Nabble.com. > >--------------------------------------------------------------------- >To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >For additional commands, e-mail: user-h...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org