You can run hive query in the spark-avro, but you cannot query the hive view in 
the spark-avro, as the view is stored in the Hive metadata.
What do you mean the right version of spark, then "can't determine table 
schema" problem is fixed? I faced this problem before, and my guess is the Hive 
library mismatch causing it, but not sure.
I never faced your 2nd problem, can you post the whole stack for that error?
Most of our datasets are also in AVRO format.
Yong

Date: Thu, 27 Aug 2015 09:45:45 -0700
Subject: Re: query avro hive table in spark sql
From: gpatc...@gmail.com
To: java8...@hotmail.com
CC: mich...@databricks.com; user@spark.apache.org

can we run hive queries using spark-avro ?
In our case its not just reading the avro file. we have view in hive which is 
based on multiple tables.
On Thu, Aug 27, 2015 at 9:41 AM, Giri P <gpatc...@gmail.com> wrote:
we are using hive1.1 . 
I was able to fix below error when I used right version spark
15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered 
AvroSerdeExceptiondetermining schema. Returning signal schema to indicate 
problemorg.apache.hadoop.hive.serde2.avro.AvroSerdeException: 
Neitheravro.schema.literal nor avro.schema.url specified, can't determine 
tableschema        
atorg.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)
        
atorg.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)
        
atorg.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)    
    
atorg.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)
        
atorg.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)



But I still see this error when querying on some hive avro tables.
15/08/26 17:51:27 WARN
scheduler.TaskSetManager: Lost task 30.0 in stage 0.0 (TID 14,
dtord01hdw0227p.dc.dotomi.net):
org.apache.hadoop.hive.serde2.avro.BadSchemaException

       
at org.apache.hadoop.hive.serde2.avro.AvroSerDe.deserialize(AvroSerDe.java:91)

       
at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:321)

       at
org.apache.spark.sql.hive.HadoopTableReader$$anonfun$fillObject$1.apply(TableReader.scala:320)
I haven't tried spark-avro. We are using Sqlcontext to run queries in our 
application
Any idea if this issue might be coz of querying across different schema version 
of data ?
ThanksGiri
On Thu, Aug 27, 2015 at 5:39 AM, java8964 <java8...@hotmail.com> wrote:



What version of the Hive you are using? And do you compile to the right version 
of Hive when you compiled Spark?
BTY, spark-avro works great for our experience, but still, some non-tech people 
just want to use as a SQL shell in spark, like HIVE-CLI.
Yong

From: mich...@databricks.com
Date: Wed, 26 Aug 2015 17:48:44 -0700
Subject: Re: query avro hive table in spark sql
To: gpatc...@gmail.com
CC: user@spark.apache.org

I'd suggest looking at http://spark-packages.org/package/databricks/spark-avro
On Wed, Aug 26, 2015 at 11:32 AM, gpatcham <gpatc...@gmail.com> wrote:
Hi,



I'm trying to query hive table which is based on avro in spark SQL and

seeing below errors.



15/08/26 17:51:12 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException

determining schema. Returning signal schema to indicate problem

org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither

avro.schema.literal nor avro.schema.url specified, can't determine table

schema

        at

org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:68)

        at

org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:93)

        at

org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:60)

        at

org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:375)

        at

org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:249)





Its not able to determine schema. Hive table is pointing to avro schema

using url. I'm stuck and couldn't find more info on this.



Any pointers ?







--

View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/query-avro-hive-table-in-spark-sql-tp24462.html

Sent from the Apache Spark User List mailing list archive at Nabble.com.



---------------------------------------------------------------------

To unsubscribe, e-mail: user-unsubscr...@spark.apache.org

For additional commands, e-mail: user-h...@spark.apache.org




                                          



                                          

Reply via email to