Thanks for reporting this! https://issues.apache.org/jira/browse/SPARK-1964 https://github.com/apache/spark/pull/913
If you could test out that PR and see if it fixes your problems I'd really appreciate it! Michael On Thu, May 29, 2014 at 9:09 AM, Andrew Ash <and...@andrewash.com> wrote: > I can confirm that the commit is included in the 1.0.0 release candidates > (it was committed before branch-1.0 split off from master), but I can't > confirm that it works in PySpark. Generally the Python and Java interfaces > lag a little behind the Scala interface to Spark, but we're working to keep > that diff much smaller going forward. > > Can you try the same thing in Scala? > > > On Thu, May 29, 2014 at 8:54 AM, dataginjaninja < > rickett.stepha...@gmail.com > > wrote: > > > Can anyone verify which rc [SPARK-1360] Add Timestamp Support for SQL > #275 > > <https://github.com/apache/spark/pull/275> is included in? I am > running > > rc3, but receiving errors with TIMESTAMP as a datatype in my Hive tables > > when trying to use them in pyspark. > > > > *The error I get: > > * > > 14/05/29 15:44:47 INFO ParseDriver: Parsing command: SELECT COUNT(*) FROM > > aol > > 14/05/29 15:44:48 INFO ParseDriver: Parse Completed > > 14/05/29 15:44:48 INFO metastore: Trying to connect to metastore with URI > > thrift: > > 14/05/29 15:44:48 INFO metastore: Waiting 1 seconds before next > connection > > attempt. > > 14/05/29 15:44:49 INFO metastore: Connected to metastore. > > Traceback (most recent call last): > > File "<stdin>", line 1, in <module> > > File "/opt/spark-1.0.0-rc3/python/pyspark/sql.py", line 189, in hql > > return self.hiveql(hqlQuery) > > File "/opt/spark-1.0.0-rc3/python/pyspark/sql.py", line 183, in hiveql > > return SchemaRDD(self._ssql_ctx.hiveql(hqlQuery), self) > > File > > > "/opt/spark-1.0.0-rc3/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py", > > line 537, in __call__ > > File > > "/opt/spark-1.0.0-rc3/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", > line > > 300, in get_return_value > > py4j.protocol.Py4JJavaError: An error occurred while calling o14.hiveql. > > : java.lang.RuntimeException: Unsupported dataType: timestamp > > > > *The table I loaded:* > > DROP TABLE IF EXISTS aol; > > CREATE EXTERNAL TABLE aol ( > > userid STRING, > > query STRING, > > query_time TIMESTAMP, > > item_rank INT, > > click_url STRING) > > ROW FORMAT DELIMITED > > FIELDS TERMINATED BY '\t' > > LOCATION '/tmp/data/aol'; > > > > *The pyspark commands:* > > from pyspark.sql import HiveContext > > hctx= HiveContext(sc) > > results = hctx.hql("SELECT COUNT(*) FROM aol").collect() > > > > > > > > > > > > > > -- > > View this message in context: > > > http://apache-spark-developers-list.1001551.n3.nabble.com/Timestamp-support-in-v1-0-tp6850.html > > Sent from the Apache Spark Developers List mailing list archive at > > Nabble.com. > > >