[ https://issues.apache.org/jira/browse/SPARK-1964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Armbrust reassigned SPARK-1964: --------------------------------------- Assignee: Michael Armbrust > Timestamp missing from HiveMetastore types parser > ------------------------------------------------- > > Key: SPARK-1964 > URL: https://issues.apache.org/jira/browse/SPARK-1964 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 1.0.0 > Reporter: Michael Armbrust > Assignee: Michael Armbrust > > {code} > ---------- Forwarded message ---------- > From: dataginjaninja <rickett.stepha...@gmail.com> > Date: Thu, May 29, 2014 at 8:54 AM > Subject: Timestamp support in v1.0 > To: d...@spark.incubator.apache.org > Can anyone verify which rc [SPARK-1360] Add Timestamp Support for SQL #275 > <https://github.com/apache/spark/pull/275> is included in? I am running > rc3, but receiving errors with TIMESTAMP as a datatype in my Hive tables > when trying to use them in åçpyspark. > *The error I get: > * > 14/05/29 15:44:47 INFO ParseDriver: Parsing command: SELECT COUNT(*) FROM > aol > 14/05/29 15:44:48 INFO ParseDriver: Parse Completed > 14/05/29 15:44:48 INFO metastore: Trying to connect to metastore with URI > thrift: > 14/05/29 15:44:48 INFO metastore: Waiting 1 seconds before next connection > attempt. > 14/05/29 15:44:49 INFO metastore: Connected to metastore. > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "/opt/spark-1.0.0-rc3/python/pyspark/sql.py", line 189, in hql > return self.hiveql(hqlQuery) > File "/opt/spark-1.0.0-rc3/python/pyspark/sql.py", line 183, in hiveql > return SchemaRDD(self._ssql_ctx.hiveql(hqlQuery), self) > File > "/opt/spark-1.0.0-rc3/python/lib/py4j-0.8.1-src.zip/py4j/java_gateway.py", > line 537, in __call__ > File > "/opt/spark-1.0.0-rc3/python/lib/py4j-0.8.1-src.zip/py4j/protocol.py", line > 300, in get_return_value > py4j.protocol.Py4JJavaError: An error occurred while calling o14.hiveql. > : java.lang.RuntimeException: Unsupported dataType: timestamp > *The table I loaded:* > DROP TABLE IF EXISTS aol; > CREATE EXTERNAL TABLE aol ( > userid STRING, > query STRING, > query_time TIMESTAMP, > item_rank INT, > click_url STRING) > ROW FORMAT DELIMITED > FIELDS TERMINATED BY '\t' > LOCATION '/tmp/data/aol'; > *The pyspark commands:* > from pyspark.sql import HiveContext > hctx= HiveContext(sc) > results = hctx.hql("SELECT COUNT(*) FROM aol").collect() > {code} -- This message was sent by Atlassian JIRA (v6.2#6252)