Hi all, I've got a (very) basic Spark application in Python that selects some basic information from my Phoenix table. I can't quite figure out how (or even if I can) select dynamic columns through this, however.
Here's what I have; from pyspark import SparkContext, SparkConf from pyspark.sql import SQLContext conf = SparkConf().setAppName("pysparkPhoenixLoad").setMaster("local") sc = SparkContext(conf=conf) sqlContext = SQLContext(sc) df = sqlContext.read.format("org.apache.phoenix.spark") \ .option("table", """MYTABLE("dyamic_column" VARCHAR)""") \ .option("zkUrl", "127.0.0.1:2181:/hbase-unsecure") \ .load() df.show() df.printSchema() I get a "org.apache.phoenix.schema.TableNotFoundException:" error for the above. If I try and load the data frame as a table and query that with SQL: sqlContext.registerDataFrameAsTable(df, "test") sqlContext.sql("""SELECT * FROM test("dynamic_column" VARCHAR)""") I get a bit of a strange exception: py4j.protocol.Py4JJavaError: An error occurred while calling o37.sql. : java.lang.RuntimeException: [1.19] failure: ``union'' expected but `(' found SELECT * FROM test("dynamic_column" VARCHAR) Does anybody have a pointer on whether this is supported and how I might be able to query a dynamic column? I haven't found much information on the wider Internet about Spark + Phoenix integration for this kind of thing...Simple selects are working. Final note: I have (rather stupidly) lower-cased my column names in Phoenix, so I need to quote them when I execute a query (I'll be changing this as soon as possible). Any assistance would be appreciated :) *-- Craig*