Hi, I'm having some issues with the phoenix spark connector. I'm using the phoenix-for-cloudera[1] build so I'm not really sure if these are bugs that have been fixed already.
1. I can't use the connector to load tables that have lowercase names. Suppose I have a view of a HBase table called 'test' (lowercase) then loading the table with val df = sqlContext.load( "org.apache.phoenix.spark", Map("table" -> "test", "zkUrl" -> zk) ) fails with the exception org.apache.phoenix.schema.TableNotFoundException: ERROR 1012 (42M03): Table undefined. tableName=TEST at org.apache.phoenix.schema.PMetaDataImpl.getTableRef(PMetaDataImpl.java:71) at org.apache.phoenix.jdbc.PhoenixConnection.getTable(PhoenixConnection.java:452) at org.apache.phoenix.util.PhoenixRuntime.getTable(PhoenixRuntime.java:399) at org.apache.phoenix.util.PhoenixRuntime.generateColumnInfo(PhoenixRuntime.java:425) at org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil.getSelectColumnMetadataList(PhoenixConfigurationUtil.java:281) at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:106) at org.apache.phoenix.spark.PhoenixRelation.schema(PhoenixRelation.scala:60) at org.apache.spark.sql.execution.datasources.LogicalRelation.<init>(LogicalRelation.scala:40) 2. If a column name contains a period, it cannot be used. For example, I added a column named 'test.test' to a table. Using phoenix-sqlline.py I can use this column without a problem. But when I load the table using the phoenix spark collector as before: val df = sqlContext.load( "org.apache.phoenix.spark", Map("table" -> "TEST", "zkUrl" -> zk) ) then I cannot even view the table: scala> df.show org.apache.phoenix.schema.ColumnFamilyNotFoundException: ERROR 1001 (42I01): Undefined column family. familyName=test at org.apache.phoenix.schema.PTableImpl.getColumnFamily(PTableImpl.java:921) at org.apache.phoenix.util.PhoenixRuntime.getColumnInfo(PhoenixRuntime.java:494) at org.apache.phoenix.util.PhoenixRuntime.generateColumnInfo(PhoenixRuntime.java:440) at org.apache.phoenix.mapreduce.util.PhoenixConfigurationUtil.getSelectColumnMetadataList(PhoenixConfigurationUtil.java:281) at org.apache.phoenix.spark.PhoenixRDD.toDataFrame(PhoenixRDD.scala:106) at org.apache.phoenix.spark.PhoenixRelation.buildScan(PhoenixRelation.scala:47) at org.apache.spark.sql.execution.datasources.DataSourceStrategy$$anonfun$11.apply(DataSourceStrategy.scala:336) [1] https://github.com/chiastic-security/phoenix-for-cloudera