[
https://issues.apache.org/jira/browse/PHOENIX-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Samarth Jain updated PHOENIX-1405:
----------------------------------
Attachment: PHOENIX-1405_V2.patch
IMO, we should be using SchemaUtil.normalizeIdentifier(columnName.trim())
instead. I am guessing it is probably already expected of users that they will
have to give column names in quotes in their pig commands if they want the
column names to be case sensitive.
Couldn't run pig tests though to verify it doesn't break anything. Getting
error:
Failed to execute goal
org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on
project phoenix-core: Fatal error compiling: invalid target release: 1.7 ->
[Help 1]
[~jamestaylor]
> Problem referencing lower-case column names with Phoenix / Pig / Spark
> ----------------------------------------------------------------------
>
> Key: PHOENIX-1405
> URL: https://issues.apache.org/jira/browse/PHOENIX-1405
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 3.2
> Reporter: Robert Roland
> Attachments: PHOENIX-1405.patch, PHOENIX-1405_V2.patch
>
>
> Given the following table definition:
> {noformat}
> CREATE TABLE "mytable" (
> "id" VARCHAR NOT NULL
> CONSTRAINT pk PRIMARY KEY ("id")
> ) SALT_BUCKETS=16
> {noformat}
> And the following code setting up a PhoenixPigConfiguration:
> {noformat}
> val phoenixConf = new PhoenixPigConfiguration(new Configuration())
> phoenixConf.setSelectStatement("SELECT \"id\" FROM \"mytable\"")
> phoenixConf.setSelectColumns("id")
> phoenixConf.setSchemaType(SchemaType.QUERY)
> phoenixConf.configure("127.0.0.1", "\"mytable\"", 100)
> val phoenixRDD = sc.newAPIHadoopRDD(phoenixConf.getConfiguration,
> classOf[PhoenixInputFormat],
> classOf[NullWritable],
> classOf[PhoenixRecord])
> {noformat}
> The above seems to work, but when I later call
> phoenixConf.getSelectColumnMetadataList, I get the following error:
> {noformat}
> java.sql.SQLException: Unable to resolve these column names:
> id
> Available columns with column families:
> _SALT,id
> at
> org.apache.phoenix.util.PhoenixRuntime.generateColumnInfo(PhoenixRuntime.java:354)
> at
> org.apache.phoenix.pig.PhoenixPigConfiguration$PhoenixPigConfigurationUtil.getSelectColumnMetadataList(PhoenixPigConfiguration.java:269)
> at
> org.apache.phoenix.pig.PhoenixPigConfiguration.getSelectColumnMetadataList(PhoenixPigConfiguration.java:157)
> at com.simplymeasured.spark.PhoenixRDD.toSchemaRDD(PhoenixRDD.scala:52)
> at
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply$mcV$sp(PhoenixRDDTest.scala:35)
> at
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
> at
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
> at
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
> at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
> at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> {noformat}
> Looking at PhoenixRuntime, within getColumnInfo(), it's performing a
> trim().toUpperCase(), which doesn't seem valid:
> https://github.com/apache/phoenix/blob/3.0/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L374
> I'm attempting to use this from within Spark, and I would like to rely on
> getSelectColumnMetadataList to build a Schema RDD.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)