Spark

Samarth Jain (JIRA) Thu, 06 Nov 2014 23:37:48 -0800

     [ 
https://issues.apache.org/jira/browse/PHOENIX-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Samarth Jain updated PHOENIX-1405:
----------------------------------
    Attachment: PHOENIX-1405_V2.patch

IMO, we should be using SchemaUtil.normalizeIdentifier(columnName.trim()) 
instead. I am guessing it is probably already expected of users that they will 
have to give column names in quotes in their pig commands if they want  the 
column names to be case sensitive. 

Couldn't run pig tests though to verify it doesn't break anything. Getting 
error:

Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:3.0:compile (default-compile) on 
project phoenix-core: Fatal error compiling: invalid target release: 1.7 -> 
[Help 1]

[~jamestaylor]

> Problem referencing lower-case column names with Phoenix / Pig / Spark
> ----------------------------------------------------------------------
>
>                 Key: PHOENIX-1405
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1405
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 3.2
>            Reporter: Robert Roland
>         Attachments: PHOENIX-1405.patch, PHOENIX-1405_V2.patch
>
>
> Given the following table definition:
> {noformat}
> CREATE TABLE "mytable" (
>   "id" VARCHAR NOT NULL
>   CONSTRAINT pk PRIMARY KEY ("id")
> ) SALT_BUCKETS=16
> {noformat}
> And the following code setting up a PhoenixPigConfiguration:
> {noformat}
> val phoenixConf = new PhoenixPigConfiguration(new Configuration())
> phoenixConf.setSelectStatement("SELECT \"id\" FROM \"mytable\"")
> phoenixConf.setSelectColumns("id")
> phoenixConf.setSchemaType(SchemaType.QUERY)
> phoenixConf.configure("127.0.0.1", "\"mytable\"", 100)
> val phoenixRDD = sc.newAPIHadoopRDD(phoenixConf.getConfiguration,
>   classOf[PhoenixInputFormat],
>   classOf[NullWritable],
>   classOf[PhoenixRecord])
> {noformat}
> The above seems to work, but when I later call 
> phoenixConf.getSelectColumnMetadataList, I get the following error:
> {noformat}
>   java.sql.SQLException: Unable to resolve these column names:
> id
> Available columns with column families:
> _SALT,id
>   at 
> org.apache.phoenix.util.PhoenixRuntime.generateColumnInfo(PhoenixRuntime.java:354)
>   at 
> org.apache.phoenix.pig.PhoenixPigConfiguration$PhoenixPigConfigurationUtil.getSelectColumnMetadataList(PhoenixPigConfiguration.java:269)
>   at 
> org.apache.phoenix.pig.PhoenixPigConfiguration.getSelectColumnMetadataList(PhoenixPigConfiguration.java:157)
>   at com.simplymeasured.spark.PhoenixRDD.toSchemaRDD(PhoenixRDD.scala:52)
>   at 
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply$mcV$sp(PhoenixRDDTest.scala:35)
>   at 
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
>   at 
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> {noformat}
> Looking at PhoenixRuntime, within getColumnInfo(), it's performing a 
> trim().toUpperCase(), which doesn't seem valid: 
> https://github.com/apache/phoenix/blob/3.0/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L374
> I'm attempting to use this from within Spark, and I would like to rely on 
> getSelectColumnMetadataList to build a Schema RDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PHOENIX-1405) Problem referencing lower-case column names with Phoenix / Pig / Spark

Reply via email to