Spark

James Taylor (JIRA) Fri, 07 Nov 2014 00:25:15 -0800

     [ 
https://issues.apache.org/jira/browse/PHOENIX-1405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


James Taylor updated PHOENIX-1405:
----------------------------------
    Attachment: PHOENIX-1405_V3.patch

[~robertroland] - would you mind trying this patch? The tests are in the 
phoenix-pig module. Instead of the utility function doing any normalization, 
the caller should really do it. In your case, you'd want to surround any case 
sensitive column names with double quotes. For example, for the id column, I 
think you'd want to do this:
{code}
val phoenixConf = new PhoenixPigConfiguration(new Configuration())

phoenixConf.setSelectStatement("SELECT \"id\" FROM \"mytable\"")
phoenixConf.setSelectColumns("\"id\"");
{code}


> Problem referencing lower-case column names with Phoenix / Pig / Spark
> ----------------------------------------------------------------------
>
>                 Key: PHOENIX-1405
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1405
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 3.2
>            Reporter: Robert Roland
>         Attachments: PHOENIX-1405.patch, PHOENIX-1405_V2.patch, 
> PHOENIX-1405_V3.patch
>
>
> Given the following table definition:
> {noformat}
> CREATE TABLE "mytable" (
>   "id" VARCHAR NOT NULL
>   CONSTRAINT pk PRIMARY KEY ("id")
> ) SALT_BUCKETS=16
> {noformat}
> And the following code setting up a PhoenixPigConfiguration:
> {noformat}
> val phoenixConf = new PhoenixPigConfiguration(new Configuration())
> phoenixConf.setSelectStatement("SELECT \"id\" FROM \"mytable\"")
> phoenixConf.setSelectColumns("id")
> phoenixConf.setSchemaType(SchemaType.QUERY)
> phoenixConf.configure("127.0.0.1", "\"mytable\"", 100)
> val phoenixRDD = sc.newAPIHadoopRDD(phoenixConf.getConfiguration,
>   classOf[PhoenixInputFormat],
>   classOf[NullWritable],
>   classOf[PhoenixRecord])
> {noformat}
> The above seems to work, but when I later call 
> phoenixConf.getSelectColumnMetadataList, I get the following error:
> {noformat}
>   java.sql.SQLException: Unable to resolve these column names:
> id
> Available columns with column families:
> _SALT,id
>   at 
> org.apache.phoenix.util.PhoenixRuntime.generateColumnInfo(PhoenixRuntime.java:354)
>   at 
> org.apache.phoenix.pig.PhoenixPigConfiguration$PhoenixPigConfigurationUtil.getSelectColumnMetadataList(PhoenixPigConfiguration.java:269)
>   at 
> org.apache.phoenix.pig.PhoenixPigConfiguration.getSelectColumnMetadataList(PhoenixPigConfiguration.java:157)
>   at com.simplymeasured.spark.PhoenixRDD.toSchemaRDD(PhoenixRDD.scala:52)
>   at 
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply$mcV$sp(PhoenixRDDTest.scala:35)
>   at 
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
>   at 
> com.simplymeasured.spark.PhoenixRDDTest$$anonfun$3.apply(PhoenixRDDTest.scala:31)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
> {noformat}
> Looking at PhoenixRuntime, within getColumnInfo(), it's performing a 
> trim().toUpperCase(), which doesn't seem valid: 
> https://github.com/apache/phoenix/blob/3.0/phoenix-core/src/main/java/org/apache/phoenix/util/PhoenixRuntime.java#L374
> I'm attempting to use this from within Spark, and I would like to rely on 
> getSelectColumnMetadataList to build a Schema RDD.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (PHOENIX-1405) Problem referencing lower-case column names with Phoenix / Pig / Spark

Reply via email to