Re: UUID coming as int while using SPARK SQL

Laing, Michael Tue, 24 May 2016 03:41:11 -0700

Try converting that int from decimal to hex and inserting dashes in the
appropriate spots - or go the other way.


Also, you are looking at different rows, based upon your selection
criteria...

ml

On Tue, May 24, 2016 at 6:23 AM, Rajesh Radhakrishnan <
rajesh.radhakrish...@phe.gov.uk> wrote:

> Hi,
>
>
> I got a Cassandra keyspace, but while reading the data(especially UUID)
> via Spark SQL using Python is not returning the correct value.
>
> Cassandra:
> --------------
> My table 'SAM'' is described below:
>
> CREATE table ks.sam (id uuid, dept text, workflow text, type double
> primary  key (id, dept))
>
> SELECT id, workflow FROM sam WHERE dept='blah';
>
> The above example  CQL gives me the following
> id                                   | workflow
> --------------------------------------+------------
>  9547v26c-f528-12e5-da8b-001a4q3dac10 |       testWK
>
>
> Spark/Python:
> ------------------
> from pyspark import SparkConf
> from pyspark.sql import SQLContext
> import pyspark_cassandra
> from pyspark_cassandra import CassandraSparkContext
>
> ....
> conf =
> SparkConf().set("spark.cassandra.connection.host",IP_ADDRESS).set("spark.cassandra.connection.native.port",PORT_NUMBER)
> sparkContext = CassandraSparkContext(conf = conf)
> sqlContext = SQLContext(sparkContext)
>
> samTable =sparkContext.cassandraTable("ks", "sam").select('id', 'dept','
> workflow')
> samTable.cache()
>
> samdf.registerTempTable("samd")
>
>  sparkSQLl ="SELECT distinct id, dept, workflow FROM samd WHERE workflow='
> testWK'
>  new_df = sqlContext.sql(sparkSQLl)
>  results  =  new_df.collect()
>  for row in results:
>             print "dept=",row.dept
>             print "wk=",row.workflow
>             print "id=",row.id
> ...
> The Python code above prints the following:
> dept=Biology
> wk=testWK
> id=293946894141093607334963674332192894528
>
>
> You can see here that the id (uuid) whose correct value at Cassandra is '
> 9547v26c-f528-12e5-da8b-001a4q3dac10'  but via Spark I am getting an int '
> 29394689414109360733496367433219289452'.
> What I am doing wrong here? How to get the correct UUID value from
> Cassandra via Spark/Python ? Please help me.
>
> Thank you
> Rajesh R
>
> **************************************************************************
> The information contained in the EMail and any attachments is confidential
> and intended solely and for the attention and use of the named
> addressee(s). It may not be disclosed to any other person without the
> express authority of Public Health England, or the intended recipient, or
> both. If you are not the intended recipient, you must not disclose, copy,
> distribute or retain this message or any part of it. This footnote also
> confirms that this EMail has been swept for computer viruses by
> Symantec.Cloud, but please re-sweep any attachments before opening or
> saving. http://www.gov.uk/PHE
> **************************************************************************
>

Re: UUID coming as int while using SPARK SQL

Reply via email to