Hi!
I have some data in a table created using thrift. In cassandra-cli, the
'show schema' output for this table is:
create column family Users
with column_type = 'Standard'
and comparator = 'AsciiType'
and default_validation_class = 'UTF8Type'
and key_validation_class = 'LexicalUUIDType'
and column_metadata = [
{column_name : 'date_created',
validation_class : LongType},
{column_name : 'active',
validation_class : IntegerType,
index_name : 'Users_active_idx_1',
index_type : 0},
{column_name : 'email',
validation_class : UTF8Type,
index_name : 'Users_email_idx_1',
index_type : 0},
{column_name : 'username',
validation_class : UTF8Type,
index_name : 'Users_username_idx_1',
index_type : 0},
{column_name : 'default_account_id',
validation_class : LexicalUUIDType}];
>From cqlsh, it looks like this:
[cqlsh 4.1.1 | Cassandra 2.0.11 | CQL spec 3.1.1 | Thrift protocol 19.39.0]
Use HELP for help.
cqlsh:test> describe table Users;
CREATE TABLE "Users" (
key 'org.apache.cassandra.db.marshal.LexicalUUIDType',
column1 ascii,
active varint,
date_created bigint,
default_account_id 'org.apache.cassandra.db.marshal.LexicalUUIDType',
email text,
username text,
value text,
PRIMARY KEY ((key), column1)
) WITH COMPACT STORAGE;
CREATE INDEX Users_active_idx_12 ON "Users" (active);
CREATE INDEX Users_email_idx_12 ON "Users" (email);
CREATE INDEX Users_username_idx_12 ON "Users" (username);
Now, when I try to extract data from this using cqlsh or the
python-driver, I have no problems getting data for the columns which are
actually UTF8,but for those where column_metadata have been set to
something else, there's trouble. Example using the python driver:
-- snip --
In [8]: u = uuid.UUID("a6b07340-047c-4d4c-9a02-1b59eabf611c")
In [9]: sess.execute('SELECT column1,value from "Users" where key = %s
and column1 = %s', [u, 'username'])
Out[9]: [Row(column1='username', value=u'uc6vf')]
In [10]: sess.execute('SELECT column1,value from "Users" where key = %s
and column1 = %s', [u, 'date_created'])
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-10-d06f98a160e1> in <module>()
----> 1 sess.execute('SELECT column1,value from "Users" where key = %s
and column1 = %s', [u, 'date_created'])
/home/forsberg/dev/virtualenvs/ospapi/local/lib/python2.7/site-packages/cassandra/cluster.pyc
in execute(self, query, parameters, timeout, trace)
1279 future = self.execute_async(query, parameters, trace)
1280 try:
-> 1281 result = future.result(timeout)
1282 finally:
1283 if trace:
/home/forsberg/dev/virtualenvs/ospapi/local/lib/python2.7/site-packages/cassandra/cluster.pyc
in result(self, timeout)
2742 return PagedResult(self, self._final_result)
2743 elif self._final_exception:
-> 2744 raise self._final_exception
2745 else:
2746 raise OperationTimedOut(errors=self._errors,
last_host=self._current_host)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf3 in position 6:
unexpected end of data
-- snap --
cqlsh gives me similar errors.
Can I tell the python driver to parse some column values as integers, or
is this an unsupported case?
For sure this is an ugly table, but I have data in it, and I would like
to avoid having to rewrite all my tools at once, so if I could support
it from CQL that would be great.
Regards,
\EF