Hi everyone, I have a Cassandra SCF where each super column has a name which is dynamically assigned as TimeUUID at the time that that super column was inserted into the database:
create column family CF with key_validation_class = UTF8Type and comparator = TimeUUIDType and subcomparator = UTF8Type and column_type = 'Super'; Now, I'm trying to write a Pig script that would automatically calculate the number of new super columns added to the database during specified period of time (let's say, in the last hour). For that, I thought it would be nice to be able to do something along the lines of: last_hour_data = LOAD 'cassandra://Keyspace/ColumnFamily&slice_start=Time(one hour ago)&slice_end=Time(now)' USING CassandraStorage()... However, 1) I'm not sure what that "Time(one hour ago)" and "Time(now)" syntax is (so that it would translate those times into TimeUUIDs that cassandra understands) and 2) The LOAD line above that I took from the bottom of http://svn.apache.org/repos/asf/cassandra/trunk/contrib/pig/README.txtproduces an error thinking that 'CF&slice_start...' is one gigantic column family name (which of course does not exist). Alternatively, I could try generating my specified range of columns in Pig after loading the whole database. But looking at the data, the super column names look like 'S.?,uF? ?B#q' or ' ??VuI??-gFd?' instead of "normal-looking" UUIDs like '275564bc4f52f81573b4cfe0ea615ae0', even when I try to load the super column names as chararrays. I'm thinking it's because the latter representation of UUID differs from its string representation, but is there a way to load it into Pig the "normal-looking" way? Thank you in advance for your time! Dan F.
