Hi,
I have following table structure in Cassandra 1.2.1: * TimeStamp * MACAddress * Data Transfer * LocationID * Primary KEY(TimeStamp, MacAddress) // Composite key, partitioned on TimeStamp There are close to 500K different MAC Address, and 10K timestamps. So a total of 5 billion records are there. Each record is 50 bytes, so total size of the data is 250 GB. I have a 4 node cluster with no replication where all this data is stored. Here are my requirements: * I want to retrieve all the records for a particular timestamp real quick (say < 10 seconds). The above mentioned numbers mean that a total of 500 K entries would be retrieved, which is equal to around 25 MB of data. I think this is only possible if there's partition on TimeStamp * I want to fetch all the data for a particular MAC Address in real time (say < 5 seconds). The above mentioned numbers mean that a total of 10K entries (~.5 MB) will be there per MACAddress. While I'm able to access the first query in acceptable time limit, the second query takes lot of time to return. The way this could be improved is by addition of an index on MacAddress. However, I found out that if the PKEY is (TimeStamp, MacAddress), then it is not possible to index on MacAddress. Keeping in mind the above requirements, is there any thing that I could do in Cassandra to achieve both the requirements? Thanks Pushkar