NULL values
Hi, How does Cassandra handle NULL values? I want to know how I can see rows where a certain column has no values. For example if I set the TTL for columns is it possible to select rows where the ttl has expired for deletion. Regards Hans-Peter Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
Re: NULL values
Cassandra (C*) has no NULL values. C* is column schemaless, meaning you can have different columns on each row of the same ColumnFamily (CF). So if you want to check if a certain column is NULL for a row, you just check if it exist. By the way, you can store a column with a name and no value (empty value). This empty value doesn't take any disk space AFAIK. About the TTL, their point is precisely to keep them for a predefined time. C* delete them on without any action needed by a client, it's an internal work. Alain 2013/2/27 Sloot, Hans-Peter hans-peter.sl...@atos.net Hi, How does Cassandra handle NULL values? I want to know how I can see rows where a certain column has no values. For example if I set the TTL for columns is it possible to select rows where the ttl has expired for deletion. Regards Hans-Peter Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
Re: NULL values
You may also be interested in this: https://issues.apache.org/jira/browse/CASSANDRA-3783 Also, I *guess* that a deleted column takes some (very small?) space with its tombstone, until it's removed. But I leave the details to someone else, as I'm definitely not an expert. Il giorno 27/feb/2013, alle ore 10:35, Alain RODRIGUEZ arodr...@gmail.com ha scritto: Cassandra (C*) has no NULL values. C* is column schemaless, meaning you can have different columns on each row of the same ColumnFamily (CF). So if you want to check if a certain column is NULL for a row, you just check if it exist. By the way, you can store a column with a name and no value (empty value). This empty value doesn't take any disk space AFAIK. About the TTL, their point is precisely to keep them for a predefined time. C* delete them on without any action needed by a client, it's an internal work. Alain 2013/2/27 Sloot, Hans-Peter hans-peter.sl...@atos.net Hi, How does Cassandra handle NULL values? I want to know how I can see rows where a certain column has no values. For example if I set the TTL for columns is it possible to select rows where the ttl has expired for deletion. Regards Hans-Peter Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762 -- Marco Matarazzo == Hex Keep == W: http://www.hexkeep.com M: +39 347 8798528 E: marco.matara...@hexkeep.com You can learn more about a man in one hour of play than in one year of conversation.” - Plato
RE: NULL values
But how do you check whether it exists? Can I select rows from a columnfamily which do not have a column set? What if I have set a ttl on all columns. After the expiration everything will be removed except the key. How can I determine the keys that have no additional columns and delete them? From: Alain RODRIGUEZ [mailto:arodr...@gmail.com] Sent: woensdag 27 februari 2013 10:35 To: user@cassandra.apache.org Subject: Re: NULL values Cassandra (C*) has no NULL values. C* is column schemaless, meaning you can have different columns on each row of the same ColumnFamily (CF). So if you want to check if a certain column is NULL for a row, you just check if it exist. By the way, you can store a column with a name and no value (empty value). This empty value doesn't take any disk space AFAIK. About the TTL, their point is precisely to keep them for a predefined time. C* delete them on without any action needed by a client, it's an internal work. Alain 2013/2/27 Sloot, Hans-Peter hans-peter.sl...@atos.netmailto:hans-peter.sl...@atos.net Hi, How does Cassandra handle NULL values? I want to know how I can see rows where a certain column has no values. For example if I set the TTL for columns is it possible to select rows where the ttl has expired for deletion. Regards Hans-Peter Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762 Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
Re: NULL values
Cassandra only stores keys, not columns; once all of the columns in a row have been deleted, there is nothing left to delete, although the row may still appear in some queries (with no columns) until the tombstones for those columns have been removed (which occurs during compaction once gc_grace_seconds has passed). On Wed, Feb 27, 2013 at 11:59 AM, Sloot, Hans-Peter hans-peter.sl...@atos.net wrote: But how do you check whether it exists? Can I select rows from a columnfamily which do not have a column set? What if I have set a ttl on all columns. After the expiration everything will be removed except the key. How can I determine the keys that have no additional columns and delete them? From: Alain RODRIGUEZ [mailto:arodr...@gmail.com] Sent: woensdag 27 februari 2013 10:35 To: user@cassandra.apache.org Subject: Re: NULL values Cassandra (C*) has no NULL values. C* is column schemaless, meaning you can have different columns on each row of the same ColumnFamily (CF). So if you want to check if a certain column is NULL for a row, you just check if it exist. By the way, you can store a column with a name and no value (empty value). This empty value doesn't take any disk space AFAIK. About the TTL, their point is precisely to keep them for a predefined time. C* delete them on without any action needed by a client, it's an internal work. Alain 2013/2/27 Sloot, Hans-Peter hans-peter.sl...@atos.net Hi, How does Cassandra handle NULL values? I want to know how I can see rows where a certain column has no values. For example if I set the TTL for columns is it possible to select rows where the ttl has expired for deletion. Regards Hans-Peter Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762 Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will
Re: NULL values
W dniu 27.02.2013 10:57, Marco Matarazzo pisze: You may also be interested in this: https://issues.apache.org/jira/browse/CASSANDRA-3783 CASSANDRA-3783 might not be the case here. The question is about using null in SELECT statements, which will require modifications in secondary indexes code, which is unlikely to be done now/soon (see Sylvain's comments in CASSANDRA-3783 and CASSANDRA-5081) and probably will be left to a follow-up ticket, while CASSANDRA-3783 will focus on INSERT/UPDATE statements (which would be very usefull together with prepared statements [CASSANDRA-5081]). M. Also, I *guess* that a deleted column takes some (very small?) space with its tombstone, until it's removed. But I leave the details to someone else, as I'm definitely not an expert. Il giorno 27/feb/2013, alle ore 10:35, Alain RODRIGUEZ arodr...@gmail.com ha scritto: Cassandra (C*) has no NULL values. C* is column schemaless, meaning you can have different columns on each row of the same ColumnFamily (CF). So if you want to check if a certain column is NULL for a row, you just check if it exist. By the way, you can store a column with a name and no value (empty value). This empty value doesn't take any disk space AFAIK. About the TTL, their point is precisely to keep them for a predefined time. C* delete them on without any action needed by a client, it's an internal work. Alain 2013/2/27 Sloot, Hans-Peter hans-peter.sl...@atos.net Hi, How does Cassandra handle NULL values? I want to know how I can see rows where a certain column has no values. For example if I set the TTL for columns is it possible to select rows where the ttl has expired for deletion. Regards Hans-Peter Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762 -- Marco Matarazzo == Hex Keep == W: http://www.hexkeep.com M: +39 347 8798528 E: marco.matara...@hexkeep.com You can learn more about a man in one hour of play than in one year of conversation.” - Plato
range queries
Hello, I have what is perhaps a silly question. Column family other2 which has a varchar as primary key and a uuid column. I have inserted 2000 rows All rows keys start with 'nl' followed by other characters. To my surprise when I do : select count(*) from other2 where key 'z'; It shows : count --- 1947 All rows start with a character smaller than 1. But it becomes even more strange: cqlsh:demo select count(*) from other2 where key 'zz'; count --- 1415 cqlsh:demo select count(*) from other2 where key 'zzz'; count --- 1820 -- now the row count has even increased. What am I doing wrong here? Regards Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
Re: range queries
What am I doing wrong here? You are probably using a RandomPartitioner (or Murmur3Partitioner) which randomize keys to avoid hot spots. Basically, you just can't use range query because 'nlxx' is stored as md5('nlxx'). You should better modify your model to use column slice, which are ordered. An other solution, which is not recommended at all because it leads to hot spots, is to use an OrderedPreservingPartitioner. But once again, I think you shouldn't do it. I have no time to go deeper in my explanation but with what I already told you, you should be able to find out by yourself more details if needed. Alain 2013/2/27 Sloot, Hans-Peter hans-peter.sl...@atos.net Hello, I have what is perhaps a silly question. Column family other2 which has a varchar as primary key and a uuid column. I have inserted 2000 rows All rows keys start with 'nl' followed by other characters. To my surprise when I do : select count(*) from other2 where key 'z'; It shows : count --- 1947 All rows start with a character smaller than 1. But it becomes even more strange: cqlsh:demo select count(*) from other2 where key 'zz'; count --- 1415 cqlsh:demo select count(*) from other2 where key 'zzz'; count --- 1820 -- now the row count has even increased. What am I doing wrong here? Regards Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
Re: range queries
Things you can find searching on the web : http://wiki.apache.org/cassandra/DataModel#Range_queries 2013/2/27 Alain RODRIGUEZ arodr...@gmail.com What am I doing wrong here? You are probably using a RandomPartitioner (or Murmur3Partitioner) which randomize keys to avoid hot spots. Basically, you just can't use range query because 'nlxx' is stored as md5('nlxx'). You should better modify your model to use column slice, which are ordered. An other solution, which is not recommended at all because it leads to hot spots, is to use an OrderedPreservingPartitioner. But once again, I think you shouldn't do it. I have no time to go deeper in my explanation but with what I already told you, you should be able to find out by yourself more details if needed. Alain 2013/2/27 Sloot, Hans-Peter hans-peter.sl...@atos.net Hello, I have what is perhaps a silly question. Column family other2 which has a varchar as primary key and a uuid column. I have inserted 2000 rows All rows keys start with 'nl' followed by other characters. To my surprise when I do : select count(*) from other2 where key 'z'; It shows : count --- 1947 All rows start with a character smaller than 1. But it becomes even more strange: cqlsh:demo select count(*) from other2 where key 'zz'; count --- 1415 cqlsh:demo select count(*) from other2 where key 'zzz'; count --- 1820 -- now the row count has even increased. What am I doing wrong here? Regards Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
Re: Understanding system.log
Hi Victor. AFAIK, there is nothing like that. But I am quite sure that if you don't understand a log entry someone in this mailing list will be able to help you with it. Alain 2013/2/25 Víctor Hugo Oliveira Molinar vhmoli...@gmail.com Hello everyone! I'd like to know if there is any guide or description of the cassandra server log(system.log). I mean, how should I interpret each log event, and what information may I retain for it;
misreports on nodetool ring command on 1.1.4
I just installed 1.1.4 as I need to test upgrade to 1.2.2. I have an existing 6 node cluster which shows 50% ownership on each node which makes sense since RF=3 on everything I have. I brought up all 4 nodes in this cluster and ran nodetool ring and it shows every node with 25%. Then, I create a keyspace and run again and it shows every node at 0% Exact output was the following….. [cassandra@sdi-prod-04 ~]$ nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 11.37 KB25.00% 0 10.20.5.83 DC1 RAC1Up Normal 11.14 KB25.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 11.14 KB25.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 15.65 KB25.00% 127605887595351923798765477786913079295 After I created the keyspace, ran it again…. [cassandra@sdi-prod-04 ~]$ nodetool ring Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 15.95 KB0.00% 0 10.20.5.83 DC1 RAC1Up Normal 15.73 KB0.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 15.73 KB0.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 20.24 KB0.00% 127605887595351923798765477786913079295 Thanks, Dean
System.log
Cassandra rotates system.log when it reaches 20MB. We see that old logs are kept for over a month. Is Cassandra going to delete or compress these logs when a certain threshold is reached or are we supposed to do it ourselves?
Re: System.log
The conf/ directory of cassandra contains log4j-server.properties. I would assume cassandra just rides on top of whatever log4j does. -- Akshay On Wednesday, February 27, 2013 at 12:44 PM, Andy Stec wrote: Cassandra rotates system.log when it reaches 20MB. We see that old logs are kept for over a month. Is Cassandra going to delete or compress these logs when a certain threshold is reached or are we supposed to do it ourselves?
data model advice needed
Hi, I would like to get some advice on how to model columnfamilies for storing log of firewalls. The columns are listed further below. All the possibilities confuse me a bit (super columns, secondary indexes etc). My main question is how can I create the columnfamily in order to be able to get slices of data by the timestamp column combined with getting this data for a specific host or some other column In sql this would be select * from traffic where ts between ... and .. and host = ' xxx' and source_ip = 'xx.xx.xx.xx' and severity = 'xx' Probably any combination can be usefull (the slice/between and host probably the most important. Hopefully someone can shed some light. Regards Hans-Peter CREATE COLUMNFAMILY traffic (key uuid primary key, host varchar, facility varchar, priority varchar, severity varchar, tag varchar, ts timestamp, program varchar, msg varchar, protocol varchar, policy varchar, sourcezone varchar, sourceip varchar, destzone varchar, destip varchar, destport varchar ); Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
Re: data model advice needed
One possibility would be to use dynamic columns, with each column name being a composite made from a timestamp, and the value of each containing serialized json of the details. The host could be the key. Then you could slice the data by column name. Ken - Original Message - From: Hans-Peter Sloot hans-peter.sl...@atos.net To: user@cassandra.apache.org Sent: Wednesday, February 27, 2013 1:01:24 PM Subject: data model advice needed Hi, I would like to get some advice on how to model columnfamilies for storing log of firewalls. The columns are listed further below. All the possibilities confuse me a bit (super columns, secondary indexes etc). My main question is how can I create the columnfamily in order to be able to get slices of data by the timestamp column combined with getting this data for a specific host or some other column In sql this would be select * from traffic where ts between ... and .. and host = ' xxx' and source_ip = 'xx.xx.xx.xx' and severity = 'xx' Probably any combination can be usefull (the slice/between and host probably the most important. Hopefully someone can shed some light. Regards Hans-Peter CREATE COLUMNFAMILY traffic (key uuid primary key, host varchar, facility varchar, priority varchar, severity varchar, tag varchar, ts timestamp, program varchar, msg varchar, protocol varchar, policy varchar, sourcezone varchar, sourceip varchar, destzone varchar, destip varchar, destport varchar ); Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
Re: data model advice needed
There are many different patterns in noSQL with 90% being different than an RDBMS. Check out this page for some things to get you thinking http://buffalosw.com/wiki/Patterns-Page/ If you ever consider playorm and you can figure out how to partition your data(perhaps by month), you can do queries into the partitions very easily on any of your data. The issue is though if you design wrong, you may have to query multiple partitions. You can do stuff like that with or without playorm. Playorm just does some heavy lifting to make your life easier in some cases. Later, Dean On 2/27/13 11:01 AM, Sloot, Hans-Peter hans-peter.sl...@atos.net wrote: Hi, I would like to get some advice on how to model columnfamilies for storing log of firewalls. The columns are listed further below. All the possibilities confuse me a bit (super columns, secondary indexes etc). My main question is how can I create the columnfamily in order to be able to get slices of data by the timestamp column combined with getting this data for a specific host or some other column In sql this would be select * from traffic where ts between ... and .. and host = ' xxx' and source_ip = 'xx.xx.xx.xx' and severity = 'xx' Probably any combination can be usefull (the slice/between and host probably the most important. Hopefully someone can shed some light. Regards Hans-Peter CREATE COLUMNFAMILY traffic (key uuid primary key, host varchar, facility varchar, priority varchar, severity varchar, tag varchar, ts timestamp, program varchar, msg varchar, protocol varchar, policy varchar, sourcezone varchar, sourceip varchar, destzone varchar, destip varchar, destport varchar ); Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762
is upgradesstables required for 1.1.4 to 1.2.2? (I don't think it is)
My script to upgrade our first node in QA is thus (basically, snapshot, drain, stop, then switch over then start)… #!/bin/bash export NODE=$1 export VERSION=1.1.4 export USER=cassandra #NOTE: This script requires you have cassandra 1.2.2 in /opt/cassandra-1.2.2 but # feel free to modify if you like #Move the newest cassandra.yaml to the node scp cassandra.yaml $USER@$NODE:/opt/cassandra/conf #As cassandra user, snapshot then drain the node # and finally shut down cassandra on that node ssh $USER@$NODE \EOF nodetool snapshot $VERSION nodetool drain pkill -f 'java.*cassandra' EOF #Now, our .bashrc for cassandra has /opt/cassandra/bin in it's path # so we unlink and the link to the new cassandra as root since only root has # access to the opt directory. ssh root@$NODE \EOF rm /opt/cassandra ln -s /opt/cassandra-1.2.2 /opt/cassandra EOF #We should start cassandra ourselves probablyso we can watch the cluster as it joins the node #especially for the very first node we do... #Now as cassandra user, start up the cassandra node and then do manual health checks #ssh $USER@$NODE \EOF # cassandra #EOF
Re: is upgradesstables required for 1.1.4 to 1.2.2? (I don't think it is)
Yes, it's required between majors. Which your upgrade would be. On 2/27/13 10:54 AM, Hiller, Dean dean.hil...@nrel.gov wrote: My script to upgrade our first node in QA is thus (basically, snapshot, drain, stop, then switch over then start)Š #!/bin/bash export NODE=$1 export VERSION=1.1.4 export USER=cassandra #NOTE: This script requires you have cassandra 1.2.2 in /opt/cassandra-1.2.2 but # feel free to modify if you like #Move the newest cassandra.yaml to the node scp cassandra.yaml $USER@$NODE:/opt/cassandra/conf #As cassandra user, snapshot then drain the node # and finally shut down cassandra on that node ssh $USER@$NODE \EOF nodetool snapshot $VERSION nodetool drain pkill -f 'java.*cassandra' EOF #Now, our .bashrc for cassandra has /opt/cassandra/bin in it's path # so we unlink and the link to the new cassandra as root since only root has # access to the opt directory. ssh root@$NODE \EOF rm /opt/cassandra ln -s /opt/cassandra-1.2.2 /opt/cassandra EOF #We should start cassandra ourselves probablyso we can watch the cluster as it joins the node #especially for the very first node we do... #Now as cassandra user, start up the cassandra node and then do manual health checks #ssh $USER@$NODE \EOF # cassandra #EOF Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com.
Re: is upgradesstables required for 1.1.4 to 1.2.2? (I don't think it is)
Hmmm, I have this info from Aaron, but what about bringing up version 1.2.2 with thrift off so I can run upgradesstables before I rejoin the ring? Quote from Aaron... In pre 1.2 add these jvm startup params -Dcassandra.join_ring=false -Dcassandra.start_rpc=false Thanks, Dean On 2/27/13 12:00 PM, Michael Kjellman mkjell...@barracuda.com wrote: Yes, it's required between majors. Which your upgrade would be. On 2/27/13 10:54 AM, Hiller, Dean dean.hil...@nrel.gov wrote: My script to upgrade our first node in QA is thus (basically, snapshot, drain, stop, then switch over then start)Š #!/bin/bash export NODE=$1 export VERSION=1.1.4 export USER=cassandra #NOTE: This script requires you have cassandra 1.2.2 in /opt/cassandra-1.2.2 but # feel free to modify if you like #Move the newest cassandra.yaml to the node scp cassandra.yaml $USER@$NODE:/opt/cassandra/conf #As cassandra user, snapshot then drain the node # and finally shut down cassandra on that node ssh $USER@$NODE \EOF nodetool snapshot $VERSION nodetool drain pkill -f 'java.*cassandra' EOF #Now, our .bashrc for cassandra has /opt/cassandra/bin in it's path # so we unlink and the link to the new cassandra as root since only root has # access to the opt directory. ssh root@$NODE \EOF rm /opt/cassandra ln -s /opt/cassandra-1.2.2 /opt/cassandra EOF #We should start cassandra ourselves probablyso we can watch the cluster as it joins the node #especially for the very first node we do... #Now as cassandra user, start up the cassandra node and then do manual health checks #ssh $USER@$NODE \EOF # cassandra #EOF Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com.
Re: is upgradesstables required for 1.1.4 to 1.2.2? (I don't think it is)
H, wouldn't I have to run upgradesstables BEFORE I start the 1.2.2 node? But running upgradesstables as I recall required cassandra to be running.so does it somehow understand the old format when it starts I suspect? I am thinking I just keep the node out of the ring while I run the upgradesstables, correct? But of course am not sure how to start a 1.2.2 node such that it does not join the cluster. Thanks, Dean On 2/27/13 12:31 PM, Hiller, Dean dean.hil...@nrel.gov wrote: Hmmm, I have this info from Aaron, but what about bringing up version 1.2.2 with thrift off so I can run upgradesstables before I rejoin the ring? Quote from Aaron... In pre 1.2 add these jvm startup params -Dcassandra.join_ring=false -Dcassandra.start_rpc=false Thanks, Dean On 2/27/13 12:00 PM, Michael Kjellman mkjell...@barracuda.com wrote: Yes, it's required between majors. Which your upgrade would be. On 2/27/13 10:54 AM, Hiller, Dean dean.hil...@nrel.gov wrote: My script to upgrade our first node in QA is thus (basically, snapshot, drain, stop, then switch over then start)Š #!/bin/bash export NODE=$1 export VERSION=1.1.4 export USER=cassandra #NOTE: This script requires you have cassandra 1.2.2 in /opt/cassandra-1.2.2 but # feel free to modify if you like #Move the newest cassandra.yaml to the node scp cassandra.yaml $USER@$NODE:/opt/cassandra/conf #As cassandra user, snapshot then drain the node # and finally shut down cassandra on that node ssh $USER@$NODE \EOF nodetool snapshot $VERSION nodetool drain pkill -f 'java.*cassandra' EOF #Now, our .bashrc for cassandra has /opt/cassandra/bin in it's path # so we unlink and the link to the new cassandra as root since only root has # access to the opt directory. ssh root@$NODE \EOF rm /opt/cassandra ln -s /opt/cassandra-1.2.2 /opt/cassandra EOF #We should start cassandra ourselves probablyso we can watch the cluster as it joins the node #especially for the very first node we do... #Now as cassandra user, start up the cassandra node and then do manual health checks #ssh $USER@$NODE \EOF # cassandra #EOF Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com.
Re: misreports on nodetool ring command on 1.1.4
What are the replication settings for the keyspace you created? Perhaps you used NTS with a bad DC name? On Wed, Feb 27, 2013 at 7:50 AM, Hiller, Dean dean.hil...@nrel.gov wrote: I just installed 1.1.4 as I need to test upgrade to 1.2.2. I have an existing 6 node cluster which shows 50% ownership on each node which makes sense since RF=3 on everything I have. I brought up all 4 nodes in this cluster and ran nodetool ring and it shows every node with 25%. Then, I create a keyspace and run again and it shows every node at 0% Exact output was the following….. [cassandra@sdi-prod-04 ~]$ nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC RackStatus State Load OwnsToken 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 11.37 KB 25.00% 0 10.20.5.83 DC1 RAC1Up Normal 11.14 KB 25.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 11.14 KB 25.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 15.65 KB 25.00% 127605887595351923798765477786913079295 After I created the keyspace, ran it again…. [cassandra@sdi-prod-04 ~]$ nodetool ring Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 15.95 KB 0.00% 0 10.20.5.83 DC1 RAC1Up Normal 15.73 KB 0.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 15.73 KB 0.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 20.24 KB 0.00% 127605887595351923798765477786913079295 Thanks, Dean -- Tyler Hobbs DataStax http://datastax.com/
Reading old data problem
Hello, I need some help to manage my live cluster! I'm currently running a cluster with 2 nodes, RF:2, CL:1. Since I'm limited to hardware upgrade issues, I'm not able to increase my ConsitencyLevel for now. Anyway, * *I ran a full repair on each node of the cluster followed by a flush. Although I'm still reading old data when performing queries. Well it's know that I might read old data during normal operations, but shouldnt it be sync after the full antientropy repair? What I'm missing? Thanks in advance!
best way to clean up a column family? 60Gig of dangling data
Okay, we had 6 nodes of 130Gig and it was slowly increasing. Through our operations to modify bloomfilter fp chance, we screwed something up as trying to relieve memory pressures was tough. Anyways, somehow, this caused nodes 1, 2, and 3 to jump to around 200Gig and our incoming data stream is completely constant at around 260 points/second. Sooo, we know this dangling data(around 60Gigs) is in one single column family. Node 1, 2, and 3 is for the first token range according to ringdescribe. It is almost like the issue is now replicated to the other two nodes. Is there any way we can go about debugging this and release the 60 gigs of disk space? Also, the upgradesstables when memory is already close to max is not working too well. Can we do this instead(ie. Is it safe?)? 1. Bring down the node 2. Move all the *Index.db files to another directory 3. Start the node and run upgradesstables We know this relieves a ton of memory out of the gate for us. We are trying to get memory back down by a gig, then upgrade to 1.2.2 and switch to leveled compaction as we have ZERO I/o really going on most of the time and really just have this bad bad memory bottleneck(iostat shows nothing typically as we are bottlenecked by memory). Thanks, Dean
Re: misreports on nodetool ring command on 1.1.4
I finally gave up as it was supposed to be creating SimpleStrategy by default but was creating NTS by default so eventually I forced it to SimpleStrategy which did not have the issue. I never really figured out what was wrong there but my simpleStrategy correctly shows every node owns 75% which is what I would expect for RF=3. Thanks, Dean From: Tyler Hobbs ty...@datastax.commailto:ty...@datastax.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wednesday, February 27, 2013 5:58 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: misreports on nodetool ring command on 1.1.4 What are the replication settings for the keyspace you created? Perhaps you used NTS with a bad DC name? On Wed, Feb 27, 2013 at 7:50 AM, Hiller, Dean dean.hil...@nrel.govmailto:dean.hil...@nrel.gov wrote: I just installed 1.1.4 as I need to test upgrade to 1.2.2. I have an existing 6 node cluster which shows 50% ownership on each node which makes sense since RF=3 on everything I have. I brought up all 4 nodes in this cluster and ran nodetool ring and it shows every node with 25%. Then, I create a keyspace and run again and it shows every node at 0% Exact output was the following….. [cassandra@sdi-prod-04 ~]$ nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 11.37 KB25.00% 0 10.20.5.83 DC1 RAC1Up Normal 11.14 KB25.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 11.14 KB25.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 15.65 KB25.00% 127605887595351923798765477786913079295 After I created the keyspace, ran it again…. [cassandra@sdi-prod-04 ~]$ nodetool ring Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 15.95 KB0.00% 0 10.20.5.83 DC1 RAC1Up Normal 15.73 KB0.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 15.73 KB0.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 20.24 KB0.00% 127605887595351923798765477786913079295 Thanks, Dean -- Tyler Hobbs DataStaxhttp://datastax.com/
Re: no other nodes seen on priam cluster
Off the top of my head I would check to make sure the Autoscaling Group you created is restricted to a single Availability Zone, also Priam sets the number of EC2 instances it expects based on the maximum instance count you set on your scaling group (it did this last time i checked a few months ago, it's behaviour may have changed). So I would make your desired, min and max instances for your scaling group are all the same, make sure your ASG is restricted to a single availability zone (e.g. us-east-1b) and then (if you are able to and there is no data in your cluster) delete all the SimpleDB entries Priam has created and then also possibly clear out the cassandra data directory. Other than that I see you've raised it as an issue on the Priam project page , so see what they say ;) Cheers Ben On Thu, Feb 28, 2013 at 3:40 AM, Marcelo Elias Del Valle mvall...@gmail.com wrote: One additional important info, I checked here and the seeds seems really different on each node. The command echo `curl http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds`http://127.0.0.1:8080/Priam/REST/v1/cassconfig/get_seeds returns ip2 on first node and ip1,ip1 on second node. Any idea why? It's probably what is causing cassandra to die, right? 2013/2/27 Marcelo Elias Del Valle mvall...@gmail.com Hello Ben, Thanks for the willingness to help, 2013/2/27 Ben Bromhead b...@instaclustr.com Have your added the priam java agent to cassandras JVM argurments (e.g. -javaagent:$CASS_HOME/lib/priam-cass-extensions-1.1.15.jar) and does the web container running priam have permissions to write to the cassandra config directory? Also what do the priam logs say? I put the priam log of the first node bellow. Yes, I have added priam-cass-extensions to java args and Priam IS actually writting to cassandra dir. If you want to get up and running quickly with cassandra, AWS and priam quickly check out www.instaclustr.comhttp://www.instaclustr.com/?cid=cass-listyou. We deploy Cassandra under your AWS account and you have full root access to the nodes if you want to explore and play around + there is a free tier which is great for experimenting and trying Cassandra out. That sounded really great. I am not sure if it would apply to our case (will consider it though), but some partners would have a great benefit from it, for sure! I will send your link to them. What priam says: 2013-02-27 14:14:58.0614 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/public-hostname returns: ec2-174-129-59-107.compute-1.amazon aws.com 2013-02-27 14:14:58.0615 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/public-ipv4 returns: 174.129.59.107 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/instance-id returns: i-88b32bfb 2013-02-27 14:14:58.0618 INFO pool-2-thread-1 com.netflix.priam.utils.SystemUtils Calling URL API: http://169.254.169.254/latest/meta-data/instance-type returns: c1.medium 2013-02-27 14:14:59.0614 INFO pool-2-thread-1 com.netflix.priam.defaultimpl.PriamConfiguration REGION set to us-east-1, ASG Name set to dmp_cluster-useast1b 2013-02-27 14:14:59.0746 INFO pool-2-thread-1 com.netflix.priam.defaultimpl.PriamConfiguration appid used to fetch properties is: dmp_cluster 2013-02-27 14:14:59.0843 INFO pool-2-thread-1 org.quartz.simpl.SimpleThreadPool Job execution threads will use class loader of thread: pool-2-thread-1 2013-02-27 14:14:59.0861 INFO pool-2-thread-1 org.quartz.core.SchedulerSignalerImpl Initialized Scheduler Signaller of type: class org.quartz.core.SchedulerSignalerImpl 2013-02-27 14:14:59.0862 INFO pool-2-thread-1 org.quartz.core.QuartzScheduler Quartz Scheduler v.1.7.3 created. 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.simpl.RAMJobStore RAMJobStore initialized. 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.impl.StdSchedulerFactory Quartz scheduler 'DefaultQuartzScheduler' initialized from default resource file in Quartz package: 'quartz.propertie s' 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.impl.StdSchedulerFactory Quartz scheduler version: 1.7.3 2013-02-27 14:14:59.0864 INFO pool-2-thread-1 org.quartz.core.QuartzScheduler JobFactory set to: com.netflix.priam.scheduler.GuiceJobFactory@1b6a1c4 2013-02-27 14:15:00.0239 INFO pool-2-thread-1 com.netflix.priam.aws.AWSMembership Querying Amazon returned following instance in the ASG: us-east-1b -- i-8eb32bfd,i-88b32bfb 2013-02-27 14:15:01.0470 INFO Timer-0 org.quartz.utils.UpdateChecker New update(s) found: 1.8.5 [ http://www.terracotta.org/kit/reflector?kitID=defaultpageID=QuartzChangeLog ] 2013-02-27 14:15:10.0925 INFO pool-2-thread-1 com.netflix.priam.identity.InstanceIdentity Found dead instances: i-d49a0da7 2013-02-27 14:15:11.0397
Re: is upgradesstables required for 1.1.4 to 1.2.2? (I don't think it is)
I'm currently migrating 1.1.0 to 1.2.1 and on our small CI cluster, that I was testing some stuff on, it seems that it's not required to run upgradesstables (this doc doesn't mention about it too: http://www.datastax.com/docs/1.2/install/upgrading but the previous versions did). Of course I'd like to upgrade them sooner or later (in case of another C* upgrade or so), but for me it seems like it's just going to work (Cassandra is able to read data files created by the previous version, but the inverse is not always true.) and compactions will slowly convert old-version SSTables to new ones if I don't do it manually. M. W dniu 27.02.2013 20:40, Hiller, Dean pisze: H, wouldn't I have to run upgradesstables BEFORE I start the 1.2.2 node? But running upgradesstables as I recall required cassandra to be running.so does it somehow understand the old format when it starts I suspect? I am thinking I just keep the node out of the ring while I run the upgradesstables, correct? But of course am not sure how to start a 1.2.2 node such that it does not join the cluster. Thanks, Dean On 2/27/13 12:31 PM, Hiller, Dean dean.hil...@nrel.gov wrote: Hmmm, I have this info from Aaron, but what about bringing up version 1.2.2 with thrift off so I can run upgradesstables before I rejoin the ring? Quote from Aaron... In pre 1.2 add these jvm startup params -Dcassandra.join_ring=false -Dcassandra.start_rpc=false Thanks, Dean On 2/27/13 12:00 PM, Michael Kjellman mkjell...@barracuda.com wrote: Yes, it's required between majors. Which your upgrade would be. On 2/27/13 10:54 AM, Hiller, Dean dean.hil...@nrel.gov wrote: My script to upgrade our first node in QA is thus (basically, snapshot, drain, stop, then switch over then start)Š #!/bin/bash export NODE=$1 export VERSION=1.1.4 export USER=cassandra #NOTE: This script requires you have cassandra 1.2.2 in /opt/cassandra-1.2.2 but # feel free to modify if you like #Move the newest cassandra.yaml to the node scp cassandra.yaml $USER@$NODE:/opt/cassandra/conf #As cassandra user, snapshot then drain the node # and finally shut down cassandra on that node ssh $USER@$NODE \EOF nodetool snapshot $VERSION nodetool drain pkill -f 'java.*cassandra' EOF #Now, our .bashrc for cassandra has /opt/cassandra/bin in it's path # so we unlink and the link to the new cassandra as root since only root has # access to the opt directory. ssh root@$NODE \EOF rm /opt/cassandra ln -s /opt/cassandra-1.2.2 /opt/cassandra EOF #We should start cassandra ourselves probablyso we can watch the cluster as it joins the node #especially for the very first node we do... #Now as cassandra user, start up the cassandra node and then do manual health checks #ssh $USER@$NODE \EOF # cassandra #EOF Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com.
RE: data model advice needed
What would be the best book to read about data modeling in Cassandra? I have ‘Cassandra the definitive guide’ but that is relatively old and has only a very limited example of how to design a model. Hans-Peter From: ka...@comcast.net [mailto:ka...@comcast.net] Sent: woensdag 27 februari 2013 19:12 To: user@cassandra.apache.org Subject: Re: data model advice needed One possibility would be to use dynamic columns, with each column name being a composite made from a timestamp, and the value of each containing serialized json of the details. The host could be the key. Then you could slice the data by column name. Ken From: Hans-Peter Sloot hans-peter.sl...@atos.netmailto:hans-peter.sl...@atos.net To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Sent: Wednesday, February 27, 2013 1:01:24 PM Subject: data model advice needed Hi, I would like to get some advice on how to model columnfamilies for storing log of firewalls. The columns are listed further below. All the possibilities confuse me a bit (super columns, secondary indexes etc). My main question is how can I create the columnfamily in order to be able to get slices of data by the timestamp column combined with getting this data for a specific host or some other column In sql this would be select * from traffic where ts between ... and .. and host = ' xxx' and source_ip = 'xx.xx.xx.xx' and severity = 'xx' Probably any combination can be usefull (the slice/between and host probably the most important. Hopefully someone can shed some light. Regards Hans-Peter CREATE COLUMNFAMILY traffic (key uuid primary key, host varchar, facility varchar, priority varchar, severity varchar, tag varchar, ts timestamp, program varchar, msg varchar, protocol varchar, policy varchar, sourcezone varchar, sourceip varchar, destzone varchar, destip varchar, destport varchar ); Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Atos Nederland B.V. group liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. On all offers and agreements under which Atos Nederland B.V. supplies goods and/or services of whatever nature, the Terms of Delivery from Atos Nederland B.V. exclusively apply. The Terms of Delivery shall be promptly submitted to you on your request. Atos Nederland B.V. / Utrecht KvK Utrecht 30132762 Dit bericht is vertrouwelijk en kan geheime informatie bevatten enkel bestemd voor de geadresseerde. Indien dit bericht niet voor u is bestemd, verzoeken wij u dit onmiddellijk aan ons te melden en het bericht te vernietigen. Aangezien de integriteit van het bericht niet veilig gesteld is middels verzending via internet, kan Atos Nederland B.V. niet aansprakelijk worden gehouden voor de inhoud daarvan. Hoewel wij ons inspannen een virusvrij netwerk te hanteren, geven wij geen enkele garantie dat dit bericht virusvrij is, noch aanvaarden wij enige aansprakelijkheid voor de mogelijke aanwezigheid van een virus in dit bericht. Op al onze rechtsverhoudingen, aanbiedingen en overeenkomsten waaronder Atos Nederland B.V. goederen en/of diensten levert zijn met uitsluiting van alle andere voorwaarden de Leveringsvoorwaarden van Atos Nederland B.V. van toepassing. Deze worden u op aanvraag direct kosteloos toegezonden. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately