mixed linux/windows cluster in Cassandra-1.2
Hello! is mixed linux/windows cluster configuration supported in 1.2 ? Cheers, Ilya Shipitsin
Re: mixed linux/windows cluster in Cassandra-1.2
Technical reason is path separator, which is different on linux and windows. If you would search through maling list, you would have found evidence it does not work and it is not supported. But, the most recent notice I have found was about 0.7 and there was no jira bug number. Just unsupported. вторник, 22 октября 2013 г. пользователь Robert Coli писал: On Mon, Oct 21, 2013 at 12:55 AM, Илья Шипицин chipits...@gmail.comjavascript:_e({}, 'cvml', 'chipits...@gmail.com'); wrote: is mixed linux/windows cluster configuration supported in 1.2 ? I don't think it's officially supported in any version; you would be among a very small number of people operating in this way. However there is no technical reason it shouldn't work. =Rob
Re: mixed linux/windows cluster in Cassandra-1.2
We want to migrate hundred gigabytes cluster from winows to linux without operation interruption. I.e. node by node. вторник, 22 октября 2013 г. пользователь Jon Haddad писал: I can't imagine any situation where this would be practical. What would be the reason to even consider this? On Oct 21, 2013, at 11:06 AM, Robert Coli rc...@eventbrite.comjavascript:_e({}, 'cvml', 'rc...@eventbrite.com'); wrote: On Mon, Oct 21, 2013 at 12:55 AM, Илья Шипицин chipits...@gmail.comjavascript:_e({}, 'cvml', 'chipits...@gmail.com'); wrote: is mixed linux/windows cluster configuration supported in 1.2 ? I don't think it's officially supported in any version; you would be among a very small number of people operating in this way. However there is no technical reason it shouldn't work. =Rob
how to determine RF on the fly ?
Hello! is there easy way to determine current RF, for instance, via mx4j ? Cheers, Ilya Shipitsin
running Cassandra in dual stack (ipv4 + ipv6)
Hello! is it possible to use both ipv4 and ipv6 for Cassandra cluster ? Cheers, Ilya Shipitsin
Running cassandra across nat?
Hello! Is it possible to run cluster in 2 datacenters which are not routable? Each datacenter is running its own lan prefixes, however lan are not routable across datacenters. Cheers, Ilya Shipitsin
any reason for distributing Cassandra binaries without mx4j-tools.jar
Hello! is there any reason why Cassandra is shipped without mx4j-tools.jar ? memory leaking ? licensing issue ? Cheers, Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
it was good idea to have a look at StorageProxy :-) 1.0.10 Performance Tests StorageProxy RangeOperations: 546 ReadOperations: 694563 TotalHints: 0 TotalRangeLatencyMicros: 4469484 TotalReadLatencyMicros:245669679 TotalWriteLatencyMicros: 57819722 WriteOperations:208741 0.7.10 Performance Tests StorageProxy RangeOperations: 520 ReadOperations: 671476 TotalRangeLatencyMicros: 2208902 TotalReadLatencyMicros: 162186009 TotalWriteLatencyMicros: 33911222 WriteOperations: 204806 2012/9/3 aaron morton aa...@thelastpickle.com The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down? If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 1.0) Can you log the execution time for tests and find ones that are taking longer ? There are full request metrics available on the StorageProxy JMX object. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 4:45 PM, Илья Шипицин chipits...@gmail.com wrote: we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details... How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
Re: are asynchronous schema updates possible ?
Is it ok multiple servers will create/update the same CF at once ? I'm looking into dynamic schema update during application deploy/update. вторник, 4 сентября 2012 г. пользователь Sylvain Lebresne писал: To add to Aaron response, you can update a CF concurrently in 1.1 already. However, you cannot create multiple CF concurrently just yet, but that will be fixed in 1.2. -- Sylvain On Sun, Aug 26, 2012 at 11:04 PM, aaron morton aa...@thelastpickle.comjavascript:; wrote: Concurrent schema changes are coming in 1.2. I could not find a single issue that covered it, that may be my bad search fu. The issues for 1.2 are here https://issues.apache.org/jira/browse/CASSANDRA/fixforversion/12319262 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/08/2012, at 7:06 PM, Илья Шипицин chipits...@gmail.comjavascript:; wrote: Hello! we are looking into concurent schema updates (when multiple instances of application create CFs at once. at the http://wiki.apache.org/cassandra/MultiTenant there's open ticket 1391, it is said it is still open. however, in jura is said 1.1.0 is fixed can schema be updated asynchrously on 1.1.x ? or not ? if multiple server create the same CF ? Cheers, Ilya Shipitsin
Re: are asynchronous schema updates possible ?
what kind of problems? nodes do not agree about schema exception on later node ? something worse? 2012/9/5 Sylvain Lebresne sylv...@datastax.com On Tue, Sep 4, 2012 at 8:23 PM, Илья Шипицин chipits...@gmail.com wrote: Is it ok multiple servers will create/update the same CF at once ? I'm looking into dynamic schema update during application deploy/update. As said above, it is ok for update the same CF concurrently in 1.1 but *not* for creation (if you create CF concurrently, whether that is the same CF or not, you might have problem). The last part will be fixed in 1.2 however. -- Sylvain вторник, 4 сентября 2012 г. пользователь Sylvain Lebresne писал: To add to Aaron response, you can update a CF concurrently in 1.1 already. However, you cannot create multiple CF concurrently just yet, but that will be fixed in 1.2. -- Sylvain On Sun, Aug 26, 2012 at 11:04 PM, aaron morton aa...@thelastpickle.com wrote: Concurrent schema changes are coming in 1.2. I could not find a single issue that covered it, that may be my bad search fu. The issues for 1.2 are here https://issues.apache.org/jira/browse/CASSANDRA/fixforversion/12319262 Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/08/2012, at 7:06 PM, Илья Шипицин chipits...@gmail.com wrote: Hello! we are looking into concurent schema updates (when multiple instances of application create CFs at once. at the http://wiki.apache.org/cassandra/MultiTenant there's open ticket 1391, it is said it is still open. however, in jura is said 1.1.0 is fixed can schema be updated asynchrously on 1.1.x ? or not ? if multiple server create the same CF ? Cheers, Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
all tests use similar data access patterns, so every test on 1.0.11 is slower than 0.7.8 recent micros confirms that. 2012/9/5 aaron morton aa...@thelastpickle.com That's slower. the Recent* metrics are the best to look at. They recent each time you look at them. So read them, then run the test, then read them again. You'll need to narrow it down still. e.g. Is there a single test taking a very long time or are all tests running slower ? The Histogram stats can help with that as they provide a spread of latencies. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/09/2012, at 12:27 AM, Илья Шипицин chipits...@gmail.com wrote: it was good idea to have a look at StorageProxy :-) 1.0.10 Performance Tests StorageProxy RangeOperations: 546 ReadOperations: 694563 TotalHints: 0 TotalRangeLatencyMicros: 4469484 TotalReadLatencyMicros:245669679 TotalWriteLatencyMicros: 57819722 WriteOperations:208741 0.7.10 Performance Tests StorageProxy RangeOperations: 520 ReadOperations: 671476 TotalRangeLatencyMicros: 2208902 TotalReadLatencyMicros: 162186009 TotalWriteLatencyMicros: 33911222 WriteOperations: 204806 2012/9/3 aaron morton aa...@thelastpickle.com The whole test run is taking longer ? So it could be slower queries or slower test setup / tear down? If you are creating and truncate the KS for each of the 500 tests is that taking longer ? (Schema code has changed a lot 0.7 1.0) Can you log the execution time for tests and find ones that are taking longer ? There are full request metrics available on the StorageProxy JMX object. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 4:45 PM, Илья Шипицин chipits...@gmail.com wrote: we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details... How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
Re: performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
we are using functional tests ( ~500 tests in time). it is hard to tell which query is slower, it is slower in general. same hardware. 1 node, 32Gb RAM, 8Gb heap. default cassandra settings. as we are talking about functional tests, so we recreate KS just before tests are run. I do not know how to record queries (there are a lot of them), if you are interested, I can set up a special stand for you. 2012/8/31 aaron morton aa...@thelastpickle.com we are running somewhat queue-like with aggressive write-read patterns. We'll need some more details... How much data ? How many machines ? What is the machine spec ? How many clients ? Is there an example of a slow request ? How are you measuring that it's slow ? Is there anything unusual in the log ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 31/08/2012, at 3:30 AM, Edward Capriolo edlinuxg...@gmail.com wrote: If you move from 7.X to 0.8X or 1.0X you have to rebuild sstables as soon as possible. If you have large bloomfilters you can hit a bug where the bloom filters will not work properly. On Thu, Aug 30, 2012 at 9:44 AM, Илья Шипицин chipits...@gmail.com wrote: we are running somewhat queue-like with aggressive write-read patterns. I was looking for scripting queries from live Cassandra installation, but I didn't find any. is there something like thrift-proxy or other query logging/scripting engine ? 2012/8/30 aaron morton aa...@thelastpickle.com in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 We've not had any reports of a performance drop off. All tests so far have show improvements in both read and write performance. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. The sha1 is created so we can diagnose corruptions in the -Data component of the SSTables. They are not used to save network IO. It is calculated while streaming the Memtable to disk so has no impact on disk IO. While not the fasted algorithm I would assume it's CPU overhead in this case is minimal. there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. Bloom filters are used to test if a row key may exist in an SSTable. any explanation ? If you can provide some more information on your use case we may be able to help. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 30/08/2012, at 5:18 AM, Илья Шипицин chipits...@gmail.com wrote: in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
performance is drastically degraded after 0.7.8 -- 1.0.11 upgrade
in terms of our high-rate write load cassandra1.0.11 is about 3 (three!!) times slower than cassandra-0.7.8 after some investigation carried out I noticed files with sha1 extension (which are missing for Cassandra-0.7.8) in maybeWriteDigest() function I see no option fot switching sha1 digests off. I agree, such digests save some network IO, but they seem to be very bad in terms of CPU and disk IO. why to use one more digest (which have to be calculated), there's already relatively small Bloom filter file, which can be used for saving network traffic instead of sha1 digest. any explanation ? Ilya Shipitsin
are asynchronous schema updates possible ?
Hello! we are looking into concurent schema updates (when multiple instances of application create CFs at once. at the http://wiki.apache.org/cassandra/MultiTenant there's open ticket 1391, it is said it is still open. however, in jura is said 1.1.0 is fixed can schema be updated asynchrously on 1.1.x ? or not ? if multiple server create the same CF ? Cheers, Ilya Shipitsin
is it possible to disable compaction per CF ?
Hello! if we are dealing with append-only data model, so what if I disable compaction on certain CF ? any side effect ? can I do it with update column family with compaction_strategy = null ? Cheers, Ilya Shipitsin
is upgradesstables required (or recommended) upon update column family ?
Hello! is upgradesstables required upon update column family with compression_options (or compaction_strategy) ? Cheers, Ilya Shipitsin
how to disable compression ?
Hello! how can I run update command on column family to disable compression (without re-creating CF) ? Cheers, Ilya Shipitsin
Re: how to disable compression ?
[default@XXXKeyspace] update column family YYY with compression_options =[{}]; Command not found: `update column family YYY with compression_options =[{}];`. Type 'help;' or '?' for help. [default@XXXKeyspace] 2012/7/20 Viktor Jevdokimov viktor.jevdoki...@adform.com First you update schema for CF, then you run nodetool upgradesstables on each node: ** ** nodetool -h [HOST] -p [JMXPORT] upgradesstables [keyspace] [cfnames] ** ** For me sometimes it works only after node restart (upgrade leaves previous format, compressed or uncompressed). ** ** ** ** ** ** Best regards / Pagarbiai *Viktor Jevdokimov* Senior Developer Email: viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#%21/adforminsider What is Adform: watch this short video http://vimeo.com/adform/display [image: Adform News] http://www.adform.com Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. *From:* Илья Шипицин [mailto:chipits...@gmail.com] *Sent:* Friday, July 20, 2012 10:16 *To:* user@cassandra.apache.org *Subject:* how to disable compression ? ** ** Hello! how can I run update command on column family to disable compression (without re-creating CF) ? Cheers, Ilya Shipitsin signature-logo29.png
create if not exists ? create or update ?
Hello! is it possible to write CQL statement for creation of ColumnFamily in create if not exists manner ? or create or update manner ? Cheers, Ilya Shipitsin
Re: how to get list of snapshots
I seen that guide. It's missing several important things 1) ok, I can schedule snapshots using cron (snapshot's name will be ganarated from current date) how can I remove snapshots older than a week ? 2) ok, I can enable increment backups. How can I remove incremental SSTables older than 1 week ? it's more tricky than with snapshots. it will lead me to several find/cron/bash scripts. Single mistake and I can delete cassandara data entirely. 2012/5/23 aaron morton aa...@thelastpickle.com 1) is there any good guide for scheduling backups ? this http://www.datastax.com/docs/1.0/operations/backup_restore ? 2) is there way to get list of snapshots ? (without ls in directory) No. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 23/05/2012, at 5:06 PM, Илья Шипицин wrote: Hello! I'm about to schedule backups in the following way a) snapshots are done daily b) increment backups are enabled so, backup will be consistent, very old snapshots must be removed (I guess, a week depth should be enough). couple of questions: 1) is there any good guide for scheduling backups ? 2) is there way to get list of snapshots ? (without ls in directory) Cheers, Ilya Shipitsin
how to get list of snapshots
Hello! I'm about to schedule backups in the following way a) snapshots are done daily b) increment backups are enabled so, backup will be consistent, very old snapshots must be removed (I guess, a week depth should be enough). couple of questions: 1) is there any good guide for scheduling backups ? 2) is there way to get list of snapshots ? (without ls in directory) Cheers, Ilya Shipitsin