Re: [problem with OOM in nodes]
Splitting one report to multiple rows is uncomfortably WHY? Reading from N disks is way faster than reading from 1 disk. I think in terms of PlayOrm and then explain the model you can use so I think in objects first Report { String uniqueId String reportName; //may be indexable and query able String description; CursorToManyReportRow rows; } ReportRow { String uniqueId; String somedata; String someMoreData; } Each row in Report in PlayOrm is backed by two rows in the database in this special case of using CursorToMany ReportRow - reportName=somename, description=some desc CursorToManyRow in index table - reportRowKey56, reportRowKey78, reportRowKey89 (there are NO values in this row and this row can have less than 10 million valuesÅ if your report is beyond 10 million, let me know and I have a different design). Then each report row is basically the same structure as above. You can then 1. Read in the report 2. As you read from CursorToMany, it does a BATCH slice into the CursorToManyRow AND then does a MULTIGET in parallel to fetch report rows(ie. It is all in parallel so get lots of rows from many disks really fast) 3. Print the rows out If you have more than 10 million rows in a report, let me know. You can do what PlayOrm does yourself of course ;). Later, Dean On 9/23/12 11:14 PM, Denis Gabaydulin gaba...@gmail.com wrote: On Sun, Sep 23, 2012 at 10:41 PM, aaron morton aa...@thelastpickle.com wrote: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 Is it bad signal? Sorry, I do not know what this is outputting. This is outputting size of big rows which cassandra had compacted before. As I can see in cfstats, compacted row maximum size: 386857368 ! Yes. Having rows in the 100's of MB is will cause problems. Doubly so if they are large super columns. What exactly is the problem with big rows? And, how can we should place our data in this case (see the schema in the previous replies)? Splitting one report to multiple rows is uncomfortably :-( Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/09/2012, at 5:07 AM, Denis Gabaydulin gaba...@gmail.com wrote: And some stuff from log: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 3821.55MB 3337.85MB 1221.64MB 1128.67MB 930.666MB 916.4MB 861.114MB 843.325MB 711.813MB 706.992MB 674.282MB 673.861MB 658.305MB 557.756MB 531.577MB 493.112MB 492.513MB 492.291MB 484.484MB 479.908MB 465.742MB 464.015MB 459.95MB 454.472MB 441.248MB 428.763MB 424.028MB 416.663MB 416.191MB 409.341MB 406.895MB 397.314MB 388.27MB 376.714MB 371.298MB 368.819MB 366.92MB 361.371MB 360.509MB 356.168MB 355.012MB 354.897MB 354.759MB 347.986MB 344.109MB 335.546MB 329.529MB 326.857MB 326.252MB 326.237MB Is it bad signal? On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin gaba...@gmail.com wrote: Found one more intersting fact. As I can see in cfstats, compacted row maximum size: 386857368 ! On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin gaba...@gmail.com wrote: Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why composite columns are recommended over supercolumns. On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin gaba...@gmail.com wrote: p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI).
Re: [problem with OOM in nodes]
Thanks a lot for helping. We came to the same decision clustering one report to multiple cassandra rows (sorted buckets of report rows) and manage clusters on client side. On Tue, Sep 25, 2012 at 5:28 AM, aaron morton aa...@thelastpickle.com wrote: What exactly is the problem with big rows? During compaction the row will be passed through a slower two pass processing, this add's to IO pressure. Counting big rows requires that the entire row be read. Repairing big rows requires that the entire row be repaired. I generally avoid rows above a few 10's of MB as they result in more memory churn and create admin problems as above. What exactly is the problem with big rows? And, how can we should place our data in this case (see the schema in the previous replies)? Splitting one report to multiple rows is uncomfortably :-( Looking at your row sizes below, the question is How do I store an object which may be up to 3.5GB in size. AFAIK there are no hard limits that would prevent you putting that in one row. And avoiding super columns may save some space. You could have a Simple CF, where the each report is one row, each report row is one column and the report row is serialised (with JSON or protobufs etc) and stored in the column value. But i would recommend creating a model where row size is constrained in space. E.g. Report CF: * one report per row. * one column per report row * column value is empty. Report Rows CF: * one row per 100 report rows, e.g. report_id : first_row_number * column name is report row number. * column value is report data (Or use composite column names, e.g. row_number : report_column You can still do ranges, buy you have to do some client side work to work it out. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/09/2012, at 5:14 PM, Denis Gabaydulin gaba...@gmail.com wrote: On Sun, Sep 23, 2012 at 10:41 PM, aaron morton aa...@thelastpickle.com wrote: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 Is it bad signal? Sorry, I do not know what this is outputting. This is outputting size of big rows which cassandra had compacted before. As I can see in cfstats, compacted row maximum size: 386857368 ! Yes. Having rows in the 100's of MB is will cause problems. Doubly so if they are large super columns. What exactly is the problem with big rows? And, how can we should place our data in this case (see the schema in the previous replies)? Splitting one report to multiple rows is uncomfortably :-( Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/09/2012, at 5:07 AM, Denis Gabaydulin gaba...@gmail.com wrote: And some stuff from log: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 3821.55MB 3337.85MB 1221.64MB 1128.67MB 930.666MB 916.4MB 861.114MB 843.325MB 711.813MB 706.992MB 674.282MB 673.861MB 658.305MB 557.756MB 531.577MB 493.112MB 492.513MB 492.291MB 484.484MB 479.908MB 465.742MB 464.015MB 459.95MB 454.472MB 441.248MB 428.763MB 424.028MB 416.663MB 416.191MB 409.341MB 406.895MB 397.314MB 388.27MB 376.714MB 371.298MB 368.819MB 366.92MB 361.371MB 360.509MB 356.168MB 355.012MB 354.897MB 354.759MB 347.986MB 344.109MB 335.546MB 329.529MB 326.857MB 326.252MB 326.237MB Is it bad signal? On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin gaba...@gmail.com wrote: Found one more intersting fact. As I can see in cfstats, compacted row maximum size: 386857368 ! On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin gaba...@gmail.com wrote: Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting
Re: [problem with OOM in nodes]
What exactly is the problem with big rows? During compaction the row will be passed through a slower two pass processing, this add's to IO pressure. Counting big rows requires that the entire row be read. Repairing big rows requires that the entire row be repaired. I generally avoid rows above a few 10's of MB as they result in more memory churn and create admin problems as above. What exactly is the problem with big rows? And, how can we should place our data in this case (see the schema in the previous replies)? Splitting one report to multiple rows is uncomfortably :-( Looking at your row sizes below, the question is How do I store an object which may be up to 3.5GB in size. AFAIK there are no hard limits that would prevent you putting that in one row. And avoiding super columns may save some space. You could have a Simple CF, where the each report is one row, each report row is one column and the report row is serialised (with JSON or protobufs etc) and stored in the column value. But i would recommend creating a model where row size is constrained in space. E.g. Report CF: * one report per row. * one column per report row * column value is empty. Report Rows CF: * one row per 100 report rows, e.g. report_id : first_row_number * column name is report row number. * column value is report data (Or use composite column names, e.g. row_number : report_column You can still do ranges, buy you have to do some client side work to work it out. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 24/09/2012, at 5:14 PM, Denis Gabaydulin gaba...@gmail.com wrote: On Sun, Sep 23, 2012 at 10:41 PM, aaron morton aa...@thelastpickle.com wrote: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 Is it bad signal? Sorry, I do not know what this is outputting. This is outputting size of big rows which cassandra had compacted before. As I can see in cfstats, compacted row maximum size: 386857368 ! Yes. Having rows in the 100's of MB is will cause problems. Doubly so if they are large super columns. What exactly is the problem with big rows? And, how can we should place our data in this case (see the schema in the previous replies)? Splitting one report to multiple rows is uncomfortably :-( Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/09/2012, at 5:07 AM, Denis Gabaydulin gaba...@gmail.com wrote: And some stuff from log: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 3821.55MB 3337.85MB 1221.64MB 1128.67MB 930.666MB 916.4MB 861.114MB 843.325MB 711.813MB 706.992MB 674.282MB 673.861MB 658.305MB 557.756MB 531.577MB 493.112MB 492.513MB 492.291MB 484.484MB 479.908MB 465.742MB 464.015MB 459.95MB 454.472MB 441.248MB 428.763MB 424.028MB 416.663MB 416.191MB 409.341MB 406.895MB 397.314MB 388.27MB 376.714MB 371.298MB 368.819MB 366.92MB 361.371MB 360.509MB 356.168MB 355.012MB 354.897MB 354.759MB 347.986MB 344.109MB 335.546MB 329.529MB 326.857MB 326.252MB 326.237MB Is it bad signal? On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin gaba...@gmail.com wrote: Found one more intersting fact. As I can see in cfstats, compacted row maximum size: 386857368 ! On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin gaba...@gmail.com wrote: Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why
Re: [problem with OOM in nodes]
/var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 Is it bad signal? Sorry, I do not know what this is outputting. As I can see in cfstats, compacted row maximum size: 386857368 ! Yes. Having rows in the 100's of MB is will cause problems. Doubly so if they are large super columns. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/09/2012, at 5:07 AM, Denis Gabaydulin gaba...@gmail.com wrote: And some stuff from log: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 3821.55MB 3337.85MB 1221.64MB 1128.67MB 930.666MB 916.4MB 861.114MB 843.325MB 711.813MB 706.992MB 674.282MB 673.861MB 658.305MB 557.756MB 531.577MB 493.112MB 492.513MB 492.291MB 484.484MB 479.908MB 465.742MB 464.015MB 459.95MB 454.472MB 441.248MB 428.763MB 424.028MB 416.663MB 416.191MB 409.341MB 406.895MB 397.314MB 388.27MB 376.714MB 371.298MB 368.819MB 366.92MB 361.371MB 360.509MB 356.168MB 355.012MB 354.897MB 354.759MB 347.986MB 344.109MB 335.546MB 329.529MB 326.857MB 326.252MB 326.237MB Is it bad signal? On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin gaba...@gmail.com wrote: Found one more intersting fact. As I can see in cfstats, compacted row maximum size: 386857368 ! On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin gaba...@gmail.com wrote: Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why composite columns are recommended over supercolumns. On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin gaba...@gmail.com wrote: p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI). The storage schema is: Reports:{ 1:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... }, 2:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... } ... } create keyspace osmp_reports with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 4} and durable_writes = true; use osmp_reports; create column family QueryReportResult with column_type = 'Super' and comparator = 'BytesType' and subcomparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 432000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY'; = Read/Write CL: 2 Most of the reports are small, but some of them could have a half mullion of rows (xml). Typical operations on this dataset is: count report rows by report_id (top level id of super column); get columns (report_rows) by range predicate and limit for given report_id. A data is written once and hasn't never been updated. So, time to time a couple of nodes crashes with OOM exception. Heap dump says, that we have a lot of super columns in memory. For example, I see one of the reports is in memory entirely. How it could be possible? If we don't load the whole report, cassandra could whether do this for some internal reasons? What
Re: [problem with OOM in nodes]
On Sun, Sep 23, 2012 at 10:41 PM, aaron morton aa...@thelastpickle.com wrote: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 Is it bad signal? Sorry, I do not know what this is outputting. This is outputting size of big rows which cassandra had compacted before. As I can see in cfstats, compacted row maximum size: 386857368 ! Yes. Having rows in the 100's of MB is will cause problems. Doubly so if they are large super columns. What exactly is the problem with big rows? And, how can we should place our data in this case (see the schema in the previous replies)? Splitting one report to multiple rows is uncomfortably :-( Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 22/09/2012, at 5:07 AM, Denis Gabaydulin gaba...@gmail.com wrote: And some stuff from log: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 3821.55MB 3337.85MB 1221.64MB 1128.67MB 930.666MB 916.4MB 861.114MB 843.325MB 711.813MB 706.992MB 674.282MB 673.861MB 658.305MB 557.756MB 531.577MB 493.112MB 492.513MB 492.291MB 484.484MB 479.908MB 465.742MB 464.015MB 459.95MB 454.472MB 441.248MB 428.763MB 424.028MB 416.663MB 416.191MB 409.341MB 406.895MB 397.314MB 388.27MB 376.714MB 371.298MB 368.819MB 366.92MB 361.371MB 360.509MB 356.168MB 355.012MB 354.897MB 354.759MB 347.986MB 344.109MB 335.546MB 329.529MB 326.857MB 326.252MB 326.237MB Is it bad signal? On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin gaba...@gmail.com wrote: Found one more intersting fact. As I can see in cfstats, compacted row maximum size: 386857368 ! On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin gaba...@gmail.com wrote: Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why composite columns are recommended over supercolumns. On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin gaba...@gmail.com wrote: p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI). The storage schema is: Reports:{ 1:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... }, 2:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... } ... } create keyspace osmp_reports with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 4} and durable_writes = true; use osmp_reports; create column family QueryReportResult with column_type = 'Super' and comparator = 'BytesType' and subcomparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 432000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY'; = Read/Write CL: 2 Most of the reports are small, but some of them could have a half mullion of rows (xml). Typical operations on this dataset is: count report rows by report_id (top level id of super column); get columns (report_rows) by range predicate and limit for given report_id. A data is written once and hasn't never
Re: [problem with OOM in nodes]
Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why composite columns are recommended over supercolumns. On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin gaba...@gmail.com wrote: p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI). The storage schema is: Reports:{ 1:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... }, 2:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... } ... } create keyspace osmp_reports with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 4} and durable_writes = true; use osmp_reports; create column family QueryReportResult with column_type = 'Super' and comparator = 'BytesType' and subcomparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 432000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY'; = Read/Write CL: 2 Most of the reports are small, but some of them could have a half mullion of rows (xml). Typical operations on this dataset is: count report rows by report_id (top level id of super column); get columns (report_rows) by range predicate and limit for given report_id. A data is written once and hasn't never been updated. So, time to time a couple of nodes crashes with OOM exception. Heap dump says, that we have a lot of super columns in memory. For example, I see one of the reports is in memory entirely. How it could be possible? If we don't load the whole report, cassandra could whether do this for some internal reasons? What should we do to avoid OOMs? -- Tyler Hobbs DataStax
Re: [problem with OOM in nodes]
Found one more intersting fact. As I can see in cfstats, compacted row maximum size: 386857368 ! On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin gaba...@gmail.com wrote: Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why composite columns are recommended over supercolumns. On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin gaba...@gmail.com wrote: p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI). The storage schema is: Reports:{ 1:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... }, 2:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... } ... } create keyspace osmp_reports with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 4} and durable_writes = true; use osmp_reports; create column family QueryReportResult with column_type = 'Super' and comparator = 'BytesType' and subcomparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 432000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY'; = Read/Write CL: 2 Most of the reports are small, but some of them could have a half mullion of rows (xml). Typical operations on this dataset is: count report rows by report_id (top level id of super column); get columns (report_rows) by range predicate and limit for given report_id. A data is written once and hasn't never been updated. So, time to time a couple of nodes crashes with OOM exception. Heap dump says, that we have a lot of super columns in memory. For example, I see one of the reports is in memory entirely. How it could be possible? If we don't load the whole report, cassandra could whether do this for some internal reasons? What should we do to avoid OOMs? -- Tyler Hobbs DataStax
Re: [problem with OOM in nodes]
And some stuff from log: /var/log/cassandra$ cat system.log | grep Compacting large | grep -E [0-9]+ bytes -o | cut -d -f 1 | awk '{ foo = $1 / 1024 / 1024 ; print foo MB }' | sort -nr | head -n 50 3821.55MB 3337.85MB 1221.64MB 1128.67MB 930.666MB 916.4MB 861.114MB 843.325MB 711.813MB 706.992MB 674.282MB 673.861MB 658.305MB 557.756MB 531.577MB 493.112MB 492.513MB 492.291MB 484.484MB 479.908MB 465.742MB 464.015MB 459.95MB 454.472MB 441.248MB 428.763MB 424.028MB 416.663MB 416.191MB 409.341MB 406.895MB 397.314MB 388.27MB 376.714MB 371.298MB 368.819MB 366.92MB 361.371MB 360.509MB 356.168MB 355.012MB 354.897MB 354.759MB 347.986MB 344.109MB 335.546MB 329.529MB 326.857MB 326.252MB 326.237MB Is it bad signal? On Fri, Sep 21, 2012 at 8:22 PM, Denis Gabaydulin gaba...@gmail.com wrote: Found one more intersting fact. As I can see in cfstats, compacted row maximum size: 386857368 ! On Fri, Sep 21, 2012 at 12:50 PM, Denis Gabaydulin gaba...@gmail.com wrote: Reports - is a SuperColumnFamily Each report has unique identifier (report_id). This is a key of SuperColumnFamily. And a report saved in separate row. A report is consisted of report rows (may vary between 1 and 50, but most are small). Each report row is saved in separate super column. Hector based code: superCfMutator.addInsertion( report_id, Reports, HFactory.createSuperColumn( report_row_id, mapper.convertObject(object), columnDefinition.getTopSerializer(), columnDefinition.getSubSerializer(), inferringSerializer ) ); We have two frequent operation: 1. count report rows by report_id (calculate number of super columns in the row). 2. get report rows by report_id and range predicate (get super columns from the row with range predicate). I can't see here a big super columns :-( On Fri, Sep 21, 2012 at 3:10 AM, Tyler Hobbs ty...@datastax.com wrote: I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why composite columns are recommended over supercolumns. On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin gaba...@gmail.com wrote: p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI). The storage schema is: Reports:{ 1:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... }, 2:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... } ... } create keyspace osmp_reports with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 4} and durable_writes = true; use osmp_reports; create column family QueryReportResult with column_type = 'Super' and comparator = 'BytesType' and subcomparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 432000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY'; = Read/Write CL: 2 Most of the reports are small, but some of them could have a half mullion of rows (xml). Typical operations on this dataset is: count report rows by report_id (top level id of super column); get columns (report_rows) by range predicate and limit for given report_id. A data is written once and hasn't never been updated. So, time to time a couple of nodes crashes with OOM exception. Heap dump says, that we have a lot of super columns in memory. For example, I see one of the reports is in memory entirely. How it could be possible? If we don't load the whole report, cassandra could whether do this for some internal reasons? What should we do to avoid OOMs? -- Tyler Hobbs DataStax
[problem with OOM in nodes]
Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI). The storage schema is: Reports:{ 1:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... }, 2:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... } ... } create keyspace osmp_reports with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 4} and durable_writes = true; use osmp_reports; create column family QueryReportResult with column_type = 'Super' and comparator = 'BytesType' and subcomparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 432000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY'; = Read/Write CL: 2 Most of the reports are small, but some of them could have a half mullion of rows (xml). Typical operations on this dataset is: count report rows by report_id (top level id of super column); get columns (report_rows) by range predicate and limit for given report_id. A data is written once and hasn't never been updated. So, time to time a couple of nodes crashes with OOM exception. Heap dump says, that we have a lot of super columns in memory. For example, I see one of the reports is in memory entirely. How it could be possible? If we don't load the whole report, cassandra could whether do this for some internal reasons? What should we do to avoid OOMs?
Re: [problem with OOM in nodes]
p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI). The storage schema is: Reports:{ 1:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... }, 2:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... } ... } create keyspace osmp_reports with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 4} and durable_writes = true; use osmp_reports; create column family QueryReportResult with column_type = 'Super' and comparator = 'BytesType' and subcomparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 432000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY'; = Read/Write CL: 2 Most of the reports are small, but some of them could have a half mullion of rows (xml). Typical operations on this dataset is: count report rows by report_id (top level id of super column); get columns (report_rows) by range predicate and limit for given report_id. A data is written once and hasn't never been updated. So, time to time a couple of nodes crashes with OOM exception. Heap dump says, that we have a lot of super columns in memory. For example, I see one of the reports is in memory entirely. How it could be possible? If we don't load the whole report, cassandra could whether do this for some internal reasons? What should we do to avoid OOMs?
Re: [problem with OOM in nodes]
I'm not 100% that I understand your data model and read patterns correctly, but it sounds like you have large supercolumns and are requesting some of the subcolumns from individual super columns. If that's the case, the issue is that Cassandra must deserialize the entire supercolumn in memory whenever you read *any* of the subcolumns. This is one of the reasons why composite columns are recommended over supercolumns. On Thu, Sep 20, 2012 at 6:45 AM, Denis Gabaydulin gaba...@gmail.com wrote: p.s. Cassandra 1.1.4 On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin gaba...@gmail.com wrote: Hi, all! We have a cluster with virtual 7 nodes (disk storage is connected to nodes with iSCSI). The storage schema is: Reports:{ 1:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... }, 2:{ 1:{value1:some val, value2:some val}, 2:{value1:some val, value2:some val} ... } ... } create keyspace osmp_reports with placement_strategy = 'SimpleStrategy' and strategy_options = {replication_factor : 4} and durable_writes = true; use osmp_reports; create column family QueryReportResult with column_type = 'Super' and comparator = 'BytesType' and subcomparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 1.0 and dclocal_read_repair_chance = 0.0 and gc_grace = 432000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY'; = Read/Write CL: 2 Most of the reports are small, but some of them could have a half mullion of rows (xml). Typical operations on this dataset is: count report rows by report_id (top level id of super column); get columns (report_rows) by range predicate and limit for given report_id. A data is written once and hasn't never been updated. So, time to time a couple of nodes crashes with OOM exception. Heap dump says, that we have a lot of super columns in memory. For example, I see one of the reports is in memory entirely. How it could be possible? If we don't load the whole report, cassandra could whether do this for some internal reasons? What should we do to avoid OOMs? -- Tyler Hobbs DataStax http://datastax.com/