[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2843: Attachment: 2843_h.patch bq. the IColumnMap name when it does not implement Map interface, and some things it has in common with Map (iteration) it changes semantics of (iterating values instead of keys). not sure what to use instead though, since we already have an IColumnContainer. Maybe ISortedColumns? Yeah, I'm not sure I have a better name either, maybe ISortedColumnHolder, but not sure it's better than ISortedColumns so attached rebased patch simply rename ColumnMap - SortedColumns bq. TSCM and ALCM extending instead of wrapping CSLM/AL, respectively The idea was to save one object creation. I admit this is probably not a huge deal, but it felt that in this case it was no big deal to extend instead of wrapping either, so felt like worth optimizing. I still stand by that choice but I have no good argument against the criticism that it is possibly premature. bq. unrelated reformatting If we're talking about the ones in SuperColumn.java, sorry, I mistakenly forced re-indentation on the file which rewrote the tab to spaces. New patch keeps the old formatting. I'd mention that there is also a few places where I've rewrote cf.getSortedColumns().iterator() to cf.iterator(), which is arguably a bit gratuitous for this patch, but I figured this avoids creating a new Collection in the case of CLSM and there's not so many occurrences. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Fix For: 1.0 Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, 2843_h.patch, fix.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2843: Attachment: (was: 2843_f.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, fix.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2843: Attachment: 2843_g.patch Attaching 2843_g.patch. It is just a rebase with Yang last small fix added. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_g.patch, fix.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: fix.diff ok, the last-reported error is likely orthorgonal to this JIRA... a simple fix better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_f.patch, fix.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2843: Attachment: 2843_e.patch Attaching rebased and update patch. This patch fixes reversed slices. Turns out previous patch wasn't handling them correctly. This is also a little bit more annoying to do that one would hope because the code assumes that the ColumnFamily object is itself sorted in forward sorted order but the insertion are made in reverse sorted order (changing that is much more involved than it appears unfortunately), so it's not as simple as feeding a reverse comparator to the map. So the patch takes an hint at the construction of the ColumnFamily, indicating how the insert will be done, but it does not influence the sorting of the cf itself. Internally, AL keeps the elements in reverse sorted order as one could expect, but its iterator still have to return elements in sorted order, so there is a bit of boilerplate involved. bq. Actually I just realized that there is one place where we do add a column after a read not at the end of the CF. I was a little bit quick on that, the aforementioned place actually uses addAll(). There is no real efficiency problem with addAll() however (it does a merge). So the patch actually adds an assert in add() that the added column is in sorted order. I found another place though where we did do a add() not in sorted order. This is due to a case where we actually want to replace a column, and we end up removing then adding. I've added a replace method to handle that case instead since anyway it's the right thing to do. Note that this example was actually on the compaction path (for counters), not the read one, but it feels better to fix it now as it is related. bq. Is there a reason we can't a test for that path? Not really. I've attached such a test to CASSANDRA-2945. The patch also fix the code style in the added test, add a few more ones and add a bunch of comments. All the tests (unit and system) are passing (that is, excluding the unrelated ones that happens to fail in trunk right now). better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_e.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2843: Attachment: (was: 2843_e.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2843: Attachment: 2843_f.patch Forgot a few goodies in my previous attach. Previous patch was only using AL for local read, new patch use it when deserializing cf in ReadResponse. This should amount for some more speedup on read with more than one node. There is probably a few other places where AL makes sense (compaction clearly comes to mind), but it's a little more involved so let's focus on the read path for this one. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_f.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_d.patch the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_d.patch the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_d.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_c.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_d.patch the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_d.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: incremental.diff) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_d.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_d.patch the DeletionInfo private=protected change is moved to https://issues.apache.org/jira/browse/CASSANDRA-2937 new patch uploaded here better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, 2843_d.patch, 2843_d.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: fast_cf_081_trunk.diff) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, patch_timing, std_timing currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_b.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_c.patch rebased , against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_c.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_c.patch fixed a bug in my newly added test; also the DeletionInfo class in AbstractColumnContainer somehow gives compile error in eclipse, had to change that into protected. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: rebased , against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48)) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: 2843_c.patch) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: rebased against against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48) also fixed a bug in my newly added test; also the DeletionInfo class in AbstractColumnContainer somehow gives compile error in eclipse, had to change that into protected. ) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_c.patch rebased against 4629648899e637e8e03938935f126689cce5ad48 also fixed a bug in my test, the AbstractColumnContainer.DeletionInfo has to be protected, otherwise eclipse gives a compile error better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: i just got down the patch and transfered it to a computer to read Sylvain's approach to compare the last element is quite clean i see no problems the only problem was due to me: the bin search high=mid-1 should be changed to high=mid also with this error fixed , you dont need to special case 1 2 in the end of binsearch ) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: incremental.diff the minor typo fix *on top of** Sylvain's patch, for easier reading better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_b.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: 2843_b.patch patch against the same base as Sylvain's patch, fixing one typo better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, 2843_b.patch, fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: Thanks Sylvain Its great to see this taking shape Im on vacation with a pbone so very cursory comments this implementation should not expect the input always come in already sorted order then probably it would be difficult to achieve a lot of per gain easily. Mainly there are two areas of attack. Synch and cheaper data structure. Actually during my tests I found that sync is not the main issue: going from CSLM to treemap is actually slower. Your tests show different so we'd better confirm. If I were correct in tests. Then that means not assuming special circumstances would be like finding a generic silver bullet which is very difficult https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] ColumnFamily during is a good idea. It is clear that avoiding synchronization will be faster, and given the type of operations we do during reads (insertion in sorted order and iteration), an ArrayList backed solution is sure to be faster too. I will also be much gentle on the GC that the linked list ConcurrentSkipListMap uses. I think that all those will help even with relatively small reads. So let's focus on that for this ticket and let other potential improvement to other ticket, especially if it is unclear they bear any noticeable speedup. is quite frankly ugly and will be a maintenance nightmare (you'll have to check you did overwrite every function that touch the map (which is not the case in the patch) and every update to ColumnFamily have to be aware that it should update FastColumnFamily as well). functionnal ColumnFamily implementation (albeit not synchronized). That is, we can't assume that addition will always be in strict increasing order, otherwise again this will be too hard to use. Granted, I don't think it is used in the read path, but I think that the new ColumnFamily implementation could advantageously be used during compaction (by preCompactedRow typically, and possibly other places where concurrent access is not an issue) where this would matter. the remarks above. The patch is against trunk (not 0.8 branch), because it build on the recently committed refactor of ColumnFamily. It refactors ColumnFamily (AbstractColumnContainer actually) to allow for a pluggable backing column map. The ConcurrentSkipListMap implemn is name ThreadSafeColumnMap and the new one is called ArrayBackedColumnMap (which I prefer to FastSomething since it's not a very helpful name). getTopLevelColumns, I pass along a factory (that each backing implementation provides). The main goal was to avoid creating a columnFamily when it's useless (if row cache is enabled on the CF -- btw, this ticket only improve on read for column family with no cache). (addition of column + iteration), the ArrayBacked implementation is faster than the ConcurrentSkipListMap based one. Interestingly though, this is mainly true when some reconciliation of columns happens. That is, if you only add columns with different names, the ArrayBacked implementation is faster, but not dramatically so. If you start adding column that have to be resolved, the ArrayBacked implementation becomes much faster, even with a reasonably small number of columns (inserting 100 columns with only 10 unique column names, the ArrayBacked is already 30% faster). And this mostly due to the overhead of synchronization (of replace()): a TreeMap based implementation is slightly slower than the ArrayBacked one but not by a lot and thus is much faster than the ConcurrentSkipListMap implementation. use a few unit test for the new ArrayBacked implementation). considerably slow (my test of and 40 bytes in value, is about 16ms. org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) concurrentSkipListMap() that maps column names to values. it needs to maintain a more complex structure of map. output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily,
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Comment: was deleted (was: My comment before last was not quite correct The array implementation with binary search also works without special assumptions. Just that with assumption alot futher gains https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel] ColumnFamily during is a good idea. It is clear that avoiding synchronization will be faster, and given the type of operations we do during reads (insertion in sorted order and iteration), an ArrayList backed solution is sure to be faster too. I will also be much gentle on the GC that the linked list ConcurrentSkipListMap uses. I think that all those will help even with relatively small reads. So let's focus on that for this ticket and let other potential improvement to other ticket, especially if it is unclear they bear any noticeable speedup. is quite frankly ugly and will be a maintenance nightmare (you'll have to check you did overwrite every function that touch the map (which is not the case in the patch) and every update to ColumnFamily have to be aware that it should update FastColumnFamily as well). functionnal ColumnFamily implementation (albeit not synchronized). That is, we can't assume that addition will always be in strict increasing order, otherwise again this will be too hard to use. Granted, I don't think it is used in the read path, but I think that the new ColumnFamily implementation could advantageously be used during compaction (by preCompactedRow typically, and possibly other places where concurrent access is not an issue) where this would matter. the remarks above. The patch is against trunk (not 0.8 branch), because it build on the recently committed refactor of ColumnFamily. It refactors ColumnFamily (AbstractColumnContainer actually) to allow for a pluggable backing column map. The ConcurrentSkipListMap implemn is name ThreadSafeColumnMap and the new one is called ArrayBackedColumnMap (which I prefer to FastSomething since it's not a very helpful name). getTopLevelColumns, I pass along a factory (that each backing implementation provides). The main goal was to avoid creating a columnFamily when it's useless (if row cache is enabled on the CF -- btw, this ticket only improve on read for column family with no cache). (addition of column + iteration), the ArrayBacked implementation is faster than the ConcurrentSkipListMap based one. Interestingly though, this is mainly true when some reconciliation of columns happens. That is, if you only add columns with different names, the ArrayBacked implementation is faster, but not dramatically so. If you start adding column that have to be resolved, the ArrayBacked implementation becomes much faster, even with a reasonably small number of columns (inserting 100 columns with only 10 unique column names, the ArrayBacked is already 30% faster). And this mostly due to the overhead of synchronization (of replace()): a TreeMap based implementation is slightly slower than the ArrayBacked one but not by a lot and thus is much faster than the ConcurrentSkipListMap implementation. use a few unit test for the new ArrayBacked implementation). considerably slow (my test of and 40 bytes in value, is about 16ms. org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) concurrentSkipListMap() that maps column names to values. it needs to maintain a more complex structure of map. output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. agree on the general direction. provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-2843: Attachment: microBenchmark.patch bq. this implementation should not expect the input always come in already sorted order Yeah, that's not really what I said. I said that it should not *assume* it. Which doesn't mean it cannot optimize for it. If you look at the version I attached, at least as far a addColumn is concerned, it does the exact same thing as your version, with the only difference that I first check if adding at the tail is legit and fail back to a binary search if that is not the case. That is, as long as the input is in sorted order, it will be as fast as your implementation (there is one more bytebuffer comparison but I'm willing to bet that it has no noticeable impact on performance). But it won't create unsorted CF if the input is not in sorted order. Btw, Yang, can you try to fix you 3 last comments ? bq. If the main benefit is avoiding synchronization, shouldn't we just stick w/ TreeMap for simplicity? I'm attaching a second patch with the micro-benchmark that I've used and the TreeMap implementation so that people can look for themselves. The test simply creates a CF, add columns to it (in sorted order) and do a simple iteration at the end. I've also add a delete at the end because at least in the case of super columns, we do call removeDeleted so the goal was to see if this has a significant impact (the deletes are made at the beginning of the CF, which is the worst case for the ArrayBacked solution). The test also allow to have some column overwrap (to exercise reconciliation). Not that when that happens, the input is not in strict sorted order anymore, but it's mostly at the disadvantage of the ArrayBack implementation there too. Playing with the parameters (number of columns added, number that overlaps, number of deletes) the results seems to always be the same. The ArrayBacked is consistently faster than the TreeMap one that is itself consistently faster than the CSLM one. Now what I meant is that the difference between ArrayBacked and TreeMap is generally not as big as the one with CSLM, but it is still often very noticeable. This is no surprise in the end, the ArrayBacked solution is optimized for insertion in sorted order: the insertion is then O(1) and with a small constant factor because we're using ArrayList. TreeMap can't beat that. Given this, and given that ColumnFamily is one of our core data structure, I think we should choose the more efficient implementation for each use case. And truth is, the ArrayBacked implementation is really not very complicated, that's basic stuff. bq. That's odd, because adding in sorted order is actually worst-case for CSLM, and that's what we do on reads. Yeah, I was a bit quick on that statement. Rerunning my micro-benchmarks does show that we're much much faster even without reconciliation happening. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: 2843.patch, fast_cf_081_trunk.diff, microBenchmark.patch currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: fast_cf.diff diff file better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: fast_cf.diff currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: b.tar.gz just untar this file into the 0.8.0-rc1 source tree, then compile better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: b.tar.gz, fast_cf.diff currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: fast_cf.diff) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: (was: b.tar.gz) better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2843) better performance on long row read
[ https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yang Yang updated CASSANDRA-2843: - Attachment: fast_cf_081_trunk.diff the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. better performance on long row read --- Key: CASSANDRA-2843 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843 Project: Cassandra Issue Type: New Feature Reporter: Yang Yang Attachments: fast_cf_081_trunk.diff currently if a row contains 1000 columns, the run time becomes considerably slow (my test of a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 40 bytes in value, is about 16ms. this is all running in memory, no disk read is involved. through debugging we can find most of this time is spent on [Wall Time] org.apache.cassandra.db.Table.getRow(QueryFilter) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, int, ColumnFamily) [Wall Time] org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily, Iterator, int) [Wall Time] org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer, Iterator, int) [Wall Time] org.apache.cassandra.db.ColumnFamily.addColumn(IColumn) ColumnFamily.addColumn() is slow because it inserts into an internal concurrentSkipListMap() that maps column names to values. this structure is slow for two reasons: it needs to do synchronization; it needs to maintain a more complex structure of map. but if we look at the whole read path, thrift already defines the read output to be ListColumnOrSuperColumn so it does not make sense to use a luxury map data structure in the interium and finally convert it to a list. on the synchronization side, since the return CF is never going to be shared/modified by other threads, we know the access is always single thread, so no synchronization is needed. but these 2 features are indeed needed for ColumnFamily in other cases, particularly write. so we can provide a different ColumnFamily to CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always creates the standard ColumnFamily, but take a provided returnCF, whose cost is much cheaper. the provided patch is for demonstration now, will work further once we agree on the general direction. CFS, ColumnFamily, and Table are changed; a new FastColumnFamily is provided. the main work is to let the FastColumnFamily use an array for internal storage. at first I used binary search to insert new columns in addColumn(), but later I found that even this is not necessary, since all calling scenarios of ColumnFamily.addColumn() has an invariant that the inserted columns come in sorted order (I still have an issue to resolve descending or ascending now, but ascending works). so the current logic is simply to compare the new column against the end column in the array, if names not equal, append, if equal, reconcile. slight temporary hacks are made on getTopLevelColumnFamily so we have 2 flavors of the method, one accepting a returnCF. but we could definitely think about what is the better way to provide this returnCF. this patch compiles fine, no tests are provided yet. but I tested it in my application, and the performance improvement is dramatic: it offers about 50% reduction in read time in the 3000-column case. thanks Yang -- This message is automatically generated by JIRA. For more information on