subject:"\[jira\] \[Updated\] \(CASSANDRA\-2843\) better performance on long row read"


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_d.patch

the DeletionInfo private=protected change is moved to 
https://issues.apache.org/jira/browse/CASSANDRA-2937

new patch uploaded here

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, 2843_d.patch, 2843_d.patch, 
 fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, 
 std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_d.patch

the DeletionInfo private=protected change is moved to 
https://issues.apache.org/jira/browse/CASSANDRA-2937

new patch uploaded here

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, 
 fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, 
 std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_d.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, 
 fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, 
 std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_c.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, 
 fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, 
 std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_d.patch

the DeletionInfo private=protected change is moved to 
https://issues.apache.org/jira/browse/CASSANDRA-2937

new patch uploaded here

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 2843_d.patch, 
 fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, 
 std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_d.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, 2843_d.patch, 
 fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, 
 std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: incremental.diff)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, 
 patch_timing, std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: the DeletionInfo private=protected change is moved to 
https://issues.apache.org/jira/browse/CASSANDRA-2937

new patch uploaded here)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, 
 patch_timing, std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_d.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, 
 patch_timing, std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: the DeletionInfo private=protected change is moved to 
https://issues.apache.org/jira/browse/CASSANDRA-2937

new patch uploaded here)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, 
 patch_timing, std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: the DeletionInfo private=protected change is moved to 
https://issues.apache.org/jira/browse/CASSANDRA-2937

new patch uploaded here)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, 
 patch_timing, std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_d.patch

the DeletionInfo private=protected change is moved to 
https://issues.apache.org/jira/browse/CASSANDRA-2937

new patch uploaded here

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, 2843_d.patch, 2843_d.patch, 
 fast_cf_081_trunk.diff, incremental.diff, microBenchmark.patch, patch_timing, 
 std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: fast_cf_081_trunk.diff)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_d.patch, microBenchmark.patch, 
 patch_timing, std_timing


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_b.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, 
 microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_c.patch

rebased , against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_c.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, 
 microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_c.patch

fixed a bug in my newly added test; also the DeletionInfo class in 
AbstractColumnContainer somehow gives compile error in eclipse, had to change 
that into protected.

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: rebased , against HEAD of trunk 
(4629648899e637e8e03938935f126689cce5ad48))

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: 2843_c.patch)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, 
 microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: 

rebased against against HEAD of trunk (4629648899e637e8e03938935f126689cce5ad48)


also fixed a bug in my newly added test; also the DeletionInfo class in 
AbstractColumnContainer somehow gives compile error in eclipse, had to change 
that into protected. )

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, incremental.diff, 
 microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_c.patch

rebased against 4629648899e637e8e03938935f126689cce5ad48

also fixed a bug in my test,
the AbstractColumnContainer.DeletionInfo has to be protected, otherwise eclipse 
gives a compile error

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_c.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-11 Thread Yang Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: i just got down the patch and transfered it to a computer to read
Sylvain's approach to compare the last element is quite clean  i see no
problems

the only problem was due to me: the bin search high=mid-1  should be changed
to high=mid
also with this error fixed , you dont need to special case 1 2 in the end of
binsearch


)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, fast_cf_081_trunk.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-11 Thread Yang Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: incremental.diff

the minor typo fix *on top of** Sylvain's patch, for easier reading

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_b.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-11 Thread Yang Yang (JIRA)


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: 2843_b.patch

patch against the same base as Sylvain's patch, fixing one typo

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: 2843.patch, 2843_b.patch, fast_cf_081_trunk.diff, 
 incremental.diff, microBenchmark.patch


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-07 Thread Yang Yang (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: Thanks Sylvain

Its great to see this taking shape

Im on vacation with a pbone so very cursory comments

this implementation should not expect the input always come in already
sorted order then probably it would be difficult to achieve a lot of per
gain easily. Mainly there are two areas of attack. Synch and cheaper data
structure. Actually during my tests I found that sync is not the main issue:
going from CSLM to treemap is actually slower. Your tests show different so
we'd better confirm. If I were correct in tests. Then that means not
assuming special circumstances would be like finding a generic silver bullet
which is very difficult
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
ColumnFamily during is a good idea. It is clear that avoiding
synchronization will be faster, and given the type of operations we do
during reads (insertion in sorted order and iteration), an ArrayList backed
solution is sure to be faster too. I will also be much gentle on the GC that
the linked list ConcurrentSkipListMap uses. I think that all those will help
even with relatively small reads. So let's focus on that for this ticket and
let other potential improvement to other ticket, especially if it is unclear
they bear any noticeable speedup.
is quite frankly ugly and will be a maintenance nightmare (you'll have to
check you did overwrite every function that touch the map (which is not the
case in the patch) and every update to ColumnFamily have to be aware that it
should update FastColumnFamily as well).
functionnal ColumnFamily implementation (albeit not synchronized). That is,
we can't assume that addition will always be in strict increasing order,
otherwise again this will be too hard to use.
Granted, I don't think it is used in the read path, but I think that the new
ColumnFamily implementation could advantageously be used during compaction
(by preCompactedRow typically, and possibly other places where concurrent
access is not an issue) where this would matter.
the remarks above. The patch is against trunk (not 0.8 branch), because it
build on the recently committed refactor of ColumnFamily. It refactors
ColumnFamily (AbstractColumnContainer actually) to allow for a pluggable
backing column map. The ConcurrentSkipListMap implemn is name
ThreadSafeColumnMap and the new one is called ArrayBackedColumnMap (which I
prefer to FastSomething since it's not a very helpful name).
getTopLevelColumns, I pass along a factory (that each backing implementation
provides). The main goal was to avoid creating a columnFamily when it's
useless (if row cache is enabled on the CF -- btw, this ticket only improve
on read for column family with no cache).
(addition of column + iteration), the ArrayBacked implementation is faster
than the ConcurrentSkipListMap based one. Interestingly though, this is
mainly true when some reconciliation of columns happens. That is, if you
only add columns with different names, the ArrayBacked implementation is
faster, but not dramatically so. If you start adding column that have to be
resolved, the ArrayBacked implementation becomes much faster, even with a
reasonably small number of columns (inserting 100 columns with only 10
unique column names, the ArrayBacked is already 30% faster). And this
mostly due to the overhead of synchronization (of replace()): a TreeMap
based implementation is slightly slower than the ArrayBacked one but not by
a lot and thus is much faster than the ConcurrentSkipListMap implementation.
use a few unit test for the new ArrayBacked implementation).
considerably slow (my test of
and 40 bytes in value, is about 16ms.
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter,
ColumnFamily)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int,
ColumnFamily)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter,
int, ColumnFamily)
org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
Iterator, int)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
Iterator, int)
concurrentSkipListMap() that maps column names to values.
it needs to maintain a more complex structure of map.
output to be ListColumnOrSuperColumn so it does not make sense to use a
luxury map data structure in the interium and finally convert it to a list.
on the synchronization side, since the return CF is never going to be
shared/modified by other threads, we know the access is always single
thread, so no synchronization is needed.
particularly write. so we can provide a different ColumnFamily to
CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always
creates the standard ColumnFamily,

[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-07 Thread Yang Yang (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Yang Yang updated CASSANDRA-2843:
-

Comment: was deleted

(was: My comment before last was not quite correct

The array implementation with binary search also works without special
assumptions. Just that with assumption alot futher gains
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
ColumnFamily during is a good idea. It is clear that avoiding
synchronization will be faster, and given the type of operations we do
during reads (insertion in sorted order and iteration), an ArrayList backed
solution is sure to be faster too. I will also be much gentle on the GC that
the linked list ConcurrentSkipListMap uses. I think that all those will help
even with relatively small reads. So let's focus on that for this ticket and
let other potential improvement to other ticket, especially if it is unclear
they bear any noticeable speedup.
is quite frankly ugly and will be a maintenance nightmare (you'll have to
check you did overwrite every function that touch the map (which is not the
case in the patch) and every update to ColumnFamily have to be aware that it
should update FastColumnFamily as well).
functionnal ColumnFamily implementation (albeit not synchronized). That is,
we can't assume that addition will always be in strict increasing order,
otherwise again this will be too hard to use.
Granted, I don't think it is used in the read path, but I think that the new
ColumnFamily implementation could advantageously be used during compaction
(by preCompactedRow typically, and possibly other places where concurrent
access is not an issue) where this would matter.
the remarks above. The patch is against trunk (not 0.8 branch), because it
build on the recently committed refactor of ColumnFamily. It refactors
ColumnFamily (AbstractColumnContainer actually) to allow for a pluggable
backing column map. The ConcurrentSkipListMap implemn is name
ThreadSafeColumnMap and the new one is called ArrayBackedColumnMap (which I
prefer to FastSomething since it's not a very helpful name).
getTopLevelColumns, I pass along a factory (that each backing implementation
provides). The main goal was to avoid creating a columnFamily when it's
useless (if row cache is enabled on the CF -- btw, this ticket only improve
on read for column family with no cache).
(addition of column + iteration), the ArrayBacked implementation is faster
than the ConcurrentSkipListMap based one. Interestingly though, this is
mainly true when some reconciliation of columns happens. That is, if you
only add columns with different names, the ArrayBacked implementation is
faster, but not dramatically so. If you start adding column that have to be
resolved, the ArrayBacked implementation becomes much faster, even with a
reasonably small number of columns (inserting 100 columns with only 10
unique column names, the ArrayBacked is already 30% faster). And this
mostly due to the overhead of synchronization (of replace()): a TreeMap
based implementation is slightly slower than the ArrayBacked one but not by
a lot and thus is much faster than the ConcurrentSkipListMap implementation.
use a few unit test for the new ArrayBacked implementation).
considerably slow (my test of
and 40 bytes in value, is about 16ms.
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter,
ColumnFamily)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int,
ColumnFamily)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter,
int, ColumnFamily)
org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
Iterator, int)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
Iterator, int)
concurrentSkipListMap() that maps column names to values.
it needs to maintain a more complex structure of map.
output to be ListColumnOrSuperColumn so it does not make sense to use a
luxury map data structure in the interium and finally convert it to a list.
on the synchronization side, since the return CF is never going to be
shared/modified by other threads, we know the access is always single
thread, so no synchronization is needed.
particularly write. so we can provide a different ColumnFamily to
CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always
creates the standard ColumnFamily, but take a provided returnCF, whose cost
is much cheaper.
agree on the general direction.
provided. the main work is to let the FastColumnFamily use an array for
internal storage. at first I used binary search to insert new columns in
addColumn(), but later I found that even this is not necessary, since all
calling scenarios of ColumnFamily.addColumn() has an invariant that the
inserted columns come in sorted order (I still have an issue to resolve
descending or ascending

[jira] [Updated] (CASSANDRA-2843) better performance on long row read

2011-07-06 Thread Sylvain Lebresne (JIRA)

[
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Sylvain Lebresne updated CASSANDRA-2843:

Attachment: microBenchmark.patch

bq. this implementation should not expect the input always come in already
sorted order

Yeah, that's not really what I said. I said that it should not *assume* it.
Which doesn't mean it cannot optimize for it. If you look at the version I
attached, at least as far a addColumn is concerned, it does the exact same
thing as your version, with the only difference that I first check if adding at
the tail is legit and fail back to a binary search if that is not the case.
That is, as long as the input is in sorted order, it will be as fast as your
implementation (there is one more bytebuffer comparison but I'm willing to bet
that it has no noticeable impact on performance). But it won't create unsorted
CF if the input is not in sorted order.

Btw, Yang, can you try to fix you 3 last comments ?

bq. If the main benefit is avoiding synchronization, shouldn't we just stick w/
TreeMap for simplicity?

I'm attaching a second patch with the micro-benchmark that I've used and the
TreeMap implementation so that people can look for themselves. The test simply
creates a CF, add columns to it (in sorted order) and do a simple iteration at
the end. I've also add a delete at the end because at least in the case of
super columns, we do call removeDeleted so the goal was to see if this has a
significant impact (the deletes are made at the beginning of the CF, which is
the worst case for the ArrayBacked solution). The test also allow to have some
column overwrap (to exercise reconciliation). Not that when that happens, the
input is not in strict sorted order anymore, but it's mostly at the
disadvantage of the ArrayBack implementation there too. Playing with the
parameters (number of columns added, number that overlaps, number of deletes)
the results seems to always be the same. The ArrayBacked is consistently faster
than the TreeMap one that is itself consistently faster than the CSLM one. Now
what I meant is that the difference between ArrayBacked and TreeMap is
generally not as big as the one with CSLM, but it is still often very
noticeable.

This is no surprise in the end, the ArrayBacked solution is optimized for
insertion in sorted order: the insertion is then O(1) and with a small constant
factor because we're using ArrayList. TreeMap can't beat that. Given this, and
given that ColumnFamily is one of our core data structure, I think we should
choose the more efficient implementation for each use case. And truth is, the
ArrayBacked implementation is really not very complicated, that's basic stuff.

bq. That's odd, because adding in sorted order is actually worst-case for CSLM,
and that's what we do on reads.

Yeah, I was a bit quick on that statement. Rerunning my micro-benchmarks does
show that we're much much faster even without reconciliation happening.

better performance on long row read
---

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: fast_cf.diff

diff file

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: fast_cf.diff


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: b.tar.gz

just untar this file into the 0.8.0-rc1  source tree, then compile

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang
 Attachments: b.tar.gz, fast_cf.diff


 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: fast_cf.diff)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang

 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read


 [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yang Yang updated CASSANDRA-2843:
-

Attachment: (was: b.tar.gz)

 better performance on long row read
 ---

 Key: CASSANDRA-2843
 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
 Project: Cassandra
  Issue Type: New Feature
Reporter: Yang Yang

 currently if a row contains  1000 columns, the run time becomes considerably 
 slow (my test of 
 a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
 40 bytes in value, is about 16ms.
 this is all running in memory, no disk read is involved.
 through debugging we can find
 most of this time is spent on 
 [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
 ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
 int, ColumnFamily)
 [Wall Time]  
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
  Iterator, int)
 [Wall Time]  
 org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
  Iterator, int)
 [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
 ColumnFamily.addColumn() is slow because it inserts into an internal 
 concurrentSkipListMap() that maps column names to values.
 this structure is slow for two reasons: it needs to do synchronization; it 
 needs to maintain a more complex structure of map.
 but if we look at the whole read path, thrift already defines the read output 
 to be ListColumnOrSuperColumn so it does not make sense to use a luxury map 
 data structure in the interium and finally convert it to a list. on the 
 synchronization side, since the return CF is never going to be 
 shared/modified by other threads, we know the access is always single thread, 
 so no synchronization is needed.
 but these 2 features are indeed needed for ColumnFamily in other cases, 
 particularly write. so we can provide a different ColumnFamily to 
 CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
 creates the standard ColumnFamily, but take a provided returnCF, whose cost 
 is much cheaper.
 the provided patch is for demonstration now, will work further once we agree 
 on the general direction. 
 CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
 provided. the main work is to let the FastColumnFamily use an array  for 
 internal storage. at first I used binary search to insert new columns in 
 addColumn(), but later I found that even this is not necessary, since all 
 calling scenarios of ColumnFamily.addColumn() has an invariant that the 
 inserted columns come in sorted order (I still have an issue to resolve 
 descending or ascending  now, but ascending works). so the current logic is 
 simply to compare the new column against the end column in the array, if 
 names not equal, append, if equal, reconcile.
 slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
 flavors of the method, one accepting a returnCF. but we could definitely 
 think about what is the better way to provide this returnCF.
 this patch compiles fine, no tests are provided yet. but I tested it in my 
 application, and the performance improvement is dramatic: it offers about 50% 
 reduction in read time in the 3000-column case.
 thanks
 Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (CASSANDRA-2843) better performance on long row read