[jira] [Issue Comment Edited] (CASSANDRA-2843) better performance on long row read

Yang Yang (JIRA) Thu, 07 Jul 2011 01:36:57 -0700

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13061134#comment-13061134
 ]


Yang Yang edited comment on CASSANDRA-2843 at 7/7/11 8:34 AM:
--------------------------------------------------------------

i just got down the patch and transfered it to a computer to read
Sylvain's approach to compare the last element is quite clean  i see no
problems

the only problem was due to me: the bin search high=mid-1  should be changed
to high=mid
also with this error fixed , you dont need to special case 1 2 in the end of
binsearch






https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
already sorted order"
it.  Which doesn't mean it cannot optimize for it. If you look at the
version I attached, at least as far a addColumn is concerned, it does the
exact same thing as your version, with the only difference that I first
check if adding at the tail is legit and fail back to a binary search if
that is not the case.  That is, as long as the input is in sorted order, it
will be as fast as your implementation (there is one more bytebuffer
comparison but I'm willing to bet that it has no noticeable impact on
performance). But it won't create unsorted CF if the input is not in sorted
order.
stick w/ TreeMap for simplicity?
the TreeMap implementation so that people can look for themselves. The test
simply creates a CF, add columns to it (in sorted order) and do a simple
iteration at the end. I've also add a delete at the end because at least in
the case of super columns, we do call removeDeleted so the goal was to see
if this has a significant impact (the deletes are made at the beginning of
the CF, which is the worst case for the ArrayBacked solution). The test also
allow to have some column overwrap (to exercise reconciliation). Not that
when that happens, the input is not in strict sorted order anymore, but it's
mostly at the disadvantage of the ArrayBack implementation there too.
 Playing with the parameters (number of columns added, number that overlaps,
number of deletes) the results seems to always be the same. The ArrayBacked
is consistently faster than the TreeMap one that is itself consistently
faster than the CSLM one. Now what I meant is that the difference between
ArrayBacked and TreeMap is generally not as big as the one with CSLM, but it
is still often very noticeable.
insertion in sorted order: the insertion is then O(1) and with a small
constant factor because we're using ArrayList. TreeMap can't beat that.
Given this, and given that ColumnFamily is one of our core data structure, I
think we should choose the more efficient implementation for each use case.
And truth is, the ArrayBacked implementation is really not very complicated,
that's basic stuff.
CSLM, and that's what we do on reads.
does show that we're much much faster even without reconciliation happening.
https://issues.apache.org/jira/browse/CASSANDRA-2843
microBenchmark.patch
considerably slow (my test of
and 40 bytes in value, is about 16ms.
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter,
ColumnFamily)
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int,
ColumnFamily)
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter,
int, ColumnFamily)
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
Iterator, int)
 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
Iterator, int)
concurrentSkipListMap() that maps column names to values.
it needs to maintain a more complex structure of map.
output to be List<ColumnOrSuperColumn> so it does not make sense to use a
luxury map data structure in the interium and finally convert it to a list.
on the synchronization side, since the return CF is never going to be
shared/modified by other threads, we know the access is always single
thread, so no synchronization is needed.
particularly write. so we can provide a different ColumnFamily to
CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always
creates the standard ColumnFamily, but take a provided returnCF, whose cost
is much cheaper.
agree on the general direction.
provided. the main work is to let the FastColumnFamily use an array  for
internal storage. at first I used binary search to insert new columns in
addColumn(), but later I found that even this is not necessary, since all
calling scenarios of ColumnFamily.addColumn() has an invariant that the
inserted columns come in sorted order (I still have an issue to resolve
descending or ascending  now, but ascending works). so the current logic is
simply to compare the new column against the end column in the array, if
names not equal, append, if equal, reconcile.
flavors of the method, one accepting a returnCF. but we could definitely
think about what is the better way to provide this returnCF.
my application, and the performance improvement is dramatic: it offers about
50% reduction in read time in the 3000-column case.


      was (Author: yangyangyyy):
    i just got down the patch and transfered it to a computer to read
Sylvain's approach to compare the last element is quite clean  i see no
problems

the only problem was due to me: the bin search high=mid-1  should be changed
to high=mid
also with this error fixed , you dont need to special case 1 2 in the end of
binsearch
https://issues.apache.org/jira/browse/CASSANDRA-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]
already sorted order"
it.  Which doesn't mean it cannot optimize for it. If you look at the
version I attached, at least as far a addColumn is concerned, it does the
exact same thing as your version, with the only difference that I first
check if adding at the tail is legit and fail back to a binary search if
that is not the case.  That is, as long as the input is in sorted order, it
will be as fast as your implementation (there is one more bytebuffer
comparison but I'm willing to bet that it has no noticeable impact on
performance). But it won't create unsorted CF if the input is not in sorted
order.
stick w/ TreeMap for simplicity?
the TreeMap implementation so that people can look for themselves. The test
simply creates a CF, add columns to it (in sorted order) and do a simple
iteration at the end. I've also add a delete at the end because at least in
the case of super columns, we do call removeDeleted so the goal was to see
if this has a significant impact (the deletes are made at the beginning of
the CF, which is the worst case for the ArrayBacked solution). The test also
allow to have some column overwrap (to exercise reconciliation). Not that
when that happens, the input is not in strict sorted order anymore, but it's
mostly at the disadvantage of the ArrayBack implementation there too.
 Playing with the parameters (number of columns added, number that overlaps,
number of deletes) the results seems to always be the same. The ArrayBacked
is consistently faster than the TreeMap one that is itself consistently
faster than the CSLM one. Now what I meant is that the difference between
ArrayBacked and TreeMap is generally not as big as the one with CSLM, but it
is still often very noticeable.
insertion in sorted order: the insertion is then O(1) and with a small
constant factor because we're using ArrayList. TreeMap can't beat that.
Given this, and given that ColumnFamily is one of our core data structure, I
think we should choose the more efficient implementation for each use case.
And truth is, the ArrayBacked implementation is really not very complicated,
that's basic stuff.
CSLM, and that's what we do on reads.
does show that we're much much faster even without reconciliation happening.
https://issues.apache.org/jira/browse/CASSANDRA-2843
microBenchmark.patch
considerably slow (my test of
and 40 bytes in value, is about 16ms.
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter,
ColumnFamily)
 org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int,
ColumnFamily)
 org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter,
int, ColumnFamily)
 org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
Iterator, int)
 
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
Iterator, int)
concurrentSkipListMap() that maps column names to values.
it needs to maintain a more complex structure of map.
output to be List<ColumnOrSuperColumn> so it does not make sense to use a
luxury map data structure in the interium and finally convert it to a list.
on the synchronization side, since the return CF is never going to be
shared/modified by other threads, we know the access is always single
thread, so no synchronization is needed.
particularly write. so we can provide a different ColumnFamily to
CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always
creates the standard ColumnFamily, but take a provided returnCF, whose cost
is much cheaper.
agree on the general direction.
provided. the main work is to let the FastColumnFamily use an array  for
internal storage. at first I used binary search to insert new columns in
addColumn(), but later I found that even this is not necessary, since all
calling scenarios of ColumnFamily.addColumn() has an invariant that the
inserted columns come in sorted order (I still have an issue to resolve
descending or ascending  now, but ascending works). so the current logic is
simply to compare the new column against the end column in the array, if
names not equal, append, if equal, reconcile.
flavors of the method, one accepting a returnCF. but we could definitely
think about what is the better way to provide this returnCF.
my application, and the performance improvement is dramatic: it offers about
50% reduction in read time in the 3000-column case.

  
> better performance on long row read
> -----------------------------------
>
>                 Key: CASSANDRA-2843
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2843
>             Project: Cassandra
>          Issue Type: New Feature
>            Reporter: Yang Yang
>         Attachments: 2843.patch, fast_cf_081_trunk.diff, microBenchmark.patch
>
>
> currently if a row contains > 1000 columns, the run time becomes considerably 
> slow (my test of 
> a row with 30 00 columns (standard, regular) each with 8 bytes in name, and 
> 40 bytes in value, is about 16ms.
> this is all running in memory, no disk read is involved.
> through debugging we can find
> most of this time is spent on 
> [Wall Time]  org.apache.cassandra.db.Table.getRow(QueryFilter)
> [Wall Time]  
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, 
> ColumnFamily)
> [Wall Time]  
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(QueryFilter, int, 
> ColumnFamily)
> [Wall Time]  
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(QueryFilter, 
> int, ColumnFamily)
> [Wall Time]  
> org.apache.cassandra.db.filter.QueryFilter.collectCollatedColumns(ColumnFamily,
>  Iterator, int)
> [Wall Time]  
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(IColumnContainer,
>  Iterator, int)
> [Wall Time]  org.apache.cassandra.db.ColumnFamily.addColumn(IColumn)
> ColumnFamily.addColumn() is slow because it inserts into an internal 
> concurrentSkipListMap() that maps column names to values.
> this structure is slow for two reasons: it needs to do synchronization; it 
> needs to maintain a more complex structure of map.
> but if we look at the whole read path, thrift already defines the read output 
> to be List<ColumnOrSuperColumn> so it does not make sense to use a luxury map 
> data structure in the interium and finally convert it to a list. on the 
> synchronization side, since the return CF is never going to be 
> shared/modified by other threads, we know the access is always single thread, 
> so no synchronization is needed.
> but these 2 features are indeed needed for ColumnFamily in other cases, 
> particularly write. so we can provide a different ColumnFamily to 
> CFS.getTopLevelColumnFamily(), so getTopLevelColumnFamily no longer always 
> creates the standard ColumnFamily, but take a provided returnCF, whose cost 
> is much cheaper.
> the provided patch is for demonstration now, will work further once we agree 
> on the general direction. 
> CFS, ColumnFamily, and Table  are changed; a new FastColumnFamily is 
> provided. the main work is to let the FastColumnFamily use an array  for 
> internal storage. at first I used binary search to insert new columns in 
> addColumn(), but later I found that even this is not necessary, since all 
> calling scenarios of ColumnFamily.addColumn() has an invariant that the 
> inserted columns come in sorted order (I still have an issue to resolve 
> descending or ascending  now, but ascending works). so the current logic is 
> simply to compare the new column against the end column in the array, if 
> names not equal, append, if equal, reconcile.
> slight temporary hacks are made on getTopLevelColumnFamily so we have 2 
> flavors of the method, one accepting a returnCF. but we could definitely 
> think about what is the better way to provide this returnCF.
> this patch compiles fine, no tests are provided yet. but I tested it in my 
> application, and the performance improvement is dramatic: it offers about 50% 
> reduction in read time in the 3000-column case.
> thanks
> Yang

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (CASSANDRA-2843) better performance on long row read

Reply via email to