[
https://issues.apache.org/jira/browse/CASSANDRA-13482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16044202#comment-16044202
]
Alex Petrov commented on CASSANDRA-13482:
-----------------------------------------
I've composed a patch to mitigate the problem.
In order to fix it, we have to allow for concatenating the iterators with
different amounts of columns, although make sure that cases with wrapped
iterators, limits, stopping and empty iterators all have some predictable
behaviour.
While working on this problem I have also discovered the slight inconsistency
in the way concatenation and {{MoreRows}} is working right now: {{DataLimits}}
filter
[here|https://github.com/apache/cassandra/blob/a87b15d1d6c42f4247c84b460ed39899d8813a6f/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java#L423]
will call {{stopInPartition}}, which would effectively "stop" the original
{{iter}} iterator, and make it return {{hasNext() => false}}, even though only
one row was consumed.
Right now, it works only because internally {{concat}} would take {{input}}
from the {{BaseIterator}} and discard this {{isStopped}} in
[tryGetMoreContent|https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/db/transform/BaseIterator.java#L124].
In the context of this patch this would mean that we'd get only cached results
and avoid reading mem/sstable.
In other words, current behaviour can be described as:
{code}
iter1 = /* some iterator yielding: 1, 2, 3 */;
iter2 = /* some iterator yielding: 3, 4, 5, 6, 7, 8, 9 */;
concatenated = concat(iter1, DataLimits.cqlLimits(3).filter(iter2));
{code}
would result in {{concatenated}} yielding 1 through 9, which is incorrect.
The patch implements the following changes:
* {{concat}} [now
allows|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13482-trunk#diff-57d0dfa95504bfd17d30539b3b338c0cL205]
different amount of columns from iterators, but returned columns will be a
union of the two iterators
* {{BaseIterator}} [would
now|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13482-trunk#diff-d14a4b314544d3720010343e330e7e3cR125]
take the {{stop}} from the iterator, which means that the iterator was
stopped, even if it might have contents, it will not yield additional data
* {{cacheIterator}} [is
now|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13482-trunk#diff-2e17efa5977a71330df6651d3bec0d12R424]
using a custom wrapped iterator that will not call {{stop}} on the wrapping
iterator, but previous problem with {{Unfiltered}} is now fixed
|[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...ifesdjeen:13482-3.0]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13482-3.0-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13482-3.0-dtest/]|
|[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...ifesdjeen:13482-3.11]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13482-3.11-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13482-3.11-dtest/]|
|[trunk|https://github.com/apache/cassandra/compare/trunk...ifesdjeen:13482-trunk]|[testall|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13482-trunk-testall/]|[dtest|http://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-13482-trunk-dtest/]|
> NPE on non-existing row read when row cache is enabled
> ------------------------------------------------------
>
> Key: CASSANDRA-13482
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13482
> Project: Cassandra
> Issue Type: Bug
> Reporter: Alex Petrov
> Assignee: Alex Petrov
>
> The problem is reproducible on 3.0 with:
> {code}
> -# row_cache_class_name: org.apache.cassandra.cache.OHCProvider
> +row_cache_class_name: org.apache.cassandra.cache.OHCProvider
> -row_cache_size_in_mb: 0
> +row_cache_size_in_mb: 100
> {code}
> Table setup:
> {code}
> CREATE TABLE cache_tables (pk int, v1 int, v2 int, v3 int, primary key (pk,
> v1)) WITH CACHING = { 'keys': 'ALL', 'rows_per_partition': '1' } ;
> {code}
> No data is required, only a head query (or any pk/ck query but with full
> partitions cached).
> {code}
> select * from cross_page_queries where pk = 10000 ;
> {code}
> {code}
> java.lang.AssertionError: null
> at
> org.apache.cassandra.db.rows.UnfilteredRowIterators.concat(UnfilteredRowIterators.java:193)
> ~[main/:na]
> at
> org.apache.cassandra.db.SinglePartitionReadCommand.getThroughCache(SinglePartitionReadCommand.java:461)
> ~[main/:na]
> at
> org.apache.cassandra.db.SinglePartitionReadCommand.queryStorage(SinglePartitionReadCommand.java:358)
> ~[main/:na]
> at
> org.apache.cassandra.db.ReadCommand.executeLocally(ReadCommand.java:395)
> ~[main/:na]
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1794)
> ~[main/:na]
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2472)
> ~[main/:na]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_121]
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:164)
> ~[main/:na]
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:136)
> [main/:na]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:105)
> [main/:na]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_121]
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]