Re: Row cache functionality - Some confusion

2018-03-13 Thread Rahul Singh
It’s pretty clear to me that the only thing that gets put into the caches are 
the top N rows.

https://github.com/apache/cassandra/blob/0db88242c66d3a7193a9ad836f9a515b3ac7f9fa/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java#L523

It may fetch more, but it doesn’t cache it. It may get more if its not the full 
partition cache, but theres no code that inserts into the CacheService except

https://github.com/apache/cassandra/blob/0db88242c66d3a7193a9ad836f9a515b3ac7f9fa/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java#L528



--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Mar 12, 2018, 8:56 AM -0400, Hannu Kröger , wrote:
>
> > On 12 Mar 2018, at 14:45, Rahul Singh  wrote:
> >
> > I may be wrong, but what I’ve read and used in the past assumes that the 
> > “first” N rows are cached and the clustering key design is how I change 
> > what N rows are put into memory. Looking at the code, it seems that’s the 
> > case.
>
> So we agree that we row cache is storing only N rows from the beginning of 
> the partition. So if only the last row in a partition is read, then it 
> probably doesn’t get cached assuming there are more than N rows in a 
> partition?
>
> > The language of the comment basically says that it holds in cache what 
> > satisfies the query if and only if it’s the head of the partition, if not 
> > it fetches it and saves it - I dont interpret it differently from what I 
> > have seen in the documentation.
>
> Hmm, I’m trying to understand this. Does it mean that it stores the results 
> in cache if it is head and if not, it will fetch the head and store that 
> (instead of the results for the query) ?
>
> Hannu


Re: Row cache functionality - Some confusion

2018-03-12 Thread Hannu Kröger

> On 12 Mar 2018, at 14:45, Rahul Singh  wrote:
> 
> I may be wrong, but what I’ve read and used in the past assumes that the 
> “first” N rows are cached and the clustering key design is how I change what 
> N rows are put into memory. Looking at the code, it seems that’s the case. 

So we agree that we row cache is storing only N rows from the beginning of the 
partition. So if only the last row in a partition is read, then it probably 
doesn’t get cached assuming there are more than N rows in a partition?

> The language of the comment basically says that it holds in cache what 
> satisfies the query if and only if it’s the head of the partition, if not it 
> fetches it and saves it - I dont interpret it differently from what I have 
> seen in the documentation. 

Hmm, I’m trying to understand this. Does it mean that it stores the results in 
cache if it is head and if not, it will fetch the head and store that (instead 
of the results for the query) ?

Hannu

Re: Row cache functionality - Some confusion

2018-03-12 Thread Rahul Singh
I may be wrong, but what I’ve read and used in the past assumes that the 
“first” N rows are cached and the clustering key design is how I change what N 
rows are put into memory. Looking at the code, it seems that’s the case.

The language of the comment basically says that it holds in cache what 
satisfies the query if and only if it’s the head of the partition, if not it 
fetches it and saves it - I dont interpret it differently from what I have seen 
in the documentation.



--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Mar 12, 2018, 7:13 AM -0400, Hannu Kröger , wrote:
>
> rows_per_partition


Re: Row cache functionality - Some confusion

2018-03-12 Thread Hannu Kröger
Hi,

My goal is to make sure that I understand functionality correctly and that the 
documentation is accurate. 

The question in other words: Is the documentation or the comment in the code 
wrong (or inaccurate).

Hannu

> On 12 Mar 2018, at 13:00, Rahul Singh  wrote:
> 
> What’s the goal? How big are your partitions , size in MB and in rows?
> 
> --
> Rahul Singh
> rahul.si...@anant.us
> 
> Anant Corporation
> 
> On Mar 12, 2018, 6:37 AM -0400, Hannu Kröger , wrote:
>> Anyone?
>> 
>>> On 4 Mar 2018, at 20:45, Hannu Kröger >> > wrote:
>>> 
>>> Hello,
>>> 
>>> I am trying to verify and understand fully the functionality of row cache 
>>> in Cassandra.
>>> 
>>> I have been using mainly two different sources for information:
>>> https://github.com/apache/cassandra/blob/0db88242c66d3a7193a9ad836f9a515b3ac7f9fa/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java#L476
>>>  
>>> 
>>> AND
>>> http://cassandra.apache.org/doc/latest/cql/ddl.html#caching-options 
>>> 
>>> 
>>> and based on what I read documentation is not correct. 
>>> 
>>> Documentation says like this:
>>> “rows_per_partition: The amount of rows to cache per partition (“row 
>>> cache”). If an integer n is specified, the first n queried rows of a 
>>> partition will be cached. Other possible options are ALL, to cache all rows 
>>> of a queried partition, or NONE to disable row caching.”
>>> 
>>> The problematic part is "the first n queried rows of a partition will be 
>>> cached”. Shouldn’t it be that the first N rows in a partition will be 
>>> cached? Not first N that are queried?
>>> 
>>> If this is the case, I’m more than happy to create a ticket (and maybe even 
>>> create a patch) for the doc update.
>>> 
>>> BR,
>>> Hannu
>>> 
>> 



Re: Row cache functionality - Some confusion

2018-03-12 Thread Rahul Singh
What’s the goal? How big are your partitions , size in MB and in rows?

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Mar 12, 2018, 6:37 AM -0400, Hannu Kröger , wrote:
> Anyone?
>
> > On 4 Mar 2018, at 20:45, Hannu Kröger  wrote:
> >
> > Hello,
> >
> > I am trying to verify and understand fully the functionality of row cache 
> > in Cassandra.
> >
> > I have been using mainly two different sources for information:
> > https://github.com/apache/cassandra/blob/0db88242c66d3a7193a9ad836f9a515b3ac7f9fa/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java#L476
> > AND
> > http://cassandra.apache.org/doc/latest/cql/ddl.html#caching-options
> >
> > and based on what I read documentation is not correct.
> >
> > Documentation says like this:
> > “rows_per_partition: The amount of rows to cache per partition (“row 
> > cache”). If an integer n is specified, the first n queried rows of a 
> > partition will be cached. Other possible options are ALL, to cache all rows 
> > of a queried partition, or NONE to disable row caching.”
> >
> > The problematic part is "the first n queried rows of a partition will be 
> > cached”. Shouldn’t it be that the first N rows in a partition will be 
> > cached? Not first N that are queried?
> >
> > If this is the case, I’m more than happy to create a ticket (and maybe even 
> > create a patch) for the doc update.
> >
> > BR,
> > Hannu
> >
>


Re: Row cache functionality - Some confusion

2018-03-12 Thread Hannu Kröger
Anyone?

> On 4 Mar 2018, at 20:45, Hannu Kröger  wrote:
> 
> Hello,
> 
> I am trying to verify and understand fully the functionality of row cache in 
> Cassandra.
> 
> I have been using mainly two different sources for information:
> https://github.com/apache/cassandra/blob/0db88242c66d3a7193a9ad836f9a515b3ac7f9fa/src/java/org/apache/cassandra/db/SinglePartitionReadCommand.java#L476
>  
> 
> AND
> http://cassandra.apache.org/doc/latest/cql/ddl.html#caching-options 
> 
> 
> and based on what I read documentation is not correct. 
> 
> Documentation says like this:
> “rows_per_partition: The amount of rows to cache per partition (“row cache”). 
> If an integer n is specified, the first n queried rows of a partition will be 
> cached. Other possible options are ALL, to cache all rows of a queried 
> partition, or NONE to disable row caching.”
> 
> The problematic part is "the first n queried rows of a partition will be 
> cached”. Shouldn’t it be that the first N rows in a partition will be cached? 
> Not first N that are queried?
> 
> If this is the case, I’m more than happy to create a ticket (and maybe even 
> create a patch) for the doc update.
> 
> BR,
> Hannu
>