[
https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15141660#comment-15141660
]
Paulo Motta edited comment on CASSANDRA-10990 at 2/10/16 9:33 PM:
------------------------------------------------------------------
Thanks for the comments [~yukim].
bq. What's the difference between MemoryCachedInputStream and
BufferedInputStream?
The main difference between {{MemoryCachedInputStream}} and
{{BufferedInputStream}} is that the former has the ability to mark/reset a
parent/source stream when it runs out of capacity without losing its mark
state, allowing us to cascade a {{FileCachedInputStream}} with a
{{MemoryCachedInputStream}} to provide a multi-tiered cached input stream.
Another less relevant difference is that {{BufferedInputStream}} always does
buffered reads of up to the capacity of its buffer, while
{{MemoryCachedInputStream}} only buffer reads when it's marked and only the
amount that was consumed via its {{read}}/{{skip}} methods.
bq. Why can't we use the latter?
I tried extending {{BufferedInputStream}} to add the ability to mark a parent
stream when it runs out of capacity, but that involved reimplementing and/or
changing most of its methods since {{BufferedInputStream}} always reads from
its internal buffer and re-fills it when necessary and most of its methods rely
on that logic. Reading from a parent stream when the buffer is full would
change this assumption what would require a significant refactor in most of its
methods. I'm open to suggestions if you see a way of easily adapting
{{BufferedInputStream}} to fulfil that requirement.
bq. {{MemoryCachedInputStream}} uses default {{ByteArrayOutputStream}}
constructor which has only size of 32 bytes. Isn't this too small to use for
cache?
Probably, I will try to find a better value for this. Do you easily remember if
there is a way to retrieve the average partition size for a given table? I
remember seeing something along those lines but I'm not sure where it is..
I will start work on the remaining TODO points and review comments. Please let
me know if you have something to add.
was (Author: pauloricardomg):
Thanks for the comments.
bq. What's the difference between MemoryCachedInputStream and
BufferedInputStream? Why can't we use the latter?
The main difference between {{MemoryCachedInputStream}} and
{{BufferedInputStream}} is that the former has the ability to mark/reset a
parent/source stream when it runs out of capacity without losing its mark
state, allowing us to cascade a {{FileCachedInputStream}} with a
{{MemoryCachedInputStream}} to provide a multi-tiered cached input stream.
> Support streaming of older version sstables in 3.0
> --------------------------------------------------
>
> Key: CASSANDRA-10990
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10990
> Project: Cassandra
> Issue Type: Bug
> Components: Streaming and Messaging
> Reporter: Jeremy Hanna
> Assignee: Paulo Motta
>
> In 2.0 we introduced support for streaming older versioned sstables
> (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this
> became no longer supported. So currently, while 3.0 can read sstables in the
> 2.1/2.2 format, it cannot stream the older versioned sstables. We should do
> some work to make this still possible to be consistent with what
> CASSANDRA-5772 provided.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)