[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15196492#comment-15196492 ] Paulo Motta commented on CASSANDRA-10990: - dtests look good. [dtest PR|https://github.com/riptano/cassandra-dtest/pull/858] submitted. marking as ready to commit. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15195544#comment-15195544 ] Yuki Morishita commented on CASSANDRA-10990: +1. Thanks for your work. I will commit soon. (Can you change status to Patch Available so that the status can change to Ready to commit?) > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15188030#comment-15188030 ] Paulo Motta commented on CASSANDRA-10990: - Updated patch and submitted tests. ||3.0||3.5||trunk||dtest|| |[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:3.0-10990]|[branch|https://github.com/apache/cassandra/compare/cassandra-3.5...pauloricardomg:3.5-10990]|[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:trunk-10990]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:10990]| |[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-10990-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.5-10990-testall/lastCompletedBuild/testReport/]|[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-10990-testall/lastCompletedBuild/testReport/]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-10990-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.5-10990-dtest/lastCompletedBuild/testReport/]|[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-trunk-10990-dtest/lastCompletedBuild/testReport/]| note: the dtests are configured to use the [updated dtest branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:10990]. commit info: conflict on 3.5, merges cleanly to trunk. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15187223#comment-15187223 ] Yuki Morishita commented on CASSANDRA-10990: I think patch looks good for me with the following little change: * getTempBufferFile should throw IOException instead of RuntimeException when buffer file cannot be created, so stream won't be retried. * RewindableDISP.reset should throw IOException instead of IllegalStateException so that it fits the InputStream's interface. Can you create patch for 3.0 branch also, and run dtests? > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15181464#comment-15181464 ] Paulo Motta commented on CASSANDRA-10990: - Updated branch with reworked {{RewindableDataInputStreamPlus}}, now treating the spill file as circular buffer with max size. Now required space for spill buffer file of legacy {{StreamDeserializer}} is {{max(sstableSize, MAX_INT)}}, capping the max required space for streaming older version sstables on ~2GB. I also updated documentation and tests. WDYT [~yukim]? Besides that, I added a new [dtest|https://github.com/pauloricardomg/cassandra-dtest/commit/b59ea329a5b9302372c57956f7015111fd47a4ae] for testing repair of older version sstables on 3.0+, and it seems to work. Now we have the following dtests: * [bootstrap dtests|https://github.com/pauloricardomg/cassandra-dtest/blob/10990/upgrade_8099_test.py#L340] * [sstable loader dtests|https://github.com/pauloricardomg/cassandra-dtest/blob/10990/sstable_generation_loading_test.py#L181] * [repair dtests|https://github.com/pauloricardomg/cassandra-dtest/commit/b59ea329a5b9302372c57956f7015111fd47a4ae] For simplicity, I opted for writing the old streamed sstables in the new format without {{EncodingStats}} (since there is no {{SerializationHeader}} available), what may make these sstables less optimized in terms of storage space from what I understood. Do you think we should construct these stats when receiving the sstable or not bother [~slebresne] ? Also, do you recall of any other edge case we should watch here? Thanks! > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15172237#comment-15172237 ] Yuki Morishita commented on CASSANDRA-10990: Current check requires to have disk space of 2 * transferring SSTable, so for example we need to receive a SSTable file of 100GB at bootstrap then we need 200GB of free space beforehand on bootstrapping node, even we may write 2 MB of buffer file. I think {{totalSize + Integer.MAX_VALUE}} can do the job here. We want to do bq. An alternative approach is to overwrite the spill file for each partition, so we need disk space proportional to the max partition size with the cap. bq. but this will probably cause write amplification. We don't need to overwrite for every partition, we can apply cap here too. Implementation may become a bit complicated though. WDYT? bq. I set the initial capacity to 32KB, do you think this is sufficient or do you prefer 128KB ? 32 is fine, users can change through system prop if they really need to. BTW, you can use {{Integer.getInteger}} instead of {{Integer.parseInt(System.getProperty)}}. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15171565#comment-15171565 ] xiaodong wang commented on CASSANDRA-10990: --- Thanks for the prompt response, Yuki. I did try the upgrade the SSTable and failed. And figured out that the SSTable has to be under ///. After the upgrade, a few table could be loaded. I will keep updated if further issues. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170295#comment-15170295 ] Paulo Motta commented on CASSANDRA-10990: - Thanks for the review, [~yukim]! Follow-up below: bq. StreamDeserializer determines tmp directory to where disk can be hold 2 * the size of streaming SSTable, but isn't it too much? SSTable can be few hundred GB. You need to cap that. With the current implementation, each partition larger than 1MB is spilled to disk for rewind later. In the worst case, if all partitions are >> 1MB, the spill file will be proportional to the sstable size. This is mostly a safety measure to check that currently the disk has space to hold the actual sstable, and all partitions spilled to disk in the worst case. Of course that even with this, and other compactions/streaming in place you cannot guarantee anything, and the disk can occasionaly get full and chaos will happen. An alternative approach is to overwrite the spill file for each partition, so we need disk space proportional to the max partition size, but this will probably cause write amplification. In any case, before I was not checking in case the returned spill file from {{cfs.getDirectories().getTemporaryWriteableDirectoryAsFile}} was null (meaning there is no space) so I added that check and threw an exception. bq. Speaking of capping the size of tmp buffer file, we may want to respect the parameter of InputStream#mark(int readLimit). We just set read limit to 2.147GB and make it the limitation of this functionality. Users can always upgradesstable if needed. I don't see why we should cap this if there is enough disk space, which is usually the case when bootstrapping/sstableloading. While I'm happy to add the cap if you feel strong about it, I think this will render the functionality useless to dense nodes with {{STCS}}, while they could still be served as long as there is sufficient disk space. Furthermore, I think for most cases partitions will be <= 1MB, so we will only spill to disk very rarely (and the disk space check protection should be sufficient for other cases). bq. RewindableInputStream only deletes buffer file at close, but it is not closed when stream finishes. You cannot close with RewindableInputStream#close because that will cause underlying socket to be closed. Nice catch, thanks! I added a {{cleanup}} method to {{StreamDeserializer}} that performs the cleanup in the {{finally}} clause of {{StreamReader.read}} and {{CompressedStreamReader.read}} (unfortunately I couldn't reuse the {{close}} method of {{StreamDeserializer}} because it's called in-between-partitions by the try-with-resources clause of {{BigTableWriter.append}}). I also updated [dtests|https://github.com/pauloricardomg/cassandra-dtest/tree/10990] to check that temp spill files are cleaned up correctly. bq. RewindableInputStream - I don't think we need SyncUtil.sync on buffer file. Removed, thanks! bq.RewindableDataInputStreamPlus - bytesPastMark returns underlying stream's available(), but I don't think that is suitable there. available (by contract) just returns estimated bytes to be read without blocking. You should return pos from RewindableInputStream. That said, I think it is better to integrate RewindableDataInputStreamPlus and RewindableInputStream. Fixed {{bytesPastMark}} and integrated {{RewindableInputStream}} into {{RewindableDataInputStreamPlus}}. bq. Initial capacity seems to be set as 128 bytes, not 128 Kbytes. Also I prefer passing this value to constructor as well so you don't have to deal with System.setProperty in unit test. I set the initial capacity to 32KB, do you think this is sufficient or do you prefer 128KB ? I added the initial capacity to constructor so we can use it on unit tests, but left the system properties so we can set them to a lower value on [dtests|https://github.com/pauloricardomg/cassandra-dtest/tree/10990]. Updated above branch with review followup and resubmitted tests. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167951#comment-15167951 ] Yuki Morishita commented on CASSANDRA-10990: Thanks for the update. Here are my review comments. * StreamDeserializer determines tmp directory to where disk can be hold 2 * the size of streaming SSTable, but isn't it too much? SSTable can be few hundred GB. You need to cap that. * Speaking of capping the size of tmp buffer file, we may want to respect the parameter of {{InputStream#mark(int readLimit)}}. We just set read limit to 2.147GB and make it the limitation of this functionality. Users can always upgradesstable if needed. * RewindableInputStream - I don't think we need {{SyncUtil.sync}} on buffer file. * RewindableInputStream only deletes buffer file at close, but it is not closed when stream finishes. You cannot close with {{RewindableInputStream#close}} because that will cause underlying socket to be closed. * RewindableDataInputStreamPlus - {{bytesPastMark}} returns underlying stream's {{available()}}, but I don't think that is suitable there. {{available}} (by contract) just returns estimated bytes to be read without blocking. You should return {{pos}} from {{RewindableInputStream}}. That said, I think it is better to integrate {{RewindableDataInputStreamPlus}} and {{RewindableInputStream}}. * Initial capacity seems to be set as 128 bytes, not 128 Kbytes. Also I prefer passing this value to constructor as well so you don't have to deal with {{System.setProperty}} in unit test. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15165679#comment-15165679 ] Paulo Motta commented on CASSANDRA-10990: - Thanks for the review and comments [~yukim]! Follow-up below. bq. Isn't it simpler to just have CachedInputStream that first fills on heap buffer and writes out to file as buffer gets full? +1, merged {{MemoryCachedInputStream}} and {{FileCachedInputStream}} into a single class {{RewindableInputStream}}, and also made some simplifications to cover only our needed use case. Before I was trying to make it too general that's why it became unnecessarily complex. Also updated tests to reflect changes. bq. I think we can just set to be more sane value (few hundreds kilobytes?), the use case here is for static columns only. Made 128KB initial capacity and 1MB max memory capacity to store read partitions. After that, read bytes are spilled to disk. Also removed {{ByteArrayOutputStream}} and {{ByteArrayInputStream}}, now working with a growing {{byte[]}} buffer. Besides that, addressed github comments, finished previous TODO list and fixed all tests. Patch and tests available below. ||trunk||dtest|| |[branch|https://github.com/apache/cassandra/compare/trunk...pauloricardomg:10990]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:10990]| |[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10990-testall/lastCompletedBuild/testReport/]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10990-dtest/lastCompletedBuild/testReport/]| I will provide 3.0 patch after review, as well as update CHANGEs, etc. [~philipthompson] if I used the same C* and dtests branches as before, do you need to update the cassci job or can I just resubmit? > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159424#comment-15159424 ] Yuki Morishita commented on CASSANDRA-10990: Did you try running sstableupgrade from 3.3 against 2.1 SSTables? > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15159412#comment-15159412 ] xiaodong wang commented on CASSANDRA-10990: --- Hi, I learnt that this issue has been worked on actively. Could you please share the updates/ETA of the fix release? I am currently running into the same issue and get blocked when trying to migrating the data from C* 2.1.x to C* 3.3 Did a few approaches: * Running the "sstableupgrade" & "sstableloader" on the 2.1.x's snapshots won't work because of this issue; * Due to the CASSANDRA-8110: within the same 2.1.x cluster running "rebuild" on 3.3 DC won't work; > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15154677#comment-15154677 ] Yuki Morishita commented on CASSANDRA-10990: One more suggestion. Isn't it simpler to just have CachedInputStream that first fills on heap buffer and writes out to file as buffer gets full? bq. Do you easily remember if there is a way to retrieve the average partition size for a given table? I remember seeing something along those lines but I'm not sure where it is. I think we can just set to be more sane value (few hundreds kilobytes?), the use case here is for static columns only. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15141660#comment-15141660 ] Paulo Motta commented on CASSANDRA-10990: - Thanks for the comments. bq. What's the difference between MemoryCachedInputStream and BufferedInputStream? Why can't we use the latter? The main difference between {{MemoryCachedInputStream}} and {{BufferedInputStream}} is that the former has the ability to mark/reset a parent/source stream when it runs out of capacity without losing its mark state, allowing us to cascade a {{FileCachedInputStream}} with a {{MemoryCachedInputStream}} to provide a multi-tiered cached input stream. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134400#comment-15134400 ] Philip Thompson commented on CASSANDRA-10990: - Running at http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-3.0-10990-dtest/3/ > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15134391#comment-15134391 ] Paulo Motta commented on CASSANDRA-10990: - Initial version is ready for review. Feedback on approach and correctness will be greatly appreciated. *Patch Overview* The patch adds support for streaming pre-3.0 sstables and a comprehensive test suite around it. Adding support to non-static-compact tables was simple, basically wokaround the lack of serialization header by using a header with no stats and deserialize clustering prefix with old format deserializer while serializing in new format. The main challenge was to provide support to streaming compact static tables, because in the new format the static columns must be the first columns in a partition while in the previous format they can be in any position of the partition. This means that each partition must be traversed to search for static columns and then rewinded to search for remaining non-static columns. In order to solve this I added a new {{CachedInputStream}} that adds mark/reset functionality to a source stream and allows to cooperatively cascade multiple {{CachedInputStream}} with different capacities to create an input stream cache hierarchy. For instance, I used this feature on {{StreamDeserializer}} for pre-3.0 sstables that uses a {{MemoryCachedInputStream}} that falls back to a {{FileCachedInputStream}} when it runs out of capacity in memory. The {{FileCachedInputStream}} may write a temporary buffer file to a data directory and remove it once the file is successfully streamed or if it fails. This approach allow us to use the {{OldFormatDeserializer}} transparently, and the same code path for reading pre-3.0 sstables is used to stream pre-3.0 sstables. Note that the {{CachedInputStream}} is only used to stream pre-3.0 sstables in order to provide rewind functionality and will not affect existing behavior. Please note that performance was not the objective here, but mostly support streaming functionality of pre-3.0 sstables. Compact static tables may suffer a slight performance hit due to buffer copying and rewinding, but non-compact static tables will not have performance affected since the stream cache will not be used. *Tests* * *Unit tests*: Extended {{LegacySStableTest}} to test streaming of legacy compact sstables since jb version. ** Add comprehensive test suite for different {{CachedInputStream}} variants on {{RewindableDataInputStreamPlusTest}} * *SStable loader dtests*: Extended {{sstable_generation_loading_test}} to sstableload 2.1 (ka) sstables with different compression settings. * *Upgrade dtests*: Extended CASSANDRA-10563 upgrade dtests to bootstrap soon after upgrading, to test bootstrap streaming of legacy sstables. *TODO* * Cleanup of leftover buffer files on startup. * Improve documentation of {{CachedInputStream}}, {{MemoryCachedInputStream}} and {{FileCachedInputStream}} * Make max memory buffer size a system property and change it on dtests * {{LegacySSTableTest}} passes when executed individually but fails when executed on a suite, probably some leftovers from previous test that need to be cleaned up. * Add la sstables to {{sstable_generation_loading_test}} * Fix {{upgrade_8099_test.py:TestBootstrapAfterUpgrade.upgrade_with_wide_partition_test}} ||3.0||dtest|| |[branch|https://github.com/apache/cassandra/compare/cassandra-3.0...pauloricardomg:10990]|[branch|https://github.com/riptano/cassandra-dtest/compare/master...pauloricardomg:10990]| |[testall|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10990-testall/lastCompletedBuild/testReport/]| |[dtest|http://cassci.datastax.com/view/Dev/view/paulomotta/job/pauloricardomg-10990-dtest/lastCompletedBuild/testReport/]| [~philipthompson] when you have time, could you please setup a custom dtest run with the dtest branch above? Thanks! > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15135308#comment-15135308 ] Yuki Morishita commented on CASSANDRA-10990: [~pauloricardomg] Over all, I like your approach. I'm still looking through your code, and commented some on github, I have one quesitons for now. What's the difference between {{MemoryCachedInputStream}} and {{BufferedInputStream}}? Why can't we use the latter? {{MemoryCachedInputStream}} uses default {{ByteArrayOutputStream}} constructor which has only size of 32 bytes. Isn't this too small to use for cache? > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15112830#comment-15112830 ] Paulo Motta commented on CASSANDRA-10990: - Many thanks for the detailed explanation and clarification [~slebresne]! I believe the purpose of this ticket is more to allow streaming of old sstables to 3.0 vnodes via {{sstableloader}} as well as support bootstrap and move on 3.0 nodes that did not complete {{upgradesstables}} (dense nodes) than to actually provide full 2.x-3.0 streaming compatibility (we can leave this to CASSANDRA-8110). There seems to be additional complications with repair due to different digest formats, so we can address that in a separate ticket if necessary. I was able to achieve streaming of pre-3.0 sstables fairly transparently with the addition of a {{RewindableDataInputStreamPlus}} wrapper input stream that allows rewinding a source stream, so we can leverage the exactly same code path of reading pre-3.0 sstables ({{OldFormatIterator}}) to do the static compact table handling. Currently, the {{RewindableDataInputStreamPlus}} works in memory, so next step is to spill the buffer to disk if its sizes goes above a certain treshold. I agree we need to do extensive testing before calling this supported. I propose to extend CASSANDRA-10563 upgrade dtests to perform a bootstrap of a new node, without performing {{upgradesstables}} on upgraded nodes (so they will contain only old sstables), and check that all streamed data will be readable via thrift and/or cql. I updated the previous branch with the latest changes. > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (CASSANDRA-10990) Support streaming of older version sstables in 3.0
[ https://issues.apache.org/jira/browse/CASSANDRA-10990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110707#comment-15110707 ] Sylvain Lebresne commented on CASSANDRA-10990: -- The problem will be compact tables. With them, old nodes have their cells for declared columns intermingled with other cells and we need to get all the former ones first for the new format (as they become static columns). When reading old sstables, we end up having to sometimes double read partitions during compaction (thankfully it's only for some tables and it's a one time thing) and do some tricks for reads (which are too long to explain here but one can look at {{ThriftResultsMerger}} if interested). Anyway, we can't easily do this for streaming. The only viable solution I can see would be to first write all the "dynamic" cells to some temporary location collecting the "static" ones while doing it, then writing the static ones and then copying back the dynamic ones. Doable, though it's getting complicated. I also want to note that streaming sstables is in itself useless. What we care is the features that use streaming, i.e. bootstrap, moving and repair. And at least for repair, this won't work out of the box because 3.0 don't compute the same digest than 2.X nodes. Now, 3.0 nodes are actually able to compute the 2.X ones (which we use for digest queries), but we need to add the code so they pick the right version to compute. As for bootstrap/moving, they "should" work out of the box in mixed cluster (providing streaming works) but we've changed enough in 3.0, including with schemas, that we need a good battery of tests before claiming it does. So anyway, we can likely solve all those problems, but it's worth noting that this ticket is probably a rather big one if you include the testing necessary (and I think it's better to say "sorry but you can't do streaming based operations during upgrade" than pretending it works but having it break easily in practice). > Support streaming of older version sstables in 3.0 > -- > > Key: CASSANDRA-10990 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10990 > Project: Cassandra > Issue Type: Bug > Components: Streaming and Messaging >Reporter: Jeremy Hanna >Assignee: Paulo Motta > > In 2.0 we introduced support for streaming older versioned sstables > (CASSANDRA-5772). In 3.0, because of the rewrite of the storage layer, this > became no longer supported. So currently, while 3.0 can read sstables in the > 2.1/2.2 format, it cannot stream the older versioned sstables. We should do > some work to make this still possible to be consistent with what > CASSANDRA-5772 provided. -- This message was sent by Atlassian JIRA (v6.3.4#6332)