[ https://issues.apache.org/jira/browse/CASSANDRA-14556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16553030#comment-16553030 ]
ASF GitHub Bot commented on CASSANDRA-14556: -------------------------------------------- Github user iamaleksey commented on a diff in the pull request: https://github.com/apache/cassandra/pull/239#discussion_r204457225 --- Diff: src/java/org/apache/cassandra/db/streaming/CassandraBlockStreamReader.java --- @@ -0,0 +1,184 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.cassandra.db.streaming; + +import java.io.File; +import java.io.IOException; +import java.util.List; +import java.util.Set; + +import com.google.common.base.Throwables; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.cassandra.db.ColumnFamilyStore; +import org.apache.cassandra.db.DecoratedKey; +import org.apache.cassandra.db.Directories; +import org.apache.cassandra.db.SerializationHeader; +import org.apache.cassandra.db.lifecycle.LifecycleTransaction; +import org.apache.cassandra.io.sstable.Component; +import org.apache.cassandra.io.sstable.Descriptor; +import org.apache.cassandra.io.sstable.SSTableMultiWriter; +import org.apache.cassandra.io.sstable.format.SSTableFormat; +import org.apache.cassandra.io.sstable.format.Version; +import org.apache.cassandra.io.sstable.format.big.BigTableBlockWriter; +import org.apache.cassandra.io.util.DataInputPlus; +import org.apache.cassandra.schema.TableId; +import org.apache.cassandra.streaming.ProgressInfo; +import org.apache.cassandra.streaming.StreamReceiver; +import org.apache.cassandra.streaming.StreamSession; +import org.apache.cassandra.streaming.messages.StreamMessageHeader; +import org.apache.cassandra.utils.Collectors3; +import org.apache.cassandra.utils.FBUtilities; + +/** + * CassandraBlockStreamReader reads SSTable off the wire and writes it to disk. + */ +public class CassandraBlockStreamReader implements IStreamReader +{ + private static final Logger logger = LoggerFactory.getLogger(CassandraBlockStreamReader.class); + protected final TableId tableId; + protected final StreamSession session; + protected final int sstableLevel; --- End diff -- It has taken me some time (and @krummas's help) to prove that this wasn't a correctness issue, but at its best this is confusing/misleading code. We extract `sstableLevel` from the header, but don't use it anywhere. Instead, since we stream `StatsMetadata` directly, we also inherit the level from there - regardless of whether `CassandraOutgoingStream.keepSSTableLevel` is set to `true`. If `LeveledManifest.canAddSSTable` check didn't exist, we'd be in trouble here. For clarity, I would probably look at that flag, and explicitly reset the level to `L0` if `keepSSTableLevel` is set to `false`. P.S. What's the deal with all these `protected` fields? > Optimize streaming path in Cassandra > ------------------------------------ > > Key: CASSANDRA-14556 > URL: https://issues.apache.org/jira/browse/CASSANDRA-14556 > Project: Cassandra > Issue Type: Improvement > Components: Streaming and Messaging > Reporter: Dinesh Joshi > Assignee: Dinesh Joshi > Priority: Major > Labels: Performance > Fix For: 4.x > > > During streaming, Cassandra reifies the sstables into objects. This creates > unnecessary garbage and slows down the whole streaming process as some > sstables can be transferred as a whole file rather than individual > partitions. The objective of the ticket is to detect when a whole sstable can > be transferred and skip the object reification. We can also use a zero-copy > path to avoid bringing data into user-space on both sending and receiving > side. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org