[ 
https://issues.apache.org/jira/browse/CASSANDRA-14556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16556429#comment-16556429
 ] 

ASF GitHub Bot commented on CASSANDRA-14556:
--------------------------------------------

Github user dineshjoshi commented on a diff in the pull request:

    https://github.com/apache/cassandra/pull/239#discussion_r205297558
  
    --- Diff: src/java/org/apache/cassandra/db/streaming/ComponentManifest.java 
---
    @@ -0,0 +1,129 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.cassandra.db.streaming;
    +
    +import java.io.IOException;
    +import java.util.Collections;
    +import java.util.LinkedHashMap;
    +import java.util.LinkedHashSet;
    +import java.util.Map;
    +import java.util.Objects;
    +import java.util.Set;
    +
    +import org.apache.cassandra.db.TypeSizes;
    +import org.apache.cassandra.io.IVersionedSerializer;
    +import org.apache.cassandra.io.sstable.Component;
    +import org.apache.cassandra.io.util.DataInputPlus;
    +import org.apache.cassandra.io.util.DataOutputPlus;
    +
    +public class ComponentManifest
    +{
    +    private final LinkedHashMap<Component.Type, Long> manifest;
    +    private final Set<Component> components = new 
LinkedHashSet<>(Component.Type.values().length);
    +    private final long totalSize;
    +
    +    public ComponentManifest(Map<Component.Type, Long> componentManifest)
    +    {
    +        this.manifest = new LinkedHashMap<>(componentManifest);
    +
    +        long size = 0;
    +        for (Map.Entry<Component.Type, Long> entry : 
this.manifest.entrySet())
    +        {
    +            size += entry.getValue();
    +            this.components.add(Component.parse(entry.getKey().repr));
    +        }
    +
    +        this.totalSize = size;
    +    }
    +
    +    public Long getSizeForType(Component.Type type)
    +    {
    +        return manifest.get(type);
    +    }
    +
    +    public long getTotalSize()
    +    {
    +        return totalSize;
    +    }
    +
    +    public Set<Component> getComponents()
    +    {
    +        return Collections.unmodifiableSet(components);
    +    }
    +
    +    public boolean equals(Object o)
    +    {
    +        if (this == o) return true;
    +        if (o == null || getClass() != o.getClass()) return false;
    +        ComponentManifest that = (ComponentManifest) o;
    +        return totalSize == that.totalSize &&
    +               Objects.equals(manifest, that.manifest);
    +    }
    +
    +    public int hashCode()
    +    {
    +
    +        return Objects.hash(manifest, totalSize);
    +    }
    +
    +    public static final IVersionedSerializer<ComponentManifest> serializer 
= new IVersionedSerializer<ComponentManifest>()
    +    {
    +        public void serialize(ComponentManifest manifest, DataOutputPlus 
out, int version) throws IOException
    +        {
    +            out.writeInt(manifest.manifest.size());
    +            for (Map.Entry<Component.Type, Long> entry : 
manifest.manifest.entrySet())
    +                serialize(entry.getKey(), entry.getValue(), out);
    +        }
    +
    +        public ComponentManifest deserialize(DataInputPlus in, int 
version) throws IOException
    +        {
    +            LinkedHashMap<Component.Type, Long> components = new 
LinkedHashMap<>(Component.Type.values().length);
    +
    +            int size = in.readInt();
    +            assert size >= 0 : "Invalid number of components";
    +
    +            for (int i = 0; i < size; i++)
    +            {
    +                Component.Type type = 
Component.Type.fromRepresentation(in.readByte());
    +                long length = in.readLong();
    +                components.put(type, length);
    +            }
    +
    +            return new ComponentManifest(components);
    +        }
    +
    +        public long serializedSize(ComponentManifest manifest, int version)
    +        {
    +            long size = 0;
    +            size += TypeSizes.sizeof(manifest.manifest.size());
    +            for (Map.Entry<Component.Type, Long> entry : 
manifest.manifest.entrySet())
    +            {
    +                size += TypeSizes.sizeof(entry.getKey().id);
    +                size += TypeSizes.sizeof(entry.getValue());
    +            }
    +            return size;
    +        }
    +
    +        private void serialize(Component.Type type, long size, 
DataOutputPlus out) throws IOException
    +        {
    +            out.writeByte(type.id);
    --- End diff --
    
    @jasobrown and I debated this. Basically, we chose to go with the byte 
approach as it seemed to be better than going with ordinals. In degenerate 
cases where we have a large number of SSTables, this will make a difference as 
we will be sending a lot of these messages around. unless you strongly oppose 
removing this, I am inclined to leaving this as-is.


> Optimize streaming path in Cassandra
> ------------------------------------
>
>                 Key: CASSANDRA-14556
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-14556
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Streaming and Messaging
>            Reporter: Dinesh Joshi
>            Assignee: Dinesh Joshi
>            Priority: Major
>              Labels: Performance
>             Fix For: 4.x
>
>
> During streaming, Cassandra reifies the sstables into objects. This creates 
> unnecessary garbage and slows down the whole streaming process as some 
> sstables can be transferred as a whole file rather than individual 
> partitions. The objective of the ticket is to detect when a whole sstable can 
> be transferred and skip the object reification. We can also use a zero-copy 
> path to avoid bringing data into user-space on both sending and receiving 
> side.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to