[jira] [Commented] (HDDS-328) Support export and import of the KeyValueContainer

Elek, Marton (JIRA) Thu, 09 Aug 2018 06:35:06 -0700


    [ 
https://issues.apache.org/jira/browse/HDDS-328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16574842#comment-16574842
 ]


Elek, Marton commented on HDDS-328:
-----------------------------------

Thank you the very valuable review for both of you.

I uploaded the new patch where I fixed most of the items:

[~bharatviswa]:
{quote}1. KeyValueContainer.java:
 Line 172: Typo in the comment. realted -> related
 2. In KeyValueContainer.java in importContainerData, we have to update 
following fields of keyvaluecontainerdata: checksum, dbType, state from 
original containerData.
 3. Changes in KeyValueContainerData.java change only imports, this is needed 
change?
 5. In KeyValueHandler populateContainerPathFields(), we need to acquire a 
volumeSet lock, so that if some volume gets removed during chooseVolume, it 
will avoid doing that.
 6. Checkstyle issues in TestContainerPacker. Line 108,161 need to be uppercase 
L.
 7. In TestKeyValueContainer, after unpack, can we verify all the containerData 
fields with newly replaced one with the old containerData fields. (Checksum no 
need of comparing, as they will change)
 8. And also in TestKeyValueContainer, size parameter passed should be in GB. 
Line 205.
{quote}

All of them are fixed, thanks. With 2.) I copy the dbType/state the checksum 
and counters are recalculated.

{quote}

4. We can use these constants from OzoneConsts.java, instead of redefining it. 
As in future any of these get changed, it will be all in one place.

{quote}

Not sure about this. If it's about the TarContainerPacker, the directory names 
inside the tar files are independent from the structure in the disk. More like 
a logical structure. For example there is no "db" directory in the disk. I 
started to use the same "chunk" folder name from the OzoneConst but can't 
change the name of the db and container file. I changed them to private to make 
it clean that they are internal.

[~hanishakoneru]:

{quote}
 * Do we allow copying open containers? I am thinking not. We should add a 
check for that the container is closed before packing it.
 * Is there any reason for making the container file in tarball a hidden file?
 * Can we change the *_METADATADB_DIR_NAME to indicate that it is the db file. 
Currently, we have both .container file and .db file under the metadata 
directory. So it might get confusing on what "metadata” denotes.
* We should acquire the container write lock before unpacking and updating in 
KeyValueContainer#importContainerData.
{quote}

You are right. They are fixed now. And I renamed the .container name to 
container.yaml.

{quote}
 * When creating a container object and choosing the volume for it, we need to 
specify the max container size. The max size is a configurable parameter. The 
dest node might have a different max size setting from the source container.
 For example, lets say the container to be copied has 5GB max size and is full. 
We want to copy it into a node with max size set to 2GB and as such we choose a 
volume with 2.5GB space left. When actually copying the container data, we 
would get disk out of space exception as we are trying to copy 5GB whereas we 
have only 2.5GB space on disk.
 To avoid this, we should first copy and extract the container tar into the new 
node and then update the path variables.
{quote}

Thanks, I understood. I modified the behaviour in HDDS-75 (in this patch it 
works well as I set the same size in the unit test)

1. I added a new method to the ContainerPacker which extracts just the yaml 
file to a string
2. I parse the yaml from the archive to a string
3. Create the container and allocate the path based on the original max size
4. Copy the data to the destination paths
5. Refresh the required fields/metadata 
6. Add container to the containerSet 

{quote} 
 * Once the container has been successfully copied, we should update its other 
metrics such as keyCount
{quote}
 
Yep, I added a KeyValueContainerUtil.parseKVContainerData call after the import 
and adjusted the unit test. Now it works for the keyCount and used bytes.

[~bharatviswa]

{quote}
1. In KeyValueHandler populateContainerPathFields() how will it be used during 
this copyContainer flow? And when we add it to ContainerSet after copying into 
datanode?
{quote}

I just rebased HDDS-75 patch on top of this patch where you can check the usage 
in ReplicateContainerCommandHandler. The simplified version is something like 
this:

{code}
        String containerDescriptorYaml =
            packer.readContainerDescriptorFromArchive(tempContainerTarStream);

        originalContainerData = ContainerDataYaml.readContainer(
            new ByteArrayInputStream(containerDescriptorYaml.getBytes()));

        KeyValueContainerData containerData =
            new KeyValueContainerData(containerID,
                originalContainerData.getMaxSizeGB());

        KeyValueContainer container = new KeyValueContainer(containerData, 
conf);

        keyValueHander.populateContainerPathFields(container, 
originalContainerData.getMaxSizeGB());

       container.importContainerData(tempContainerTarStream, packer);

       containerSet.addContainer(container);
{code}

I was not sure which is the more safe way: import the original containerdata 
yaml file and use that but overwrite the path fields or create a new 
containerdata but overwrite all the valuable fields. Currently I follow the 
second approach, but the opposite also could be possible...

Thanks again the review, I learned a lot...

> Support export and import of the KeyValueContainer
> --------------------------------------------------
>
>                 Key: HDDS-328
>                 URL: https://issues.apache.org/jira/browse/HDDS-328
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>          Components: Ozone Datanode
>            Reporter: Elek, Marton
>            Assignee: Elek, Marton
>            Priority: Critical
>             Fix For: 0.2.1
>
>         Attachments: HDDS-328.002.patch, HDDS-328.003.patch
>
>
> In HDDS-75 we pack the container data to an archive file, copy to other 
> datanodes and create the container from the archive.
> As I wrote in the comment of HDDS-75 I propose to separate the patch to make 
> it easier to review.
> In this patch we need to extend the existing Container interface with adding 
> export/import methods to save the container data to one binary input/output 
> stream. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDDS-328) Support export and import of the KeyValueContainer

Reply via email to