[ 
https://issues.apache.org/jira/browse/HDFS-11769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16013279#comment-16013279
 ] 

Chen Liang commented on HDFS-11769:
-----------------------------------

I just got a question about the usage of ksm.db, did not realize this earlier, 
also please correct me if I'm wrong on any statement below.
when a volume gets created, there is
{code}
batch.put(DFSUtil.string2Bytes(args.getVolume()), volumeInfo.toByteArray());
{code}
and
{code}
batch.put(DFSUtil.string2Bytes(dbUserName), newVolList.toByteArray());
{code}
so there are two types of key-value information pushed/updated in ksm.db on 
each volume creation:
{{volumeName -> volumeInfo}}
{{userName -> volumeListOfThisUser}}

So my concern about this is that, if we try to iterate over ksm.db (say, for 
debug purpose), we may have a hard time figuring out what type of information 
the current entry is...e.g., say we get an entry from iterator, is this entry a 
"volumeName -> volumeInfo" entry? or is it "userName -> volumeListOfThisUser"? 
Seems no way to tell, except for something like blindly trying to parse it and 
catch the error...I ran into this when try to get the DEBUG CLI work for 
ksm.db. In addition, if we allow the odd cases where a user name can be the 
same as an existing volume name (or vice versa), either we will overwrite 
things that we should not, or we will have to reject the insertion. 

I checked the design doc in HDFS-11768, seems to me that maintaining this one 
single ksm.db instance is actually by design. In which case this is not just 
about this JIRA but about everything in KSM, and we will have more than two 
types of information in ksm.db in the future. So I wonder, should we have more 
db instance based on type of entries? or is it better to have some way to be 
able to distinguish the entries in the DB?

I think one easy way to be able distinguish the entry types, is to add a prefix 
to key for each type of entries, assuming keys are all strings...e.g. for 
"volumeName - volumeInfo" type entry, we actually insert the entry 
"VOLUMENAME_volumeName - volumeInfo" i.e. adding prefix "VOLUMENAME_" to volume 
name. (It doesn't have to be as long as "VOLUMENAME", can be as short as one 
single character). This way when iterating the db, we will know this is a 
volume info entry. 

Any comments? cc. [~xyao], [~msingh], [~anu], [~cheersyang]

> Ozone: KSM: Add createVolume API
> --------------------------------
>
>                 Key: HDFS-11769
>                 URL: https://issues.apache.org/jira/browse/HDFS-11769
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>    Affects Versions: HDFS-7240
>            Reporter: Anu Engineer
>            Assignee: Mukul Kumar Singh
>             Fix For: HDFS-7240
>
>         Attachments: HDFS-11769-HDFS-7240.001.patch, 
> HDFS-11769-HDFS-7240.002.patch, HDFS-11769-HDFS-7240.003.patch, 
> HDFS-11769-HDFS-7240.004.patch, HDFS-11769-HDFS-7240.005.patch, 
> HDFS-11769-HDFS-7240.006.patch, HDFS-11769-HDFS-7240.007.patch, 
> HDFS-11769-HDFS-7240.008.patch
>
>
> Add createVolume API in KSM.
> createVolume API allows administrators to create a volume. The arguments to 
> the API are:
> * Admin Name - The name of the administrator who is creating this volume. 
> Volumes can be created only by administrators.
> *  User Name - The name of the owner of this volume.
> * Quota - Quota information for this volume.
> This JIRA proposes to add the protobuf layer for createVolume and handling in 
> KSM. We will file a followup JIRA for the KSM client.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to