[jira] [Commented] (HDFS-7056) Snapshot support for truncate

Colin Patrick McCabe (JIRA) Fri, 19 Dec 2014 12:12:55 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14253948#comment-14253948
 ]


Colin Patrick McCabe commented on HDFS-7056:
--------------------------------------------

{code}
  public boolean truncate(String src, long newLength, String clientName)
{code}
How about calling the first argument "target" rather than "src"?  "src" and 
"dst" make sense for {{rename}}, not so much for truncate since there is no 
"destination"

I would still like to see these functions return an enum like:
{code}
enum TruncateResult { FINISHED, IN_PROGRESS }
{code}
the problem with returning a boolean is that people will assume that if it 
returns false, the operation failed.  This is how {{File#delete}} works, how 
{{File#mkdir}} works, how pretty much everything in the Java standard library 
works.  We have some stuff like this in Hadoop's {{FileSystem.java}} as well.  
So it will be confusing to users if our API returns a boolean that means 
something different.  But by returning an enum, we make it clear what is going 
on.  Of course, we can still use booleans in the protobuf if that's more 
convenient.  This is just a user API thing.

{code}
    * @param bc file
+   * @param lastBlockDelta num of bytes to remove from block
    * @return the last block locations if the block is partial or null otherwise
    */
   public LocatedBlock convertLastBlockToUnderConstruction(
-      BlockCollection bc) throws IOException {
+      BlockCollection bc, long lastBlockDelta) throws IOException {
     BlockInfo oldBlock = bc.getLastBlock();
     if(oldBlock == null ||
-        bc.getPreferredBlockSize() == oldBlock.getNumBytes())
+       bc.getPreferredBlockSize() == oldBlock.getNumBytes() - lastBlockDelta)
{code}

How about calling this {{bytesToRemove}} rather than {{lastBlockDelta}}?  When 
I think of "delta," I think of something that could be either positive or 
negative.

{code}
             }
+            // If we are performing a truncate recovery than set recovery 
fields
+            // to old block.
+            boolean truncateRecovery = b.getRecoveryBlock() != null;
+            boolean copyOnTruncateRecovery = truncateRecovery &&
+                b.getRecoveryBlock().getBlockId() != b.getBlockId();
+            ExtendedBlock primaryBlock = (copyOnTruncateRecovery) ?
+                new ExtendedBlock(blockPoolId, b.getRecoveryBlock()) :
+                new ExtendedBlock(blockPoolId, b);
             // If we only get 1 replica after eliminating stale nodes, then 
choose all
{code}
{{recoveryBlock}} seems like a very generic name for this.  It seems that 
{{BlockInfoUnderConstruction#recoveryBlock}} is only used for truncate, never 
for anything else.  So let's call it {{BlockInfoUnderConstruction#truncateDst}} 
or something.

{code}
message RecoveringBlockProto {
  required uint64 newGenStamp = 1;      // New genstamp post recovery
  required LocatedBlockProto block = 2; // Block to be recovered
+  optional BlockProto newBlock = 3;     // New block for recovery (truncate)
}
{code}
Same comment here.  Since this is only for truncate, then put truncate in the 
name somewhere.

{code}
--- a/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/InterDatanodeProtocol.proto
+++ b/hadoop-hdfs-project/hadoop-hdfs/src/main/proto/InterDatanodeProtocol.proto
@@ -59,6 +59,7 @@ message UpdateReplicaUnderRecoveryRequestProto {
   required ExtendedBlockProto block = 1; // Block identifier
   required uint64 recoveryId = 2;        // New genstamp of the replica
   required uint64 newLength = 3;         // New length of the replica
+  optional uint64 newBlockId = 4;        // New blockId for copy (truncate)
 }
{code}
There isn't any default given here for {{newBlockId}}, so what's going to 
happen in a mixed-version scenario?  Where you have older datanodes 
communicating with newer ones.  Like in this code:

{code}
      storageID = impl.updateReplicaUnderRecovery(
          PBHelper.convert(request.getBlock()), request.getRecoveryId(),
          request.getNewBlockId(), request.getNewLength());
{code}
It will either be a null pointer exception, or returning 0 unexpectedly, right? 
 I would say just pass in a {{Long}} rather than a {{long}}, and check for 
null.  If it's {{null}}, then don't do anything to the id.
This would also mean that you could just pass null (i.e. don't pass anything 
for the optional value) rather than re-setting the block's ID to itself, when 
doing a non-truncate recovery.

{code}
    boolean copyOnTruncate = rur.getBlockId() != newBlockId;
{code}
Can we have an accessor for this in {{ReplicaUnderRecovery}}?  I see this 
pattern in a lot of places.

{code}
   * 2.) INode length is truncated now – clients can only read up to new length.
{code}
Should be "*new* clients can only read up to the truncated length."
Old clients may be able to read above the truncated length for a while.

{code}
    @Override
    public void writeFields(DataOutputStream out) throws IOException {
      FSImageSerialization.writeString(src, out);
      FSImageSerialization.writeString(clientName, out);
      FSImageSerialization.writeString(clientMachine, out);
      FSImageSerialization.writeLong(newLength, out);
      FSImageSerialization.writeLong(timestamp, out);
      int size = truncateBlock != null ? 1 : 0;
      Block[] blocks = new Block[size];
      if (truncateBlock != null) {
        blocks[0] = truncateBlock;
      }
      FSImageSerialization.writeCompactBlockArray(blocks, out);
    }
{code}
This code writes out all of the fields in truncateBlock: {{blockId}}, 
{{numBytes}}, {{generationStamp}}.  But we only care about {{blockId}} and 
{{generationStamp}}, right?  It might be better just to manually write out 
{{blockId}} and {{generationStamp}}, rather than using 
{{FSImageSerialization.writeCompactBlockArray}}.

{code}
    if(oldLength == newLength)
      return true;
    if(oldLength < newLength)
      throw new HadoopIllegalArgumentException(
          "Cannot truncate to a larger file size. Current size: " + oldLength +
              ", truncate size: " + newLength + ".");
{code}
Missing braces around "if".  Same in a few other places

{{INodeFile}}: can these changes be split out into a separate patch?  They seem 
like they could be done as a standalone patch since they just add utility 
methods and so on.  Same with the changes in {{AbstractINodeDiffList}} and some 
of the other snapshot stuff.

> Snapshot support for truncate
> -----------------------------
>
>                 Key: HDFS-7056
>                 URL: https://issues.apache.org/jira/browse/HDFS-7056
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>    Affects Versions: 3.0.0
>            Reporter: Konstantin Shvachko
>            Assignee: Plamen Jeliazkov
>         Attachments: HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-3107-HDFS-7056-combined.patch, HDFS-3107-HDFS-7056-combined.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, HDFS-7056.patch, 
> HDFSSnapshotWithTruncateDesign.docx
>
>
> Implementation of truncate in HDFS-3107 does not allow truncating files which 
> are in a snapshot. It is desirable to be able to truncate and still keep the 
> old file state of the file in the snapshot.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7056) Snapshot support for truncate

Reply via email to