dlmarion commented on PR #2808:
URL: https://github.com/apache/accumulo/pull/2808#issuecomment-1192889478

   LGTM. There is an open question as to whether or not we should add this 
utility to the `accumulo` command. Based on the comments above and a discussion 
with @EdColeman, I am of the opinion that the utility should be removed. As 
described 
[here](https://hadoop.apache.org/docs/r3.3.3/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html#Replica_Placement:_The_First_Baby_Steps)
 the write pipeline will put a replica on the local machine if the writer is on 
a datanode. Blocks for a file will only be local if a compaction has occurred 
and *all* of the following are true:
   
     1. The TabletServer is running on a host with a DataNode
     2. The Tablet has not been re-hosted on a different TabletServer on a 
different host.
     3. In the case of a External Compaction, the Compactor running the 
compaction is on the same host as the Tablets' TabletServer.
     4. The Hadoop configuration does not use a  [block placement 
policy](https://hadoop.apache.org/docs/r3.3.3/hadoop-project-dist/hadoop-hdfs/HdfsBlockPlacementPolicies.html)
 that does not place a block local to the writer.
     5. Erasure Coding is not being used.
     6. The HDFS balancer has not moved any of the files blocks from the local 
node.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to