[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-11-24 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15025185#comment-15025185
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8246:
---

Changed Resolution from "Auto Closed" to "Won't Fix".

> Get HDFS file name based on block pool id and block id
> --
>
> Key: HDFS-8246
> URL: https://issues.apache.org/jira/browse/HDFS-8246
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client, namenode
>Reporter: feng xu
>Assignee: feng xu
>  Labels: BB2015-05-TBR
> Attachments: HDFS-8246.0.patch
>
>
> This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
> name based on block pool id and block id.
> 1. The Java API in class DistributedFileSystem
> public String getFileName(String poolId, long blockId) throws IOException
> 2. The C API in hdfs.c
> char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
> 3. The HDFS shell command 
>  hdfs dfs [generic options] -fn  
> This feature is useful if you have HDFS block file name in local file system 
> and want to  find out the related HDFS file name in HDFS name space 
> (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
>   Each HDFS block file name in local file system contains both block pool id 
> and block id, for sample HDFS block file name 
> /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
>   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
> is 1073741825. The block  pool id is uniquely related to a HDFS name 
> node/name space,  and the block id is uniquely related to a HDFS file within 
> a HDFS name node/name space, so the combination of block pool id and a block 
> id is uniquely related a HDFS file name. 
> The shell command and C/Java API do not map the block pool id to name node, 
> so it’s user’s responsibility to talk to the correct name node in federation 
> environment that has multiple name nodes. The block pool id is used by name 
> node to check if the user is talking with the correct name node.
> The implementation is straightforward. The client request to get HDFS file 
> name reaches the new method String getFileName(String poolId, long blockId) 
> in FSNamesystem in name node through RPC,  and the new method does the 
> followings,
> (1)   Validate the block pool id.
> (2)   Create Block  based on the block id.
> (3)   Get BlockInfoContiguous from Block.
> (4)   Get BlockCollection from BlockInfoContiguous.
> (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14590879#comment-14590879
 ] 

Colin Patrick McCabe commented on HDFS-8246:


I agree that fsck solves the admin wants to know what file includes a block 
use-case.  What are the other use cases?

Also, if this API is available to normal users, how do we deal with this case:

1. snapshot S1 happens: block B is in file F
2. permissions of F get changed so that only superuser can access it
3. non-superuser asks for what files contain B

Should the non-superuser be able to know that F still contains B in step 3?  
Even though he doesn't have permission to access F?  It certainly seems like he 
should know that it contained B in snapshot S1.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-15 Thread feng xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586244#comment-14586244
 ] 

feng xu commented on HDFS-8246:
---

If hdfsListDirectory() is enhanced to get HDFS file name and snapshots from 
block id, it will cover more use cases and benefit more people and 
applications. I think it’s better than fsck.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-15 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587323#comment-14587323
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8246:
---

(1) and (5) are just empty statements

For (2) and (3), we probably do not want ls to show block IDs as Unix/Linux 
doesn't.  Also, it becomes a problem for files having a lot of blocks.

For (4), namenode is enforcing access control already.

It seems that we don't really have a use case for the proposed API.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-15 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14586649#comment-14586649
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8246:
---

What are the more use cases?

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-15 Thread feng xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587313#comment-14587313
 ] 

feng xu commented on HDFS-8246:
---

With the enhancement to DistributedFileSystem::listStatus()/hdfsListDirectory(),
 (1) Any application calling the API will get the functionality without any 
change. 
 (2) The existing  commands like hdfs dfs –ls will get the functionality 
without any change. This is similar to fsck command.
 (3) The WebHDFS list directory option LISTSTATUS will get the functionality 
without any change. This is the case that fsck does not cover.
 (4) Any local file system extension can track the file access in HDFS name 
space in real time, and make security decision properly with user, process and 
time information.
 (5) ………..


 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-15 Thread feng xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14587364#comment-14587364
 ] 

feng xu commented on HDFS-8246:
---

For (2) and (3), it will show HDFS file name and snapshots based on block ID, 
could you explain why it becomes a problem for files having a lot of blocks?  
For (4), an user can bypass HDFS and access the blocks files directly from 
local file system, can namenode enforce access control in this case? Instead  a 
local file system extension  can be complementary to HDFS security. 

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-12 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584190#comment-14584190
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8246:
---

 At least this feature can help security software ...

 ... However performance critical applications need C API to get HDFS file 
 name ...

So do you have a performance critical security software application in your 
mind?

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-12 Thread feng xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14584133#comment-14584133
 ] 

feng xu commented on HDFS-8246:
---

It’s very nice to see HDFS-6663, which enhances fsck command to have –blockId 
option for admin purpose. However performance critical applications need C API 
to get HDFS file name from pool id and block id programmatically , and those 
applications usually have to cache the HDFS filesystem handle to avoid JVM  
instantiation again and again.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-12 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14583902#comment-14583902
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8246:
---

We actually already have such feature; see HDFS-6663.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-08 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14578122#comment-14578122
 ] 

Colin Patrick McCabe commented on HDFS-8246:


I think this API should be accessible to everyone, not just admins.  It can 
help eliminate race conditions.

Let's say you have some daemon that runs with superuser permissions that is 
trying to verify that some user U has permission to access /a/b.  Then you have 
an access pattern like this:

1. daemon checks /a to ensure U has permissions to access it
2. daemon checks /a/b

A malicious user might be able to insert a step 1a.

1a. malicious user moves /a to /a2 and moves /something-else to /a

Then the malicious user would get access to /something-else/b even if he 
doesn't have access to /something-else.

There are a lot of scenarios you can set up like this.  Any time you have a 
check of a path, followed by a use of that path, there can be a race condition 
if you are not refering to inodes by number.

bq. What should it return if a block belong to multiple files (current file and 
snapshotted files, or hard linked files when we support hard link in the 
future)?

Then I think it should return multiple paths, including some to snapshot 
directories.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-05 Thread feng xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575343#comment-14575343
 ] 

feng xu commented on HDFS-8246:
---

At least this feature can help security software in local file system to trace 
IOs back to HDFS name space, understand the context better and take actions 
more accurate, which is very useful.

On Windows how about the Volume Management Control Code  
FSCTL_LOOKUP_STREAM_FROM_CLUSTER and the command “fsutil volume querycluster”? 
On Unix/Linux it’s file system specific, I think some file systems have fsdb 
tool, for example “xfs_db blockuse” on xfs?


 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-05 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14575381#comment-14575381
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8246:
---

Thanks for the info.  I think the new API could be useful for fixing problems.  
It should be added as an admin API instead of a user API.  The API should 
return the full path instead of the file name.

What should it return if a block belong to multiple files (current file and 
snapshotted files, or hard linked files when we support hard link in the 
future)?

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-06-04 Thread Tsz Wo Nicholas Sze (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573494#comment-14573494
 ] 

Tsz Wo Nicholas Sze commented on HDFS-8246:
---

What is the use case of this?  Block pool id and block id are HDFS internal so 
that they should not be exposed to users.

BTW, is there a way to get the file name(s) from a block id in 
unix/linux/windows?  It seems not a common supported operation.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-05-14 Thread feng xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14544325#comment-14544325
 ] 

feng xu commented on HDFS-8246:
---

I don’t see in-memory structure that maps a block to snapshots directly, and it 
may take multiple round trips to get snapshots for a given block, so I'd not 
return snapshots for a performance critical application that does not need 
them. How about map /.reserved/.blockIdToFiles/poolid_blockid to current hdfs 
file, and /.reserved/.blockIdToFiles/poolid_blockid/all to current hdfs file 
plus snapshots?

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-05-08 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535502#comment-14535502
 ] 

Colin Patrick McCabe commented on HDFS-8246:


From the C client, if you wanted to know what files block ID 123 was in, you 
could do {{hdfsListDirectory(fs, path=/.reserved/.blockIdToFiles/123, 
...)}}.  I think one of the advantages of having a path in .reserved instead 
of a new API is everything just works for the C client, C++ client, webhdfs, 
etc. etc.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-05-06 Thread feng xu (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14531663#comment-14531663
 ] 

feng xu commented on HDFS-8246:
---

Thank you Colin, could you elaborate on your review comment? For example how 
could I use the feature with HDFS C API and shell command?

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
  Labels: BB2015-05-TBR
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-05-05 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14529256#comment-14529256
 ] 

Colin Patrick McCabe commented on HDFS-8246:


How about having {{/.reserved/.blockIdToFiles/$ID}} map to a directory 
containing the hdfs files which have the given block ID?  I think this would be 
a lot better than having a whole other set of APIs.  Remember also that 
multiple snapshotted files can contain the same block ID.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)   Get BlockInfoContiguous from Block.
 (4)   Get BlockCollection from BlockInfoContiguous.
 (5)   Get file name from BlockCollection.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-8246) Get HDFS file name based on block pool id and block id

2015-04-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-8246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14518034#comment-14518034
 ] 

Hadoop QA commented on HDFS-8246:
-

\\
\\
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | pre-patch |  14m 26s | Pre-patch trunk compilation is 
healthy. |
| {color:green}+1{color} | @author |   0m  0s | The patch does not contain any 
@author tags. |
| {color:green}+1{color} | tests included |   0m  0s | The patch appears to 
include 3 new or modified test files. |
| {color:green}+1{color} | whitespace |   0m  0s | The patch has no lines that 
end in whitespace. |
| {color:green}+1{color} | javac |   7m 29s | There were no new javac warning 
messages. |
| {color:green}+1{color} | javadoc |   9m 35s | There were no new javadoc 
warning messages. |
| {color:green}+1{color} | release audit |   0m 22s | The applied patch does 
not increase the total number of release audit warnings. |
| {color:red}-1{color} | checkstyle |   7m 50s | The applied patch generated  
10  additional checkstyle issues. |
| {color:green}+1{color} | install |   1m 33s | mvn install still works. |
| {color:green}+1{color} | eclipse:eclipse |   0m 33s | The patch built with 
eclipse:eclipse. |
| {color:green}+1{color} | findbugs |   4m 44s | The patch does not introduce 
any new Findbugs (version 2.0.3) warnings. |
| {color:green}+1{color} | common tests |  22m 54s | Tests passed in 
hadoop-common. |
| {color:green}+1{color} | hdfs tests | 165m  8s | Tests passed in hadoop-hdfs. 
|
| | | 234m 40s | |
\\
\\
|| Subsystem || Report/Notes ||
| Patch URL | 
http://issues.apache.org/jira/secure/attachment/12728038/HDFS-8246.0.patch |
| Optional Tests | javadoc javac unit findbugs checkstyle |
| git revision | trunk / 99fe03e |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10431/artifact/patchprocess/checkstyle-result-diff.txt
 |
| hadoop-common test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10431/artifact/patchprocess/testrun_hadoop-common.txt
 |
| hadoop-hdfs test log | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10431/artifact/patchprocess/testrun_hadoop-hdfs.txt
 |
| Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10431/testReport/ |
| Java | 1.7.0_55 |
| uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/10431/console |


This message was automatically generated.

 Get HDFS file name based on block pool id and block id
 --

 Key: HDFS-8246
 URL: https://issues.apache.org/jira/browse/HDFS-8246
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: HDFS, hdfs-client, namenode
Reporter: feng xu
Assignee: feng xu
 Attachments: HDFS-8246.0.patch


 This feature provides HDFS shell command and C/Java API to retrieve HDFS file 
 name based on block pool id and block id.
 1. The Java API in class DistributedFileSystem
 public String getFileName(String poolId, long blockId) throws IOException
 2. The C API in hdfs.c
 char* hdfsGetFileName(hdfsFS fs, const char* poolId, int64_t blockId)
 3. The HDFS shell command 
  hdfs dfs [generic options] -fn poolId blockId
 This feature is useful if you have HDFS block file name in local file system 
 and want to  find out the related HDFS file name in HDFS name space 
 (http://stackoverflow.com/questions/10881449/how-to-find-file-from-blockname-in-hdfs-hadoop).
   Each HDFS block file name in local file system contains both block pool id 
 and block id, for sample HDFS block file name 
 /hdfs/1/hadoop/hdfs/data/current/BP-97622798-10.3.11.84-1428081035160/current/finalized/subdir0/subdir0/blk_1073741825,
   the block pool id is BP-97622798-10.3.11.84-1428081035160 and the block id 
 is 1073741825. The block  pool id is uniquely related to a HDFS name 
 node/name space,  and the block id is uniquely related to a HDFS file within 
 a HDFS name node/name space, so the combination of block pool id and a block 
 id is uniquely related a HDFS file name. 
 The shell command and C/Java API do not map the block pool id to name node, 
 so it’s user’s responsibility to talk to the correct name node in federation 
 environment that has multiple name nodes. The block pool id is used by name 
 node to check if the user is talking with the correct name node.
 The implementation is straightforward. The client request to get HDFS file 
 name reaches the new method String getFileName(String poolId, long blockId) 
 in FSNamesystem in name node through RPC,  and the new method does the 
 followings,
 (1)   Validate the block pool id.
 (2)   Create Block  based on the block id.
 (3)