When you say 'scan blocks on that datanode', what do you mean to do by
'scan'? If you want merely a list of blocks per DN at a given time,
there are ways to get that. However, if you want to then perform
operations on each of these block remotely, then thats not possible to
do.

In any case, you can run whatever program you wish to agnostically on
any DN by running it on the dfs.datanode.data.dir directories of the
DN (take it from its config), and visiting all files with the format
^blk_<ID number>$.

We can help you better if you tell us what exactly are you attempting
to do, for which you need a list of all the blocks per DN.

On Fri, Jul 6, 2012 at 7:58 PM, Yaron Gonen <yaron.go...@gmail.com> wrote:
> Hi,
> I'm trying to write an agent that will run on a datanode and will scan
> blocks on a that datanode.
> The logical thing to do is to look in the DataBlockScanner code, which lists
> all the blocks on a node, which is what I did.
> The problem is that the DataBlockScanner object is instantiated during the
> start-up of a DataNode, so a lot of objects needed (like FSDataSet) are
> already instantiated.
> Then, I tried with DataNode.getDataNode(), but it returned null (needless to
> say that the node is up-and-running).
> I'd be grateful if you can refer me to the right object or to a a guide.
>
> I'm new in hdfs, so I'm sorry if its a trivial question.
>
> Thanks,
> Yaron



-- 
Harsh J

Reply via email to