You have two ways:

A. If remote node has access to all HDFS machines (NN + all DNs).

Simply do a "hadoop dfs -put" operation to push in data.

B. If remote node has no access to HDFS, setup a bastion box with Hoop
and write to HDFS via Hoop. Hoop provides a REST API to do this.

Some examples to write can be found here:
http://cloudera.github.com/hoop/docs/latest/HttpRestApi.html (See
section "File System Operations", and the Write example).

The box you setup must be accessible by the remote node, and the box
itself should be able to access your HDFS in a regular fashion (NN +
all DNs), so that it can relay your writes.

Hoop also has security support, so you can use it against secured
clusters and prevent writes from non-authed folks. The same link
carries instructions for this as well.

On Wed, Jan 25, 2012 at 12:31 AM, Mohammad Tariq <donta...@gmail.com> wrote:
> Hey Ron,
>
>   Thanks for the response.No, the remote machine is not a part of our
> Hadoop ecosystem.
>
> Regards,
>     Mohammad Tariq
>
>
>
> On Tue, Jan 24, 2012 at 10:23 PM, Ronald Petty <ronald.pe...@gmail.com> wrote:
>> Mohammed,
>>
>> Is this remote machine part of the HDFS system?
>>
>> Ron
>>
>>
>> On Tue, Jan 24, 2012 at 7:30 AM, Mohammad Tariq <donta...@gmail.com> wrote:
>>>
>>> Hello list,
>>>
>>>    I have a situation wherein I have to move large binary files(~TB)
>>> from remote machines into the HDFS.While looking for some way to do
>>> this I came across Hoop.Could anyone tell me whether it fits into my
>>> use case?If so where can I find some proper help so that I can learn
>>> about Hoop in detail or some place where I can find some demo apps or
>>> some code that perform similar kind of tasks?I am going through the
>>> documentation at
>>> http://cloudera.github.com/hoop/docs/latest/index.html.But it
>>> basically talks about configuration stuff.Need some help.Many thanks.
>>>
>>> Regards,
>>>     Mohammad Tariq
>>
>>



-- 
Harsh J
Customer Ops. Engineer, Cloudera

Reply via email to