You don't VPC with a hardware VPN  if all you need is an easy way to
connect to the cluster from the local machine - a SOCKS proxy over SSH will
work just fine.

My recommendation for you would be to use standard EC2 and start a proxy on
your local machine. See "Run a proxy" section in the guide:
http://whirr.apache.org/docs/0.8.2/quick-start-guide.html

Also you need to make sure you are running the same version of Hadoop on
your local machine.

Are you going to transfer large amounts of data? What's the end goal?

--
Andrei Savu

On Mon, Oct 28, 2013 at 10:36 PM, Samarth Gupta
<[email protected]>wrote:

> the only reason i had to use VPC was the same as
> http://stackoverflow.com/questions/14544055/copyfromlocalfile-doesnt-work-in-cdh4
>
> While using HDFS api to transfer data,  namenode returns private IP of the
> datanode, so any method that deals with datanode is not able to communicate
> with it since they are not on same network. If i can get transfer to HDFS
> working, i can do away with VPC.
>
> Thanks..
>
>
> On Tue, Oct 29, 2013 at 10:44 AM, Andrei Savu <[email protected]>wrote:
>
>> Hi Samarth -
>>
>>
>> AFAIK we've never actually tested Whirr with VPC. I guess you can make it
>> work eventually but probably you will need to make many changes.
>>
>> Can you tell me a bit more about your use case? Why VPC and not standard
>> EC2?
>>
>> --
>> Andrei Savu - https://www.linkedin.com/in/sandrei
>>
>> On Mon, Oct 28, 2013 at 10:10 PM, Samarth Gupta <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> I have launched cluster using whirr and tried using HDFS API's to
>>> transfer data on the started cluster. However got the following error :
>>>
>>> "There are 1 datanode(s) running and 1 node(s) are excluded from the
>>> operation"
>>>
>>> which was same as
>>> http://stackoverflow.com/questions/14544055/copyfromlocalfile-doesnt-work-in-cdh4
>>>
>>>
>>> On setting up VPC and VPN and launching cluster through whirr i faced
>>> the problem of "No default VPC assigned" , for which i made a change a
>>> whirr source code and passed the VPC ID while launching clusters.
>>>
>>> I was able to start the cluster using the same, but faced trouble in :
>>>
>>> 1. Configuration scripts for installing hadoop were unsuccessful
>>> 2. Security groups API was throwing error since VPC already had a
>>> assigned security group and whirr was trying to create one more.
>>>
>>>
>>> I wanted to check, if anyone is using whirr with AWS VPC and help me
>>> with the same ...
>>>
>>> Thanks:
>>> Samarth
>>>
>>
>>
>

Reply via email to