? to Ingest 10TB from FTP

Marcelo Vanzin Fri, 14 Aug 2015 13:24:06 -0700

Why do you need to use Spark or Flume for this?

You can just use curl and hdfs:


  curl ftp://blah | hdfs dfs -put - /blah


On Fri, Aug 14, 2015 at 1:15 PM, Varadhan, Jawahar <
varad...@yahoo.com.invalid> wrote:

> What is the best way to bring such a huge file from a FTP server into
> Hadoop to persist in HDFS? Since a single jvm process might run out of
> memory, I was wondering if I can use Spark or Flume to do this. Any help on
> this matter is appreciated.
>
> I prefer a application/process running inside Hadoop which is doing this
> transfer
>
> Thanks.
>



-- 
Marcelo

Re: Setting up Spark/flume/? to Ingest 10TB from FTP

Reply via email to