Why do you need to use Spark or Flume for this?

You can just use curl and hdfs:

  curl ftp://blah | hdfs dfs -put - /blah


On Fri, Aug 14, 2015 at 1:15 PM, Varadhan, Jawahar <
varad...@yahoo.com.invalid> wrote:

> What is the best way to bring such a huge file from a FTP server into
> Hadoop to persist in HDFS? Since a single jvm process might run out of
> memory, I was wondering if I can use Spark or Flume to do this. Any help on
> this matter is appreciated.
>
> I prefer a application/process running inside Hadoop which is doing this
> transfer
>
> Thanks.
>



-- 
Marcelo

Reply via email to