Hi Guys, Depending on the *type* of ingestion you are trying to do into HDFS, the combination of Apache OODT (http://oodt.apache.org/) and Apache Tika (http://tika.apache.org/) may do the trick.
Cheers, Chris -----Original Message----- From: Bing Jiang <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Monday, November 4, 2013 2:34 AM To: "[email protected]" <[email protected]> Subject: Re: best solution for data ingestion >Apache Pig is also a solution for data ingest, which gives more flexible >in functionality and more efficient in development. > > >Regards. >Bing > > >2013/11/2 Marcel Mitsuto F. S. <[email protected]> > >I've done some testing with flume, but ended up using syslog-ng, more >flexible, reliable, and with a lower fingerprint. > > >On Fri, Nov 1, 2013 at 3:57 PM, Mirko Kämpf ><[email protected]> wrote: > >Have a look on Sqoop for data from RDBMS or Flume, if data flows and >multiple sources have to be used. >Best wishes >Mirko > > > >2013/11/1 Siddharth Tiwari <[email protected]> > >hi team > >seeking your advice on what could be best way to ingest a lot of data to >hadoop. Also what are views about fuse ? > > >*------------------------* >Cheers !!! >SiddharthTiwari >Have a refreshing day !!! >"Every duty is holy, and devotion to duty is the highest form of worship >of God.” > >"Maybe other people will try to limit me but I don't limit myself" > > > > > > > > > > > > > > > > > > > > > > > >-- >Bing Jiang >Tel:(86)134-2619-1361 >weibo: http://weibo.com/jiangbinglover >BLOG: www.binospace.com <http://www.binospace.com> >BLOG: http://blog.sina.com.cn/jiangbinglover > >Focus on distributed computing, HDFS/HBase > > >
