thanks. but is it going to create 1 big file in HDFS? I am currently considering writing my own cascading job for this.
thx, T On Wed, Jul 7, 2010 at 6:06 PM, Sarah Sproehnle <[email protected]> wrote: > Hi Todd, > > Are you planning to use Sqoop to do this import? If not, you should. > :) It will do a parallel import, using MapReduce, to load the table > into Hadoop. With the --hive-import option, it will also create the > Hive table definition. > > Cheers, > Sarah > > On Wed, Jul 7, 2010 at 5:51 PM, Todd Lee <[email protected]> wrote: > > Hi, > > I am new to Hive and Hadoop in general. I have a table in Oracle that has > > millions of rows and I'd like to export it into HDFS so that I can run > some > > Hive queries. My first question is, is it recommended to export the > entire > > table as a single file (possibly 5GB), or more files with smaller sizes > (10 > > files each 500mb)? also, does it matter if I put the files under > different > > sub-directories before I do the data load in Hive? or everything has to > be > > under the same folder? > > Thanks, > > T > > p.s. I am sorry if this post is submitted twice. > > > > -- > Sarah Sproehnle > Educational Services > Cloudera, Inc > http://www.cloudera.com/training >
