Hi, I am new to Hive and Hadoop in general. I have a table in Oracle that has millions of rows and I'd like to export it into HDFS so that I can run some Hive queries. My first question is, is it recommended to export the entire table as a single file (possibly 5GB), or more files with smaller sizes (10 files each 500mb)? also, does it matter if I put the files under different sub-directories before I do the data load in Hive? or everything has to be under the same folder?
Thanks, T p.s. I am sorry if this post is submitted twice.
