Re: loading data from HDFS or local file to

Zheng Shao Wed, 22 Jul 2009 13:59:23 -0700

If the huge file is already on HDFS (load data WITHOUT local), Hive
will just *move* the file into the table (NOTE: that means user won't
be able to see the file in its original directory afterwards)

If you don't want that to happen, you might want to use "CREATE
EXTERNAL TABLE .... LOCATION "/user/myname/myfiledir";"

If the huge file is on local file system, you will have to use (load
data WITH local), and Hive will copy the file.

Zheng

On Wed, Jul 22, 2009 at 12:25 AM, Manhee Jo<[email protected]> wrote:
> Hi all,
>
> What really happens when a huge file (e.g. some tens of TB) is "LOADed DATA
> (LOCAL) INPATH ...
> INTO TABLE"? Does hive need to scan the entire file before processing
> anything even very simple (e.g. select)?
> If so, are there any solutions to decrease the number of disk access? Is
> partitioning a way to do it?
>
> Many Thanks,
> Manhee
>

-- 
Yours,
Zheng

Re: loading data from HDFS or local file to

Reply via email to