Hi all, What really happens when a huge file (e.g. some tens of TB) is "LOADed DATA (LOCAL) INPATH ... INTO TABLE"? Does hive need to scan the entire file before processing anything even very simple (e.g. select)? If so, are there any solutions to decrease the number of disk access? Is partitioning a way to do it?
Many Thanks, Manhee
