Re: Files does not exist error: concurrency control on hive queries...

He Yongqiang Tue, 08 Sep 2009 22:31:51 -0700

Hi Eva,
    Can you open a new jira for this?  And let’s discuss and resolve this
issue. 
I guess this is because the partition metadata is added before the data is
available.


Thanks
Yongqiang
On 09-9-9 下午1:18, "Eva Tse" <[email protected]> wrote:

> 
> We are planning to start enabling ad-hoc querying on our hive warehouse and we
> tested some of the concurrent queries and found the following issue:
> 
> Query 1 – doing ‘insert overwrite table yyy .... partition (dateint = xxx)
> select ...  from yyy where dateint = xxx’  This is done to merge small files
> within a partition in table yyy
> Query 2 – doing some select on the same table joining another table.
> 
> What we found is that query 2 would fail with the following exceptions in
> multiple reducers.
> java.io.FileNotFoundException: File does not exist:
> hdfs://ip-10-251-98-80.ec2.internal:9000/user/hive/dataeng/warehouse/nccp_sess
> ion_facts/dateint=20090908/hour=9/sessionsFacts_P20090909T021823L20090908T09-r
> -00006
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSyst
> em.java:457)
>  at org.apache.hadoop.fs.FileSystem.getLength(FileSystem.java:671)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1417)
>  at org.apache.hadoop.io.SequenceFile$Reader.(SequenceFile.java:1412)
>  at 
> org.apache.hadoop.mapred.SequenceFileRecordReader.(SequenceFileRecordReader.ja
> va:43)
>  at 
> org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileI
> nputFormat.java:63)
>  at 
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.j
> ava:236)
>  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:336)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>  at org.apache.hadoop.mapred.Child.main(Child.java:170)
> 
> Is this expected? If so, is there a jira or is it planned to be addressed? We
> are trying to think of workaround, but haven’t thought of good ones as
> swapping of files would ideally be handled inside hive.
> 
> Please let us know your feedback.
> 
> Thanks,
> Eva.

Re: Files does not exist error: concurrency control on hive queries...

Reply via email to