[jira] Commented: (HIVE-524) ExecDriver adds 0 byte file to input paths

Namit Jain (JIRA) Fri, 29 May 2009 12:31:08 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714561#action_12714561
 ]


Namit Jain commented on HIVE-524:
---------------------------------

The problem is that downstream map-reduce jobs can run into problems.

For eg:

consider the query:


select .... from
(query 1  union all query 2);

It will result in 3 map-reduce jobs: query 1, query 2 and outer query depending 
on query 1 and query2.

If query2 had empty partitions, and we disallow it.
outer query will fail because the output for query 2 has not been created.

That's why we create a dummy file

The correct fix would be to create a file based on the table descriptor instead 
of some hard-coded value. Then, the custom input format can be attached to the 
table descriptor and will work fine.
I am already in the process of implementing that as part of map-join, and will 
merge it in soon.

> ExecDriver adds 0 byte file to input paths
> ------------------------------------------
>
>                 Key: HIVE-524
>                 URL: https://issues.apache.org/jira/browse/HIVE-524
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.4.0
>            Reporter: Johan Oskarsson
>             Fix For: 0.4.0
>
>
> In the addInputPaths method in ExecDriver:
> If the input path of a partition cannot be found or contains no files with 
> data in them, a 0 byte file is created and added to the job instead. This 
> causes our custom InputFormat to throw an exception since it is asked to 
> process an unknown file format (not an lzo file).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-524) ExecDriver adds 0 byte file to input paths

Reply via email to