[ 
https://issues.apache.org/jira/browse/HIVE-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12789088#action_12789088
 ] 

Paul Yang commented on HIVE-976:
--------------------------------

I'm looking at sample7.q.out, which has the lines that are quoted in the 
description. As I understand it, the test should check lines of the form 
'file:/[directory]/[file]' and verify that the file part (or the last token) is 
the same as those in the reference test files. But aren't there lines in the 
test output that begin with 'file:/' that should be ignored completely?

Anyway, assuming that we can look at just the last token of lines that begin 
with 'file:/', it doesn't seem like there is a way to use only diff to handle 
this case. The man page for diff does not describe any useful options. I have 
two ideas 

1. Instead of using diff directly, write a script that uses diff + some 
additional logic to detect this condition. Probably will be a little slower.

2. in mapredWork, add an additional function that would display the last token 
of the paths. Sort of an ugly idea, but would work with diff in the current 
form.

> test outputs should compare the file/directory for sampling 
> ------------------------------------------------------------
>
>                 Key: HIVE-976
>                 URL: https://issues.apache.org/jira/browse/HIVE-976
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Testing Infrastructure
>            Reporter: Namit Jain
>            Assignee: Paul Yang
>             Fix For: 0.5.0
>
>
> Currently, all lines starting with file: are ignored.
> It means that
> file:/Users/heyongqiang/Documents/workspace/Hive-Test/build/ql/test/data/warehouse/srcbucket/srcbucket0.txt
>  [s]
> and
> file:/Users/heyongqiang/Documents/workspace/Hive-Test/build/ql/test/data/warehouse/srcbucket
>  [s]
> are same - that is not good because it will hide some of the optimizations of 
> sampling.
> This should be changed to compare the last token.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to