[ 
https://issues.apache.org/jira/browse/HADOOP-6513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12805338#action_12805338
 ] 

Todd Lipcon commented on HADOOP-6513:
-------------------------------------

Hi Robert. Would you mind posting your test program as an attachment, rather 
than the JIRA description? It's difficult to read code in this context. Best 
would be a failing unit test case that shows the problem as you understand it.

> SequenceFile.Sorter  design issue and class-check bug
> -----------------------------------------------------
>
>                 Key: HADOOP-6513
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6513
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1
>         Environment: hadoop 20.1, java 1.6.0_17,fedora
>            Reporter: robert Cook
>
> SequenceFile.Writer takes key/value classes as creation arguments and checks 
> for validity on every append.
> Reader does not take class arguments on creation because they are derived 
> from the input file.
> Sorter takes key/value classes as creation arguments??  no point.  should be 
> derived from input.
> In any case, SortPass does not compare Sorter key/value classes with input 
> file classes.
> No error is given for the following:
>         private static void writeTest4(FileSystem fs, int count, int seed, 
> Path file, 
>                         SequenceFile.CompressionType compressionType, 
> CompressionCodec codec, Configuration conf)
>           throws IOException {
>           fs.delete(file, true);
>           LOG.info("creating " + count + " records with " + compressionType +
>                    " compression");
>           SequenceFile.Writer writer = 
>             SequenceFile.createWriter(fs, conf, file, 
>                         StringWritable.class, FloatWritable.class, 
> compressionType, codec);
>           FloatWritable x=new FloatWritable();
>           StringWritable y=new StringWritable();
>           for (int i = count-1; i >= 0; i--) {
>             x.set(i);  y.set(""+i);
>             writer.append(y, x);
>           }
>           writer.close();
>         }
>         private static void sortTest(FileSystem fs, int count, int megabytes, 
>                         int factor, boolean fast, Path file, Configuration 
> conf)
>         throws IOException {
>                 fs.delete(new Path(file+".sorted"), true);
>                 SequenceFile.Sorter sorter = newSorter(fs, fast, megabytes, 
> factor, conf);
>                 LOG.debug("sorting " + count + " records");
>                 sorter.sort(file, file.suffix(".sorted"));
>                 LOG.info("done sorting " + count + " debug");
>         }
>         
>         private static SequenceFile.Sorter newSorter(FileSystem fs, 
>                         boolean fast,
>                         int megabytes, int factor, Configuration conf) {
>                 SequenceFile.Sorter sorter = 
>                         fast
>                         ? new SequenceFile.Sorter(fs, new 
> IntWritable.Comparator(),
>                                         FloatWritable.class, 
> IntWritable.class, conf)
>                 : new SequenceFile.Sorter(fs, FloatWritable.class, 
> IntWritable.class, conf);
>                         sorter.setMemory(megabytes * 1024*1024);
>                         sorter.setFactor(factor);
>                         return sorter;
>         }
> ---------------------Note String/Float  does not match Float/Int
> Macintosh-2:datanode bobcook$ od -c file          
> 0000000    S   E   Q 006 016   S   t   r   i   n   g   W   r   i   t   a
> 0000020    b   l   e  \r   F   l   o   a   t   W   r   i   t   a   b   l
> 0000040    e  \0  \0  \0  \0  \0  \0 203   `   n   E   J   z 272   d 352
> 0000060    w 177 373  \n 364   M 276  \0  \0  \0  \n  \0  \0  \0 006  \0
> 0000100   \0  \0 001  \0   4   @ 200  \0  \0  \0  \0  \0  \n  \0  \0  \0
> 0000120  006  \0  \0  \0 001  \0   3   @   @  \0  \0  \0  \0  \0  \n  \0
> 0000140   \0  \0 006  \0  \0  \0 001  \0   2   @  \0  \0  \0  \0  \0  \0
> 0000160   \n  \0  \0  \0 006  \0  \0  \0 001  \0   1   ? 200  \0  \0  \0
> 0000200   \0  \0  \n  \0  \0  \0 006  \0  \0  \0 001  \0   0  \0  \0  \0
> *
> 0000220
> Macintosh-2:datanode bobcook$ od -c file.sorted
> 0000000    S   E   Q 006  \r   F   l   o   a   t   W   r   i   t   a   b
> 0000020    l   e  \v   I   n   t   W   r   i   t   a   b   l   e  \0  \0
> 0000040   \0  \0  \0  \0   6 364 343  \r 256   h   U 222 365   T   7   l
> 0000060  357   i   ~   }  \0  \0  \0  \n  \0  \0  \0 006  \0  \0  \0 001
> 0000100   \0   4   @ 200  \0  \0  \0  \0  \0  \n  \0  \0  \0 006  \0  \0
> 0000120   \0 001  \0   3   @   @  \0  \0  \0  \0  \0  \n  \0  \0  \0 006
> 0000140   \0  \0  \0 001  \0   2   @  \0  \0  \0  \0  \0  \0  \n  \0  \0
> 0000160   \0 006  \0  \0  \0 001  \0   1   ? 200  \0  \0  \0  \0  \0  \n
> 0000200   \0  \0  \0 006  \0  \0  \0 001  \0   0  \0  \0  \0  \0        
> 0000216
> NOTE OUTPUT FILE IS TOTALLY TOASTED, but no error was generated!
> PS: Your evaluation of my previous bug reports was enlightening.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to