[ 
https://issues.apache.org/jira/browse/HADOOP-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506727#comment-13506727
 ] 

Todd Lipcon commented on HADOOP-9103:
-------------------------------------

bq. Nice. Should we test for rejecting 5-byte and 6-byte sequences, since I 
notice you added some code to do that?

I added a test for an invalid sequence. I didn't think it was necessary to add 
a separate test for a 5-byte sequence, since it would trigger the same 
"invalid" code path. Got an example hex sequence you think we should test 
against?

bq. I'm also a little scared by the idea that we have differently-encoded 
byte[] running around for the same file name strings. We have to be very 
careful about this. 
bq. ...However, we could do that in a separate JIRA, not here
Agreed. Let's open a separate HDFS JIRA and use this for the Common-side fix. 
This patch alone was enough to successfully restart a NN which had an open file 
with a 4-byte codepoint.
                
> UTF8 class does not properly decode Unicode characters outside the basic 
> multilingual plane
> -------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-9103
>                 URL: https://issues.apache.org/jira/browse/HADOOP-9103
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: io
>    Affects Versions: 0.20.1
>         Environment: SUSE LINUX
>            Reporter: yixiaohua
>            Assignee: Todd Lipcon
>         Attachments: FSImage.java, hadoop-9103.txt, ProblemString.txt, 
> TestUTF8AndStringGetBytes.java, TestUTF8AndStringGetBytes.java
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> this the log information  of the  exception  from the SecondaryNameNode: 
> 2012-03-28 00:48:42,553 ERROR 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: 
> java.io.IOException: Found lease for
>  non-existent file 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/????@???????????????
> ??????????tor.qzone.qq.com/keypart-00174
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFilesUnderConstruction(FSImage.java:1211)
>         at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:959)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:589)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:473)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:350)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:314)
>         at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:225)
>         at java.lang.Thread.run(Thread.java:619)
> this is the log information  about the file from namenode:
> 2012-03-28 00:32:26,528 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss 
> ip=/10.131.16.34        cmd=create      
> src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?            tor.qzone.qq.com/keypart-00174 dst=null        
> perm=boss:boss:rw-r--r--
> 2012-03-28 00:37:42,387 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> NameSystem.allocateBlock: 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?            tor.qzone.qq.com/keypart-00174. 
> blk_2751836614265659170_184668759
> 2012-03-28 00:37:42,696 INFO org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.completeFile: file 
> /user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?            tor.qzone.qq.com/keypart-00174 is closed by 
> DFSClient_attempt_201203271849_0016_r_000174_0
> 2012-03-28 00:37:50,315 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=boss,boss 
> ip=/10.131.16.34        cmd=rename      
> src=/user/boss/pgv/fission/task16/split/_temporary/_attempt_201203271849_0016_r_000174_0/
>   @?            tor.qzone.qq.com/keypart-00174 
> dst=/user/boss/pgv/fission/task16/split/  @?            
> tor.qzone.qq.com/keypart-00174  perm=boss:boss:rw-r--r--
> after check the code that save FSImage,I found there are a problem that maybe 
> a bug of HDFS Code,I past below:
> -------------this is the saveFSImage method  in  FSImage.java, I make some 
> mark at the problem code------------
> /**
>    * Save the contents of the FS image to the file.
>    */
>   void saveFSImage(File newFile) throws IOException {
>     FSNamesystem fsNamesys = FSNamesystem.getFSNamesystem();
>     FSDirectory fsDir = fsNamesys.dir;
>     long startTime = FSNamesystem.now();
>     //
>     // Write out data
>     //
>     DataOutputStream out = new DataOutputStream(
>                                                 new BufferedOutputStream(
>                                                                          new 
> FileOutputStream(newFile)));
>     try {
>       .........
>     
>       // save the rest of the nodes
>       saveImage(strbuf, 0, fsDir.rootDir, out);------------------problem
>       fsNamesys.saveFilesUnderConstruction(out);------------------problem  
> detail is below
>       strbuf = null;
>     } finally {
>       out.close();
>     }
>     LOG.info("Image file of size " + newFile.length() + " saved in " 
>         + (FSNamesystem.now() - startTime)/1000 + " seconds.");
>   }
>  /**
>    * Save file tree image starting from the given root.
>    * This is a recursive procedure, which first saves all children of
>    * a current directory and then moves inside the sub-directories.
>    */
>   private static void saveImage(ByteBuffer parentPrefix,
>                                 int prefixLength,
>                                 INodeDirectory current,
>                                 DataOutputStream out) throws IOException {
>     int newPrefixLength = prefixLength;
>     if (current.getChildrenRaw() == null)
>       return;
>     for(INode child : current.getChildren()) {
>       // print all children first
>       parentPrefix.position(prefixLength);
>       
> parentPrefix.put(PATH_SEPARATOR).put(child.getLocalNameBytes());------------------problem
>       saveINode2Image(parentPrefix, child, out);
>     }
>    ..........
>   }
>  // Helper function that writes an INodeUnderConstruction
>   // into the input stream
>   //
>   static void writeINodeUnderConstruction(DataOutputStream out,
>                                            INodeFileUnderConstruction cons,
>                                            String path) 
>                                            throws IOException {
>     writeString(path, out);------------------problem
>     ..........
>   }
>   
>   static private final UTF8 U_STR = new UTF8();
>   static void writeString(String str, DataOutputStream out) throws 
> IOException {
>     U_STR.set(str);
>     U_STR.write(out);------------------problem 
>   }
>   /**
>    * Converts a string to a byte array using UTF8 encoding.
>    */
>   static byte[] string2Bytes(String str) {
>     try {
>       return str.getBytes("UTF8");------------------problem 
>     } catch(UnsupportedEncodingException e) {
>       assert false : "UTF8 encoding is not supported ";
>     }
>     return null;
>   }
> ------------------------------------------below is the 
> explain------------------------
> in  saveImage method:  child.getLocalNameBytes(),the  bytes use the method of 
> str.getBytes("UTF8");
> but in writeINodeUnderConstruction, the bytes user the method of Class  UTF8 
> to get the bytes.
> I make a test use our messy code file name , found the the two bytes arrsy 
> are not equal. so I both use the class UTF8,then the problem desappare.
> I think this is a bug of HDFS or UTF8.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to