[ 
https://issues.apache.org/jira/browse/HDFS-10892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15527560#comment-15527560
 ] 

Mingliang Liu edited comment on HDFS-10892 at 9/27/16 10:12 PM:
----------------------------------------------------------------

The {{hadoop.hdfs.TestDFSShell#testUtf8Encoding}} was not able to pass in 
Jenkins. The stack trace is like:
{quote}
Error Message

Malformed input or input contains unmappable characters: 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/RFrIWvn1nt/TestDFSShell/哈杜普.txt
Stacktrace

java.nio.file.InvalidPathException: Malformed input or input contains 
unmappable characters: 
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/RFrIWvn1nt/TestDFSShell/哈杜普.txt
        at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
        at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
        at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
        at java.io.File.toPath(File.java:2234)
        at 
org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getLastAccessTime(RawLocalFileSystem.java:662)
        at 
org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.<init>(RawLocalFileSystem.java:673)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:643)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:871)
        at 
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:635)
        at 
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:435)
        at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:360)
        at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2093)
        at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2061)
        at 
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2026)
        at 
org.apache.hadoop.hdfs.TestDFSShell.testUtf8Encoding(TestDFSShell.java:3885)
{quote}

I was able to pass the test locally without any problem. Perhaps it has 
something to with the {{LANG}} environment variable according to discussions on 
the Internet. I can confirm that my local test machine is using following 
settings.
{code}
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
{code}

I also think Yetus is setting the local correctly. [~aw] Do you have any idea 
about this? Thanks.

As this test case is not tightly related to the commands "-tail" or "-stat", I 
remove this test case in the latest patch. If anyone suggests a working 
approach, I'd like to submit separate JIRA for tracking this test. Otherwise, 
we test it elsewhere (say, nightly system tests). Just for what it's worth, the 
test code is:
{code}
  /**
   * Test that the file name and content can have UTF-8 chars.
   */
  @Test (timeout = 30000)
  public void testUtf8Encoding() throws Exception {
    final int blockSize = 1024;
    final Configuration conf = new HdfsConfiguration();
    conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, blockSize);

    try (MiniDFSCluster cluster =
             new MiniDFSCluster.Builder(conf).numDataNodes(3).build()) {
      cluster.waitActive();
      final DistributedFileSystem dfs = cluster.getFileSystem();
      final Path workDir= new Path("/testUtf8Encoding");
      dfs.mkdirs(workDir);

      System.setProperty("sun.jnu.encoding", "UTF-8");
      System.setProperty("file.encoding", "UTF-8");
      final String chineseStr = "哈杜普.txt";
      final File testFile = new File(TEST_ROOT_DIR, chineseStr);
      // create a local file; its content contains the Chinese file name
      createLocalFile(testFile);
      dfs.copyFromLocalFile(new Path(testFile.getPath()), workDir);
      assertTrue(dfs.exists(new Path(workDir, testFile.getName())));

      final ByteArrayOutputStream out = new ByteArrayOutputStream();
      System.setOut(new PrintStream(out));

      final String argv[] = new String[]{
          "-cat", workDir + "/" + testFile.getName()};
      final int ret = ToolRunner.run(new FsShell(conf), argv);
      assertEquals(Arrays.toString(argv) + " returned non-zero status " + ret,
          0, ret);
      assertTrue("Unexpected -cat output: " + out,
          out.toString().contains(chineseStr));
    }
  }
{code}


was (Author: liuml07):
The {{hadoop.hdfs.TestDFSShell#testUtf8Encoding}} was not able to run in 
Jenkins. I was able to pass the test locally without any problem. Perhaps it 
has something to with the {{LANG}} environment variable according to 
discussions on the Internet. I can confirm that my local test machine is using 
following settings. I also think Yetus is setting the local correctly. [~aw] Do 
you have any idea about this? Thanks.
{code}
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
{code}

As this test case is not tightly related to the commands "-tail" or "-stat", I 
remove this test case in the latest patch. Just for what it's worth, the test 
code is:
{code}

  /**
   * Test that the file name and content can have UTF-8 chars.
   */
  @Test (timeout = 30000)
  public void testUtf8Encoding() throws Exception {
    final int blockSize = 1024;
    final Configuration conf = new HdfsConfiguration();
    conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, blockSize);

    try (MiniDFSCluster cluster =
             new MiniDFSCluster.Builder(conf).numDataNodes(3).build()) {
      cluster.waitActive();
      final DistributedFileSystem dfs = cluster.getFileSystem();
      final Path workDir= new Path("/testUtf8Encoding");
      dfs.mkdirs(workDir);

      System.setProperty("sun.jnu.encoding", "UTF-8");
      System.setProperty("file.encoding", "UTF-8");
      final String chineseStr = "哈杜普.txt";
      final File testFile = new File(TEST_ROOT_DIR, chineseStr);
      // create a local file; its content contains the Chinese file name
      createLocalFile(testFile);
      dfs.copyFromLocalFile(new Path(testFile.getPath()), workDir);
      assertTrue(dfs.exists(new Path(workDir, testFile.getName())));

      final ByteArrayOutputStream out = new ByteArrayOutputStream();
      System.setOut(new PrintStream(out));

      final String argv[] = new String[]{
          "-cat", workDir + "/" + testFile.getName()};
      final int ret = ToolRunner.run(new FsShell(conf), argv);
      assertEquals(Arrays.toString(argv) + " returned non-zero status " + ret,
          0, ret);
      assertTrue("Unexpected -cat output: " + out,
          out.toString().contains(chineseStr));
    }
  }
{code}
If anyone suggests a working approach, I'd like to submit separate JIRA for 
tracking this test. Otherwise, we test it elsewhere (say, nightly system tests).

> Add unit tests for HDFS command 'dfs -tail' and 'dfs -stat'
> -----------------------------------------------------------
>
>                 Key: HDFS-10892
>                 URL: https://issues.apache.org/jira/browse/HDFS-10892
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: fs, shell, test
>            Reporter: Mingliang Liu
>            Assignee: Mingliang Liu
>         Attachments: HDFS-10892.000.patch, HDFS-10892.001.patch, 
> HDFS-10892.002.patch, HDFS-10892.003.patch, HDFS-10892.004.patch
>
>
> I did not find unit test in {{trunk}} code for following cases:
> - HDFS command {{dfs -tail}}
> - HDFS command {{dfs -stat}}
> I think it still merits to have one though the commands have served us for 
> years.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to