[
https://issues.apache.org/jira/browse/HDFS-10892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15527560#comment-15527560
]
Mingliang Liu edited comment on HDFS-10892 at 9/27/16 10:12 PM:
----------------------------------------------------------------
The {{hadoop.hdfs.TestDFSShell#testUtf8Encoding}} was not able to pass in
Jenkins. The stack trace is like:
{quote}
Error Message
Malformed input or input contains unmappable characters:
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/RFrIWvn1nt/TestDFSShell/哈杜普.txt
Stacktrace
java.nio.file.InvalidPathException: Malformed input or input contains
unmappable characters:
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/1/RFrIWvn1nt/TestDFSShell/哈杜普.txt
at sun.nio.fs.UnixPath.encode(UnixPath.java:147)
at sun.nio.fs.UnixPath.<init>(UnixPath.java:71)
at sun.nio.fs.UnixFileSystem.getPath(UnixFileSystem.java:281)
at java.io.File.toPath(File.java:2234)
at
org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.getLastAccessTime(RawLocalFileSystem.java:662)
at
org.apache.hadoop.fs.RawLocalFileSystem$DeprecatedRawLocalFileStatus.<init>(RawLocalFileSystem.java:673)
at
org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:643)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:871)
at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:635)
at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:435)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:360)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2093)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2061)
at
org.apache.hadoop.fs.FileSystem.copyFromLocalFile(FileSystem.java:2026)
at
org.apache.hadoop.hdfs.TestDFSShell.testUtf8Encoding(TestDFSShell.java:3885)
{quote}
I was able to pass the test locally without any problem. Perhaps it has
something to with the {{LANG}} environment variable according to discussions on
the Internet. I can confirm that my local test machine is using following
settings.
{code}
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
{code}
I also think Yetus is setting the local correctly. [~aw] Do you have any idea
about this? Thanks.
As this test case is not tightly related to the commands "-tail" or "-stat", I
remove this test case in the latest patch. If anyone suggests a working
approach, I'd like to submit separate JIRA for tracking this test. Otherwise,
we test it elsewhere (say, nightly system tests). Just for what it's worth, the
test code is:
{code}
/**
* Test that the file name and content can have UTF-8 chars.
*/
@Test (timeout = 30000)
public void testUtf8Encoding() throws Exception {
final int blockSize = 1024;
final Configuration conf = new HdfsConfiguration();
conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, blockSize);
try (MiniDFSCluster cluster =
new MiniDFSCluster.Builder(conf).numDataNodes(3).build()) {
cluster.waitActive();
final DistributedFileSystem dfs = cluster.getFileSystem();
final Path workDir= new Path("/testUtf8Encoding");
dfs.mkdirs(workDir);
System.setProperty("sun.jnu.encoding", "UTF-8");
System.setProperty("file.encoding", "UTF-8");
final String chineseStr = "哈杜普.txt";
final File testFile = new File(TEST_ROOT_DIR, chineseStr);
// create a local file; its content contains the Chinese file name
createLocalFile(testFile);
dfs.copyFromLocalFile(new Path(testFile.getPath()), workDir);
assertTrue(dfs.exists(new Path(workDir, testFile.getName())));
final ByteArrayOutputStream out = new ByteArrayOutputStream();
System.setOut(new PrintStream(out));
final String argv[] = new String[]{
"-cat", workDir + "/" + testFile.getName()};
final int ret = ToolRunner.run(new FsShell(conf), argv);
assertEquals(Arrays.toString(argv) + " returned non-zero status " + ret,
0, ret);
assertTrue("Unexpected -cat output: " + out,
out.toString().contains(chineseStr));
}
}
{code}
was (Author: liuml07):
The {{hadoop.hdfs.TestDFSShell#testUtf8Encoding}} was not able to run in
Jenkins. I was able to pass the test locally without any problem. Perhaps it
has something to with the {{LANG}} environment variable according to
discussions on the Internet. I can confirm that my local test machine is using
following settings. I also think Yetus is setting the local correctly. [~aw] Do
you have any idea about this? Thanks.
{code}
$ locale
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
{code}
As this test case is not tightly related to the commands "-tail" or "-stat", I
remove this test case in the latest patch. Just for what it's worth, the test
code is:
{code}
/**
* Test that the file name and content can have UTF-8 chars.
*/
@Test (timeout = 30000)
public void testUtf8Encoding() throws Exception {
final int blockSize = 1024;
final Configuration conf = new HdfsConfiguration();
conf.setInt(DFSConfigKeys.DFS_BLOCK_SIZE_KEY, blockSize);
try (MiniDFSCluster cluster =
new MiniDFSCluster.Builder(conf).numDataNodes(3).build()) {
cluster.waitActive();
final DistributedFileSystem dfs = cluster.getFileSystem();
final Path workDir= new Path("/testUtf8Encoding");
dfs.mkdirs(workDir);
System.setProperty("sun.jnu.encoding", "UTF-8");
System.setProperty("file.encoding", "UTF-8");
final String chineseStr = "哈杜普.txt";
final File testFile = new File(TEST_ROOT_DIR, chineseStr);
// create a local file; its content contains the Chinese file name
createLocalFile(testFile);
dfs.copyFromLocalFile(new Path(testFile.getPath()), workDir);
assertTrue(dfs.exists(new Path(workDir, testFile.getName())));
final ByteArrayOutputStream out = new ByteArrayOutputStream();
System.setOut(new PrintStream(out));
final String argv[] = new String[]{
"-cat", workDir + "/" + testFile.getName()};
final int ret = ToolRunner.run(new FsShell(conf), argv);
assertEquals(Arrays.toString(argv) + " returned non-zero status " + ret,
0, ret);
assertTrue("Unexpected -cat output: " + out,
out.toString().contains(chineseStr));
}
}
{code}
If anyone suggests a working approach, I'd like to submit separate JIRA for
tracking this test. Otherwise, we test it elsewhere (say, nightly system tests).
> Add unit tests for HDFS command 'dfs -tail' and 'dfs -stat'
> -----------------------------------------------------------
>
> Key: HDFS-10892
> URL: https://issues.apache.org/jira/browse/HDFS-10892
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: fs, shell, test
> Reporter: Mingliang Liu
> Assignee: Mingliang Liu
> Attachments: HDFS-10892.000.patch, HDFS-10892.001.patch,
> HDFS-10892.002.patch, HDFS-10892.003.patch, HDFS-10892.004.patch
>
>
> I did not find unit test in {{trunk}} code for following cases:
> - HDFS command {{dfs -tail}}
> - HDFS command {{dfs -stat}}
> I think it still merits to have one though the commands have served us for
> years.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]