[
https://issues.apache.org/jira/browse/HDFS-12862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895407#comment-16895407
]
Wei-Chiu Chuang commented on HDFS-12862:
----------------------------------------
It appears to me that TestDiskBalancer keeps failing after applying this patch.
Tried 3-4 times and failed all the time, and this is interesting because the
test seems unrelated.
Specifically,
{noformat}
[INFO] Running org.apache.hadoop.hdfs.server.diskbalancer.TestDiskBalancer
[ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 456.083
s <<< FAILURE! - in org.apache.hadoop.hdfs.server.diskbalancer.TestDiskBalancer
[ERROR]
testDiskBalancerWithFedClusterWithOneNameServiceEmpty(org.apache.hadoop.hdfs.server.diskbalancer.TestDiskBalancer)
Time elapsed: 81.176 s <<< FAILURE!
java.lang.AssertionError
at org.junit.Assert.fail(Assert.java:86)
at org.junit.Assert.assertTrue(Assert.java:41)
at org.junit.Assert.assertTrue(Assert.java:52)
at
org.apache.hadoop.hdfs.server.diskbalancer.TestDiskBalancer.testDiskBalancerWithFedClusterWithOneNameServiceEmpty(TestDiskBalancer.java:278)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
at
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
at
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
at
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
at
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
at
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
at
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
at
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
at
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
at
org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
at
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
{noformat}
> CacheDirective becomes invalid when NN restart or failover
> ----------------------------------------------------------
>
> Key: HDFS-12862
> URL: https://issues.apache.org/jira/browse/HDFS-12862
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: caching, hdfs
> Affects Versions: 2.7.1
> Environment:
> Reporter: Wang XL
> Assignee: Wang XL
> Priority: Major
> Labels: patch
> Fix For: 3.3.0, 3.2.1
>
> Attachments: HDFS-12862-branch-2.7.1.001.patch,
> HDFS-12862-trunk.002.patch, HDFS-12862-trunk.003.patch,
> HDFS-12862-trunk.004.patch, HDFS-12862.005.patch, HDFS-12862.006.patch,
> HDFS-12862.007.patch, HDFS-12862.branch-3.1.patch
>
>
> The logic in FSNDNCacheOp#modifyCacheDirective is not correct. when modify
> cacheDirective,the expiration in directive may be a relative expiryTime, and
> EditLog will serial a relative expiry time.
> {code:java}
> // Some comments here
> static void modifyCacheDirective(
> FSNamesystem fsn, CacheManager cacheManager, CacheDirectiveInfo
> directive,
> EnumSet<CacheFlag> flags, boolean logRetryCache) throws IOException {
> final FSPermissionChecker pc = getFsPermissionChecker(fsn);
> cacheManager.modifyDirective(directive, pc, flags);
> fsn.getEditLog().logModifyCacheDirectiveInfo(directive, logRetryCache);
> }
> {code}
> But when SBN replay the log ,it will invoke
> FSImageSerialization#readCacheDirectiveInfo as a absolute expiryTime.It will
> result in the inconsistency .
> {code:java}
> public static CacheDirectiveInfo readCacheDirectiveInfo(DataInput in)
> throws IOException {
> CacheDirectiveInfo.Builder builder =
> new CacheDirectiveInfo.Builder();
> builder.setId(readLong(in));
> int flags = in.readInt();
> if ((flags & 0x1) != 0) {
> builder.setPath(new Path(readString(in)));
> }
> if ((flags & 0x2) != 0) {
> builder.setReplication(readShort(in));
> }
> if ((flags & 0x4) != 0) {
> builder.setPool(readString(in));
> }
> if ((flags & 0x8) != 0) {
> builder.setExpiration(
> CacheDirectiveInfo.Expiration.newAbsolute(readLong(in)));
> }
> if ((flags & ~0xF) != 0) {
> throw new IOException("unknown flags set in " +
> "ModifyCacheDirectiveInfoOp: " + flags);
> }
> return builder.build();
> }
> {code}
> In other words, fsn.getEditLog().logModifyCacheDirectiveInfo(directive,
> logRetryCache) may serial a relative expiry time,But
> builder.setExpiration(CacheDirectiveInfo.Expiration.newAbsolute(readLong(in)))
> read it as a absolute expiryTime.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]