[jira] [Created] (HDFS-4482) ReplicationMonitor thread can exit with NPE due to the race between delete and replication of same file.
Uma Maheswara Rao G created HDFS-4482: - Summary: ReplicationMonitor thread can exit with NPE due to the race between delete and replication of same file. Key: HDFS-4482 URL: https://issues.apache.org/jira/browse/HDFS-4482 Project: Hadoop HDFS Issue Type: Bug Reporter: Uma Maheswara Rao G Priority: Blocker Trace: {noformat} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFullPathName(FSDirectory.java:1442) at org.apache.hadoop.hdfs.server.namenode.INode.getFullPathName(INode.java:269) at org.apache.hadoop.hdfs.server.namenode.INodeFile.getName(INodeFile.java:163) at org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.chooseTarget(BlockPlacementPolicy.java:131) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1157) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1063) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3085) at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3047) at java.lang.Thread.run(Thread.java:619) {noformat} What I am seeing here is: 1) create a file and write with 2 DNS 2) Close the file. 3) Kill one DN 4) Lat replication start. Info: {code} // choose replication targets: NOT HOLDING THE GLOBAL LOCK // It is costly to extract the filename for which chooseTargets is called, // so for now we pass in the block collection itself. rw.targets = blockplacement.chooseTarget(rw.bc, rw.additionalReplRequired, rw.srcNode, rw.liveReplicaNodes, excludedNodes, rw.block.getNumBytes()); {code} Here we are choosing target outside the global lock. Inside we will try to get the src path from blockCollection(nothing but INodeFile here). see the code for FSDirectory#getFullPathName Here it is incrementing the depth until it has parent. and Later it will iterate and access parent again in next loop. Between this if file is deleted by client then that parent would have been set as null. So, here accessing the parent can cause NPE because it is not under lock. 2) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Build failed in Jenkins: Hadoop-Hdfs-0.23-Build #519
See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/519/ -- [...truncated 17130 lines...] Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//XException.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//XException.ERROR.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/servlet//FileSystemReleaseFilter.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/servlet//HostnameFilter.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/servlet//MDCFilter.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/servlet//ServerWebApp.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/util//Check.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/util//ConfigurationUtils.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/overview-frame.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/client//package-frame.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/client//package-summary.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/client//package-tree.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/server//package-frame.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/server//package-summary.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/server//package-tree.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//package-frame.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//package-summary.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//package-tree.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/server//package-frame.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/server//package-summary.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/server//package-tree.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service//package-frame.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service//package-summary.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service//package-tree.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service/hadoop//package-frame.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service/hadoop//package-summary.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service/hadoop//package-tree.html... Generating https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service/instrumentation//package-frame.html...
Build failed in Jenkins: Hadoop-Hdfs-trunk #1310
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1310/changes Changes: [atm] HDFS-4471. Namenode WebUI file browsing does not work with wildcard addresses configured. Contributed by Andrew Wang. [szetszwo] YARN-377. Use the new StringUtils methods added by HADOOP-9252 and fix TestContainersMonitor. Contributed by Chris Nauroth [suresh] HDFS-4470. Several HDFS tests attempt file operations on invalid HDFS paths when running on Windows. Contributed by Chris Nauroth. [suresh] HADOOP-9277. Improve javadoc for FileContext. Contributed by Andrew Wang. [sseth] YARN-385. Add missing fields - location and #containers to ResourceRequestPBImpl's toString(). Contributed by Sandy Ryza. [sseth] YARN-383. AMRMClientImpl should handle null rmClient in stop(). Contributed by Hitesh Shah. [suresh] HADOOP-9253. Capture ulimit info in the logs at service start time. Contributed by Arpit Gupta. -- [...truncated 13652 lines...] [INFO] Using default encoding to copy filtered resources. [INFO] [INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ hadoop-hdfs-httpfs --- [INFO] Compiling 56 source files to https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/classes [INFO] [INFO] --- maven-antrun-plugin:1.6:run (create-web-xmls) @ hadoop-hdfs-httpfs --- [INFO] Executing tasks main: [mkdir] Created dir: https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/test-classes/webapp [copy] Copying 1 file to https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/test-classes/webapp [INFO] Executed tasks [INFO] [INFO] --- maven-resources-plugin:2.2:testResources (default-testResources) @ hadoop-hdfs-httpfs --- [INFO] Using default encoding to copy filtered resources. [INFO] [INFO] --- maven-compiler-plugin:2.5.1:testCompile (default-testCompile) @ hadoop-hdfs-httpfs --- [INFO] Compiling 46 source files to https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/test-classes [INFO] [INFO] --- maven-surefire-plugin:2.12.3:test (default-test) @ hadoop-hdfs-httpfs --- [INFO] Surefire report directory: https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/surefire-reports --- T E S T S --- --- T E S T S --- Running org.apache.hadoop.test.TestDirHelper Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.048 sec Running org.apache.hadoop.test.TestJettyHelper Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.053 sec Running org.apache.hadoop.test.TestHdfsHelper Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.06 sec Running org.apache.hadoop.test.TestHTestCase Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.196 sec Running org.apache.hadoop.test.TestExceptionHelper Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.047 sec Running org.apache.hadoop.test.TestHFSTestCase Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.981 sec Running org.apache.hadoop.lib.service.instrumentation.TestInstrumentationService Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.671 sec Running org.apache.hadoop.lib.service.scheduler.TestSchedulerService Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.196 sec Running org.apache.hadoop.lib.service.security.TestProxyUserService Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.914 sec Running org.apache.hadoop.lib.service.security.TestDelegationTokenManagerService Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.728 sec Running org.apache.hadoop.lib.service.security.TestGroupsService Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.205 sec Running org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.744 sec Running org.apache.hadoop.lib.server.TestServerConstructor Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.115 sec Running org.apache.hadoop.lib.server.TestServer Tests run: 30, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.402 sec Running org.apache.hadoop.lib.server.TestBaseService Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.27 sec Running org.apache.hadoop.lib.lang.TestRunnableCallable Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.064 sec Running org.apache.hadoop.lib.lang.TestXException Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.059 sec Running org.apache.hadoop.lib.wsrs.TestParam Tests run: 8, Failures: 0, Errors: 0, Skipped: 0,
Hadoop-Hdfs-trunk - Build # 1310 - Still Failing
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1310/ ### ## LAST 60 LINES OF THE CONSOLE ### [...truncated 13845 lines...] at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75) Running org.apache.hadoop.fs.http.client.TestHttpFSWithHttpFSFileSystem Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.026 sec Running org.apache.hadoop.fs.http.client.TestHttpFSFileSystemLocalFileSystem Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.223 sec Results : Tests in error: testOperation[7](org.apache.hadoop.fs.http.client.TestHttpFSFWithWebhdfsFileSystem) testOperationDoAs[7](org.apache.hadoop.fs.http.client.TestHttpFSFWithWebhdfsFileSystem) Tests run: 283, Failures: 0, Errors: 2, Skipped: 0 [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Hadoop HDFS SUCCESS [1:27:42.034s] [INFO] Apache Hadoop HttpFS .. FAILURE [1:45.030s] [INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED [INFO] Apache Hadoop HDFS Project SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 1:29:27.856s [INFO] Finished at: Fri Feb 08 13:02:56 UTC 2013 [INFO] Final Memory: 50M/525M [INFO] [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on project hadoop-hdfs-httpfs: There are test failures. [ERROR] [ERROR] Please refer to /home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/surefire-reports for the individual test results. [ERROR] - [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn goals -rf :hadoop-hdfs-httpfs Build step 'Execute shell' marked build as failure Archiving artifacts Updating HDFS-4470 Updating YARN-377 Updating HADOOP-9252 Updating HDFS-4471 Updating HADOOP-9253 Updating YARN-383 Updating YARN-385 Updating HADOOP-9277 Sending e-mails to: hdfs-dev@hadoop.apache.org Email was triggered for: Failure Sending email for trigger: Failure ### ## FAILED TESTS (if any) ## No tests ran.
[jira] [Created] (HDFS-4483) Refactor NN WebUI to no longer pass IP addresses in the URL
Andrew Wang created HDFS-4483: - Summary: Refactor NN WebUI to no longer pass IP addresses in the URL Key: HDFS-4483 URL: https://issues.apache.org/jira/browse/HDFS-4483 Project: Hadoop HDFS Issue Type: Bug Reporter: Andrew Wang Assignee: Andrew Wang Right now, the namenode passes its RPC address in WebUI URLs when it redirects to datanodes for things like browsing the filesystem. This is brittle and fails in different ways when wildcard addresses are configured (see HDFS-3932 and HDFS-4471). A better solution would be to instead pass the NN's nameservice ID in the URL, and make DNs look up the appropriate RPC address for the nameservice from their conf. This fixes the wildcard issues and has the additional benefit of making browsing work after a NN failover. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4484) libwebhdfs compilation broken with gcc 4.6.2
Colin Patrick McCabe created HDFS-4484: -- Summary: libwebhdfs compilation broken with gcc 4.6.2 Key: HDFS-4484 URL: https://issues.apache.org/jira/browse/HDFS-4484 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 2.0.4-beta Environment: OpenSUSE 12.1, x86_64, gcc 4.6.2 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe Priority: Minor libwebhdfs doesn't compile with gcc 4.6.2. {code} /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c: In function ‘main’: /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:260:9: error: ‘for’ loop initial declarations are only allowed in C99 mode /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:260:9: note: use option -std=c99 or -std=gnu99 to compile your code /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:284:13: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘tOffset’ [-Wformat] /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:285:13: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘tOffset’ [-Wformat] /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:308:17: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘tOffset’ [-Wformat] /home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:309:17: warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has type ‘tOffset’ [-Wformat] {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4485) HDFS-347: DN should chmod socket path a+w
Todd Lipcon created HDFS-4485: - Summary: HDFS-347: DN should chmod socket path a+w Key: HDFS-4485 URL: https://issues.apache.org/jira/browse/HDFS-4485 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Todd Lipcon Assignee: Colin Patrick McCabe Priority: Critical In cluster-testing HDFS-347, we found that in clusters where the MR job doesn't run as the same user as HDFS, clients wouldn't use short circuit read because of a 'permission denied' error connecting to the socket. It turns out that, in order to connect to a socket, clients need write permissions on the socket file. The DN should set these permissions automatically after it creates the socket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4486) Add log category for long-running DFSClient notices
Todd Lipcon created HDFS-4486: - Summary: Add log category for long-running DFSClient notices Key: HDFS-4486 URL: https://issues.apache.org/jira/browse/HDFS-4486 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Priority: Minor There are a number of features in the DFS client which are transparent but can make a fairly big difference for performance -- two in particular are short circuit reads and native checksumming. Because we don't want log spew for clients like hadoop fs -cat we currently log only at DEBUG level when these features are disabled. This makes it difficult to troubleshoot/verify for long-running perf-sensitive clients like HBase. One simple solution is to add a new log category - eg o.a.h.h.DFSClient.PerformanceAdvisory - which long-running clients could enable at DEBUG level without getting the full debug spew. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-4480) Eliminate the file snapshot circular linked list
[ https://issues.apache.org/jira/browse/HDFS-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-4480. -- Resolution: Fixed Fix Version/s: Snapshot (HDFS-2802) Hadoop Flags: Reviewed Thanks Jing for reviewing the patches. I have committed this. Eliminate the file snapshot circular linked list Key: HDFS-4480 URL: https://issues.apache.org/jira/browse/HDFS-4480 Project: Hadoop HDFS Issue Type: Sub-task Components: namenode Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: Snapshot (HDFS-2802) Attachments: h4480_20130207.patch, h4480_20130208.patch With HDFS-4446, all file changes can be recorded using file diff so that the circular linked list can be eliminated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HDFS-976) Hot Standby for NameNode
[ https://issues.apache.org/jira/browse/HDFS-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J resolved HDFS-976. -- Resolution: Duplicate A working HDFS HA mode has been implemented via HDFS-1623. Closing this one out as a 'dupe'. Hot Standby for NameNode Key: HDFS-976 URL: https://issues.apache.org/jira/browse/HDFS-976 Project: Hadoop HDFS Issue Type: New Feature Components: namenode Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: 0001-0.20.3_rc2-AvatarNode.patch, AvatarNode.20.patch, AvatarNodeDescription.txt, AvatarNode.patch, AvatarPatch.2.patch This is a place holder to share our code and experiences about implementing a Hot Standby for the HDFS NameNode for hadoop 0.20. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release hadoop-2.0.3-alpha
+1 (binding) I downloaded the src tar ball, built it with the native bits enabled, started up a little cluster, and ran some sample jobs. Things worked as expected. I also verified the signatures on the source artifact. I did bump into one little issue, but I don't think it should be considered a blocker. When I first tried to start up the RM, it failed to start with this error: 13/02/08 16:00:31 FATAL resourcemanager.ResourceManager: Error starting ResourceManager java.lang.IllegalStateException: Queue configuration missing child queue names for root at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:328) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:255) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:220) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.init(ResourceManager.java:226) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:710) And then this on shutdown: 13/02/08 16:00:31 INFO service.CompositeService: Error stopping ResourceManager java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stop(ResourceManager.java:590) at org.apache.hadoop.yarn.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:122) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) Presumably this is because I don't have the CapacityScheduler queues configured at all, and the default scheduler is now the CapacityScheduler. To work around this for my testing, I switched to the FairScheduler and the RM came up just fine. -- Aaron T. Myers Software Engineer, Cloudera On Wed, Feb 6, 2013 at 7:59 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.0.3-alpha that I would like to release. This release contains several major enhancements such as QJM for HDFS HA, multi-resource scheduling for YARN, YARN ResourceManager restart etc. Also YARN has achieved significant stability at scale (more details from Y! folks here: http://s.apache.org/VYO). The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.0.3-alpha-rc0/ The RC tag in svn is here: http://svn.apache.org/viewvc/hadoop/common/tags/release-2.0.3-alpha-rc0/ The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/
Re: [VOTE] Release hadoop-2.0.3-alpha
The issue with the configuration is raised (and adressed) in https://issues.apache.org/jira/browse/BIGTOP-841 Cos On Fri, Feb 08, 2013 at 04:25PM, Aaron T. Myers wrote: +1 (binding) I downloaded the src tar ball, built it with the native bits enabled, started up a little cluster, and ran some sample jobs. Things worked as expected. I also verified the signatures on the source artifact. I did bump into one little issue, but I don't think it should be considered a blocker. When I first tried to start up the RM, it failed to start with this error: 13/02/08 16:00:31 FATAL resourcemanager.ResourceManager: Error starting ResourceManager java.lang.IllegalStateException: Queue configuration missing child queue names for root at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:328) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:255) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:220) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.init(ResourceManager.java:226) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:710) And then this on shutdown: 13/02/08 16:00:31 INFO service.CompositeService: Error stopping ResourceManager java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stop(ResourceManager.java:590) at org.apache.hadoop.yarn.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:122) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) Presumably this is because I don't have the CapacityScheduler queues configured at all, and the default scheduler is now the CapacityScheduler. To work around this for my testing, I switched to the FairScheduler and the RM came up just fine. -- Aaron T. Myers Software Engineer, Cloudera On Wed, Feb 6, 2013 at 7:59 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.0.3-alpha that I would like to release. This release contains several major enhancements such as QJM for HDFS HA, multi-resource scheduling for YARN, YARN ResourceManager restart etc. Also YARN has achieved significant stability at scale (more details from Y! folks here: http://s.apache.org/VYO). The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.0.3-alpha-rc0/ The RC tag in svn is here: http://svn.apache.org/viewvc/hadoop/common/tags/release-2.0.3-alpha-rc0/ The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ signature.asc Description: Digital signature
[jira] [Resolved] (HDFS-4485) HDFS-347: DN should chmod socket path a+w
[ https://issues.apache.org/jira/browse/HDFS-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers resolved HDFS-4485. -- Resolution: Fixed I've just committed this to the HDFS-347 branch. Thanks a lot for the contribution, Colin. HDFS-347: DN should chmod socket path a+w - Key: HDFS-4485 URL: https://issues.apache.org/jira/browse/HDFS-4485 Project: Hadoop HDFS Issue Type: Sub-task Components: datanode Reporter: Todd Lipcon Assignee: Colin Patrick McCabe Priority: Critical Attachments: HDFS-4485.001.patch, HDFS-4485.003.patch In cluster-testing HDFS-347, we found that in clusters where the MR job doesn't run as the same user as HDFS, clients wouldn't use short circuit read because of a 'permission denied' error connecting to the socket. It turns out that, in order to connect to a socket, clients need write permissions on the socket file. The DN should set these permissions automatically after it creates the socket. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [VOTE] Release hadoop-2.0.3-alpha
+1 Deployed on more than 100 nodes. Ran 30TB teragen/terasort. Will run few more over the weekend to test scheduler. Things looks stable. I do see few failures, but those I believe are hardware problems. Thanks, @lohitvijayarenu 2013/2/8 Konstantin Boudnik c...@apache.org The issue with the configuration is raised (and adressed) in https://issues.apache.org/jira/browse/BIGTOP-841 Cos On Fri, Feb 08, 2013 at 04:25PM, Aaron T. Myers wrote: +1 (binding) I downloaded the src tar ball, built it with the native bits enabled, started up a little cluster, and ran some sample jobs. Things worked as expected. I also verified the signatures on the source artifact. I did bump into one little issue, but I don't think it should be considered a blocker. When I first tried to start up the RM, it failed to start with this error: 13/02/08 16:00:31 FATAL resourcemanager.ResourceManager: Error starting ResourceManager java.lang.IllegalStateException: Queue configuration missing child queue names for root at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:328) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:255) at org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:220) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.init(ResourceManager.java:226) at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:710) And then this on shutdown: 13/02/08 16:00:31 INFO service.CompositeService: Error stopping ResourceManager java.lang.NullPointerException at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stop(ResourceManager.java:590) at org.apache.hadoop.yarn.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:122) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54) Presumably this is because I don't have the CapacityScheduler queues configured at all, and the default scheduler is now the CapacityScheduler. To work around this for my testing, I switched to the FairScheduler and the RM came up just fine. -- Aaron T. Myers Software Engineer, Cloudera On Wed, Feb 6, 2013 at 7:59 PM, Arun C Murthy a...@hortonworks.com wrote: Folks, I've created a release candidate (rc0) for hadoop-2.0.3-alpha that I would like to release. This release contains several major enhancements such as QJM for HDFS HA, multi-resource scheduling for YARN, YARN ResourceManager restart etc. Also YARN has achieved significant stability at scale (more details from Y! folks here: http://s.apache.org/VYO). The RC is available at: http://people.apache.org/~acmurthy/hadoop-2.0.3-alpha-rc0/ The RC tag in svn is here: http://svn.apache.org/viewvc/hadoop/common/tags/release-2.0.3-alpha-rc0/ The maven artifacts are available via repository.apache.org. Please try the release and vote; the vote will run for the usual 7 days. thanks, Arun -- Arun C. Murthy Hortonworks Inc. http://hortonworks.com/ -- Have a Nice Day! Lohit