[jira] [Created] (HDFS-4482) ReplicationMonitor thread can exit with NPE due to the race between delete and replication of same file.

2013-02-08 Thread Uma Maheswara Rao G (JIRA)
Uma Maheswara Rao G created HDFS-4482:
-

 Summary: ReplicationMonitor thread can exit with NPE due to the 
race between delete and replication of same file.
 Key: HDFS-4482
 URL: https://issues.apache.org/jira/browse/HDFS-4482
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Uma Maheswara Rao G
Priority: Blocker


Trace:

{noformat}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.getFullPathName(FSDirectory.java:1442)
at 
org.apache.hadoop.hdfs.server.namenode.INode.getFullPathName(INode.java:269)
at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.getName(INodeFile.java:163)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.chooseTarget(BlockPlacementPolicy.java:131)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWorkForBlocks(BlockManager.java:1157)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReplicationWork(BlockManager.java:1063)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:3085)
at 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$ReplicationMonitor.run(BlockManager.java:3047)
at java.lang.Thread.run(Thread.java:619)

{noformat}

What I am seeing here is:

1) create a file and write with 2 DNS
2) Close the file.
3) Kill one DN
4) Lat replication start.
  Info:
{code}
 // choose replication targets: NOT HOLDING THE GLOBAL LOCK
  // It is costly to extract the filename for which chooseTargets is called,
  // so for now we pass in the block collection itself.
  rw.targets = blockplacement.chooseTarget(rw.bc,
  rw.additionalReplRequired, rw.srcNode, rw.liveReplicaNodes,
  excludedNodes, rw.block.getNumBytes());
{code}
Here we are choosing target outside the global lock. Inside we will try to get 
the src path from blockCollection(nothing but INodeFile here).

see the code for FSDirectory#getFullPathName
 Here it is incrementing the depth until it has parent. and Later it will 
iterate and access parent again in next loop.

Between this if file is deleted by client then that parent would have been set 
as null. So, here accessing the parent can cause NPE because it is not under 
lock.


2) 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Build failed in Jenkins: Hadoop-Hdfs-0.23-Build #519

2013-02-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/519/

--
[...truncated 17130 lines...]
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//XException.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//XException.ERROR.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/servlet//FileSystemReleaseFilter.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/servlet//HostnameFilter.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/servlet//MDCFilter.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/servlet//ServerWebApp.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/util//Check.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/util//ConfigurationUtils.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/overview-frame.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/client//package-frame.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/client//package-summary.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/client//package-tree.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/server//package-frame.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/server//package-summary.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/fs/http/server//package-tree.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//package-frame.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//package-summary.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/lang//package-tree.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/server//package-frame.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/server//package-summary.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/server//package-tree.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service//package-frame.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service//package-summary.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service//package-tree.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service/hadoop//package-frame.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service/hadoop//package-summary.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service/hadoop//package-tree.html...
Generating 
https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/org/apache/hadoop/lib/service/instrumentation//package-frame.html...

Build failed in Jenkins: Hadoop-Hdfs-trunk #1310

2013-02-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1310/changes

Changes:

[atm] HDFS-4471. Namenode WebUI file browsing does not work with wildcard 
addresses configured. Contributed by Andrew Wang.

[szetszwo] YARN-377. Use the new StringUtils methods added by HADOOP-9252 and 
fix TestContainersMonitor.  Contributed by Chris Nauroth

[suresh] HDFS-4470. Several HDFS tests attempt file operations on invalid HDFS 
paths when running on Windows. Contributed by Chris Nauroth.

[suresh] HADOOP-9277. Improve javadoc for FileContext. Contributed by Andrew 
Wang.

[sseth] YARN-385. Add missing fields - location and #containers to 
ResourceRequestPBImpl's toString(). Contributed by Sandy Ryza.

[sseth] YARN-383. AMRMClientImpl should handle null rmClient in stop(). 
Contributed by Hitesh Shah.

[suresh] HADOOP-9253. Capture ulimit info in the logs at service start time. 
Contributed by Arpit Gupta.

--
[...truncated 13652 lines...]
[INFO] Using default encoding to copy filtered resources.
[INFO] 
[INFO] --- maven-compiler-plugin:2.5.1:compile (default-compile) @ 
hadoop-hdfs-httpfs ---
[INFO] Compiling 56 source files to 
https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/classes
[INFO] 
[INFO] --- maven-antrun-plugin:1.6:run (create-web-xmls) @ hadoop-hdfs-httpfs 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/test-classes/webapp
 [copy] Copying 1 file to 
https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/test-classes/webapp
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-resources-plugin:2.2:testResources (default-testResources) @ 
hadoop-hdfs-httpfs ---
[INFO] Using default encoding to copy filtered resources.
[INFO] 
[INFO] --- maven-compiler-plugin:2.5.1:testCompile (default-testCompile) @ 
hadoop-hdfs-httpfs ---
[INFO] Compiling 46 source files to 
https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.12.3:test (default-test) @ 
hadoop-hdfs-httpfs ---
[INFO] Surefire report directory: 
https://builds.apache.org/job/Hadoop-Hdfs-trunk/ws/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/surefire-reports

---
 T E S T S
---

---
 T E S T S
---
Running org.apache.hadoop.test.TestDirHelper
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.048 sec
Running org.apache.hadoop.test.TestJettyHelper
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.053 sec
Running org.apache.hadoop.test.TestHdfsHelper
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.06 sec
Running org.apache.hadoop.test.TestHTestCase
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.196 sec
Running org.apache.hadoop.test.TestExceptionHelper
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.047 sec
Running org.apache.hadoop.test.TestHFSTestCase
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.981 sec
Running org.apache.hadoop.lib.service.instrumentation.TestInstrumentationService
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.671 sec
Running org.apache.hadoop.lib.service.scheduler.TestSchedulerService
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.196 sec
Running org.apache.hadoop.lib.service.security.TestProxyUserService
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.914 sec
Running org.apache.hadoop.lib.service.security.TestDelegationTokenManagerService
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.728 sec
Running org.apache.hadoop.lib.service.security.TestGroupsService
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.205 sec
Running org.apache.hadoop.lib.service.hadoop.TestFileSystemAccessService
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.744 sec
Running org.apache.hadoop.lib.server.TestServerConstructor
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.115 sec
Running org.apache.hadoop.lib.server.TestServer
Tests run: 30, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.402 sec
Running org.apache.hadoop.lib.server.TestBaseService
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.27 sec
Running org.apache.hadoop.lib.lang.TestRunnableCallable
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.064 sec
Running org.apache.hadoop.lib.lang.TestXException
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.059 sec
Running org.apache.hadoop.lib.wsrs.TestParam
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, 

Hadoop-Hdfs-trunk - Build # 1310 - Still Failing

2013-02-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1310/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 13845 lines...]
at 
org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at 
org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at 
org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at 
org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

Running org.apache.hadoop.fs.http.client.TestHttpFSWithHttpFSFileSystem
Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.026 sec
Running org.apache.hadoop.fs.http.client.TestHttpFSFileSystemLocalFileSystem
Tests run: 32, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.223 sec

Results :

Tests in error: 
  
testOperation[7](org.apache.hadoop.fs.http.client.TestHttpFSFWithWebhdfsFileSystem)
  
testOperationDoAs[7](org.apache.hadoop.fs.http.client.TestHttpFSFWithWebhdfsFileSystem)

Tests run: 283, Failures: 0, Errors: 2, Skipped: 0

[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  SUCCESS 
[1:27:42.034s]
[INFO] Apache Hadoop HttpFS .. FAILURE [1:45.030s]
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS Project  SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 1:29:27.856s
[INFO] Finished at: Fri Feb 08 13:02:56 UTC 2013
[INFO] Final Memory: 50M/525M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.12.3:test (default-test) on 
project hadoop-hdfs-httpfs: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/trunk/hadoop-hdfs-project/hadoop-hdfs-httpfs/target/surefire-reports
 for the individual test results.
[ERROR] - [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn goals -rf :hadoop-hdfs-httpfs
Build step 'Execute shell' marked build as failure
Archiving artifacts
Updating HDFS-4470
Updating YARN-377
Updating HADOOP-9252
Updating HDFS-4471
Updating HADOOP-9253
Updating YARN-383
Updating YARN-385
Updating HADOOP-9277
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (HDFS-4483) Refactor NN WebUI to no longer pass IP addresses in the URL

2013-02-08 Thread Andrew Wang (JIRA)
Andrew Wang created HDFS-4483:
-

 Summary: Refactor NN WebUI to no longer pass IP addresses in the 
URL
 Key: HDFS-4483
 URL: https://issues.apache.org/jira/browse/HDFS-4483
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Andrew Wang
Assignee: Andrew Wang


Right now, the namenode passes its RPC address in WebUI URLs when it redirects 
to datanodes for things like browsing the filesystem. This is brittle and fails 
in different ways when wildcard addresses are configured (see HDFS-3932 and 
HDFS-4471).

A better solution would be to instead pass the NN's nameservice ID in the URL, 
and make DNs look up the appropriate RPC address for the nameservice from their 
conf. This fixes the wildcard issues and has the additional benefit of making 
browsing work after a NN failover.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4484) libwebhdfs compilation broken with gcc 4.6.2

2013-02-08 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-4484:
--

 Summary: libwebhdfs compilation broken with gcc 4.6.2
 Key: HDFS-4484
 URL: https://issues.apache.org/jira/browse/HDFS-4484
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.0.4-beta
 Environment: OpenSUSE 12.1, x86_64, gcc 4.6.2
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe
Priority: Minor


libwebhdfs doesn't compile with gcc 4.6.2.

{code}
/home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:
 In function ‘main’:
/home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:260:9:
 error: ‘for’ loop initial declarations are only allowed in C99 mode
/home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:260:9:
 note: use option -std=c99 or -std=gnu99 to compile your code
/home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:284:13:
 warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 
3 has type ‘tOffset’ [-Wformat]
/home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:285:13:
 warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 
3 has type ‘tOffset’ [-Wformat]
/home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:308:17:
 warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 
3 has type ‘tOffset’ [-Wformat]
/home/cmccabe/hadoop1/hadoop-hdfs-project/hadoop-hdfs/src/contrib/libwebhdfs/src/test_libwebhdfs_ops.c:309:17:
 warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 
3 has type ‘tOffset’ [-Wformat]
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4485) HDFS-347: DN should chmod socket path a+w

2013-02-08 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-4485:
-

 Summary: HDFS-347: DN should chmod socket path a+w
 Key: HDFS-4485
 URL: https://issues.apache.org/jira/browse/HDFS-4485
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Todd Lipcon
Assignee: Colin Patrick McCabe
Priority: Critical


In cluster-testing HDFS-347, we found that in clusters where the MR job doesn't 
run as the same user as HDFS, clients wouldn't use short circuit read because 
of a 'permission denied' error connecting to the socket. It turns out that, in 
order to connect to a socket, clients need write permissions on the socket file.

The DN should set these permissions automatically after it creates the socket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4486) Add log category for long-running DFSClient notices

2013-02-08 Thread Todd Lipcon (JIRA)
Todd Lipcon created HDFS-4486:
-

 Summary: Add log category for long-running DFSClient notices
 Key: HDFS-4486
 URL: https://issues.apache.org/jira/browse/HDFS-4486
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Todd Lipcon
Priority: Minor


There are a number of features in the DFS client which are transparent but can 
make a fairly big difference for performance -- two in particular are short 
circuit reads and native checksumming. Because we don't want log spew for 
clients like hadoop fs -cat we currently log only at DEBUG level when these 
features are disabled. This makes it difficult to troubleshoot/verify for 
long-running perf-sensitive clients like HBase.

One simple solution is to add a new log category - eg 
o.a.h.h.DFSClient.PerformanceAdvisory - which long-running clients could enable 
at DEBUG level without getting the full debug spew.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4480) Eliminate the file snapshot circular linked list

2013-02-08 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-4480.
--

   Resolution: Fixed
Fix Version/s: Snapshot (HDFS-2802)
 Hadoop Flags: Reviewed

Thanks Jing for reviewing the patches.

I have committed this.

 Eliminate the file snapshot circular linked list
 

 Key: HDFS-4480
 URL: https://issues.apache.org/jira/browse/HDFS-4480
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: namenode
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE
 Fix For: Snapshot (HDFS-2802)

 Attachments: h4480_20130207.patch, h4480_20130208.patch


 With HDFS-4446, all file changes can be recorded using file diff so that the 
 circular linked list can be eliminated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-976) Hot Standby for NameNode

2013-02-08 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HDFS-976.
--

Resolution: Duplicate

A working HDFS HA mode has been implemented via HDFS-1623. Closing this one out 
as a 'dupe'.

 Hot Standby for NameNode
 

 Key: HDFS-976
 URL: https://issues.apache.org/jira/browse/HDFS-976
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: namenode
Reporter: dhruba borthakur
Assignee: Dmytro Molkov
 Attachments: 0001-0.20.3_rc2-AvatarNode.patch, AvatarNode.20.patch, 
 AvatarNodeDescription.txt, AvatarNode.patch, AvatarPatch.2.patch


 This is a place holder to share our code and experiences about implementing a 
 Hot Standby for the HDFS NameNode for hadoop 0.20. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Release hadoop-2.0.3-alpha

2013-02-08 Thread Aaron T. Myers
+1 (binding)

I downloaded the src tar ball, built it with the native bits enabled,
started up a little cluster, and ran some sample jobs. Things worked as
expected. I also verified the signatures on the source artifact.

I did bump into one little issue, but I don't think it should be considered
a blocker. When I first tried to start up the RM, it failed to start with
this error:

13/02/08 16:00:31 FATAL resourcemanager.ResourceManager: Error starting
ResourceManager
java.lang.IllegalStateException: Queue configuration missing child queue
names for root
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:328)
 at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:255)
at
org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:220)
 at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.init(ResourceManager.java:226)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:710)

And then this on shutdown:

13/02/08 16:00:31 INFO service.CompositeService: Error stopping
ResourceManager
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stop(ResourceManager.java:590)
 at
org.apache.hadoop.yarn.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:122)
at
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)

Presumably this is because I don't have the CapacityScheduler queues
configured at all, and the default scheduler is now the CapacityScheduler.
To work around this for my testing, I switched to the FairScheduler and the
RM came up just fine.


--
Aaron T. Myers
Software Engineer, Cloudera


On Wed, Feb 6, 2013 at 7:59 PM, Arun C Murthy a...@hortonworks.com wrote:

 Folks,

 I've created a release candidate (rc0) for hadoop-2.0.3-alpha that I would
 like to release.

 This release contains several major enhancements such as QJM for HDFS HA,
 multi-resource scheduling for YARN, YARN ResourceManager restart etc.
 Also YARN has achieved significant stability at scale (more details from
 Y! folks here: http://s.apache.org/VYO).

 The RC is available at:
 http://people.apache.org/~acmurthy/hadoop-2.0.3-alpha-rc0/
 The RC tag in svn is here:
 http://svn.apache.org/viewvc/hadoop/common/tags/release-2.0.3-alpha-rc0/

 The maven artifacts are available via repository.apache.org.

 Please try the release and vote; the vote will run for the usual 7 days.

 thanks,
 Arun



 --
 Arun C. Murthy
 Hortonworks Inc.
 http://hortonworks.com/





Re: [VOTE] Release hadoop-2.0.3-alpha

2013-02-08 Thread Konstantin Boudnik
The issue with the configuration is raised (and adressed) in 
  https://issues.apache.org/jira/browse/BIGTOP-841

Cos

On Fri, Feb 08, 2013 at 04:25PM, Aaron T. Myers wrote:
 +1 (binding)
 
 I downloaded the src tar ball, built it with the native bits enabled,
 started up a little cluster, and ran some sample jobs. Things worked as
 expected. I also verified the signatures on the source artifact.
 
 I did bump into one little issue, but I don't think it should be considered
 a blocker. When I first tried to start up the RM, it failed to start with
 this error:
 
 13/02/08 16:00:31 FATAL resourcemanager.ResourceManager: Error starting
 ResourceManager
 java.lang.IllegalStateException: Queue configuration missing child queue
 names for root
 at
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:328)
  at
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:255)
 at
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:220)
  at
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.init(ResourceManager.java:226)
 at
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:710)
 
 And then this on shutdown:
 
 13/02/08 16:00:31 INFO service.CompositeService: Error stopping
 ResourceManager
 java.lang.NullPointerException
 at
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stop(ResourceManager.java:590)
  at
 org.apache.hadoop.yarn.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:122)
 at
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
 
 Presumably this is because I don't have the CapacityScheduler queues
 configured at all, and the default scheduler is now the CapacityScheduler.
 To work around this for my testing, I switched to the FairScheduler and the
 RM came up just fine.
 
 
 --
 Aaron T. Myers
 Software Engineer, Cloudera
 
 
 On Wed, Feb 6, 2013 at 7:59 PM, Arun C Murthy a...@hortonworks.com wrote:
 
  Folks,
 
  I've created a release candidate (rc0) for hadoop-2.0.3-alpha that I would
  like to release.
 
  This release contains several major enhancements such as QJM for HDFS HA,
  multi-resource scheduling for YARN, YARN ResourceManager restart etc.
  Also YARN has achieved significant stability at scale (more details from
  Y! folks here: http://s.apache.org/VYO).
 
  The RC is available at:
  http://people.apache.org/~acmurthy/hadoop-2.0.3-alpha-rc0/
  The RC tag in svn is here:
  http://svn.apache.org/viewvc/hadoop/common/tags/release-2.0.3-alpha-rc0/
 
  The maven artifacts are available via repository.apache.org.
 
  Please try the release and vote; the vote will run for the usual 7 days.
 
  thanks,
  Arun
 
 
 
  --
  Arun C. Murthy
  Hortonworks Inc.
  http://hortonworks.com/
 
 
 


signature.asc
Description: Digital signature


[jira] [Resolved] (HDFS-4485) HDFS-347: DN should chmod socket path a+w

2013-02-08 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers resolved HDFS-4485.
--

Resolution: Fixed

I've just committed this to the HDFS-347 branch.

Thanks a lot for the contribution, Colin.

 HDFS-347: DN should chmod socket path a+w
 -

 Key: HDFS-4485
 URL: https://issues.apache.org/jira/browse/HDFS-4485
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: datanode
Reporter: Todd Lipcon
Assignee: Colin Patrick McCabe
Priority: Critical
 Attachments: HDFS-4485.001.patch, HDFS-4485.003.patch


 In cluster-testing HDFS-347, we found that in clusters where the MR job 
 doesn't run as the same user as HDFS, clients wouldn't use short circuit read 
 because of a 'permission denied' error connecting to the socket. It turns out 
 that, in order to connect to a socket, clients need write permissions on the 
 socket file.
 The DN should set these permissions automatically after it creates the socket.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: [VOTE] Release hadoop-2.0.3-alpha

2013-02-08 Thread lohit
+1
Deployed on more than 100 nodes.
Ran 30TB teragen/terasort. Will run few more over the weekend to test
scheduler.
Things looks stable. I do see few failures, but those I believe are
hardware problems.

Thanks,
@lohitvijayarenu

2013/2/8 Konstantin Boudnik c...@apache.org

 The issue with the configuration is raised (and adressed) in
   https://issues.apache.org/jira/browse/BIGTOP-841

 Cos

 On Fri, Feb 08, 2013 at 04:25PM, Aaron T. Myers wrote:
  +1 (binding)
 
  I downloaded the src tar ball, built it with the native bits enabled,
  started up a little cluster, and ran some sample jobs. Things worked as
  expected. I also verified the signatures on the source artifact.
 
  I did bump into one little issue, but I don't think it should be
 considered
  a blocker. When I first tried to start up the RM, it failed to start with
  this error:
 
  13/02/08 16:00:31 FATAL resourcemanager.ResourceManager: Error starting
  ResourceManager
  java.lang.IllegalStateException: Queue configuration missing child queue
  names for root
  at
 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.parseQueue(CapacityScheduler.java:328)
   at
 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.initializeQueues(CapacityScheduler.java:255)
  at
 
 org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler.reinitialize(CapacityScheduler.java:220)
   at
 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.init(ResourceManager.java:226)
  at
 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:710)
 
  And then this on shutdown:
 
  13/02/08 16:00:31 INFO service.CompositeService: Error stopping
  ResourceManager
  java.lang.NullPointerException
  at
 
 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.stop(ResourceManager.java:590)
   at
 
 org.apache.hadoop.yarn.service.CompositeService$CompositeServiceShutdownHook.run(CompositeService.java:122)
  at
 
 org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
 
  Presumably this is because I don't have the CapacityScheduler queues
  configured at all, and the default scheduler is now the
 CapacityScheduler.
  To work around this for my testing, I switched to the FairScheduler and
 the
  RM came up just fine.
 
 
  --
  Aaron T. Myers
  Software Engineer, Cloudera
 
 
  On Wed, Feb 6, 2013 at 7:59 PM, Arun C Murthy a...@hortonworks.com
 wrote:
 
   Folks,
  
   I've created a release candidate (rc0) for hadoop-2.0.3-alpha that I
 would
   like to release.
  
   This release contains several major enhancements such as QJM for HDFS
 HA,
   multi-resource scheduling for YARN, YARN ResourceManager restart etc.
   Also YARN has achieved significant stability at scale (more details
 from
   Y! folks here: http://s.apache.org/VYO).
  
   The RC is available at:
   http://people.apache.org/~acmurthy/hadoop-2.0.3-alpha-rc0/
   The RC tag in svn is here:
  
 http://svn.apache.org/viewvc/hadoop/common/tags/release-2.0.3-alpha-rc0/
  
   The maven artifacts are available via repository.apache.org.
  
   Please try the release and vote; the vote will run for the usual 7
 days.
  
   thanks,
   Arun
  
  
  
   --
   Arun C. Murthy
   Hortonworks Inc.
   http://hortonworks.com/
  
  
  




-- 
Have a Nice Day!
Lohit