[jira] [Created] (HDFS-12139) liststatus returns incorrect pathSuffix for path of file

2017-07-13 Thread Yongjun Zhang (JIRA)
Yongjun Zhang created HDFS-12139:


 Summary: liststatus returns incorrect pathSuffix for path of file
 Key: HDFS-12139
 URL: https://issues.apache.org/jira/browse/HDFS-12139
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang


Per the following logs, we can see that liststatus returns the same pathSuffix 
"test.txt" for  /tmp/yj/yj1 and /tmp/yj/yj1/test.txt, which is wrong. The 
pathSuffix for the latter should be empty. 

[thost ~]$ hadoop fs -copyFromLocal test.txt /tmp/yj/yj1
[thost ~]$ curl 
"http://thost.x.y:14000/webhdfs/v1/tmp/yj/yj1?op=LISTSTATUS=tuser;
{"FileStatuses":{"FileStatus":[{"pathSuffix":"test.txt","type":"FILE","length":16,"owner":"tuser","group":"supergroup","permission":"644","accessTime":157684989,"modificationTime":157685286,"blockSize":134217728,"replication":3}]}}
[thost ~]$ curl 
"http://thost.x.y:14000/webhdfs/v1/tmp/yj/yj1/test.txt?op=LISTSTATUS=tuser;
{"FileStatuses":{"FileStatus":[{"pathSuffix":"test.txt","type":"FILE","length":16,"owner":"tuser","group":"supergroup","permission":"644","accessTime":157684989,"modificationTime":157685286,"blockSize":134217728,"replication":3}]}}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: 2.8 Release activities

2017-07-13 Thread Junping Du
Haohui,
I am waiting for a special security release on branch-2.8 get out to resume 
the release work for production release of 2.8. You should be on security alias 
and ask for update there.

Thanks,

Junping


From: Haohui Mai 
Sent: Wednesday, July 12, 2017 11:48 PM
To: Hadoop Common; yarn-...@hadoop.apache.org; Hdfs-dev
Subject: 2.8 Release activities

Hi,

Just curious -- what is the current status of the 2.8 release? It looks
like the release process for some time.

There are 5 or 6 blocker / critical bugs of the upcoming 2.8 release:

https://issues.apache.org/jira/browse/YARN-6654?jql=project%20in%20(HDFS%2C%20HADOOP%2C%20MAPREDUCE%2C%20YARN)%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22)%20AND%20priority%20in%20(Blocker%2C%20Critical)%20AND%20%22Target%20Version%2Fs%22%20in%20(2.8.2%2C%202.8.3)

I think we can address them in reasonable amount of effort.

We are interested in putting 2.8.x in production and it would be great to
have a maintenance Apache release for the 2.8 line.

I wonder, are there any concerns of not getting the release out? We might
be able to get some helps internally to fix the issues in the 2.8 lines. I
can also volunteer to be the release manager for 2.8.2 if it requires more
effort to coordinate to push the release out.

Regards,
Haohui



-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: About 2.7.4 Release

2017-07-13 Thread Konstantin Shvachko
Hi everybody.

We have been doing some internal testing of Hadoop 2.7.4. The testing is
going well.
Did not find any major issues on our workloads.
Used an internal tool called Dynamometer to check NameNode performance on
real cluster traces. Good.
Overall test cluster performance looks good.
Some more testing is still going on.

I plan to build an RC next week. If there are no objection.

Thanks,
--Konst

On Thu, Jun 15, 2017 at 4:42 PM, Konstantin Shvachko 
wrote:

> Hey guys.
>
> An update on 2.7.4 progress.
> We are down to 4 blockers. There is some work remaining on those.
> https://issues.apache.org/jira/browse/HDFS-11896?filter=12340814
> Would be good if people could follow up on review comments.
>
> I looked through nightly Jenkins build results for 2.7.4 both on Apache
> Jenkins and internal.
> Some test fail intermittently, but there no consistent failures. I filed
> HDFS-11985 to track some of them.
> https://issues.apache.org/jira/browse/HDFS-11985
> I do not currently consider these failures as blockers. LMK if some of
> them are.
>
> We started internal testing of branch-2.7 on one of our smallish (100+
> nodes) test clusters.
> Will update on the results.
>
> There is a plan to enable BigTop for 2.7.4 testing.
>
> Akira, Brahma thank you for setting up a wiki page for 2.7.4 release.
> Thank you everybody for contributing to this effort.
>
> Regards,
> --Konstantin
>
>
> On Tue, May 30, 2017 at 12:08 AM, Akira Ajisaka 
> wrote:
>
>> Sure.
>> If you want to edit the wiki, please tell me your ASF confluence account.
>>
>> -Akira
>>
>> On 2017/05/30 15:31, Rohith Sharma K S wrote:
>>
>>> Couple of more JIRAs need to be back ported for 2.7.4 release. These will
>>> solve RM HA unstability issues.
>>> https://issues.apache.org/jira/browse/YARN-5333
>>> https://issues.apache.org/jira/browse/YARN-5988
>>> https://issues.apache.org/jira/browse/YARN-6304
>>>
>>> I will raise a JIRAs to back port it.
>>>
>>> @Akira , could  you help to add these JIRAs into wiki?
>>>
>>> Thanks & Regards
>>> Rohith Sharma K S
>>>
>>> On 29 May 2017 at 12:19, Akira Ajisaka  wrote:
>>>
>>> Created a page for 2.7.4 release.
 https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+2.7.4

 If you want to edit this wiki, please ping me.

 Regards,
 Akira


 On 2017/05/23 4:42, Brahma Reddy Battula wrote:

 Hi Konstantin Shvachko
>
>
> how about creating a wiki page for 2.7.4 release status like 2.8 and
> trunk in following link.??
>
>
> https://cwiki.apache.org/confluence/display/HADOOP
>
>
> 
> From: Konstantin Shvachko 
> Sent: Saturday, May 13, 2017 3:58 AM
> To: Akira Ajisaka
> Cc: Hadoop Common; Hdfs-dev; mapreduce-...@hadoop.apache.org;
> yarn-...@hadoop.apache.org
> Subject: Re: About 2.7.4 Release
>
> Latest update on the links and filters. Here is the correct link for
> the
> filter:
> https://issues.apache.org/jira/secure/IssueNavigator.jspa?
> requestId=12340814
>
> Also updated: https://s.apache.org/Dzg4
>
> Had to do some Jira debugging. Sorry for confusion.
>
> Thanks,
> --Konstantin
>
> On Wed, May 10, 2017 at 2:30 PM, Konstantin Shvachko <
> shv.had...@gmail.com>
> wrote:
>
> Hey Akira,
>
>>
>> I didn't have private filters. Most probably Jira caches something.
>> Your filter is in the right direction, but for some reason it lists
>> only
>> 22 issues, while mine has 29.
>> It misses e.g. YARN-5543 > a/browse/YARN-5543>
>> .
>>
>> Anyways, I created a Jira filter now "Hadoop 2.7.4 release blockers",
>> shared it with "everybody", and updated my link to point to that
>> filter.
>> So
>> you can use any of the three methods below to get the correct list:
>> 1. Go to https://s.apache.org/Dzg4
>> 2. Go to the filter via
>> https://issues.apache.org/jira/issues?filter=12340814
>>or by finding "Hadoop 2.7.4 release blockers" filter in the jira
>> 3. On Advanced issues search page paste this:
>> project in (HDFS, HADOOP, YARN, MAPREDUCE) AND labels =
>> release-blocker
>> AND "Target Version/s" = 2.7.4
>>
>> Hope this solves the confusion for which issues are included.
>> Please LMK if it doesn't, as it is important.
>>
>> Thanks,
>> --Konstantin
>>
>> On Tue, May 9, 2017 at 9:58 AM, Akira Ajisaka 
>> wrote:
>>
>> Hi Konstantin,
>>
>>>
>>> Thank you for volunteering as release manager!
>>>
>>> Actually the original link works fine: https://s.apache.org/Dzg4
>>>

 I couldn't see the link. Maybe is it private filter?
>>>
>>> Here is a link I generated: https://s.apache.org/ehKy

[jira] [Created] (HDFS-12138) Remove redundant 'public' modifiers from BlockCollection

2017-07-13 Thread Chen Liang (JIRA)
Chen Liang created HDFS-12138:
-

 Summary: Remove redundant 'public' modifiers from BlockCollection
 Key: HDFS-12138
 URL: https://issues.apache.org/jira/browse/HDFS-12138
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Chen Liang
Assignee: Chen Liang
Priority: Trivial


The 'public' modifier of the methods in {{BlockCollection}} are redundant, 
since this is a public interface. Running checkstyle against also complains 
this.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/ppc64le

2017-07-13 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/

[Jul 12, 2017 8:52:56 PM] (jlowe) YARN-6797. TimelineWriter does not fully 
consume the POST response.
[Jul 12, 2017 9:15:04 PM] (szetszwo) HDFS-6874. Add GETFILEBLOCKLOCATIONS 
operation to HttpFS.  Contributed
[Jul 12, 2017 10:40:45 PM] (xgong) YARN-6689. PlacementRule should be 
configurable. (Jonathan Hung via
[Jul 12, 2017 11:26:19 PM] (xyao) HDFS-11502. Datanode UI should display 
hostname based on JMX bean
[Jul 13, 2017 11:18:29 AM] (sunilg) YARN-5731. Preemption calculation is not 
accurate when reserved
[Jul 13, 2017 12:41:43 PM] (iwasakims) HADOOP-14646.




-1 overall


The following subsystems voted -1:
compile mvninstall unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc javac


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.hdfs.server.namenode.TestDecommissioningStatus 
   hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewer 
   hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations 
   hadoop.hdfs.server.balancer.TestBalancerRPCDelay 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure 
   hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation 
   hadoop.hdfs.TestDFSStripedInputStreamWithRandomECPolicy 
   hadoop.yarn.server.nodemanager.recovery.TestNMLeveldbStateStoreService 
   hadoop.yarn.server.nodemanager.TestNodeManagerShutdown 
   hadoop.yarn.server.timeline.TestRollingLevelDB 
   hadoop.yarn.server.timeline.TestTimelineDataManager 
   hadoop.yarn.server.timeline.TestLeveldbTimelineStore 
   hadoop.yarn.server.timeline.recovery.TestLeveldbTimelineStateStore 
   hadoop.yarn.server.timeline.TestRollingLevelDBTimelineStore 
   
hadoop.yarn.server.applicationhistoryservice.TestApplicationHistoryServer 
   hadoop.yarn.server.resourcemanager.TestRMEmbeddedElector 
   hadoop.yarn.server.resourcemanager.recovery.TestLeveldbRMStateStore 
   hadoop.yarn.server.resourcemanager.TestRMRestart 
   hadoop.yarn.server.TestDiskFailures 
   hadoop.yarn.server.TestMiniYarnClusterNodeUtilization 
   hadoop.yarn.server.TestContainerManagerSecurity 
   hadoop.yarn.client.api.impl.TestAMRMClient 
   hadoop.yarn.server.timeline.TestLevelDBCacheTimelineStore 
   hadoop.yarn.server.timeline.TestOverrideTimelineStoreYarnClient 
   hadoop.yarn.server.timeline.TestEntityGroupFSTimelineStore 
   hadoop.yarn.applications.distributedshell.TestDistributedShell 
   hadoop.mapred.TestShuffleHandler 
   hadoop.mapreduce.v2.hs.TestHistoryServerLeveldbStateStoreService 

Timed out junit tests :

   org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache 
   org.apache.hadoop.yarn.server.resourcemanager.TestRMStoreCommands 
   
org.apache.hadoop.yarn.server.resourcemanager.TestReservationSystemWithRMHA 
   
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA 
   
org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA 
   org.apache.hadoop.yarn.server.resourcemanager.TestRMHAForNodeLabels 
  

   mvninstall:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-mvninstall-root.txt
  [620K]

   compile:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-compile-root.txt
  [20K]

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-compile-root.txt
  [20K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-compile-root.txt
  [20K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-unit-hadoop-assemblies.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [680K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-nodemanager.txt
  [56K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-applicationhistoryservice.txt
  [64K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-resourcemanager.txt
  [80K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/374/artifact/out/patch-unit-hadoop-yarn-project_hadoop-yarn_hadoop-yarn-server_hadoop-yarn-server-tests.txt
  

[jira] [Created] (HDFS-12137) DN dataset lock should be fair

2017-07-13 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12137:
--

 Summary: DN dataset lock should be fair
 Key: HDFS-12137
 URL: https://issues.apache.org/jira/browse/HDFS-12137
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.8.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical


The dataset lock is very highly contended.  The unfair nature can be especially 
harmful to the heartbeat handling.  Under high loads, partially expose by 
HDFS-12136 introducing disk i/o within the lock, the heartbeat handling thread 
may process commands so slowly due to the contention that the node becomes 
stale or falsely declared dead.  The unfair lock is not helping and appears to 
be causing frequent starvation under load.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12136) BlockSender performance regression due to volume scanner edge case

2017-07-13 Thread Daryn Sharp (JIRA)
Daryn Sharp created HDFS-12136:
--

 Summary: BlockSender performance regression due to volume scanner 
edge case
 Key: HDFS-12136
 URL: https://issues.apache.org/jira/browse/HDFS-12136
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Affects Versions: 2.8.0
Reporter: Daryn Sharp
Assignee: Daryn Sharp
Priority: Critical


HDFS-11160 attempted to fix a volume scan race for a file appended mid-scan by 
reading the last checksum of finalized blocks within the {{BlockSender}} ctor.  
Unfortunately it's holding the exclusive dataset lock to open and read the 
metafile multiple times  Block sender instantiation becomes serialized.

Performance completely collapses under heavy disk i/o utilization or high 
xceiver activity.  Ex. lost node replication, balancing, or decommissioning.  
The xceiver threads congest creating block senders and impair the heartbeat 
processing that is contending for the same lock.  Combined with other lock 
contention issues, pipelines break and nodes sporadically go dead.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-07-13 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/463/

[Jul 12, 2017 9:37:39 AM] (stevel) HADOOP-14581. Restrict setOwner to list of 
user when security is enabled
[Jul 12, 2017 10:38:32 AM] (aajisaka) YARN-6809. Fix typo in 
ResourceManagerHA.md. Contributed by Yeliang
[Jul 12, 2017 8:52:56 PM] (jlowe) YARN-6797. TimelineWriter does not fully 
consume the POST response.
[Jul 12, 2017 9:15:04 PM] (szetszwo) HDFS-6874. Add GETFILEBLOCKLOCATIONS 
operation to HttpFS.  Contributed
[Jul 12, 2017 10:40:45 PM] (xgong) YARN-6689. PlacementRule should be 
configurable. (Jonathan Hung via
[Jul 12, 2017 11:26:19 PM] (xyao) HDFS-11502. Datanode UI should display 
hostname based on JMX bean




-1 overall


The following subsystems voted -1:
compile findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs-client 
   Possible exposure of partially initialized object in 
org.apache.hadoop.hdfs.DFSClient.initThreadsNumForStripedReads(int) At 
DFSClient.java:object in 
org.apache.hadoop.hdfs.DFSClient.initThreadsNumForStripedReads(int) At 
DFSClient.java:[line 2888] 
   org.apache.hadoop.hdfs.server.protocol.SlowDiskReports.equals(Object) 
makes inefficient use of keySet iterator instead of entrySet iterator At 
SlowDiskReports.java:keySet iterator instead of entrySet iterator At 
SlowDiskReports.java:[line 105] 

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus() due to 
return value of called method Dereferenced at 
JournalNode.java:org.apache.hadoop.hdfs.qjournal.server.JournalNode.getJournalsStatus()
 due to return value of called method Dereferenced at JournalNode.java:[line 
302] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setClusterId(String)
 unconditionally sets the field clusterId At HdfsServerConstants.java:clusterId 
At HdfsServerConstants.java:[line 193] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForce(int)
 unconditionally sets the field force At HdfsServerConstants.java:force At 
HdfsServerConstants.java:[line 217] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setForceFormat(boolean)
 unconditionally sets the field isForceFormat At 
HdfsServerConstants.java:isForceFormat At HdfsServerConstants.java:[line 229] 
   
org.apache.hadoop.hdfs.server.common.HdfsServerConstants$StartupOption.setInteractiveFormat(boolean)
 unconditionally sets the field isInteractiveFormat At 
HdfsServerConstants.java:isInteractiveFormat At HdfsServerConstants.java:[line 
237] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File, File, 
int, HardLink, boolean, File, List) due to return value of called method 
Dereferenced at 
DataStorage.java:org.apache.hadoop.hdfs.server.datanode.DataStorage.linkBlocksHelper(File,
 File, int, HardLink, boolean, File, List) due to return value of called method 
Dereferenced at DataStorage.java:[line 1339] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method Dereferenced at 
NNStorageRetentionManager.java:org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager.purgeOldLegacyOIVImages(String,
 long) due to return value of called method Dereferenced at 
NNStorageRetentionManager.java:[line 258] 
   Possible null pointer dereference in 
org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil$1.visitFile(Path, 
BasicFileAttributes) due to return value of called method Dereferenced at 
NNUpgradeUtil.java:org.apache.hadoop.hdfs.server.namenode.NNUpgradeUtil$1.visitFile(Path,
 BasicFileAttributes) due to return value of called method Dereferenced at 
NNUpgradeUtil.java:[line 133] 
   Useless condition:argv.length >= 1 at this point At DFSAdmin.java:[line 
2085] 
   Useless condition:numBlocks == -1 at this point At 
ImageLoaderCurrent.java:[line 727] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Useless object stored in variable removedNullContainers of method 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.removeOrTrackCompletedContainersFromContext(List)
 At NodeStatusUpdaterImpl.java:removedNullContainers of method 
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.removeOrTrackCompletedContainersFromContext(List)
 At NodeStatusUpdaterImpl.java:[line 642] 
   

[jira] [Created] (HDFS-12135) Invalid -z option used for nc in org.apache.hadoop.ha.SshFenceByTcpPort under CentOS 7

2017-07-13 Thread Luigi Di Fraia (JIRA)
Luigi Di Fraia created HDFS-12135:
-

 Summary: Invalid -z option used for nc in 
org.apache.hadoop.ha.SshFenceByTcpPort under CentOS 7
 Key: HDFS-12135
 URL: https://issues.apache.org/jira/browse/HDFS-12135
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: 2.8.0
 Environment: [hadoop@namenode01 ~]$ cat /etc/redhat-release
CentOS Linux release 7.3.1611 (Core)
[hadoop@namenode01 ~]$ uname -a
Linux namenode01 3.10.0-514.10.2.el7.x86_64 #1 SMP Fri Mar 3 00:04:05 UTC 2017 
x86_64 x86_64 x86_64 GNU/Linux
[hadoop@namenode01 ~]$ java -version
java version "1.8.0_131"
Java(TM) SE Runtime Environment (build 1.8.0_131-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.131-b11, mixed mode)
Reporter: Luigi Di Fraia


During a failover scenario caused by the manual killing on the active NameNode 
process, having fuser failed in the first instance:

2017-07-13 15:59:36,851 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: 
SSH_MSG_NEWKEYS sent
2017-07-13 15:59:36,851 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: 
SSH_MSG_NEWKEYS received
2017-07-13 15:59:36,860 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: 
SSH_MSG_SERVICE_REQUEST sent
2017-07-13 15:59:36,861 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: 
SSH_MSG_SERVICE_ACCEPT received
2017-07-13 15:59:36,871 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: 
Authentications that can continue: 
gssapi-with-mic,publickey,keyboard-interactive,password
2017-07-13 15:59:36,871 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Next 
authentication method: gssapi-with-mic
2017-07-13 15:59:36,876 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: 
Authentications that can continue: publickey,keyboard-interactive,password
2017-07-13 15:59:36,876 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: Next 
authentication method: publickey
2017-07-13 15:59:37,048 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: 
Authentication succeeded (publickey).
2017-07-13 15:59:37,049 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Connected 
to namenode02
2017-07-13 15:59:37,049 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Looking 
for process running on port 8020
2017-07-13 15:59:37,502 INFO org.apache.hadoop.ha.SshFenceByTcpPort: 
Indeterminate response from trying to kill service. Verifying whether it is 
running using nc...
2017-07-13 15:59:37,556 WARN org.apache.hadoop.ha.SshFenceByTcpPort: nc -z 
namenode02 8020 via ssh: nc: invalid option -- 'z'
2017-07-13 15:59:37,556 WARN org.apache.hadoop.ha.SshFenceByTcpPort: nc -z 
namenode02 8020 via ssh: Ncat: Try `--help' or man(1) ncat for more 
information, usage options and help. QUITTING.
2017-07-13 15:59:37,557 INFO org.apache.hadoop.ha.SshFenceByTcpPort: Verified 
that the service is down.
2017-07-13 15:59:37,557 INFO org.apache.hadoop.ha.SshFenceByTcpPort.jsch: 
Disconnecting from namenode02 port 22

This was raised with HDFS-11308 previously, closed as a duplicate of HDFS-3618 
which does not seem to have been resolved itself (PATCH AVAILABLE).

Also, the use of fuser is mentioned in the documentation 
(https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html)
 but the use of nc (as fallback?) is not mentioned.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12134) libhdfs++: Add a synchronization interface for the GSSAPI

2017-07-13 Thread James Clampffer (JIRA)
James Clampffer created HDFS-12134:
--

 Summary: libhdfs++: Add a synchronization interface for the GSSAPI
 Key: HDFS-12134
 URL: https://issues.apache.org/jira/browse/HDFS-12134
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: James Clampffer
Assignee: James Clampffer


Bits of the GSSAPI that Cyrus Sasl uses aren't thread safe.  There needs to be 
a way for a client application to share a lock with this library in order to 
prevent race conditions.  It can be done using event callbacks through the C 
API but we can provide something more robust (RAII) in the C++ API.

Proposed client supplied lock, pretty much the C++17 lockable concept. Use a 
default if one isn't provided.  This would be scoped at the process level since 
it's unlikely that multiple instances of libgssapi unless someone puts some 
effort in with dlopen/dlsym.

{code}
class LockProvider
{
  virtual ~LockProvider() {}
  // allow client application to deny access to the lock
  virtual bool try_lock() = 0;
  virtual void unlock() = 0;
}
{code}





--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12133) Correct ContentSummaryComputationContext Logger class name.

2017-07-13 Thread Surendra Singh Lilhore (JIRA)
Surendra Singh Lilhore created HDFS-12133:
-

 Summary: Correct ContentSummaryComputationContext Logger class 
name.
 Key: HDFS-12133
 URL: https://issues.apache.org/jira/browse/HDFS-12133
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 3.0.0-alpha4
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore
Priority: Minor


Now it is {code}public static final Log LOG = 
LogFactory.getLog(INode.class){code}
It should be  {{ContentSummaryComputationContext.class}}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12132) Both two NameNodes become Standby because the ZKFC exception

2017-07-13 Thread Yang Jiandan (JIRA)
Yang Jiandan created HDFS-12132:
---

 Summary: Both two NameNodes become Standby because the ZKFC 
exception
 Key: HDFS-12132
 URL: https://issues.apache.org/jira/browse/HDFS-12132
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover
Affects Versions: 2.8.1
Reporter: Yang Jiandan


Both two NameNodes become Standby because the ZKFC exception When rolling 
upgrading Hadoop from Hadoop-2.6.5 to Hadoop-2.8.0, this lead HDFS to be not 
available. The logic of processing exception in ZKFC seems to be problematic, 
ZKFC should guarantee to have a active NameNode.

Before upgrading, the cluster was deployed with HA, NN1 was active, and NN2 was 
standby
The configuration before upgrading is as follows:

{code:java}
dfs.namenode.rpc-address.nameservice.nn1 nn1: 8020
dfs.namenode.rpc-address.nameservice.nn2 nn2: 8020
{code}

After upgrading, add the configuration of the separate RPC service:
{code:java}
dfs.namenode.rpc-address.nameservice.nn1 nn1: 8020
dfs.namenode.rpc-address.nameservice.nn2 nn2: 8020
dfs.namenode.servicerpc-address.nameservice.nn1 nn1: 8021
dfs.namenode.servicerpc-address.nameservice.nn2 nn2: 8021
dfs.namenode.lifeline.rpc-address.nameservice.nn1 nn1: 8022
dfs.namenode.lifeline.rpc-address.nameservice.nn2 nn2: 8022
{code}

The upgrade steps are as follows:
1. Upgrade NN2: restart NameNode process on NN2
2. Upgrade NN1: restart the NameNode process on NN1, then NN2 becomes active, 
NN1 is standby
3. Restart both ZKFC on NN1 and NN2

After restarting ZKFC, Two ZKFC threw same exception and two NameNodes have 
become Standby, exception log is:

{code:java}
2017-07-11 18:49:44,311 WARN [main-EventThread] 
org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of 
election
java.lang.RuntimeException: Mismatched address stored in ZK for NameNode at 
nn2/xx.xxx.xx.xxx:8022: Stored protobuf was nameserviceId: “nameservice”
namenodeId: "nn2"
hostname: “nn2_hostname”
port: 8020
zkfcPort: 8019
, address from our own configuration for this NameNode was 
nn2_hostname/xx.xxx.xx.xxx:8021
at 
org.apache.hadoop.hdfs.tools.DFSZKFailoverController.dataToTarget(DFSZKFailoverController.java:87)
at 
org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:506)
at 
org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61)
at 
org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:895)
at 
org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:985)
at 
org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:882)
at 
org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
at 
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2017-07-11 18:49:44,311 INFO [main-EventThread] 
org.apache.hadoop.ha.ActiveStandbyElector: Trying to re-establish ZK session
2017-07-11 18:49:44,311 INFO [main-EventThread] org.apache.zookeeper.ZooKeeper: 
Session: 0x15c3ada0ec319aa closed
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



2.8 Release activities

2017-07-13 Thread Haohui Mai
Hi,

Just curious -- what is the current status of the 2.8 release? It looks
like the release process for some time.

There are 5 or 6 blocker / critical bugs of the upcoming 2.8 release:

https://issues.apache.org/jira/browse/YARN-6654?jql=project%20in%20(HDFS%2C%20HADOOP%2C%20MAPREDUCE%2C%20YARN)%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Reopened%2C%20%22Patch%20Available%22)%20AND%20priority%20in%20(Blocker%2C%20Critical)%20AND%20%22Target%20Version%2Fs%22%20in%20(2.8.2%2C%202.8.3)

I think we can address them in reasonable amount of effort.

We are interested in putting 2.8.x in production and it would be great to
have a maintenance Apache release for the 2.8 line.

I wonder, are there any concerns of not getting the release out? We might
be able to get some helps internally to fix the issues in the 2.8 lines. I
can also volunteer to be the release manager for 2.8.2 if it requires more
effort to coordinate to push the release out.

Regards,
Haohui