[jira] [Commented] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775817#comment-16775817
 ] 

Yongjun Zhang commented on HDFS-14118:
--

Hi [~fengnanli],

Seems we have quite some flaky tests, I manually ran all failed tests and they 
passed. I saw another place to fix in hdfs-default.xml of rev 23, so I'm 
uploading rev 24 instead of asking you to iterate again. The change in rev 24 
is very minor and it's just to make sure the version we are committing is the 
same as the last version we uploaded here.

Hi [~elgoiri], I saw your comment 
[here|https://issues.apache.org/jira/browse/HDFS-14118?focusedCommentId=16761094=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16761094],
 so I am taking it as a +1 from you, and I will go ahead to commit it soon.

Thanks.

> Use DNS to resolve Namenodes and Routers
> 
>
> Key: HDFS-14118
> URL: https://issues.apache.org/jira/browse/HDFS-14118
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs-1.pdf, HDFS design doc_ Single domain name for clients 
> - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.020.patch, 
> HDFS-14118.021.patch, HDFS-14118.022.patch, HDFS-14118.023.patch, 
> HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers updating (adding/removing) will have to make 
> every client change, which is a painful process.
> DNS can be used here to resolve the single domain name clients knows to a 
> list of routers in the current config. However, DNS won't be able to consider 
> only resolving to the working router based on certain health thresholds.
> There are some ways about how this can be solved. One way is to have a 
> separate script to regularly check the status of the router and update the 
> DNS records if a router fails the health thresholds. In this way, security 
> might be carefully considered for this way. Another way is to have the client 
> do the normal connecting/failover after they get the list of routers, which 
> requires the change of current failover proxy provider.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Yongjun Zhang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang updated HDFS-14118:
-
Attachment: HDFS-14118.024.patch

> Use DNS to resolve Namenodes and Routers
> 
>
> Key: HDFS-14118
> URL: https://issues.apache.org/jira/browse/HDFS-14118
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs-1.pdf, HDFS design doc_ Single domain name for clients 
> - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.020.patch, 
> HDFS-14118.021.patch, HDFS-14118.022.patch, HDFS-14118.023.patch, 
> HDFS-14118.024.patch, HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers updating (adding/removing) will have to make 
> every client change, which is a painful process.
> DNS can be used here to resolve the single domain name clients knows to a 
> list of routers in the current config. However, DNS won't be able to consider 
> only resolving to the working router based on certain health thresholds.
> There are some ways about how this can be solved. One way is to have a 
> separate script to regularly check the status of the router and update the 
> DNS records if a router fails the health thresholds. In this way, security 
> might be carefully considered for this way. Another way is to have the client 
> do the normal connecting/failover after they get the list of routers, which 
> requires the change of current failover proxy provider.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14314) fullBlockReportLeaseId should be reset after

2019-02-22 Thread star (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

star updated HDFS-14314:

Description: 
  since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
block lease id from active NN before sending full block to NN. Then DN will 
send full block report together with lease id. If the lease id is invalid, NN 
will reject the full block report and log "not in the pending set".

  In a case when DN is doing full block reporting while NN is restarted. It 
happens that DN will later send a full block report with lease id ,acquired 
from previous NN instance, which is invalid to the new NN instance. Though DN 
recognized the new NN instance by heartbeat and reregister itself, it did not 
reset the lease id from previous instance.

  The issuse may cause DNs to temporarily go dead, making it unsafe to 
restart NN especially in hadoop cluster which has large amount of DNs. 
HDFS-12914 reported the issue  without any clues why it occurred and remain 
unsolved.

   To make it clear, look at code below. We take it from method 
offerService of class BPServiceActor. We eliminate some code to focus on 
current issue. fullBlockReportLeaseId is a local variable to hold lease id from 
NN. Exceptions will occur at blockReport call when NN restarting, which will be 
caught by catch block in while loop. Thus fullBlockReportLeaseId will not be 
set to 0. After NN restarted, DN will send full block report which will be 
rejected by the new NN instance. DN will never send full block report until the 
next full block report schedule, about an hour later.

  Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
exception or after registering to NN. Thus it will ask for a valid 
fullBlockReportLeaseId from new NN instance.
{code:java}
private void offerService() throws Exception {

  long fullBlockReportLeaseId = 0;

  //
  // Now loop for a long time
  //
  while (shouldRun()) {
try {
  final long startTime = scheduler.monotonicNow();

  //
  // Every so often, send heartbeat or block-report
  //
  final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
  HeartbeatResponse resp = null;
  if (sendHeartbeat) {
  
boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
scheduler.isBlockReportDue(startTime);
scheduler.scheduleNextHeartbeat();
if (!dn.areHeartbeatsDisabledForTests()) {
  resp = sendHeartBeat(requestBlockReportLease);
  assert resp != null;
  if (resp.getFullBlockReportLeaseId() != 0) {
if (fullBlockReportLeaseId != 0) {
  LOG.warn(nnAddr + " sent back a full block report lease " +
  "ID of 0x" +
  Long.toHexString(resp.getFullBlockReportLeaseId()) +
  ", but we already have a lease ID of 0x" +
  Long.toHexString(fullBlockReportLeaseId) + ". " +
  "Overwriting old lease ID.");
}
fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
  }
 
}
  }
   
 
  if ((fullBlockReportLeaseId != 0) || forceFullBr) {
//Exception occurred here when NN restarting
cmds = blockReport(fullBlockReportLeaseId);
fullBlockReportLeaseId = 0;
  }
  
} catch(RemoteException re) {
  
  } // while (shouldRun())
} // offerService{code}
 

  was:
  since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
block lease id from active NN before sending full block to NN. Then DN will 
send full block report together with lease id. If the lease id is invalid, NN 
will reject the full block report and log "not in the pending set".

  In a case when DN is doing full block reporting while NN is restarted. It 
happens that DN will later send a full block report with lease id ,acquired 
from previous NN instance, which is invalid to the new NN instance. Though DN 
recognized the new NN instance by heartbeat and reregister itself, it did not 
reset the lease id from previous instance.

  The issuse may cause DNs to temporarily go dead, making it unsafe to 
restart NN especially in hadoop cluster which has large amount of DNs. 
HDFS-12914 reported the issue  without any clues why it occurred and remain 
unsolved.

   To make it clear, look at code below. We take it from method 
offerService of class BPServiceActor. We eliminate some code to focus on 
current issue. fullBlockReportLeaseId is a local variable. Exceptions will 
occur at blockReport call when NN restarting, which will be caught by catch 
block in while loop. Thus fullBlockReportLeaseId will not be set to 0. After NN 
restarted, DN will send full block report which will be rejected by the new NN 
instance. DN will never send full block report until the next full block 

[jira] [Updated] (HDFS-14314) fullBlockReportLeaseId should be reset after

2019-02-22 Thread star (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

star updated HDFS-14314:

Description: 
  since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
block lease id from active NN before sending full block to NN. Then DN will 
send full block report together with lease id. If the lease id is invalid, NN 
will reject the full block report and log "not in the pending set".

  In a case when DN is doing full block reporting while NN is restarted. It 
happens that DN will later send a full block report with lease id ,acquired 
from previous NN instance, which is invalid to the new NN instance. Though DN 
recognized the new NN instance by heartbeat and reregister itself, it did not 
reset the lease id from previous instance.

  The issuse may cause DNs to temporarily go dead, making it unsafe to 
restart NN especially in hadoop cluster which has large amount of DNs. 
HDFS-12914 reported the issue  without any clues why it occurred and remain 
unsolved.

   To make it clear, look at code below. We take it from method 
offerService of class BPServiceActor. We eliminate some code to focus on 
current issue. fullBlockReportLeaseId is a local variable. Exceptions will 
occur at blockReport call when NN restarting, which will be caught by catch 
block in while loop. Thus fullBlockReportLeaseId will not be set to 0. After NN 
restarted, DN will send full block report which will be rejected by the new NN 
instance. DN will never send full block report until the next full block report 
schedule, about an hour later.

  Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
exception or after registering to NN. Thus it will ask for a valid 
fullBlockReportLeaseId from new NN instance.
{code:java}
private void offerService() throws Exception {

  long fullBlockReportLeaseId = 0;

  //
  // Now loop for a long time
  //
  while (shouldRun()) {
try {
  final long startTime = scheduler.monotonicNow();

  //
  // Every so often, send heartbeat or block-report
  //
  final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
  HeartbeatResponse resp = null;
  if (sendHeartbeat) {
  
boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
scheduler.isBlockReportDue(startTime);
scheduler.scheduleNextHeartbeat();
if (!dn.areHeartbeatsDisabledForTests()) {
  resp = sendHeartBeat(requestBlockReportLease);
  assert resp != null;
  if (resp.getFullBlockReportLeaseId() != 0) {
if (fullBlockReportLeaseId != 0) {
  LOG.warn(nnAddr + " sent back a full block report lease " +
  "ID of 0x" +
  Long.toHexString(resp.getFullBlockReportLeaseId()) +
  ", but we already have a lease ID of 0x" +
  Long.toHexString(fullBlockReportLeaseId) + ". " +
  "Overwriting old lease ID.");
}
fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
  }
 
}
  }
   
 
  if ((fullBlockReportLeaseId != 0) || forceFullBr) {
//Exception occurred here when NN restarting
cmds = blockReport(fullBlockReportLeaseId);
fullBlockReportLeaseId = 0;
  }
  
} catch(RemoteException re) {
  
  } // while (shouldRun())
} // offerService{code}
 

  was:
  since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
block lease id from active NN before sending full block to NN. Then DN will 
send full block report together with lease id. If the lease id is invalid, NN 
will reject the full block report and log "not in the pending set".

  In a case when DN is doing full block reporting while NN is restarted. It 
happens that DN will later send a full block report with lease id ,acquired 
from previous NN instance, which is invalid to the new NN instance. Though DN 
recognized the new NN instance by heartbeat and reregister itself, it did not 
reset the lease id from previous instance.

  The issuse may cause DNs to temporarily go dead, making it unsafe to 
restart NN especially in hadoop cluster which has large amount of DNs.

   To make it clear, look at code below. We take it from method 
offerService of class BPServiceActor. We eliminate some code to focus on 
current issue. fullBlockReportLeaseId is a local variable. Exceptions will 
occur at blockReport call when NN restarting, which will be caught by catch 
block in while loop. Thus fullBlockReportLeaseId will not be set to 0. After NN 
restarted, DN will send full block report which will be rejected by the new NN 
instance. DN will never send full block report until the next full block report 
schedule, about an hour later. 

  Solution is simple, just reset fullBlockReportLeaseId to 0 after any 

[jira] [Created] (HDFS-14314) fullBlockReportLeaseId should be reset after

2019-02-22 Thread star (JIRA)
star created HDFS-14314:
---

 Summary: fullBlockReportLeaseId should be reset after 
 Key: HDFS-14314
 URL: https://issues.apache.org/jira/browse/HDFS-14314
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.8.0
 Environment:  

 

 
Reporter: star


  since HDFS-7923 ,to rate-limit DN block report, DN will ask for a full 
block lease id from active NN before sending full block to NN. Then DN will 
send full block report together with lease id. If the lease id is invalid, NN 
will reject the full block report and log "not in the pending set".

  In a case when DN is doing full block reporting while NN is restarted. It 
happens that DN will later send a full block report with lease id ,acquired 
from previous NN instance, which is invalid to the new NN instance. Though DN 
recognized the new NN instance by heartbeat and reregister itself, it did not 
reset the lease id from previous instance.

  The issuse may cause DNs to temporarily go dead, making it unsafe to 
restart NN especially in hadoop cluster which has large amount of DNs.

   To make it clear, look at code below. We take it from method 
offerService of class BPServiceActor. We eliminate some code to focus on 
current issue. fullBlockReportLeaseId is a local variable. Exceptions will 
occur at blockReport call when NN restarting, which will be caught by catch 
block in while loop. Thus fullBlockReportLeaseId will not be set to 0. After NN 
restarted, DN will send full block report which will be rejected by the new NN 
instance. DN will never send full block report until the next full block report 
schedule, about an hour later. 

  Solution is simple, just reset fullBlockReportLeaseId to 0 after any 
exception or after registering to NN. Thus it will ask for a valid 
fullBlockReportLeaseId from new NN instance.
{code:java}
private void offerService() throws Exception {

  long fullBlockReportLeaseId = 0;

  //
  // Now loop for a long time
  //
  while (shouldRun()) {
try {
  final long startTime = scheduler.monotonicNow();

  //
  // Every so often, send heartbeat or block-report
  //
  final boolean sendHeartbeat = scheduler.isHeartbeatDue(startTime);
  HeartbeatResponse resp = null;
  if (sendHeartbeat) {
  
boolean requestBlockReportLease = (fullBlockReportLeaseId == 0) &&
scheduler.isBlockReportDue(startTime);
scheduler.scheduleNextHeartbeat();
if (!dn.areHeartbeatsDisabledForTests()) {
  resp = sendHeartBeat(requestBlockReportLease);
  assert resp != null;
  if (resp.getFullBlockReportLeaseId() != 0) {
if (fullBlockReportLeaseId != 0) {
  LOG.warn(nnAddr + " sent back a full block report lease " +
  "ID of 0x" +
  Long.toHexString(resp.getFullBlockReportLeaseId()) +
  ", but we already have a lease ID of 0x" +
  Long.toHexString(fullBlockReportLeaseId) + ". " +
  "Overwriting old lease ID.");
}
fullBlockReportLeaseId = resp.getFullBlockReportLeaseId();
  }
 
}
  }
   
 
  if ((fullBlockReportLeaseId != 0) || forceFullBr) {
//Exception occurred here when NN restarting
cmds = blockReport(fullBlockReportLeaseId);
fullBlockReportLeaseId = 0;
  }
  
} catch(RemoteException re) {
  
  } // while (shouldRun())
} // offerService{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14313) Get hdfs used space from FsDatasetImpl#volumeMap instead of df/du

2019-02-22 Thread Lisheng Sun (JIRA)
Lisheng Sun created HDFS-14313:
--

 Summary: Get hdfs used space from FsDatasetImpl#volumeMap instead 
of df/du
 Key: HDFS-14313
 URL: https://issues.apache.org/jira/browse/HDFS-14313
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode, performance
Affects Versions: 3.0.0-alpha1, 2.8.0
Reporter: Lisheng Sun


There are two ways of DU/DF getting used space that are insufficient.
 #  Running DU across lots of disks is very expensive and running all of the 
processes at the same time creates a noticeable IO spike.
 #  Running DF is inaccurate when the disk sharing by multiple datanode or 
other servers.

 Getting hdfs used space from  FsDatasetImpl#volumeMap is very small and 
accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775793#comment-16775793
 ] 

Hadoop QA commented on HDFS-14118:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
35s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
58s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 18m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 39s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 17m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 32s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
36s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
22s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}129m 10s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}259m 42s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReconstructStripedFileWithRandomECPolicy 
|
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14118 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12959859/HDFS-14118.023.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |

[jira] [Commented] (HDFS-14123) NameNode failover doesn't happen when running fsfreeze for the NameNode dir (dfs.namenode.name.dir)

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775790#comment-16775790
 ] 

Hadoop QA commented on HDFS-14123:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 23s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}117m 34s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
45s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}181m 26s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.TestSafeMode |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14123 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12959863/HDFS-14123.01.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 96c2b7017a86 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 05bce33 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26310/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26310/testReport/ |
| Max. 

[jira] [Commented] (HDDS-1165) Document generation in maven should be configured on execution level

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775770#comment-16775770
 ] 

Hadoop QA commented on HDDS-1165:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
26s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
34m 37s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
23s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 38s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
14s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
17s{color} | {color:green} docs in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 43s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDDS-1165 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12959868/HDDS-1165.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  xml  |
| uname | Linux e5fc20cb72ac 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f19c844 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2344/testReport/ |
| Max. process+thread count | 340 (vs. ulimit of 1) |
| modules | C: hadoop-hdds/docs U: hadoop-hdds/docs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2344/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Document generation in maven should be configured on execution level 
> -
>
> Key: HDDS-1165
> URL: https://issues.apache.org/jira/browse/HDDS-1165
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Anu Engineer
>Priority: Major
> Attachments: HDDS-1165.001.patch
>
>
> Documentation of Ozone/Hdds project 

[jira] [Updated] (HDDS-1072) Implement RetryProxy and FailoverProxy for OM client

2019-02-22 Thread Hanisha Koneru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-1072:
-
Summary: Implement RetryProxy and FailoverProxy for OM client  (was: 
Propagate OM Ratis NotLeaderException to Client)

> Implement RetryProxy and FailoverProxy for OM client
> 
>
> Key: HDDS-1072
> URL: https://issues.apache.org/jira/browse/HDDS-1072
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: OM
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: HDDS-1072.001.patch
>
>
> RPC Client should implement a retry and failover proxy provider to failover 
> between OM Ratis clients. The failover should occur in two scenarios:
> # When the client is unable to connect to the OM (either because of network 
> issues or because the OM is down). The client retry proxy provider should 
> failover to next OM in the cluster.
> # When OM Ratis Client receives a response from the Ratis server for its 
> request, it also gets the LeaderId of server which processed this request 
> (the current Leader OM nodeId). This information should be propagated back to 
> the client. The Client failover Proxy provider should failover to the leader 
> OM node. This helps avoid an extra hop from Follower OM Ratis Client to 
> Leader OM Ratis server for every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1072) Propagate OM Ratis NotLeaderException to Client

2019-02-22 Thread Hanisha Koneru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-1072:
-
Attachment: HDDS-1072.001.patch

> Propagate OM Ratis NotLeaderException to Client
> ---
>
> Key: HDDS-1072
> URL: https://issues.apache.org/jira/browse/HDDS-1072
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: OM
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: HDDS-1072.001.patch
>
>
> When OM Ratis Client receives a response from the Ratis server for its 
> request, it also gets the LeaderId of server which processed this request 
> (the current Leader OM nodeId). This information should be propagated to the 
> RPC client. The RPC client should send subsequent requests to the Ratis 
> Client on the Leader OM. This helps avoid an extra hop from Follower OM Ratis 
> Client to Leader OM Ratis server for every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1072) Implement RetryProxy and FailoverProxy for OM client

2019-02-22 Thread Hanisha Koneru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-1072:
-
Status: Patch Available  (was: Open)

> Implement RetryProxy and FailoverProxy for OM client
> 
>
> Key: HDDS-1072
> URL: https://issues.apache.org/jira/browse/HDDS-1072
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: OM
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: HDDS-1072.001.patch
>
>
> RPC Client should implement a retry and failover proxy provider to failover 
> between OM Ratis clients. The failover should occur in two scenarios:
> # When the client is unable to connect to the OM (either because of network 
> issues or because the OM is down). The client retry proxy provider should 
> failover to next OM in the cluster.
> # When OM Ratis Client receives a response from the Ratis server for its 
> request, it also gets the LeaderId of server which processed this request 
> (the current Leader OM nodeId). This information should be propagated back to 
> the client. The Client failover Proxy provider should failover to the leader 
> OM node. This helps avoid an extra hop from Follower OM Ratis Client to 
> Leader OM Ratis server for every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1072) Propagate OM Ratis NotLeaderException to Client

2019-02-22 Thread Hanisha Koneru (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-1072:
-
Description: 
RPC Client should implement a retry and failover proxy provider to failover 
between OM Ratis clients. The failover should occur in two scenarios:
# When the client is unable to connect to the OM (either because of network 
issues or because the OM is down). The client retry proxy provider should 
failover to next OM in the cluster.
# When OM Ratis Client receives a response from the Ratis server for its 
request, it also gets the LeaderId of server which processed this request (the 
current Leader OM nodeId). This information should be propagated back to the 
client. The Client failover Proxy provider should failover to the leader OM 
node. This helps avoid an extra hop from Follower OM Ratis Client to Leader OM 
Ratis server for every request.

  was:When OM Ratis Client receives a response from the Ratis server for its 
request, it also gets the LeaderId of server which processed this request (the 
current Leader OM nodeId). This information should be propagated to the RPC 
client. The RPC client should send subsequent requests to the Ratis Client on 
the Leader OM. This helps avoid an extra hop from Follower OM Ratis Client to 
Leader OM Ratis server for every request.


> Propagate OM Ratis NotLeaderException to Client
> ---
>
> Key: HDDS-1072
> URL: https://issues.apache.org/jira/browse/HDDS-1072
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: OM
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Major
> Attachments: HDDS-1072.001.patch
>
>
> RPC Client should implement a retry and failover proxy provider to failover 
> between OM Ratis clients. The failover should occur in two scenarios:
> # When the client is unable to connect to the OM (either because of network 
> issues or because the OM is down). The client retry proxy provider should 
> failover to next OM in the cluster.
> # When OM Ratis Client receives a response from the Ratis server for its 
> request, it also gets the LeaderId of server which processed this request 
> (the current Leader OM nodeId). This information should be propagated back to 
> the client. The Client failover Proxy provider should failover to the leader 
> OM node. This helps avoid an extra hop from Follower OM Ratis Client to 
> Leader OM Ratis server for every request.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1158) TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1158?focusedWorklogId=202967=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202967
 ]

ASF GitHub Bot logged work on HDDS-1158:


Author: ASF GitHub Bot
Created on: 23/Feb/19 02:33
Start Date: 23/Feb/19 02:33
Worklog Time Spent: 10m 
  Work Description: hanishakoneru commented on pull request #512: 
HDDS-1158. TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error.
URL: https://github.com/apache/hadoop/pull/512#discussion_r259561458
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -150,17 +152,27 @@ private void testCreateVolume(boolean checkSuccess) 
throws Exception {
 createVolumeArgs.setUserName(userName);
 createVolumeArgs.setAdminName(adminName);
 
-storageHandler.createVolume(createVolumeArgs);
+try {
+  storageHandler.createVolume(createVolumeArgs);
 
-VolumeArgs getVolumeArgs = new VolumeArgs(volumeName, userArgs);
-VolumeInfo retVolumeinfo = storageHandler.getVolumeInfo(getVolumeArgs);
+  VolumeArgs getVolumeArgs = new VolumeArgs(volumeName, userArgs);
+  VolumeInfo retVolumeinfo = storageHandler.getVolumeInfo(getVolumeArgs);
 
-if (checkSuccess) {
-  Assert.assertTrue(retVolumeinfo.getVolumeName().equals(volumeName));
-  Assert.assertTrue(retVolumeinfo.getOwner().getName().equals(userName));
-} else {
-  // Verify that the request failed
-  Assert.assertTrue(retVolumeinfo.getVolumeName().isEmpty());
+  if (checkSuccess) {
+Assert.assertTrue(retVolumeinfo.getVolumeName().equals(volumeName));
+Assert.assertTrue(retVolumeinfo.getOwner().getName().equals(userName));
+  } else {
+// Verify that the request failed
+Assert.assertTrue(retVolumeinfo.getVolumeName().isEmpty());
+Assert.fail("There is no quorum. Request should have failed");
+  }
+} catch (OMException e) {
+  if (!checkSuccess) {
+GenericTestUtils.assertExceptionContains(
+"RaftRetryFailureException", e);
 
 Review comment:
   Hi @bharatviswa504,
   This particular test case if for when we have 2 OM nodes down. So even with 
retry and failover logic implemented, the request should eventually fail. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202967)
Time Spent: 50m  (was: 40m)

> TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error
> -
>
> Key: HDDS-1158
> URL: https://issues.apache.org/jira/browse/HDDS-1158
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> h3. Error Message
> {code:java}
> org.apache.ratis.protocol.RaftRetryFailureException: Failed 
> RaftClientRequest:client-4D77D2A8F653->omNode-3@group-523986131536, cid=9, 
> seq=0 RW, 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisClient$$Lambda$373/2067504307@6afa0221
>  for 10 attempts with RetryLimited(maxAttempts=10, sleepTime=100ms){code}
> Stacktrace
> {code:java}
> INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException: 
> org.apache.ratis.protocol.RaftRetryFailureException: Failed 
> RaftClientRequest:client-4D77D2A8F653->omNode-3@group-523986131536, cid=9, 
> seq=0 RW, 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisClient$$Lambda$373/2067504307@6afa0221
>  for 10 attempts with RetryLimited(maxAttempts=10, sleepTime=100ms) at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:586)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.createVolume(OzoneManagerProtocolClientSideTranslatorPB.java:230)
>  at 
> org.apache.hadoop.ozone.web.storage.DistributedStorageHandler.createVolume(DistributedStorageHandler.java:179)
>  at 
> org.apache.hadoop.ozone.om.TestOzoneManagerHA.testCreateVolume(TestOzoneManagerHA.java:153)
>  at 
> org.apache.hadoop.ozone.om.TestOzoneManagerHA.testTwoOMNodesDown(TestOzoneManagerHA.java:138)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> 

[jira] [Updated] (HDDS-1165) Document generation in maven should be configured on execution level

2019-02-22 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-1165:
---
Status: Patch Available  (was: Open)

> Document generation in maven should be configured on execution level 
> -
>
> Key: HDDS-1165
> URL: https://issues.apache.org/jira/browse/HDDS-1165
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Priority: Major
> Attachments: HDDS-1165.001.patch
>
>
> Documentation of Ozone/Hdds project is generated from maven with the help of 
> maven exec plugin.
> There are multiple ways to configure plugins in maven. Plugin can be 
> configured on plugin level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>   
>     
>     
>   ...
>     
>   
> {code}
> In this case not only the specific execution but all the execution will be 
> configured (even if it's triggered by mvn exec:exec)
> Or it can be configured on the execution level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>  
>  ...
>      
>   
>     
>     
>   {code}
> In this case the configuration is valid only for this specific execution 
> which is bound to a specific phase (compile in this case)
> Unfortunately it's configured in the wrong way in hadoop-hdds/docs/pom.xml: 
> the first approach should be replaced with the second with moving the 
> configuration to inside the execution.
> Without this change yetus can't detect the dependency order.
> How to test:
> The easiest way to reproduce the problem is to execute:
> {code:java}
> mvn  -fae exec:exec -Dexec.executable=pwd -Dexec.args='' -Phdds{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14285) libhdfs hdfsRead copies entire array even if its only partially filled

2019-02-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775752#comment-16775752
 ] 

Hudson commented on HDFS-14285:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16036 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16036/])
HDFS-14285. libhdfs hdfsRead copies entire array even if its only (weichiu: rev 
f19c844e7515c00b5a11e4fd971e45d98629b1a6)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfs/hdfs.c


> libhdfs hdfsRead copies entire array even if its only partially filled
> --
>
> Key: HDFS-14285
> URL: https://issues.apache.org/jira/browse/HDFS-14285
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14285.001.patch, HDFS-14285.002.patch
>
>
> There is a bug in libhdfs {{hdfsRead}}
> {code:java}
> jthr = invokeMethod(env, , INSTANCE, jInputStream, HADOOP_ISTRM,
>"read", "([B)I", jbRarray);
> if (jthr) {
> destroyLocalReference(env, jbRarray);
> errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
> "hdfsRead: FSDataInputStream#read");
> return -1;
> }
> if (jVal.i < 0) {
> // EOF
> destroyLocalReference(env, jbRarray);
> return 0;
> } else if (jVal.i == 0) {
> destroyLocalReference(env, jbRarray);
> errno = EINTR;
> return -1;
> }
> (*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, buffer);
> {code}
> The method makes a call to {{FSInputStream#read(byte[])}} to fill in the Java 
> byte array, however, {{#read(byte[])}} is not guaranteed to fill up the 
> entire array, instead it returns the number of bytes written to the array 
> (which could be less than the size of the array). Yet `{{GetByteArrayRegion}} 
> decides to copy the entire contents of the {{jbArray}} into the buffer 
> ({{noReadBytes}} is initialized to the length of the buffer and is never 
> updated). So if {{FSInputStream#read(byte[])}} decides to read less data than 
> the size of the byte array, the call to {{GetByteArrayRegion}} will 
> essentially copy more bytes than necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1165) Document generation in maven should be configured on execution level

2019-02-22 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775751#comment-16775751
 ] 

Anu Engineer commented on HDDS-1165:


[~elek]  Thanks for writing the JIRA description like a code patch, I just 
blindly followed your suggestion. Be gentle on me, this is my first maven patch 
:)

> Document generation in maven should be configured on execution level 
> -
>
> Key: HDDS-1165
> URL: https://issues.apache.org/jira/browse/HDDS-1165
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Anu Engineer
>Priority: Major
> Attachments: HDDS-1165.001.patch
>
>
> Documentation of Ozone/Hdds project is generated from maven with the help of 
> maven exec plugin.
> There are multiple ways to configure plugins in maven. Plugin can be 
> configured on plugin level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>   
>     
>     
>   ...
>     
>   
> {code}
> In this case not only the specific execution but all the execution will be 
> configured (even if it's triggered by mvn exec:exec)
> Or it can be configured on the execution level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>  
>  ...
>      
>   
>     
>     
>   {code}
> In this case the configuration is valid only for this specific execution 
> which is bound to a specific phase (compile in this case)
> Unfortunately it's configured in the wrong way in hadoop-hdds/docs/pom.xml: 
> the first approach should be replaced with the second with moving the 
> configuration to inside the execution.
> Without this change yetus can't detect the dependency order.
> How to test:
> The easiest way to reproduce the problem is to execute:
> {code:java}
> mvn  -fae exec:exec -Dexec.executable=pwd -Dexec.args='' -Phdds{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1165) Document generation in maven should be configured on execution level

2019-02-22 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-1165:
---
Attachment: HDDS-1165.001.patch

> Document generation in maven should be configured on execution level 
> -
>
> Key: HDDS-1165
> URL: https://issues.apache.org/jira/browse/HDDS-1165
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Priority: Major
> Attachments: HDDS-1165.001.patch
>
>
> Documentation of Ozone/Hdds project is generated from maven with the help of 
> maven exec plugin.
> There are multiple ways to configure plugins in maven. Plugin can be 
> configured on plugin level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>   
>     
>     
>   ...
>     
>   
> {code}
> In this case not only the specific execution but all the execution will be 
> configured (even if it's triggered by mvn exec:exec)
> Or it can be configured on the execution level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>  
>  ...
>      
>   
>     
>     
>   {code}
> In this case the configuration is valid only for this specific execution 
> which is bound to a specific phase (compile in this case)
> Unfortunately it's configured in the wrong way in hadoop-hdds/docs/pom.xml: 
> the first approach should be replaced with the second with moving the 
> configuration to inside the execution.
> Without this change yetus can't detect the dependency order.
> How to test:
> The easiest way to reproduce the problem is to execute:
> {code:java}
> mvn  -fae exec:exec -Dexec.executable=pwd -Dexec.args='' -Phdds{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-1165) Document generation in maven should be configured on execution level

2019-02-22 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer reassigned HDDS-1165:
--

Assignee: Anu Engineer

> Document generation in maven should be configured on execution level 
> -
>
> Key: HDDS-1165
> URL: https://issues.apache.org/jira/browse/HDDS-1165
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Anu Engineer
>Priority: Major
> Attachments: HDDS-1165.001.patch
>
>
> Documentation of Ozone/Hdds project is generated from maven with the help of 
> maven exec plugin.
> There are multiple ways to configure plugins in maven. Plugin can be 
> configured on plugin level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>   
>     
>     
>   ...
>     
>   
> {code}
> In this case not only the specific execution but all the execution will be 
> configured (even if it's triggered by mvn exec:exec)
> Or it can be configured on the execution level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>  
>  ...
>      
>   
>     
>     
>   {code}
> In this case the configuration is valid only for this specific execution 
> which is bound to a specific phase (compile in this case)
> Unfortunately it's configured in the wrong way in hadoop-hdds/docs/pom.xml: 
> the first approach should be replaced with the second with moving the 
> configuration to inside the execution.
> Without this change yetus can't detect the dependency order.
> How to test:
> The easiest way to reproduce the problem is to execute:
> {code:java}
> mvn  -fae exec:exec -Dexec.executable=pwd -Dexec.args='' -Phdds{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14111) hdfsOpenFile on HDFS causes unnecessary IO from file offset 0

2019-02-22 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775747#comment-16775747
 ] 

Wei-Chiu Chuang commented on HDFS-14111:


I think using the StreamCapabilities is the perfect solution.
I am less familiar with libhdfs, so [~stakiar] could you double check to make 
sure the failed CTEST is unrelated?

{quote}
[ RUN  ] HdfsExtTest.TestReadStats
Formatting using clusterid: testClusterID
/testptch/hadoop/hadoop-hdfs-project/hadoop-hdfs-native-client/src/main/native/libhdfspp/tests/hdfs_ext_test.cc:506:
 Failure
Value of: (*__errno_location ())
  Actual: 2
Expected: 0
[  FAILED  ] HdfsExtTest.TestReadStats (1655 ms)
{quote}

> hdfsOpenFile on HDFS causes unnecessary IO from file offset 0
> -
>
> Key: HDFS-14111
> URL: https://issues.apache.org/jira/browse/HDFS-14111
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, libhdfs
>Affects Versions: 3.2.0
>Reporter: Todd Lipcon
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14111.001.patch, HDFS-14111.002.patch
>
>
> hdfsOpenFile() calls readDirect() with a 0-length argument in order to check 
> whether the underlying stream supports bytebuffer reads. With DFSInputStream, 
> the read(0) isn't short circuited, and results in the DFSClient opening a 
> block reader. In the case of a remote block, the block reader will actually 
> issue a read of the whole block, causing the datanode to perform unnecessary 
> IO and network transfers in order to fill up the client's TCP buffers. This 
> causes performance degradation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14285) libhdfs hdfsRead copies entire array even if its only partially filled

2019-02-22 Thread Wei-Chiu Chuang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-14285:
---
   Resolution: Fixed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

Pushed to trunk. Thanks [~stakiar] for the patch!

> libhdfs hdfsRead copies entire array even if its only partially filled
> --
>
> Key: HDFS-14285
> URL: https://issues.apache.org/jira/browse/HDFS-14285
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14285.001.patch, HDFS-14285.002.patch
>
>
> There is a bug in libhdfs {{hdfsRead}}
> {code:java}
> jthr = invokeMethod(env, , INSTANCE, jInputStream, HADOOP_ISTRM,
>"read", "([B)I", jbRarray);
> if (jthr) {
> destroyLocalReference(env, jbRarray);
> errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
> "hdfsRead: FSDataInputStream#read");
> return -1;
> }
> if (jVal.i < 0) {
> // EOF
> destroyLocalReference(env, jbRarray);
> return 0;
> } else if (jVal.i == 0) {
> destroyLocalReference(env, jbRarray);
> errno = EINTR;
> return -1;
> }
> (*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, buffer);
> {code}
> The method makes a call to {{FSInputStream#read(byte[])}} to fill in the Java 
> byte array, however, {{#read(byte[])}} is not guaranteed to fill up the 
> entire array, instead it returns the number of bytes written to the array 
> (which could be less than the size of the array). Yet `{{GetByteArrayRegion}} 
> decides to copy the entire contents of the {{jbArray}} into the buffer 
> ({{noReadBytes}} is initialized to the length of the buffer and is never 
> updated). So if {{FSInputStream#read(byte[])}} decides to read less data than 
> the size of the byte array, the call to {{GetByteArrayRegion}} will 
> essentially copy more bytes than necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14176) Replace incorrect use of system property user.name

2019-02-22 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775746#comment-16775746
 ] 

Wei-Chiu Chuang commented on HDFS-14176:


Looks almost good to me. Just one nit: could you also log the exception in the 
message? That should help with troubleshooting should it ever happens.
{code}
  LOG.warn("Unable to get user name. Fall back to system property " +
  "user.name", ex);
{code}

+1 after the change.

> Replace incorrect use of system property user.name
> --
>
> Key: HDFS-14176
> URL: https://issues.apache.org/jira/browse/HDFS-14176
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.3.0
> Environment: Kerberized
>Reporter: Wei-Chiu Chuang
>Assignee: Dinesh Chitlangia
>Priority: Major
> Attachments: HDFS-14176.01.patch, HDFS-14176.02.patch, 
> HDFS-14176.03.patch
>
>
> Looking at the Hadoop source code, there are a few places where the code 
> assumes the user name can be acquired from Java's system property 
> {{user.name}}.
> For example,
> {code:java|title=FileSystem}
> /** Return the current user's home directory in this FileSystem.
>* The default implementation returns {@code "/user/$USER/"}.
>*/
>   public Path getHomeDirectory() {
> return this.makeQualified(
> new Path(USER_HOME_PREFIX + "/" + System.getProperty("user.name")));
>   }
> {code}
> This is incorrect, as in a Kerberized environment, a user may login as a user 
> principal different from its system login account.
> It would be better to use 
> {{UserGroupInformation.getCurrentUser().getShortUserName()}}, similar to 
> HDFS-12485.
> Unfortunately, I am seeing this improper use in Yarn, HDFS federation 
> SFTPFilesystem and Ozone code (tests are ignored)
> The impact should be small, since it only affects the case where system is 
> Kerberized and that the user principal is different from system login account.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14285) libhdfs hdfsRead copies entire array even if its only partially filled

2019-02-22 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775744#comment-16775744
 ] 

Wei-Chiu Chuang commented on HDFS-14285:


+1 committing the patch now.

> libhdfs hdfsRead copies entire array even if its only partially filled
> --
>
> Key: HDFS-14285
> URL: https://issues.apache.org/jira/browse/HDFS-14285
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, libhdfs, native
>Reporter: Sahil Takiar
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-14285.001.patch, HDFS-14285.002.patch
>
>
> There is a bug in libhdfs {{hdfsRead}}
> {code:java}
> jthr = invokeMethod(env, , INSTANCE, jInputStream, HADOOP_ISTRM,
>"read", "([B)I", jbRarray);
> if (jthr) {
> destroyLocalReference(env, jbRarray);
> errno = printExceptionAndFree(env, jthr, PRINT_EXC_ALL,
> "hdfsRead: FSDataInputStream#read");
> return -1;
> }
> if (jVal.i < 0) {
> // EOF
> destroyLocalReference(env, jbRarray);
> return 0;
> } else if (jVal.i == 0) {
> destroyLocalReference(env, jbRarray);
> errno = EINTR;
> return -1;
> }
> (*env)->GetByteArrayRegion(env, jbRarray, 0, noReadBytes, buffer);
> {code}
> The method makes a call to {{FSInputStream#read(byte[])}} to fill in the Java 
> byte array, however, {{#read(byte[])}} is not guaranteed to fill up the 
> entire array, instead it returns the number of bytes written to the array 
> (which could be less than the size of the array). Yet `{{GetByteArrayRegion}} 
> decides to copy the entire contents of the {{jbArray}} into the buffer 
> ({{noReadBytes}} is initialized to the length of the buffer and is never 
> updated). So if {{FSInputStream#read(byte[])}} decides to read less data than 
> the size of the byte array, the call to {{GetByteArrayRegion}} will 
> essentially copy more bytes than necessary.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-3246) pRead equivalent for direct read path

2019-02-22 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775734#comment-16775734
 ] 

Wei-Chiu Chuang edited comment on HDFS-3246 at 2/23/19 1:50 AM:


I am not an expert in client side input stream implementations. But it seems 
you would also want CryptoInputStream to implement  
ByteBufferPositionedReadable, otherwise it won't help encrypted cluster. (i 
looked for input stream implementations that implement both ByteBufferReadable 
and PositionedReadable)


was (Author: jojochuang):
I am not an expert in client side input stream implementations. But it seems 
you would also want CryptoInputStream to implement  
ByteBufferPositionedReadable, otherwise it won't help encrypted cluster.

> pRead equivalent for direct read path
> -
>
> Key: HDFS-3246
> URL: https://issues.apache.org/jira/browse/HDFS-3246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, performance
>Affects Versions: 3.0.0-alpha1
>Reporter: Henry Robinson
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-3246.001.patch, HDFS-3246.002.patch, 
> HDFS-3246.003.patch, HDFS-3246.004.patch
>
>
> There is no pread equivalent in ByteBufferReadable. We should consider adding 
> one. It would be relatively easy to implement for the distributed case 
> (certainly compared to HDFS-2834), since DFSInputStream does most of the 
> heavy lifting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-3246) pRead equivalent for direct read path

2019-02-22 Thread Wei-Chiu Chuang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775734#comment-16775734
 ] 

Wei-Chiu Chuang commented on HDFS-3246:
---

I am not an expert in client side input stream implementations. But it seems 
you would also want CryptoInputStream to implement  
ByteBufferPositionedReadable, otherwise it won't help encrypted cluster.

> pRead equivalent for direct read path
> -
>
> Key: HDFS-3246
> URL: https://issues.apache.org/jira/browse/HDFS-3246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, performance
>Affects Versions: 3.0.0-alpha1
>Reporter: Henry Robinson
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-3246.001.patch, HDFS-3246.002.patch, 
> HDFS-3246.003.patch, HDFS-3246.004.patch
>
>
> There is no pread equivalent in ByteBufferReadable. We should consider adding 
> one. It would be relatively easy to implement for the distributed case 
> (certainly compared to HDFS-2834), since DFSInputStream does most of the 
> heavy lifting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14123) NameNode failover doesn't happen when running fsfreeze for the NameNode dir (dfs.namenode.name.dir)

2019-02-22 Thread Toshihiro Suzuki (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Toshihiro Suzuki updated HDFS-14123:

Attachment: HDFS-14123.01.patch

> NameNode failover doesn't happen when running fsfreeze for the NameNode dir 
> (dfs.namenode.name.dir)
> ---
>
> Key: HDFS-14123
> URL: https://issues.apache.org/jira/browse/HDFS-14123
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha
>Reporter: Toshihiro Suzuki
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HDFS-14123.01.patch, HDFS-14123.01.patch
>
>
> I ran fsfreeze for the NameNode dir (dfs.namenode.name.dir) in my cluster for 
> test purpose, but NameNode failover didn't happen.
> {code}
> fsfreeze -f /mnt
> {code}
> /mnt is a separate filesystem partition from /. And the NameNode dir 
> "dfs.namenode.name.dir" is /mnt/hadoop/hdfs/namenode.
> I checked the source code, and I found monitorHealth RPC from ZKFC doesn't 
> fail even if the NameNode dir is frozen. I think that's why the failover 
> doesn't happen.
> Also if the NameNode dir is frozen, it looks like FSImage.rollEditLog() gets 
> stuck like the following, and it keeps holding the write lock of 
> FSNamesystem, which causes HDFS service down:
> {code}
> "IPC Server handler 5 on default port 8020" #53 daemon prio=5 os_prio=0 
> tid=0x7f56b96e2000 nid=0x5042 in Object.wait() [0x7f56937bb000]
>java.lang.Thread.State: TIMED_WAITING (on object monitor)
> at java.lang.Object.wait(Native Method)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync$SyncEdit.logSyncWait(FSEditLogAsync.java:317)
> - locked <0xc58ca268> (a 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.logSyncAll(FSEditLogAsync.java:147)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.endCurrentLogSegment(FSEditLog.java:1422)
> - locked <0xc58ca268> (a 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLog.rollEditLog(FSEditLog.java:1316)
> - locked <0xc58ca268> (a 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.rollEditLog(FSImage.java:1322)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.rollEditLog(FSNamesystem.java:4740)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.rollEditLog(NameNodeRpcServer.java:1307)
> at 
> org.apache.hadoop.hdfs.protocolPB.NamenodeProtocolServerSideTranslatorPB.rollEditLog(NamenodeProtocolServerSideTranslatorPB.java:148)
> at 
> org.apache.hadoop.hdfs.protocol.proto.NamenodeProtocolProtos$NamenodeProtocolService$2.callBlockingMethod(NamenodeProtocolProtos.java:14726)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:898)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:844)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2727)
>Locked ownable synchronizers:
> - <0xc5f4ca10> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$FairSync)
> {code}
> I believe NameNode failover should happen in this case. One idea is to check 
> if the NameNode dir is working when NameNode receives monitorHealth RPC from 
> ZKFC.
> I will attach a patch for this idea.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775724#comment-16775724
 ] 

Yongjun Zhang commented on HDFS-14118:
--

Thanks [~fengnanli]! +1 on rev 23 pending jenkins test.

Hi [~elgoiri], wonder if you have further comments? Thanks.

> Use DNS to resolve Namenodes and Routers
> 
>
> Key: HDFS-14118
> URL: https://issues.apache.org/jira/browse/HDFS-14118
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs-1.pdf, HDFS design doc_ Single domain name for clients 
> - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.020.patch, 
> HDFS-14118.021.patch, HDFS-14118.022.patch, HDFS-14118.023.patch, 
> HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers updating (adding/removing) will have to make 
> every client change, which is a painful process.
> DNS can be used here to resolve the single domain name clients knows to a 
> list of routers in the current config. However, DNS won't be able to consider 
> only resolving to the working router based on certain health thresholds.
> There are some ways about how this can be solved. One way is to have a 
> separate script to regularly check the status of the router and update the 
> DNS records if a router fails the health thresholds. In this way, security 
> might be carefully considered for this way. Another way is to have the client 
> do the normal connecting/failover after they get the list of routers, which 
> requires the change of current failover proxy provider.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1158) TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1158?focusedWorklogId=202950=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202950
 ]

ASF GitHub Bot logged work on HDDS-1158:


Author: ASF GitHub Bot
Created on: 23/Feb/19 00:33
Start Date: 23/Feb/19 00:33
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #512: 
HDDS-1158. TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error.
URL: https://github.com/apache/hadoop/pull/512#discussion_r259551047
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -150,17 +152,27 @@ private void testCreateVolume(boolean checkSuccess) 
throws Exception {
 createVolumeArgs.setUserName(userName);
 createVolumeArgs.setAdminName(adminName);
 
-storageHandler.createVolume(createVolumeArgs);
+try {
+  storageHandler.createVolume(createVolumeArgs);
 
-VolumeArgs getVolumeArgs = new VolumeArgs(volumeName, userArgs);
-VolumeInfo retVolumeinfo = storageHandler.getVolumeInfo(getVolumeArgs);
+  VolumeArgs getVolumeArgs = new VolumeArgs(volumeName, userArgs);
+  VolumeInfo retVolumeinfo = storageHandler.getVolumeInfo(getVolumeArgs);
 
-if (checkSuccess) {
-  Assert.assertTrue(retVolumeinfo.getVolumeName().equals(volumeName));
-  Assert.assertTrue(retVolumeinfo.getOwner().getName().equals(userName));
-} else {
-  // Verify that the request failed
-  Assert.assertTrue(retVolumeinfo.getVolumeName().isEmpty());
+  if (checkSuccess) {
+Assert.assertTrue(retVolumeinfo.getVolumeName().equals(volumeName));
+Assert.assertTrue(retVolumeinfo.getOwner().getName().equals(userName));
+  } else {
+// Verify that the request failed
+Assert.assertTrue(retVolumeinfo.getVolumeName().isEmpty());
+Assert.fail("There is no quorum. Request should have failed");
+  }
+} catch (OMException e) {
+  if (!checkSuccess) {
+GenericTestUtils.assertExceptionContains(
+"RaftRetryFailureException", e);
 
 Review comment:
   Hi @hanishakoneru 
   This will be fixed, when we have client have finding the leader and have 
retry logic?
   
   This logic needs can be removed, when we will have client retry logic. So, 
can we add TODO here?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202950)
Time Spent: 40m  (was: 0.5h)

> TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error
> -
>
> Key: HDDS-1158
> URL: https://issues.apache.org/jira/browse/HDDS-1158
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> h3. Error Message
> {code:java}
> org.apache.ratis.protocol.RaftRetryFailureException: Failed 
> RaftClientRequest:client-4D77D2A8F653->omNode-3@group-523986131536, cid=9, 
> seq=0 RW, 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisClient$$Lambda$373/2067504307@6afa0221
>  for 10 attempts with RetryLimited(maxAttempts=10, sleepTime=100ms){code}
> Stacktrace
> {code:java}
> INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException: 
> org.apache.ratis.protocol.RaftRetryFailureException: Failed 
> RaftClientRequest:client-4D77D2A8F653->omNode-3@group-523986131536, cid=9, 
> seq=0 RW, 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisClient$$Lambda$373/2067504307@6afa0221
>  for 10 attempts with RetryLimited(maxAttempts=10, sleepTime=100ms) at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:586)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.createVolume(OzoneManagerProtocolClientSideTranslatorPB.java:230)
>  at 
> org.apache.hadoop.ozone.web.storage.DistributedStorageHandler.createVolume(DistributedStorageHandler.java:179)
>  at 
> org.apache.hadoop.ozone.om.TestOzoneManagerHA.testCreateVolume(TestOzoneManagerHA.java:153)
>  at 
> org.apache.hadoop.ozone.om.TestOzoneManagerHA.testTwoOMNodesDown(TestOzoneManagerHA.java:138)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> 

[jira] [Work logged] (HDDS-1145) Add optional web server to the Ozone freon test tool

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1145?focusedWorklogId=202945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202945
 ]

ASF GitHub Bot logged work on HDDS-1145:


Author: ASF GitHub Bot
Created on: 23/Feb/19 00:30
Start Date: 23/Feb/19 00:30
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #505: 
HDDS-1145. Add optional web server to the Ozone freon test tool
URL: https://github.com/apache/hadoop/pull/505#discussion_r259551416
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1890,4 +1890,63 @@
   the servlet.
 
   
+
+  
+ozone.freon.http-address
+0.0.0.0:9884
+OZONE, MANAGEMENT
+
+  The address and the base port where the FREON web ui will listen on.
+
+  If the port is 0 then the server will start on a free port.
+
+  
+  
+ozone.freon.http-bind-host
+0.0.0.0
+OZONE, MANAGEMENT
+
+  The actual address the Freon web server will bind to. If this
+  optional address is set, it overrides only the hostname portion of
+  ozone.freon.http-address.
+
+  
+  
+ozone.freon.http.enabled
+true
+OZONE, MANAGEMENT
+
+  Property to enable or disable FREON web ui.
+
+  
+  
+ozone.freon.https-address
+0.0.0.0:9885
+OZONE, MANAGEMENT
+
+  The address and the base port where the Freon web server will listen
+  on using HTTPS.
+
+  If the port is 0 then the server will start on a free port.
+
+  
+  
+ozone.freon.https-bind-host
+0.0.0.0
+OZONE, MANAGEMENT
+
+  The actual address the Freon web server will bind to using HTTPS.
+  If this optional address is set, it overrides only the hostname portion 
of
+  ozone.freon.http-address.
+
+  
+  
+ozone.freon.http.kerberos.principal
+HTTP/_h...@example.com
 
 Review comment:
   Can we add SECURITY tag for all the security related properties?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202945)
Time Spent: 20m  (was: 10m)

> Add optional web server to the Ozone freon test tool
> 
>
> Key: HDDS-1145
> URL: https://issues.apache.org/jira/browse/HDDS-1145
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Recently we improved the default HttpServer to support prometheus monitoring 
> and java profiling.
> It would be very useful to enable the same options for freon testing:
>  1. We need a simple way to profile freon and check the problems
>  2. Long running freons should be monitored
> We can create a new optional FreonHttpServer which includes all the required 
> servlets by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1145) Add optional web server to the Ozone freon test tool

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1145?focusedWorklogId=202948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202948
 ]

ASF GitHub Bot logged work on HDDS-1145:


Author: ASF GitHub Bot
Created on: 23/Feb/19 00:30
Start Date: 23/Feb/19 00:30
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #505: 
HDDS-1145. Add optional web server to the Ozone freon test tool
URL: https://github.com/apache/hadoop/pull/505#discussion_r259552022
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1890,4 +1890,63 @@
   the servlet.
 
   
+
+  
+ozone.freon.http-address
+0.0.0.0:9884
+OZONE, MANAGEMENT
+
+  The address and the base port where the FREON web ui will listen on.
+
 
 Review comment:
   NIT pick: Unwanted extra line
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202948)
Time Spent: 50m  (was: 40m)

> Add optional web server to the Ozone freon test tool
> 
>
> Key: HDDS-1145
> URL: https://issues.apache.org/jira/browse/HDDS-1145
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Recently we improved the default HttpServer to support prometheus monitoring 
> and java profiling.
> It would be very useful to enable the same options for freon testing:
>  1. We need a simple way to profile freon and check the problems
>  2. Long running freons should be monitored
> We can create a new optional FreonHttpServer which includes all the required 
> servlets by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1145) Add optional web server to the Ozone freon test tool

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1145?focusedWorklogId=202946=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202946
 ]

ASF GitHub Bot logged work on HDDS-1145:


Author: ASF GitHub Bot
Created on: 23/Feb/19 00:30
Start Date: 23/Feb/19 00:30
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #505: 
HDDS-1145. Add optional web server to the Ozone freon test tool
URL: https://github.com/apache/hadoop/pull/505#discussion_r259552006
 
 

 ##
 File path: hadoop-hdds/common/src/main/resources/ozone-default.xml
 ##
 @@ -1890,4 +1890,63 @@
   the servlet.
 
   
+
+  
+ozone.freon.http-address
+0.0.0.0:9884
+OZONE, MANAGEMENT
+
+  The address and the base port where the FREON web ui will listen on.
+
+  If the port is 0 then the server will start on a free port.
+
+  
+  
+ozone.freon.http-bind-host
+0.0.0.0
+OZONE, MANAGEMENT
+
+  The actual address the Freon web server will bind to. If this
+  optional address is set, it overrides only the hostname portion of
+  ozone.freon.http-address.
+
+  
+  
+ozone.freon.http.enabled
+true
+OZONE, MANAGEMENT
+
+  Property to enable or disable FREON web ui.
+
+  
+  
+ozone.freon.https-address
+0.0.0.0:9885
+OZONE, MANAGEMENT
+
+  The address and the base port where the Freon web server will listen
+  on using HTTPS.
+
 
 Review comment:
   NIT pick: Unwanted extra line
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202946)
Time Spent: 0.5h  (was: 20m)

> Add optional web server to the Ozone freon test tool
> 
>
> Key: HDDS-1145
> URL: https://issues.apache.org/jira/browse/HDDS-1145
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Recently we improved the default HttpServer to support prometheus monitoring 
> and java profiling.
> It would be very useful to enable the same options for freon testing:
>  1. We need a simple way to profile freon and check the problems
>  2. Long running freons should be monitored
> We can create a new optional FreonHttpServer which includes all the required 
> servlets by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1145) Add optional web server to the Ozone freon test tool

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1145?focusedWorklogId=202947=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202947
 ]

ASF GitHub Bot logged work on HDDS-1145:


Author: ASF GitHub Bot
Created on: 23/Feb/19 00:30
Start Date: 23/Feb/19 00:30
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #505: 
HDDS-1145. Add optional web server to the Ozone freon test tool
URL: https://github.com/apache/hadoop/pull/505#discussion_r259551810
 
 

 ##
 File path: hadoop-ozone/tools/src/main/resources/webapps/freon/.gitkeep
 ##
 @@ -0,0 +1,17 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#  http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
 
 Review comment:
   Why we need to do this?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202947)
Time Spent: 40m  (was: 0.5h)

> Add optional web server to the Ozone freon test tool
> 
>
> Key: HDDS-1145
> URL: https://issues.apache.org/jira/browse/HDDS-1145
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Tools
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Recently we improved the default HttpServer to support prometheus monitoring 
> and java profiling.
> It would be very useful to enable the same options for freon testing:
>  1. We need a simple way to profile freon and check the problems
>  2. Long running freons should be monitored
> We can create a new optional FreonHttpServer which includes all the required 
> servlets by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1158) TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1158?focusedWorklogId=202944=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202944
 ]

ASF GitHub Bot logged work on HDDS-1158:


Author: ASF GitHub Bot
Created on: 23/Feb/19 00:21
Start Date: 23/Feb/19 00:21
Worklog Time Spent: 10m 
  Work Description: bharatviswa504 commented on pull request #512: 
HDDS-1158. TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error.
URL: https://github.com/apache/hadoop/pull/512#discussion_r259551047
 
 

 ##
 File path: 
hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/om/TestOzoneManagerHA.java
 ##
 @@ -150,17 +152,27 @@ private void testCreateVolume(boolean checkSuccess) 
throws Exception {
 createVolumeArgs.setUserName(userName);
 createVolumeArgs.setAdminName(adminName);
 
-storageHandler.createVolume(createVolumeArgs);
+try {
+  storageHandler.createVolume(createVolumeArgs);
 
-VolumeArgs getVolumeArgs = new VolumeArgs(volumeName, userArgs);
-VolumeInfo retVolumeinfo = storageHandler.getVolumeInfo(getVolumeArgs);
+  VolumeArgs getVolumeArgs = new VolumeArgs(volumeName, userArgs);
+  VolumeInfo retVolumeinfo = storageHandler.getVolumeInfo(getVolumeArgs);
 
-if (checkSuccess) {
-  Assert.assertTrue(retVolumeinfo.getVolumeName().equals(volumeName));
-  Assert.assertTrue(retVolumeinfo.getOwner().getName().equals(userName));
-} else {
-  // Verify that the request failed
-  Assert.assertTrue(retVolumeinfo.getVolumeName().isEmpty());
+  if (checkSuccess) {
+Assert.assertTrue(retVolumeinfo.getVolumeName().equals(volumeName));
+Assert.assertTrue(retVolumeinfo.getOwner().getName().equals(userName));
+  } else {
+// Verify that the request failed
+Assert.assertTrue(retVolumeinfo.getVolumeName().isEmpty());
+Assert.fail("There is no quorum. Request should have failed");
+  }
+} catch (OMException e) {
+  if (!checkSuccess) {
+GenericTestUtils.assertExceptionContains(
+"RaftRetryFailureException", e);
 
 Review comment:
   Hi @hanishakoneru 
   This will be fixed, when we have client have finding the leader and have 
retry logic?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202944)
Time Spent: 0.5h  (was: 20m)

> TestOzoneManagerHA.testTwoOMNodesDown is failing with ratis error
> -
>
> Key: HDDS-1158
> URL: https://issues.apache.org/jira/browse/HDDS-1158
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Hanisha Koneru
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> h3. Error Message
> {code:java}
> org.apache.ratis.protocol.RaftRetryFailureException: Failed 
> RaftClientRequest:client-4D77D2A8F653->omNode-3@group-523986131536, cid=9, 
> seq=0 RW, 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisClient$$Lambda$373/2067504307@6afa0221
>  for 10 attempts with RetryLimited(maxAttempts=10, sleepTime=100ms){code}
> Stacktrace
> {code:java}
> INTERNAL_ERROR org.apache.hadoop.ozone.om.exceptions.OMException: 
> org.apache.ratis.protocol.RaftRetryFailureException: Failed 
> RaftClientRequest:client-4D77D2A8F653->omNode-3@group-523986131536, cid=9, 
> seq=0 RW, 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerRatisClient$$Lambda$373/2067504307@6afa0221
>  for 10 attempts with RetryLimited(maxAttempts=10, sleepTime=100ms) at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:586)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.createVolume(OzoneManagerProtocolClientSideTranslatorPB.java:230)
>  at 
> org.apache.hadoop.ozone.web.storage.DistributedStorageHandler.createVolume(DistributedStorageHandler.java:179)
>  at 
> org.apache.hadoop.ozone.om.TestOzoneManagerHA.testCreateVolume(TestOzoneManagerHA.java:153)
>  at 
> org.apache.hadoop.ozone.om.TestOzoneManagerHA.testTwoOMNodesDown(TestOzoneManagerHA.java:138)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at 

[jira] [Assigned] (HDDS-1132) Ozone serialization codec for Ozone S3 secret table

2019-02-22 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-1132:


Assignee: Bharat Viswanadham

> Ozone serialization codec for Ozone S3 secret table
> ---
>
> Key: HDDS-1132
> URL: https://issues.apache.org/jira/browse/HDDS-1132
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager, S3
>Reporter: Elek, Marton
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: newbie
>
> HDDS-748/HDDS-864 introduced an option to use strongly typed metadata tables 
> and separated the serialization/deserialization logic to separated codec 
> implementation
> HDDS-937 introduced a new S3 secret table which is not codec based.
> I propose to use codecs for this table.
> In OzoneMetadataManager the return value of getS3SecretTable() should be 
> changed from Table to Table. 
> The encoding/decoding logic of S3SecretValue should be registered in 
> ~OzoneMetadataManagerImpl:L204
> As the codecs are type based we may need a wrapper class to encode the String 
> kerberos id with md5: class S3SecretKey(String name = kerberodId). Long term 
> we can modify the S3SecretKey to support multiple keys for the same kerberos 
> id.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-1132) Ozone serialization codec for Ozone S3 secret table

2019-02-22 Thread Bharat Viswanadham (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-1132:


Assignee: (was: Bharat Viswanadham)

> Ozone serialization codec for Ozone S3 secret table
> ---
>
> Key: HDDS-1132
> URL: https://issues.apache.org/jira/browse/HDDS-1132
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager, S3
>Reporter: Elek, Marton
>Priority: Major
>  Labels: newbie
>
> HDDS-748/HDDS-864 introduced an option to use strongly typed metadata tables 
> and separated the serialization/deserialization logic to separated codec 
> implementation
> HDDS-937 introduced a new S3 secret table which is not codec based.
> I propose to use codecs for this table.
> In OzoneMetadataManager the return value of getS3SecretTable() should be 
> changed from Table to Table. 
> The encoding/decoding logic of S3SecretValue should be registered in 
> ~OzoneMetadataManagerImpl:L204
> As the codecs are type based we may need a wrapper class to encode the String 
> kerberos id with md5: class S3SecretKey(String name = kerberodId). Long term 
> we can modify the S3SecretKey to support multiple keys for the same kerberos 
> id.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14130) Make ZKFC ObserverNode aware

2019-02-22 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775709#comment-16775709
 ] 

Chao Sun commented on HDFS-14130:
-

+1 on patch v11.

> Make ZKFC ObserverNode aware
> 
>
> Key: HDFS-14130
> URL: https://issues.apache.org/jira/browse/HDFS-14130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Affects Versions: HDFS-12943
>Reporter: Konstantin Shvachko
>Assignee: xiangheng
>Priority: Major
> Attachments: HDFS-14130-HDFS-12943.001.patch, 
> HDFS-14130-HDFS-12943.003.patch, HDFS-14130-HDFS-12943.004.patch, 
> HDFS-14130-HDFS-12943.005.patch, HDFS-14130-HDFS-12943.006.patch, 
> HDFS-14130-HDFS-12943.007.patch, HDFS-14130.008.patch, HDFS-14130.009.patch, 
> HDFS-14130.010.patch, HDFS-14130.011.patch
>
>
> Need to fix automatic failover with ZKFC. Currently it does not know about 
> ObserverNodes trying to convert them to SBNs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Fengnan Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775705#comment-16775705
 ] 

Fengnan Li commented on HDFS-14118:
---

[~yzhangal] It makes a lot of sense with your suggestion. I have updated the 
patch with those new descriptions. Thanks!

> Use DNS to resolve Namenodes and Routers
> 
>
> Key: HDFS-14118
> URL: https://issues.apache.org/jira/browse/HDFS-14118
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs-1.pdf, HDFS design doc_ Single domain name for clients 
> - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.020.patch, 
> HDFS-14118.021.patch, HDFS-14118.022.patch, HDFS-14118.023.patch, 
> HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers updating (adding/removing) will have to make 
> every client change, which is a painful process.
> DNS can be used here to resolve the single domain name clients knows to a 
> list of routers in the current config. However, DNS won't be able to consider 
> only resolving to the working router based on certain health thresholds.
> There are some ways about how this can be solved. One way is to have a 
> separate script to regularly check the status of the router and update the 
> DNS records if a router fails the health thresholds. In this way, security 
> might be carefully considered for this way. Another way is to have the client 
> do the normal connecting/failover after they get the list of routers, which 
> requires the change of current failover proxy provider.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Fengnan Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-14118:
--
Attachment: HDFS-14118.023.patch

> Use DNS to resolve Namenodes and Routers
> 
>
> Key: HDFS-14118
> URL: https://issues.apache.org/jira/browse/HDFS-14118
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs-1.pdf, HDFS design doc_ Single domain name for clients 
> - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.020.patch, 
> HDFS-14118.021.patch, HDFS-14118.022.patch, HDFS-14118.023.patch, 
> HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers updating (adding/removing) will have to make 
> every client change, which is a painful process.
> DNS can be used here to resolve the single domain name clients knows to a 
> list of routers in the current config. However, DNS won't be able to consider 
> only resolving to the working router based on certain health thresholds.
> There are some ways about how this can be solved. One way is to have a 
> separate script to regularly check the status of the router and update the 
> DNS records if a router fails the health thresholds. In this way, security 
> might be carefully considered for this way. Another way is to have the client 
> do the normal connecting/failover after they get the list of routers, which 
> requires the change of current failover proxy provider.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14130) Make ZKFC ObserverNode aware

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775703#comment-16775703
 ] 

Hadoop QA commented on HDFS-14130:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 17m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
18m 54s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 22m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 22m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 16s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
38s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 10m 
19s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 31s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
55s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}240m  8s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14130 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12959841/HDFS-14130.011.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ac180087c489 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 7d3b567 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| unit | 

[jira] [Commented] (HDFS-14052) RBF: Use Router keytab for WebHDFS

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775698#comment-16775698
 ] 

Hadoop QA commented on HDFS-14052:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-13891 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
43s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
31s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
24s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
32s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 20s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
53s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} HDFS-13891 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  8s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 22m 
51s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
29s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 70m 39s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14052 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12959850/HDFS-14052-HDFS-13891.3.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux aca5caa0a0e7 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-13891 / 0477b0b |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26308/testReport/ |
| Max. process+thread count | 1374 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26308/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Use Router keytab for WebHDFS
> --
>
> Key: HDFS-14052
>

[jira] [Commented] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775665#comment-16775665
 ] 

Hadoop QA commented on HDFS-14118:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
36s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 3 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
40s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 21m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 
25s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  4m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
23m 25s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
37s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
39s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
22s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
3s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 57s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
38s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m 
49s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
55s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}113m 35s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}257m 28s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.namenode.ha.TestEditLogTailer |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14118 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12959831/HDFS-14118.022.patch |
| Optional Tests |  dupname  asflicense  compile  javac  

[jira] [Assigned] (HDDS-1165) Document generation in maven should be configured on execution level

2019-02-22 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton reassigned HDDS-1165:
--

Assignee: (was: Elek, Marton)

> Document generation in maven should be configured on execution level 
> -
>
> Key: HDDS-1165
> URL: https://issues.apache.org/jira/browse/HDDS-1165
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Priority: Major
>
> Documentation of Ozone/Hdds project is generated from maven with the help of 
> maven exec plugin.
> There are multiple ways to configure plugins in maven. Plugin can be 
> configured on plugin level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>   
>     
>     
>   ...
>     
>   
> {code}
> In this case not only the specific execution but all the execution will be 
> configured (even if it's triggered by mvn exec:exec)
> Or it can be configured on the execution level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>  
>  ...
>      
>   
>     
>     
>   {code}
> In this case the configuration is valid only for this specific execution 
> which is bound to a specific phase (compile in this case)
> Unfortunately it's configured in the wrong way in hadoop-hdds/docs/pom.xml: 
> the first approach should be replaced with the second with moving the 
> configuration to inside the execution.
> Without this change yetus can't detect the dependency order.
> How to test:
> The easiest way to reproduce the problem is to execute:
> {code:java}
> mvn  -fae exec:exec -Dexec.executable=pwd -Dexec.args='' -Phdds{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14272) [SBN read] ObserverReadProxyProvider should sync with active txnID on startup

2019-02-22 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775664#comment-16775664
 ] 

Chao Sun commented on HDFS-14272:
-

Thanks [~xkrogen]. The change on {{msync}} looks good to me. Some very minor 
comments:
 # Can we add one or two comments for the newly added {{msynced}} field?
 # Instead of:
{code:java}
  if (!msynced) {
// If this was reached, the request reached the active, so the
// state is up-to-date with active and no further msync is needed.
msynced = true;
  }
{code}
can we remove the if clause?

> [SBN read] ObserverReadProxyProvider should sync with active txnID on startup
> -
>
> Key: HDFS-14272
> URL: https://issues.apache.org/jira/browse/HDFS-14272
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
> Environment: CDH6.1 (Hadoop 3.0.x) + Consistency Reads from Standby + 
> SSL + Kerberos + RPC encryption
>Reporter: Wei-Chiu Chuang
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14272.000.patch, HDFS-14272.001.patch
>
>
> It is typical for integration tests to create some files and then check their 
> existence. For example, like the following simple bash script:
> {code:java}
> # hdfs dfs -touchz /tmp/abc
> # hdfs dfs -ls /tmp/abc
> {code}
> The test executes HDFS bash command sequentially, but it may fail with 
> Consistent Standby Read because the -ls does not find the file.
> Analysis: the second bash command, while launched sequentially after the 
> first one, is not aware of the state id returned from the first bash command. 
> So ObserverNode wouldn't wait for the the edits to get propagated, and thus 
> fails.
> I've got a cluster where the Observer has tens of seconds of RPC latency, and 
> this becomes very annoying. (I am still trying to figure out why this 
> Observer has such a long RPC latency. But that's another story.)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14052) RBF: Use Router keytab for WebHDFS

2019-02-22 Thread CR Hota (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

CR Hota updated HDFS-14052:
---
Attachment: HDFS-14052-HDFS-13891.3.patch

> RBF: Use Router keytab for WebHDFS
> --
>
> Key: HDFS-14052
> URL: https://issues.apache.org/jira/browse/HDFS-14052
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14052-HDFS-13891.0.patch, 
> HDFS-14052-HDFS-13891.1.patch, HDFS-14052-HDFS-13891.2.patch, 
> HDFS-14052-HDFS-13891.3.patch
>
>
> When the RouterHttpServer starts it does:
> {code}
> NameNodeHttpServer.initWebHdfs(conf, httpAddress.getHostName(), 
> httpServer,
> RouterWebHdfsMethods.class.getPackage().getName());
> {code}
> This function is in the NN and is pretty generic.
> However, it then calls to NameNodeHttpServer#getAuthFilterParams, which does:
> {code}
> String httpKeytab = conf.get(DFSUtil.getSpnegoKeytabKey(conf,
> DFSConfigKeys.DFS_NAMENODE_KEYTAB_FILE_KEY));
> {code}
> In most cases, the regular web keytab will kick in, but we should make this a 
> parameter and load the Router one just in case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1164) Add New blockade Tests to test Replica Manager

2019-02-22 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-1164:
---
Target Version/s: 0.4.0

> Add New blockade Tests to test Replica Manager
> --
>
> Key: HDDS-1164
> URL: https://issues.apache.org/jira/browse/HDDS-1164
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Nilotpal Nandi
>Assignee: Nilotpal Nandi
>Priority: Major
> Attachments: HDDS-1164.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1165) Document generation in maven should be configured on execution level

2019-02-22 Thread Elek, Marton (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elek, Marton updated HDDS-1165:
---
Summary: Document generation in maven should be configured on execution 
level   (was: Document generation in maven should be configured on execition 
level )

> Document generation in maven should be configured on execution level 
> -
>
> Key: HDDS-1165
> URL: https://issues.apache.org/jira/browse/HDDS-1165
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>
> Documentation of Ozone/Hdds project is generated from maven with the help of 
> maven exec plugin.
> There are multiple ways to configure plugins in maven. Plugin can be 
> configured on plugin level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>   
>     
>     
>   ...
>     
>   
> {code}
> In this case not only the specific execution but all the execution will be 
> configured (even if it's triggered by mvn exec:exec)
> Or it can be configured on the execution level:
> {code:java}
> 
>     org.codehaus.mojo
>     exec-maven-plugin
>     1.6.0
>     
>   
>     
>   exec
>     
>     compile
>  
>  ...
>      
>   
>     
>     
>   {code}
> In this case the configuration is valid only for this specific execution 
> which is bound to a specific phase (compile in this case)
> Unfortunately it's configured in the wrong way in hadoop-hdds/docs/pom.xml: 
> the first approach should be replaced with the second with moving the 
> configuration to inside the execution.
> Without this change yetus can't detect the dependency order.
> How to test:
> The easiest way to reproduce the problem is to execute:
> {code:java}
> mvn  -fae exec:exec -Dexec.executable=pwd -Dexec.args='' -Phdds{code}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-1165) Document generation in maven should be configured on execition level

2019-02-22 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-1165:
--

 Summary: Document generation in maven should be configured on 
execition level 
 Key: HDDS-1165
 URL: https://issues.apache.org/jira/browse/HDDS-1165
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Elek, Marton
Assignee: Elek, Marton


Documentation of Ozone/Hdds project is generated from maven with the help of 
maven exec plugin.

There are multiple ways to configure plugins in maven. Plugin can be configured 
on plugin level:
{code:java}

    org.codehaus.mojo
    exec-maven-plugin
    1.6.0
    
  
    
  exec
    
    compile
  
    
    
  ...
    
  
{code}
In this case not only the specific execution but all the execution will be 
configured (even if it's triggered by mvn exec:exec)

Or it can be configured on the execution level:
{code:java}

    org.codehaus.mojo
    exec-maven-plugin
    1.6.0
    
  
    
  exec
    
    compile
 
 ...
     
  
    
    
  {code}
In this case the configuration is valid only for this specific execution which 
is bound to a specific phase (compile in this case)

Unfortunately it's configured in the wrong way in hadoop-hdds/docs/pom.xml: the 
first approach should be replaced with the second with moving the configuration 
to inside the execution.

Without this change yetus can't detect the dependency order.

How to test:

The easiest way to reproduce the problem is to execute:
{code:java}
mvn  -fae exec:exec -Dexec.executable=pwd -Dexec.args='' -Phdds{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1149) Change the default ozone.client.checksum.type

2019-02-22 Thread Anu Engineer (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDDS-1149:
---
Target Version/s: 0.4.0

> Change the default ozone.client.checksum.type
> -
>
> Key: HDDS-1149
> URL: https://issues.apache.org/jira/browse/HDDS-1149
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDDS-1149.00.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14279) [SBN Read] Race condition in ObserverReadProxyProvider

2019-02-22 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775639#comment-16775639
 ] 

Erik Krogen commented on HDFS-14279:


Thanks for the review, [~shv]. I just committed this to trunk.

> [SBN Read] Race condition in ObserverReadProxyProvider
> --
>
> Key: HDFS-14279
> URL: https://issues.apache.org/jira/browse/HDFS-14279
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14279.000.patch, HDFS-14279.001.patch
>
>
> There is a race condition in {{ObserverReadProxyProvider#getCurrentProxy()}}:
> {code}
>   private NNProxyInfo getCurrentProxy() {
> if (currentProxy == null) {
>   changeProxy(null);
> }
> return currentProxy;
>   }
> {code}
> {{currentProxy}} is a {{volatile}}. Another {{changeProxy()}} could occur 
> after the {{changeProxy()}} and before the {{return}}, thus making the return 
> value incorrect. I have seen this result in an NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1126) Datanode is trying to qausi-close a container which is already closed

2019-02-22 Thread Anu Engineer (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775645#comment-16775645
 ] 

Anu Engineer commented on HDDS-1126:


[~nandakumar131] it is not upgraded to 100 lines, I see the issue in our 
internal runs. This needs to be fixed.


> Datanode is trying to qausi-close a container which is already closed
> -
>
> Key: HDDS-1126
> URL: https://issues.apache.org/jira/browse/HDDS-1126
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Nilotpal Nandi
>Assignee: Nanda kumar
>Priority: Major
> Fix For: 0.4.0
>
> Attachments: HDDS-1126.000.patch
>
>
> steps taken :
> 
>  # created 12 datanodes cluster and running workload on all the nodes
>  # running failure injection/restart on 1 datanode at a time periodically and 
> randomly.
>  
> Error seen in ozone.log :
> --
>  
> {noformat}
> 2019-02-18 06:06:32,780 [Datanode State Machine Thread - 0] DEBUG 
> (DatanodeStateMachine.java:176) - Executing cycle Number : 30
> 2019-02-18 06:06:32,784 [Command processor thread] DEBUG 
> (CloseContainerCommandHandler.java:71) - Processing Close Container command.
> 2019-02-18 06:06:32,785 [Datanode State Machine Thread - 0] DEBUG 
> (DatanodeStateMachine.java:176) - Executing cycle Number : 31
> 2019-02-18 06:06:32,785 [Command processor thread] ERROR 
> (CloseContainerCommandHandler.java:118) - Can't close container #37
> org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException:
>  Cannot quasi close container #37 while in CLOSED state.
>  at 
> org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.quasiCloseContainer(KeyValueHandler.java:903)
>  at 
> org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.quasiCloseContainer(ContainerController.java:93)
>  at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:110)
>  at 
> org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93)
>  at 
> org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:413)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-02-18 06:06:32,785 [Command processor thread] DEBUG 
> (CloseContainerCommandHandler.java:71) - Processing Close Container command.
> 2019-02-18 06:06:32,788 [Command processor thread] DEBUG 
> (CloseContainerCommandHandler.java:71) - Processing Close Container command.
> 2019-02-18 06:06:32,788 [Datanode State Machine Thread - 0] DEBUG 
> (DatanodeStateMachine.java:176) - Executing cycle Number : 32
> 2019-02-18 06:06:34,430 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:36,608 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:38,876 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:41,084 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:43,297 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:45,469 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:47,684 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:49,958 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:52,124 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:54,344 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:56,499 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:06:58,764 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:07:00,969 [main] DEBUG (OzoneClientFactory.java:287) - Using 
> org.apache.hadoop.ozone.client.rpc.RpcClient as client protocol.
> 2019-02-18 06:07:02,788 [Datanode State Machine Thread - 0] DEBUG 
> (DatanodeStateMachine.java:176) - Executing cycle Number : 33
> 

[jira] [Commented] (HDFS-14279) [SBN Read] Race condition in ObserverReadProxyProvider

2019-02-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775644#comment-16775644
 ] 

Hudson commented on HDFS-14279:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16033 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16033/])
HDFS-14279. [SBN read] Fix race condition in ObserverReadProxyProvider. 
(xkrogen: rev bad3ffd2907d75395907ff6b76c909ab50add4bc)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/ObserverReadProxyProvider.java


> [SBN Read] Race condition in ObserverReadProxyProvider
> --
>
> Key: HDFS-14279
> URL: https://issues.apache.org/jira/browse/HDFS-14279
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14279.000.patch, HDFS-14279.001.patch
>
>
> There is a race condition in {{ObserverReadProxyProvider#getCurrentProxy()}}:
> {code}
>   private NNProxyInfo getCurrentProxy() {
> if (currentProxy == null) {
>   changeProxy(null);
> }
> return currentProxy;
>   }
> {code}
> {{currentProxy}} is a {{volatile}}. Another {{changeProxy()}} could occur 
> after the {{changeProxy()}} and before the {{return}}, thus making the return 
> value incorrect. I have seen this result in an NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14279) [SBN Read] Race condition in ObserverReadProxyProvider

2019-02-22 Thread Erik Krogen (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14279:
---
   Resolution: Fixed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

> [SBN Read] Race condition in ObserverReadProxyProvider
> --
>
> Key: HDFS-14279
> URL: https://issues.apache.org/jira/browse/HDFS-14279
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14279.000.patch, HDFS-14279.001.patch
>
>
> There is a race condition in {{ObserverReadProxyProvider#getCurrentProxy()}}:
> {code}
>   private NNProxyInfo getCurrentProxy() {
> if (currentProxy == null) {
>   changeProxy(null);
> }
> return currentProxy;
>   }
> {code}
> {{currentProxy}} is a {{volatile}}. Another {{changeProxy()}} could occur 
> after the {{changeProxy()}} and before the {{return}}, thus making the return 
> value incorrect. I have seen this result in an NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1120) Add a config to disable checksum verification during read even though checksum data is present in the persisted data

2019-02-22 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775627#comment-16775627
 ] 

Bharat Viswanadham commented on HDDS-1120:
--

Thank You [~linyiqun] for the review. Addressed your review comment and updated 
the PR.

 

> Add a config to disable checksum verification during read even though 
> checksum data is present in the persisted data
> 
>
> Key: HDDS-1120
> URL: https://issues.apache.org/jira/browse/HDDS-1120
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Client
>Affects Versions: 0.4.0
>Reporter: Shashikant Banerjee
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.0
>
> Attachments: HDDS-1120.00.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, if the checksum is computed during data write and persisted in the 
> disk, we will always end up verifying it while reading. This Jira aims to 
> selectively disable checksum verification during reads even though checksum 
> info is present in the data stored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14279) [SBN Read] Race condition in ObserverReadProxyProvider

2019-02-22 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775601#comment-16775601
 ] 

Konstantin Shvachko commented on HDFS-14279:


Looks great. +1

> [SBN Read] Race condition in ObserverReadProxyProvider
> --
>
> Key: HDFS-14279
> URL: https://issues.apache.org/jira/browse/HDFS-14279
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs, namenode
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14279.000.patch, HDFS-14279.001.patch
>
>
> There is a race condition in {{ObserverReadProxyProvider#getCurrentProxy()}}:
> {code}
>   private NNProxyInfo getCurrentProxy() {
> if (currentProxy == null) {
>   changeProxy(null);
> }
> return currentProxy;
>   }
> {code}
> {{currentProxy}} is a {{volatile}}. Another {{changeProxy()}} could occur 
> after the {{changeProxy()}} and before the {{return}}, thus making the return 
> value incorrect. I have seen this result in an NPE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-1070) Adding Node and Pipeline related metrics in SCM

2019-02-22 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775586#comment-16775586
 ] 

Bharat Viswanadham edited comment on HDDS-1070 at 2/22/19 8:57 PM:
---

Can we implement the metricsSource for MXBean, as some time there is a 
discussion that without implementing the metricsSource, ambari will not be able 
to consume metrics?

 

Refer HDFS-8232. and there are also other HDDS jira's which are opened for the 
same issue.

 

HDDS-910. When we have @metrics Notation we don't need to implement, but for 
MXBean metrics to be collected by external metrics we need to implement 
MetricsSource interface.


was (Author: bharatviswa):
Can we implement the metricsSource for MXBean, as some time there is a 
discussion that without implementing the metricsSource, ambari will not be able 
to consume metrics?

 

Refer HDFS-8232. and there are also other HDDS jira's which are opened for the 
same issue.

> Adding Node and Pipeline related metrics in SCM
> ---
>
> Key: HDDS-1070
> URL: https://issues.apache.org/jira/browse/HDDS-1070
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-1070.000.patch, HDDS-1070.001.patch, 
> HDDS-1070.002.patch
>
>
> This jira aims to add more Node and Pipeline related metrics to SCM.
> Following metrics will be added as part of this jira:
>  * numberOfSuccessfulPipelineCreation
>  * numberOfFailedPipelineCreation
>  * numberOfSuccessfulPipelineDestroy
>  * numberOfFailedPipelineDestroy
>  * numberOfPipelineReportProcessed
>  * numberOfNodeReportProcessed
>  * numberOfHBProcessed
>  * number of pipelines in different PipelineState



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1070) Adding Node and Pipeline related metrics in SCM

2019-02-22 Thread Bharat Viswanadham (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775586#comment-16775586
 ] 

Bharat Viswanadham commented on HDDS-1070:
--

Can we implement the metricsSource for MXBean, as some time there is a 
discussion that without implementing the metricsSource, ambari will not be able 
to consume metrics?

 

Refer HDFS-8232. and there are also other HDDS jira's which are opened for the 
same issue.

> Adding Node and Pipeline related metrics in SCM
> ---
>
> Key: HDDS-1070
> URL: https://issues.apache.org/jira/browse/HDDS-1070
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Affects Versions: 0.3.0
>Reporter: Nanda kumar
>Assignee: Nanda kumar
>Priority: Major
> Attachments: HDDS-1070.000.patch, HDDS-1070.001.patch, 
> HDDS-1070.002.patch
>
>
> This jira aims to add more Node and Pipeline related metrics to SCM.
> Following metrics will be added as part of this jira:
>  * numberOfSuccessfulPipelineCreation
>  * numberOfFailedPipelineCreation
>  * numberOfSuccessfulPipelineDestroy
>  * numberOfFailedPipelineDestroy
>  * numberOfPipelineReportProcessed
>  * numberOfNodeReportProcessed
>  * numberOfHBProcessed
>  * number of pipelines in different PipelineState



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1120) Add a config to disable checksum verification during read even though checksum data is present in the persisted data

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1120?focusedWorklogId=202858=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202858
 ]

ASF GitHub Bot logged work on HDDS-1120:


Author: ASF GitHub Bot
Created on: 22/Feb/19 20:47
Start Date: 22/Feb/19 20:47
Worklog Time Spent: 10m 
  Work Description: apache-yetus commented on issue #513: HDDS-1120. Add a 
config to disable checksum verification during read …
URL: https://github.com/apache/hadoop/pull/513#issuecomment-466541739
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 23 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1004 | trunk passed |
   | +1 | compile | 933 | trunk passed |
   | +1 | checkstyle | 190 | trunk passed |
   | -1 | mvnsite | 39 | client in trunk failed. |
   | -1 | mvnsite | 25 | client in trunk failed. |
   | -1 | mvnsite | 25 | integration-test in trunk failed. |
   | -1 | mvnsite | 21 | objectstore-service in trunk failed. |
   | -1 | mvnsite | 18 | ozone-manager in trunk failed. |
   | +1 | shadedclient | 1040 | branch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | -1 | findbugs | 20 | client in trunk failed. |
   | -1 | findbugs | 20 | client in trunk failed. |
   | -1 | findbugs | 17 | objectstore-service in trunk failed. |
   | -1 | findbugs | 16 | ozone-manager in trunk failed. |
   | -1 | javadoc | 19 | client in trunk failed. |
   | -1 | javadoc | 21 | client in trunk failed. |
   | -1 | javadoc | 22 | integration-test in trunk failed. |
   | -1 | javadoc | 18 | objectstore-service in trunk failed. |
   | -1 | javadoc | 20 | ozone-manager in trunk failed. |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 23 | Maven dependency ordering for patch |
   | -1 | mvninstall | 9 | client in the patch failed. |
   | -1 | mvninstall | 10 | client in the patch failed. |
   | -1 | mvninstall | 10 | integration-test in the patch failed. |
   | -1 | mvninstall | 10 | objectstore-service in the patch failed. |
   | -1 | mvninstall | 11 | ozone-manager in the patch failed. |
   | +1 | compile | 908 | the patch passed |
   | +1 | javac | 908 | the patch passed |
   | +1 | checkstyle | 185 | the patch passed |
   | -1 | mvnsite | 27 | client in the patch failed. |
   | -1 | mvnsite | 28 | client in the patch failed. |
   | -1 | mvnsite | 27 | integration-test in the patch failed. |
   | -1 | mvnsite | 27 | objectstore-service in the patch failed. |
   | -1 | mvnsite | 27 | ozone-manager in the patch failed. |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | xml | 2 | The patch has no ill-formed XML file. |
   | +1 | shadedclient | 670 | patch has no errors when building and testing 
our client artifacts. |
   | 0 | findbugs | 0 | Skipped patched modules with no Java source: 
hadoop-ozone/integration-test |
   | -1 | findbugs | 27 | client in the patch failed. |
   | -1 | findbugs | 29 | client in the patch failed. |
   | -1 | findbugs | 27 | objectstore-service in the patch failed. |
   | -1 | findbugs | 28 | ozone-manager in the patch failed. |
   | -1 | javadoc | 28 | client in the patch failed. |
   | -1 | javadoc | 28 | client in the patch failed. |
   | -1 | javadoc | 27 | integration-test in the patch failed. |
   | -1 | javadoc | 29 | objectstore-service in the patch failed. |
   | -1 | javadoc | 27 | ozone-manager in the patch failed. |
   ||| _ Other Tests _ |
   | -1 | unit | 28 | client in the patch failed. |
   | -1 | unit | 86 | common in the patch failed. |
   | -1 | unit | 27 | client in the patch failed. |
   | -1 | unit | 27 | integration-test in the patch failed. |
   | -1 | unit | 28 | objectstore-service in the patch failed. |
   | -1 | unit | 28 | ozone-manager in the patch failed. |
   | +1 | asflicense | 47 | The patch does not generate ASF License warnings. |
   | | | 6289 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.hdds.security.x509.certificate.client.TestDefaultCertificateClient |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-513/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/513 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
   | uname | 

[jira] [Commented] (HDDS-726) Ozone Client should update SCM to move the container out of allocation path in case a write transaction fails

2019-02-22 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775574#comment-16775574
 ] 

Nanda kumar commented on HDDS-726:
--

[~shashikant], can you rebase the patch on latest changes on the trunk? The 
patch is not applying anymore.

> Ozone Client should update SCM to move the container out of allocation path 
> in case a write transaction fails
> -
>
> Key: HDDS-726
> URL: https://issues.apache.org/jira/browse/HDDS-726
> Project: Hadoop Distributed Data Store
>  Issue Type: Test
>Reporter: Shashikant Banerjee
>Assignee: Shashikant Banerjee
>Priority: Major
> Attachments: HDDS-726.000.patch, HDDS-726.001.patch, 
> HDDS-726.002.patch, HDDS-726.003.patch, HDDS-726.004.patch, HDDS-726.005.patch
>
>
> Once an container write transaction fails, it will be marked corrupted. Once 
> Ozone client gets an exception in such case it should tell SCM to move the 
> container out of allocation path. SCM will eventually close the container.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1148) After allocating container, we are not adding to container DB.

2019-02-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775572#comment-16775572
 ] 

Hudson commented on HDDS-1148:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16032 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16032/])
HDDS-1148. After allocating container, we are not adding to container 
(nandakumar131: rev 70579805c97c0affb22b036ce8d2795007cdb6dc)
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerStateManager.java
* (edit) 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/SCMContainerManager.java


> After allocating container, we are not adding to container DB.
> --
>
> Key: HDDS-1148
> URL: https://issues.apache.org/jira/browse/HDDS-1148
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.0
>
> Attachments: HDDS-1148.00.patch, HDDS-1148.01.patch, 
> HDDS-1148.02.patch, HDDS-1148.03.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If we don't do that, we get an error when handling container report for open 
> containers.
> As they don't exist in container DB.
>  
> {code:java}
> scm_1           | at java.lang.Thread.run(Thread.java:748)
> scm_1           | 2019-02-21 00:00:32 ERROR ContainerReportHandler:173 - 
> Received container report for an unknown container 1 from datanode 
> e2733c00-162b-4993-a986-f6104f5008d8{ip: 172.18.0.2, host: 4f4e683d86c3} {}
> scm_1           | 
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: #1
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:543)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.updateContainerReplica(ContainerStateMap.java:230)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerReplica(ContainerStateManager.java:565)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerReplica(SCMContainerManager.java:393)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ReportHandlerHelper.processContainerReplica(ReportHandlerHelper.java:74)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:159)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:110)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:51)
> scm_1           | at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)
> scm_1           | at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> scm_1           | at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1148) After allocating container, we are not adding to container DB.

2019-02-22 Thread Nanda kumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-1148:
--
  Resolution: Fixed
Target Version/s: 0.4.0
  Status: Resolved  (was: Patch Available)

Thanks [~bharatviswa] for the contribution and thanks to [~ljain] for the 
review. Committed this to trunk.

> After allocating container, we are not adding to container DB.
> --
>
> Key: HDDS-1148
> URL: https://issues.apache.org/jira/browse/HDDS-1148
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.0
>
> Attachments: HDDS-1148.00.patch, HDDS-1148.01.patch, 
> HDDS-1148.02.patch, HDDS-1148.03.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If we don't do that, we get an error when handling container report for open 
> containers.
> As they don't exist in container DB.
>  
> {code:java}
> scm_1           | at java.lang.Thread.run(Thread.java:748)
> scm_1           | 2019-02-21 00:00:32 ERROR ContainerReportHandler:173 - 
> Received container report for an unknown container 1 from datanode 
> e2733c00-162b-4993-a986-f6104f5008d8{ip: 172.18.0.2, host: 4f4e683d86c3} {}
> scm_1           | 
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: #1
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:543)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.updateContainerReplica(ContainerStateMap.java:230)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerReplica(ContainerStateManager.java:565)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerReplica(SCMContainerManager.java:393)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ReportHandlerHelper.processContainerReplica(ReportHandlerHelper.java:74)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:159)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:110)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:51)
> scm_1           | at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)
> scm_1           | at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> scm_1           | at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14201) Ability to disallow safemode NN to become active

2019-02-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775564#comment-16775564
 ] 

Íñigo Goiri commented on HDFS-14201:


[~surmountian], let's fix the checstyle warnings.

> Ability to disallow safemode NN to become active
> 
>
> Key: HDFS-14201
> URL: https://issues.apache.org/jira/browse/HDFS-14201
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: auto-failover
>Affects Versions: 3.1.1, 2.9.2
>Reporter: Xiao Liang
>Assignee: Xiao Liang
>Priority: Major
> Attachments: HDFS-14201.001.patch
>
>
> Currently with HA, Namenode in safemode can be possibly selected as active, 
> for availability of both read and write, Namenodes not in safemode are better 
> choices to become active though.
> It can take tens of minutes for a cold started Namenode to get out of 
> safemode, especially when there are large number of files and blocks in HDFS, 
> that means if a Namenode in safemode become active, the cluster will be not 
> fully functioning for quite a while, even if it can while there is some 
> Namenode not in safemode.
> The proposal here is to add an option, to allow Namenode to report itself as 
> UNHEALTHY to ZKFC, if it's in safemode, so as to only allow fully functioning 
> Namenode to become active, improving the general availability of the cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work logged] (HDDS-1148) After allocating container, we are not adding to container DB.

2019-02-22 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1148?focusedWorklogId=202854=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-202854
 ]

ASF GitHub Bot logged work on HDDS-1148:


Author: ASF GitHub Bot
Created on: 22/Feb/19 20:20
Start Date: 22/Feb/19 20:20
Worklog Time Spent: 10m 
  Work Description: nandakumar131 commented on pull request #511: 
HDDS-1148. After allocating container, we are not adding to container DB.
URL: https://github.com/apache/hadoop/pull/511
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 202854)
Time Spent: 1h 20m  (was: 1h 10m)

> After allocating container, we are not adding to container DB.
> --
>
> Key: HDDS-1148
> URL: https://issues.apache.org/jira/browse/HDDS-1148
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.0
>
> Attachments: HDDS-1148.00.patch, HDDS-1148.01.patch, 
> HDDS-1148.02.patch, HDDS-1148.03.patch
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> If we don't do that, we get an error when handling container report for open 
> containers.
> As they don't exist in container DB.
>  
> {code:java}
> scm_1           | at java.lang.Thread.run(Thread.java:748)
> scm_1           | 2019-02-21 00:00:32 ERROR ContainerReportHandler:173 - 
> Received container report for an unknown container 1 from datanode 
> e2733c00-162b-4993-a986-f6104f5008d8{ip: 172.18.0.2, host: 4f4e683d86c3} {}
> scm_1           | 
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: #1
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:543)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.updateContainerReplica(ContainerStateMap.java:230)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerReplica(ContainerStateManager.java:565)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerReplica(SCMContainerManager.java:393)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ReportHandlerHelper.processContainerReplica(ReportHandlerHelper.java:74)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:159)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:110)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:51)
> scm_1           | at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)
> scm_1           | at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> scm_1           | at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-3246) pRead equivalent for direct read path

2019-02-22 Thread Sahil Takiar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775566#comment-16775566
 ] 

Sahil Takiar commented on HDFS-3246:


Fixed all issues reported by Hadoop QA. Open to any reviews comments.

> pRead equivalent for direct read path
> -
>
> Key: HDFS-3246
> URL: https://issues.apache.org/jira/browse/HDFS-3246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, performance
>Affects Versions: 3.0.0-alpha1
>Reporter: Henry Robinson
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-3246.001.patch, HDFS-3246.002.patch, 
> HDFS-3246.003.patch, HDFS-3246.004.patch
>
>
> There is no pread equivalent in ByteBufferReadable. We should consider adding 
> one. It would be relatively easy to implement for the distributed case 
> (certainly compared to HDFS-2834), since DFSInputStream does most of the 
> heavy lifting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1148) After allocating container, we are not adding to container DB.

2019-02-22 Thread Nanda kumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1148?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775567#comment-16775567
 ] 

Nanda kumar commented on HDDS-1148:
---

+1, will commit this shortly. Test failures are not related to this patch.

> After allocating container, we are not adding to container DB.
> --
>
> Key: HDDS-1148
> URL: https://issues.apache.org/jira/browse/HDDS-1148
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.4.0
>
> Attachments: HDDS-1148.00.patch, HDDS-1148.01.patch, 
> HDDS-1148.02.patch, HDDS-1148.03.patch
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> If we don't do that, we get an error when handling container report for open 
> containers.
> As they don't exist in container DB.
>  
> {code:java}
> scm_1           | at java.lang.Thread.run(Thread.java:748)
> scm_1           | 2019-02-21 00:00:32 ERROR ContainerReportHandler:173 - 
> Received container report for an unknown container 1 from datanode 
> e2733c00-162b-4993-a986-f6104f5008d8{ip: 172.18.0.2, host: 4f4e683d86c3} {}
> scm_1           | 
> org.apache.hadoop.hdds.scm.container.ContainerNotFoundException: #1
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.checkIfContainerExist(ContainerStateMap.java:543)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.states.ContainerStateMap.updateContainerReplica(ContainerStateMap.java:230)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerReplica(ContainerStateManager.java:565)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerReplica(SCMContainerManager.java:393)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ReportHandlerHelper.processContainerReplica(ReportHandlerHelper.java:74)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:159)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:110)
> scm_1           | at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:51)
> scm_1           | at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:85)
> scm_1           | at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> scm_1           | at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14052) RBF: Use Router keytab for WebHDFS

2019-02-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775562#comment-16775562
 ] 

Íñigo Goiri edited comment on HDFS-14052 at 2/22/19 8:15 PM:
-

[^HDFS-14052-HDFS-13891.2.patch] tests only the failure case.
Even though there are other tests trying the success case, I think it makes 
sense to have it in this one too.
Checking DFS_ROUTER_KEYTAB_FILE_KEY and 
DFS_WEB_AUTHENTICATION_KERBEROS_PRINCIPAL_KEY separately would be nice.


was (Author: elgoiri):
[^HDFS-14052-HDFS-13891.2.patch] test the failure case.
Even though there are other tests trying the success case, I think it makes 
sense to have it in this one too.
Checking DFS_ROUTER_KEYTAB_FILE_KEY and 
DFS_WEB_AUTHENTICATION_KERBEROS_PRINCIPAL_KEY separately would be nice.

> RBF: Use Router keytab for WebHDFS
> --
>
> Key: HDFS-14052
> URL: https://issues.apache.org/jira/browse/HDFS-14052
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14052-HDFS-13891.0.patch, 
> HDFS-14052-HDFS-13891.1.patch, HDFS-14052-HDFS-13891.2.patch
>
>
> When the RouterHttpServer starts it does:
> {code}
> NameNodeHttpServer.initWebHdfs(conf, httpAddress.getHostName(), 
> httpServer,
> RouterWebHdfsMethods.class.getPackage().getName());
> {code}
> This function is in the NN and is pretty generic.
> However, it then calls to NameNodeHttpServer#getAuthFilterParams, which does:
> {code}
> String httpKeytab = conf.get(DFSUtil.getSpnegoKeytabKey(conf,
> DFSConfigKeys.DFS_NAMENODE_KEYTAB_FILE_KEY));
> {code}
> In most cases, the regular web keytab will kick in, but we should make this a 
> parameter and load the Router one just in case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14052) RBF: Use Router keytab for WebHDFS

2019-02-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/HDFS-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775562#comment-16775562
 ] 

Íñigo Goiri commented on HDFS-14052:


[^HDFS-14052-HDFS-13891.2.patch] test the failure case.
Even though there are other tests trying the success case, I think it makes 
sense to have it in this one too.
Checking DFS_ROUTER_KEYTAB_FILE_KEY and 
DFS_WEB_AUTHENTICATION_KERBEROS_PRINCIPAL_KEY separately would be nice.

> RBF: Use Router keytab for WebHDFS
> --
>
> Key: HDFS-14052
> URL: https://issues.apache.org/jira/browse/HDFS-14052
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14052-HDFS-13891.0.patch, 
> HDFS-14052-HDFS-13891.1.patch, HDFS-14052-HDFS-13891.2.patch
>
>
> When the RouterHttpServer starts it does:
> {code}
> NameNodeHttpServer.initWebHdfs(conf, httpAddress.getHostName(), 
> httpServer,
> RouterWebHdfsMethods.class.getPackage().getName());
> {code}
> This function is in the NN and is pretty generic.
> However, it then calls to NameNodeHttpServer#getAuthFilterParams, which does:
> {code}
> String httpKeytab = conf.get(DFSUtil.getSpnegoKeytabKey(conf,
> DFSConfigKeys.DFS_NAMENODE_KEYTAB_FILE_KEY));
> {code}
> In most cases, the regular web keytab will kick in, but we should make this a 
> parameter and load the Router one just in case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14130) Make ZKFC ObserverNode aware

2019-02-22 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775554#comment-16775554
 ] 

Konstantin Shvachko edited comment on HDFS-14130 at 2/22/19 8:07 PM:
-

v011 capitalized "Verify" in {{testVerifyObserverState}}.
I don't know why it failed with v008, while TestDFSHAAdmin didn't. I assume 
there is some nondeterminism in execution, since Jenkins runs test cases in 
parallel.
Any ways, if you guys favor the CLI approach over the direct RPC calls, let's 
go with it.

Other failures in the last run are passing locally.


was (Author: shv):
v011 capitalized "Verify" in {{TestJournalNodeSync}}.
I don't know why it failed with v008, while TestDFSHAAdmin didn't. I assume 
there is some nondeterminism in execution, since Jenkins runs test cases in 
parallel.
Any ways, if you guys favour the CLI approach over the direct RPC calls, let's 
go with it.

Other failures in the last run are passing locally.

> Make ZKFC ObserverNode aware
> 
>
> Key: HDFS-14130
> URL: https://issues.apache.org/jira/browse/HDFS-14130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Affects Versions: HDFS-12943
>Reporter: Konstantin Shvachko
>Assignee: xiangheng
>Priority: Major
> Attachments: HDFS-14130-HDFS-12943.001.patch, 
> HDFS-14130-HDFS-12943.003.patch, HDFS-14130-HDFS-12943.004.patch, 
> HDFS-14130-HDFS-12943.005.patch, HDFS-14130-HDFS-12943.006.patch, 
> HDFS-14130-HDFS-12943.007.patch, HDFS-14130.008.patch, HDFS-14130.009.patch, 
> HDFS-14130.010.patch, HDFS-14130.011.patch
>
>
> Need to fix automatic failover with ZKFC. Currently it does not know about 
> ObserverNodes trying to convert them to SBNs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-22 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775556#comment-16775556
 ] 

Hudson commented on HDFS-14298:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16031 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16031/])
HDFS-14298. Improve log messages of ECTopologyVerifier. Contributed by 
(surendralilhore: rev 7d3b567194f51b745dbc7eb7ee91c1ac160053f4)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/common/ECTopologyVerifier.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestNameNodeMXBean.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/tools/TestECAdmin.java


> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, 
> HDFS-14298.003.patch, HDFS-14298.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14130) Make ZKFC ObserverNode aware

2019-02-22 Thread Konstantin Shvachko (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775554#comment-16775554
 ] 

Konstantin Shvachko commented on HDFS-14130:


v011 capitalized "Verify" in {{TestJournalNodeSync}}.
I don't know why it failed with v008, while TestDFSHAAdmin didn't. I assume 
there is some nondeterminism in execution, since Jenkins runs test cases in 
parallel.
Any ways, if you guys favour the CLI approach over the direct RPC calls, let's 
go with it.

Other failures in the last run are passing locally.

> Make ZKFC ObserverNode aware
> 
>
> Key: HDFS-14130
> URL: https://issues.apache.org/jira/browse/HDFS-14130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Affects Versions: HDFS-12943
>Reporter: Konstantin Shvachko
>Assignee: xiangheng
>Priority: Major
> Attachments: HDFS-14130-HDFS-12943.001.patch, 
> HDFS-14130-HDFS-12943.003.patch, HDFS-14130-HDFS-12943.004.patch, 
> HDFS-14130-HDFS-12943.005.patch, HDFS-14130-HDFS-12943.006.patch, 
> HDFS-14130-HDFS-12943.007.patch, HDFS-14130.008.patch, HDFS-14130.009.patch, 
> HDFS-14130.010.patch, HDFS-14130.011.patch
>
>
> Need to fix automatic failover with ZKFC. Currently it does not know about 
> ObserverNodes trying to convert them to SBNs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14130) Make ZKFC ObserverNode aware

2019-02-22 Thread Konstantin Shvachko (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko updated HDFS-14130:
---
Attachment: HDFS-14130.011.patch

> Make ZKFC ObserverNode aware
> 
>
> Key: HDFS-14130
> URL: https://issues.apache.org/jira/browse/HDFS-14130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Affects Versions: HDFS-12943
>Reporter: Konstantin Shvachko
>Assignee: xiangheng
>Priority: Major
> Attachments: HDFS-14130-HDFS-12943.001.patch, 
> HDFS-14130-HDFS-12943.003.patch, HDFS-14130-HDFS-12943.004.patch, 
> HDFS-14130-HDFS-12943.005.patch, HDFS-14130-HDFS-12943.006.patch, 
> HDFS-14130-HDFS-12943.007.patch, HDFS-14130.008.patch, HDFS-14130.009.patch, 
> HDFS-14130.010.patch, HDFS-14130.011.patch
>
>
> Need to fix automatic failover with ZKFC. Currently it does not know about 
> ObserverNodes trying to convert them to SBNs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-22 Thread Surendra Singh Lilhore (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Surendra Singh Lilhore updated HDFS-14298:
--
   Resolution: Fixed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

Thanks [~knanasi] for contribution.

Committed to trunk

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Fix For: 3.3.0
>
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, 
> HDFS-14298.003.patch, HDFS-14298.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-3246) pRead equivalent for direct read path

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775549#comment-16775549
 ] 

Hadoop QA commented on HDFS-3246:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
24s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
20m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-hdfs-project/hadoop-hdfs-native-client {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  3m 
42s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 16m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 16m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
33s{color} | {color:green} root: The patch generated 0 new + 48 unchanged - 1 
fixed = 48 total (was 49) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  3m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 40s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
0s{color} | {color:blue} Skipped patched modules with no Java source: 
hadoop-hdfs-project/hadoop-hdfs-native-client {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  9m  
8s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
53s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}108m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  7m 
12s{color} | {color:green} hadoop-hdfs-native-client in the patch passed. 
{color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
47s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}243m 20s{color} | 
{color:black} {color} |
\\
\\
|| 

[jira] [Commented] (HDFS-14298) Improve log messages of ECTopologyVerifier

2019-02-22 Thread Surendra Singh Lilhore (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775530#comment-16775530
 ] 

Surendra Singh Lilhore commented on HDFS-14298:
---

+1

> Improve log messages of ECTopologyVerifier
> --
>
> Key: HDFS-14298
> URL: https://issues.apache.org/jira/browse/HDFS-14298
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Kitti Nanasi
>Assignee: Kitti Nanasi
>Priority: Minor
> Attachments: HDFS-14298.001.patch, HDFS-14298.002.patch, 
> HDFS-14298.003.patch, HDFS-14298.004.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1038) Support Service Level Authorization for OM, SCM and DN

2019-02-22 Thread Xiaoyu Yao (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDDS-1038:
-
Attachment: HDDS-1038.08.patch

> Support Service Level Authorization for OM, SCM and DN
> --
>
> Key: HDDS-1038
> URL: https://issues.apache.org/jira/browse/HDDS-1038
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: Security
> Fix For: 0.4.0
>
> Attachments: HDDS-1038.00.patch, HDDS-1038.01.patch, 
> HDDS-1038.02.patch, HDDS-1038.03.patch, HDDS-1038.04.patch, 
> HDDS-1038.05.patch, HDDS-1038.06.patch, HDDS-1038.07.patch, HDDS-1038.08.patch
>
>
> In a secure Ozone cluster. Datanodes fail to connect to SCM on 
> {{StorageContainerDatanodeProtocol}}. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-1093) Configuration tab in OM/SCM ui is not displaying the correct values

2019-02-22 Thread Arpit Agarwal (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-1093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDDS-1093:
---

Assignee: Vivek Ratnavel Subramanian

> Configuration tab in OM/SCM ui is not displaying the correct values
> ---
>
> Key: HDDS-1093
> URL: https://issues.apache.org/jira/browse/HDDS-1093
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: OM, SCM
>Reporter: Sandeep Nemuri
>Assignee: Vivek Ratnavel Subramanian
>Priority: Critical
> Attachments: image-2019-02-12-19-47-18-332.png
>
>
> Configuration tab in OM/SCM ui is not displaying the correct/configured 
> values, rather it is displaying the default values.
> !image-2019-02-12-19-47-18-332.png!
> {code:java}
> [hdfs@freonnode10 hadoop]$ curl -s http://freonnode10:9874/conf | grep 
> ozone.om.handler.count.key
> ozone.om.handler.count.key40falseozone-site.xml
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775474#comment-16775474
 ] 

Yongjun Zhang edited comment on HDFS-14118 at 2/22/19 6:33 PM:
---

Hi [~fengnanli],

Thanks for following-up. Would like to suggest some small changes in the config 
description. Hope the suggested changes make sense to you:
  
{code:java}

 dfs.client.failover.random.order
 false
 
 Determines if the failover proxies are picked in random order instead of the
 configured order. Random order may be enabled for better load balancing
 or to avoid always hitting failed ones first if the failed ones appear in the
 beginning of the configured or resolved list.
 For example, In the case of multiple RBF routers or ObserverNameNodes,
 it is recommended to be turned on for load balancing. 
 The config name can be extended with an optional nameservice ID
 (of form dfs.client.failover.random.order[.nameservice]) in case multiple
 nameservices exist and random order should be enabled for specific
 nameservices.
 



 dfs.client.failover.resolve-needed
 false
 
 Determines if the given nameservice address is a domain name which needs to
 be resolved (using the resolver configured by 
dfs.client.failover.resolver-impl).
 This adds a transparency layer in the client so physical server address
 can change without changing the client. The config name can be extended with
 an optional nameservice ID (of form 
dfs.client.failover.resolve-needed[.nameservice])
 to configure specific nameservices when multiple nameservices exist.
 



 dfs.client.failover.resolver.impl
 org.apache.hadoop.net.DNSDomainNameResolver
 
 Determines what class to use to resolve name service domain name to specific
 machine address. The config name can be extended with an optional
 nameservice ID (of form dfs.client.failover.resolver.impl[.nameservice]) to 
configure
 specific nameservices when multiple nameservices exist.
 

{code}
 


was (Author: yzhangal):
Hi [~fengnanli],

Thanks for following-up. Would like to suggest some small changes in the config 
description. Hope the suggested changes make sense to you:
  
{code:java}

 dfs.client.failover.random.order
 false
 
 Determines if the failover proxies are picked in random order instead of the
 configured order. Random order may be enabled for better load balancing
 or to avoid always hitting failed ones first if the failed ones appear in the
 beginning of the configured or resolved list.
 For example, In the case of multiple RBF routers or ObserverNameNodes,
 it is recommended to be turned on for load balancing. 
 The config name can be extended with an optional nameservice ID
 (of form dfs.client.failover.random.order[.nameservice]) in case multiple
 nameservices exist and random order should be enabled for specific
 nameservices.
 



 dfs.client.failover.resolve-needed
 false
 
 Determines if the given namenode address is a domain name which needs to
 be resolved (using the resolver configured by 
dfs.client.failover.resolver-impl).
 This adds a transparency layer in the client so physical namenode address
 can change without changing the client. The config name can be extended with
 an optional nameservice ID (of form 
dfs.client.failover.resolve-needed[.nameservice])
 to configure specific nameservices when multiple nameservices exist.
 



 dfs.client.failover.resolver.impl
 org.apache.hadoop.net.DNSDomainNameResolver
 
 Determines what class to use to resolve name service domain name to specific
 machine address. The config name can be extended with an optional
 nameservice ID (of form dfs.client.failover.resolver.impl[.nameservice]) to 
configure
 specific nameservices when multiple nameservices exist.
 

{code}
 

> Use DNS to resolve Namenodes and Routers
> 
>
> Key: HDFS-14118
> URL: https://issues.apache.org/jira/browse/HDFS-14118
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs-1.pdf, HDFS design doc_ Single domain name for clients 
> - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.020.patch, 
> HDFS-14118.021.patch, HDFS-14118.022.patch, HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers 

[jira] [Commented] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Yongjun Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775474#comment-16775474
 ] 

Yongjun Zhang commented on HDFS-14118:
--

Hi [~fengnanli],

Thanks for following-up. Would like to suggest some small changes in the config 
description. Hope the suggested changes make sense to you:
  
{code:java}

 dfs.client.failover.random.order
 false
 
 Determines if the failover proxies are picked in random order instead of the
 configured order. Random order may be enabled for better load balancing
 or to avoid always hitting failed ones first if the failed ones appear in the
 beginning of the configured or resolved list.
 For example, In the case of multiple RBF routers or ObserverNameNodes,
 it is recommended to be turned on for load balancing. 
 The config name can be extended with an optional nameservice ID
 (of form dfs.client.failover.random.order[.nameservice]) in case multiple
 nameservices exist and random order should be enabled for specific
 nameservices.
 



 dfs.client.failover.resolve-needed
 false
 
 Determines if the given namenode address is a domain name which needs to
 be resolved (using the resolver configured by 
dfs.client.failover.resolver-impl).
 This adds a transparency layer in the client so physical namenode address
 can change without changing the client. The config name can be extended with
 an optional nameservice ID (of form 
dfs.client.failover.resolve-needed[.nameservice])
 to configure specific nameservices when multiple nameservices exist.
 



 dfs.client.failover.resolver.impl
 org.apache.hadoop.net.DNSDomainNameResolver
 
 Determines what class to use to resolve name service domain name to specific
 machine address. The config name can be extended with an optional
 nameservice ID (of form dfs.client.failover.resolver.impl[.nameservice]) to 
configure
 specific nameservices when multiple nameservices exist.
 

{code}
 

> Use DNS to resolve Namenodes and Routers
> 
>
> Key: HDFS-14118
> URL: https://issues.apache.org/jira/browse/HDFS-14118
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs-1.pdf, HDFS design doc_ Single domain name for clients 
> - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.020.patch, 
> HDFS-14118.021.patch, HDFS-14118.022.patch, HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers updating (adding/removing) will have to make 
> every client change, which is a painful process.
> DNS can be used here to resolve the single domain name clients knows to a 
> list of routers in the current config. However, DNS won't be able to consider 
> only resolving to the working router based on certain health thresholds.
> There are some ways about how this can be solved. One way is to have a 
> separate script to regularly check the status of the router and update the 
> DNS records if a router fails the health thresholds. In this way, security 
> might be carefully considered for this way. Another way is to have the client 
> do the normal connecting/failover after they get the list of routers, which 
> requires the change of current failover proxy provider.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14130) Make ZKFC ObserverNode aware

2019-02-22 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775470#comment-16775470
 ] 

Chao Sun commented on HDFS-14130:
-

Thanks [~shv]. I'm slightly more in favor of the System.in approach because 
it's more realistic - curious why it failed but the one in TestDFSHAAdmin 
didn't. The latest patch looks better since it reset the {{System.in}} after 
finish.

The fix itself looks good to me. One minor nit: change 
{{testverifyObserverState}} to {{testVerifyObserverState}}. 

> Make ZKFC ObserverNode aware
> 
>
> Key: HDFS-14130
> URL: https://issues.apache.org/jira/browse/HDFS-14130
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha
>Affects Versions: HDFS-12943
>Reporter: Konstantin Shvachko
>Assignee: xiangheng
>Priority: Major
> Attachments: HDFS-14130-HDFS-12943.001.patch, 
> HDFS-14130-HDFS-12943.003.patch, HDFS-14130-HDFS-12943.004.patch, 
> HDFS-14130-HDFS-12943.005.patch, HDFS-14130-HDFS-12943.006.patch, 
> HDFS-14130-HDFS-12943.007.patch, HDFS-14130.008.patch, HDFS-14130.009.patch, 
> HDFS-14130.010.patch
>
>
> Need to fix automatic failover with ZKFC. Currently it does not know about 
> ObserverNodes trying to convert them to SBNs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14200) Add emptyTrash option to purge trash immediately

2019-02-22 Thread Steve Loughran (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775471#comment-16775471
 ] 

Steve Loughran commented on HDFS-14200:
---

seems good, but the patch isn't ready yet.


* Why not just add an option to -expunge. It exists, has tests, documentation, 
etc? 
* add a way to take a filesystem so that I can do this for s3a://bucket1/ . See 
HADOOP-13656
* TestTrash:L509. Dont downgrade an exception to a log, just rethrow

Finally: press the submit button so yetus runs it. No yetus, no review.

> Add emptyTrash option to purge trash immediately
> 
>
> Key: HDFS-14200
> URL: https://issues.apache.org/jira/browse/HDFS-14200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-14200.001.patch
>
>
> I have always felt the HDFS trash is missing a simple way to empty the 
> current users trash immediately. We have "expunge" but in my experience 
> supporting clusters, end users find this confusing. When most end users run 
> expunge, they really want to empty their trash immediately and get confused 
> when expunge does not do this.
> This can result in users performing somewhat dangerous "skipTrash" operations 
> on the trash to free up space. The alternative, which most users will not 
> figure out on their own is:
> # Run the expunge command once - this will move the current folder to a 
> checkpoint and remove any old checkpoints older than the retention interval
> # Wait over 1 minute and then run expunge again, overriding fs.trash.interval 
> to 1 minute using the following command hadoop fs -Dfs.trash.interval=1 
> -expunge.
> With this Jira I am proposing to add a extra command, "hdfs dfs -emptyTrash" 
> that purges everything in the logged in users Trash directories immediately.
> How would the community feel about adding this new option? I will upload a 
> patch for comments.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14118) Use DNS to resolve Namenodes and Routers

2019-02-22 Thread Fengnan Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-14118:
--
Attachment: HDFS-14118.022.patch

> Use DNS to resolve Namenodes and Routers
> 
>
> Key: HDFS-14118
> URL: https://issues.apache.org/jira/browse/HDFS-14118
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: DNS testing log, HDFS design doc_ Single domain name for 
> clients - Google Docs-1.pdf, HDFS design doc_ Single domain name for clients 
> - Google Docs.pdf, HDFS-14118.001.patch, HDFS-14118.002.patch, 
> HDFS-14118.003.patch, HDFS-14118.004.patch, HDFS-14118.005.patch, 
> HDFS-14118.006.patch, HDFS-14118.007.patch, HDFS-14118.008.patch, 
> HDFS-14118.009.patch, HDFS-14118.010.patch, HDFS-14118.011.patch, 
> HDFS-14118.012.patch, HDFS-14118.013.patch, HDFS-14118.014.patch, 
> HDFS-14118.015.patch, HDFS-14118.016.patch, HDFS-14118.017.patch, 
> HDFS-14118.018.patch, HDFS-14118.019.patch, HDFS-14118.020.patch, 
> HDFS-14118.021.patch, HDFS-14118.022.patch, HDFS-14118.patch
>
>
> Clients will need to know about routers to talk to the HDFS cluster 
> (obviously), and having routers updating (adding/removing) will have to make 
> every client change, which is a painful process.
> DNS can be used here to resolve the single domain name clients knows to a 
> list of routers in the current config. However, DNS won't be able to consider 
> only resolving to the working router based on certain health thresholds.
> There are some ways about how this can be solved. One way is to have a 
> separate script to regularly check the status of the router and update the 
> DNS records if a router fails the health thresholds. In this way, security 
> might be carefully considered for this way. Another way is to have the client 
> do the normal connecting/failover after they get the list of routers, which 
> requires the change of current failover proxy provider.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7133) Support clearing namespace quota on "/"

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775463#comment-16775463
 ] 

Hadoop QA commented on HDFS-7133:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m  5s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
48s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 39 unchanged - 6 fixed = 39 total (was 45) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
18s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 78m 57s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
30s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}134m 51s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-7133 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12959811/HDFS-7133-01.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 3cac4c105e07 4.4.0-138-generic #164~14.04.1-Ubuntu SMP Fri Oct 
5 08:56:16 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / ed13cf8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26305/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26305/testReport/ |
| Max. process+thread count | 3636 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Commented] (HDFS-14052) RBF: Use Router keytab for WebHDFS

2019-02-22 Thread CR Hota (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775452#comment-16775452
 ] 

CR Hota commented on HDFS-14052:


[~elgoiri]

The run went through fine now. Could you help take a look at this patch and 
merge. I will then re-work on the webhdfs part.

> RBF: Use Router keytab for WebHDFS
> --
>
> Key: HDFS-14052
> URL: https://issues.apache.org/jira/browse/HDFS-14052
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14052-HDFS-13891.0.patch, 
> HDFS-14052-HDFS-13891.1.patch, HDFS-14052-HDFS-13891.2.patch
>
>
> When the RouterHttpServer starts it does:
> {code}
> NameNodeHttpServer.initWebHdfs(conf, httpAddress.getHostName(), 
> httpServer,
> RouterWebHdfsMethods.class.getPackage().getName());
> {code}
> This function is in the NN and is pretty generic.
> However, it then calls to NameNodeHttpServer#getAuthFilterParams, which does:
> {code}
> String httpKeytab = conf.get(DFSUtil.getSpnegoKeytabKey(conf,
> DFSConfigKeys.DFS_NAMENODE_KEYTAB_FILE_KEY));
> {code}
> In most cases, the regular web keytab will kick in, but we should make this a 
> parameter and load the Router one just in case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-22 Thread Chao Sun (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775439#comment-16775439
 ] 

Chao Sun commented on HDFS-14305:
-

Thanks [~hexiaoqiao]. One potential issue with the patch 001 is that when keys 
are updated (which will call {{setSerialNo}}), it could go to a range that 
belongs to a different NameNode,.

I'm thinking maybe we could follow how this is handled in the previous 
implementation (i.e., without HDFS-6440), which uses this approach:
{code}
int LOW_MASK  = ~(1 << 31);
this.serialNo = (serialNo & LOW_MASK) | (nnIndex << 31);
{code}

Instead of 1 bit, we can either pre-allocate a fixed number of bits (e.g., 5), 
or calculate the number of bits needed from the total number of configured 
namenodes.  Then we can use the same masking technique.

The advantage of having a pre-defined number of bits is that when adding or 
removing namenodes (e.g., observers), we are free from collision as long as we 
keep the ordering. The disadvantage is that it put a limit on the total number 
of namenodes allowed, but I can't think a scenario where people would want more 
than 32 or 64 namenodes in a single cluster.


> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14312) Scale test KMS using kms audit log

2019-02-22 Thread Wei-Chiu Chuang (JIRA)
Wei-Chiu Chuang created HDFS-14312:
--

 Summary: Scale test KMS using kms audit log
 Key: HDFS-14312
 URL: https://issues.apache.org/jira/browse/HDFS-14312
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: kms
Affects Versions: 3.3.0
Reporter: Wei-Chiu Chuang


It appears to me that Dynamometer's architecture allows KMS scale tests too.

I imagine there are two ways to scale test a KMS.
# Take KMS audit logs, and replay the logs against a KMS.
# Configure Dynamometer to start KMS in addition to NameNode. Assuming the 
fsimage comes from an encrypted cluster, replaying HDFS audit log also tests 
KMS.

It would be even more interesting to have a tool that converts uncrypted 
cluster fsimage to an encrypted one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14311) multi-threading conflict at layoutVersion when loading block pool storage

2019-02-22 Thread Yicong Cai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yicong Cai updated HDFS-14311:
--
Attachment: HDFS-14311.1.patch

> multi-threading conflict at layoutVersion when loading block pool storage
> -
>
> Key: HDFS-14311
> URL: https://issues.apache.org/jira/browse/HDFS-14311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Priority: Major
> Fix For: 3.3.0, 2.9.3
>
> Attachments: HDFS-14311.1.patch
>
>
> When DataNode upgrade from 2.7.3 to 2.9.2, there is a conflict at 
> StorageInfo.layoutVersion in loading block pool storage process.
> It will cause this exception:
>  
> {panel:title=exceptions}
> 2019-02-15 10:18:01,357 [13783] - INFO [Thread-33:BlockPoolSliceStorage@395] 
> - Restored 36974 block files from trash before the layout upgrade. These 
> blocks will be moved to the previous directory during the upgrade
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:BlockPoolSliceStorage@226] 
> - Failed to analyze storage directories for block pool 
> BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:DataStorage@472] - Failed 
> to add storage directory [DISK]file:/mnt/dfs/2/hadoop/hdfs/data/ for block 
> pool BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748) 
> {panel}
>  
> root cause:
> BlockPoolSliceStorage instance is shared for all storage locations recover 
> transition. In BlockPoolSliceStorage.doTransition, it will read the old 
> layoutVersion from local storage, compare with current DataNode version, then 
> do upgrade. In doUpgrade, add the transition work as a sub-thread, the 
> transition work will set the BlockPoolSliceStorage's layoutVersion to current 
> DN version. The next storage dir transition check will concurrent with pre 
> 

[jira] [Updated] (HDFS-14311) multi-threading conflict at layoutVersion when loading block pool storage

2019-02-22 Thread Yicong Cai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yicong Cai updated HDFS-14311:
--
Status: Open  (was: Patch Available)

> multi-threading conflict at layoutVersion when loading block pool storage
> -
>
> Key: HDFS-14311
> URL: https://issues.apache.org/jira/browse/HDFS-14311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Priority: Major
> Fix For: 3.3.0, 2.9.3
>
>
> When DataNode upgrade from 2.7.3 to 2.9.2, there is a conflict at 
> StorageInfo.layoutVersion in loading block pool storage process.
> It will cause this exception:
>  
> {panel:title=exceptions}
> 2019-02-15 10:18:01,357 [13783] - INFO [Thread-33:BlockPoolSliceStorage@395] 
> - Restored 36974 block files from trash before the layout upgrade. These 
> blocks will be moved to the previous directory during the upgrade
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:BlockPoolSliceStorage@226] 
> - Failed to analyze storage directories for block pool 
> BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:DataStorage@472] - Failed 
> to add storage directory [DISK]file:/mnt/dfs/2/hadoop/hdfs/data/ for block 
> pool BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748) 
> {panel}
>  
> root cause:
> BlockPoolSliceStorage instance is shared for all storage locations recover 
> transition. In BlockPoolSliceStorage.doTransition, it will read the old 
> layoutVersion from local storage, compare with current DataNode version, then 
> do upgrade. In doUpgrade, add the transition work as a sub-thread, the 
> transition work will set the BlockPoolSliceStorage's layoutVersion to current 
> DN version. The next storage dir transition check will concurrent with pre 
> storage dir real transition work, then the 

[jira] [Updated] (HDFS-14311) multi-threading conflict at layoutVersion when loading block pool storage

2019-02-22 Thread Yicong Cai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yicong Cai updated HDFS-14311:
--
Fix Version/s: 2.9.3
   3.3.0
   Status: Patch Available  (was: Open)

> multi-threading conflict at layoutVersion when loading block pool storage
> -
>
> Key: HDFS-14311
> URL: https://issues.apache.org/jira/browse/HDFS-14311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Priority: Major
> Fix For: 3.3.0, 2.9.3
>
>
> When DataNode upgrade from 2.7.3 to 2.9.2, there is a conflict at 
> StorageInfo.layoutVersion in loading block pool storage process.
> It will cause this exception:
>  
> {panel:title=exceptions}
> 2019-02-15 10:18:01,357 [13783] - INFO [Thread-33:BlockPoolSliceStorage@395] 
> - Restored 36974 block files from trash before the layout upgrade. These 
> blocks will be moved to the previous directory during the upgrade
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:BlockPoolSliceStorage@226] 
> - Failed to analyze storage directories for block pool 
> BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:DataStorage@472] - Failed 
> to add storage directory [DISK]file:/mnt/dfs/2/hadoop/hdfs/data/ for block 
> pool BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748) 
> {panel}
>  
> root cause:
> BlockPoolSliceStorage instance is shared for all storage locations recover 
> transition. In BlockPoolSliceStorage.doTransition, it will read the old 
> layoutVersion from local storage, compare with current DataNode version, then 
> do upgrade. In doUpgrade, add the transition work as a sub-thread, the 
> transition work will set the BlockPoolSliceStorage's layoutVersion to current 
> DN version. The next storage dir transition check will 

[jira] [Updated] (HDFS-14311) multi-threading conflict at layoutVersion when loading block pool storage

2019-02-22 Thread Yicong Cai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yicong Cai updated HDFS-14311:
--
Attachment: (was: HDFS-14311.1.patch)

> multi-threading conflict at layoutVersion when loading block pool storage
> -
>
> Key: HDFS-14311
> URL: https://issues.apache.org/jira/browse/HDFS-14311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Priority: Major
>
> When DataNode upgrade from 2.7.3 to 2.9.2, there is a conflict at 
> StorageInfo.layoutVersion in loading block pool storage process.
> It will cause this exception:
>  
> {panel:title=exceptions}
> 2019-02-15 10:18:01,357 [13783] - INFO [Thread-33:BlockPoolSliceStorage@395] 
> - Restored 36974 block files from trash before the layout upgrade. These 
> blocks will be moved to the previous directory during the upgrade
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:BlockPoolSliceStorage@226] 
> - Failed to analyze storage directories for block pool 
> BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:DataStorage@472] - Failed 
> to add storage directory [DISK]file:/mnt/dfs/2/hadoop/hdfs/data/ for block 
> pool BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748) 
> {panel}
>  
> root cause:
> BlockPoolSliceStorage instance is shared for all storage locations recover 
> transition. In BlockPoolSliceStorage.doTransition, it will read the old 
> layoutVersion from local storage, compare with current DataNode version, then 
> do upgrade. In doUpgrade, add the transition work as a sub-thread, the 
> transition work will set the BlockPoolSliceStorage's layoutVersion to current 
> DN version. The next storage dir transition check will concurrent with pre 
> storage dir real transition work, then the BlockPoolSliceStorage instance 

[jira] [Commented] (HDFS-14292) Introduce Java ExecutorService to DataXceiverServer

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775358#comment-16775358
 ] 

Hadoop QA commented on HDFS-14292:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
40s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
20s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
56s{color} | {color:green} hadoop-hdfs-project generated 0 new + 537 unchanged 
- 3 fixed = 537 total (was 540) {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  8s{color} | {color:orange} hadoop-hdfs-project: The patch generated 3 new + 
629 unchanged - 8 fixed = 632 total (was 637) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 39s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
53s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 83m 31s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}155m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestECAdmin |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.qjournal.server.TestJournalNodeSync |
|   | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-495/1/artifact/out/Dockerfile
 |
| GITHUB PR | https://github.com/apache/hadoop/pull/495 |
| JIRA Issue | HDFS-14292 |
| Optional Tests |  dupname  asflicense  

[jira] [Updated] (HDFS-14311) multi-threading conflict at layoutVersion when loading block pool storage

2019-02-22 Thread Yicong Cai (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-14311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yicong Cai updated HDFS-14311:
--
Attachment: HDFS-14311.1.patch

> multi-threading conflict at layoutVersion when loading block pool storage
> -
>
> Key: HDFS-14311
> URL: https://issues.apache.org/jira/browse/HDFS-14311
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: rolling upgrades
>Affects Versions: 2.9.2
>Reporter: Yicong Cai
>Priority: Major
>
> When DataNode upgrade from 2.7.3 to 2.9.2, there is a conflict at 
> StorageInfo.layoutVersion in loading block pool storage process.
> It will cause this exception:
>  
> {panel:title=exceptions}
> 2019-02-15 10:18:01,357 [13783] - INFO [Thread-33:BlockPoolSliceStorage@395] 
> - Restored 36974 block files from trash before the layout upgrade. These 
> blocks will be moved to the previous directory during the upgrade
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:BlockPoolSliceStorage@226] 
> - Failed to analyze storage directories for block pool 
> BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-02-15 10:18:01,358 [13784] - WARN [Thread-33:DataStorage@472] - Failed 
> to add storage directory [DISK]file:/mnt/dfs/2/hadoop/hdfs/data/ for block 
> pool BP-1216718839-10.120.232.23-1548736842023
> java.io.IOException: Datanode state: LV = -57 CTime = 0 is newer than the 
> namespace state: LV = -63 CTime = 0
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.doTransition(BlockPoolSliceStorage.java:406)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadStorageDirectory(BlockPoolSliceStorage.java:177)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.loadBpStorageDirectories(BlockPoolSliceStorage.java:221)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BlockPoolSliceStorage.recoverTransitionRead(BlockPoolSliceStorage.java:250)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.loadBlockPoolSliceStorage(DataStorage.java:460)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.addStorageLocations(DataStorage.java:390)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:556)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:1649)
>  at 
> org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:1610)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:388)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:280)
>  at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:816)
>  at java.lang.Thread.run(Thread.java:748) 
> {panel}
>  
> root cause:
> BlockPoolSliceStorage instance is shared for all storage locations recover 
> transition. In BlockPoolSliceStorage.doTransition, it will read the old 
> layoutVersion from local storage, compare with current DataNode version, then 
> do upgrade. In doUpgrade, add the transition work as a sub-thread, the 
> transition work will set the BlockPoolSliceStorage's layoutVersion to current 
> DN version. The next storage dir transition check will concurrent with pre 
> storage dir real transition work, then the BlockPoolSliceStorage instance 
> 

[jira] [Commented] (HDFS-14305) Serial number in BlockTokenSecretManager could overlap between different namenodes

2019-02-22 Thread Erik Krogen (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775329#comment-16775329
 ] 

Erik Krogen commented on HDFS-14305:


Thanks for reporting this [~csun]. Given that the serial numbers are randomly 
distributed through a 32-bit space the chance of collision should be low, but 
agreed that we need to fix this to ensure a lack of collision. [~hexiaoqiao], I 
agree with your {{POSITIVE_MASK}} approach, but why not just do this within 
{{setSerialNo()}} itself instead of doing an additional check later?

> Serial number in BlockTokenSecretManager could overlap between different 
> namenodes
> --
>
> Key: HDFS-14305
> URL: https://issues.apache.org/jira/browse/HDFS-14305
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Chao Sun
>Assignee: Chao Sun
>Priority: Major
> Attachments: HDFS-14305.001.patch
>
>
> Currently, a {{BlockTokenSecretManager}} starts with a random integer as the 
> initial serial number, and then use this formula to rotate it:
> {code:java}
> this.intRange = Integer.MAX_VALUE / numNNs;
> this.nnRangeStart = intRange * nnIndex;
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
>  {code}
> while {{numNNs}} is the total number of NameNodes in the cluster, and 
> {{nnIndex}} is the index of the current NameNode specified in the 
> configuration {{dfs.ha.namenodes.}}.
> However, with this approach, different NameNode could have overlapping ranges 
> for serial number. For simplicity, let's assume {{Integer.MAX_VALUE}} is 100, 
> and we have 2 NameNodes {{nn1}} and {{nn2}} in configuration. Then the ranges 
> for these two are:
> {code}
> nn1 -> [-49, 49]
> nn2 -> [1, 99]
> {code}
> This is because the initial serial number could be any negative integer.
> Moreover, when the keys are updated, the serial number will again be updated 
> with the formula:
> {code}
> this.serialNo = (this.serialNo % intRange) + (nnRangeStart);
> {code}
> which means the new serial number could be updated to a range that belongs to 
> a different NameNode, thus increasing the chance of collision again.
> When the collision happens, DataNodes could overwrite an existing key which 
> will cause clients to fail because of {{InvalidToken}} error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-7133) Support clearing namespace quota on "/"

2019-02-22 Thread Ayush Saxena (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775305#comment-16775305
 ] 

Ayush Saxena commented on HDFS-7133:


Uploaded patch to restore to original on clrQuota on Root.

Pls Review :)

> Support clearing namespace quota on "/"
> ---
>
> Key: HDFS-7133
> URL: https://issues.apache.org/jira/browse/HDFS-7133
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Guo Ruijing
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-7133-01.patch
>
>
> existing implementation:
> 1. support set namespace quota on "/"
> 2. doesn't support clear namespace quota on "/" due to HDFS-1258
> expected implementation:
> support clearing namespace quota on "/"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-3246) pRead equivalent for direct read path

2019-02-22 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HDFS-3246:
---
Status: Open  (was: Patch Available)

> pRead equivalent for direct read path
> -
>
> Key: HDFS-3246
> URL: https://issues.apache.org/jira/browse/HDFS-3246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, performance
>Affects Versions: 3.0.0-alpha1
>Reporter: Henry Robinson
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-3246.001.patch, HDFS-3246.002.patch, 
> HDFS-3246.003.patch, HDFS-3246.004.patch
>
>
> There is no pread equivalent in ByteBufferReadable. We should consider adding 
> one. It would be relatively easy to implement for the distributed case 
> (certainly compared to HDFS-2834), since DFSInputStream does most of the 
> heavy lifting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7133) Support clearing namespace quota on "/"

2019-02-22 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-7133:
---
Attachment: HDFS-7133-01.patch

> Support clearing namespace quota on "/"
> ---
>
> Key: HDFS-7133
> URL: https://issues.apache.org/jira/browse/HDFS-7133
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Guo Ruijing
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-7133-01.patch
>
>
> existing implementation:
> 1. support set namespace quota on "/"
> 2. doesn't support clear namespace quota on "/" due to HDFS-1258
> expected implementation:
> support clearing namespace quota on "/"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7133) Support clearing namespace quota on "/"

2019-02-22 Thread Ayush Saxena (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ayush Saxena updated HDFS-7133:
---
Status: Patch Available  (was: Open)

> Support clearing namespace quota on "/"
> ---
>
> Key: HDFS-7133
> URL: https://issues.apache.org/jira/browse/HDFS-7133
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Reporter: Guo Ruijing
>Assignee: Ayush Saxena
>Priority: Major
> Attachments: HDFS-7133-01.patch
>
>
> existing implementation:
> 1. support set namespace quota on "/"
> 2. doesn't support clear namespace quota on "/" due to HDFS-1258
> expected implementation:
> support clearing namespace quota on "/"



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-3246) pRead equivalent for direct read path

2019-02-22 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HDFS-3246:
---
Attachment: HDFS-3246.004.patch

> pRead equivalent for direct read path
> -
>
> Key: HDFS-3246
> URL: https://issues.apache.org/jira/browse/HDFS-3246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, performance
>Affects Versions: 3.0.0-alpha1
>Reporter: Henry Robinson
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-3246.001.patch, HDFS-3246.002.patch, 
> HDFS-3246.003.patch, HDFS-3246.004.patch
>
>
> There is no pread equivalent in ByteBufferReadable. We should consider adding 
> one. It would be relatively easy to implement for the distributed case 
> (certainly compared to HDFS-2834), since DFSInputStream does most of the 
> heavy lifting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-3246) pRead equivalent for direct read path

2019-02-22 Thread Sahil Takiar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sahil Takiar updated HDFS-3246:
---
Status: Patch Available  (was: Open)

> pRead equivalent for direct read path
> -
>
> Key: HDFS-3246
> URL: https://issues.apache.org/jira/browse/HDFS-3246
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client, performance
>Affects Versions: 3.0.0-alpha1
>Reporter: Henry Robinson
>Assignee: Sahil Takiar
>Priority: Major
> Attachments: HDFS-3246.001.patch, HDFS-3246.002.patch, 
> HDFS-3246.003.patch, HDFS-3246.004.patch
>
>
> There is no pread equivalent in ByteBufferReadable. We should consider adding 
> one. It would be relatively easy to implement for the distributed case 
> (certainly compared to HDFS-2834), since DFSInputStream does most of the 
> heavy lifting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14292) Introduce Java ExecutorService to DataXceiverServer

2019-02-22 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775165#comment-16775165
 ] 

BELUGA BEHR edited comment on HDFS-14292 at 2/22/19 2:23 PM:
-

Hello Watchers.

There are three 3 checkstyle warnings.  Please ignore these.  To clear one of 
them, the fix breaks unit tests.  The others are minor infractions and are 
artifacts of the existing code base. 

The unit test "TestNamenodeCapacityReport" is failing because it is getting 
incorrect XCeiver numbers.  I'm not sure where this failure is coming from, it 
passes on my local machine, but it could be addressed by other changes such as 
HDFS-14295 because the thread group that this thread pool uses is also used in 
other areas of the DataNode and therefore are not controlled or limited by this 
pool.

 

Please consider the latest patch for inclusion into the project.

 

PR on GitHub is updated with the latest proposed changes.


was (Author: belugabehr):
Hello Watchers.

There are three 3 checkstyle warnings.  Please ignore these.  To clear one of 
them, the fix breaks unit tests.  The others are minor infractions and are 
artifacts of the existing code base. 

The unit test "TestNamenodeCapacityReport" is failing because it is getting 
incorrect XCeiver numbers.  I'm not sure where this failure is coming from, it 
passes on my local machine, but it could be addressed by other changes such as 
HDFS-14295 because the thread group that this thread pool uses is also used in 
other areas of the DataNode and therefore are not controlled or limited by this 
pool.

 

Please consider the latest patch for inclusion into the project.

> Introduce Java ExecutorService to DataXceiverServer
> ---
>
> Key: HDFS-14292
> URL: https://issues.apache.org/jira/browse/HDFS-14292
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Assignee: BELUGA BEHR
>Priority: Major
> Attachments: HDFS-14292.1.patch, HDFS-14292.2.patch, 
> HDFS-14292.3.patch, HDFS-14292.4.patch, HDFS-14292.5.patch, 
> HDFS-14292.6.patch, HDFS-14292.6.patch, HDFS-14292.7.patch
>
>
> I wanted to investigate {{dfs.datanode.max.transfer.threads}} from 
> {{hdfs-site.xml}}.  It is described as "Specifies the maximum number of 
> threads to use for transferring data in and out of the DN."   The default 
> value is 4096.  I found it interesting because 4096 threads sounds like a lot 
> to me.  I'm not sure how a system with 8-16 cores would react to this large a 
> thread count.  Intuitively, I would say that the overhead of context 
> switching would be immense.
> During mt investigation, I discovered the 
> [following|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataXceiverServer.java#L203-L216]
>  setup in the {{DataXceiverServer}} class:
> # A peer connects to a DataNode
> # A new thread is spun up to service this connection
> # The thread runs to completion
> # The tread dies
> It would perhaps be better if we used a thread pool to better manage the 
> lifecycle of the service threads and to allow the DataNode to re-use existing 
> threads, saving on the need to create and spin-up threads on demand.
> In this JIRA, I have added a couple of things:
> # Added a thread pool to {{DataXceiverServer}} class that, on demand, will 
> create up to {{dfs.datanode.max.transfer.threads}}.  A thread that has 
> completed its prior duties will stay idle for up to 60 seconds 
> (configurable), it will be retired if no new work has arrived.
> # Added new methods to the {{Peer}} Interface to allow for better logging and 
> less code within each Thread ({{DataXceiver}}).
> # Updated the Thread code ({{DataXceiver}}) regarding its interactions with 
> {{blockReceiver}} instance variable



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1164) Add New blockade Tests to test Replica Manager

2019-02-22 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HDDS-1164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775175#comment-16775175
 ] 

Hadoop QA commented on HDDS-1164:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 24m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  8s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
16s{color} | {color:red} dist in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
17s{color} | {color:red} dist in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 17s{color} 
| {color:red} dist in the patch failed. {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
17s{color} | {color:red} dist in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} pylint {color} | {color:orange}  0m 
13s{color} | {color:orange} The patch generated 152 new + 390 unchanged - 2 
fixed = 542 total (was 392) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  4s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 17s{color} 
| {color:red} dist in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
26s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 50m 14s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDDS-1164 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12959762/HDDS-1164.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  pylint  |
| uname | Linux ad075cc398f1 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 632d5e8 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2340/artifact/out/patch-mvninstall-hadoop-ozone_dist.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2340/artifact/out/patch-compile-hadoop-ozone_dist.txt
 |
| javac | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2340/artifact/out/patch-compile-hadoop-ozone_dist.txt
 |
| mvnsite | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2340/artifact/out/patch-mvnsite-hadoop-ozone_dist.txt
 |
| pylint | v1.9.2 |
| pylint | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2340/artifact/out/diff-patch-pylint.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2340/artifact/out/patch-unit-hadoop-ozone_dist.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDDS-Build/2340/testReport/ |
| Max. process+thread count | 413 (vs. ulimit of 1) |
| 

[jira] [Commented] (HDFS-14293) Increase Default Size of dfs.stream-buffer-size

2019-02-22 Thread BELUGA BEHR (JIRA)


[ 
https://issues.apache.org/jira/browse/HDFS-14293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775171#comment-16775171
 ] 

BELUGA BEHR commented on HDFS-14293:


[~shwetayakkali] I'm not sure the best way to go about deprecating all of these 
various components.  However, if you can assist with getting HDFS-14294 
accepted, I can start tearing out that piece of the puzzle.

> Increase Default Size of dfs.stream-buffer-size
> ---
>
> Key: HDFS-14293
> URL: https://issues.apache.org/jira/browse/HDFS-14293
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 3.2.0
>Reporter: BELUGA BEHR
>Priority: Minor
>
> For many years (7+) now, the JDK has been using a default buffer size of 
> [8192 
> bytes|https://github.com/openjdk-mirror/jdk7u-jdk/blob/master/src/share/classes/java/io/BufferedInputStream.java#L53].
>   Hadoop still defaults to a size half of that.  The default is 
> {{dfs.stream-buffer-size}} is 4096 bytes.
> https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml
> Please increase default size to 8192.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   >