date:20190417

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-17 Thread He Xiaoqiao (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820742#comment-16820742
 ] 

He Xiaoqiao commented on HDFS-14437:


[~angerszhuuu], Thanks for the ping. It is interesting dig. I think it is OK 
here if attach somewhat documents.
I confused detailed explain about #endCurrentLogSegment:
When we invoke #endCurrentLogSegment, it will do all log sync firstly, then 
finalize log segment, which are all in synchronized. IIUC, it should guarantee 
that edit double-buffer are empty before finalize. Look forward the truth. 
Thanks again.

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>

[jira] [Updated] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-17 Thread angerszhu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

angerszhu updated HDFS-14437:
-
Component/s: qjm
 namenode
 ha
 Issue Type: Bug  (was: Improvement)

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, namenode, qjm
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   synchronized(journalSetLock) {
> IOUtils.cleanup(LOG, journalSet);
>   }
>   terminate(1, msg);
> }
>   } finally {
> // Prevent RuntimeException from blocking other log edit write 
> doneWithAutoSyncScheduling();
>   }
>   //editLogStream may become null,
>   //so store a

[jira] [Commented] (HDDS-1447) Fix CheckStyle warnings

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820729#comment-16820729
 ] 

Hadoop QA commented on HDDS-1447:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
25s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
47s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
9s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
9s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  4m  
5s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m  
6s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
28s{color} | {color:green} The patch passed checkstyle in hadoop-hdds {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
27s{color} | {color:green} hadoop-ozone: The patch generated 0 new + 0 
unchanged - 3 fixed = 0 total (was 3) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 59s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m  
9s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 35s{color} 
| {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 18m  4s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 91m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.ozone.client.rpc.TestSecureOzoneRpcClient |
|   | hadoop.ozone.scm.node.TestQueryNode |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HDDS-Build/2665/artifact/out/Dockerfile 
|
| JIRA Issue | HDDS-1447 |
| JIRA Patch URL |

[jira] [Commented] (HDDS-1447) Fix CheckStyle warnings

2019-04-17 Thread Wanqiang Ji (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820705#comment-16820705
 ] 

Wanqiang Ji commented on HDDS-1447:
---

Thanks [~anu] for reviewing.

> Fix CheckStyle warnings 
> 
>
> Key: HDDS-1447
> URL: https://issues.apache.org/jira/browse/HDDS-1447
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: HDDS-1447.001.patch
>
>
> We had a full acceptance test + unit test build for 
> [HDDS-1433|https://issues.apache.org/jira/browse/HDDS-1433] : 
> [https://ci.anzix.net/job/ozone/16677/] gave 3 warnings belongs to Ozone.
> *Modules:*
>  * [Apache Hadoop Ozone 
> Client|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.1350159737/]
>  ** KeyOutputStream.java:319
>  ** KeyOutputStream.java:622
>  * [Apache Hadoop Ozone Integration 
> Tests|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.-1713756601/]
>  ** ContainerTestHelper.java:731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-17 Thread angerszhu (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820703#comment-16820703
 ] 

angerszhu commented on HDFS-14437:
--

If some one can understand Chinese, can contact with me to chat about this, 
since I write some document in Chinese to explain these. 
[angers@gmail.com|mailto:angers@gmail.com] 

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   synchronized(journalSetLock) {
> IOUtils.cleanup(LOG, journalSet);
>   }
>   terminate(1, msg);
> }
>   } finally {
> // Prevent RuntimeException from blocking other log edit write 
>

[jira] [Updated] (HDFS-14438) Fix typo in HDFS for OfflineEditsVisitorFactory.java

2019-04-17 Thread bianqi (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

bianqi updated HDFS-14438:
--
Attachment: HDFS-14438.1.patch

> Fix typo in HDFS for OfflineEditsVisitorFactory.java
> 
>
> Key: HDFS-14438
> URL: https://issues.apache.org/jira/browse/HDFS-14438
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: bianqi
>Priority: Major
>  Labels: newbie
> Attachments: HDFS-14438.1.patch
>
>
> https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/OfflineEditsVisitorFactory.java#L68
> proccesor -> processor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14438) Fix typo in HDFS for OfflineEditsVisitorFactory.java

2019-04-17 Thread bianqi (JIRA)

bianqi created HDFS-14438:
-

 Summary: Fix typo in HDFS for OfflineEditsVisitorFactory.java
 Key: HDFS-14438
 URL: https://issues.apache.org/jira/browse/HDFS-14438
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs
Affects Versions: 3.1.2
Reporter: bianqi


https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/tools/offlineEditsViewer/OfflineEditsVisitorFactory.java#L68
proccesor -> processor



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1447) Fix CheckStyle warnings

2019-04-17 Thread Anu Engineer (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820689#comment-16820689
 ] 

Anu Engineer commented on HDDS-1447:


Thank you, Really appreciate your attention to detail and help in making ozone 
better. +1, I will commit this as soon as I have a Jenkins run. 

> Fix CheckStyle warnings 
> 
>
> Key: HDDS-1447
> URL: https://issues.apache.org/jira/browse/HDDS-1447
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: HDDS-1447.001.patch
>
>
> We had a full acceptance test + unit test build for 
> [HDDS-1433|https://issues.apache.org/jira/browse/HDDS-1433] : 
> [https://ci.anzix.net/job/ozone/16677/] gave 3 warnings belongs to Ozone.
> *Modules:*
>  * [Apache Hadoop Ozone 
> Client|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.1350159737/]
>  ** KeyOutputStream.java:319
>  ** KeyOutputStream.java:622
>  * [Apache Hadoop Ozone Integration 
> Tests|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.-1713756601/]
>  ** ContainerTestHelper.java:731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-17 Thread angerszhu (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820687#comment-16820687
 ] 

angerszhu commented on HDFS-14437:
--

[~yzhangal] [~hexiaoqiao]

hi I have meet this problem too, can you look at my explanation?

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   synchronized(journalSetLock) {
> IOUtils.cleanup(LOG, journalSet);
>   }
>   terminate(1, msg);
> }
>   } finally {
> // Prevent RuntimeException from blocking other log edit write 
> doneWithAutoSyncScheduling();
>   }
>   //editLogStream may become null,
>   //so store a local

[jira] [Commented] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-17 Thread angerszhu (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820686#comment-16820686
 ] 

angerszhu commented on HDFS-14437:
--

[~kihwal] [~daryn] 

Hi both, can your look at my explanation about this issue, since in our 
production env, we meet this problem too. 

> Exception happened when   rollEditLog expects empty 
> EditsDoubleBuffer.bufCurrent  but not
> -
>
> Key: HDFS-14437
> URL: https://issues.apache.org/jira/browse/HDFS-14437
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: angerszhu
>Priority: Major
>
> For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 
> , I have sort the process of write and flush EditLog and some important 
> function, I found the in the class  FSEditLog class, the close() function 
> will call such process like below:
>  
> {code:java}
> waitForSyncToFinish();
> endCurrentLogSegment(true);{code}
> since we have gain the object lock in the function close(), so when  
> waitForSyncToFish() method return, it mean all logSync job has done and all 
> data in bufReady has been flushed out, and since current thread has the lock 
> of this object, when call endCurrentLogSegment(), no other thread will gain 
> the lock so they can't write new editlog into currentBuf.
> But when we don't call waitForSyncToFish() before endCurrentLogSegment(), 
> there may be some autoScheduled logSync()'s flush process is doing, since 
> this process don't need
> synchronization since it has mention in the comment of logSync() method :
>  
> {code:java}
> /**
>  * Sync all modifications done by this thread.
>  *
>  * The internal concurrency design of this class is as follows:
>  *   - Log items are written synchronized into an in-memory buffer,
>  * and each assigned a transaction ID.
>  *   - When a thread (client) would like to sync all of its edits, logSync()
>  * uses a ThreadLocal transaction ID to determine what edit number must
>  * be synced to.
>  *   - The isSyncRunning volatile boolean tracks whether a sync is currently
>  * under progress.
>  *
>  * The data is double-buffered within each edit log implementation so that
>  * in-memory writing can occur in parallel with the on-disk writing.
>  *
>  * Each sync occurs in three steps:
>  *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
>  *  flag.
>  *   2. unsynchronized, it flushes the data to storage
>  *   3. synchronized, it resets the flag and notifies anyone waiting on the
>  *  sync.
>  *
>  * The lack of synchronization on step 2 allows other threads to continue
>  * to write into the memory buffer while the sync is in progress.
>  * Because this step is unsynchronized, actions that need to avoid
>  * concurrency with sync() should be synchronized and also call
>  * waitForSyncToFinish() before assuming they are running alone.
>  */
> public void logSync() {
>   long syncStart = 0;
>   // Fetch the transactionId of this thread. 
>   long mytxid = myTransactionId.get().txid;
>   
>   boolean sync = false;
>   try {
> EditLogOutputStream logStream = null;
> synchronized (this) {
>   try {
> printStatistics(false);
> // if somebody is already syncing, then wait
> while (mytxid > synctxid && isSyncRunning) {
>   try {
> wait(1000);
>   } catch (InterruptedException ie) {
>   }
> }
> //
> // If this transaction was already flushed, then nothing to do
> //
> if (mytxid <= synctxid) {
>   numTransactionsBatchedInSync++;
>   if (metrics != null) {
> // Metrics is non-null only when used inside name node
> metrics.incrTransactionsBatchedInSync();
>   }
>   return;
> }
>
> // now, this thread will do the sync
> syncStart = txid;
> isSyncRunning = true;
> sync = true;
> // swap buffers
> try {
>   if (journalSet.isEmpty()) {
> throw new IOException("No journals available to flush");
>   }
>   editLogStream.setReadyToFlush();
> } catch (IOException e) {
>   final String msg =
>   "Could not sync enough journals to persistent storage " +
>   "due to " + e.getMessage() + ". " +
>   "Unsynced transactions: " + (txid - synctxid);
>   LOG.fatal(msg, new Exception());
>   synchronized(journalSetLock) {
> IOUtils.cleanup(LOG, journalSet);
>   }
>   terminate(1, msg);
> }
>   } finally {
> // Prevent RuntimeException from blocking other log edit write 
> doneWithAutoSyncScheduling();
>   }
>   //editLogStream may

[jira] [Created] (HDFS-14437) Exception happened when rollEditLog expects empty EditsDoubleBuffer.bufCurrent but not

2019-04-17 Thread angerszhu (JIRA)

angerszhu created HDFS-14437:


 Summary: Exception happened when   rollEditLog expects empty 
EditsDoubleBuffer.bufCurrent  but not
 Key: HDFS-14437
 URL: https://issues.apache.org/jira/browse/HDFS-14437
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: angerszhu


For the problem mentioned in https://issues.apache.org/jira/browse/HDFS-10943 , 
I have sort the process of write and flush EditLog and some important function, 
I found the in the class  FSEditLog class, the close() function will call such 
process like below:

 
{code:java}
waitForSyncToFinish();
endCurrentLogSegment(true);{code}
since we have gain the object lock in the function close(), so when  
waitForSyncToFish() method return, it mean all logSync job has done and all 
data in bufReady has been flushed out, and since current thread has the lock of 
this object, when call endCurrentLogSegment(), no other thread will gain the 
lock so they can't write new editlog into currentBuf.

But when we don't call waitForSyncToFish() before endCurrentLogSegment(), there 
may be some autoScheduled logSync()'s flush process is doing, since this 
process don't need

synchronization since it has mention in the comment of logSync() method :

 
{code:java}
/**
 * Sync all modifications done by this thread.
 *
 * The internal concurrency design of this class is as follows:
 *   - Log items are written synchronized into an in-memory buffer,
 * and each assigned a transaction ID.
 *   - When a thread (client) would like to sync all of its edits, logSync()
 * uses a ThreadLocal transaction ID to determine what edit number must
 * be synced to.
 *   - The isSyncRunning volatile boolean tracks whether a sync is currently
 * under progress.
 *
 * The data is double-buffered within each edit log implementation so that
 * in-memory writing can occur in parallel with the on-disk writing.
 *
 * Each sync occurs in three steps:
 *   1. synchronized, it swaps the double buffer and sets the isSyncRunning
 *  flag.
 *   2. unsynchronized, it flushes the data to storage
 *   3. synchronized, it resets the flag and notifies anyone waiting on the
 *  sync.
 *
 * The lack of synchronization on step 2 allows other threads to continue
 * to write into the memory buffer while the sync is in progress.
 * Because this step is unsynchronized, actions that need to avoid
 * concurrency with sync() should be synchronized and also call
 * waitForSyncToFinish() before assuming they are running alone.
 */
public void logSync() {
  long syncStart = 0;

  // Fetch the transactionId of this thread. 
  long mytxid = myTransactionId.get().txid;
  
  boolean sync = false;
  try {
EditLogOutputStream logStream = null;
synchronized (this) {
  try {
printStatistics(false);

// if somebody is already syncing, then wait
while (mytxid > synctxid && isSyncRunning) {
  try {
wait(1000);
  } catch (InterruptedException ie) {
  }
}

//
// If this transaction was already flushed, then nothing to do
//
if (mytxid <= synctxid) {
  numTransactionsBatchedInSync++;
  if (metrics != null) {
// Metrics is non-null only when used inside name node
metrics.incrTransactionsBatchedInSync();
  }
  return;
}
   
// now, this thread will do the sync
syncStart = txid;
isSyncRunning = true;
sync = true;

// swap buffers
try {
  if (journalSet.isEmpty()) {
throw new IOException("No journals available to flush");
  }
  editLogStream.setReadyToFlush();
} catch (IOException e) {
  final String msg =
  "Could not sync enough journals to persistent storage " +
  "due to " + e.getMessage() + ". " +
  "Unsynced transactions: " + (txid - synctxid);
  LOG.fatal(msg, new Exception());
  synchronized(journalSetLock) {
IOUtils.cleanup(LOG, journalSet);
  }
  terminate(1, msg);
}
  } finally {
// Prevent RuntimeException from blocking other log edit write 
doneWithAutoSyncScheduling();
  }
  //editLogStream may become null,
  //so store a local variable for flush.
  logStream = editLogStream;
}

// do the sync
long start = now();
try {
  if (logStream != null) {
logStream.flush();
  }
} catch (IOException ex) {
  synchronized (this) {
final String msg =
"Could not sync enough journals to persistent storage. "
+ "Unsynced transactions: " + (txid - synctxid);
LOG.fatal(msg, new Exception());
synchronized(journalSetLock) {
  IOUtils.cleanup(LOG, journalSet);
}
terminate(1, msg);
  }
}

[jira] [Updated] (HDDS-1447) Fix CheckStyle warnings

2019-04-17 Thread Wanqiang Ji (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wanqiang Ji updated HDDS-1447:
--
Description: 
We had a full acceptance test + unit test build for 
[HDDS-1433|https://issues.apache.org/jira/browse/HDDS-1433] : 
[https://ci.anzix.net/job/ozone/16677/] gave 3 warnings belongs to Ozone.

*Modules:*
 * [Apache Hadoop Ozone 
Client|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.1350159737/]
 ** KeyOutputStream.java:319
 ** KeyOutputStream.java:622
 * [Apache Hadoop Ozone Integration 
Tests|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.-1713756601/]
 ** ContainerTestHelper.java:731

  was:
We had a full acceptance test + unit test build: 
[https://ci.anzix.net/job/ozone/16677/] gave 3 warnings belongs to Ozone.

*Modules:*
 * [Apache Hadoop Ozone 
Client|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.1350159737/]
 ** KeyOutputStream.java:319
 ** KeyOutputStream.java:622
 * [Apache Hadoop Ozone Integration 
Tests|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.-1713756601/]
 ** ContainerTestHelper.java:731


> Fix CheckStyle warnings 
> 
>
> Key: HDDS-1447
> URL: https://issues.apache.org/jira/browse/HDDS-1447
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: HDDS-1447.001.patch
>
>
> We had a full acceptance test + unit test build for 
> [HDDS-1433|https://issues.apache.org/jira/browse/HDDS-1433] : 
> [https://ci.anzix.net/job/ozone/16677/] gave 3 warnings belongs to Ozone.
> *Modules:*
>  * [Apache Hadoop Ozone 
> Client|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.1350159737/]
>  ** KeyOutputStream.java:319
>  ** KeyOutputStream.java:622
>  * [Apache Hadoop Ozone Integration 
> Tests|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.-1713756601/]
>  ** ContainerTestHelper.java:731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1447) Fix CheckStyle warnings

2019-04-17 Thread Wanqiang Ji (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wanqiang Ji updated HDDS-1447:
--
Attachment: HDDS-1447.001.patch
Status: Patch Available  (was: Open)

> Fix CheckStyle warnings 
> 
>
> Key: HDDS-1447
> URL: https://issues.apache.org/jira/browse/HDDS-1447
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Wanqiang Ji
>Assignee: Wanqiang Ji
>Priority: Major
> Attachments: HDDS-1447.001.patch
>
>
> We had a full acceptance test + unit test build: 
> [https://ci.anzix.net/job/ozone/16677/] gave 3 warnings belongs to Ozone.
> *Modules:*
>  * [Apache Hadoop Ozone 
> Client|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.1350159737/]
>  ** KeyOutputStream.java:319
>  ** KeyOutputStream.java:622
>  * [Apache Hadoop Ozone Integration 
> Tests|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.-1713756601/]
>  ** ContainerTestHelper.java:731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDDS-1433) Rename GetScmInfoRespsonseProto to GetScmInfoResponseProto due to typos

2019-04-17 Thread Wanqiang Ji (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820650#comment-16820650
 ] 

Wanqiang Ji edited comment on HDDS-1433 at 4/18/19 2:47 AM:


Thanks [~elek] for reviewing. 
 * 3 warnings in CheckStyle are not related to this patch. I will create new 
[HDDS-1447|https://issues.apache.org/jira/browse/HDDS-1447] to fix it.
 * The UT failure is not related to this patch.


was (Author: jiwq):
Thanks [~elek] for reviewing. 
 * 3 warnings in CheckStyle are not related to this patch. I will create new 
JIRA to fix it.
 * The UT failure is not related to this patch.

> Rename GetScmInfoRespsonseProto to GetScmInfoResponseProto due to typos
> ---
>
> Key: HDDS-1433
> URL: https://issues.apache.org/jira/browse/HDDS-1433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: bianqi
>Assignee: Wanqiang Ji
>Priority: Major
>  Labels: newbie
> Attachments: HDDS-1433.001.patch
>
>
> We got a typo in hdds.proto file
> - {{GetScmInfoRespsonseProto}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDDS-1447) Fix CheckStyle warnings

2019-04-17 Thread Wanqiang Ji (JIRA)

Wanqiang Ji created HDDS-1447:
-

 Summary: Fix CheckStyle warnings 
 Key: HDDS-1447
 URL: https://issues.apache.org/jira/browse/HDDS-1447
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Wanqiang Ji
Assignee: Wanqiang Ji


We had a full acceptance test + unit test build: 
[https://ci.anzix.net/job/ozone/16677/] gave 3 warnings belongs to Ozone.

*Modules:*
 * [Apache Hadoop Ozone 
Client|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.1350159737/]
 ** KeyOutputStream.java:319
 ** KeyOutputStream.java:622
 * [Apache Hadoop Ozone Integration 
Tests|https://ci.anzix.net/job/ozone/16677/checkstyle/new/moduleName.-1713756601/]
 ** ContainerTestHelper.java:731



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1433) Rename GetScmInfoRespsonseProto to GetScmInfoResponseProto due to typos

2019-04-17 Thread Wanqiang Ji (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820650#comment-16820650
 ] 

Wanqiang Ji commented on HDDS-1433:
---

Thanks [~elek] for reviewing. 
 * 3 warnings in CheckStyle are not related to this patch. I will create new 
JIRA to fix it.
 * The UT failure is not related to this patch.

> Rename GetScmInfoRespsonseProto to GetScmInfoResponseProto due to typos
> ---
>
> Key: HDDS-1433
> URL: https://issues.apache.org/jira/browse/HDDS-1433
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.3.0
>Reporter: bianqi
>Assignee: Wanqiang Ji
>Priority: Major
>  Labels: newbie
> Attachments: HDDS-1433.001.patch
>
>
> We got a typo in hdds.proto file
> - {{GetScmInfoRespsonseProto}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14436) Configuration#getTimeDuration is not consistent between default value and manual settings.

2019-04-17 Thread star (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

star updated HDFS-14436:

Description: 
When call getTimeDuration like this:
{quote}conf.getTimeDuration("nn.interval", 10, 
TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
{color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};{color}
{quote}
 {color:#33}If "nn.interval" is set manually or configured in xml file, 
1 will be retrurned.{color}
 If not, 10 will be returned while 1 is expected. 
 The logic is not consistent.

  was:
When call getTimeDuration like this:
{quote}conf.getTimeDuration("nn.interval", 10, 
TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
{color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};{color}
{quote}
 

{color:#33}{color:#cc7832}If "nn.interval" is set manually or configured in 
xml file, 1 will be retrurned.{color}{color}
 If not, 10 will be returned while 1 is expected. 
 The logic is not consistent.


> Configuration#getTimeDuration is not consistent between default value and 
> manual settings.
> --
>
> Key: HDFS-14436
> URL: https://issues.apache.org/jira/browse/HDFS-14436
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: star
>Assignee: star
>Priority: Major
>
> When call getTimeDuration like this:
> {quote}conf.getTimeDuration("nn.interval", 10, 
> TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
> {color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};{color}
> {quote}
>  {color:#33}If "nn.interval" is set manually or configured in xml file, 
> 1 will be retrurned.{color}
>  If not, 10 will be returned while 1 is expected. 
>  The logic is not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14436) Configuration#getTimeDuration is not consistent between default value and manual settings.

2019-04-17 Thread star (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

star updated HDFS-14436:

Description: 
When call getTimeDuration like this:
{quote}conf.getTimeDuration("nn.interval", 10, 
TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
{color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};{color}
{quote}
 

{color:#33}{color:#cc7832}If "nn.interval" is set manually or configured in 
xml file, 1 will be retrurned.{color}{color}
 If not, 10 will be returned while 1 is expected. 
 The logic is not consistent.

  was:
When call getTimeDuration like this:
{quote}conf.getTimeDuration("nn.interval", 10, 
TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
{color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};
{color}
{quote}
{color:#cc7832}{color:#33}If "nn.interval" is set manually or configured in 
xml file, 1 will be retrurned.
{color}{color}

{color:#cc7832}{color:#33}If not, 10 will be returned while 1 is 
expected.{color} {color}

{color:#33}{color:#cc7832}The logic is not consistent.{color}{color}


> Configuration#getTimeDuration is not consistent between default value and 
> manual settings.
> --
>
> Key: HDFS-14436
> URL: https://issues.apache.org/jira/browse/HDFS-14436
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: star
>Assignee: star
>Priority: Major
>
> When call getTimeDuration like this:
> {quote}conf.getTimeDuration("nn.interval", 10, 
> TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
> {color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};{color}
> {quote}
>  
> {color:#33}{color:#cc7832}If "nn.interval" is set manually or configured 
> in xml file, 1 will be retrurned.{color}{color}
>  If not, 10 will be returned while 1 is expected. 
>  The logic is not consistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14436) Configuration#getTimeDuration is not consistent between default value and manual settings.

2019-04-17 Thread star (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

star updated HDFS-14436:

Description: 
When call getTimeDuration like this:
{quote}conf.getTimeDuration("nn.interval", 10, 
TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
{color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};
{color}
{quote}
{color:#cc7832}{color:#33}If "nn.interval" is set manually or configured in 
xml file, 1 will be retrurned.
{color}{color}

{color:#cc7832}{color:#33}If not, 10 will be returned while 1 is 
expected.{color} {color}

{color:#33}{color:#cc7832}The logic is not consistent.{color}{color}

  was:
When call getTimeDuration like this:
{quote}conf.getTimeDuration("property", 10, 
TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
{color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};
{color}

 
{quote}


> Configuration#getTimeDuration is not consistent between default value and 
> manual settings.
> --
>
> Key: HDFS-14436
> URL: https://issues.apache.org/jira/browse/HDFS-14436
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: star
>Assignee: star
>Priority: Major
>
> When call getTimeDuration like this:
> {quote}conf.getTimeDuration("nn.interval", 10, 
> TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
> {color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};
> {color}
> {quote}
> {color:#cc7832}{color:#33}If "nn.interval" is set manually or configured 
> in xml file, 1 will be retrurned.
> {color}{color}
> {color:#cc7832}{color:#33}If not, 10 will be returned while 1 is 
> expected.{color} {color}
> {color:#33}{color:#cc7832}The logic is not consistent.{color}{color}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14436) Configuration#getTimeDuration is not consistent between default value and manual settings.

2019-04-17 Thread star (JIRA)

star created HDFS-14436:
---

 Summary: Configuration#getTimeDuration is not consistent between 
default value and manual settings.
 Key: HDFS-14436
 URL: https://issues.apache.org/jira/browse/HDFS-14436
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: star
Assignee: star


When call getTimeDuration like this:
{quote}conf.getTimeDuration("property", 10, 
TimeUnit.{color:#9876aa}SECONDS{color}{color:#cc7832}, 
{color}TimeUnit.{color:#9876aa}MILLISECONDS{color}){color:#cc7832};
{color}

 
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12510) RBF: Add security to UI

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820634#comment-16820634
 ] 

Hadoop QA commented on HDFS-12510:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
50s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-13891 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
15s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
43s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
40s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
55s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} HDFS-13891 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 26m 
37s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 85m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-12510 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966285/HDFS-12510-HDFS-13891.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 7f1cc9687af4 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-13891 / bd3161e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26665/testReport/ |
| Max. process+thread count | 992 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26665/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Add security to UI
> ---
>
> Key: HDFS-12510
> URL:

[jira] [Commented] (HDFS-14434) webhdfs that connect secure hdfs should not use user.name parameter

2019-04-17 Thread KWON BYUNGCHANG (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820633#comment-16820633
 ] 

KWON BYUNGCHANG commented on HDFS-14434:


[~kihwal] Thank you reply. I agree your opinion.  I think both client and 
server need to modify. (server ignore user.name parameter in secure,  client do 
not use user.name in secure)

I will update patch.

> webhdfs that connect secure hdfs should not use user.name parameter
> ---
>
> Key: HDFS-14434
> URL: https://issues.apache.org/jira/browse/HDFS-14434
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: HDFS-14434.001.patch
>
>
> I have two secure hadoop cluster.  Both cluster use cross-realm 
> authentication. 
> [use...@a.com|mailto:use...@a.com] can access to HDFS of B.COM realm
> by the way, hadoop username of use...@a.com  in B.COM realm is  
> cross_realm_a_com_user_a.
> hdfs dfs command of use...@a.com using B.COM webhdfs failed.
> root cause is  webhdfs that connect secure hdfs use user.name parameter.
> according to webhdfs spec,  insecure webhdfs use user.name,  secure webhdfs 
> use SPNEGO for authentication.
> I think webhdfs that connect secure hdfs  should not use user.name parameter.
> I will attach patch.
> below is error log
>  
> {noformat}
> $ hdfs dfs -ls  webhdfs://b.com:50070/
> ls: Usernames not matched: name=user_a != expected=cross_realm_a_com_user_a
>  
> # user.name in cross realm webhdfs
> $ curl -u : --negotiate 
> 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN=user_a' 
> {"RemoteException":{"exception":"SecurityException","javaClassName":"java.lang.SecurityException","message":"Failed
>  to obtain user group information: java.io.IOException: Usernames not 
> matched: name=user_a != expected=cross_realm_a_com_user_a"}}
> # USE SPNEGO
> $ curl -u : --negotiate 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN'
> {"Token"{"urlString":"XgA."}}
>  
> {noformat}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14406) Add per user RPC Processing time

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820632#comment-16820632
 ] 

Hadoop QA commented on HDFS-14406:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  0m 
30s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
47s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 47s{color} 
| {color:red} root in the patch failed. {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 35s{color} | {color:orange} hadoop-common-project/hadoop-common: The patch 
generated 15 new + 357 unchanged - 5 fixed = 372 total (was 362) {color} |
| {color:red}-1{color} | {color:red} mvnsite {color} | {color:red}  0m 
31s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:red}-1{color} | {color:red} shadedclient {color} | {color:red}  0m 
28s{color} | {color:red} patch has errors when building and testing our client 
artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
22s{color} | {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 29s{color} 
| {color:red} hadoop-common in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 59m 32s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14406 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966299/HDFS-14406.005.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux af2708adc721 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 685cb83 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/2/artifact/out/patch-mvninstall-hadoop-common-project_hadoop-common.txt
 |
| compile | 
https://builds.apache.org/job/PreCommit-HDFS-Build/2/artifact/out/patch-compile-root.txt
 |
| javac |

[jira] [Commented] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820631#comment-16820631
 ] 

Hadoop QA commented on HDFS-14426:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
2s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-13891 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
39s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
21s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 3s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
44s{color} | {color:green} HDFS-13891 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
58s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 15s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
40s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 21m 
46s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}124m 49s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14426 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966292/HDFS-14426-HDFS-13891.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 54998db57c41 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-13891 / bd3161e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26664/testReport/ |
| Max. process+thread

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820620#comment-16820620
 ] 

Hadoop QA commented on HDFS-14245:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
47s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
32s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
16s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
27s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m  4s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
45s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  4m 
16s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 51s{color} | {color:orange} hadoop-hdfs-project: The patch generated 6 new + 
1 unchanged - 0 fixed = 7 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} whitespace {color} | {color:red}  0m  
0s{color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git 
apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply 
{color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 13s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
4s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}105m  0s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}191m 53s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14245 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966289/HDFS-14245.000.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux bcddce9e40d7 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |

[jira] [Updated] (HDFS-14406) Add per user RPC Processing time

2019-04-17 Thread Xue Liu (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xue Liu updated HDFS-14406:
---
Attachment: HDFS-14406.005.patch

> Add per user RPC Processing time
> 
>
> Key: HDFS-14406
> URL: https://issues.apache.org/jira/browse/HDFS-14406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Xue Liu
>Assignee: Xue Liu
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HDFS-14406.001.patch, HDFS-14406.002.patch, 
> HDFS-14406.003.patch, HDFS-14406.004.patch, HDFS-14406.005.patch
>
>
> For a shared cluster we would want to separate users' resources, as well as 
> having our metrics reflecting on the usage, latency, etc, for each user. 
> This JIRA aims to add per user RPC processing time metrics and expose it via 
> JMX.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12510) RBF: Add security to UI

2019-04-17 Thread CR Hota (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

CR Hota updated HDFS-12510:
---
Status: Patch Available  (was: Open)

> RBF: Add security to UI
> ---
>
> Key: HDFS-12510
> URL: https://issues.apache.org/jira/browse/HDFS-12510
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12510-HDFS-13891.001.patch
>
>
> HDFS-12273 implemented the UI for Router Based Federation without security.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820600#comment-16820600
 ] 

Hadoop QA commented on HDFS-14435:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 19m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
23s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 13s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
31s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
 8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
59s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 52s{color} | {color:orange} hadoop-hdfs-project: The patch generated 3 new + 
34 unchanged - 0 fixed = 37 total (was 34) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m  
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 48s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
3s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}121m 44s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
56s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}200m 48s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.balancer.TestBalancerWithMultipleNameNodes |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14435 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966281/HDFS-14435.001.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 08399b23ca42 4.4.0-144-generic #170~14.04.1-Ubuntu SMP Mon Mar 
18 15:02:05 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
|

[jira] [Commented] (HDFS-14374) Expose total number of delegation tokens in AbstractDelegationTokenSecretManager

2019-04-17 Thread Fengnan Li (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820584#comment-16820584
 ] 

Fengnan Li commented on HDFS-14374:
---

[~elgoiri] I synced up with [~crh]. So basically this ticket will be exposing 
that value inside AbstractDelegationTokenSecretManager and 
https://issues.apache.org/jira/browse/HDFS-14426 will do the exposing for 
router, we will also create another ticket for exposing this metric for Namenode

> Expose total number of delegation tokens in 
> AbstractDelegationTokenSecretManager
> 
>
> Key: HDFS-14374
> URL: https://issues.apache.org/jira/browse/HDFS-14374
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14374.001.patch, HDFS-14374.002.patch
>
>
> AbstractDelegationTokenSecretManager should expose total number of active 
> delegation tokens for specific implementations to track for observability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread Fengnan Li (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820579#comment-16820579
 ] 

Fengnan Li commented on HDFS-14426:
---

Thanks for the suggestion [~crh] sorry I didn't know HDFS-14374, I guess I can 
make this ticket specifically about exposing the metric inside router, and 
create another ticket about exposing it inside Namenode.

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426-HDFS-13891.001.patch, HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread Fengnan Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-14426:
--
Attachment: HDFS-14426-HDFS-13891.001.patch

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426-HDFS-13891.001.patch, HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14434) webhdfs that connect secure hdfs should not use user.name parameter

2019-04-17 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820572#comment-16820572
 ] 

Eric Yang commented on HDFS-14434:
--

[~kihwal] make sense.  Thanks

> webhdfs that connect secure hdfs should not use user.name parameter
> ---
>
> Key: HDFS-14434
> URL: https://issues.apache.org/jira/browse/HDFS-14434
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: HDFS-14434.001.patch
>
>
> I have two secure hadoop cluster.  Both cluster use cross-realm 
> authentication. 
> [use...@a.com|mailto:use...@a.com] can access to HDFS of B.COM realm
> by the way, hadoop username of use...@a.com  in B.COM realm is  
> cross_realm_a_com_user_a.
> hdfs dfs command of use...@a.com using B.COM webhdfs failed.
> root cause is  webhdfs that connect secure hdfs use user.name parameter.
> according to webhdfs spec,  insecure webhdfs use user.name,  secure webhdfs 
> use SPNEGO for authentication.
> I think webhdfs that connect secure hdfs  should not use user.name parameter.
> I will attach patch.
> below is error log
>  
> {noformat}
> $ hdfs dfs -ls  webhdfs://b.com:50070/
> ls: Usernames not matched: name=user_a != expected=cross_realm_a_com_user_a
>  
> # user.name in cross realm webhdfs
> $ curl -u : --negotiate 
> 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN=user_a' 
> {"RemoteException":{"exception":"SecurityException","javaClassName":"java.lang.SecurityException","message":"Failed
>  to obtain user group information: java.io.IOException: Usernames not 
> matched: name=user_a != expected=cross_realm_a_com_user_a"}}
> # USE SPNEGO
> $ curl -u : --negotiate 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN'
> {"Token"{"urlString":"XgA."}}
>  
> {noformat}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread Fengnan Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-14426:
--
Attachment: (was: HDFS-14426-HDFS-13891.001.patch)

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread CR Hota (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820565#comment-16820565
 ] 

CR Hota commented on HDFS-14426:


[~fengnanli] 

Thanks for taking this up. It's better to restrict changes to router project 
for easy management of changes during merge. Added this old ticket HDFS-14374 
as a dependency.

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426-HDFS-13891.001.patch, HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12510) RBF: Add security to UI

2019-04-17 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820562#comment-16820562
 ] 

Íñigo Goiri commented on HDFS-12510:


Thanks [~crh], we may want to document that but it's pretty generic.
[~raviprak], do you  think there's anything else we need to cover?

> RBF: Add security to UI
> ---
>
> Key: HDFS-12510
> URL: https://issues.apache.org/jira/browse/HDFS-12510
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12510-HDFS-13891.001.patch
>
>
> HDFS-12273 implemented the UI for Router Based Federation without security.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1382) Create customized CSI server for Ozone

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1382?focusedWorklogId=229483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229483
 ]

ASF GitHub Bot logged work on HDDS-1382:


Author: ASF GitHub Bot
Created on: 17/Apr/19 22:51
Start Date: 17/Apr/19 22:51
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on issue #693: HDDS-1382. Create 
customized CSI server for Ozone.
URL: https://github.com/apache/hadoop/pull/693#issuecomment-484290500
 
 
   Thanks @elek  for working on this, the change LGTM overall, just few minor 
comments. 
   Please also fix the the nightly test run. +1 after that. 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229483)
Time Spent: 1h 10m  (was: 1h)

> Create customized CSI server for Ozone
> --
>
> Key: HDDS-1382
> URL: https://issues.apache.org/jira/browse/HDDS-1382
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> CSI (Container Storage Interface) is a vendor neutral storage interface 
> specification for container orchestrators. CSI support is implemented in 
> YARN, Kubernetes and Mesos. Implementing a CSI server makes it easy to mount 
> disk volumes for containers.
> See https://github.com/container-storage-interface/spec for more details 
> about the spec.
> Until now we used https://github.com/CTrox/csi-s3 server to support CSI 
> specification. Using an ozone specific CSI server would have the following 
> advantages:
>  * We can provide additional functionalities (as we have access to the 
> internal Ozone API not just the very generic s3 api).
>  * Security setup can be synchronized.
>  * Increased stability
>  * Simplified deployment (only the minimal set of the components are required 
> to be installed)
> The CSI specification itself is very simple 
> (https://github.com/container-storage-interface/spec/blob/master/csi.proto) 
> at least the part which is required for Ozone.
> We can use various fuse s3 driver to mount the ozone buckets via s3.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1382) Create customized CSI server for Ozone

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1382?focusedWorklogId=229480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229480
 ]

ASF GitHub Bot logged work on HDDS-1382:


Author: ASF GitHub Bot
Created on: 17/Apr/19 22:49
Start Date: 17/Apr/19 22:49
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #693: HDDS-1382. 
Create customized CSI server for Ozone.
URL: https://github.com/apache/hadoop/pull/693#discussion_r276462239
 
 

 ##
 File path: 
hadoop-ozone/csi/src/main/java/org/apache/hadoop/ozone/csi/NodeService.java
 ##
 @@ -0,0 +1,142 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.csi;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+
+import csi.v1.Csi.NodeGetCapabilitiesRequest;
+import csi.v1.Csi.NodeGetCapabilitiesResponse;
+import csi.v1.Csi.NodeGetInfoRequest;
+import csi.v1.Csi.NodeGetInfoResponse;
+import csi.v1.Csi.NodePublishVolumeRequest;
+import csi.v1.Csi.NodePublishVolumeResponse;
+import csi.v1.Csi.NodeUnpublishVolumeRequest;
+import csi.v1.Csi.NodeUnpublishVolumeResponse;
+import csi.v1.NodeGrpc.NodeImplBase;
+import io.grpc.stub.StreamObserver;
+import org.apache.commons.io.IOUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of the CSI node service.
+ */
+public class NodeService extends NodeImplBase {
+
+  private static final Logger LOG = LoggerFactory.getLogger(NodeService.class);
+
+  private String s3Endpoint;
+
+  public NodeService(OzoneConfiguration configuration) {
+this.s3Endpoint =
+configuration.get(CsiConfigurationValues.OZONE_S3_ADDRESS);
+  }
+
+  @Override
+  public void nodePublishVolume(NodePublishVolumeRequest request,
+  StreamObserver responseObserver) {
+
+try {
+  Files.createDirectories(Paths.get(request.getTargetPath()));
+  String mountCommand =
+  String.format("goofys --endpoint %s %s %s",
+  s3Endpoint,
+  request.getVolumeId(),
+  request.getTargetPath());
+  LOG.info("Executing " + mountCommand);
 
 Review comment:
   NIT: LOG.info("Executing {}", mountCommand);
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229480)
Time Spent: 1h  (was: 50m)

> Create customized CSI server for Ozone
> --
>
> Key: HDDS-1382
> URL: https://issues.apache.org/jira/browse/HDDS-1382
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> CSI (Container Storage Interface) is a vendor neutral storage interface 
> specification for container orchestrators. CSI support is implemented in 
> YARN, Kubernetes and Mesos. Implementing a CSI server makes it easy to mount 
> disk volumes for containers.
> See https://github.com/container-storage-interface/spec for more details 
> about the spec.
> Until now we used https://github.com/CTrox/csi-s3 server to support CSI 
> specification. Using an ozone specific CSI server would have the following 
> advantages:
>  * We can provide additional functionalities (as we have access to the 
> internal Ozone API not just the very generic s3 api).
>  * Security setup can be synchronized.
>  * Increased stability
>  * Simplified deployment (only the minimal set of the components are required 
> to be installed)
> The CSI

[jira] [Work logged] (HDDS-1382) Create customized CSI server for Ozone

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1382?focusedWorklogId=229479=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229479
 ]

ASF GitHub Bot logged work on HDDS-1382:


Author: ASF GitHub Bot
Created on: 17/Apr/19 22:48
Start Date: 17/Apr/19 22:48
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #693: HDDS-1382. 
Create customized CSI server for Ozone.
URL: https://github.com/apache/hadoop/pull/693#discussion_r276462482
 
 

 ##
 File path: 
hadoop-ozone/csi/src/main/java/org/apache/hadoop/ozone/csi/NodeService.java
 ##
 @@ -0,0 +1,142 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.csi;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+
+import csi.v1.Csi.NodeGetCapabilitiesRequest;
+import csi.v1.Csi.NodeGetCapabilitiesResponse;
+import csi.v1.Csi.NodeGetInfoRequest;
+import csi.v1.Csi.NodeGetInfoResponse;
+import csi.v1.Csi.NodePublishVolumeRequest;
+import csi.v1.Csi.NodePublishVolumeResponse;
+import csi.v1.Csi.NodeUnpublishVolumeRequest;
+import csi.v1.Csi.NodeUnpublishVolumeResponse;
+import csi.v1.NodeGrpc.NodeImplBase;
+import io.grpc.stub.StreamObserver;
+import org.apache.commons.io.IOUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of the CSI node service.
+ */
+public class NodeService extends NodeImplBase {
+
+  private static final Logger LOG = LoggerFactory.getLogger(NodeService.class);
+
+  private String s3Endpoint;
+
+  public NodeService(OzoneConfiguration configuration) {
+this.s3Endpoint =
+configuration.get(CsiConfigurationValues.OZONE_S3_ADDRESS);
+  }
+
+  @Override
+  public void nodePublishVolume(NodePublishVolumeRequest request,
+  StreamObserver responseObserver) {
+
+try {
+  Files.createDirectories(Paths.get(request.getTargetPath()));
+  String mountCommand =
+  String.format("goofys --endpoint %s %s %s",
+  s3Endpoint,
+  request.getVolumeId(),
+  request.getTargetPath());
+  LOG.info("Executing " + mountCommand);
+
+  executeCommand(mountCommand);
+
+  responseObserver.onNext(NodePublishVolumeResponse.newBuilder()
+  .build());
+  responseObserver.onCompleted();
+
+} catch (Exception e) {
+  responseObserver.onError(e);
+}
+
+  }
+
+  private void executeCommand(String mountCommand)
+  throws IOException, InterruptedException {
+Process exec = Runtime.getRuntime().exec(mountCommand);
+exec.waitFor(10, TimeUnit.SECONDS);
+
+LOG.info("Command is executed with  stdout: {}, stderr: {}",
+IOUtils.toString(exec.getInputStream(), "UTF-8"),
+IOUtils.toString(exec.getErrorStream(), "UTF-8"));
+if (exec.exitValue() != 0) {
+  throw new RuntimeException(String
+  .format("Return code of the command {} was {} ()", mountCommand,
+  exec.exitValue()));
+}
+  }
+
+  @Override
+  public void nodeUnpublishVolume(NodeUnpublishVolumeRequest request,
+  StreamObserver responseObserver) {
+String umountCommand =
+String.format("fusermount -u %s", request.getTargetPath());
+LOG.info("Executing " + umountCommand);
 
 Review comment:
   NIT: LOG.info("Executing {}", umountCommand);
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229479)
Time Spent: 50m  (was: 40m)

> Create customized CSI server for Ozone
> --
>
> Key: HDDS-1382
> URL: https://issues.apache.org/jira/browse/HDDS-1382
>

[jira] [Work logged] (HDDS-1382) Create customized CSI server for Ozone

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1382?focusedWorklogId=229477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229477
 ]

ASF GitHub Bot logged work on HDDS-1382:


Author: ASF GitHub Bot
Created on: 17/Apr/19 22:47
Start Date: 17/Apr/19 22:47
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #693: HDDS-1382. 
Create customized CSI server for Ozone.
URL: https://github.com/apache/hadoop/pull/693#discussion_r276462239
 
 

 ##
 File path: 
hadoop-ozone/csi/src/main/java/org/apache/hadoop/ozone/csi/NodeService.java
 ##
 @@ -0,0 +1,142 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.csi;
+
+import java.io.IOException;
+import java.net.InetAddress;
+import java.net.UnknownHostException;
+import java.nio.file.Files;
+import java.nio.file.Paths;
+import java.util.concurrent.TimeUnit;
+
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+
+import csi.v1.Csi.NodeGetCapabilitiesRequest;
+import csi.v1.Csi.NodeGetCapabilitiesResponse;
+import csi.v1.Csi.NodeGetInfoRequest;
+import csi.v1.Csi.NodeGetInfoResponse;
+import csi.v1.Csi.NodePublishVolumeRequest;
+import csi.v1.Csi.NodePublishVolumeResponse;
+import csi.v1.Csi.NodeUnpublishVolumeRequest;
+import csi.v1.Csi.NodeUnpublishVolumeResponse;
+import csi.v1.NodeGrpc.NodeImplBase;
+import io.grpc.stub.StreamObserver;
+import org.apache.commons.io.IOUtils;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+/**
+ * Implementation of the CSI node service.
+ */
+public class NodeService extends NodeImplBase {
+
+  private static final Logger LOG = LoggerFactory.getLogger(NodeService.class);
+
+  private String s3Endpoint;
+
+  public NodeService(OzoneConfiguration configuration) {
+this.s3Endpoint =
+configuration.get(CsiConfigurationValues.OZONE_S3_ADDRESS);
+  }
+
+  @Override
+  public void nodePublishVolume(NodePublishVolumeRequest request,
+  StreamObserver responseObserver) {
+
+try {
+  Files.createDirectories(Paths.get(request.getTargetPath()));
+  String mountCommand =
+  String.format("goofys --endpoint %s %s %s",
+  s3Endpoint,
+  request.getVolumeId(),
+  request.getTargetPath());
+  LOG.info("Executing " + mountCommand);
 
 Review comment:
   NIT: LOG.info("Executting {}", mountCommand);
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229477)
Time Spent: 40m  (was: 0.5h)

> Create customized CSI server for Ozone
> --
>
> Key: HDDS-1382
> URL: https://issues.apache.org/jira/browse/HDDS-1382
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> CSI (Container Storage Interface) is a vendor neutral storage interface 
> specification for container orchestrators. CSI support is implemented in 
> YARN, Kubernetes and Mesos. Implementing a CSI server makes it easy to mount 
> disk volumes for containers.
> See https://github.com/container-storage-interface/spec for more details 
> about the spec.
> Until now we used https://github.com/CTrox/csi-s3 server to support CSI 
> specification. Using an ozone specific CSI server would have the following 
> advantages:
>  * We can provide additional functionalities (as we have access to the 
> internal Ozone API not just the very generic s3 api).
>  * Security setup can be synchronized.
>  * Increased stability
>  * Simplified deployment (only the minimal set of the components are required 
> to be installed)
> The CSI

[jira] [Work logged] (HDDS-1382) Create customized CSI server for Ozone

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1382?focusedWorklogId=229475=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229475
 ]

ASF GitHub Bot logged work on HDDS-1382:


Author: ASF GitHub Bot
Created on: 17/Apr/19 22:46
Start Date: 17/Apr/19 22:46
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #693: HDDS-1382. 
Create customized CSI server for Ozone.
URL: https://github.com/apache/hadoop/pull/693#discussion_r276461856
 
 

 ##
 File path: 
hadoop-ozone/csi/src/main/java/org/apache/hadoop/ozone/csi/ControllerService.java
 ##
 @@ -0,0 +1,115 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.csi;
+
+import java.io.IOException;
+
+import org.apache.hadoop.ozone.client.OzoneClient;
+
+import csi.v1.ControllerGrpc.ControllerImplBase;
+import csi.v1.Csi.CapacityRange;
+import csi.v1.Csi.ControllerGetCapabilitiesRequest;
+import csi.v1.Csi.ControllerGetCapabilitiesResponse;
+import csi.v1.Csi.ControllerServiceCapability;
+import csi.v1.Csi.ControllerServiceCapability.RPC;
+import csi.v1.Csi.ControllerServiceCapability.RPC.Type;
+import csi.v1.Csi.CreateVolumeRequest;
+import csi.v1.Csi.CreateVolumeResponse;
+import csi.v1.Csi.DeleteVolumeRequest;
+import csi.v1.Csi.DeleteVolumeResponse;
+import csi.v1.Csi.Volume;
+import io.grpc.stub.StreamObserver;
+
+/**
+ * CSI controller service.
+ * 
+ * This service usually runs only once and responsible for the creation of
+ * the volume.
+ */
+public class ControllerService extends ControllerImplBase {
+
+  private OzoneClient ozoneClient;
+
+  public ControllerService(OzoneClient ozoneClient) {
+this.ozoneClient = ozoneClient;
+  }
+
+  @Override
+  public void createVolume(CreateVolumeRequest request,
+  StreamObserver responseObserver) {
+try {
+  ozoneClient.getObjectStore().createS3Bucket("hadoop", request.getName());
+
+  long size = findSize(request.getCapacityRange());
+
+  CreateVolumeResponse response = CreateVolumeResponse.newBuilder()
+  .setVolume(Volume.newBuilder()
+  .setVolumeId(request.getName())
+  .setCapacityBytes(size))
+  .build();
+
+  responseObserver.onNext(response);
+  responseObserver.onCompleted();
+} catch (IOException e) {
+  responseObserver.onError(e);
+}
+  }
+
+  private long findSize(CapacityRange capacityRange) {
+if (capacityRange.getRequiredBytes() != 0) {
+  return capacityRange.getRequiredBytes();
+} else {
+  if (capacityRange.getLimitBytes() != 0) {
+return Math.min(1000_000_000, capacityRange.getLimitBytes());
 
 Review comment:
   Can we make 1000_000_000 configurable? 
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229475)
Time Spent: 0.5h  (was: 20m)

> Create customized CSI server for Ozone
> --
>
> Key: HDDS-1382
> URL: https://issues.apache.org/jira/browse/HDDS-1382
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> CSI (Container Storage Interface) is a vendor neutral storage interface 
> specification for container orchestrators. CSI support is implemented in 
> YARN, Kubernetes and Mesos. Implementing a CSI server makes it easy to mount 
> disk volumes for containers.
> See https://github.com/container-storage-interface/spec for more details 
> about the spec.
> Until now we used https://github.com/CTrox/csi-s3 server to support CSI 
> specification. Using an ozone specific CSI server would have the

[jira] [Work logged] (HDDS-1382) Create customized CSI server for Ozone

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1382?focusedWorklogId=229472=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229472
 ]

ASF GitHub Bot logged work on HDDS-1382:


Author: ASF GitHub Bot
Created on: 17/Apr/19 22:44
Start Date: 17/Apr/19 22:44
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #693: HDDS-1382. 
Create customized CSI server for Ozone.
URL: https://github.com/apache/hadoop/pull/693#discussion_r276461358
 
 

 ##
 File path: 
hadoop-ozone/csi/src/main/java/org/apache/hadoop/ozone/csi/CsiServer.java
 ##
 @@ -0,0 +1,71 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.ozone.csi;
+
+import java.util.concurrent.Callable;
+
+import org.apache.hadoop.hdds.cli.GenericCli;
+import org.apache.hadoop.hdds.cli.HddsVersionProvider;
+import org.apache.hadoop.hdds.conf.OzoneConfiguration;
+import org.apache.hadoop.ozone.client.OzoneClient;
+import org.apache.hadoop.ozone.client.OzoneClientFactory;
+
+import io.grpc.Server;
+import io.grpc.netty.NettyServerBuilder;
+import io.netty.channel.epoll.EpollEventLoopGroup;
+import io.netty.channel.epoll.EpollServerDomainSocketChannel;
+import io.netty.channel.unix.DomainSocketAddress;
+import picocli.CommandLine.Command;
+
+/**
+ * CLI entrypoint of the CSI service daemon.
+ */
+@Command(name = "ozone csi",
+hidden = true, description = "CSI service daemon.",
+versionProvider = HddsVersionProvider.class,
+mixinStandardHelpOptions = true)
+public class CsiServer extends GenericCli implements Callable {
+
+  @Override
+  public Void call() throws Exception {
+OzoneConfiguration ozoneConfiguration = createOzoneConfiguration();
+String unixSocket = ozoneConfiguration
+.get(CsiConfigurationValues.OZONE_CSI_SOCKET,
+CsiConfigurationValues.OZONE_CSI_SOCKET_DEFAULT);
+OzoneClient rpcClient = 
OzoneClientFactory.getRpcClient(ozoneConfiguration);
+
+EpollEventLoopGroup group = new EpollEventLoopGroup();
+Server server =
+NettyServerBuilder.forAddress(new DomainSocketAddress(unixSocket))
+.channelType(EpollServerDomainSocketChannel.class)
+.workerEventLoopGroup(group)
+.bossEventLoopGroup(group)
+.addService(new IdentitiyService())
+.addService(new ControllerService(rpcClient))
+.addService(new NodeService(ozoneConfiguration))
+.build();
+
+server.start();
+server.awaitTermination();
+return null;
 
 Review comment:
   Should we call rpcClient.close() before exit?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229472)
Time Spent: 20m  (was: 10m)

> Create customized CSI server for Ozone
> --
>
> Key: HDDS-1382
> URL: https://issues.apache.org/jira/browse/HDDS-1382
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Elek, Marton
>Assignee: Elek, Marton
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> CSI (Container Storage Interface) is a vendor neutral storage interface 
> specification for container orchestrators. CSI support is implemented in 
> YARN, Kubernetes and Mesos. Implementing a CSI server makes it easy to mount 
> disk volumes for containers.
> See https://github.com/container-storage-interface/spec for more details 
> about the spec.
> Until now we used https://github.com/CTrox/csi-s3 server to support CSI 
> specification. Using an ozone specific CSI server would have the following 
> advantages:
>  * We can provide additional functionalities (as we have access to the 
> internal Ozone API

[jira] [Commented] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820554#comment-16820554
 ] 

Hadoop QA commented on HDFS-14435:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
 1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 14s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
10s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
15s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 45s{color} | {color:orange} hadoop-hdfs-project: The patch generated 3 new + 
34 unchanged - 0 fixed = 37 total (was 34) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 49s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
24s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m  
0s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 93m 45s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
36s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}168m 56s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.datanode.TestBPOfferService |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.namenode.TestNamenodeCapacityReport |
|   | hadoop.hdfs.TestReconstructStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14435 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966279/HDFS-14435.000.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux

[jira] [Commented] (HDFS-12510) RBF: Add security to UI

2019-04-17 Thread CR Hota (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820548#comment-16820548
 ] 

CR Hota commented on HDFS-12510:


[~elgoiri] Thanks for taking a look.

Am not an UI expect :(

But from what I analyzed in the current code, users could specify if CORS 
filters should be loaded during initialization of name node HttpServer that 
Router also uses for its http server initialization. So if needed 
hadoop.http.filter.initializers can be specified in core-site that router will 
automatically use. With this filter users can also specify the cross origin 
sub-properties.

 

> RBF: Add security to UI
> ---
>
> Key: HDFS-12510
> URL: https://issues.apache.org/jira/browse/HDFS-12510
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12510-HDFS-13891.001.patch
>
>
> HDFS-12273 implemented the UI for Router Based Federation without security.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14406) Add per user RPC Processing time

2019-04-17 Thread Erik Krogen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820537#comment-16820537
 ] 

Erik Krogen edited comment on HDFS-14406 at 4/17/19 10:03 PM:
--

Thanks for the ping, [~elgoiri]. The overall idea LGTM now, but I do have some 
comments about the implementation:
* Creating a new {{rpcuser}} metrics context seems potentially overkill. This 
is still providing details about RPC calls, so it seems to me that it can still 
live under the {{rpcdetailed}} context with the sub-name "rpcuser".
* {{RpcUserMetrics}} has an {{init()}} method that initializes it with the 
methods of a {{protocol}}. This is completely meaningless in this context, 
since {{RpcUserMetrics}} does not store metrics about method names. This should 
be removed.
* {{RpcUserMetrics}} and {{RpcDetailedMetrics}} are nearly identical, can we 
have a common base class which they both inherit from, or even just let 
{{RpcUserMetrics extends RpcDetailedMetrics}}? The only differences are the 
init step, and the name which is passed to the registry.
* In the Javadoc for {{IPC_SERVER_USER_METRICS_DEFAULT}}, can we use a 
{{@link}} tag to refer to the key?
* In the description within {{core-site.xml}}, the sentence has a typo:
{quote}
Set the whether we enable per user rpc processing time metrics on namenode rpc 
server.
{quote}
should be
{quote}
Set whether we enable per user rpc processing time metrics on NameNode rpc 
server.
{quote}
(An extra "the", and also IMO NameNode should be capitalized)
* Within {{Metrics.md}}, I think the descriptions could use updating:
{quote}
| *username*`NumOps` | Total number of the times RPC with username is called |
| *username*`AvgTime` | Average turn around time of the user RPC in 
milliseconds |
{quote}
What does "turn around time" even mean? I would prefer:
{quote}
| *username*`NumOps` | Total number of times RPCs were called by username |
| *username*`AvgTime` | Average processing time of the user's RPCs in 
milliseconds |
{quote}


was (Author: xkrogen):
Thanks for the ping, [~elgoiri]. I do have some comments:
* Creating a new {{rpcuser}} metrics context seems potentially overkill. This 
is still providing details about RPC calls, so it seems to me that it can still 
live under the {{rpcdetailed}} context with the sub-name "rpcuser".
* {{RpcUserMetrics}} has an {{init()}} method that initializes it with the 
methods of a {{protocol}}. This is completely meaningless in this context, 
since {{RpcUserMetrics}} does not store metrics about method names. This should 
be removed.
* {{RpcUserMetrics}} and {{RpcDetailedMetrics}} are nearly identical, can we 
have a common base class which they both inherit from, or even just let 
{{RpcUserMetrics extends RpcDetailedMetrics}}? The only differences are the 
init step, and the name which is passed to the registry.
* In the Javadoc for {{IPC_SERVER_USER_METRICS_DEFAULT}}, can we use a 
{{@link}} tag to refer to the key?
* In the description within {{core-site.xml}}, the sentence has a typo:
{quote}
Set the whether we enable per user rpc processing time metrics on namenode rpc 
server.
{quote}
should be
{quote}
Set whether we enable per user rpc processing time metrics on NameNode rpc 
server.
{quote}
(An extra "the", and also IMO NameNode should be capitalized)
* Within {{Metrics.md}}, I think the descriptions could use updating:
{quote}
| *username*`NumOps` | Total number of the times RPC with username is called |
| *username*`AvgTime` | Average turn around time of the user RPC in 
milliseconds |
{quote}
What does "turn around time" even mean? I would prefer:
{quote}
| *username*`NumOps` | Total number of times RPCs were called by username |
| *username*`AvgTime` | Average processing time of the user's RPCs in 
milliseconds |
{quote}

> Add per user RPC Processing time
> 
>
> Key: HDFS-14406
> URL: https://issues.apache.org/jira/browse/HDFS-14406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Xue Liu
>Assignee: Xue Liu
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HDFS-14406.001.patch, HDFS-14406.002.patch, 
> HDFS-14406.003.patch, HDFS-14406.004.patch
>
>
> For a shared cluster we would want to separate users' resources, as well as 
> having our metrics reflecting on the usage, latency, etc, for each user. 
> This JIRA aims to add per user RPC processing time metrics and expose it via 
> JMX.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14406) Add per user RPC Processing time

2019-04-17 Thread Erik Krogen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820537#comment-16820537
 ] 

Erik Krogen commented on HDFS-14406:


Thanks for the ping, [~elgoiri]. I do have some comments:
* Creating a new {{rpcuser}} metrics context seems potentially overkill. This 
is still providing details about RPC calls, so it seems to me that it can still 
live under the {{rpcdetailed}} context with the sub-name "rpcuser".
* {{RpcUserMetrics}} has an {{init()}} method that initializes it with the 
methods of a {{protocol}}. This is completely meaningless in this context, 
since {{RpcUserMetrics}} does not store metrics about method names. This should 
be removed.
* {{RpcUserMetrics}} and {{RpcDetailedMetrics}} are nearly identical, can we 
have a common base class which they both inherit from, or even just let 
{{RpcUserMetrics extends RpcDetailedMetrics}}? The only differences are the 
init step, and the name which is passed to the registry.
* In the Javadoc for {{IPC_SERVER_USER_METRICS_DEFAULT}}, can we use a 
{{@link}} tag to refer to the key?
* In the description within {{core-site.xml}}, the sentence has a typo:
{quote}
Set the whether we enable per user rpc processing time metrics on namenode rpc 
server.
{quote}
should be
{quote}
Set whether we enable per user rpc processing time metrics on NameNode rpc 
server.
{quote}
(An extra "the", and also IMO NameNode should be capitalized)
* Within {{Metrics.md}}, I think the descriptions could use updating:
{quote}
| *username*`NumOps` | Total number of the times RPC with username is called |
| *username*`AvgTime` | Average turn around time of the user RPC in 
milliseconds |
{quote}
What does "turn around time" even mean? I would prefer:
{quote}
| *username*`NumOps` | Total number of times RPCs were called by username |
| *username*`AvgTime` | Average processing time of the user's RPCs in 
milliseconds |
{quote}

> Add per user RPC Processing time
> 
>
> Key: HDFS-14406
> URL: https://issues.apache.org/jira/browse/HDFS-14406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Xue Liu
>Assignee: Xue Liu
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HDFS-14406.001.patch, HDFS-14406.002.patch, 
> HDFS-14406.003.patch, HDFS-14406.004.patch
>
>
> For a shared cluster we would want to separate users' resources, as well as 
> having our metrics reflecting on the usage, latency, etc, for each user. 
> This JIRA aims to add per user RPC processing time metrics and expose it via 
> JMX.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14374) Expose total number of delegation tokens in AbstractDelegationTokenSecretManager

2019-04-17 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820536#comment-16820536
 ] 

Íñigo Goiri commented on HDFS-14374:


I think [~hexiaoqiao] refers to JMX.
As [~crh] pointed out, the Router will expose this through FederationMetrics.
I don't think the manager has a JMX itself, right?
We could expose this for the NN too.

> Expose total number of delegation tokens in 
> AbstractDelegationTokenSecretManager
> 
>
> Key: HDFS-14374
> URL: https://issues.apache.org/jira/browse/HDFS-14374
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14374.001.patch, HDFS-14374.002.patch
>
>
> AbstractDelegationTokenSecretManager should expose total number of active 
> delegation tokens for specific implementations to track for observability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Erik Krogen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820526#comment-16820526
 ] 

Erik Krogen commented on HDFS-14245:


Just attached my v000 patch. Essentially, if the ORPP is not using a 
{{ClientProtocol}}, it turns {{observerReadEnabled}} to false, which will avoid 
all of the code that requires the use of a {{ClientProtocol}}. While making 
this change, I noticed one thing that I think may be a bug in the 
implementation of HDFS-14160, which is that I believe 
{{ObserverReadProxyProvider.ObserverReadInvocationHandler#getConnectionId()}} 
should probably always be returning the connection ID of the {{failoverProxy}}, 
as opposed to the observer proxy. I'm verifying and if so, will file a separate 
JIRA for this.

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.000.patch, HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14245:
---
Attachment: HDFS-14245.000.patch

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.000.patch, HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14245:
---
Status: Patch Available  (was: In Progress)

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.000.patch, HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14406) Add per user RPC Processing time

2019-04-17 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820522#comment-16820522
 ] 

Íñigo Goiri commented on HDFS-14406:


[^HDFS-14406.004.patch] LGTM.
+1
I'll give a couple days to others ([~linyiqun], [~xkrogen], [~daryn]) to add 
more comments if so.

> Add per user RPC Processing time
> 
>
> Key: HDFS-14406
> URL: https://issues.apache.org/jira/browse/HDFS-14406
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.2.0
>Reporter: Xue Liu
>Assignee: Xue Liu
>Priority: Minor
> Fix For: 3.2.0
>
> Attachments: HDFS-14406.001.patch, HDFS-14406.002.patch, 
> HDFS-14406.003.patch, HDFS-14406.004.patch
>
>
> For a shared cluster we would want to separate users' resources, as well as 
> having our metrics reflecting on the usage, latency, etc, for each user. 
> This JIRA aims to add per user RPC processing time metrics and expose it via 
> JMX.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820512#comment-16820512
 ] 

Hadoop QA commented on HDFS-14245:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
37s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 23m 
43s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 16s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 36s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 2 new + 0 unchanged - 1 fixed = 2 total (was 1) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}118m 40s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
46s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}182m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.sps.TestStoragePolicySatisfierWithStripedFile |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14245 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12956995/HDFS-14245.patch |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 21b3b91f5949 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 685cb83 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26660/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26660/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test

[jira] [Commented] (HDFS-14374) Expose total number of delegation tokens in AbstractDelegationTokenSecretManager

2019-04-17 Thread CR Hota (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820499#comment-16820499
 ] 

CR Hota commented on HDFS-14374:


[~hexiaoqiao] 
 Thanks for the review.

Not sure what you meant by one metric. It is a size exposed by holder of the 
datastructure which is AbstractDelegationTokenSecretManager.

Once this is checked in, individual applications such as Resource manager, 
Namenode, Routers etc can get hold of this via this getter and populate 
respective metrics. 

For example : In router, through FederationMetrics this can be reported by 
querying securitymanager.

[~elgoiri] [~brahmareddy] 

I think we should target this as minimum requirement for router security. 
Secured installations will or else find it difficult to monitor overall 
delegation tokens in entire router cluster.

> Expose total number of delegation tokens in 
> AbstractDelegationTokenSecretManager
> 
>
> Key: HDFS-14374
> URL: https://issues.apache.org/jira/browse/HDFS-14374
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: CR Hota
>Assignee: CR Hota
>Priority: Major
> Attachments: HDFS-14374.001.patch, HDFS-14374.002.patch
>
>
> AbstractDelegationTokenSecretManager should expose total number of active 
> delegation tokens for specific implementations to track for observability.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12510) RBF: Add security to UI

2019-04-17 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820495#comment-16820495
 ] 

Íñigo Goiri commented on HDFS-12510:


Thanks [~crh] for the update.
What about the ACCESS_CONTROL_ALLOW_ORIGIN and these things?
anything we should add for CORS?

> RBF: Add security to UI
> ---
>
> Key: HDFS-12510
> URL: https://issues.apache.org/jira/browse/HDFS-12510
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12510-HDFS-13891.001.patch
>
>
> HDFS-12273 implemented the UI for Router Based Federation without security.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-12510) RBF: Add security to UI

2019-04-17 Thread CR Hota (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820493#comment-16820493
 ] 

CR Hota commented on HDFS-12510:


[~brahmareddy]  [~elgoiri] 

Uploaded the simple patch for UI. This would take care of displaying on router 
landing page security status. Overall UI needs bunch of generic clean-ups and 
refactoring such as "routerstat" pointing to "namenodestatus" etc is quite 
misleading in federationhealth.js. Things like safemode etc will get 
incorrectly reported due to this.

Will create separate Jiras once security piece is completed here.

> RBF: Add security to UI
> ---
>
> Key: HDFS-12510
> URL: https://issues.apache.org/jira/browse/HDFS-12510
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12510-HDFS-13891.001.patch
>
>
> HDFS-12273 implemented the UI for Router Based Federation without security.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-12510) RBF: Add security to UI

2019-04-17 Thread CR Hota (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

CR Hota reassigned HDFS-12510:
--

Assignee: CR Hota

> RBF: Add security to UI
> ---
>
> Key: HDFS-12510
> URL: https://issues.apache.org/jira/browse/HDFS-12510
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: CR Hota
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12510-HDFS-13891.001.patch
>
>
> HDFS-12273 implemented the UI for Router Based Federation without security.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-12510) RBF: Add security to UI

2019-04-17 Thread CR Hota (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-12510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

CR Hota updated HDFS-12510:
---
Attachment: HDFS-12510-HDFS-13891.001.patch

> RBF: Add security to UI
> ---
>
> Key: HDFS-12510
> URL: https://issues.apache.org/jira/browse/HDFS-12510
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12510-HDFS-13891.001.patch
>
>
> HDFS-12273 implemented the UI for Router Based Federation without security.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820487#comment-16820487
 ] 

Íñigo Goiri commented on HDFS-14435:


The code looks a little convoluted.
What about something like:
{code}
  private HAServiceState getHAServiceState(NNProxyInfo proxyInfo) {
Exception e;
try {
  return proxyInfo.proxy.getHAServiceState();
} catch (RemoteException re) {
  // Though a Standby will allow a getHAServiceState call, it won't allow
  // delegation token lookup, so if DT is used it throws StandbyException
  IOException ioe = re.unwrapRemoteException(StandbyException.class);
  if (ioe instanceof StandbyException) {
LOG.debug("NameNode {} threw StandbyException when fetching HAState",
proxyInfo.getAddress());
return HAServiceState.STANDBY;
  }
  e = re;
} catch (IOException ioe) {
  e = ioe;
}
LOG.info("Failed to connect to {}. Assuming Standby state",
  proxyInfo.getAddress(), e);
return HAServiceState.STANDBY;
  }
{code}

> ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs
> --
>
> Key: HDFS-14435
> URL: https://issues.apache.org/jira/browse/HDFS-14435
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, nn
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14435.000.patch, HDFS-14435.001.patch
>
>
> We have been seeing issues during testing of the Consistent Read from Standby 
> feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
> Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
> investigation, we realized that although the Standby allows the 
> {{getHAServiceState()}} call, reading a delegation token is not allowed in 
> Standby state, thus the call will fail when using DT-based authentication. 
> This hasn't caused issues in practice, since ORPP assumes that the state is 
> Standby if it is unable to fetch the state, but we should fix the logic to 
> properly handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820482#comment-16820482
 ] 

Íñigo Goiri commented on HDFS-14426:


Thanks [~fengnanli] for the update.
Can you fix the check style issue?

[~crh], do you mind taking a look?
Anything else that should be covered?

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426-HDFS-13891.001.patch, HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen reassigned HDFS-14245:
--

Assignee: Erik Krogen  (was: Shen Yinjie)

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work started] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-14245 started by Erik Krogen.
--
> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14245:
---
Status: Open  (was: Patch Available)

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14435:
---
Attachment: HDFS-14435.001.patch

> ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs
> --
>
> Key: HDFS-14435
> URL: https://issues.apache.org/jira/browse/HDFS-14435
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, nn
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14435.000.patch, HDFS-14435.001.patch
>
>
> We have been seeing issues during testing of the Consistent Read from Standby 
> feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
> Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
> investigation, we realized that although the Standby allows the 
> {{getHAServiceState()}} call, reading a delegation token is not allowed in 
> Standby state, thus the call will fail when using DT-based authentication. 
> This hasn't caused issues in practice, since ORPP assumes that the state is 
> Standby if it is unable to fetch the state, but we should fix the logic to 
> properly handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Erik Krogen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820472#comment-16820472
 ] 

Erik Krogen commented on HDFS-14435:


Thanks [~vagarychen]... That was a bad rebase on my part. Fixed in v001.

> ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs
> --
>
> Key: HDFS-14435
> URL: https://issues.apache.org/jira/browse/HDFS-14435
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, nn
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14435.000.patch, HDFS-14435.001.patch
>
>
> We have been seeing issues during testing of the Consistent Read from Standby 
> feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
> Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
> investigation, we realized that although the Standby allows the 
> {{getHAServiceState()}} call, reading a delegation token is not allowed in 
> Standby state, thus the call will fail when using DT-based authentication. 
> This hasn't caused issues in practice, since ORPP assumes that the state is 
> Standby if it is unable to fetch the state, but we should fix the logic to 
> properly handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Chen Liang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820459#comment-16820459
 ] 

Chen Liang commented on HDFS-14435:
---

Thanks for working on this [~xkrogen]! Just a minor comment, can we use DEBUG 
instead of INFO logs? Current code uses DEBUG here.
{{LOG.info("Changed current proxy from {} to {}"}}
Otherwise +1, pending Jenkins.

> ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs
> --
>
> Key: HDFS-14435
> URL: https://issues.apache.org/jira/browse/HDFS-14435
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, nn
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14435.000.patch
>
>
> We have been seeing issues during testing of the Consistent Read from Standby 
> feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
> Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
> investigation, we realized that although the Standby allows the 
> {{getHAServiceState()}} call, reading a delegation token is not allowed in 
> Standby state, thus the call will fail when using DT-based authentication. 
> This hasn't caused issues in practice, since ORPP assumes that the state is 
> Standby if it is unable to fetch the state, but we should fix the logic to 
> properly handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14434) webhdfs that connect secure hdfs should not use user.name parameter

2019-04-17 Thread Kihwal Lee (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820458#comment-16820458
 ] 

Kihwal Lee commented on HDFS-14434:
---

I should have clarified that the current code does validation against 
"user.name" and "doAs" only when the auth is based on token. I was suggesting 
those to be removed. The "doAs" param is still valid for kerberos 
authentication, but should be ignored in the token-based auth, where whatever 
information in the token is authoritative.

> webhdfs that connect secure hdfs should not use user.name parameter
> ---
>
> Key: HDFS-14434
> URL: https://issues.apache.org/jira/browse/HDFS-14434
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: HDFS-14434.001.patch
>
>
> I have two secure hadoop cluster.  Both cluster use cross-realm 
> authentication. 
> [use...@a.com|mailto:use...@a.com] can access to HDFS of B.COM realm
> by the way, hadoop username of use...@a.com  in B.COM realm is  
> cross_realm_a_com_user_a.
> hdfs dfs command of use...@a.com using B.COM webhdfs failed.
> root cause is  webhdfs that connect secure hdfs use user.name parameter.
> according to webhdfs spec,  insecure webhdfs use user.name,  secure webhdfs 
> use SPNEGO for authentication.
> I think webhdfs that connect secure hdfs  should not use user.name parameter.
> I will attach patch.
> below is error log
>  
> {noformat}
> $ hdfs dfs -ls  webhdfs://b.com:50070/
> ls: Usernames not matched: name=user_a != expected=cross_realm_a_com_user_a
>  
> # user.name in cross realm webhdfs
> $ curl -u : --negotiate 
> 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN=user_a' 
> {"RemoteException":{"exception":"SecurityException","javaClassName":"java.lang.SecurityException","message":"Failed
>  to obtain user group information: java.io.IOException: Usernames not 
> matched: name=user_a != expected=cross_realm_a_com_user_a"}}
> # USE SPNEGO
> $ curl -u : --negotiate 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN'
> {"Token"{"urlString":"XgA."}}
>  
> {noformat}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14435:
---
Attachment: HDFS-14435.000.patch

> ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs
> --
>
> Key: HDFS-14435
> URL: https://issues.apache.org/jira/browse/HDFS-14435
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, nn
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
> Attachments: HDFS-14435.000.patch
>
>
> We have been seeing issues during testing of the Consistent Read from Standby 
> feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
> Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
> investigation, we realized that although the Standby allows the 
> {{getHAServiceState()}} call, reading a delegation token is not allowed in 
> Standby state, thus the call will fail when using DT-based authentication. 
> This hasn't caused issues in practice, since ORPP assumes that the state is 
> Standby if it is unable to fetch the state, but we should fix the logic to 
> properly handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820446#comment-16820446
 ] 

Hadoop QA commented on HDFS-14426:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-13891 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
45s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m 
42s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
50s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
48s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 27s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
47s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
47s{color} | {color:green} HDFS-13891 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 14m 
31s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 54s{color} | {color:orange} root: The patch generated 1 new + 47 unchanged - 
0 fixed = 48 total (was 47) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
40s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m  
6s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 22m 
19s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
43s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}124m  7s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14426 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966263/HDFS-14426-HDFS-13891.001.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux ff38843851f0 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 
10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-13891 / bd3161e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
| checkstyle |

[jira] [Updated] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erik Krogen updated HDFS-14435:
---
Status: Patch Available  (was: In Progress)

> ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs
> --
>
> Key: HDFS-14435
> URL: https://issues.apache.org/jira/browse/HDFS-14435
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, nn
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> We have been seeing issues during testing of the Consistent Read from Standby 
> feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
> Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
> investigation, we realized that although the Standby allows the 
> {{getHAServiceState()}} call, reading a delegation token is not allowed in 
> Standby state, thus the call will fail when using DT-based authentication. 
> This hasn't caused issues in practice, since ORPP assumes that the state is 
> Standby if it is unable to fetch the state, but we should fix the logic to 
> properly handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1442) add spark container to ozonesecure-mr compose files

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1442?focusedWorklogId=229299=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229299
 ]

ASF GitHub Bot logged work on HDDS-1442:


Author: ASF GitHub Bot
Created on: 17/Apr/19 18:57
Start Date: 17/Apr/19 18:57
Worklog Time Spent: 10m 
  Work Description: elek commented on pull request #746: HDDS-1442. add 
spark container to ozonesecure-mr compose files. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/746#discussion_r276387885
 
 

 ##
 File path: 
hadoop-ozone/dist/src/main/compose/ozonesecure-mr/docker-compose.yaml
 ##
 @@ -112,3 +112,13 @@ services:
   HADOOP_CLASSPATH: 
/opt/ozone/share/ozone/lib/hadoop-ozone-filesystem-lib-current-@project.version@.jar
   WAIT_FOR: rm:8088
 command: ["yarn","timelineserver"]
+  spark:
+image: ahadoop/spark:v1
 
 Review comment:
   It's fine. Just use automated builds: 
https://docs.docker.com/docker-hub/builds/ 
   
   For automated builds the source will be linked by dockerhub without any 
effort and easy to check the source.
   
   (Or add a readme with a link to the github project, but automated builds can 
be better).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229299)
Time Spent: 1h  (was: 50m)

> add spark container to ozonesecure-mr compose files
> ---
>
> Key: HDDS-1442
> URL: https://issues.apache.org/jira/browse/HDDS-1442
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> add spark container to ozonesecure-mr compose files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Erik Krogen (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820395#comment-16820395
 ] 

Erik Krogen commented on HDFS-14245:


Yes... Thanks for reminding me of this JIRA, [~shv]. I will take it up, unless 
[~shenyinjie] has any issues with that. 

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Shen Yinjie
>Priority: Major
> Attachments: HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14378) Simplify the design of multiple NN and both logic of edit log roll and checkpoint

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820380#comment-16820380
 ] 

Hadoop QA commented on HDFS-14378:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 31s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 53s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 85 new + 1025 unchanged - 47 fixed = 1110 total (was 1072) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 37s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}104m 53s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
32s{color} | {color:red} The patch generated 3 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}161m 54s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestCheckpoint |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.server.namenode.ha.TestHAConfiguration |
|   | hadoop.hdfs.tools.offlineEditsViewer.TestOfflineEditsViewer |
|   | hadoop.hdfs.TestFetchImage |
|   | hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | hadoop.hdfs.server.namenode.TestEditLog |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | hadoop.hdfs.server.namenode.TestSecurityTokenEditLog |
|   | hadoop.hdfs.server.namenode.TestEditLogRace |
|   | hadoop.hdfs.TestDFSClientRetries |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:bdbca0e |
| JIRA Issue | HDFS-14378 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966249/HDFS-14378-trunk.004.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 15ba67745de8 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64

[jira] [Work started] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Erik Krogen (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-14435 started by Erik Krogen.
--
> ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs
> --
>
> Key: HDFS-14435
> URL: https://issues.apache.org/jira/browse/HDFS-14435
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, nn
>Affects Versions: 3.3.0
>Reporter: Erik Krogen
>Assignee: Erik Krogen
>Priority: Major
>
> We have been seeing issues during testing of the Consistent Read from Standby 
> feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
> Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
> investigation, we realized that although the Standby allows the 
> {{getHAServiceState()}} call, reading a delegation token is not allowed in 
> Standby state, thus the call will fail when using DT-based authentication. 
> This hasn't caused issues in practice, since ORPP assumes that the state is 
> Standby if it is unable to fetch the state, but we should fix the logic to 
> properly handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14245) Class cast error in GetGroups with ObserverReadProxyProvider

2019-04-17 Thread Konstantin Shvachko (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820373#comment-16820373
 ] 

Konstantin Shvachko commented on HDFS-14245:


[~xkrogen] your option 2 - _ORPP not using Observers for protocols that do not 
extend {{ClientProtocol}}_ - sounds reasonable.
Should we implement it?

> Class cast error in GetGroups with ObserverReadProxyProvider
> 
>
> Key: HDFS-14245
> URL: https://issues.apache.org/jira/browse/HDFS-14245
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: HDFS-12943
>Reporter: Shen Yinjie
>Assignee: Shen Yinjie
>Priority: Major
> Attachments: HDFS-14245.patch
>
>
> Run "hdfs groups" with ObserverReadProxyProvider, Exception throws as :
> {code:java}
> Exception in thread "main" java.io.IOException: Couldn't create proxy 
> provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:261)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:119)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:95)
>  at org.apache.hadoop.hdfs.tools.GetGroups.getUgmProtocol(GetGroups.java:87)
>  at org.apache.hadoop.tools.GetGroupsBase.run(GetGroupsBase.java:71)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
>  at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
>  at org.apache.hadoop.hdfs.tools.GetGroups.main(GetGroups.java:96)
> Caused by: java.lang.reflect.InvocationTargetException
>  at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>  at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
>  at 
> org.apache.hadoop.hdfs.NameNodeProxiesClient.createFailoverProxyProvider(NameNodeProxiesClient.java:245)
>  ... 7 more
> Caused by: java.lang.ClassCastException: 
> org.apache.hadoop.hdfs.server.namenode.ha.NameNodeHAProxyFactory cannot be 
> cast to org.apache.hadoop.hdfs.server.namenode.ha.ClientHAProxyFactory
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:123)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider.(ObserverReadProxyProvider.java:112)
>  ... 12 more
> {code}
> similar with HDFS-14116, we did a simple fix.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Created] (HDFS-14435) ObserverReadProxyProvider is unable to properly fetch HAState from Standby NNs

2019-04-17 Thread Erik Krogen (JIRA)

Erik Krogen created HDFS-14435:
--

 Summary: ObserverReadProxyProvider is unable to properly fetch 
HAState from Standby NNs
 Key: HDFS-14435
 URL: https://issues.apache.org/jira/browse/HDFS-14435
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha, nn
Affects Versions: 3.3.0
Reporter: Erik Krogen
Assignee: Erik Krogen


We have been seeing issues during testing of the Consistent Read from Standby 
feature that indicate that ORPP is unable to call {{getHAServiceState}} on 
Standby NNs, as they are rejected with a {{StandbyException}}. Upon further 
investigation, we realized that although the Standby allows the 
{{getHAServiceState()}} call, reading a delegation token is not allowed in 
Standby state, thus the call will fail when using DT-based authentication. This 
hasn't caused issues in practice, since ORPP assumes that the state is Standby 
if it is unable to fetch the state, but we should fix the logic to properly 
handle this scenario.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1442) add spark container to ozonesecure-mr compose files

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1442?focusedWorklogId=229272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229272
 ]

ASF GitHub Bot logged work on HDDS-1442:


Author: ASF GitHub Bot
Created on: 17/Apr/19 18:05
Start Date: 17/Apr/19 18:05
Worklog Time Spent: 10m 
  Work Description: ajayydv commented on issue #746: HDDS-1442. add spark 
container to ozonesecure-mr compose files. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/746#issuecomment-484200110
 
 
   > Looks good to me but not clear what is the ahadoop/spark:v1 and how is it 
maintained.
   > 
   > 1. Where it the source of this image and how can we create PRs? (If you 
have a simple repo and automated build can be better...)
   > 2. Which version of spark is included in the image (v1 tag is not very 
clear)
   > 3. The entrypoint of the dockerimage doesn't allow to start the container 
easily (docker run ahadoop/spark:v1 bash --> instead of bash the download is 
started)
   Its modified version of our hadoop docker runner. Source is currently in my 
private repo (https://github.com/ajayydv/docker/tree/spark) but we can push it 
to public git. Don't think hadoop repo is right place for this as this has 
mostly spark bits. this uses spark 2.4.3 but it can be overrided via config 
(SPARK_DOWNLOAD_URL).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229272)
Time Spent: 50m  (was: 40m)

> add spark container to ozonesecure-mr compose files
> ---
>
> Key: HDDS-1442
> URL: https://issues.apache.org/jira/browse/HDDS-1442
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> add spark container to ozonesecure-mr compose files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1442) add spark container to ozonesecure-mr compose files

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1442?focusedWorklogId=229265=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229265
 ]

ASF GitHub Bot logged work on HDDS-1442:


Author: ASF GitHub Bot
Created on: 17/Apr/19 18:02
Start Date: 17/Apr/19 18:02
Worklog Time Spent: 10m 
  Work Description: ajayydv commented on pull request #746: HDDS-1442. add 
spark container to ozonesecure-mr compose files. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/746#discussion_r276366214
 
 

 ##
 File path: 
hadoop-ozone/dist/src/main/compose/ozonesecure-mr/docker-compose.yaml
 ##
 @@ -112,3 +112,13 @@ services:
   HADOOP_CLASSPATH: 
/opt/ozone/share/ozone/lib/hadoop-ozone-filesystem-lib-current-@project.version@.jar
   WAIT_FOR: rm:8088
 command: ["yarn","timelineserver"]
+  spark:
+image: ahadoop/spark:v1
 
 Review comment:
   Its based on our hadoop docker runner image. You can look at source at 
https://github.com/ajayydv/docker/tree/spark . We can create a team account for 
ozone in githib and docker and push this there.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229265)
Time Spent: 40m  (was: 0.5h)

> add spark container to ozonesecure-mr compose files
> ---
>
> Key: HDDS-1442
> URL: https://issues.apache.org/jira/browse/HDDS-1442
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> add spark container to ozonesecure-mr compose files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14431) RBF: Rename with multiple subclusters should fail if no eligible locations

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820361#comment-16820361
 ] 

Hadoop QA commented on HDFS-14431:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 5 new or modified test 
files. {color} |
|| || || || {color:brown} HDFS-13891 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 22m 
33s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
25s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
38s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 44s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
5s{color} | {color:green} HDFS-13891 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} HDFS-13891 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 26m 
21s{color} | {color:green} hadoop-hdfs-rbf in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 29s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:8f97d6f |
| JIRA Issue | HDFS-14431 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966257/HDFS-14431-HDFS-13891.003.patch
 |
| Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall  
mvnsite  unit  shadedclient  findbugs  checkstyle  |
| uname | Linux c6d52b4f9db1 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | HDFS-13891 / bd3161e |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_191 |
| findbugs | v3.1.0-RC1 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26658/testReport/ |
| Max. process+thread count | 1027 (vs. ulimit of 1) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs-rbf U: 
hadoop-hdfs-project/hadoop-hdfs-rbf |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/26658/console |
| Powered by | Apache Yetus 0.8.0   http://yetus.apache.org |


This message was automatically generated.



> RBF: Rename with multiple subclusters should fail if no eligible locations
>

[jira] [Commented] (HDFS-14433) Remove the extra empty space in the DataStreamer logging

2019-04-17 Thread Hudson (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820352#comment-16820352
 ] 

Hudson commented on HDFS-14433:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #16434 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/16434/])
HDFS-14433. Remove the extra empty space in the DataStreamer logging. (xyao: 
rev 685cb83e4c3f433c5147e35217ce79ea520a0da5)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DataStreamer.java


> Remove the extra empty space in the DataStreamer logging
> 
>
> Key: HDFS-14433
> URL: https://issues.apache.org/jira/browse/HDFS-14433
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Yishuang Lu
>Assignee: Yishuang Lu
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: HDFS-14433.001.patch
>
>
> Remove the extra empty space in the DataStreamer logging



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14434) webhdfs that connect secure hdfs should not use user.name parameter

2019-04-17 Thread Eric Yang (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820344#comment-16820344
 ] 

Eric Yang commented on HDFS-14434:
--

[~kihwal] How does impersonation take place for service user to access HDFS on 
behave of end user without doAs?

> webhdfs that connect secure hdfs should not use user.name parameter
> ---
>
> Key: HDFS-14434
> URL: https://issues.apache.org/jira/browse/HDFS-14434
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: webhdfs
>Affects Versions: 3.1.2
>Reporter: KWON BYUNGCHANG
>Priority: Minor
> Attachments: HDFS-14434.001.patch
>
>
> I have two secure hadoop cluster.  Both cluster use cross-realm 
> authentication. 
> [use...@a.com|mailto:use...@a.com] can access to HDFS of B.COM realm
> by the way, hadoop username of use...@a.com  in B.COM realm is  
> cross_realm_a_com_user_a.
> hdfs dfs command of use...@a.com using B.COM webhdfs failed.
> root cause is  webhdfs that connect secure hdfs use user.name parameter.
> according to webhdfs spec,  insecure webhdfs use user.name,  secure webhdfs 
> use SPNEGO for authentication.
> I think webhdfs that connect secure hdfs  should not use user.name parameter.
> I will attach patch.
> below is error log
>  
> {noformat}
> $ hdfs dfs -ls  webhdfs://b.com:50070/
> ls: Usernames not matched: name=user_a != expected=cross_realm_a_com_user_a
>  
> # user.name in cross realm webhdfs
> $ curl -u : --negotiate 
> 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN=user_a' 
> {"RemoteException":{"exception":"SecurityException","javaClassName":"java.lang.SecurityException","message":"Failed
>  to obtain user group information: java.io.IOException: Usernames not 
> matched: name=user_a != expected=cross_realm_a_com_user_a"}}
> # USE SPNEGO
> $ curl -u : --negotiate 'http://b.com:50070/webhdfs/v1/?op=GETDELEGATIONTOKEN'
> {"Token"{"urlString":"XgA."}}
>  
> {noformat}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HDFS-14427) RBF: Optimize some testing set up logic in MiniRouterDFSCluster

2019-04-17 Thread Fengnan Li (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820338#comment-16820338
 ] 

Fengnan Li edited comment on HDFS-14427 at 4/17/19 5:41 PM:


[~elgoiri] Thanks for the comment. 

I also like the idea that having an extra option to specify the number of 
Routers, so if we keep the current logic of # of Routers = # of Namenodes, we 
will need to give out functions with signatures like either specifying the 
number of Routers or number of Namenodes but not both since that will mislead 
people to specify mismatched count.

Will upload a patch later for this


was (Author: fengnanli):
[~elgoiri] Thanks for the comment. 

I also like the idea that having an extra option to specify the number of 
Routers, so if we keep the current logic of # of Routers = # of Namenodes, we 
will need to give out functions with signatures like either specifying the 
number of Routers or number of Namenodes but not both since that will mislead 
people to specify mismatched count.

> RBF: Optimize some testing set up logic in MiniRouterDFSCluster
> ---
>
> Key: HDFS-14427
> URL: https://issues.apache.org/jira/browse/HDFS-14427
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
>
> [https://github.com/apache/hadoop/blob/HDFS-13891/hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/MiniRouterDFSCluster.java#L808]
> the comment says one router is created per name service, while in the code 
> one router is created per namenode in each nameservice.
> There are a couple of things that might need to consider optimization:
>  # make the code as the the comment
>  # add some ways to specify the number of routers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14433) Remove the extra empty space in the DataStreamer logging

2019-04-17 Thread Xiaoyu Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao updated HDFS-14433:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.3.0
   Status: Resolved  (was: Patch Available)

Thanks [~lys0716] for the contribution and all for the reviews. I've merged the 
change to trunk. 

> Remove the extra empty space in the DataStreamer logging
> 
>
> Key: HDFS-14433
> URL: https://issues.apache.org/jira/browse/HDFS-14433
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Yishuang Lu
>Assignee: Yishuang Lu
>Priority: Trivial
> Fix For: 3.3.0
>
> Attachments: HDFS-14433.001.patch
>
>
> Remove the extra empty space in the DataStreamer logging



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Assigned] (HDFS-14433) Remove the extra empty space in the DataStreamer logging

2019-04-17 Thread Xiaoyu Yao (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao reassigned HDFS-14433:
-

Assignee: Yishuang Lu

> Remove the extra empty space in the DataStreamer logging
> 
>
> Key: HDFS-14433
> URL: https://issues.apache.org/jira/browse/HDFS-14433
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: hdfs
>Affects Versions: 3.1.2
>Reporter: Yishuang Lu
>Assignee: Yishuang Lu
>Priority: Trivial
> Attachments: HDFS-14433.001.patch
>
>
> Remove the extra empty space in the DataStreamer logging



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14427) RBF: Optimize some testing set up logic in MiniRouterDFSCluster

2019-04-17 Thread Fengnan Li (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820338#comment-16820338
 ] 

Fengnan Li commented on HDFS-14427:
---

[~elgoiri] Thanks for the comment. 

I also like the idea that having an extra option to specify the number of 
Routers, so if we keep the current logic of # of Routers = # of Namenodes, we 
will need to give out functions with signatures like either specifying the 
number of Routers or number of Namenodes but not both since that will mislead 
people to specify mismatched count.

> RBF: Optimize some testing set up logic in MiniRouterDFSCluster
> ---
>
> Key: HDFS-14427
> URL: https://issues.apache.org/jira/browse/HDFS-14427
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
>
> [https://github.com/apache/hadoop/blob/HDFS-13891/hadoop-hdfs-project/hadoop-hdfs-rbf/src/test/java/org/apache/hadoop/hdfs/server/federation/MiniRouterDFSCluster.java#L808]
> the comment says one router is created per name service, while in the code 
> one router is created per namenode in each nameservice.
> There are a couple of things that might need to consider optimization:
>  # make the code as the the comment
>  # add some ways to specify the number of routers



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1442) add spark container to ozonesecure-mr compose files

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1442?focusedWorklogId=229240=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229240
 ]

ASF GitHub Bot logged work on HDDS-1442:


Author: ASF GitHub Bot
Created on: 17/Apr/19 17:32
Start Date: 17/Apr/19 17:32
Worklog Time Spent: 10m 
  Work Description: xiaoyuyao commented on pull request #746: HDDS-1442. 
add spark container to ozonesecure-mr compose files. Contributed by Ajay Kumar.
URL: https://github.com/apache/hadoop/pull/746#discussion_r276353848
 
 

 ##
 File path: 
hadoop-ozone/dist/src/main/compose/ozonesecure-mr/docker-compose.yaml
 ##
 @@ -112,3 +112,13 @@ services:
   HADOOP_CLASSPATH: 
/opt/ozone/share/ozone/lib/hadoop-ozone-filesystem-lib-current-@project.version@.jar
   WAIT_FOR: rm:8088
 command: ["yarn","timelineserver"]
+  spark:
+image: ahadoop/spark:v1
 
 Review comment:
   Can you share the image details of ahadoop/spark:v1? Is it published in 
docker repo?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 229240)
Time Spent: 0.5h  (was: 20m)

> add spark container to ozonesecure-mr compose files
> ---
>
> Key: HDDS-1442
> URL: https://issues.apache.org/jira/browse/HDDS-1442
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> add spark container to ozonesecure-mr compose files



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread Fengnan Li (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820334#comment-16820334
 ] 

Fengnan Li commented on HDFS-14426:
---

[~elgoiri] Thanks for the review and comment! I updated the patch with the test 
change as well as the name.

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426-HDFS-13891.001.patch, HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14426) RBF: Add delegation token total count as one of the federation metrics

2019-04-17 Thread Fengnan Li (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDFS-14426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Fengnan Li updated HDFS-14426:
--
Attachment: HDFS-14426-HDFS-13891.001.patch

> RBF: Add delegation token total count as one of the federation metrics
> --
>
> Key: HDFS-14426
> URL: https://issues.apache.org/jira/browse/HDFS-14426
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Fengnan Li
>Assignee: Fengnan Li
>Priority: Major
> Attachments: HDFS-14426-HDFS-13891.001.patch, HDFS-14426.001.patch
>
>
> Currently router doesn't report the total number of current valid delegation 
> tokens it has, but this piece of information is useful for monitoring and 
> understanding the real time situation of tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1297) Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization

2019-04-17 Thread Hadoop QA (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820330#comment-16820330
 ] 

Hadoop QA commented on HDDS-1297:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
33s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
28s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
14m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} trunk passed {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  4m  
3s{color} | {color:blue} Used deprecated FindBugs config; considering switching 
to SpotBugs. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
6s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 30s{color} | {color:orange} hadoop-hdds: The patch generated 3 new + 0 
unchanged - 0 fixed = 3 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  5s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m  
2s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  3m 32s{color} 
| {color:red} hadoop-hdds in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 15m 29s{color} 
| {color:red} hadoop-ozone in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
37s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 86m 59s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdds.scm.node.TestSCMNodeManager |
|   | hadoop.hdds.scm.pipeline.TestPipelineClose |
|   | hadoop.ozone.om.TestOzoneManagerHA |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/PreCommit-HDDS-Build/2664/artifact/out/Dockerfile 
|
| JIRA Issue | HDDS-1297 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12966254/HDDS-1297.05.patch |
| Optional Tests | dupname asflicense

[jira] [Comment Edited] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2019-04-17 Thread Anu Engineer (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820302#comment-16820302
 ] 

Anu Engineer edited comment on HDFS-13596 at 4/17/19 5:03 PM:
--

Not to add more noise, but this might be a good opportunity for us to learn a 
trick or two from our brethren in the HBase land.  HBase has this nice ability 
to upgrade to 3.0, but will not enable 3.0 features unless another command or 
setting is applied. We actually have a very similar situation here, we might 
have changes in Edit logs, but let us not allow that feature to be used until 
after the main step is completely done and we have some way of verifying that 
nothing is broken. Then you can enable the full 3.0 features once the full 
upgrade is done. (Thanks to [~ccondit] for educating me on how good HBase in 
doing this, and letting me know that Ozone should probably learn from that 
experience).

 

if we do this, Rolling upgrade would be two steps, upgrade, then enable 3.0 
features like EC.  Till enable call is done, HDFS will not allow 3.0 features 
like EC.

 


was (Author: anu):
Not to add more noise, but this might be a good opportunity for us to learn a 
trick or two from our brethren in the HBase land.  HBase has this nice ability 
to upgrade to 3.0, but will not enable 3.0 features unless another command or 
setting is applied. We actually have a very similar situation here, we might 
have changes in Edit logs, but let us not allow that feature to be used until 
after the main step is completely done and we have some way of verifying that 
nothing is broken. Then you can enable the full 3.0 features once the full 
upgrade is done. (Thanks to [~ccondit] for educating me on how good HBase does 
this, and letting me know that Ozone should probably learn from that 
experience).

 

if we do this, Rolling upgrade would be two steps, upgrade, then enable 3.0 
features like EC.  Till enable call is done, HDFS will not allow 3.0 features 
like EC.

 

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, 
> HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, 
> HDFS-13596.006.patch, HDFS-13596.007.patch
>
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
>

[jira] [Comment Edited] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2019-04-17 Thread Anu Engineer (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820302#comment-16820302
 ] 

Anu Engineer edited comment on HDFS-13596 at 4/17/19 5:03 PM:
--

Not to add more noise, but this might be a good opportunity for us to learn a 
trick or two from our brethren in the HBase land.  HBase has this nice ability 
to upgrade to 3.0, but will not enable 3.0 features unless another command or 
setting is applied. We actually have a very similar situation here, we might 
have changes in Edit logs, but let us not allow that feature to be used until 
after the main step is completely done and we have some way of verifying that 
nothing is broken. Then you can enable the full 3.0 features once the full 
upgrade is done. (Thanks to [~ccondit] for educating me on how good HBase is in 
doing this, and letting me know that Ozone should probably learn from that 
experience).

 

if we do this, Rolling upgrade would be two steps, upgrade, then enable 3.0 
features like EC.  Till enable call is done, HDFS will not allow 3.0 features 
like EC.

 


was (Author: anu):
Not to add more noise, but this might be a good opportunity for us to learn a 
trick or two from our brethren in the HBase land.  HBase has this nice ability 
to upgrade to 3.0, but will not enable 3.0 features unless another command or 
setting is applied. We actually have a very similar situation here, we might 
have changes in Edit logs, but let us not allow that feature to be used until 
after the main step is completely done and we have some way of verifying that 
nothing is broken. Then you can enable the full 3.0 features once the full 
upgrade is done. (Thanks to [~ccondit] for educating me on how good HBase in 
doing this, and letting me know that Ozone should probably learn from that 
experience).

 

if we do this, Rolling upgrade would be two steps, upgrade, then enable 3.0 
features like EC.  Till enable call is done, HDFS will not allow 3.0 features 
like EC.

 

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, 
> HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, 
> HDFS-13596.006.patch, HDFS-13596.007.patch
>
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
>

[jira] [Commented] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2019-04-17 Thread Anu Engineer (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820302#comment-16820302
 ] 

Anu Engineer commented on HDFS-13596:
-

Not to add more noise, but this might be a good opportunity for us to learn a 
trick or two from our brethren in the HBase land.  HBase has this nice ability 
to upgrade to 3.0, but will not enable 3.0 features unless another command or 
setting is applied. We actually have a very similar situation here, we might 
have changes in Edit logs, but let us not allow that feature to be used until 
after the main step is complete done and we have some way of verifying that 
nothing is broken. Then you can enable the full 3.0 features once the full 
upgrade is done. (Thanks to [~ccondit] for educating me on how good HBase does 
this, and letting me know that Ozone should probably learn from that 
experience).

 

if we do this, Rolling upgrade would be two steps, upgrade, then enable 3.0 
features like EC.  Till enable call is done, HDFS will not allow 3.0 features 
like EC.

 

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, 
> HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, 
> HDFS-13596.006.patch, HDFS-13596.007.patch
>
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:714)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:632)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:694)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:937)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:910)
>  at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1643)
>  at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1710)
> 2018-05-17 19:10:06,522 WARN 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception 
> loading fsimage
> java.io.IOException: java.lang.IllegalStateException: Cannot skip to less 
> than the current value (=16389), where newValue=16388
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1945)
>  at 
>

[jira] [Comment Edited] (HDFS-13596) NN restart fails after RollingUpgrade from 2.x to 3.x

2019-04-17 Thread Anu Engineer (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-13596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820302#comment-16820302
 ] 

Anu Engineer edited comment on HDFS-13596 at 4/17/19 5:02 PM:
--

Not to add more noise, but this might be a good opportunity for us to learn a 
trick or two from our brethren in the HBase land.  HBase has this nice ability 
to upgrade to 3.0, but will not enable 3.0 features unless another command or 
setting is applied. We actually have a very similar situation here, we might 
have changes in Edit logs, but let us not allow that feature to be used until 
after the main step is completely done and we have some way of verifying that 
nothing is broken. Then you can enable the full 3.0 features once the full 
upgrade is done. (Thanks to [~ccondit] for educating me on how good HBase does 
this, and letting me know that Ozone should probably learn from that 
experience).

 

if we do this, Rolling upgrade would be two steps, upgrade, then enable 3.0 
features like EC.  Till enable call is done, HDFS will not allow 3.0 features 
like EC.

 


was (Author: anu):
Not to add more noise, but this might be a good opportunity for us to learn a 
trick or two from our brethren in the HBase land.  HBase has this nice ability 
to upgrade to 3.0, but will not enable 3.0 features unless another command or 
setting is applied. We actually have a very similar situation here, we might 
have changes in Edit logs, but let us not allow that feature to be used until 
after the main step is complete done and we have some way of verifying that 
nothing is broken. Then you can enable the full 3.0 features once the full 
upgrade is done. (Thanks to [~ccondit] for educating me on how good HBase does 
this, and letting me know that Ozone should probably learn from that 
experience).

 

if we do this, Rolling upgrade would be two steps, upgrade, then enable 3.0 
features like EC.  Till enable call is done, HDFS will not allow 3.0 features 
like EC.

 

> NN restart fails after RollingUpgrade from 2.x to 3.x
> -
>
> Key: HDFS-13596
> URL: https://issues.apache.org/jira/browse/HDFS-13596
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs
>Reporter: Hanisha Koneru
>Assignee: Fei Hui
>Priority: Critical
> Attachments: HDFS-13596.001.patch, HDFS-13596.002.patch, 
> HDFS-13596.003.patch, HDFS-13596.004.patch, HDFS-13596.005.patch, 
> HDFS-13596.006.patch, HDFS-13596.007.patch
>
>
> After rollingUpgrade NN from 2.x and 3.x, if the NN is restarted, it fails 
> while replaying edit logs.
>  * After NN is started with rollingUpgrade, the layoutVersion written to 
> editLogs (before finalizing the upgrade) is the pre-upgrade layout version 
> (so as to support downgrade).
>  * When writing transactions to log, NN writes as per the current layout 
> version. In 3.x, erasureCoding bits are added to the editLog transactions.
>  * So any edit log written after the upgrade and before finalizing the 
> upgrade will have the old layout version but the new format of transactions.
>  * When NN is restarted and the edit logs are replayed, the NN reads the old 
> layout version from the editLog file. When parsing the transactions, it 
> assumes that the transactions are also from the previous layout and hence 
> skips parsing the erasureCoding bits.
>  * This cascades into reading the wrong set of bits for other fields and 
> leads to NN shutting down.
> Sample error output:
> {code:java}
> java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
> length 16
>  at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:74)
>  at org.apache.hadoop.ipc.RetryCache$CacheEntry.(RetryCache.java:86)
>  at 
> org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.(RetryCache.java:163)
>  at 
> org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:960)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
>  at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:888)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:745)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:323)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1086)
>  at 
>

[jira] [Assigned] (HDDS-1258) Fix error propagation for SCM protocol

2019-04-17 Thread Shweta (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shweta reassigned HDDS-1258:


Assignee: Shweta  (was: Kitti Nanasi)

> Fix error propagation for SCM protocol
> --
>
> Key: HDDS-1258
> URL: https://issues.apache.org/jira/browse/HDDS-1258
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Elek, Marton
>Assignee: Shweta
>Priority: Major
>
> HDDS-1068 fixed the error propagation between the OM client and OM server.
> By default the Server.java transforms all the IOExceptions to one string 
> (message + stack trace) and this is returned to the client.
> But for business exception (eg. volume not found, chill mode is active, etc.) 
> this is not what we need.
> In the OM side we fixed this behaviour. In the ServerSideTranslator classes 
> we catch (server) the business (OMException) exceptions and serialize them to 
> the response object.
> The exception (and the status code) is stored in message/status field of the 
> OMResponse (hadoop-ozone/common/src/main/proto/OzoneManagerProtocol.proto)
> Here I propose to do the same for the ScmBlockLocationProtocol.proto.
> Unfortunately there is no common parent object (like OMRequest) in this 
> protocol, but we can easily add one as only the Serverside/Clientside 
> translator should be changed for that. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14431) RBF: Rename with multiple subclusters should fail if no eligible locations

2019-04-17 Thread JIRA



[ 
https://issues.apache.org/jira/browse/HDFS-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820271#comment-16820271
 ] 

Íñigo Goiri commented on HDFS-14431:


Following the discussion in HDFS-14117, this just tackles the rename operation.
It basically unifies the experience and throws an intuitive error when a rename 
is not possible.
The approach is the one described there: (1) check the source files and (2) 
rename only the ones that are possible.
[^HDFS-14431-HDFS-13891.003.patch] should be ready for review.

> RBF: Rename with multiple subclusters should fail if no eligible locations
> --
>
> Key: HDFS-14431
> URL: https://issues.apache.org/jira/browse/HDFS-14431
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14431-HDFS-13891.001.patch, 
> HDFS-14431-HDFS-13891.002.patch, HDFS-14431-HDFS-13891.003.patch
>
>
> Currently, the rename will fail with FileNotFoundException which is not clear 
> to the user.
> The operation should fail stating the reason is that there are no eligible 
> destinations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-14431) RBF: Rename with multiple subclusters should fail if no eligible locations

2019-04-17 Thread JIRA



 [ 
https://issues.apache.org/jira/browse/HDFS-14431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-14431:
---
Attachment: HDFS-14431-HDFS-13891.003.patch

> RBF: Rename with multiple subclusters should fail if no eligible locations
> --
>
> Key: HDFS-14431
> URL: https://issues.apache.org/jira/browse/HDFS-14431
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-14431-HDFS-13891.001.patch, 
> HDFS-14431-HDFS-13891.002.patch, HDFS-14431-HDFS-13891.003.patch
>
>
> Currently, the rename will fail with FileNotFoundException which is not clear 
> to the user.
> The operation should fail stating the reason is that there are no eligible 
> destinations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDDS-1297) Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization

2019-04-17 Thread Arpit Agarwal (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-1297:

Attachment: HDDS-1297.05.patch

> Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization
> 
>
> Key: HDDS-1297
> URL: https://issues.apache.org/jira/browse/HDDS-1297
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDDS-1297.001.patch, HDDS-1297.002.patch, 
> HDDS-1297.003.patch, HDDS-1297.004.patch, HDDS-1297.05.patch
>
>
> Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization
> {code}
> ava.lang.IllegalArgumentException: 30 is not within min = 500 or max = 
> 10
>   at 
> org.apache.hadoop.hdds.server.ServerUtils.sanitizeUserArgs(ServerUtils.java:66)
>   at 
> org.apache.hadoop.hdds.scm.HddsServerUtil.getStaleNodeInterval(HddsServerUtil.java:256)
>   at 
> org.apache.hadoop.hdds.scm.node.NodeStateManager.(NodeStateManager.java:136)
>   at 
> org.apache.hadoop.hdds.scm.node.SCMNodeManager.(SCMNodeManager.java:105)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.initalizeSystemManagers(StorageContainerManager.java:391)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.(StorageContainerManager.java:286)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.(StorageContainerManager.java:218)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:684)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:628)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createSCM(MiniOzoneClusterImpl.java:458)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:392)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainer.testBothGetandPutSmallFile(TestOzoneContainer.java:237)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1297) Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization

2019-04-17 Thread Arpit Agarwal (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820240#comment-16820240
 ] 

Arpit Agarwal commented on HDDS-1297:
-

[~linyiqun] sorry I missed reviewing your updated patch earlier. Attached patch 
v5 that resolved some trivial conflicts in the test case.

I am +1 on the latest patch.

> Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization
> 
>
> Key: HDDS-1297
> URL: https://issues.apache.org/jira/browse/HDDS-1297
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDDS-1297.001.patch, HDDS-1297.002.patch, 
> HDDS-1297.003.patch, HDDS-1297.004.patch, HDDS-1297.05.patch
>
>
> Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization
> {code}
> ava.lang.IllegalArgumentException: 30 is not within min = 500 or max = 
> 10
>   at 
> org.apache.hadoop.hdds.server.ServerUtils.sanitizeUserArgs(ServerUtils.java:66)
>   at 
> org.apache.hadoop.hdds.scm.HddsServerUtil.getStaleNodeInterval(HddsServerUtil.java:256)
>   at 
> org.apache.hadoop.hdds.scm.node.NodeStateManager.(NodeStateManager.java:136)
>   at 
> org.apache.hadoop.hdds.scm.node.SCMNodeManager.(SCMNodeManager.java:105)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.initalizeSystemManagers(StorageContainerManager.java:391)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.(StorageContainerManager.java:286)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.(StorageContainerManager.java:218)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:684)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:628)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createSCM(MiniOzoneClusterImpl.java:458)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:392)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainer.testBothGetandPutSmallFile(TestOzoneContainer.java:237)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDDS-1297) Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization

2019-04-17 Thread Arpit Agarwal (JIRA)



[ 
https://issues.apache.org/jira/browse/HDDS-1297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820219#comment-16820219
 ] 

Arpit Agarwal commented on HDDS-1297:
-

The idea looks good to me. Thanks for creating the pull request [~elek]!

The patch needs to be rebased to trunk. Also a nitpick - I'd update the javadoc 
to better explain what it is doing since it is not obvious.

E.g. 
{code}
  /**
   * Checks that a config setting (key) that depends on another setting 
(basekey) is
   * within a reasonable multiple of the basekey. If the value is outside the 
range, then
   * limit it to the range and return the capped value.
   *
{code}



> Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization
> 
>
> Key: HDDS-1297
> URL: https://issues.apache.org/jira/browse/HDDS-1297
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 0.3.0
>Reporter: Mukul Kumar Singh
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDDS-1297.001.patch, HDDS-1297.002.patch, 
> HDDS-1297.003.patch, HDDS-1297.004.patch
>
>
> Fix IllegalArgumentException thrown with MiniOzoneCluster Initialization
> {code}
> ava.lang.IllegalArgumentException: 30 is not within min = 500 or max = 
> 10
>   at 
> org.apache.hadoop.hdds.server.ServerUtils.sanitizeUserArgs(ServerUtils.java:66)
>   at 
> org.apache.hadoop.hdds.scm.HddsServerUtil.getStaleNodeInterval(HddsServerUtil.java:256)
>   at 
> org.apache.hadoop.hdds.scm.node.NodeStateManager.(NodeStateManager.java:136)
>   at 
> org.apache.hadoop.hdds.scm.node.SCMNodeManager.(SCMNodeManager.java:105)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.initalizeSystemManagers(StorageContainerManager.java:391)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.(StorageContainerManager.java:286)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.(StorageContainerManager.java:218)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:684)
>   at 
> org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:628)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createSCM(MiniOzoneClusterImpl.java:458)
>   at 
> org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:392)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.TestOzoneContainer.testBothGetandPutSmallFile(TestOzoneContainer.java:237)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Work logged] (HDDS-1301) Optimize recursive ozone filesystem apis

2019-04-17 Thread ASF GitHub Bot (JIRA)



 [ 
https://issues.apache.org/jira/browse/HDDS-1301?focusedWorklogId=229170=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-229170
 ]

ASF GitHub Bot logged work on HDDS-1301:


Author: ASF GitHub Bot
Created on: 17/Apr/19 15:41
Start Date: 17/Apr/19 15:41
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on issue #718: HDDS-1301. 
Optimize recursive ozone filesystem apis
URL: https://github.com/apache/hadoop/pull/718#issuecomment-484144792
 
 
   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | 0 | reexec | 57 | Docker mode activated. |
   ||| _ Prechecks _ |
   | +1 | @author | 0 | The patch does not contain any @author tags. |
   | +1 | test4tests | 0 | The patch appears to include 2 new or modified test 
files. |
   ||| _ trunk Compile Tests _ |
   | 0 | mvndep | 70 | Maven dependency ordering for branch |
   | +1 | mvninstall | 1077 | trunk passed |
   | +1 | compile | 1063 | trunk passed |
   | +1 | checkstyle | 138 | trunk passed |
   | -1 | mvnsite | 52 | client in trunk failed. |
   | +1 | shadedclient | 1127 | branch has no errors when building and testing 
our client artifacts. |
   | -1 | findbugs | 33 | client in trunk failed. |
   | +1 | javadoc | 186 | trunk passed |
   ||| _ Patch Compile Tests _ |
   | 0 | mvndep | 22 | Maven dependency ordering for patch |
   | -1 | mvninstall | 19 | client in the patch failed. |
   | -1 | mvninstall | 18 | ozonefs in the patch failed. |
   | +1 | compile | 1035 | the patch passed |
   | +1 | cc | 1035 | the patch passed |
   | +1 | javac | 1035 | the patch passed |
   | +1 | checkstyle | 151 | the patch passed |
   | -1 | mvnsite | 35 | ozonefs in the patch failed. |
   | +1 | whitespace | 0 | The patch has no whitespace issues. |
   | +1 | shadedclient | 684 | patch has no errors when building and testing 
our client artifacts. |
   | -1 | findbugs | 33 | ozonefs in the patch failed. |
   | +1 | javadoc | 181 | the patch passed |
   ||| _ Other Tests _ |
   | +1 | unit | 95 | common in the patch passed. |
   | +1 | unit | 36 | client in the patch passed. |
   | -1 | unit | 44 | common in the patch failed. |
   | -1 | unit | 75 | ozone-manager in the patch failed. |
   | -1 | unit | 32 | ozonefs in the patch failed. |
   | +1 | asflicense | 44 | The patch does not generate ASF License warnings. |
   | | | 7183 | |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.ozone.om.exceptions.TestResultCodes |
   |   | hadoop.ozone.om.ratis.TestOzoneManagerRatisServer |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | Client=17.05.0-ce Server=17.05.0-ce base: 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/718 |
   | Optional Tests |  dupname  asflicense  compile  javac  javadoc  mvninstall 
 mvnsite  unit  shadedclient  findbugs  checkstyle  cc  |
   | uname | Linux 18bc982ec5df 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 
17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | personality/hadoop.sh |
   | git revision | trunk / 13907d8 |
   | maven | version: Apache Maven 3.3.9 |
   | Default Java | 1.8.0_191 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/branch-mvnsite-hadoop-ozone_client.txt
 |
   | findbugs | v3.1.0-RC1 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/branch-findbugs-hadoop-ozone_client.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/patch-mvninstall-hadoop-ozone_client.txt
 |
   | mvninstall | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/patch-mvninstall-hadoop-ozone_ozonefs.txt
 |
   | mvnsite | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/patch-mvnsite-hadoop-ozone_ozonefs.txt
 |
   | findbugs | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/patch-findbugs-hadoop-ozone_ozonefs.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/patch-unit-hadoop-ozone_common.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/patch-unit-hadoop-ozone_ozone-manager.txt
 |
   | unit | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/artifact/out/patch-unit-hadoop-ozone_ozonefs.txt
 |
   |  Test Results | 
https://builds.apache.org/job/hadoop-multibranch/job/PR-718/2/testReport/ |
   | Max. process+thread count | 414 (vs. ulimit of 5500) |
   | modules | C: hadoop-hdds/common hadoop-ozone/client hadoop-ozone/common 
hadoop-ozone/ozone-manager hadoop-ozone/ozonefs U: . |
   | Console output |

[jira] [Commented] (HDFS-10659) Namenode crashes after Journalnode re-installation in an HA cluster due to missing paxos directory

2019-04-17 Thread star (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-10659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820206#comment-16820206
 ] 

star commented on HDFS-10659:
-

[~jojochuang], would you like to take a review for the patch?

> Namenode crashes after Journalnode re-installation in an HA cluster due to 
> missing paxos directory
> --
>
> Key: HDFS-10659
> URL: https://issues.apache.org/jira/browse/HDFS-10659
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ha, journal-node
>Affects Versions: 2.7.0
>Reporter: Amit Anand
>Assignee: star
>Priority: Major
> Attachments: HDFS-10659.000.patch, HDFS-10659.001.patch, 
> HDFS-10659.002.patch, HDFS-10659.003.patch, HDFS-10659.004.patch, 
> HDFS-10659.005.patch, HDFS-10659.006.patch
>
>
> In my environment I am seeing {{Namenodes}} crashing down after majority of 
> {{Journalnodes}} are re-installed. We manage multiple clusters and do rolling 
> upgrades followed by rolling re-install of each node including master(NN, JN, 
> RM, ZK) nodes. When a journal node is re-installed or moved to a new 
> disk/host, instead of running {{"initializeSharedEdits"}} command, I copy 
> {{VERSION}} file from one of the other {{Journalnode}} and that allows my 
> {{NN}} to start writing data to the newly installed {{Journalnode}}.
> To acheive quorum for JN and recover unfinalized segments NN during starupt 
> creates .tmp files under {{"/jn/current/paxos"}} directory . In 
> current implementation "paxos" directry is only created during 
> {{"initializeSharedEdits"}} command and if a JN is re-installed the "paxos" 
> directory is not created upon JN startup or by NN while writing .tmp 
> files which causes NN to crash with following error message:
> {code}
> 192.168.100.16:8485: /disk/1/dfs/jn/Test-Laptop/current/paxos/64044.tmp (No 
> such file or directory)
> at java.io.FileOutputStream.open(Native Method)
> at java.io.FileOutputStream.(FileOutputStream.java:221)
> at java.io.FileOutputStream.(FileOutputStream.java:171)
> at 
> org.apache.hadoop.hdfs.util.AtomicFileOutputStream.(AtomicFileOutputStream.java:58)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.persistPaxosData(Journal.java:971)
> at 
> org.apache.hadoop.hdfs.qjournal.server.Journal.acceptRecovery(Journal.java:846)
> at 
> org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.acceptRecovery(JournalNodeRpcServer.java:205)
> at 
> org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.acceptRecovery(QJournalProtocolServerSideTranslatorPB.java:249)
> at 
> org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25435)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)
> {code}
> The current 
> [getPaxosFile|https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JNStorage.java#L128-L130]
>  method simply returns a path to a file under "paxos" directory without 
> verifiying its existence. Since "paxos" directoy holds files that are 
> required for NN recovery and acheiving JN quorum my proposed solution is to 
> add a check to "getPaxosFile" method and create the {{"paxos"}} directory if 
> it is missing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-14378) Simplify the design of multiple NN and both logic of edit log roll and checkpoint

2019-04-17 Thread star (JIRA)



[ 
https://issues.apache.org/jira/browse/HDFS-14378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820196#comment-16820196
 ] 

star commented on HDFS-14378:
-

Any one would like to make a review for the patch? Appreciated.

> Simplify the design of multiple NN and both logic of edit log roll and 
> checkpoint
> -
>
> Key: HDFS-14378
> URL: https://issues.apache.org/jira/browse/HDFS-14378
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: namenode
>Affects Versions: 3.1.2
>Reporter: star
>Assignee: star
>Priority: Minor
>  Labels: patch
> Attachments: HDFS-14378-trunk.001.patch, HDFS-14378-trunk.002.patch, 
> HDFS-14378-trunk.003.patch, HDFS-14378-trunk.004.patch
>
>
>       HDFS-6440 introduced a mechanism to support more than 2 NNs. It 
> implements a first-writer-win policy to avoid duplicated fsimage downloading. 
> Variable 'isPrimaryCheckPointer' is used to hold the first-writer state, with 
> which SNN will provide fsimage for ANN next time. Then we have three roles in 
> NN cluster: ANN, one primary SNN, one or more normal SNN.
>       Since HDFS-12248, there may be more than two primary SNN shortly after 
> a exception occurred. It takes care with a scenario  that SNN will not upload 
> fsimage on IOE and Interrupted exceptions. Though it will not cause any 
> further functional issues, it is inconsistent. 
>       Futher more, edit log may be rolled more frequently than necessary with 
> multiple Standby name nodes, HDFS-14349. (I'm not so sure about this, will 
> verify by unit tests or any one could point it out.)
>       Above all, I‘m wondering if we could make it simple with following 
> changes:
>  * There are only two roles:ANN, SNN
>  * ANN will roll its edit log every DFS_HA_LOGROLL_PERIOD_KEY period.
>  * ANN will select a SNN to download checkpoint.
> SNN will just do logtail and checkpoint. Then provide a servlet for fsimage 
> downloading as normal. SNN will not try to roll edit log or send checkpoint 
> request to ANN.
> In a word, ANN will be more active. Suggestions are welcomed.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

1 2 >

1 - 100 of 156 matches

Mail list logo