[jira] [Commented] (HDDS-4240) Ozone support append operation

2020-10-12 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212171#comment-17212171
 ] 

mingchao zhao commented on HDDS-4240:
-

append将会有较多的改动,使用子任务不是很好展开处理。这里单独列了一个计划HDDS-4333。

> Ozone support append operation
> --
>
> Key: HDDS-4240
> URL: https://issues.apache.org/jira/browse/HDDS-4240
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-4240) Ozone support append operation

2020-10-12 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212171#comment-17212171
 ] 

mingchao zhao edited comment on HDDS-4240 at 10/12/20, 7:09 AM:


Append is going to have a lot of changes, and it's not easy to expand with 
subtasks. A separate plan, HDDS-4333, is listed here.


was (Author: micahzhao):
append将会有较多的改动,使用子任务不是很好展开处理。这里单独列了一个计划HDDS-4333。

> Ozone support append operation
> --
>
> Key: HDDS-4240
> URL: https://issues.apache.org/jira/browse/HDDS-4240
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: runzhiwang
>Assignee: runzhiwang
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4333) Ozone supports append operation

2020-10-12 Thread mingchao zhao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mingchao zhao updated HDDS-4333:

Summary: Ozone supports append operation  (was: Ozone supports append 
writes)

> Ozone supports append operation
> ---
>
> Key: HDDS-4333
> URL: https://issues.apache.org/jira/browse/HDDS-4333
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: mingchao zhao
>Priority: Major
>
> Currently HDDS does not support modifying append operations on data. We had 
> this need in production, so we needed to make HDDS support this feature



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread mingchao zhao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mingchao zhao updated HDDS-4308:

Comment: was deleted

(was: Hi [~bharat] Can we keep updating the usedBytes of volumeArgs in memory 
and persist it in the takeSnapshot(currently, this action is executed 
periodically)? 
Just want to make sure it's feasible. I am not familiar with snapshot and 
ratis. If feasible, I am very willing to discuss here.)

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4333) Ozone supports append writes

2020-10-12 Thread mingchao zhao (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mingchao zhao updated HDDS-4333:

Summary: Ozone supports append writes  (was: HDDS supports append writes)

> Ozone supports append writes
> 
>
> Key: HDDS-4333
> URL: https://issues.apache.org/jira/browse/HDDS-4333
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>Reporter: mingchao zhao
>Priority: Major
>
> Currently HDDS does not support modifying append operations on data. We had 
> this need in production, so we needed to make HDDS support this feature



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212162#comment-17212162
 ] 

mingchao zhao edited comment on HDDS-4308 at 10/12/20, 7:01 AM:


cc [~amaliujia], HDDS-4272 and HDDS-4277 will also have this problem, need to 
pay attention to
 here.


was (Author: micahzhao):
cc [~amaliujia], HDDS-4272 and HDDS-4277 will also have this problem, need to 
pay attention to the 
 here.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212162#comment-17212162
 ] 

mingchao zhao commented on HDDS-4308:
-

cc [~amaliujia], HDDS-4272 and HDDS-4277 will also have this problem, need to 
pay attention to the 
 here.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4336) ContainerInfo does not persist BCSID leading to failed replicas reports

2020-10-12 Thread Stephen O'Donnell (Jira)
Stephen O'Donnell created HDDS-4336:
---

 Summary: ContainerInfo does not persist BCSID leading to failed 
replicas reports
 Key: HDDS-4336
 URL: https://issues.apache.org/jira/browse/HDDS-4336
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM
Affects Versions: 1.1.0
Reporter: Stephen O'Donnell
Assignee: Stephen O'Donnell


If you create a container, and then close it, the BCSID is synced on the 
datanodes and then the value is updated in SCM via setting the "sequenceID" 
field on the containerInfo object for the container.

If you later restart just SCM, the sequenceID becomes null, and then container 
reports for the replica fail with a stack trace like:

{code}
Exception in thread "EventQueue-ContainerReportForContainerReportHandler" 
java.lang.AssertionError
at 
org.apache.hadoop.hdds.scm.container.ContainerInfo.updateSequenceId(ContainerInfo.java:176)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerStats(AbstractContainerReportHandler.java:108)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:83)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:162)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:130)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
at 
org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

The assertion here is what is failing, as it does not allow for the sequenceID 
to be changed on a CLOSED container:

{code}
  public void updateSequenceId(long sequenceID) {
assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
sequenceId = max(sequenceID, sequenceId);
  }
{code}

The issue seems to be caused by the serialisation and deserialisation of the 
containerInfo object to protobuf, as sequenceId never persisted or restored.

However, I am also confused about how this ever worked, as this is a pretty 
significant problem.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] sodonnel opened a new pull request #1488: HDDS-4336. ContainerInfo does not persist BCSID (sequenceId) leading to failed replicas reports

2020-10-12 Thread GitBox


sodonnel opened a new pull request #1488:
URL: https://github.com/apache/hadoop-ozone/pull/1488


   ## What changes were proposed in this pull request?
   
   If you create a container, and then close it, the BCSID is synced on the 
datanodes and then the value is updated in SCM via setting the "sequenceID" 
field on the containerInfo object for the container.
   
   If you later restart just SCM, the sequenceID becomes zero, and then 
container reports for the replica fail with a stack trace like:
   
   ```
   Exception in thread "EventQueue-ContainerReportForContainerReportHandler" 
java.lang.AssertionError
at 
org.apache.hadoop.hdds.scm.container.ContainerInfo.updateSequenceId(ContainerInfo.java:176)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerStats(AbstractContainerReportHandler.java:108)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:83)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:162)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:130)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
at 
org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
   ```
   
   The assertion here is failing, as it does not allow for the sequenceID to be 
changed on a CLOSED container:
   
   ```
 public void updateSequenceId(long sequenceID) {
   assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
   sequenceId = max(sequenceID, sequenceId);
 }
   ```
   
   The issue seems to be caused by the serialisation and deserialisation of the 
containerInfo object to protobuf, as sequenceId never persisted or restored.
   
   However, I am also confused about how this ever worked, as this is a pretty 
significant problem.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4336
   
   ## How was this patch tested?
   
   New integration test to reproduce the issue before fixing it.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4336) ContainerInfo does not persist BCSID leading to failed replicas reports

2020-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4336:
-
Labels: pull-request-available  (was: )

> ContainerInfo does not persist BCSID leading to failed replicas reports
> ---
>
> Key: HDDS-4336
> URL: https://issues.apache.org/jira/browse/HDDS-4336
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 1.1.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>
> If you create a container, and then close it, the BCSID is synced on the 
> datanodes and then the value is updated in SCM via setting the "sequenceID" 
> field on the containerInfo object for the container.
> If you later restart just SCM, the sequenceID becomes zero, and then 
> container reports for the replica fail with a stack trace like:
> {code}
> Exception in thread "EventQueue-ContainerReportForContainerReportHandler" 
> java.lang.AssertionError
>   at 
> org.apache.hadoop.hdds.scm.container.ContainerInfo.updateSequenceId(ContainerInfo.java:176)
>   at 
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerStats(AbstractContainerReportHandler.java:108)
>   at 
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:83)
>   at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:162)
>   at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:130)
>   at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
>   at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> The assertion here is failing, as it does not allow for the sequenceID to be 
> changed on a CLOSED container:
> {code}
>   public void updateSequenceId(long sequenceID) {
> assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
> sequenceId = max(sequenceID, sequenceId);
>   }
> {code}
> The issue seems to be caused by the serialisation and deserialisation of the 
> containerInfo object to protobuf, as sequenceId never persisted or restored.
> However, I am also confused about how this ever worked, as this is a pretty 
> significant problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4337) Implement RocksDB options cache for new datanode DB utilities

2020-10-12 Thread Ethan Rose (Jira)
Ethan Rose created HDDS-4337:


 Summary: Implement RocksDB options cache for new datanode DB 
utilities
 Key: HDDS-4337
 URL: https://issues.apache.org/jira/browse/HDDS-4337
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Reporter: Ethan Rose


HDDS-3869 switched datanodes from using the old database utilities found in the 
hdds.utils package to the new database utilities in the hdds.utils.db package. 
Since the datanode RocksDB options cache from HDDS-2283 was implemented in the 
old utilities package, it is no longer present on the datanodes after HDDS-3869 
was merged. This issue aims to add the options cache performance improvement to 
the new datanode code.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] GlenGeng commented on pull request #1371: HDDS-2922. Balance ratis leader distribution in datanodes

2020-10-12 Thread GitBox


GlenGeng commented on pull request #1371:
URL: https://github.com/apache/hadoop-ozone/pull/1371#issuecomment-707094391


   +1
   Thanks for the work! LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4336) ContainerInfo does not persist BCSID leading to failed replicas reports

2020-10-12 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDDS-4336:

Description: 
If you create a container, and then close it, the BCSID is synced on the 
datanodes and then the value is updated in SCM via setting the "sequenceID" 
field on the containerInfo object for the container.

If you later restart just SCM, the sequenceID becomes zero, and then container 
reports for the replica fail with a stack trace like:

{code}
Exception in thread "EventQueue-ContainerReportForContainerReportHandler" 
java.lang.AssertionError
at 
org.apache.hadoop.hdds.scm.container.ContainerInfo.updateSequenceId(ContainerInfo.java:176)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerStats(AbstractContainerReportHandler.java:108)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:83)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:162)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:130)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
at 
org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

The assertion here is what is failing, as it does not allow for the sequenceID 
to be changed on a CLOSED container:

{code}
  public void updateSequenceId(long sequenceID) {
assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
sequenceId = max(sequenceID, sequenceId);
  }
{code}

The issue seems to be caused by the serialisation and deserialisation of the 
containerInfo object to protobuf, as sequenceId never persisted or restored.

However, I am also confused about how this ever worked, as this is a pretty 
significant problem.



  was:
If you create a container, and then close it, the BCSID is synced on the 
datanodes and then the value is updated in SCM via setting the "sequenceID" 
field on the containerInfo object for the container.

If you later restart just SCM, the sequenceID becomes null, and then container 
reports for the replica fail with a stack trace like:

{code}
Exception in thread "EventQueue-ContainerReportForContainerReportHandler" 
java.lang.AssertionError
at 
org.apache.hadoop.hdds.scm.container.ContainerInfo.updateSequenceId(ContainerInfo.java:176)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerStats(AbstractContainerReportHandler.java:108)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:83)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:162)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:130)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
at 
org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

The assertion here is what is failing, as it does not allow for the sequenceID 
to be changed on a CLOSED container:

{code}
  public void updateSequenceId(long sequenceID) {
assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
sequenceId = max(sequenceID, sequenceId);
  }
{code}

The issue seems to be caused by the serialisation and deserialisation of the 
containerInfo object to protobuf, as sequenceId never persisted or restored.

However, I am also confused about how this ever worked, as this is a pretty 
significant problem.




> ContainerInfo does not persist BCSID leading to failed replicas reports
> ---
>
> Key: HDDS-4336
> URL: https://issues.apache.org/jira/browse/HDDS-4336
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 1.1.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>
> If you create a 

[jira] [Updated] (HDDS-4336) ContainerInfo does not persist BCSID leading to failed replicas reports

2020-10-12 Thread Stephen O'Donnell (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen O'Donnell updated HDDS-4336:

Description: 
If you create a container, and then close it, the BCSID is synced on the 
datanodes and then the value is updated in SCM via setting the "sequenceID" 
field on the containerInfo object for the container.

If you later restart just SCM, the sequenceID becomes zero, and then container 
reports for the replica fail with a stack trace like:

{code}
Exception in thread "EventQueue-ContainerReportForContainerReportHandler" 
java.lang.AssertionError
at 
org.apache.hadoop.hdds.scm.container.ContainerInfo.updateSequenceId(ContainerInfo.java:176)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerStats(AbstractContainerReportHandler.java:108)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:83)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:162)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:130)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
at 
org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

The assertion here is failing, as it does not allow for the sequenceID to be 
changed on a CLOSED container:

{code}
  public void updateSequenceId(long sequenceID) {
assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
sequenceId = max(sequenceID, sequenceId);
  }
{code}

The issue seems to be caused by the serialisation and deserialisation of the 
containerInfo object to protobuf, as sequenceId never persisted or restored.

However, I am also confused about how this ever worked, as this is a pretty 
significant problem.



  was:
If you create a container, and then close it, the BCSID is synced on the 
datanodes and then the value is updated in SCM via setting the "sequenceID" 
field on the containerInfo object for the container.

If you later restart just SCM, the sequenceID becomes zero, and then container 
reports for the replica fail with a stack trace like:

{code}
Exception in thread "EventQueue-ContainerReportForContainerReportHandler" 
java.lang.AssertionError
at 
org.apache.hadoop.hdds.scm.container.ContainerInfo.updateSequenceId(ContainerInfo.java:176)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerStats(AbstractContainerReportHandler.java:108)
at 
org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:83)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:162)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:130)
at 
org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
at 
org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{code}

The assertion here is what is failing, as it does not allow for the sequenceID 
to be changed on a CLOSED container:

{code}
  public void updateSequenceId(long sequenceID) {
assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
sequenceId = max(sequenceID, sequenceId);
  }
{code}

The issue seems to be caused by the serialisation and deserialisation of the 
containerInfo object to protobuf, as sequenceId never persisted or restored.

However, I am also confused about how this ever worked, as this is a pretty 
significant problem.




> ContainerInfo does not persist BCSID leading to failed replicas reports
> ---
>
> Key: HDDS-4336
> URL: https://issues.apache.org/jira/browse/HDDS-4336
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 1.1.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>
> If you create a container, and 

[GitHub] [hadoop-ozone] ChenSammi commented on pull request #1445: HDDS-4272. Volume namespace: add usedNamespace and update it when create and delete bucket

2020-10-12 Thread GitBox


ChenSammi commented on pull request #1445:
URL: https://github.com/apache/hadoop-ozone/pull/1445#issuecomment-707100115


   @amaliujia , could you add a new UT for bucket link case?  Linked bucket 
should not be counted in the namespace quota. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4336) ContainerInfo does not persist BCSID leading to failed replicas reports

2020-10-12 Thread Nanda kumar (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDDS-4336:
--
Status: Patch Available  (was: Open)

> ContainerInfo does not persist BCSID leading to failed replicas reports
> ---
>
> Key: HDDS-4336
> URL: https://issues.apache.org/jira/browse/HDDS-4336
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: SCM
>Affects Versions: 1.1.0
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
>  Labels: pull-request-available
>
> If you create a container, and then close it, the BCSID is synced on the 
> datanodes and then the value is updated in SCM via setting the "sequenceID" 
> field on the containerInfo object for the container.
> If you later restart just SCM, the sequenceID becomes zero, and then 
> container reports for the replica fail with a stack trace like:
> {code}
> Exception in thread "EventQueue-ContainerReportForContainerReportHandler" 
> java.lang.AssertionError
>   at 
> org.apache.hadoop.hdds.scm.container.ContainerInfo.updateSequenceId(ContainerInfo.java:176)
>   at 
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.updateContainerStats(AbstractContainerReportHandler.java:108)
>   at 
> org.apache.hadoop.hdds.scm.container.AbstractContainerReportHandler.processContainerReplica(AbstractContainerReportHandler.java:83)
>   at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.processContainerReplicas(ContainerReportHandler.java:162)
>   at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:130)
>   at 
> org.apache.hadoop.hdds.scm.container.ContainerReportHandler.onMessage(ContainerReportHandler.java:50)
>   at 
> org.apache.hadoop.hdds.server.events.SingleThreadExecutor.lambda$onMessage$1(SingleThreadExecutor.java:81)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> {code}
> The assertion here is failing, as it does not allow for the sequenceID to be 
> changed on a CLOSED container:
> {code}
>   public void updateSequenceId(long sequenceID) {
> assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
> sequenceId = max(sequenceID, sequenceId);
>   }
> {code}
> The issue seems to be caused by the serialisation and deserialisation of the 
> containerInfo object to protobuf, as sequenceId never persisted or restored.
> However, I am also confused about how this ever worked, as this is a pretty 
> significant problem.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] aryangupta1998 opened a new pull request #1487: HDDS-4318. Disable single node pipeline creation by default in Ozone.

2020-10-12 Thread GitBox


aryangupta1998 opened a new pull request #1487:
URL: https://github.com/apache/hadoop-ozone/pull/1487


   ## What changes were proposed in this pull request?
   
   Currently, single node pipeline creation is ON by default in ozone, though 
its not used by default in Ozone write path. It would be good to disable this 
by turning off the config "ozone.scm.pipeline.creation.auto.factor.one" by 
default. It may lead to some unit test failures and for those tests , this 
config needs to b explicitly set to true.
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-4318
   
   ## How was this patch tested?
   
   Tested manually 
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4318) Disable single node pipeline creation by default in Ozone

2020-10-12 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-4318:
-
Labels: pull-request-available  (was: )

> Disable single node pipeline creation by default in Ozone
> -
>
> Key: HDDS-4318
> URL: https://issues.apache.org/jira/browse/HDDS-4318
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Shashikant Banerjee
>Assignee: Aryan Gupta
>Priority: Major
>  Labels: pull-request-available
>
> Currently, single node pipeline creation is ON by default in ozone, though 
> its not used by default in Ozone write path. It would be good to disable this 
> by turning off the config "ozone.scm.pipeline.creation.auto.factor.one" by 
> default. It may lead to some unit test failures and for those tests , this 
> config needs to b explicitly set to true.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4335) No user access checks in Ozone FS

2020-10-12 Thread Shashikant Banerjee (Jira)
Shashikant Banerjee created HDDS-4335:
-

 Summary: No user access checks in Ozone FS
 Key: HDDS-4335
 URL: https://issues.apache.org/jira/browse/HDDS-4335
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Shashikant Banerjee


Currently, a dir/file created with hdfs user cab be deleted by any user.
{code:java}
[sbanerjee@vd1308 MapReduce-Performance_Testing-master]$ sudo -u hdfs ozone fs 
-mkdir o3fs://bucket1.vol1.ozone1/data/sandbox/poc/teragen
[sbanerjee@vd1308 MapReduce-Performance_Testing-master]$ sudo -u hdfs ozone fs 
-ls o3fs://bucket1.vol1.ozone1/data/sandbox/poc/teragen
[sbanerjee@vd1308 MapReduce-Performance_Testing-master]$ sudo -u hdfs ozone fs 
-ls o3fs://bucket1.vol1.ozone1/data/sandbox/poc/
Found 1 items
drwxrwxrwx   - hdfs hdfs          0 2020-10-12 02:51 
o3fs://bucket1.vol1.ozone1/data/sandbox/poc/teragen
[sbanerjee@vd1308 MapReduce-Performance_Testing-master]$ 
[sbanerjee@vd1308 MapReduce-Performance_Testing-master]$ 
[sbanerjee@vd1308 MapReduce-Performance_Testing-master]$ 
[sbanerjee@vd1308 MapReduce-Performance_Testing-master]$ ozone fs -rm -r 
o3fs://bucket1.vol1.ozone1/data/sandbox/poc/
20/10/12 02:52:16 INFO Configuration.deprecation: io.bytes.per.checksum is 
deprecated. Instead, use dfs.bytes-per-checksum
20/10/12 02:52:16 INFO ozone.BasicOzoneFileSystem: Move to trash is disabled 
for o3fs, deleting instead: o3fs://bucket1.vol1.ozone1/data/sandbox/poc. Files 
or directories will NOT be retained in trash. Ignore the following 
TrashPolicyDefault message, if any.
20/10/12 02:52:16 INFO fs.TrashPolicyDefault: Moved: 
'o3fs://bucket1.vol1.ozone1/data/sandbox/poc' to trash at: 
/.Trash/sbanerjee/Current/data/sandbox/poc1602496336480
[sbanerjee@vd1308 MapReduce-Performance_Testing-master]$ sudo -u hdfs ozone fs 
-ls o3fs://bucket1.vol1.ozone1/data/sandbox/poc/
ls: `o3fs://bucket1.vol1.ozone1/data/sandbox/poc/': No such file or directory
{code}
Whereas, the same seuquence fails with permission denied error in HDFS.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-4307) Start Trash Emptier in Ozone Manager

2020-10-12 Thread YiSheng Lien (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212542#comment-17212542
 ] 

YiSheng Lien edited comment on HDDS-4307 at 10/12/20, 6:17 PM:
---

Hi [~sadanand_shenoy], thanks for this issue. 

I think this issue should be separated to several sub-tasks, 
 and I'm working on the design-doc (I would upload to HDDS-2416 and here if you 
don't mind.)

So I supposed that we could start the work til the uploading of design-doc (by 
this week),
 feel free to share your thoughts, thanks.


was (Author: cxorm):
Hi [~sadanand_shenoy], thanks for this issue. 


I think this issue should be separated to several sub-tasks, 
and I'm working on the design-doc (I would upload to HDDS-2416 and here if you 
don't mind.)

So I supposed that we could start the work til the uploading of design-doc,
feel free to share your thoughts, thanks.

> Start Trash Emptier in Ozone Manager
> 
>
> Key: HDDS-4307
> URL: https://issues.apache.org/jira/browse/HDDS-4307
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Sadanand Shenoy
>Assignee: Sadanand Shenoy
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1473: HDDS-4266: CreateFile : store parent dir entries into DirTable and file entry into separate FileTable

2020-10-12 Thread GitBox


bharatviswa504 commented on a change in pull request #1473:
URL: https://github.com/apache/hadoop-ozone/pull/1473#discussion_r503496882



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileCreateRequestV1.java
##
@@ -0,0 +1,289 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.request.file;
+
+import com.google.common.base.Optional;
+import org.apache.hadoop.hdds.utils.db.cache.CacheKey;
+import org.apache.hadoop.hdds.utils.db.cache.CacheValue;
+import org.apache.hadoop.ozone.audit.OMAction;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.*;
+import org.apache.hadoop.ozone.om.ratis.utils.OzoneManagerDoubleBufferHelper;
+import org.apache.hadoop.ozone.om.request.util.OmResponseUtil;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.file.OMFileCreateResponse;
+import org.apache.hadoop.ozone.om.response.file.OMFileCreateResponseV1;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.*;
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.BUCKET_LOCK;
+import static 
org.apache.hadoop.ozone.om.request.file.OMFileRequest.OMDirectoryResult.*;
+
+/**
+ * Handles create file request layout version1.
+ */
+public class OMFileCreateRequestV1 extends OMFileCreateRequest {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMFileCreateRequestV1.class);
+  public OMFileCreateRequestV1(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  @SuppressWarnings("methodlength")
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long trxnLogIndex, OzoneManagerDoubleBufferHelper omDoubleBufferHelper) {
+
+CreateFileRequest createFileRequest = 
getOmRequest().getCreateFileRequest();
+KeyArgs keyArgs = createFileRequest.getKeyArgs();
+Map auditMap = buildKeyArgsAuditMap(keyArgs);
+
+String volumeName = keyArgs.getVolumeName();
+String bucketName = keyArgs.getBucketName();
+String keyName = keyArgs.getKeyName();
+
+// if isRecursive is true, file would be created even if parent
+// directories does not exist.
+boolean isRecursive = createFileRequest.getIsRecursive();
+if (LOG.isDebugEnabled()) {
+  LOG.debug("File create for : " + volumeName + "/" + bucketName + "/"
+  + keyName + ":" + isRecursive);
+}
+
+// if isOverWrite is true, file would be over written.
+boolean isOverWrite = createFileRequest.getIsOverwrite();
+
+OMMetrics omMetrics = ozoneManager.getMetrics();
+omMetrics.incNumCreateFile();
+
+OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+
+boolean acquiredLock = false;
+
+OmVolumeArgs omVolumeArgs = null;
+OmBucketInfo omBucketInfo = null;
+final List locations = new ArrayList<>();
+List missingParentInfos;
+int numKeysCreated = 0;
+
+OMClientResponse omClientResponse = null;
+OMResponse.Builder omResponse = OmResponseUtil.getOMResponseBuilder(
+getOmRequest());
+IOException exception = null;
+Result result = null;
+try {
+  keyArgs = resolveBucketLink(ozoneManager, keyArgs, auditMap);
+  volumeName = keyArgs.getVolumeName();
+  bucketName = keyArgs.getBucketName();
+
+  if (keyName.length() == 0) {
+// Check if this is the root of the filesystem.
+throw new OMException("Can not write to directory: " + keyName,
+OMException.ResultCodes.NOT_A_FILE);
+  }
+
+  // check Acl
+  checkKeyAcls(ozoneManager, volumeName, bucketName, keyName,
+ 

[jira] [Assigned] (HDDS-4327) Potential resource leakage using BatchOperation

2020-10-12 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal reassigned HDDS-4327:
---

Assignee: Bharat Viswanadham

> Potential resource leakage using BatchOperation
> ---
>
> Key: HDDS-4327
> URL: https://issues.apache.org/jira/browse/HDDS-4327
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Bharat Viswanadham
>Priority: Major
>
> there are a number of places in the code where BatchOperation is used but not 
> closed. As a best practice, better to close them explicitly.
> I have a stress test code that uses BatchOperation to insert into OM rocksdb. 
> Without closing BatchOperation explicitly, the process crashes after just a 
> few minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4327) Potential resource leakage using BatchOperation

2020-10-12 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-4327:

Target Version/s: 1.1.0

> Potential resource leakage using BatchOperation
> ---
>
> Key: HDDS-4327
> URL: https://issues.apache.org/jira/browse/HDDS-4327
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Bharat Viswanadham
>Priority: Blocker
>
> there are a number of places in the code where BatchOperation is used but not 
> closed. As a best practice, better to close them explicitly.
> I have a stress test code that uses BatchOperation to insert into OM rocksdb. 
> Without closing BatchOperation explicitly, the process crashes after just a 
> few minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4327) Potential resource leakage using BatchOperation

2020-10-12 Thread Arpit Agarwal (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDDS-4327:

Priority: Blocker  (was: Major)

> Potential resource leakage using BatchOperation
> ---
>
> Key: HDDS-4327
> URL: https://issues.apache.org/jira/browse/HDDS-4327
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Bharat Viswanadham
>Priority: Blocker
>
> there are a number of places in the code where BatchOperation is used but not 
> closed. As a best practice, better to close them explicitly.
> I have a stress test code that uses BatchOperation to insert into OM rocksdb. 
> Without closing BatchOperation explicitly, the process crashes after just a 
> few minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4220) BlockManagerImpl#getBlockByID does unnecessary serialization

2020-10-12 Thread Hanisha Koneru (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru updated HDDS-4220:
-
Fix Version/s: 1.1.0

> BlockManagerImpl#getBlockByID does unnecessary serialization
> 
>
> Key: HDDS-4220
> URL: https://issues.apache.org/jira/browse/HDDS-4220
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Ethan Rose
>Assignee: Ethan Rose
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> After HDDS-3869, tables in the datanode handle coding/decoding objects 
> to/from RocksDB, and the caller no longer has to do this manually. As a 
> result, the BlockManagerImpl#getBlockByID method should now return a 
> BlockData type, instead of a byte array. In the current implementation, this 
> method converts the block data into a byte array and returns it to the 
> caller, who then converts the byte array back to block data in order to use 
> it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread Rui Wang (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212631#comment-17212631
 ] 

Rui Wang commented on HDDS-4308:


[~micahzhao] thanks for this reminder. Will follow up on discussions above to 
understand necessary steps.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4307) Start Trash Emptier in Ozone Manager

2020-10-12 Thread YiSheng Lien (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212542#comment-17212542
 ] 

YiSheng Lien commented on HDDS-4307:


Hi [~sadanand_shenoy], thanks for this issue. 


I think this issue should be separated to several sub-tasks, 
and I'm working on the design-doc (I would upload to HDDS-2416 and here if you 
don't mind.)

So I supposed that we could start the work til the uploading of design-doc,
feel free to share your thoughts, thanks.

> Start Trash Emptier in Ozone Manager
> 
>
> Key: HDDS-4307
> URL: https://issues.apache.org/jira/browse/HDDS-4307
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Reporter: Sadanand Shenoy
>Assignee: Sadanand Shenoy
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-4220) BlockManagerImpl#getBlockByID does unnecessary serialization

2020-10-12 Thread Hanisha Koneru (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4220?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hanisha Koneru resolved HDDS-4220.
--
Resolution: Fixed

> BlockManagerImpl#getBlockByID does unnecessary serialization
> 
>
> Key: HDDS-4220
> URL: https://issues.apache.org/jira/browse/HDDS-4220
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Ethan Rose
>Assignee: Ethan Rose
>Priority: Minor
>  Labels: pull-request-available
>
> After HDDS-3869, tables in the datanode handle coding/decoding objects 
> to/from RocksDB, and the caller no longer has to do this manually. As a 
> result, the BlockManagerImpl#getBlockByID method should now return a 
> BlockData type, instead of a byte array. In the current implementation, this 
> method converts the block data into a byte array and returns it to the 
> caller, who then converts the byte array back to block data in order to use 
> it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] hanishakoneru merged pull request #1470: HDDS-4220. BlockManagerImpl#getBlockByID does unnecessary serialization

2020-10-12 Thread GitBox


hanishakoneru merged pull request #1470:
URL: https://github.com/apache/hadoop-ozone/pull/1470


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] hanishakoneru commented on pull request #1470: HDDS-4220. BlockManagerImpl#getBlockByID does unnecessary serialization

2020-10-12 Thread GitBox


hanishakoneru commented on pull request #1470:
URL: https://github.com/apache/hadoop-ozone/pull/1470#issuecomment-707266148


   Thanks @errose28 for fixing this. 
   LGTM. +1.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on a change in pull request #1473: HDDS-4266: CreateFile : store parent dir entries into DirTable and file entry into separate FileTable

2020-10-12 Thread GitBox


bharatviswa504 commented on a change in pull request #1473:
URL: https://github.com/apache/hadoop-ozone/pull/1473#discussion_r503496882



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileCreateRequestV1.java
##
@@ -0,0 +1,289 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.request.file;
+
+import com.google.common.base.Optional;
+import org.apache.hadoop.hdds.utils.db.cache.CacheKey;
+import org.apache.hadoop.hdds.utils.db.cache.CacheValue;
+import org.apache.hadoop.ozone.audit.OMAction;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.*;
+import org.apache.hadoop.ozone.om.ratis.utils.OzoneManagerDoubleBufferHelper;
+import org.apache.hadoop.ozone.om.request.util.OmResponseUtil;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.file.OMFileCreateResponse;
+import org.apache.hadoop.ozone.om.response.file.OMFileCreateResponseV1;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.*;
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.BUCKET_LOCK;
+import static 
org.apache.hadoop.ozone.om.request.file.OMFileRequest.OMDirectoryResult.*;
+
+/**
+ * Handles create file request layout version1.
+ */
+public class OMFileCreateRequestV1 extends OMFileCreateRequest {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMFileCreateRequestV1.class);
+  public OMFileCreateRequestV1(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  @SuppressWarnings("methodlength")
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long trxnLogIndex, OzoneManagerDoubleBufferHelper omDoubleBufferHelper) {
+
+CreateFileRequest createFileRequest = 
getOmRequest().getCreateFileRequest();
+KeyArgs keyArgs = createFileRequest.getKeyArgs();
+Map auditMap = buildKeyArgsAuditMap(keyArgs);
+
+String volumeName = keyArgs.getVolumeName();
+String bucketName = keyArgs.getBucketName();
+String keyName = keyArgs.getKeyName();
+
+// if isRecursive is true, file would be created even if parent
+// directories does not exist.
+boolean isRecursive = createFileRequest.getIsRecursive();
+if (LOG.isDebugEnabled()) {
+  LOG.debug("File create for : " + volumeName + "/" + bucketName + "/"
+  + keyName + ":" + isRecursive);
+}
+
+// if isOverWrite is true, file would be over written.
+boolean isOverWrite = createFileRequest.getIsOverwrite();
+
+OMMetrics omMetrics = ozoneManager.getMetrics();
+omMetrics.incNumCreateFile();
+
+OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+
+boolean acquiredLock = false;
+
+OmVolumeArgs omVolumeArgs = null;
+OmBucketInfo omBucketInfo = null;
+final List locations = new ArrayList<>();
+List missingParentInfos;
+int numKeysCreated = 0;
+
+OMClientResponse omClientResponse = null;
+OMResponse.Builder omResponse = OmResponseUtil.getOMResponseBuilder(
+getOmRequest());
+IOException exception = null;
+Result result = null;
+try {
+  keyArgs = resolveBucketLink(ozoneManager, keyArgs, auditMap);
+  volumeName = keyArgs.getVolumeName();
+  bucketName = keyArgs.getBucketName();
+
+  if (keyName.length() == 0) {
+// Check if this is the root of the filesystem.
+throw new OMException("Can not write to directory: " + keyName,
+OMException.ResultCodes.NOT_A_FILE);
+  }
+
+  // check Acl
+  checkKeyAcls(ozoneManager, volumeName, bucketName, keyName,
+ 

[GitHub] [hadoop-ozone] fapifta commented on a change in pull request #1456: HDDS-4172. Implement Finalize command in Ozone Manager server.

2020-10-12 Thread GitBox


fapifta commented on a change in pull request #1456:
URL: https://github.com/apache/hadoop-ozone/pull/1456#discussion_r503556816



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/upgrade/OMUpgradeFinalizer.java
##
@@ -0,0 +1,328 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.upgrade;
+
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes;
+import org.apache.hadoop.ozone.upgrade.UpgradeFinalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Queue;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ConcurrentLinkedQueue;
+
+import static org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.*;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.INVALID_REQUEST;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.PERSIST_UPGRADE_TO_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.REMOVE_UPGRADE_TO_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.UPDATE_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.upgrade.UpgradeFinalizer.Status.FINALIZATION_DONE;
+import static 
org.apache.hadoop.ozone.upgrade.UpgradeFinalizer.Status.FINALIZATION_IN_PROGRESS;
+
+/**
+ * UpgradeFinalizer implementation for the Ozone Manager service.
+ */
+public class OMUpgradeFinalizer implements UpgradeFinalizer {
+
+  private  static final OmUpgradeAction NOOP = a -> {};
+
+  private OMLayoutVersionManagerImpl versionManager;
+  private String clientID;
+
+  private Queue msgs = new ConcurrentLinkedQueue<>();
+  private boolean isDone = false;
+
+  public OMUpgradeFinalizer(OMLayoutVersionManagerImpl versionManager) {
+this.versionManager = versionManager;
+  }
+
+  @Override
+  public StatusAndMessages finalize(String upgradeClientID, OzoneManager om)
+  throws IOException {
+if (!versionManager.needsFinalization()) {
+  return FINALIZED_MSG;
+}
+clientID = upgradeClientID;
+
+// This requires some more investigation on how to do it properly while
+// requests are on the fly, and post finalize features one by one.
+// Until that is done, monitoring is not really doing anything meaningful
+// but this is a tradoff we can take for the first iteration either if needed,
+// as the finalization of the first few features should not take that long.
+// Follow up JIRA is in HDDS-4286
+//String threadName = "OzoneManager-Upgrade-Finalizer";
+//ExecutorService executor =
+//Executors.newSingleThreadExecutor(r -> new Thread(threadName));
+//executor.submit(new Worker(om));
+new Worker(om).call();
+return STARTING_MSG;
+  }
+
+  @Override
+  public synchronized StatusAndMessages reportStatus(
+  String upgradeClientID, boolean takeover
+  ) throws IOException {
+if (takeover) {
+  clientID = upgradeClientID;
+}
+assertClientId(upgradeClientID);
+List returningMsgs = new ArrayList<>(msgs.size()+10);
+Status status = isDone ? FINALIZATION_DONE : FINALIZATION_IN_PROGRESS;
+while (msgs.size() > 0) {
+  returningMsgs.add(msgs.poll());
+}
+return new StatusAndMessages(status, returningMsgs);
+  }
+
+  private void assertClientId(String id) throws OMException {
+if (!this.clientID.equals(id)) {
+  throw new OMException("Unknown client tries to get finalization 
status.\n"
+  + "The requestor is not the initiating client of the finalization,"
+  + " if you want to take over, and get unsent status messages, check"
+  + " -takeover option.", INVALID_REQUEST);
+}
+  }
+
+  /**
+   * This class implements the finalization logic applied to every
+   * LayoutFeature that needs to be finalized.
+   *
+   * For the first approach this happens synchronously within the state machine
+   * during the FinalizeUpgrade request, but ideally this has to be moved to
+   * individual calls that are going into the 

[jira] [Commented] (HDDS-4327) Potential resource leakage using BatchOperation

2020-10-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212740#comment-17212740
 ] 

Bharat Viswanadham commented on HDDS-4327:
--

Thank You [~weichiu] for the catch.

I will post a PR to fix the issue.

> Potential resource leakage using BatchOperation
> ---
>
> Key: HDDS-4327
> URL: https://issues.apache.org/jira/browse/HDDS-4327
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Wei-Chiu Chuang
>Assignee: Bharat Viswanadham
>Priority: Blocker
>
> there are a number of places in the code where BatchOperation is used but not 
> closed. As a best practice, better to close them explicitly.
> I have a stress test code that uses BatchOperation to insert into OM rocksdb. 
> Without closing BatchOperation explicitly, the process crashes after just a 
> few minutes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] fapifta commented on a change in pull request #1456: HDDS-4172. Implement Finalize command in Ozone Manager server.

2020-10-12 Thread GitBox


fapifta commented on a change in pull request #1456:
URL: https://github.com/apache/hadoop-ozone/pull/1456#discussion_r503552977



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/upgrade/OMUpgradeFinalizer.java
##
@@ -0,0 +1,328 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.upgrade;
+
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes;
+import org.apache.hadoop.ozone.upgrade.UpgradeFinalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Queue;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ConcurrentLinkedQueue;
+
+import static org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.*;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.INVALID_REQUEST;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.PERSIST_UPGRADE_TO_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.REMOVE_UPGRADE_TO_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.UPDATE_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.upgrade.UpgradeFinalizer.Status.FINALIZATION_DONE;
+import static 
org.apache.hadoop.ozone.upgrade.UpgradeFinalizer.Status.FINALIZATION_IN_PROGRESS;
+
+/**
+ * UpgradeFinalizer implementation for the Ozone Manager service.
+ */
+public class OMUpgradeFinalizer implements UpgradeFinalizer {
+
+  private  static final OmUpgradeAction NOOP = a -> {};
+
+  private OMLayoutVersionManagerImpl versionManager;
+  private String clientID;
+
+  private Queue msgs = new ConcurrentLinkedQueue<>();
+  private boolean isDone = false;
+
+  public OMUpgradeFinalizer(OMLayoutVersionManagerImpl versionManager) {
+this.versionManager = versionManager;
+  }
+
+  @Override
+  public StatusAndMessages finalize(String upgradeClientID, OzoneManager om)
+  throws IOException {
+if (!versionManager.needsFinalization()) {
+  return FINALIZED_MSG;
+}
+clientID = upgradeClientID;
+
+// This requires some more investigation on how to do it properly while
+// requests are on the fly, and post finalize features one by one.
+// Until that is done, monitoring is not really doing anything meaningful
+// but this is a tradoff we can take for the first iteration either if needed,
+// as the finalization of the first few features should not take that long.
+// Follow up JIRA is in HDDS-4286
+//String threadName = "OzoneManager-Upgrade-Finalizer";
+//ExecutorService executor =
+//Executors.newSingleThreadExecutor(r -> new Thread(threadName));
+//executor.submit(new Worker(om));
+new Worker(om).call();
+return STARTING_MSG;

Review comment:
   yes, I did not wanted to change the overall flow of messages and 
statusas that I came up with, but technically at this point we already done in 
the current implementation.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212736#comment-17212736
 ] 

Bharat Viswanadham commented on HDDS-4308:
--

{quote}The performance impact of volume lock hasn’t tested before and it may 
also be within our tolerance(In-memory operations can be really fast). This 
should be the easiest way to fix this bug by far.
{quote}
Agreed using volume lock and then doing calculation will solve the correctness 
issue.

If we don't have any smart solution until then we can fix this by using volume 
lock and update bytes used.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4338) SCM web UI banner shows "HDFS SCM"

2020-10-12 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDDS-4338:
--
Affects Version/s: 1.0.0

> SCM web UI banner shows "HDFS SCM"
> --
>
> Key: HDDS-4338
> URL: https://issues.apache.org/jira/browse/HDDS-4338
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Wei-Chiu Chuang
>Priority: Trivial
> Attachments: Screen Shot 2020-10-12 at 6.42.31 PM.png
>
>
> !Screen Shot 2020-10-12 at 6.42.31 PM.png!  Let's call it Ozone SCM, shall we?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-4338) SCM web UI banner shows "HDFS SCM"

2020-10-12 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDDS-4338:
--
Target Version/s: 1.1.0

> SCM web UI banner shows "HDFS SCM"
> --
>
> Key: HDDS-4338
> URL: https://issues.apache.org/jira/browse/HDDS-4338
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Wei-Chiu Chuang
>Priority: Trivial
> Attachments: Screen Shot 2020-10-12 at 6.42.31 PM.png
>
>
> !Screen Shot 2020-10-12 at 6.42.31 PM.png!  Let's call it Ozone SCM, shall we?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-4338) SCM web UI banner shows "HDFS SCM"

2020-10-12 Thread Wei-Chiu Chuang (Jira)
Wei-Chiu Chuang created HDDS-4338:
-

 Summary: SCM web UI banner shows "HDFS SCM"
 Key: HDDS-4338
 URL: https://issues.apache.org/jira/browse/HDDS-4338
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Wei-Chiu Chuang
 Attachments: Screen Shot 2020-10-12 at 6.42.31 PM.png

!Screen Shot 2020-10-12 at 6.42.31 PM.png!  Let's call it Ozone SCM, shall we?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx commented on a change in pull request #1456: HDDS-4172. Implement Finalize command in Ozone Manager server.

2020-10-12 Thread GitBox


avijayanhwx commented on a change in pull request #1456:
URL: https://github.com/apache/hadoop-ozone/pull/1456#discussion_r503538733



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/upgrade/OMUpgradeFinalizer.java
##
@@ -0,0 +1,328 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.upgrade;
+
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes;
+import org.apache.hadoop.ozone.upgrade.UpgradeFinalizer;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Queue;
+import java.util.concurrent.Callable;
+import java.util.concurrent.ConcurrentLinkedQueue;
+
+import static org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.*;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.INVALID_REQUEST;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.PERSIST_UPGRADE_TO_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.REMOVE_UPGRADE_TO_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.om.exceptions.OMException.ResultCodes.UPDATE_LAYOUT_VERSION_FAILED;
+import static 
org.apache.hadoop.ozone.upgrade.UpgradeFinalizer.Status.FINALIZATION_DONE;
+import static 
org.apache.hadoop.ozone.upgrade.UpgradeFinalizer.Status.FINALIZATION_IN_PROGRESS;
+
+/**
+ * UpgradeFinalizer implementation for the Ozone Manager service.
+ */
+public class OMUpgradeFinalizer implements UpgradeFinalizer {
+
+  private  static final OmUpgradeAction NOOP = a -> {};
+
+  private OMLayoutVersionManagerImpl versionManager;
+  private String clientID;
+
+  private Queue msgs = new ConcurrentLinkedQueue<>();
+  private boolean isDone = false;
+
+  public OMUpgradeFinalizer(OMLayoutVersionManagerImpl versionManager) {
+this.versionManager = versionManager;
+  }
+
+  @Override
+  public StatusAndMessages finalize(String upgradeClientID, OzoneManager om)
+  throws IOException {
+if (!versionManager.needsFinalization()) {
+  return FINALIZED_MSG;
+}
+clientID = upgradeClientID;
+
+// This requires some more investigation on how to do it properly while
+// requests are on the fly, and post finalize features one by one.
+// Until that is done, monitoring is not really doing anything meaningful
+// but this is a tradoff we can take for the first iteration either if needed,
+// as the finalization of the first few features should not take that long.
+// Follow up JIRA is in HDDS-4286
+//String threadName = "OzoneManager-Upgrade-Finalizer";
+//ExecutorService executor =
+//Executors.newSingleThreadExecutor(r -> new Thread(threadName));
+//executor.submit(new Worker(om));
+new Worker(om).call();
+return STARTING_MSG;
+  }
+
+  @Override
+  public synchronized StatusAndMessages reportStatus(
+  String upgradeClientID, boolean takeover
+  ) throws IOException {
+if (takeover) {
+  clientID = upgradeClientID;
+}
+assertClientId(upgradeClientID);
+List returningMsgs = new ArrayList<>(msgs.size()+10);
+Status status = isDone ? FINALIZATION_DONE : FINALIZATION_IN_PROGRESS;
+while (msgs.size() > 0) {
+  returningMsgs.add(msgs.poll());
+}
+return new StatusAndMessages(status, returningMsgs);
+  }
+
+  private void assertClientId(String id) throws OMException {
+if (!this.clientID.equals(id)) {
+  throw new OMException("Unknown client tries to get finalization 
status.\n"
+  + "The requestor is not the initiating client of the finalization,"
+  + " if you want to take over, and get unsent status messages, check"
+  + " -takeover option.", INVALID_REQUEST);
+}
+  }
+
+  /**
+   * This class implements the finalization logic applied to every
+   * LayoutFeature that needs to be finalized.
+   *
+   * For the first approach this happens synchronously within the state machine
+   * during the FinalizeUpgrade request, but ideally this has to be moved to
+   * individual calls that are going into the 

[GitHub] [hadoop-ozone] fapifta commented on a change in pull request #1456: HDDS-4172. Implement Finalize command in Ozone Manager server.

2020-10-12 Thread GitBox


fapifta commented on a change in pull request #1456:
URL: https://github.com/apache/hadoop-ozone/pull/1456#discussion_r503559123



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/upgrade/OMFinalizeUpgradeRequest.java
##
@@ -63,11 +64,20 @@ public OMClientResponse validateAndUpdateCache(
 
   String upgradeClientID = request.getUpgradeClientId();
 
-  UpgradeFinalizationStatus status =
+  StatusAndMessages omStatus =

Review comment:
   Nice catch, it seems I forgot to remove this one. I am pushing the 
deletion of this line.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] bharatviswa504 commented on pull request #1473: HDDS-4266: CreateFile : store parent dir entries into DirTable and file entry into separate FileTable

2020-10-12 Thread GitBox


bharatviswa504 commented on pull request #1473:
URL: https://github.com/apache/hadoop-ozone/pull/1473#issuecomment-707398681


   Thank You for the updated patch.
   I have one comment, (resolved fixed comments), rest LGTM.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-4308) Fix issue with quota update

2020-10-12 Thread mingchao zhao (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-4308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17212808#comment-17212808
 ] 

mingchao zhao commented on HDDS-4308:
-

Thanks for [~bharat]'s advice. I will submit PR in this way as soon as possible 
to solve this bug.

> Fix issue with quota update
> ---
>
> Key: HDDS-4308
> URL: https://issues.apache.org/jira/browse/HDDS-4308
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Blocker
>
> Currently volumeArgs using getCacheValue and put the same object in 
> doubleBuffer, this might cause issue.
> Let's take the below scenario:
> InitialVolumeArgs quotaBytes -> 1
> 1. T1 -> Update VolumeArgs, and subtracting 1000 and put this updated 
> volumeArgs to DoubleBuffer.
> 2. T2-> Update VolumeArgs, and subtracting 2000 and has not still updated to 
> double buffer.
> *Now at the end of flushing these transactions, our DB should have 7000 as 
> bytes used.*
> Now T1 is picked by double Buffer and when it commits, and as it uses cached 
> Object put into doubleBuffer, it flushes to DB with the updated value from 
> T2(As it is a cache object) and update DB with bytesUsed as 7000.
> And now OM has restarted, and only DB has transactions till T1. (We get this 
> info from TransactionInfo 
> Table(https://issues.apache.org/jira/browse/HDDS-3685)
> Now T2 is again replayed, as it is not committed to DB, now DB will be again 
> subtracted with 2000, and now DB will have 5000.
> But after T2, the value should be 7000, so we have DB in an incorrect state.
> Issue here:
> 1. As we use a cached object and put the same cached object into double 
> buffer this can cause this kind of issue. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] GlenGeng commented on a change in pull request #1228: HDDS-3995. Fix s3g met NPE exception while write file by multiPartUpload

2020-10-12 Thread GitBox


GlenGeng commented on a change in pull request #1228:
URL: https://github.com/apache/hadoop-ozone/pull/1228#discussion_r503653865



##
File path: 
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java
##
@@ -562,13 +562,18 @@ private Response createMultipartKey(String bucket, String 
key, long length,
 
   OmMultipartCommitUploadPartInfo omMultipartCommitUploadPartInfo =
   ozoneOutputStream.getCommitUploadPartInfo();
-  String eTag = omMultipartCommitUploadPartInfo.getPartName();
+  if (omMultipartCommitUploadPartInfo != null) {

Review comment:
   Hey @bharatviswa504
   
   Seems this NPE in S3G occurred again, I got some clue, but haven't found the 
root cause yet.
   
   OM side, `S3MultipartUploadCommitPartRequest` failed due to upload id not 
found.
   S3G side, 
   ```
 } finally {
   IOUtils.closeQuietly(ozoneOutputStream);
 }
   ```
   `OzoneOutputStream.close()` will call   `OmMultipartCommitUploadPartInfo 
commitMultipartUploadPart(
 OmKeyArgs omKeyArgs, long clientID) throws IOException;` to commit the 
key, but `IOUtils.closeQuietly()` will swallow IOException that relates to this 
error, thus 
   ```
 OmMultipartCommitUploadPartInfo omMultipartCommitUploadPartInfo =
 ozoneOutputStream.getCommitUploadPartInfo();
   ```
   will return null, trigger the NPE issue.
   
   OM Trace
   > 2020-10-10 21:29:26,644 ERROR 
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
 MultipartUpload Commit is failed for Key:python_1_1G.dat in Volume/Bucket 
s325d55ad283aa400af464c76d713c07ad/pythonbucket
   NO_SUCH_MULTIPART_UPLOAD_ERROR 
org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload is 
with specified uploadId fc5051bf-97a7-454b-9761-18669f7d3b02-105010697301721222
   at 
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:174)
   at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:227)
   at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:224)
   at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:145)
   at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
   at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:113)
   at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
   at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)
   
   
   S3G Trace
   > 2020-10-10 21:29:26,544 WARN 
org.apache.ratis.metrics.impl.MetricRegistriesImpl: First MetricRegistry has 
been created without registering reporters. You may need to call 
MetricRegistries.global().addReportRegistration(...) before.
   2020-10-10 21:29:26,648 WARN org.eclipse.jetty.server.HttpChannel: 
/pythonbucket/python_1_1G.dat
   javax.servlet.ServletException: javax.servlet.ServletException: 
java.lang.NullPointerException
   at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:162)
   at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
   at org.eclipse.jetty.server.Server.handle(Server.java:500)
   at 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
   at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
   at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270)
   at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
   at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
   at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
   at 

[GitHub] [hadoop-ozone] rakeshadr commented on a change in pull request #1473: HDDS-4266: CreateFile : store parent dir entries into DirTable and file entry into separate FileTable

2020-10-12 Thread GitBox


rakeshadr commented on a change in pull request #1473:
URL: https://github.com/apache/hadoop-ozone/pull/1473#discussion_r503628506



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileCreateRequestV1.java
##
@@ -0,0 +1,289 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.ozone.om.request.file;
+
+import com.google.common.base.Optional;
+import org.apache.hadoop.hdds.utils.db.cache.CacheKey;
+import org.apache.hadoop.hdds.utils.db.cache.CacheValue;
+import org.apache.hadoop.ozone.audit.OMAction;
+import org.apache.hadoop.ozone.om.OMMetadataManager;
+import org.apache.hadoop.ozone.om.OMMetrics;
+import org.apache.hadoop.ozone.om.OzoneManager;
+import org.apache.hadoop.ozone.om.exceptions.OMException;
+import org.apache.hadoop.ozone.om.helpers.*;
+import org.apache.hadoop.ozone.om.ratis.utils.OzoneManagerDoubleBufferHelper;
+import org.apache.hadoop.ozone.om.request.util.OmResponseUtil;
+import org.apache.hadoop.ozone.om.response.OMClientResponse;
+import org.apache.hadoop.ozone.om.response.file.OMFileCreateResponse;
+import org.apache.hadoop.ozone.om.response.file.OMFileCreateResponseV1;
+import org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos.*;
+import org.apache.hadoop.ozone.security.acl.IAccessAuthorizer;
+import org.apache.hadoop.ozone.security.acl.OzoneObj;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.List;
+import java.util.Map;
+import java.util.stream.Collectors;
+
+import static 
org.apache.hadoop.ozone.om.lock.OzoneManagerLock.Resource.BUCKET_LOCK;
+import static 
org.apache.hadoop.ozone.om.request.file.OMFileRequest.OMDirectoryResult.*;
+
+/**
+ * Handles create file request layout version1.
+ */
+public class OMFileCreateRequestV1 extends OMFileCreateRequest {
+
+  private static final Logger LOG =
+  LoggerFactory.getLogger(OMFileCreateRequestV1.class);
+  public OMFileCreateRequestV1(OMRequest omRequest) {
+super(omRequest);
+  }
+
+  @Override
+  @SuppressWarnings("methodlength")
+  public OMClientResponse validateAndUpdateCache(OzoneManager ozoneManager,
+  long trxnLogIndex, OzoneManagerDoubleBufferHelper omDoubleBufferHelper) {
+
+CreateFileRequest createFileRequest = 
getOmRequest().getCreateFileRequest();
+KeyArgs keyArgs = createFileRequest.getKeyArgs();
+Map auditMap = buildKeyArgsAuditMap(keyArgs);
+
+String volumeName = keyArgs.getVolumeName();
+String bucketName = keyArgs.getBucketName();
+String keyName = keyArgs.getKeyName();
+
+// if isRecursive is true, file would be created even if parent
+// directories does not exist.
+boolean isRecursive = createFileRequest.getIsRecursive();
+if (LOG.isDebugEnabled()) {
+  LOG.debug("File create for : " + volumeName + "/" + bucketName + "/"
+  + keyName + ":" + isRecursive);
+}
+
+// if isOverWrite is true, file would be over written.
+boolean isOverWrite = createFileRequest.getIsOverwrite();
+
+OMMetrics omMetrics = ozoneManager.getMetrics();
+omMetrics.incNumCreateFile();
+
+OMMetadataManager omMetadataManager = ozoneManager.getMetadataManager();
+
+boolean acquiredLock = false;
+
+OmVolumeArgs omVolumeArgs = null;
+OmBucketInfo omBucketInfo = null;
+final List locations = new ArrayList<>();
+List missingParentInfos;
+int numKeysCreated = 0;
+
+OMClientResponse omClientResponse = null;
+OMResponse.Builder omResponse = OmResponseUtil.getOMResponseBuilder(
+getOmRequest());
+IOException exception = null;
+Result result = null;
+try {
+  keyArgs = resolveBucketLink(ozoneManager, keyArgs, auditMap);
+  volumeName = keyArgs.getVolumeName();
+  bucketName = keyArgs.getBucketName();
+
+  if (keyName.length() == 0) {
+// Check if this is the root of the filesystem.
+throw new OMException("Can not write to directory: " + keyName,
+OMException.ResultCodes.NOT_A_FILE);
+  }
+
+  // check Acl
+  checkKeyAcls(ozoneManager, volumeName, bucketName, keyName,
+  

[jira] [Resolved] (HDDS-4172) Implement Finalize command in Ozone Manager server.

2020-10-12 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan resolved HDDS-4172.
-
Resolution: Fixed

PR Merged.

> Implement Finalize command in Ozone Manager server.
> ---
>
> Key: HDDS-4172
> URL: https://issues.apache.org/jira/browse/HDDS-4172
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Manager
>Affects Versions: 1.1.0
>Reporter: Aravindan Vijayan
>Assignee: István Fajth
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.1.0
>
>
> Using changes from HDDS-4141 and HDDS-3829, we can finish the OM finalization 
> logic by implementing the Ratis request to Finalize.
> On the server side, this finalize command should update the internal Upgrade 
> state to "Finalized". This operation can be a No-Op if there are no layout 
> changes across an upgrade.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] avijayanhwx merged pull request #1456: HDDS-4172. Implement Finalize command in Ozone Manager server.

2020-10-12 Thread GitBox


avijayanhwx merged pull request #1456:
URL: https://github.com/apache/hadoop-ozone/pull/1456


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] GlenGeng commented on a change in pull request #1228: HDDS-3995. Fix s3g met NPE exception while write file by multiPartUpload

2020-10-12 Thread GitBox


GlenGeng commented on a change in pull request #1228:
URL: https://github.com/apache/hadoop-ozone/pull/1228#discussion_r503653865



##
File path: 
hadoop-ozone/s3gateway/src/main/java/org/apache/hadoop/ozone/s3/endpoint/ObjectEndpoint.java
##
@@ -562,13 +562,18 @@ private Response createMultipartKey(String bucket, String 
key, long length,
 
   OmMultipartCommitUploadPartInfo omMultipartCommitUploadPartInfo =
   ozoneOutputStream.getCommitUploadPartInfo();
-  String eTag = omMultipartCommitUploadPartInfo.getPartName();
+  if (omMultipartCommitUploadPartInfo != null) {

Review comment:
   Hey @bharatviswa504
   
   Seems this NPE in S3G occurred again(in the latest master), I got some clue, 
but haven't found the root cause yet.
   
   OM side, `S3MultipartUploadCommitPartRequest` failed due to upload id not 
found.
   S3G side, 
   ```
 } finally {
   IOUtils.closeQuietly(ozoneOutputStream);
 }
   ```
   `OzoneOutputStream.close()` will call   `OmMultipartCommitUploadPartInfo 
commitMultipartUploadPart(
 OmKeyArgs omKeyArgs, long clientID) throws IOException;` to commit the 
key, but `IOUtils.closeQuietly()` will swallow IOException that relates to this 
error, thus 
   ```
 OmMultipartCommitUploadPartInfo omMultipartCommitUploadPartInfo =
 ozoneOutputStream.getCommitUploadPartInfo();
   ```
   will return null, trigger the NPE issue.
   
   OM Trace
   > 2020-10-10 21:29:26,644 ERROR 
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
 MultipartUpload Commit is failed for Key:python_1_1G.dat in Volume/Bucket 
s325d55ad283aa400af464c76d713c07ad/pythonbucket
   NO_SUCH_MULTIPART_UPLOAD_ERROR 
org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload is 
with specified uploadId fc5051bf-97a7-454b-9761-18669f7d3b02-105010697301721222
   at 
org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:174)
   at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handleWriteRequest(OzoneManagerRequestHandler.java:227)
   at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:224)
   at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:145)
   at 
org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:87)
   at 
org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:113)
   at 
org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
   at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:999)
   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:927)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:422)
   at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2915)
   
   
   S3G Trace
   > 2020-10-10 21:29:26,544 WARN 
org.apache.ratis.metrics.impl.MetricRegistriesImpl: First MetricRegistry has 
been created without registering reporters. You may need to call 
MetricRegistries.global().addReportRegistration(...) before.
   2020-10-10 21:29:26,648 WARN org.eclipse.jetty.server.HttpChannel: 
/pythonbucket/python_1_1G.dat
   javax.servlet.ServletException: javax.servlet.ServletException: 
java.lang.NullPointerException
   at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:162)
   at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
   at org.eclipse.jetty.server.Server.handle(Server.java:500)
   at 
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:383)
   at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:547)
   at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:375)
   at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:270)
   at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
   at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:103)
   at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:117)
   at 
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
   at 

[GitHub] [hadoop-ozone] amaliujia commented on pull request #1445: HDDS-4272. Volume namespace: add usedNamespace and update it when create and delete bucket

2020-10-12 Thread GitBox


amaliujia commented on pull request #1445:
URL: https://github.com/apache/hadoop-ozone/pull/1445#issuecomment-707456469


   @ChenSammi 
   
   Thanks for the point that linked bucket case. Is there an example to show 
now linked bucket is tested (e.g. as a unit test)? I am trying to find a way to 
create a linked bucket and verify whether quota is impacted.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org



[GitHub] [hadoop-ozone] xiaoyuyao commented on a change in pull request #1301: HDDS-3882. Update modification time when updating volume/bucket/key ACLs

2020-10-12 Thread GitBox


xiaoyuyao commented on a change in pull request #1301:
URL: https://github.com/apache/hadoop-ozone/pull/1301#discussion_r503654340



##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/bucket/acl/OMBucketAddAclRequest.java
##
@@ -58,6 +60,19 @@
 };
   }
 
+  @Override
+  public OMRequest preExecute(OzoneManager ozoneManager) throws IOException {
+long modificationTime = Time.now();
+OzoneManagerProtocolProtos.AddAclRequest addAclRequest =
+getOmRequest().getAddAclRequest().toBuilder()
+.setModificationTime(modificationTime).build();

Review comment:
   if you don't call .build() hereon line 68, we can avoid the unnecessary  
toBuilder() conversion on line 71. 

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/bucket/acl/OMBucketRemoveAclRequest.java
##
@@ -55,6 +57,19 @@
 };
   }
 
+  @Override
+  public OMRequest preExecute(OzoneManager ozoneManager) throws IOException {
+long modificationTime = Time.now();
+OzoneManagerProtocolProtos.RemoveAclRequest removeAclRequest =
+getOmRequest().getRemoveAclRequest().toBuilder()

Review comment:
   same as above. 

##
File path: 
hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/bucket/acl/OMBucketSetAclRequest.java
##
@@ -55,6 +57,19 @@
 };
   }
 
+  @Override
+  public OMRequest preExecute(OzoneManager ozoneManager) throws IOException {
+long modificationTime = Time.now();
+OzoneManagerProtocolProtos.SetAclRequest setAclRequest =
+getOmRequest().getSetAclRequest().toBuilder()

Review comment:
   same as above. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org