[jira] [Created] (HDFS-15897) SCM HA should be disabled in secure cluster

2021-03-15 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDFS-15897:
-

 Summary: SCM HA should be disabled in secure cluster
 Key: HDFS-15897
 URL: https://issues.apache.org/jira/browse/HDFS-15897
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


SCM HA security work is still in progress.

[~elek] Brought up the point that until before merge of SCM HA branch we should 
add safeguard check to fail bringing up the cluster



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15165) In Du missed calling getAttributesProvider

2020-02-18 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-15165:
--
Attachment: HDFS-15165.01.patch

> In Du missed calling getAttributesProvider
> --
>
> Key: HDFS-15165
> URL: https://issues.apache.org/jira/browse/HDFS-15165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-15165.00.patch, HDFS-15165.01.patch, 
> example-test.patch
>
>
> HDFS-12130 changed the behavior of DU command.
> It merged both check permission and computation in to a single step.
> During this change, when it is required to getInodeAttributes, it just used 
> inode.getAttributes(). But when attribute provider class is configured, we 
> should call attribute provider configured object to get InodeAttributes and 
> use the returned InodeAttributes during checkPermission. 
> So, if we see after HDFS-12130, code is changed as below. 
>  
> {code:java}
> byte[][] localComponents = {inode.getLocalNameBytes()}; 
> INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)};
> enforcer.checkPermission(
>  fsOwner, supergroup, callerUgi,
>  iNodeAttr, // single inode attr in the array
>  new INode[]{inode}, // single inode in the array
>  localComponents, snapshotId,
>  null, -1, // this will skip checkTraverse() because
>  // not checking ancestor here
>  false, null, null,
>  access, // the target access to be checked against the inode
>  null, // passing null sub access avoids checking children
>  false);
> {code}
>  
> If we observe 2nd line it is missing the check if attribute provider class is 
> configured use that to get InodeAttributeProvider. Because of this when hdfs 
> path is managed by sentry, and InodeAttributeProvider class is configured 
> with SentryINodeAttributeProvider, it does not get 
> SentryInodeAttributeProvider object and not using AclFeature from that if any 
> Acl’s are set. This has caused the issue of AccessControlException when du 
> command is run against hdfs path managed by Sentry.
>  
> {code:java}
> [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/
> du: Permission denied: user=systest, access=READ_EXECUTE, 
> inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15165) In Du missed calling getAttributesProvider

2020-02-18 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039643#comment-17039643
 ] 

Bharat Viswanadham commented on HDFS-15165:
---

Thank You [~sodonnell] for the review.
{quote}I believe you need to pass the iNodeAttr into the AccessControlException 
as the first parameter rather than inode at the bottom of the method. 
Otherwise, if there is an access exception, the log message will contain the 
HDFS permissions for the inode, rather than the provider permissions, which 
would be confusing eg:
{quote}
Yes, I agree, Done.

 

Used the test case provided by you, thanks for the test.

 

> In Du missed calling getAttributesProvider
> --
>
> Key: HDFS-15165
> URL: https://issues.apache.org/jira/browse/HDFS-15165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-15165.00.patch, HDFS-15165.01.patch, 
> example-test.patch
>
>
> HDFS-12130 changed the behavior of DU command.
> It merged both check permission and computation in to a single step.
> During this change, when it is required to getInodeAttributes, it just used 
> inode.getAttributes(). But when attribute provider class is configured, we 
> should call attribute provider configured object to get InodeAttributes and 
> use the returned InodeAttributes during checkPermission. 
> So, if we see after HDFS-12130, code is changed as below. 
>  
> {code:java}
> byte[][] localComponents = {inode.getLocalNameBytes()}; 
> INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)};
> enforcer.checkPermission(
>  fsOwner, supergroup, callerUgi,
>  iNodeAttr, // single inode attr in the array
>  new INode[]{inode}, // single inode in the array
>  localComponents, snapshotId,
>  null, -1, // this will skip checkTraverse() because
>  // not checking ancestor here
>  false, null, null,
>  access, // the target access to be checked against the inode
>  null, // passing null sub access avoids checking children
>  false);
> {code}
>  
> If we observe 2nd line it is missing the check if attribute provider class is 
> configured use that to get InodeAttributeProvider. Because of this when hdfs 
> path is managed by sentry, and InodeAttributeProvider class is configured 
> with SentryINodeAttributeProvider, it does not get 
> SentryInodeAttributeProvider object and not using AclFeature from that if any 
> Acl’s are set. This has caused the issue of AccessControlException when du 
> command is run against hdfs path managed by Sentry.
>  
> {code:java}
> [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/
> du: Permission denied: user=systest, access=READ_EXECUTE, 
> inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15165) In Du missed calling getAttributesProvider

2020-02-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-15165:
--
Status: Patch Available  (was: Open)

> In Du missed calling getAttributesProvider
> --
>
> Key: HDFS-15165
> URL: https://issues.apache.org/jira/browse/HDFS-15165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-15165.00.patch
>
>
> HDFS-12130 changed the behavior of DU command.
> It merged both check permission and computation in to a single step.
> During this change, when it is required to getInodeAttributes, it just used 
> inode.getAttributes(). But when attribute provider class is configured, we 
> should call attribute provider configured object to get InodeAttributes and 
> use the returned InodeAttributes during checkPermission. 
> So, if we see after HDFS-12130, code is changed as below. 
>  
> {code:java}
> byte[][] localComponents = {inode.getLocalNameBytes()}; 
> INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)};
> enforcer.checkPermission(
>  fsOwner, supergroup, callerUgi,
>  iNodeAttr, // single inode attr in the array
>  new INode[]{inode}, // single inode in the array
>  localComponents, snapshotId,
>  null, -1, // this will skip checkTraverse() because
>  // not checking ancestor here
>  false, null, null,
>  access, // the target access to be checked against the inode
>  null, // passing null sub access avoids checking children
>  false);
> {code}
>  
> If we observe 2nd line it is missing the check if attribute provider class is 
> configured use that to get InodeAttributeProvider. Because of this when hdfs 
> path is managed by sentry, and InodeAttributeProvider class is configured 
> with SentryINodeAttributeProvider, it does not get 
> SentryInodeAttributeProvider object and not using AclFeature from that if any 
> Acl’s are set. This has caused the issue of AccessControlException when du 
> command is run against hdfs path managed by Sentry.
>  
> {code:java}
> [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/
> du: Permission denied: user=systest, access=READ_EXECUTE, 
> inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-15165) In Du missed calling getAttributesProvider

2020-02-13 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036441#comment-17036441
 ] 

Bharat Viswanadham edited comment on HDFS-15165 at 2/13/20 7:12 PM:


With Suggested fix tried on the cluster.

Now hdfs path managed with sentry du command is working.
{code:java}
[root@gg-620-1 ~]# hdfs dfs -du 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 805306368 2147483648 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging/9f4f1d1b671d0714_1fbeb6b7
 [root@gg-620-1 ~]# hdfs dfs -du -s 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 1073741824 2684354560 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 [root@gg-620-1 ~]#{code}


was (Author: bharatviswa):
With Suggested fix,

now hdfs path managed with sentry du command is working.
{code:java}
[root@gg-620-1 ~]# hdfs dfs -du 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 805306368 2147483648 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging/9f4f1d1b671d0714_1fbeb6b7
 [root@gg-620-1 ~]# hdfs dfs -du -s 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 1073741824 2684354560 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 [root@gg-620-1 ~]#{code}

> In Du missed calling getAttributesProvider
> --
>
> Key: HDFS-15165
> URL: https://issues.apache.org/jira/browse/HDFS-15165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-15165.00.patch
>
>
> HDFS-12130 changed the behavior of DU command.
> It merged both check permission and computation in to a single step.
> During this change, when it is required to getInodeAttributes, it just used 
> inode.getAttributes(). But when attribute provider class is configured, we 
> should call attribute provider configured object to get InodeAttributes and 
> use the returned InodeAttributes during checkPermission. 
> So, if we see after HDFS-12130, code is changed as below. 
>  
> {code:java}
> byte[][] localComponents = {inode.getLocalNameBytes()}; 
> INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)};
> enforcer.checkPermission(
>  fsOwner, supergroup, callerUgi,
>  iNodeAttr, // single inode attr in the array
>  new INode[]{inode}, // single inode in the array
>  localComponents, snapshotId,
>  null, -1, // this will skip checkTraverse() because
>  // not checking ancestor here
>  false, null, null,
>  access, // the target access to be checked against the inode
>  null, // passing null sub access avoids checking children
>  false);
> {code}
>  
> If we observe 2nd line it is missing the check if attribute provider class is 
> configured use that to get InodeAttributeProvider. Because of this when hdfs 
> path is managed by sentry, and InodeAttributeProvider class is configured 
> with SentryINodeAttributeProvider, it does not get 
> SentryInodeAttributeProvider object and not using AclFeature from that if any 
> Acl’s are set. This has caused the issue of AccessControlException when du 
> command is run against hdfs path managed by Sentry.
>  
> {code:java}
> [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/
> du: Permission denied: user=systest, access=READ_EXECUTE, 
> inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-15165) In Du missed calling getAttributesProvider

2020-02-13 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036441#comment-17036441
 ] 

Bharat Viswanadham commented on HDFS-15165:
---

With Suggested fix,

now hdfs path managed with sentry du command is working.
{code:java}
[root@gg-620-1 ~]# hdfs dfs -du 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 805306368 2147483648 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging/9f4f1d1b671d0714_1fbeb6b7
 [root@gg-620-1 ~]# hdfs dfs -du -s 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 1073741824 2684354560 
/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging
 [root@gg-620-1 ~]#{code}

> In Du missed calling getAttributesProvider
> --
>
> Key: HDFS-15165
> URL: https://issues.apache.org/jira/browse/HDFS-15165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-15165.00.patch
>
>
> HDFS-12130 changed the behavior of DU command.
> It merged both check permission and computation in to a single step.
> During this change, when it is required to getInodeAttributes, it just used 
> inode.getAttributes(). But when attribute provider class is configured, we 
> should call attribute provider configured object to get InodeAttributes and 
> use the returned InodeAttributes during checkPermission. 
> So, if we see after HDFS-12130, code is changed as below. 
>  
> {code:java}
> byte[][] localComponents = {inode.getLocalNameBytes()}; 
> INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)};
> enforcer.checkPermission(
>  fsOwner, supergroup, callerUgi,
>  iNodeAttr, // single inode attr in the array
>  new INode[]{inode}, // single inode in the array
>  localComponents, snapshotId,
>  null, -1, // this will skip checkTraverse() because
>  // not checking ancestor here
>  false, null, null,
>  access, // the target access to be checked against the inode
>  null, // passing null sub access avoids checking children
>  false);
> {code}
>  
> If we observe 2nd line it is missing the check if attribute provider class is 
> configured use that to get InodeAttributeProvider. Because of this when hdfs 
> path is managed by sentry, and InodeAttributeProvider class is configured 
> with SentryINodeAttributeProvider, it does not get 
> SentryInodeAttributeProvider object and not using AclFeature from that if any 
> Acl’s are set. This has caused the issue of AccessControlException when du 
> command is run against hdfs path managed by Sentry.
>  
> {code:java}
> [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/
> du: Permission denied: user=systest, access=READ_EXECUTE, 
> inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15165) In Du missed calling getAttributesProvider

2020-02-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-15165:
--
Attachment: HDFS-15165.00.patch

> In Du missed calling getAttributesProvider
> --
>
> Key: HDFS-15165
> URL: https://issues.apache.org/jira/browse/HDFS-15165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-15165.00.patch
>
>
> HDFS-12130 changed the behavior of DU command.
> It merged both check permission and computation in to a single step.
> During this change, when it is required to getInodeAttributes, it just used 
> inode.getAttributes(). But when attribute provider class is configured, we 
> should call attribute provider configured object to get InodeAttributes and 
> use the returned InodeAttributes during checkPermission. 
> So, if we see after HDFS-12130, code is changed as below. 
>  
> {code:java}
> byte[][] localComponents = {inode.getLocalNameBytes()}; 
> INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)};
> enforcer.checkPermission(
>  fsOwner, supergroup, callerUgi,
>  iNodeAttr, // single inode attr in the array
>  new INode[]{inode}, // single inode in the array
>  localComponents, snapshotId,
>  null, -1, // this will skip checkTraverse() because
>  // not checking ancestor here
>  false, null, null,
>  access, // the target access to be checked against the inode
>  null, // passing null sub access avoids checking children
>  false);
> {code}
>  
> If we observe 2nd line it is missing the check if attribute provider class is 
> configured use that to get InodeAttributeProvider. Because of this when hdfs 
> path is managed by sentry, and InodeAttributeProvider class is configured 
> with SentryINodeAttributeProvider, it does not get 
> SentryInodeAttributeProvider object and not using AclFeature from that if any 
> Acl’s are set. This has caused the issue of AccessControlException when du 
> command is run against hdfs path managed by Sentry.
>  
> {code:java}
> [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/
> du: Permission denied: user=systest, access=READ_EXECUTE, 
> inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15165) In Du missed calling getAttributesProvider

2020-02-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDFS-15165:
--
Description: 
HDFS-12130 changed the behavior of DU command.

It merged both check permission and computation in to a single step.

During this change, when it is required to getInodeAttributes, it just used 
inode.getAttributes(). But when attribute provider class is configured, we 
should call attribute provider configured object to get InodeAttributes and use 
the returned InodeAttributes during checkPermission. 

So, if we see after HDFS-12130, code is changed as below. 

 
{code:java}
byte[][] localComponents = {inode.getLocalNameBytes()}; 
INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)};
enforcer.checkPermission(
 fsOwner, supergroup, callerUgi,
 iNodeAttr, // single inode attr in the array
 new INode[]{inode}, // single inode in the array
 localComponents, snapshotId,
 null, -1, // this will skip checkTraverse() because
 // not checking ancestor here
 false, null, null,
 access, // the target access to be checked against the inode
 null, // passing null sub access avoids checking children
 false);
{code}
 

If we observe 2nd line it is missing the check if attribute provider class is 
configured use that to get InodeAttributeProvider. Because of this when hdfs 
path is managed by sentry, and InodeAttributeProvider class is configured with 
SentryINodeAttributeProvider, it does not get SentryInodeAttributeProvider 
object and not using AclFeature from that if any Acl’s are set. This has caused 
the issue of AccessControlException when du command is run against hdfs path 
managed by Sentry.

 
{code:java}
[root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/
du: Permission denied: user=systest, access=READ_EXECUTE, 
inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code}

  was:
HDFS-12130 has changed the behavior of DU.

During that change to getInodeAttributes, it missed calling 
getAttributesProvider().getAttributes when it is configured.

Because of this, when sentry is configured for hdfs path, and attributeProvider 
class is set. We missed calling this, and AclFeature from Sentry was missing. 
Because of this when DU command is run on a sentry managed hdfs path, we are 
seeing AccessControlException.

 

This Jira is to fix this issue.
{code:java}
 dfs.namenode.inode.attributes.provider.class 
org.apache.sentry.hdfs.SentryINodeAttributesProvider 
{code}


> In Du missed calling getAttributesProvider
> --
>
> Key: HDFS-15165
> URL: https://issues.apache.org/jira/browse/HDFS-15165
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> HDFS-12130 changed the behavior of DU command.
> It merged both check permission and computation in to a single step.
> During this change, when it is required to getInodeAttributes, it just used 
> inode.getAttributes(). But when attribute provider class is configured, we 
> should call attribute provider configured object to get InodeAttributes and 
> use the returned InodeAttributes during checkPermission. 
> So, if we see after HDFS-12130, code is changed as below. 
>  
> {code:java}
> byte[][] localComponents = {inode.getLocalNameBytes()}; 
> INodeAttributes[] iNodeAttr = {inode.getSnapshotINode(snapshotId)};
> enforcer.checkPermission(
>  fsOwner, supergroup, callerUgi,
>  iNodeAttr, // single inode attr in the array
>  new INode[]{inode}, // single inode in the array
>  localComponents, snapshotId,
>  null, -1, // this will skip checkTraverse() because
>  // not checking ancestor here
>  false, null, null,
>  access, // the target access to be checked against the inode
>  null, // passing null sub access avoids checking children
>  false);
> {code}
>  
> If we observe 2nd line it is missing the check if attribute provider class is 
> configured use that to get InodeAttributeProvider. Because of this when hdfs 
> path is managed by sentry, and InodeAttributeProvider class is configured 
> with SentryINodeAttributeProvider, it does not get 
> SentryInodeAttributeProvider object and not using AclFeature from that if any 
> Acl’s are set. This has caused the issue of AccessControlException when du 
> command is run against hdfs path managed by Sentry.
>  
> {code:java}
> [root@gg-620-1 ~]# hdfs dfs -du /dev/edl/sc/consumer/lpfg/str/edf/abc/
> du: Permission denied: user=systest, access=READ_EXECUTE, 
> inode="/dev/edl/sc/consumer/lpfg/str/lpfg_wrk/PRISMA_TO_ICERTIS_OUTBOUND_RM_MASTER/_impala_insert_staging":impala:hive:drwxrwx--x{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

--

[jira] [Created] (HDFS-15165) In Du missed calling getAttributesProvider

2020-02-12 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDFS-15165:
-

 Summary: In Du missed calling getAttributesProvider
 Key: HDFS-15165
 URL: https://issues.apache.org/jira/browse/HDFS-15165
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


HDFS-12130 has changed the behavior of DU.

During that change to getInodeAttributes, it missed calling 
getAttributesProvider().getAttributes when it is configured.

Because of this, when sentry is configured for hdfs path, and attributeProvider 
class is set. We missed calling this, and AclFeature from Sentry was missing. 
Because of this when DU command is run on a sentry managed hdfs path, we are 
seeing AccessControlException.

 

This Jira is to fix this issue.
{code:java}
 dfs.namenode.inode.attributes.provider.class 
org.apache.sentry.hdfs.SentryINodeAttributesProvider 
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2536) Add ozone.om.internal.service.id to OM HA configuration

2019-11-20 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2536.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Add ozone.om.internal.service.id to OM HA configuration
> ---
>
> Key: HDDS-2536
> URL: https://issues.apache.org/jira/browse/HDDS-2536
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This Jira is to add ozone.om.internal.serviceid to let OM knows it belong to 
> a particular service.
>  
> As now we have ozone.om.service.ids -≥ where we can define all service id's 
> in a cluster.(This can happen if the same config is shared across the cluster)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2591) No tailMap needed for startIndex 0 in ContainerSet#listContainer

2019-11-20 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978910#comment-16978910
 ] 

Bharat Viswanadham edited comment on HDDS-2591 at 11/21/19 2:10 AM:


Hi [~adoroszlai]

This API is added so that it can be used by Scanners implementation. I think, 
for now, we can leave it, and fix the issue reported.


was (Author: bharatviswa):
Hi [~adoroszlai]

This API is used so that it can be used by Scanners. I think, for now, we can 
leave it, and fix the issue reported.

> No tailMap needed for startIndex 0 in ContainerSet#listContainer
> 
>
> Key: HDDS-2591
> URL: https://issues.apache.org/jira/browse/HDDS-2591
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>
> {{ContainerSet#listContainer}} has this code:
> {code:title=https://github.com/apache/hadoop-ozone/blob/3c334f6a7b344e0e5f52fec95071c369286cfdcb/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java#L198}
> map = containerMap.tailMap(containerMap.firstKey(), true);
> {code}
> This is equivalent to:
> {code}
> map = containerMap;
> {code}
> since {{tailMap}} is a sub-map with all keys larger than or equal to 
> ({{inclusive=true}}) {{firstKey}}, which is the lowest key in the map.  So it 
> is a sub-map with all keys, ie. the whole map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2591) No tailMap needed for startIndex 0 in ContainerSet#listContainer

2019-11-20 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978910#comment-16978910
 ] 

Bharat Viswanadham commented on HDDS-2591:
--

Hi [~adoroszlai]

This API is used so that it can be used by Scanners. I think, for now, we can 
leave it, and fix the issue reported.

> No tailMap needed for startIndex 0 in ContainerSet#listContainer
> 
>
> Key: HDDS-2591
> URL: https://issues.apache.org/jira/browse/HDDS-2591
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: Ozone Datanode
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>
> {{ContainerSet#listContainer}} has this code:
> {code:title=https://github.com/apache/hadoop-ozone/blob/3c334f6a7b344e0e5f52fec95071c369286cfdcb/hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/ContainerSet.java#L198}
> map = containerMap.tailMap(containerMap.firstKey(), true);
> {code}
> This is equivalent to:
> {code}
> map = containerMap;
> {code}
> since {{tailMap}} is a sub-map with all keys larger than or equal to 
> ({{inclusive=true}}) {{firstKey}}, which is the lowest key in the map.  So it 
> is a sub-map with all keys, ie. the whole map.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2594) S3 RangeReads failing with NumberFormatException

2019-11-20 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2594:
-
Status: Patch Available  (was: Open)

> S3 RangeReads failing with NumberFormatException
> 
>
> Key: HDDS-2594
> URL: https://issues.apache.org/jira/browse/HDDS-2594
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  
> {code:java}
> 2019-11-20 15:32:04,684 WARN org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: java.lang.NumberFormatException: For input 
> string: "3977248768"
>         at 
> org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:432)
>         at 
> org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:370)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:389)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:342)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:229)
>         at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1780)
>         at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1609)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>         at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:539)
>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
>         at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>         at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>         at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2594) S3 RangeReads failing with NumberFormatException

2019-11-20 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2594:


 Summary: S3 RangeReads failing with NumberFormatException
 Key: HDDS-2594
 URL: https://issues.apache.org/jira/browse/HDDS-2594
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


 
{code:java}
2019-11-20 15:32:04,684 WARN org.eclipse.jetty.servlet.ServletHandler:
javax.servlet.ServletException: java.lang.NumberFormatException: For input 
string: "3977248768"
        at 
org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:432)
        at 
org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:370)
        at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:389)
        at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:342)
        at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:229)
        at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1780)
        at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1609)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
        at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
        at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
        at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
        at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
        at org.eclipse.jetty.server.Server.handle(Server.java:539)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
        at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
        at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
        at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
        at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
        at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
        at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
        at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
        at java.lang.Thread.run(Thread.java:748)
{code}
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2594) S3 RangeReads failing with NumberFormatException

2019-11-20 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2594?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2594:


Assignee: Bharat Viswanadham

> S3 RangeReads failing with NumberFormatException
> 
>
> Key: HDDS-2594
> URL: https://issues.apache.org/jira/browse/HDDS-2594
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
>  
> {code:java}
> 2019-11-20 15:32:04,684 WARN org.eclipse.jetty.servlet.ServletHandler:
> javax.servlet.ServletException: java.lang.NumberFormatException: For input 
> string: "3977248768"
>         at 
> org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:432)
>         at 
> org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:370)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:389)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:342)
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:229)
>         at 
> org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1780)
>         at 
> org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1609)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>         at 
> org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
>         at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1767)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:583)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>         at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
>         at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:513)
>         at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>         at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
>         at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>         at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
>         at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
>         at org.eclipse.jetty.server.Server.handle(Server.java:539)
>         at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:333)
>         at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
>         at 
> org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
>         at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
>         at 
> org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
>         at 
> org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
>         at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-20 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978812#comment-16978812
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

With HDDS-2477 PR, I was able to verify that MPU for larger size files is 
working.

https://github.com/apache/hadoop-ozone/pull/159#issuecomment-556527944

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2018-11-15-OM-logs.txt, 2019-11-06_18_13_57_422_ERROR, 
> hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png, 
> om-audit-VM_50_210_centos.log, om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at j

[jira] [Resolved] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key

2019-11-20 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2241.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipelines for a key
> 
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> information from SCM through an RPC for every block in the key. For large 
> files > 1GB, we may end up making a lot of RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, a simpler change would be to have a map (method local) of 
> ContainerID -> Pipeline that we get from SCM so that we don't need to make 
> repeated calls to SCM for the same containerID for a key. _Here, Number of 
> calls = Number of unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2247) Delete FileEncryptionInfo from KeyInfo when a Key is deleted

2019-11-19 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2247.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Delete FileEncryptionInfo from KeyInfo when a Key is deleted
> 
>
> Key: HDDS-2247
> URL: https://issues.apache.org/jira/browse/HDDS-2247
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Dinesh Chitlangia
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> As part of HDDS-2174 we are deleting GDPR Encryption Key on delete file 
> operation.
> However, if KMS is enabled, we are skipping GDPR Encryption Key approach when 
> writing file in a GDPR enforced Bucket.
> {code:java}
> final FileEncryptionInfo feInfo = keyOutputStream.getFileEncryptionInfo();
> if (feInfo != null) {
>   KeyProvider.KeyVersion decrypted = getDEK(feInfo);
>   final CryptoOutputStream cryptoOut =
>   new CryptoOutputStream(keyOutputStream,
>   OzoneKMSUtil.getCryptoCodec(conf, feInfo),
>   decrypted.getMaterial(), feInfo.getIV());
>   return new OzoneOutputStream(cryptoOut);
> } else {
>   try{
> GDPRSymmetricKey gk;
> Map openKeyMetadata =
> openKey.getKeyInfo().getMetadata();
> if(Boolean.valueOf(openKeyMetadata.get(OzoneConsts.GDPR_FLAG))){
>   gk = new GDPRSymmetricKey(
>   openKeyMetadata.get(OzoneConsts.GDPR_SECRET),
>   openKeyMetadata.get(OzoneConsts.GDPR_ALGORITHM)
>   );
>   gk.getCipher().init(Cipher.ENCRYPT_MODE, gk.getSecretKey());
>   return new OzoneOutputStream(
>   new CipherOutputStream(keyOutputStream, gk.getCipher()));
> }
>   }catch (Exception ex){
> throw new IOException(ex);
>   }
> {code}
> In such scenario, when KMS is enabled & GDPR enforced on a bucket, if user 
> deletes a file, we should delete the {{FileEncryptionInfo}} from KeyInfo, 
> before moving it to deletedTable, else we cannot guarantee Right to Erasure.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2581) Make OM Ha config to use Java Configs

2019-11-19 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2581:


 Summary: Make OM Ha config to use Java Configs
 Key: HDDS-2581
 URL: https://issues.apache.org/jira/browse/HDDS-2581
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


This Jira is created based on the comments from [~aengineer] during HDDS-2536 
review.

Can we please use the Java Configs instead of this old-style config to add a 
config?

 

This Jira it to make all HA OM config to the new style (Java config based 
approach)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2581) Make OM Ha config to use Java Configs

2019-11-19 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2581:
-
Labels: newbie  (was: )

> Make OM Ha config to use Java Configs
> -
>
> Key: HDDS-2581
> URL: https://issues.apache.org/jira/browse/HDDS-2581
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
>  Labels: newbie
>
> This Jira is created based on the comments from [~aengineer] during HDDS-2536 
> review.
> Can we please use the Java Configs instead of this old-style config to add a 
> config?
>  
> This Jira it to make all HA OM config to the new style (Java config based 
> approach)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2486) Sonar: Avoid empty test methods

2019-11-18 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2486:
-
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Sonar: Avoid empty test methods
> ---
>
> Key: HDDS-2486
> URL: https://issues.apache.org/jira/browse/HDDS-2486
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Reporter: Attila Doroszlai
>Assignee: Dinesh Chitlangia
>Priority: Minor
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{TestRDBTableStore#toIOException}} is empty.
> https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-5kKcVY8lQ4ZsQH&open=AW5md-5kKcVY8lQ4ZsQH
> Also {{TestTypedRDBTableStore#toIOException}}:
> https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-5qKcVY8lQ4ZsQJ&open=AW5md-5qKcVY8lQ4ZsQJ



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2536) Add ozone.om.internal.service.id to OM HA configuration

2019-11-18 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2536:
-
Parent: HDDS-505
Issue Type: Sub-task  (was: Bug)

> Add ozone.om.internal.service.id to OM HA configuration
> ---
>
> Key: HDDS-2536
> URL: https://issues.apache.org/jira/browse/HDDS-2536
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This Jira is to add ozone.om.internal.serviceid to let OM knows it belong to 
> a particular service.
>  
> As now we have ozone.om.service.ids -≥ where we can define all service id's 
> in a cluster.(This can happen if the same config is shared across the cluster)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2536) Add ozone.om.internal.service.id to OM HA configuration

2019-11-18 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2536:


 Summary: Add ozone.om.internal.service.id to OM HA configuration
 Key: HDDS-2536
 URL: https://issues.apache.org/jira/browse/HDDS-2536
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


This Jira is to add ozone.om.internal.serviceid to let OM knows it belong to a 
particular service.

 

As now we have ozone.om.service.ids -≥ where we can define all service id's in 
a cluster.(This can happen if the same config is shared across the cluster)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2535) TestOzoneManagerDoubleBufferWithOMResponse is flaky

2019-11-18 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976850#comment-16976850
 ] 

Bharat Viswanadham edited comment on HDDS-2535 at 11/18/19 8:25 PM:


Hi [~elek]
{quote}Independent from the flakiness I think a test where the timeout is 8 
minutes and starts 1000 threads to insert 500 buckets (500_000 buckets all 
together) it's more like an integration test and would be better to move the 
slowest part to the integration-test project.
{quote}
I think now it should run quickly with the fix, and also I think it will not 
take that much of time. On my local laptop, I see, it is always completed in 
30sec.

 

And on github run I see it is completed in 53 seconds. I just want to keep this 
test in UT, as this will detect any failure in the DoubleBuffer issue which is 
a critical component in OM. (Why I want in UT, because we are going to force 
sooner, UT should be always green)
[INFO] Running 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse 
[1164|https://github.com/bharatviswa504/hadoop-ozone/runs/308637202#step:3:1164][INFO]
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.536 s - in 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
 

 


was (Author: bharatviswa):
Hi [~elek]
{quote}Independent from the flakiness I think a test where the timeout is 8 
minutes and starts 1000 threads to insert 500 buckets (500_000 buckets all 
together) it's more like an integration test and would be better to move the 
slowest part to the integration-test project.
{quote}
I think now it should run quickly with the fix, and also I think it will not 
take that much of time. On my local laptop, I see, it is always completed in 
30sec.

 

 

> TestOzoneManagerDoubleBufferWithOMResponse is flaky
> ---
>
> Key: HDDS-2535
> URL: https://issues.apache.org/jira/browse/HDDS-2535
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Marton Elek
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Flakiness can be reproduced locally. Usually it passes, but when I started to 
> run it 100 times parallel with high cpu load it failed with the 3rd attempt 
> (timed out)
> {code:java}
> ---
> Test set: 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> ---
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 503.297 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> testDoubleBuffer(org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse)
>   Time elapsed: 500.122 s  <<< ERROR!
> java.lang.Exception: test timed out after 50 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:382)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:385)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:129)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
>  {code}
> Independent from the flakiness I think a test where the timeout is 8 minutes 
> and starts 1000 threads to insert 500 buckets (500_000 buckets all together) 
> it's more like an integration test and would be better to move the slowest 
> part to the integration-test project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-

[jira] [Comment Edited] (HDDS-2535) TestOzoneManagerDoubleBufferWithOMResponse is flaky

2019-11-18 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976850#comment-16976850
 ] 

Bharat Viswanadham edited comment on HDDS-2535 at 11/18/19 8:25 PM:


Hi [~elek]
{quote}Independent from the flakiness I think a test where the timeout is 8 
minutes and starts 1000 threads to insert 500 buckets (500_000 buckets all 
together) it's more like an integration test and would be better to move the 
slowest part to the integration-test project.
{quote}
I think now it should run quickly with the fix, and also I think it will not 
take that much of time. On my local laptop, I see, it is always completed in 
30sec.

 

And on github run I see it is completed in 53 seconds. I just want to keep this 
test in UT, as this will detect any failure in the DoubleBuffer issue which is 
a critical component in OM. (Why I want in UT, because we are going to force 
sooner, UT should be always green)




{code:java}
1164[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
53.536 s - in 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse{code}

  

 


was (Author: bharatviswa):
Hi [~elek]
{quote}Independent from the flakiness I think a test where the timeout is 8 
minutes and starts 1000 threads to insert 500 buckets (500_000 buckets all 
together) it's more like an integration test and would be better to move the 
slowest part to the integration-test project.
{quote}
I think now it should run quickly with the fix, and also I think it will not 
take that much of time. On my local laptop, I see, it is always completed in 
30sec.

 

And on github run I see it is completed in 53 seconds. I just want to keep this 
test in UT, as this will detect any failure in the DoubleBuffer issue which is 
a critical component in OM. (Why I want in UT, because we are going to force 
sooner, UT should be always green)
[INFO] Running 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse 
[1164|https://github.com/bharatviswa504/hadoop-ozone/runs/308637202#step:3:1164][INFO]
 Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 53.536 s - in 
org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
 

 

> TestOzoneManagerDoubleBufferWithOMResponse is flaky
> ---
>
> Key: HDDS-2535
> URL: https://issues.apache.org/jira/browse/HDDS-2535
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Marton Elek
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Flakiness can be reproduced locally. Usually it passes, but when I started to 
> run it 100 times parallel with high cpu load it failed with the 3rd attempt 
> (timed out)
> {code:java}
> ---
> Test set: 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> ---
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 503.297 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> testDoubleBuffer(org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse)
>   Time elapsed: 500.122 s  <<< ERROR!
> java.lang.Exception: test timed out after 50 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:382)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:385)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:129)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailO

[jira] [Commented] (HDDS-2535) TestOzoneManagerDoubleBufferWithOMResponse is flaky

2019-11-18 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976850#comment-16976850
 ] 

Bharat Viswanadham commented on HDDS-2535:
--

Hi [~elek]
{quote}Independent from the flakiness I think a test where the timeout is 8 
minutes and starts 1000 threads to insert 500 buckets (500_000 buckets all 
together) it's more like an integration test and would be better to move the 
slowest part to the integration-test project.
{quote}
I think now it should run quickly with the fix, and also I think it will not 
take that much of time. On my local laptop, I see, it is always completed in 
30sec.

 

 

> TestOzoneManagerDoubleBufferWithOMResponse is flaky
> ---
>
> Key: HDDS-2535
> URL: https://issues.apache.org/jira/browse/HDDS-2535
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Marton Elek
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Flakiness can be reproduced locally. Usually it passes, but when I started to 
> run it 100 times parallel with high cpu load it failed with the 3rd attempt 
> (timed out)
> {code:java}
> ---
> Test set: 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> ---
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 503.297 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> testDoubleBuffer(org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse)
>   Time elapsed: 500.122 s  <<< ERROR!
> java.lang.Exception: test timed out after 50 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:382)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:385)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:129)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
>  {code}
> Independent from the flakiness I think a test where the timeout is 8 minutes 
> and starts 1000 threads to insert 500 buckets (500_000 buckets all together) 
> it's more like an integration test and would be better to move the slowest 
> part to the integration-test project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2535) TestOzoneManagerDoubleBufferWithOMResponse is flaky

2019-11-18 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2535:
-
Status: Patch Available  (was: Open)

> TestOzoneManagerDoubleBufferWithOMResponse is flaky
> ---
>
> Key: HDDS-2535
> URL: https://issues.apache.org/jira/browse/HDDS-2535
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Marton Elek
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Flakiness can be reproduced locally. Usually it passes, but when I started to 
> run it 100 times parallel with high cpu load it failed with the 3rd attempt 
> (timed out)
> {code:java}
> ---
> Test set: 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> ---
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 503.297 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> testDoubleBuffer(org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse)
>   Time elapsed: 500.122 s  <<< ERROR!
> java.lang.Exception: test timed out after 50 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:382)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:385)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:129)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
>  {code}
> Independent from the flakiness I think a test where the timeout is 8 minutes 
> and starts 1000 threads to insert 500 buckets (500_000 buckets all together) 
> it's more like an integration test and would be better to move the slowest 
> part to the integration-test project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2535) TestOzoneManagerDoubleBufferWithOMResponse is flaky

2019-11-18 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2535:


Assignee: Bharat Viswanadham

> TestOzoneManagerDoubleBufferWithOMResponse is flaky
> ---
>
> Key: HDDS-2535
> URL: https://issues.apache.org/jira/browse/HDDS-2535
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Marton Elek
>Assignee: Bharat Viswanadham
>Priority: Major
>
> Flakiness can be reproduced locally. Usually it passes, but when I started to 
> run it 100 times parallel with high cpu load it failed with the 3rd attempt 
> (timed out)
> {code:java}
> ---
> Test set: 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> ---
> Tests run: 3, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 503.297 s <<< 
> FAILURE! - in 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse
> testDoubleBuffer(org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse)
>   Time elapsed: 500.122 s  <<< ERROR!
> java.lang.Exception: test timed out after 50 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at 
> org.apache.hadoop.test.GenericTestUtils.waitFor(GenericTestUtils.java:382)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:385)
> at 
> org.apache.hadoop.ozone.om.ratis.TestOzoneManagerDoubleBufferWithOMResponse.testDoubleBuffer(TestOzoneManagerDoubleBufferWithOMResponse.java:129)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
> at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
> at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)
>  {code}
> Independent from the flakiness I think a test where the timeout is 8 minutes 
> and starts 1000 threads to insert 500 buckets (500_000 buckets all together) 
> it's more like an integration test and would be better to move the slowest 
> part to the integration-test project.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-17 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976256#comment-16976256
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

[~timmylicheng] And for testing with PR, have you used the branch and set up a 
new cluster or replaced jars. Could you provide some information on this? 

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2018-11-15-OM-logs.txt, 2019-11-06_18_13_57_422_ERROR, 
> hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png, 
> om-audit-VM_50_210_centos.log, om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at ja

[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-17 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976253#comment-16976253
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/18/19 3:01 AM:


Before my PR  [https://github.com/apache/hadoop-ozone/pull/163] I have too seen 
this error.  I think in goofys there is a logic if complete Multipart upload 
failed, it aborts and uploads. (Upload after abort, fails with 
No_SUCH_MULTIPART_ERROR, this is expected from Ozone/S3 perspective)

 

With the above PR, I was able to upload 1GB,2GB, ... ,6GB files. Please have a 
look in to PR #163 comment. And for testing with PR, have you used the branch 
and set up a new cluster or replaced jars. Could you provide some information 
on this.

 

So, we need to look for is there any failure for 
COMPLETE_MULTIPART_UPLOAD_ERROR for the key. The reason for this cause is 
explained in HDDS-2477. Can you also upload om-audit log, if there is an 
occurrence of COMPLETE_MULTIPART_UPLOAD_ERROR still.


was (Author: bharatviswa):
Before my PR  [https://github.com/apache/hadoop-ozone/pull/163] I have too seen 
this error.  I think in goofys there is a logic if complete Multipart upload 
failed, it aborts and uploads. (Upload after abort, fails with 
No_SUCH_MULTIPART_ERROR, this is expected from Ozone/S3 perspective)

 

With the above PR, I was able to upload 1GB,2GB, ... ,6GB files. Please have a 
look in to PR #163 comment. And for testing with PR, have you used the branch 
and set up a new cluster or replaced jars. Could you provide some information 
on this.

 

So, we need to look for is there any failure for 
COMPLETE_MULTIPART_UPLOAD_ERROR for the key. The reason for this cause is 
explained in HDDS-2477. Can you also upload om-audit log, if there is an 
occurrence of COMPLETE_MULTIPART_UPLOAD_ERROR still.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2018-11-15-OM-logs.txt, 2019-11-06_18_13_57_422_ERROR, 
> hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png, 
> om-audit-VM_50_210_centos.log, om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(Prot

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-17 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16976253#comment-16976253
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

Before my PR  [https://github.com/apache/hadoop-ozone/pull/163] I have too seen 
this error.  I think in goofys there is a logic if complete Multipart upload 
failed, it aborts and uploads. (Upload after abort, fails with 
No_SUCH_MULTIPART_ERROR, this is expected from Ozone/S3 perspective)

 

With the above PR, I was able to upload 1GB,2GB, ... ,6GB files. Please have a 
look in to PR #163 comment. And for testing with PR, have you used the branch 
and set up a new cluster or replaced jars. Could you provide some information 
on this.

 

So, we need to look for is there any failure for 
COMPLETE_MULTIPART_UPLOAD_ERROR for the key. The reason for this cause is 
explained in HDDS-2477. Can you also upload om-audit log, if there is an 
occurrence of COMPLETE_MULTIPART_UPLOAD_ERROR still.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2018-11-15-OM-logs.txt, 2019-11-06_18_13_57_422_ERROR, 
> hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png, 
> om-audit-VM_50_210_centos.log, om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1

[jira] [Resolved] (HDDS-2461) Logging by ChunkUtils is misleading

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2461.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Logging by ChunkUtils is misleading
> ---
>
> Key: HDDS-2461
> URL: https://issues.apache.org/jira/browse/HDDS-2461
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Datanode
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> During a k8s based test I found a lot of log message like:
> {code:java}
> 2019-11-12 14:27:13 WARN  ChunkManagerImpl:209 - Duplicate write chunk 
> request. Chunk overwrite without explicit request. 
> ChunkInfo{chunkName='A9UrLxiEUN_testdata_chunk_4465025, offset=0, len=1024} 
> {code}
> I was very surprised as at ChunkManagerImpl:209 there was no similar lines.
> It turned out that it's logged by ChunkUtils but it's used the logger of 
> ChunkManagerImpl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2513) Remove this unused "COMPONENT" private field.

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2513:
-
Issue Type: Bug  (was: Improvement)

> Remove this unused "COMPONENT" private field.
> -
>
> Key: HDDS-2513
> URL: https://issues.apache.org/jira/browse/HDDS-2513
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Abhishek Purohit
>Assignee: Abhishek Purohit
>Priority: Minor
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Remove this unused "COMPONENT" private field in class 
> XceiverClientGrpc
> [https://sonarcloud.io/project/issues?id=hadoop-ozone&open=AW5md_AGKcVY8lQ4ZsWG&resolved=false]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2513) Remove this unused "COMPONENT" private field.

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2513.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Remove this unused "COMPONENT" private field.
> -
>
> Key: HDDS-2513
> URL: https://issues.apache.org/jira/browse/HDDS-2513
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Abhishek Purohit
>Assignee: Abhishek Purohit
>Priority: Minor
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Remove this unused "COMPONENT" private field in class 
> XceiverClientGrpc
> [https://sonarcloud.io/project/issues?id=hadoop-ozone&open=AW5md_AGKcVY8lQ4ZsWG&resolved=false]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2405) int2ByteString unnecessary byte array allocation

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2405:
-
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> int2ByteString unnecessary byte array allocation
> 
>
> Key: HDDS-2405
> URL: https://issues.apache.org/jira/browse/HDDS-2405
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{int2ByteString}} implementations (currently duplicated in 
> [RatisHelper|https://github.com/apache/hadoop-ozone/blob/6b2cda125b3647870ef5b01cf64e3b3e4cdc55db/hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/ratis/RatisHelper.java#L280-L289]
>  and 
> [Checksum|https://github.com/apache/hadoop-ozone/blob/6b2cda125b3647870ef5b01cf64e3b3e4cdc55db/hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/common/Checksum.java#L64-L73],
>  but the first one is being removed in HDDS-2375) result in unnecessary byte 
> array allocations:
> # {{ByteString.Output}} creates 128-byte buffer by default, which is too 
> large for writing a single int
> # {{DataOutputStream}} allocates an [extra 8-byte 
> array|https://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/io/DataOutputStream.java#l204],
>  used only for writing longs
> # {{ByteString.Output}} also creates 10-element array for {{flushedBuffers}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2502) Close ScmClient in RatisInsight

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2502.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Close ScmClient in RatisInsight
> ---
>
> Key: HDDS-2502
> URL: https://issues.apache.org/jira/browse/HDDS-2502
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Attila Doroszlai
>Assignee: Siddharth Wagle
>Priority: Major
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> {{ScmClient}} in {{RatisInsight}} should be closed after use.
> https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-mYKcVY8lQ4Zr_s&open=AW5md-mYKcVY8lQ4Zr_s
> Also two other minor issues reported in the same file:
> https://sonarcloud.io/project/issues?fileUuids=AW5md-HeKcVY8lQ4ZrXL&id=hadoop-ozone&resolved=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2507) Remove the hard-coded exclusion of TestMiniChaosOzoneCluster

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2507.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Remove the hard-coded exclusion of TestMiniChaosOzoneCluster
> 
>
> Key: HDDS-2507
> URL: https://issues.apache.org/jira/browse/HDDS-2507
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We excluded the execution of TestMiniChaosOzoneCluster from the 
> hadoop-ozone/dev-support/checks/integration.sh because it was not stable 
> enough.
> Unfortunately this exclusion makes it impossible to use custom exclusion 
> lists (-Dsurefire.excludesFile=) as excludesFile can't be used if 
> -Dtest=!... is already used.
> I propose to remove this exclusion to make it possible to use different 
> exclusion for different runs (pr check, daily, etc.)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2511) Fix Sonar issues in OzoneManagerServiceProviderImpl

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2511.
--
Resolution: Fixed

> Fix Sonar issues in OzoneManagerServiceProviderImpl
> ---
>
> Key: HDDS-2511
> URL: https://issues.apache.org/jira/browse/HDDS-2511
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Link to the list of issues : 
> https://sonarcloud.io/project/issues?fileUuids=AW5md-HdKcVY8lQ4ZrUn&id=hadoop-ozone&resolved=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2511) Fix Sonar issues in OzoneManagerServiceProviderImpl

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2511:
-
Priority: Minor  (was: Major)

> Fix Sonar issues in OzoneManagerServiceProviderImpl
> ---
>
> Key: HDDS-2511
> URL: https://issues.apache.org/jira/browse/HDDS-2511
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Minor
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Link to the list of issues : 
> https://sonarcloud.io/project/issues?fileUuids=AW5md-HdKcVY8lQ4ZrUn&id=hadoop-ozone&resolved=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2515) No need to call "toString()" method as formatting and string conversion is done by the Formatter

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2515.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> No need to call "toString()" method as formatting and string conversion is 
> done by the Formatter
> 
>
> Key: HDDS-2515
> URL: https://issues.apache.org/jira/browse/HDDS-2515
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Abhishek Purohit
>Assignee: Abhishek Purohit
>Priority: Major
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://sonarcloud.io/project/issues?id=hadoop-ozone&open=AW5md_AGKcVY8lQ4ZsV4&resolved=false]
> Class:  XceiverClientGrpc
> {code:java}
>  if (LOG.isDebugEnabled()) {  LOG.debug("Nodes in pipeline : {}", 
> pipeline.getNodes().toString());
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2515) No need to call "toString()" method as formatting and string conversion is done by the Formatter

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2515:
-
Issue Type: Bug  (was: Improvement)

> No need to call "toString()" method as formatting and string conversion is 
> done by the Formatter
> 
>
> Key: HDDS-2515
> URL: https://issues.apache.org/jira/browse/HDDS-2515
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Abhishek Purohit
>Assignee: Abhishek Purohit
>Priority: Major
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://sonarcloud.io/project/issues?id=hadoop-ozone&open=AW5md_AGKcVY8lQ4ZsV4&resolved=false]
> Class:  XceiverClientGrpc
> {code:java}
>  if (LOG.isDebugEnabled()) {  LOG.debug("Nodes in pipeline : {}", 
> pipeline.getNodes().toString());
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2515) No need to call "toString()" method as formatting and string conversion is done by the Formatter

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2515:
-
Priority: Minor  (was: Major)

> No need to call "toString()" method as formatting and string conversion is 
> done by the Formatter
> 
>
> Key: HDDS-2515
> URL: https://issues.apache.org/jira/browse/HDDS-2515
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Abhishek Purohit
>Assignee: Abhishek Purohit
>Priority: Minor
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> [https://sonarcloud.io/project/issues?id=hadoop-ozone&open=AW5md_AGKcVY8lQ4ZsV4&resolved=false]
> Class:  XceiverClientGrpc
> {code:java}
>  if (LOG.isDebugEnabled()) {  LOG.debug("Nodes in pipeline : {}", 
> pipeline.getNodes().toString());
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.

2019-11-15 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2472.
--
Resolution: Fixed

> Use try-with-resources while creating FlushOptions in RDBStore.
> ---
>
> Key: HDDS-2472
> URL: https://issues.apache.org/jira/browse/HDDS-2472
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Link to the sonar issue flag - 
> https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-zwKcVY8lQ4ZsJ4&open=AW5md-zwKcVY8lQ4ZsJ4.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2487) Ensure streams are closed

2019-11-14 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2487:
-
Fix Version/s: 0.5.0
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> Ensure streams are closed
> -
>
> Key: HDDS-2487
> URL: https://issues.apache.org/jira/browse/HDDS-2487
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Attila Doroszlai
>Assignee: Attila Doroszlai
>Priority: Major
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> * ContainerDataYaml: 
> https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-6IKcVY8lQ4ZsQU&open=AW5md-6IKcVY8lQ4ZsQU
> * OmUtils: 
> https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-hdKcVY8lQ4Zr76&open=AW5md-hdKcVY8lQ4Zr76



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2494) Sonar - BigDecimal(double) should not be used

2019-11-14 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2494.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Sonar - BigDecimal(double) should not be used
> -
>
> Key: HDDS-2494
> URL: https://issues.apache.org/jira/browse/HDDS-2494
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Matthew Sharp
>Assignee: Matthew Sharp
>Priority: Minor
>  Labels: pull-request-available, sonar
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Sonar Issue:  
> [https://sonarcloud.io/project/issues?id=hadoop-ozone&issues=AW5md-0AKcVY8lQ4ZsKR&open=AW5md-0AKcVY8lQ4ZsKR]
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-13 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973836#comment-16973836
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

Hi [~timmylicheng]

Thanks for sharing the logs.

I see completeMultipartUpload is called with 286 parts, and OM is throwing an 
error InvalidPart, but from an audit log, I was not able to know which part is 
missing in OM. (And I see 286 success commit Multipart upload for the key). 



I think there might be a chance of the scenario HDDS-2477 we are hitting here. 
(Not completely sure, this is my analysis after looking up logs) I have opened 
couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in 
analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not)

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, 
> image-2019-10-31-18-56-56-177.png, om-audit-VM_50_210_centos.log, 
> om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.a

[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-13 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973836#comment-16973836
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/14/19 12:41 AM:
-

Hi [~timmylicheng]

Thanks for sharing the logs.

I see completeMultipartUpload is called with 286 parts, and OM is throwing an 
error InvalidPart, but from an audit log, I was not able to know which part is 
missing in OM(because we don't print any such info in log/exception message). 
(And I see 286 success commit Multipart upload for the key). 

I think there might be a chance of the scenario HDDS-2477 we are hitting here. 
(Not completely sure, this is my analysis after looking up logs) I have opened 
couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in 
analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not)


was (Author: bharatviswa):
Hi [~timmylicheng]

Thanks for sharing the logs.

I see completeMultipartUpload is called with 286 parts, and OM is throwing an 
error InvalidPart, but from an audit log, I was not able to know which part is 
missing in OM. (And I see 286 success commit Multipart upload for the key). 



I think there might be a chance of the scenario HDDS-2477 we are hitting here. 
(Not completely sure, this is my analysis after looking up logs) I have opened 
couple of Jira's HDDS-2477 HDDS-2471 and HDDS-2470 which will help in 
analyzing/debugging this issue. (Let's see HDDS-2477 will fix it or not)

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, 
> image-2019-10-31-18-56-56-177.png, om-audit-VM_50_210_centos.log, 
> om_audit_log_plc_1570863541668_9278.txt
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroup

[jira] [Updated] (HDDS-2477) TableCache cleanup issue for OM non-HA

2019-11-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2477:
-
Component/s: Ozone Manager

> TableCache cleanup issue for OM non-HA
> --
>
> Key: HDDS-2477
> URL: https://issues.apache.org/jira/browse/HDDS-2477
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In OM in non-HA case, the ratisTransactionLogIndex is generated by 
> OmProtocolServersideTranslatorPB.java. And in OM non-HA 
> validateAndUpdateCache is called from multipleHandler threads. So think of a 
> case where one thread which has an index - 10 has added to doubleBuffer. (0-9 
> still have not added). DoubleBuffer flush thread flushes and call cleanup. 
> (So, now cleanup will go and cleanup all cache entries with less than 10 
> epoch) This should not have cleanup those which might have put in to cache 
> later and which are in process of flush to DB. This will cause inconsitency 
> for few OM requests.
>  
>  
> Example:
> 4 threads Committing 4 parts.
> 1st thread - part 1 - ratis Index - 3
> 2nd thread - part 2 - ratis index - 2
> 3rd thread - part3 - ratis index - 1
>  
> First thread got lock, and put in to doubleBuffer and cache with 
> OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
> cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
> parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
> might cleanup those entries, as it is called with index 3 for cleanup.
>  
> Now when the 4th part upload came -> when it is commit Multipart upload when 
> it gets multipartinfo it get Only part1 in OmMultipartInfo, as the 
> OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now 
> after 4th part upload is complete in DB and Cache we will have 1,4 parts 
> only. We will miss part2,3 information.
>  
> So for non-HA case cleanup will be called with list of epochs that need to be 
> cleanedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2477) TableCache cleanup issue for OM non-HA

2019-11-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2477:
-
Status: Patch Available  (was: Open)

> TableCache cleanup issue for OM non-HA
> --
>
> Key: HDDS-2477
> URL: https://issues.apache.org/jira/browse/HDDS-2477
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In OM in non-HA case, the ratisTransactionLogIndex is generated by 
> OmProtocolServersideTranslatorPB.java. And in OM non-HA 
> validateAndUpdateCache is called from multipleHandler threads. So think of a 
> case where one thread which has an index - 10 has added to doubleBuffer. (0-9 
> still have not added). DoubleBuffer flush thread flushes and call cleanup. 
> (So, now cleanup will go and cleanup all cache entries with less than 10 
> epoch) This should not have cleanup those which might have put in to cache 
> later and which are in process of flush to DB. This will cause inconsitency 
> for few OM requests.
>  
>  
> Example:
> 4 threads Committing 4 parts.
> 1st thread - part 1 - ratis Index - 3
> 2nd thread - part 2 - ratis index - 2
> 3rd thread - part3 - ratis index - 1
>  
> First thread got lock, and put in to doubleBuffer and cache with 
> OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
> cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
> parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
> might cleanup those entries, as it is called with index 3 for cleanup.
>  
> Now when the 4th part upload came -> when it is commit Multipart upload when 
> it gets multipartinfo it get Only part1 in OmMultipartInfo, as the 
> OmMultipartInfo (with 1,2,3 is still in process of committing to DB). So now 
> after 4th part upload is complete in DB and Cache we will have 1,4 parts 
> only. We will miss part2,3 information.
>  
> So for non-HA case cleanup will be called with list of epochs that need to be 
> cleanedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2477) TableCache cleanup issue for OM non-HA

2019-11-13 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2477:


 Summary: TableCache cleanup issue for OM non-HA
 Key: HDDS-2477
 URL: https://issues.apache.org/jira/browse/HDDS-2477
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


In OM in non-HA case, the ratisTransactionLogIndex is generated by 
OmProtocolServersideTranslatorPB.java. And in OM non-HA validateAndUpdateCache 
is called from multipleHandler threads. So think of a case where one thread 
which has an index - 10 has added to doubleBuffer. (0-9 still have not added). 
DoubleBuffer flush thread flushes and call cleanup. (So, now cleanup will go 
and cleanup all cache entries with less than 10 epoch) This should not have 
cleanup those which might have put in to cache later and which are in process 
of flush to DB. This will cause inconsitency for few OM requests.

 

 

Example:

4 threads Committing 4 parts.

1st thread - part 1 - ratis Index - 3

2nd thread - part 2 - ratis index - 2

3rd thread - part3 - ratis index - 1

 

First thread got lock, and put in to doubleBuffer and cache with 
OmMultipartInfo (with part1). And cleanup is called to cleanup all entries in 
cache with less than 3. In the mean time 2nd thread and 1st thread put 2,3 
parts in to OmMultipartInfo in to Cache and doubleBuffer. But first thread 
might cleanup those entries, as it is called with index 3 for cleanup.

 

Now when the 4th part upload came -> when it is commit Multipart upload when it 
gets multipartinfo it get Only part1 in OmMultipartInfo, as the OmMultipartInfo 
(with 1,2,3 is still in process of committing to DB). So now after 4th part 
upload is complete in DB and Cache we will have 1,4 parts only. We will miss 
part2,3 information.

 

So for non-HA case cleanup will be called with list of epochs that need to be 
cleanedup.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2471) Improve exception message for CompleteMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2471:


Assignee: Bharat Viswanadham

> Improve exception message for CompleteMultipartUpload
> -
>
> Key: HDDS-2471
> URL: https://issues.apache.org/jira/browse/HDDS-2471
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When InvalidPart error occurs, the exception message does not have any 
> information about partName and partNumber, it will be good to have this 
> information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2471) Improve exception message for CompleteMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2471:
-
Status: Patch Available  (was: Open)

> Improve exception message for CompleteMultipartUpload
> -
>
> Key: HDDS-2471
> URL: https://issues.apache.org/jira/browse/HDDS-2471
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When InvalidPart error occurs, the exception message does not have any 
> information about partName and partNumber, it will be good to have this 
> information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2471) Improve exception message for CompleteMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2471:


 Summary: Improve exception message for CompleteMultipartUpload
 Key: HDDS-2471
 URL: https://issues.apache.org/jira/browse/HDDS-2471
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


When InvalidPart error occurs, the exception message does not have any 
information about partName and partNumber, it will be good to have this 
information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2470:
-
Description: 
Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log. This will help in analyzing audit logs for MPU.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=xx.xx.xx.xx | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID

{     containerID: 2     localID: 103129366531867089   }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "xx.xx.xx.xx"

    hostName: "xx.xx.xx.xx"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id

{     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }

}

]} | ret=SUCCESS |

  was:
Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log. This will help in analyzing audit logs for MPU.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID

{     containerID: 2     localID: 103129366531867089   }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "9.134.51.232"

    hostName: "9.134.51.232"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id

{     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }

}

]} | ret=SUCCESS |


> Add partName, partNumber for CommitMultipartUpload
> --
>
> Key: HDDS-2470
> URL: https://issues.apache.org/jira/browse/HDDS-2470
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now when complete Multipart Upload is not printing partName and  
> partNumber into the audit log. This will help in analyzing audit logs for MPU.
>  
>  
> 2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=xx.xx.xx.xx | 
> op=COMMIT_MULTIPART_UPLOAD_PARTKEY 
> {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
> key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, 
> replicationFactor=ONE, keyLocationInfo=[blockID {
>   containerBlockID
> {     containerID: 2     localID: 103129366531867089   }
>   blockCommitSequenceId: 4978
> }
> offset: 0
> length: 5242880
> createVersion: 0
> pipeline {

[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2470:
-
Status: Patch Available  (was: Open)

> Add partName, partNumber for CommitMultipartUpload
> --
>
> Key: HDDS-2470
> URL: https://issues.apache.org/jira/browse/HDDS-2470
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Right now when complete Multipart Upload is not printing partName and  
> partNumber into the audit log. This will help in analyzing audit logs for MPU.
>  
>  
> 2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
> op=COMMIT_MULTIPART_UPLOAD_PARTKEY 
> {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
> key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, 
> replicationFactor=ONE, keyLocationInfo=[blockID {
>   containerBlockID
> {     containerID: 2     localID: 103129366531867089   }
>   blockCommitSequenceId: 4978
> }
> offset: 0
> length: 5242880
> createVersion: 0
> pipeline {
>   leaderID: ""
>   members {
>     uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     ipAddress: "9.134.51.232"
>     hostName: "9.134.51.232"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     networkLocation: "/default-rack"
>   }
>   members {
>     uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"
>     ipAddress: "9.134.51.25"
>     hostName: "9.134.51.25"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"
>     networkLocation: "/default-rack"
>   }
>   members {
>     uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"
>     ipAddress: "9.134.51.215"
>     hostName: "9.134.51.215"
>     ports
> {       name: "RATIS"       value: 9858     }
>     ports
> {       name: "STANDALONE"       value: 9859     }
>     networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"
>     networkLocation: "/default-rack"
>   }
>   state: PIPELINE_OPEN
>   type: RATIS
>   factor: THREE
>   id
> {     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }
> }
> ]} | ret=SUCCESS |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2470?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2470:
-
Description: 
Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log. This will help in analyzing audit logs for MPU.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID

{     containerID: 2     localID: 103129366531867089   }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "9.134.51.232"

    hostName: "9.134.51.232"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports

{       name: "RATIS"       value: 9858     }

    ports

{       name: "STANDALONE"       value: 9859     }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id

{     id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"   }

}

]} | ret=SUCCESS |

  was:
Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID {

    containerID: 2

    localID: 103129366531867089

  }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "9.134.51.232"

    hostName: "9.134.51.232"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id {

    id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"

  }

}

]} | ret=SUCCESS |


> Add partName, partNumber for CommitMultipartUpload
> --
>
> Key: HDDS-2470
> URL: https://issues.apache.org/jira/browse/HDDS-2470
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> Right now when complete Multipart Upload is not printing partName and  
> partNumber into the audit log. This will help in analyzing audit logs for MPU.
>  
>  
> 2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
> op=COMMIT_MULTIPART_UPLOAD_PARTKEY 
> {volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
> key=plc_1570850798896_2991, dataSize=5242880, replicationType=RATIS, 
> replicationFactor=ONE, keyLocationInfo=[blockID {
>   containerBlockID
> {     containerID: 2     localID: 103129366531867089   }
>   blockCommitSequenceId: 4978
> }
> offset: 0
> length: 5242880
> createVersion: 0
> pipeline {
>   leaderID: ""
>   members {
>     uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"
>     ipAddress: "9.134.51.232"
>     hostNam

[jira] [Created] (HDDS-2470) Add partName, partNumber for CommitMultipartUpload

2019-11-13 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2470:


 Summary: Add partName, partNumber for CommitMultipartUpload
 Key: HDDS-2470
 URL: https://issues.apache.org/jira/browse/HDDS-2470
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham


Right now when complete Multipart Upload is not printing partName and  
partNumber into the audit log.

 

 

2019-11-13 15:14:10,191 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570850798896_2991, dataSize=5242880, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=[blockID {

  containerBlockID {

    containerID: 2

    localID: 103129366531867089

  }

  blockCommitSequenceId: 4978

}

offset: 0

length: 5242880

createVersion: 0

pipeline {

  leaderID: ""

  members {

    uuid: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    ipAddress: "9.134.51.232"

    hostName: "9.134.51.232"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "5d03aed5-cfb3-4689-b168-0c9a94316551"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    ipAddress: "9.134.51.25"

    hostName: "9.134.51.25"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "a71462ae-7865-4ed5-b84e-60616df60a0d"

    networkLocation: "/default-rack"

  }

  members {

    uuid: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    ipAddress: "9.134.51.215"

    hostName: "9.134.51.215"

    ports {

      name: "RATIS"

      value: 9858

    }

    ports {

      name: "STANDALONE"

      value: 9859

    }

    networkName: "79bf7bdf-ed29-49d4-bf7c-e88fdbd2ce03"

    networkLocation: "/default-rack"

  }

  state: PIPELINE_OPEN

  type: RATIS

  factor: THREE

  id {

    id: "ec6b06c5-193f-4c30-879b-5a12284dc4f8"

  }

}

]} | ret=SUCCESS |



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2465) S3 Multipart upload failing

2019-11-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972951#comment-16972951
 ] 

Bharat Viswanadham commented on HDDS-2465:
--

cc [~elek]

> S3 Multipart upload failing
> ---
>
> Key: HDDS-2465
> URL: https://issues.apache.org/jira/browse/HDDS-2465
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
> Attachments: MPU.java
>
>
> When I run attached java program, facing below error, during 
> completeMultipartUpload.
> {code:java}
> ERROR StatusLogger No Log4j 2 configuration file found. Using default 
> configuration (logging only errors to the console), or user programmatically 
> provided configurations. Set system property 'log4j2.debug' to show Log4j 2 
> internal initialization logging. See 
> https://logging.apache.org/log4j/2.x/manual/configuration.html for 
> instructions on how to configure Log4j 2ERROR StatusLogger No Log4j 2 
> configuration file found. Using default configuration (logging only errors to 
> the console), or user programmatically provided configurations. Set system 
> property 'log4j2.debug' to show Log4j 2 internal initialization logging. See 
> https://logging.apache.org/log4j/2.x/manual/configuration.html for 
> instructions on how to configure Log4j 2Exception in thread "main" 
> com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: 
> Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 
> c7b87393-955b-4c93-85f6-b02945e293ca; S3 Extended Request ID: 7tnVbqgc4bgb), 
> S3 Extended Request ID: 7tnVbqgc4bgb at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
>  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532) at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512) at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4921) at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4867) at 
> com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload(AmazonS3Client.java:3464)
>  at org.apache.hadoop.ozone.freon.MPU.main(MPU.java:96){code}
> When I debug it is not the request is not been received by S3Gateway, and I 
> don't see any trace of this in audit log.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2465) S3 Multipart upload failing

2019-11-12 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2465:
-
Attachment: MPU.java

> S3 Multipart upload failing
> ---
>
> Key: HDDS-2465
> URL: https://issues.apache.org/jira/browse/HDDS-2465
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Priority: Major
> Attachments: MPU.java
>
>
> When I run attached java program, facing below error, during 
> completeMultipartUpload.
> {code:java}
> ERROR StatusLogger No Log4j 2 configuration file found. Using default 
> configuration (logging only errors to the console), or user programmatically 
> provided configurations. Set system property 'log4j2.debug' to show Log4j 2 
> internal initialization logging. See 
> https://logging.apache.org/log4j/2.x/manual/configuration.html for 
> instructions on how to configure Log4j 2ERROR StatusLogger No Log4j 2 
> configuration file found. Using default configuration (logging only errors to 
> the console), or user programmatically provided configurations. Set system 
> property 'log4j2.debug' to show Log4j 2 internal initialization logging. See 
> https://logging.apache.org/log4j/2.x/manual/configuration.html for 
> instructions on how to configure Log4j 2Exception in thread "main" 
> com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: 
> Amazon S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 
> c7b87393-955b-4c93-85f6-b02945e293ca; S3 Extended Request ID: 7tnVbqgc4bgb), 
> S3 Extended Request ID: 7tnVbqgc4bgb at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
>  at 
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
>  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532) at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512) at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4921) at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4867) at 
> com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload(AmazonS3Client.java:3464)
>  at org.apache.hadoop.ozone.freon.MPU.main(MPU.java:96){code}
> When I debug it is not the request is not been received by S3Gateway, and I 
> don't see any trace of this in audit log.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2465) S3 Multipart upload failing

2019-11-12 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2465:


 Summary: S3 Multipart upload failing
 Key: HDDS-2465
 URL: https://issues.apache.org/jira/browse/HDDS-2465
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


When I run attached java program, facing below error, during 
completeMultipartUpload.
{code:java}
ERROR StatusLogger No Log4j 2 configuration file found. Using default 
configuration (logging only errors to the console), or user programmatically 
provided configurations. Set system property 'log4j2.debug' to show Log4j 2 
internal initialization logging. See 
https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions 
on how to configure Log4j 2ERROR StatusLogger No Log4j 2 configuration file 
found. Using default configuration (logging only errors to the console), or 
user programmatically provided configurations. Set system property 
'log4j2.debug' to show Log4j 2 internal initialization logging. See 
https://logging.apache.org/log4j/2.x/manual/configuration.html for instructions 
on how to configure Log4j 2Exception in thread "main" 
com.amazonaws.services.s3.model.AmazonS3Exception: Bad Request (Service: Amazon 
S3; Status Code: 400; Error Code: 400 Bad Request; Request ID: 
c7b87393-955b-4c93-85f6-b02945e293ca; S3 Extended Request ID: 7tnVbqgc4bgb), S3 
Extended Request ID: 7tnVbqgc4bgb at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1712)
 at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1367)
 at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1113)
 at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:770)
 at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:744)
 at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:726)
 at 
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:686)
 at 
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:668)
 at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:532) at 
com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:512) at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4921) at 
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:4867) at 
com.amazonaws.services.s3.AmazonS3Client.completeMultipartUpload(AmazonS3Client.java:3464)
 at org.apache.hadoop.ozone.freon.MPU.main(MPU.java:96){code}
When I debug it is not the request is not been received by S3Gateway, and I 
don't see any trace of this in audit log.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972874#comment-16972874
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/12/19 11:22 PM:
-

Hi [~timmylicheng]

Thanks for sharing the logs.

I see an abort multipart upload request for the key plc_1570863541668_9278 once 
complete multipart upload failed.

 

 
{code:java}
2019-11-08 20:08:24,830 | ERROR | OMAudit | user=root | ip=9.134.50.210 | 
op=COMPLETE_MULTIPART_UPLOAD
{volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
key=plc_1570863541668_9278, dataSize=0, replicationType=RATIS, 
replicationFactor=ONE, keyLocationIn       fo=[], multipartList=[partNumber: 1  
 5626 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374356085"
   5627 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374487158"
     . .   5911 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102211984655258"
  5912 ]} | ret=FAILURE | INVALID_PART 
org.apache.hadoop.ozone.om.exceptions.OMException: Complete Multipart Upload 
Failed: volume: s325d55ad283aa400af464c76d713c07adbucket: ozone-testkey: 
plc_1570863541668_9278
2019-11-08 20:08:24,963 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=ABORT_MULTIPART_UPLOAD
{volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
key=plc_1570863541668_9278, dataSize=0, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=       []} 
{code}
 

And after that still, allocateBlock is continuing for the key because the entry 
from openKeyTable is not removed by abortMultipartUpload request.(Abort removed 
only entry which has been created during initiateMPU request, so that is the 
reason after some time you see the  NO_SUCH_MULTIPART_UPLOAD error during 
commitMultipartUploadKey, as we removed entry from MultipartInfo table. (But 
the strange thing I have observed is the clientID is not matching with any of 
the name in the partlist, as partName lastpart is clientID.)

 

And from the OM audit log, I see partNumber 1, and a list of multipart names, 
not sure if some log is truncated here. As it should show like part name, 
partNumber.
 # If you can confirm for this key what are parts in OM, you can get this from 
listParts(But this should be done before abort request).
 # Check in the OM audit log for this key what is the partlist we get, not sure 
in the uploaded log it is truncated. 

 

On my cluster audit logs look like below, where when completeMultipartUpload, I 
can see partNumber and partName.(Whereas in the uploaded log, I don't see like 
below)

 

 
{code:java}
2019-11-12 14:57:18,580 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=INITIATE_MULTIPART_UPLOAD {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=0, replicationType=RATIS, 
replicationFactor=THREE, keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,967 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_KEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, 
keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,974 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_BLOCK {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=THREE, 
keyLocationInfo=[], clientID=103127415125868581} | ret=SUCCESS | 
2019-11-12 14:57:54,154 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=5242880, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=[blockID {
  containerBlockID {
    containerID: 6
    localID: 103127415126327331
  }
  blockCommitSequenceId: 18
}
offset: 0
length: 5242880
createVersion: 0
pipeline {
  leaderID: ""
  members {
    uuid: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    ipAddress: "10.65.49.251"
    hostName: "bh-ozone-3.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    ipAddress: "10.65.51.23"
    hostName: "bh-ozone-4.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "cf8aace1-92b8-496e-aed9-f2771c83a56b"
    ipAddress: "10.65.53.160"
    hostName: "bh-ozone-2.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDA

[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972874#comment-16972874
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/12/19 11:21 PM:
-

Hi [~timmylicheng]

Thanks for sharing the logs.

I see an abort multipart upload request for the key plc_1570863541668_9278 once 
complete multipart upload failed.

 

 
{code:java}
2019-11-08 20:08:24,830 | ERROR | OMAudit | user=root | ip=9.134.50.210 | 
op=COMPLETE_MULTIPART_UPLOAD
{volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
key=plc_1570863541668_9278, dataSize=0, replicationType=RATIS, 
replicationFactor=ONE, keyLocationIn       fo=[], multipartList=[partNumber: 1  
 5626 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374356085"
   5627 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374487158"
     . .   5911 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102211984655258"
  5912 ]} | ret=FAILURE | INVALID_PART 
org.apache.hadoop.ozone.om.exceptions.OMException: Complete Multipart Upload 
Failed: volume: s325d55ad283aa400af464c76d713c07adbucket: ozone-testkey: 
plc_1570863541668_9278
2019-11-08 20:08:24,963 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=ABORT_MULTIPART_UPLOAD
{volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
key=plc_1570863541668_9278, dataSize=0, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=       []} 
{code}
 

And after that still, allocateBlock is continuing for the key because the entry 
from openKeyTable is not removed by abortMultipartUpload request.(Abort removed 
only entry which has been created during initiateMPU request, so that is the 
reason after some time you see the  NO_SUCH_MULTIPART_UPLOAD error during 
commitMultipartUploadKey, as we removed entry from MultipartInfo table. (But 
the strange thing I have observed is the clientID is not matching with any of 
the name in the partlist, as partName lastpart is clientID.)

 

And from the OM audit log, I see partNumber 1, and a list of multipart names, 
not sure if some log is truncated here. As it should show like part name, 
partNumber.
 # If you can confirm for this key what are parts in OM, you can get this from 
listParts(But this should be done before abort request).
 # Check in the OM audit log for this key what is the partlist we get, not sure 
in the uploaded log it is truncated. 

 

On my cluster audit logs look like below, where when completeMultipartUpload, I 
can see partNumber and partName.(Whereas in the uploaded log, I don't see like 
below)

 

 
{code:java}
2019-11-12 14:57:18,580 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=INITIATE_MULTIPART_UPLOAD {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=0, replicationType=RATIS, 
replicationFactor=THREE, keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,967 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_KEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, 
keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,974 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_BLOCK {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=THREE, 
keyLocationInfo=[], clientID=103127415125868581} | ret=SUCCESS | 
2019-11-12 14:57:54,154 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=5242880, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=[blockID {
  containerBlockID {
    containerID: 6
    localID: 103127415126327331
  }
  blockCommitSequenceId: 18
}
offset: 0
length: 5242880
createVersion: 0
pipeline {
  leaderID: ""
  members {
    uuid: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    ipAddress: "10.65.49.251"
    hostName: "bh-ozone-3.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    ipAddress: "10.65.51.23"
    hostName: "bh-ozone-4.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "cf8aace1-92b8-496e-aed9-f2771c83a56b"
    ipAddress: "10.65.53.160"
    hostName: "bh-ozone-2.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDA

[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972874#comment-16972874
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/12/19 11:17 PM:
-

Hi [~timmylicheng]

Thanks for sharing the logs.

I see an abort multipart upload request for the key plc_1570863541668_9278 once 
complete multipart upload failed.

 

 
{code:java}
2019-11-08 20:08:24,830 | ERROR | OMAudit | user=root | ip=9.134.50.210 | 
op=COMPLETE_MULTIPART_UPLOAD
{volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
key=plc_1570863541668_9278, dataSize=0, replicationType=RATIS, 
replicationFactor=ONE, keyLocationIn       fo=[], multipartList=[partNumber: 1  
 5626 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374356085"
   5627 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374487158"
     . .   5911 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102211984655258"
  5912 ]} | ret=FAILURE | INVALID_PART 
org.apache.hadoop.ozone.om.exceptions.OMException: Complete Multipart Upload 
Failed: volume: s325d55ad283aa400af464c76d713c07adbucket: ozone-testkey: 
plc_1570863541668_9278
2019-11-08 20:08:24,963 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=ABORT_MULTIPART_UPLOAD
{volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
key=plc_1570863541668_9278, dataSize=0, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=       []} 
{code}
 

And after that still, allocateBlock is continuing for the key because the entry 
from openKeyTable is not removed by abortMultipartUpload request.(Abort removed 
only entry which has been created during initiateMPU request, so that is the 
reason after some time you see the  NO_SUCH_MULTIPART_UPLOAD error during 
commitMultipartUploadKey, as we removed entry from MultipartInfo table. (But 
the strange thing I have observed is the clientID is not matching with any of 
the name in the partlist, as partName lastpart is clientID.)

 

And from the OM audit log, I see partNumber 1, and a list of multipart names, 
not sure if some log is truncated here. As it should show like part name, 
partNumber.
 # If you can confirm for this key what are parts in OM, you can get this from 
listParts(But this should be done before abort request).
 # Check in the OM audit log for this key what is the partlist we get, not sure 
in the uploaded log it is truncated. 

 

On my cluster audit logs look like below, where when completeMultipartUpload, I 
can see partNumber and partName.(Whereas in the uploaded log, I don't see like 
below)

 

 
{code:java}
2019-11-12 14:57:18,580 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=INITIATE_MULTIPART_UPLOAD {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=0, replicationType=RATIS, 
replicationFactor=THREE, keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,967 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_KEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, 
keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,974 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_BLOCK {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=THREE, 
keyLocationInfo=[], clientID=103127415125868581} | ret=SUCCESS | 
2019-11-12 14:57:54,154 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=5242880, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=[blockID {
  containerBlockID {
    containerID: 6
    localID: 103127415126327331
  }
  blockCommitSequenceId: 18
}
offset: 0
length: 5242880
createVersion: 0
pipeline {
  leaderID: ""
  members {
    uuid: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    ipAddress: "10.65.49.251"
    hostName: "bh-ozone-3.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    ipAddress: "10.65.51.23"
    hostName: "bh-ozone-4.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "cf8aace1-92b8-496e-aed9-f2771c83a56b"
    ipAddress: "10.65.53.160"
    hostName: "bh-ozone-2.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDA

[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972874#comment-16972874
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/12/19 11:08 PM:
-

Hi [~timmylicheng]

Thanks for sharing the logs.

I see an abort multipart upload request for the key plc_1570863541668_9278 once 
complete multipart upload failed.

 

2019-11-08 20:08:24,830 | ERROR | OMAudit | user=root | ip=9.134.50.210 | 
op=COMPLETE_MULTIPART_UPLOAD

{volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
key=plc_1570863541668_9278, dataSize=0, replicationType=RATIS, 
replicationFactor=ONE, keyLocationIn       fo=[], multipartList=[partNumber: 1  
 5626 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374356085"
   5627 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374487158"
     . .   5911 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102211984655258"
   5912 ]}

| ret=FAILURE | INVALID_PART org.apache.hadoop.ozone.om.exceptions.OMException: 
Complete Multipart Upload Failed: volume: 
s325d55ad283aa400af464c76d713c07adbucket: ozone-testkey: plc_1570863541668_9278

2019-11-08 20:08:24,963 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=ABORT_MULTIPART_UPLOAD

{volume=s325d55ad283aa400af464c76d713c07ad, bucket=ozone-test, 
key=plc_1570863541668_9278, dataSize=0, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=       []}

| ret=SUCCESS |

 

And after that still, allocateBlock is continuing for the key because the entry 
from openKeyTable is not removed by abortMultipartUpload request.(Abort removed 
only entry which has been created during initiateMPU request, so that is the 
reason after some time you see the  NO_SUCH_MULTIPART_UPLOAD error during 
commitMultipartUploadKey, as we removed entry from MultipartInfo table. (But 
the strange thing I have observed is the clientID is not matching with any of 
the name in the partlist, as partName lastpart is clientID.)

 

And from the OM audit log, I see partNumber 1, and a list of multipart names, 
not sure if some log is truncated here. As it should show like part name, 
partNumber.
 # If you can confirm for this key what are parts in OM, you can get this from 
listParts(But this should be done before abort request).
 # Check in the OM audit log for this key what is the partlist we get, not sure 
in the uploaded log it is truncated. 

 

On my cluster audit logs look like below, where when completeMultipartUpload, I 
can see partNumber and partName.(Whereas in the uploaded log, I don't see like 
below)

 

 
{code:java}
2019-11-12 14:57:18,580 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=INITIATE_MULTIPART_UPLOAD {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=0, replicationType=RATIS, 
replicationFactor=THREE, keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,967 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_KEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, 
keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,974 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_BLOCK {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=THREE, 
keyLocationInfo=[], clientID=103127415125868581} | ret=SUCCESS | 
2019-11-12 14:57:54,154 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=5242880, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=[blockID {
  containerBlockID {
    containerID: 6
    localID: 103127415126327331
  }
  blockCommitSequenceId: 18
}
offset: 0
length: 5242880
createVersion: 0
pipeline {
  leaderID: ""
  members {
    uuid: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    ipAddress: "10.65.49.251"
    hostName: "bh-ozone-3.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    ipAddress: "10.65.51.23"
    hostName: "bh-ozone-4.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "cf8aace1-92b8-496e-aed9-f2771c83a56b"
    ipAddress: "10.65.53.160"
    hostName: "bh-ozone-2.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDA

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-12 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16972874#comment-16972874
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

Hi [~timmylicheng]

Thanks for sharing the logs.

I see an abort multipart upload request for the key plc_1570863541668_9278 once 
complete multipart upload failed.

 

2019-11-08 20:08:24,830 | ERROR | OMAudit | user=root | ip=9.134.50.210 | 
op=COMPLETE_MULTIPART_UPLOAD {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570863541668_9278, dataSize=0, 
replicationType=RATIS, replicationFactor=ONE, keyLocationIn       fo=[], 
multipartList=[partNumber: 1

  5626 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374356085"

  5627 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102209374487158"

 

 

.

.

  5911 partName: 
"/s325d55ad283aa400af464c76d713c07ad/ozone-test/plc_1570863541668_9278103102211984655258"

  5912 ]} | ret=FAILURE | INVALID_PART 
org.apache.hadoop.ozone.om.exceptions.OMException: Complete Multipart Upload 
Failed: volume: s325d55ad283aa400af464c76d713c07adbucket: ozone-testkey: 
plc_1570863541668_9278

2019-11-08 20:08:24,963 | INFO  | OMAudit | user=root | ip=9.134.50.210 | 
op=ABORT_MULTIPART_UPLOAD {volume=s325d55ad283aa400af464c76d713c07ad, 
bucket=ozone-test, key=plc_1570863541668_9278, dataSize=0, 
replicationType=RATIS, replicationFactor=ONE, keyLocationInfo=       []} | 
ret=SUCCESS |

 

And after that still, allocateBlock is continuing for the key because the entry 
from openKeyTable is not removed by abortMultipartUpload request.(Abort removed 
only entry which has been created during initiateMPU request, so that is the 
reason after some time you see the  NO_SUCH_MULTIPART_UPLOAD error during 
commitMultipartUploadKey, as we removed entry from MultipartInfo table. (But 
the strange thing I have observed is the clientID is not matching with any of 
the name in the partlist, as partName lastpart is clientID.)

 

And from the OM audit log, I see partNumber 1, and a list of multipart names, 
not sure if some log is truncated here. As it should show like part name, 
partNumber. 
 # If you can confirm for this key what are parts in OM, you can get this from 
listParts(But this should be done before abort request).
 # Check in the OM audit log for this key what is the partlist we get, not sure 
in the uploaded log it is truncated. 

 

On my cluster audit logs look like below.

 

 
{code:java}
2019-11-12 14:57:18,580 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=INITIATE_MULTIPART_UPLOAD {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=0, replicationType=RATIS, 
replicationFactor=THREE, keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,967 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_KEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=ONE, 
keyLocationInfo=[]} | ret=SUCCESS | 
2019-11-12 14:57:53,974 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=ALLOCATE_BLOCK {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, bucket=b1234, 
key=key123, dataSize=5242880, replicationType=RATIS, replicationFactor=THREE, 
keyLocationInfo=[], clientID=103127415125868581} | ret=SUCCESS | 
2019-11-12 14:57:54,154 | INFO  | OMAudit | user=root | ip=10.65.53.160 | 
op=COMMIT_MULTIPART_UPLOAD_PARTKEY {volume=s3dfb57b2e5f36c1f893dbc12dd66897d4, 
bucket=b1234, key=key123, dataSize=5242880, replicationType=RATIS, 
replicationFactor=ONE, keyLocationInfo=[blockID {
  containerBlockID {
    containerID: 6
    localID: 103127415126327331
  }
  blockCommitSequenceId: 18
}
offset: 0
length: 5242880
createVersion: 0
pipeline {
  leaderID: ""
  members {
    uuid: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    ipAddress: "10.65.49.251"
    hostName: "bh-ozone-3.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "a5fba53c-3aa9-48a7-8272-34c606f93bc6"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    ipAddress: "10.65.51.23"
    hostName: "bh-ozone-4.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "5e2625aa-637e-4e5a-a0a1-6683bd108b0d"
    networkLocation: "/default-rack"
  }
  members {
    uuid: "cf8aace1-92b8-496e-aed9-f2771c83a56b"
    ipAddress: "10.65.53.160"
    hostName: "bh-ozone-2.vpc.cloudera.com"
    ports {
      name: "RATIS"
      value: 9858
    }
    ports {
      name: "STANDALONE"
      value: 9859
    }
    networkName: "cf8aace1-92b8-496e-aed9-f2771c83a56b"
    networkLocation: "/default-rack"
  }
  state: PIPELINE_OPEN
  type: RATIS
  factor: T

[jira] [Created] (HDDS-2453) Add Freon tests for S3Bucket/MPU Keys

2019-11-08 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2453:


 Summary: Add Freon tests for S3Bucket/MPU Keys
 Key: HDDS-2453
 URL: https://issues.apache.org/jira/browse/HDDS-2453
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


This Jira is to create freon tests for 
 # S3Bucket creation.
 # S3 MPU Key uploads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2453) Add Freon tests for S3Bucket/MPU Keys

2019-11-08 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2453?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2453:


Assignee: Bharat Viswanadham

> Add Freon tests for S3Bucket/MPU Keys
> -
>
> Key: HDDS-2453
> URL: https://issues.apache.org/jira/browse/HDDS-2453
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> This Jira is to create freon tests for 
>  # S3Bucket creation.
>  # S3 MPU Key uploads.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2410) Ozoneperf docker cluster should use privileged containers

2019-11-08 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2410.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Ozoneperf docker cluster should use privileged containers
> -
>
> Key: HDDS-2410
> URL: https://issues.apache.org/jira/browse/HDDS-2410
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Marton Elek
>Assignee: Marton Elek
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The profiler 
> [servlet|https://github.com/elek/hadoop-ozone/blob/master/hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/server/ProfileServlet.java]
>  (which helps to run java profiler in the background and publishes the result 
> on the web interface) requires privileged docker containers.
>  
> This flag is missing from the ozoneperf docker-compose cluster (which is 
> designed to run performance tests).
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-08 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16970430#comment-16970430
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

Hi [~timmylicheng]

As every run, we are seeing the new error and the stack trace and from log not 
got much information about the root cause.

I think to debug this we need to know why for the Multipartupload key is not 
finding multipart upload or why some times we see InvalidMultipartupload error. 
We can see audit logs and see what request is passing for Multipartupload 
requests, and for the same key we can use listParts to know what are the parts 
OM is having in its MultipartInfoTable(This will help in InvalidPart error).

And also I think we should enable trace/debug log to see the incoming requests, 
and why for Multipart upload we see these errors. (Not sure some bug in Cache 
logic, or some handling we missed for MPU requests)

 

To debug this we need a complete OM log, audit log, S3gateway log. And also 
enable trace to see what requests are incoming, I think we log them in 
OzoneManagerProtocolServerSideTranslatorPB.

 

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, 
> image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload 

[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-08 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16970430#comment-16970430
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/8/19 5:29 PM:
---

Hi [~timmylicheng]

As every run, we are seeing the new error and the stack trace and from log not 
got much information about the root cause.

I think to debug this we need to know why for the Multipartupload key is not 
finding multipart upload or why some times we see InvalidMultipartupload error. 
We can see audit logs and see what request is passing for Multipartupload 
requests, and for the same key we can use listParts to know what are the parts 
OM is having in its MultipartInfoTable(This will help in InvalidPart error).

And also I think we should enable trace/debug log to see the incoming requests, 
and why for Multipart upload we see these errors. (Not sure some bug in Cache 
logic, or some handling we missed for MPU requests)

 

To debug this we need a complete OM log, audit log, S3gateway log. And also 
enable trace to see what requests are incoming, I think we log them in 
OzoneManagerProtocolServerSideTranslatorPB.

 

Let us know if you have any suggestions.

 


was (Author: bharatviswa):
Hi [~timmylicheng]

As every run, we are seeing the new error and the stack trace and from log not 
got much information about the root cause.

I think to debug this we need to know why for the Multipartupload key is not 
finding multipart upload or why some times we see InvalidMultipartupload error. 
We can see audit logs and see what request is passing for Multipartupload 
requests, and for the same key we can use listParts to know what are the parts 
OM is having in its MultipartInfoTable(This will help in InvalidPart error).

And also I think we should enable trace/debug log to see the incoming requests, 
and why for Multipart upload we see these errors. (Not sure some bug in Cache 
logic, or some handling we missed for MPU requests)

 

To debug this we need a complete OM log, audit log, S3gateway log. And also 
enable trace to see what requests are incoming, I think we log them in 
OzoneManagerProtocolServerSideTranslatorPB.

 

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: 2019-11-06_18_13_57_422_ERROR, hs_err_pid9340.log, 
> image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB

[jira] [Updated] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-08 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2427:
-
Fix Version/s: 0.5.0

> Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
> -
>
> Key: HDDS-2427
> URL: https://issues.apache.org/jira/browse/HDDS-2427
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This has caused issue for DN UI loading.
> hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
> accidentally loaded Ozone datanode web application instead of Hadoop datanode 
> application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2399) Update mailing list information in CONTRIBUTION and README files

2019-11-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2399.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Update mailing list information in CONTRIBUTION and README files
> 
>
> Key: HDDS-2399
> URL: https://issues.apache.org/jira/browse/HDDS-2399
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Marton Elek
>Priority: Major
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We have new mailing lists:
>  [ozone-...@hadoop.apache.org|mailto:ozone-...@hadoop.apache.org]
> [ozone-iss...@hadoop.apache.org|mailto:ozone-iss...@hadoop.apache.org]
> [ozone-comm...@hadoop.apache.org|mailto:ozone-comm...@hadoop.apache.org]
>  
> We need to update CONTRIBUTION.md and README.md to use ozone-dev instead of 
> hdfs-dev (optionally we can mention the issues/commits lists, but only in 
> CONTRIBUTION.md)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-07 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969847#comment-16969847
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/8/19 5:26 AM:
---

I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR.

I have fixed MISMATCH_MULTIPART_LIST in HDDS-2395.


was (Author: bharatviswa):
I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR.

I have fixed MISMATCH_ERROR in HDDS-2395.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMetho

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-07 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969847#comment-16969847
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

I think this error is not related to the NO_SUCH_MULTIPART_UPLOAD_ERROR.

I have fixed MISMATCH_ERROR in HDDS-2395.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at com.sun.proxy.$Proxy82.completeMu

[jira] [Resolved] (HDDS-2395) Handle Ozone S3 completeMPU to match with aws s3 behavior.

2019-11-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2395.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Handle Ozone S3 completeMPU to match with aws s3 behavior.
> --
>
> Key: HDDS-2395
> URL: https://issues.apache.org/jira/browse/HDDS-2395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> # When uploaded 2 parts, and when complete upload 1 part no error
>  # During complete multipart upload name/part number not matching with 
> uploaded part and part number then InvalidPart error
>  # When parts are not specified in sorted order InvalidPartOrder
>  # During complete multipart upload when no uploaded parts, and we specify 
> some parts then also InvalidPart
>  # Uploaded parts 1,2,3 and during complete we can do upload 1,3 (No error)
>  # When part 3 uploaded, complete with part 3 can be done



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2427:


Assignee: Bharat Viswanadham

> Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
> -
>
> Key: HDDS-2427
> URL: https://issues.apache.org/jira/browse/HDDS-2427
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This has caused issue for DN UI loading.
> hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
> accidentally loaded Ozone datanode web application instead of Hadoop datanode 
> application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-07 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2427:
-
Status: Patch Available  (was: Open)

> Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar
> -
>
> Key: HDDS-2427
> URL: https://issues.apache.org/jira/browse/HDDS-2427
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> This has caused issue for DN UI loading.
> hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
> accidentally loaded Ozone datanode web application instead of Hadoop datanode 
> application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2427) Exclude webapps from hadoop-ozone-filesystem-lib-current uber jar

2019-11-07 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2427:


 Summary: Exclude webapps from hadoop-ozone-filesystem-lib-current 
uber jar
 Key: HDDS-2427
 URL: https://issues.apache.org/jira/browse/HDDS-2427
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


This has caused issue for DN UI loading.

hadoop-ozone-filesystem-lib-current-xx.jar is in the classpath which 
accidentally loaded Ozone datanode web application instead of Hadoop datanode 
application. This leads to the reported error. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-06 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968967#comment-16968967
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/7/19 6:38 AM:
---

HI [~timmylicheng]

When uploading part file, when no such Multipartupload is found in 
multipartInfoTable we throw error.

In your tests, are you trying to abort any upload, while some client is still 
trying to upload part.

See the below snippet of the code.

 
{code:java}
if (multipartKeyInfo == null) {
 // This can occur when user started uploading part by the time commit
 // of that part happens, in between the user might have requested
 // abort multipart upload. If we just throw exception, then the data
 // will not be garbage collected, so move this part to delete table
 // and throw error
 // Move this part to delete table.
 throw new OMException("No such Multipart upload is with specified " +
 "uploadId " + uploadID,
 OMException.ResultCodes.NO_SUCH_MULTIPART_UPLOAD_ERROR);
}{code}
 

I don't see any log attached to the Jira BTW.


was (Author: bharatviswa):
HI [~timmylicheng]

When uploading part file, when no such Multipartupload is found in 
multipartInfoTable we throw error.

In your tests, are you trying to abort any upload.

See the below snippet of the code.

 
{code:java}
if (multipartKeyInfo == null) {
 // This can occur when user started uploading part by the time commit
 // of that part happens, in between the user might have requested
 // abort multipart upload. If we just throw exception, then the data
 // will not be garbage collected, so move this part to delete table
 // and throw error
 // Move this part to delete table.
 throw new OMException("No such Multipart upload is with specified " +
 "uploadId " + uploadID,
 OMException.ResultCodes.NO_SUCH_MULTIPART_UPLOAD_ERROR);
}{code}
 

I don't see any log attached to the Jira BTW.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:10

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-11-06 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968967#comment-16968967
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

HI [~timmylicheng]

When uploading part file, when no such Multipartupload is found in 
multipartInfoTable we throw error.

In your tests, are you trying to abort any upload.

See the below snippet of the code.

 
{code:java}
if (multipartKeyInfo == null) {
 // This can occur when user started uploading part by the time commit
 // of that part happens, in between the user might have requested
 // abort multipart upload. If we just throw exception, then the data
 // will not be garbage collected, so move this part to delete table
 // and throw error
 // Move this part to delete table.
 throw new OMException("No such Multipart upload is with specified " +
 "uploadId " + uploadID,
 OMException.ResultCodes.NO_SUCH_MULTIPART_UPLOAD_ERROR);
}{code}
 

I don't see any log attached to the Jira BTW.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> Updated on 11/06/2019:
> See new multipart upload error NO_SUCH_MULTIPART_UPLOAD_ERROR and full logs 
> are in the attachment.
>  2019-11-05 18:12:37,766 ERROR 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest:
>  MultipartUpload Commit is failed for Key:./2
> 0191012/plc_1570863541668_9278 in Volume/Bucket 
> s325d55ad283aa400af464c76d713c07ad/ozone-test
> NO_SUCH_MULTIPART_UPLOAD_ERROR 
> org.apache.hadoop.ozone.om.exceptions.OMException: No such Multipart upload 
> is with specified uploadId fcda8608-b431-48b7-8386-
> 0a332f1a709a-103084683261641950
> at 
> org.apache.hadoop.ozone.om.request.s3.multipart.S3MultipartUploadCommitPartRequest.validateAndUpdateCache(S3MultipartUploadCommitPartRequest.java:1
> 56)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.
> java:217)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:132)
> at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
> at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:100)
> at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682)
>  
> Updated on 10/28/2019:
> See MISMATCH_MULTIPART_LIST error.
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolCl

[jira] [Resolved] (HDDS-2377) Speed up TestOzoneManagerHA#testOMRetryProxy and #testTwoOMNodesDown

2019-11-06 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2377.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Speed up TestOzoneManagerHA#testOMRetryProxy and #testTwoOMNodesDown
> 
>
> Key: HDDS-2377
> URL: https://issues.apache.org/jira/browse/HDDS-2377
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Marton's comment:
> https://github.com/apache/hadoop-ozone/pull/30#pullrequestreview-302465440
> Out of curiosity, I ran entire TestOzoneManagerHA locally. The entire test 
> class finished in 10m 30s. I discovered {{testOMRetryProxy}} and 
> {{testTwoOMNodesDown}} are taking the most time (2m and 2m 30s respectively) 
> to finish. Most time are wasted on retry and wait. We could reasonably reduce 
> the amount of time on the wait.
> As I tested, with the patch, {{testOMRetryProxy}} and {{testTwoOMNodesDown}} 
> finish in 20 sec each, saving almost 4 min runtime on those two tests alone. 
> The whole TestOzoneManagerHA test finishes in 5m 44s with the patch.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2404) Add support for Registered id as service identifier for CSR.

2019-11-05 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968096#comment-16968096
 ] 

Bharat Viswanadham commented on HDDS-2404:
--

Can we move this task under HDDS-505, as it is related to OM HA work.?

> Add support for Registered id as service identifier for CSR.
> 
>
> Key: HDDS-2404
> URL: https://issues.apache.org/jira/browse/HDDS-2404
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: SCM
>Reporter: Anu Engineer
>Assignee: Abhishek Purohit
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The SCM HA needs the ability to represent a group as a single entity. So that 
> Tokens for each of the OM which is part of an HA group can be honored by the 
> data nodes. 
> This patch adds the notion of a service group ID to the Certificate 
> Infrastructure. In the next JIRAs, we will use this capability when issuing 
> certificates to OM -- especially when they are in HA mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1643) Send hostName also part of OMRequest

2019-11-05 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-1643.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Send hostName also part of OMRequest
> 
>
> Key: HDDS-1643
> URL: https://issues.apache.org/jira/browse/HDDS-1643
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Bharat Viswanadham
>Assignee: YiSheng Lien
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This Jira is created based on the comment from [~eyang] on HDDS-1600 jira.
> [~bharatviswa] can hostname be used as part of OM request? For running in 
> docker container, virtual private network address may not be routable or 
> exposed to outside world. Using IP to identify the source client location may 
> not be enough. It would be nice to have ability support hostname based 
> request too.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2064) Add tests for incorrect OM HA config when node ID or RPC address is not configured

2019-11-05 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2064.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Add tests for incorrect OM HA config when node ID or RPC address is not 
> configured
> --
>
> Key: HDDS-2064
> URL: https://issues.apache.org/jira/browse/HDDS-2064
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Siyao Meng
>Assignee: Siyao Meng
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> -OM will NPE and crash when `ozone.om.service.ids=id1,id2` is configured but 
> `ozone.om.nodes.id1` doesn't exist; or `ozone.om.address.id1.omX` doesn't 
> exist.-
> -Root cause:-
> -`OzoneManager#loadOMHAConfigs()` didn't check the case where `found == 0`. 
> This happens when local OM doesn't match any `ozone.om.address.idX.omX` in 
> the config.-
> Due to the refactoring done in HDDS-2162. This fix has been included in that 
> commit. I will repurpose the jira to add some tests for the HA config.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2359) Seeking randomly in a key with more than 2 blocks of data leads to inconsistent reads

2019-11-05 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2359.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Seeking randomly in a key with more than 2 blocks of data leads to 
> inconsistent reads
> -
>
> Key: HDDS-2359
> URL: https://issues.apache.org/jira/browse/HDDS-2359
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Istvan Fajth
>Assignee: Shashikant Banerjee
>Priority: Critical
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> During Hive testing we found the following exception:
> {code}
> TaskAttempt 3 failed, info=[Error: Error while running task ( failure ) : 
> attempt_1569246922012_0214_1_03_00_3:java.lang.RuntimeException: 
> org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: 
> java.io.IOException: error iterating
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:296)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:250)
> at 
> org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:374)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:73)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable$1.run(TaskRunner2Callable.java:61)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:61)
> at 
> org.apache.tez.runtime.task.TaskRunner2Callable.callInternal(TaskRunner2Callable.java:37)
> at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:108)
> at 
> com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:41)
> at 
> com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:77)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
> java.io.IOException: java.io.IOException: error iterating
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:80)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:426)
> at 
> org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:267)
> ... 16 more
> Caused by: java.io.IOException: java.io.IOException: error iterating
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
> at 
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:366)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79)
> at 
> org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
> at 
> org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:151)
> at 
> org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:116)
> at 
> org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68)
> ... 18 more
> Caused by: java.io.IOException: error iterating
> at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:835)
> at 
> org.apache.hadoop.hive.ql.io.orc.VectorizedOrcAcidRowBatchReader.next(VectorizedOrcAcidRowBatchReader.java:74)
> at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:361)
> ... 24 more
> Caused by: java.io.IOException: Error reading file: 
> o3fs://hive.warehouse.vc0136.halxg.cloudera.com:9862/data/inventory/delta_001_001_/bucket_0
> at 
> org.apache.orc.impl.RecordReaderImpl.nextBatch(Rec

[jira] [Resolved] (HDDS-2255) Improve Acl Handler Messages

2019-11-04 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2255.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Improve Acl Handler Messages
> 
>
> Key: HDDS-2255
> URL: https://issues.apache.org/jira/browse/HDDS-2255
> Project: Hadoop Distributed Data Store
>  Issue Type: Improvement
>  Components: om
>Reporter: Hanisha Koneru
>Assignee: YiSheng Lien
>Priority: Minor
>  Labels: newbie, pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> In Add/Remove/Set Acl Key/Bucket/Volume Handlers, we print a message about 
> whether the operation was successful or not. If we are trying to add an ACL 
> which is already existing, we convey the message that the operation failed. 
> It would be better if the message conveyed more clearly why the operation 
> failed i.e. the ACL already exists. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2398) Remove usage of LogUtils class from ratis-common

2019-11-02 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham resolved HDDS-2398.
--
Fix Version/s: 0.5.0
   Resolution: Fixed

> Remove usage of LogUtils class from ratis-common
> 
>
> Key: HDDS-2398
> URL: https://issues.apache.org/jira/browse/HDDS-2398
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> MiniOzoneChaoasCluster.java for setting log level it uses LogUtils from 
> ratis-common. But this is removed from LogUtils as part of Ratis-508.
> We can avoid depending on ratis for this, and use GenericTestUtils from 
> hadoop-common test.
> LogUtils.setLogLevel(GrpcClientProtocolClient.LOG, Level.WARN);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2398) Remove usage of LogUtils class from ratis-common

2019-11-01 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2398:


 Summary: Remove usage of LogUtils class from ratis-common
 Key: HDDS-2398
 URL: https://issues.apache.org/jira/browse/HDDS-2398
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


MiniOzoneChaoasCluster.java for setting log level it uses LogUtils from 
ratis-common. But this is removed from LogUtils as part of Ratis-508.

We can avoid depending on ratis for this, and use GenericTestUtils from 
hadoop-common test.

LogUtils.setLogLevel(GrpcClientProtocolClient.LOG, Level.WARN);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2398) Remove usage of LogUtils class from ratis-common

2019-11-01 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2398:


Assignee: Bharat Viswanadham

> Remove usage of LogUtils class from ratis-common
> 
>
> Key: HDDS-2398
> URL: https://issues.apache.org/jira/browse/HDDS-2398
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>
> MiniOzoneChaoasCluster.java for setting log level it uses LogUtils from 
> ratis-common. But this is removed from LogUtils as part of Ratis-508.
> We can avoid depending on ratis for this, and use GenericTestUtils from 
> hadoop-common test.
> LogUtils.setLogLevel(GrpcClientProtocolClient.LOG, Level.WARN);



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2396) OM rocksdb core dump during writing

2019-11-01 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964947#comment-16964947
 ] 

Bharat Viswanadham edited comment on HDDS-2396 at 11/1/19 4:30 PM:
---

Hi [~aengineer]

The issue resolved with try-with-resource is HDDS-2379.

Below is the stack trace error.  

 

[~timmylicheng] In your cluster setup testing does it have a fix for 
HDDS-2379.(Not sure it will resolve or not, just want to check if it is a new 
issue.)

 
{code:java}
2019-10-29 11:15:15,131 ERROR 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with 
exit status 1: During flush to DB encountered err
 or in OMDoubleBuffer flush thread OMDoubleBufferFlushThread
 java.io.IOException: Unable to write the batch.
 at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48)
 at 
org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240)
 at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag
 at org.rocksdb.RocksDB.write0(Native Method)
 at org.rocksdb.RocksDB.write(RocksDB.java:1421)
 at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46)
 ... 3 more{code}


was (Author: bharatviswa):
Hi [~aengineer]

The issue resolved with try-with-resource is HDDS-2379.

Below is the stack trace error.  

 

[~timmylicheng] In your cluster setup testing does it have a fix for 
HDDS-2379.(Not sure it will resolve or not, just want to check.)

 
{code:java}
2019-10-29 11:15:15,131 ERROR 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with 
exit status 1: During flush to DB encountered err
 or in OMDoubleBuffer flush thread OMDoubleBufferFlushThread
 java.io.IOException: Unable to write the batch.
 at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48)
 at 
org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240)
 at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag
 at org.rocksdb.RocksDB.write0(Native Method)
 at org.rocksdb.RocksDB.write(RocksDB.java:1421)
 at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46)
 ... 3 more{code}

> OM rocksdb core dump during writing
> ---
>
> Key: HDDS-2396
> URL: https://issues.apache.org/jira/browse/HDDS-2396
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
>Reporter: Li Cheng
>Priority: Major
> Attachments: hs_err_pid9340.log
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
>  
> There happens core dump in rocksdb while it's occasional. 
>  
> Stack: [0x7f5891a23000,0x7f5891b24000], sp=0x7f5891b21bb8, free 
> space=1018k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C [libc.so.6+0x151d60] __memmove_ssse3_back+0x1ae0
> C [librocksdbjni3192271038586903156.so+0x358fec] 
> rocksdb::MemTableInserter::PutCFImpl(unsigned int, rocksdb::Slice const&, 
> rocksdb::Slice const&, rocksdb:
> :ValueType)+0x51c
> C [librocksdbjni3192271038586903156.so+0x359d17] 
> rocksdb::MemTableInserter::PutCF(unsigned int, rocksdb::Slice const&, 
> rocksdb::Slice const&)+0x17
> C [librocksdbjni3192271038586903156.so+0x3513bc] 
> rocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const+0x45c
> C [librocksdbjni3192271038586903156.so+0x354df9] 
> rocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::WriteGroup&, 
> unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, 
> bool, unsigned long, rocksdb::DB*, bool, bool, bool)+0x1f9
> C [librocksdbjni3192271038586903156.so+0x29fd79] 
> rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, 
> rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, 
> bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x24b9
> C [librocksdbjni3192271038586903156.so+0x2a0431] 
> rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, 
> rocksdb::WriteBatch*)+0x21
> C [librocksdbjni3192271038586903156.so+0x1a064c] 
> Java_org_rocksdb_RocksDB_write0+0xcc
> J 7899 org.rocksdb.RocksDB.write0(JJJ)V (0 bytes) @ 0x00

[jira] [Comment Edited] (HDDS-2396) OM rocksdb core dump during writing

2019-11-01 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964947#comment-16964947
 ] 

Bharat Viswanadham edited comment on HDDS-2396 at 11/1/19 4:30 PM:
---

Hi [~aengineer]

The issue resolved with try-with-resource is HDDS-2379.

Below is the stack trace error.  

 

[~timmylicheng] In your cluster setup testing does it have a fix for 
HDDS-2379.(Not sure it will resolve or not, just want to check.)

 
{code:java}
2019-10-29 11:15:15,131 ERROR 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with 
exit status 1: During flush to DB encountered err
 or in OMDoubleBuffer flush thread OMDoubleBufferFlushThread
 java.io.IOException: Unable to write the batch.
 at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48)
 at 
org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240)
 at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
 at java.lang.Thread.run(Thread.java:745)
 Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag
 at org.rocksdb.RocksDB.write0(Native Method)
 at org.rocksdb.RocksDB.write(RocksDB.java:1421)
 at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46)
 ... 3 more{code}


was (Author: bharatviswa):
Hi [~aengineer]

The issue resolved with try-with-resource is HDDS-2379.

Below is the stack trace error.  

 

[~timmylicheng] In your cluster setup testing does it have a fix for 
HDDS-2379.(Not sure it will resolve or not, just want to check.)
2019-10-29 11:15:15,131 ERROR 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with 
exit status 1: During flush to DB encountered err
or in OMDoubleBuffer flush thread OMDoubleBufferFlushThread
java.io.IOException: Unable to write the batch.
at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48)
at 
org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag
at org.rocksdb.RocksDB.write0(Native Method)
at org.rocksdb.RocksDB.write(RocksDB.java:1421)
at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46)
... 3 more

> OM rocksdb core dump during writing
> ---
>
> Key: HDDS-2396
> URL: https://issues.apache.org/jira/browse/HDDS-2396
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
>Reporter: Li Cheng
>Priority: Major
> Attachments: hs_err_pid9340.log
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
>  
> There happens core dump in rocksdb while it's occasional. 
>  
> Stack: [0x7f5891a23000,0x7f5891b24000], sp=0x7f5891b21bb8, free 
> space=1018k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C [libc.so.6+0x151d60] __memmove_ssse3_back+0x1ae0
> C [librocksdbjni3192271038586903156.so+0x358fec] 
> rocksdb::MemTableInserter::PutCFImpl(unsigned int, rocksdb::Slice const&, 
> rocksdb::Slice const&, rocksdb:
> :ValueType)+0x51c
> C [librocksdbjni3192271038586903156.so+0x359d17] 
> rocksdb::MemTableInserter::PutCF(unsigned int, rocksdb::Slice const&, 
> rocksdb::Slice const&)+0x17
> C [librocksdbjni3192271038586903156.so+0x3513bc] 
> rocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const+0x45c
> C [librocksdbjni3192271038586903156.so+0x354df9] 
> rocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::WriteGroup&, 
> unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, 
> bool, unsigned long, rocksdb::DB*, bool, bool, bool)+0x1f9
> C [librocksdbjni3192271038586903156.so+0x29fd79] 
> rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, 
> rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, 
> bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x24b9
> C [librocksdbjni3192271038586903156.so+0x2a0431] 
> rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, 
> rocksdb::WriteBatch*)+0x21
> C [librocksdbjni3192271038586903156.so+0x1a064c] 
> Java_org_rocksdb_RocksDB_write0+0xcc
> J 7899 org.rocksdb.RocksDB.write0(JJJ)V (0 byt

[jira] [Commented] (HDDS-2396) OM rocksdb core dump during writing

2019-11-01 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964947#comment-16964947
 ] 

Bharat Viswanadham commented on HDDS-2396:
--

Hi [~aengineer]

The issue resolved with try-with-resource is HDDS-2379.

Below is the stack trace error.  

 

[~timmylicheng] In your cluster setup testing does it have a fix for 
HDDS-2379.(Not sure it will resolve or not, just want to check.)
2019-10-29 11:15:15,131 ERROR 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with 
exit status 1: During flush to DB encountered err
or in OMDoubleBuffer flush thread OMDoubleBufferFlushThread
java.io.IOException: Unable to write the batch.
at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48)
at 
org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag
at org.rocksdb.RocksDB.write0(Native Method)
at org.rocksdb.RocksDB.write(RocksDB.java:1421)
at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46)
... 3 more

> OM rocksdb core dump during writing
> ---
>
> Key: HDDS-2396
> URL: https://issues.apache.org/jira/browse/HDDS-2396
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
>Reporter: Li Cheng
>Priority: Major
> Attachments: hs_err_pid9340.log
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
>  
> There happens core dump in rocksdb while it's occasional. 
>  
> Stack: [0x7f5891a23000,0x7f5891b24000], sp=0x7f5891b21bb8, free 
> space=1018k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> C [libc.so.6+0x151d60] __memmove_ssse3_back+0x1ae0
> C [librocksdbjni3192271038586903156.so+0x358fec] 
> rocksdb::MemTableInserter::PutCFImpl(unsigned int, rocksdb::Slice const&, 
> rocksdb::Slice const&, rocksdb:
> :ValueType)+0x51c
> C [librocksdbjni3192271038586903156.so+0x359d17] 
> rocksdb::MemTableInserter::PutCF(unsigned int, rocksdb::Slice const&, 
> rocksdb::Slice const&)+0x17
> C [librocksdbjni3192271038586903156.so+0x3513bc] 
> rocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const+0x45c
> C [librocksdbjni3192271038586903156.so+0x354df9] 
> rocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::WriteGroup&, 
> unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, 
> bool, unsigned long, rocksdb::DB*, bool, bool, bool)+0x1f9
> C [librocksdbjni3192271038586903156.so+0x29fd79] 
> rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, 
> rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, 
> bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x24b9
> C [librocksdbjni3192271038586903156.so+0x2a0431] 
> rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, 
> rocksdb::WriteBatch*)+0x21
> C [librocksdbjni3192271038586903156.so+0x1a064c] 
> Java_org_rocksdb_RocksDB_write0+0xcc
> J 7899 org.rocksdb.RocksDB.write0(JJJ)V (0 bytes) @ 0x7f58f1872dbe 
> [0x7f58f1872d00+0xbe]
> J 10093% C1 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions()V
>  (400 bytes) @ 0x7f58f2308b0c [0x7f58f2307a40+0x10cc]
> j 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer$$Lambda$29.run()V+4
> j java.lang.Thread.run()V+11



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2397) Fix calling cleanup for few missing tables in OM

2019-10-31 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham reassigned HDDS-2397:


Assignee: Bharat Viswanadham

> Fix calling cleanup for few missing tables in OM
> 
>
> Key: HDDS-2397
> URL: https://issues.apache.org/jira/browse/HDDS-2397
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> After DoubleBuffer flushes, we call cleanup cache to cleanup tables cache.
> For few tables cleanup of cache is missed:
>  # PrefixTable
>  # S3SecretTable
>  # DelegationTable



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2397) Fix calling cleanup for few missing tables in OM

2019-10-31 Thread Bharat Viswanadham (Jira)
Bharat Viswanadham created HDDS-2397:


 Summary: Fix calling cleanup for few missing tables in OM
 Key: HDDS-2397
 URL: https://issues.apache.org/jira/browse/HDDS-2397
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham


After DoubleBuffer flushes, we call cleanup cache to cleanup tables cache.

For few tables cleanup of cache is missed:
 # PrefixTable
 # S3SecretTable
 # DelegationTable



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-10-31 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2356:
-
Comment: was deleted

(was: Has the above error caused crash in OM? If so, can you share stack trace?)

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source)
>  at 
> org.apache.hadoop.ozone.client.rpc.RpcClient.completeMultipartUpload(RpcClient.java:883)
>  at 
> org.apache.hadoop.ozone.client.OzoneBucket.completeMultipartUpload(OzoneBucket.java:445)
>  at 
> org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.completeMultipartUpload(ObjectEndpoint.java:498)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
>  at 
> org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103)
>  at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493)
>  
> The following errors has been resolved in 
> https://issues.apache.org/jira/browse/HDDS-2322. 
> 2019-10-24 16:01:59,527 [OMDoubleBufferFlushThread] ERROR - Terminating with 
> exit status 2: OMDoubleBuffer flush 
> threadOMDoubleBufferFlushThreadencountered Throwable error
>  java.util.ConcurrentModificationException
>  at java.util.TreeMap.forEach(TreeMap.java:1004)
>  at 
> org.apache.hadoop.ozone.om.helpers.OmMultipartKeyInfo.getProto(OmMultipartKeyInfo.java:111)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:38)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:31)
>  at 

[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-10-31 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964586#comment-16964586
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

Has the above error caused crash in OM? If so, can you share stack trace?

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: hs_err_pid9340.log, image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source)
>  at 
> org.apache.hadoop.ozone.client.rpc.RpcClient.completeMultipartUpload(RpcClient.java:883)
>  at 
> org.apache.hadoop.ozone.client.OzoneBucket.completeMultipartUpload(OzoneBucket.java:445)
>  at 
> org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.completeMultipartUpload(ObjectEndpoint.java:498)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
>  at 
> org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103)
>  at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493)
>  
> The following errors has been resolved in 
> https://issues.apache.org/jira/browse/HDDS-2322. 
> 2019-10-24 16:01:59,527 [OMDoubleBufferFlushThread] ERROR - Terminating with 
> exit status 2: OMDoubleBuffer flush 
> threadOMDoubleBufferFlushThreadencountered Throwable error
>  java.util.ConcurrentModificationException
>  at java.util.TreeMap.forEach(TreeMap.java:1004)
>  at 
> org.apache.hadoop.ozone.om.helpers.OmMultipartKeyInfo.getProto(OmMultipartKeyInfo.java:111)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:38)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKey

[jira] [Commented] (HDDS-2395) Handle Ozone S3 completeMPU to match with aws s3 behavior.

2019-10-31 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964585#comment-16964585
 ] 

Bharat Viswanadham commented on HDDS-2395:
--

Hi [~timmylicheng]

Exclude List is fixed as part of HDDS-2381. Thanks.

> Handle Ozone S3 completeMPU to match with aws s3 behavior.
> --
>
> Key: HDDS-2395
> URL: https://issues.apache.org/jira/browse/HDDS-2395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> # When uploaded 2 parts, and when complete upload 1 part no error
>  # During complete multipart upload name/part number not matching with 
> uploaded part and part number then InvalidPart error
>  # When parts are not specified in sorted order InvalidPartOrder
>  # During complete multipart upload when no uploaded parts, and we specify 
> some parts then also InvalidPart
>  # Uploaded parts 1,2,3 and during complete we can do upload 1,3 (No error)
>  # When part 3 uploaded, complete with part 3 can be done



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2395) Handle Ozone S3 completeMPU to match with aws s3 behavior.

2019-10-31 Thread Bharat Viswanadham (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bharat Viswanadham updated HDDS-2395:
-
Issue Type: Bug  (was: Task)

> Handle Ozone S3 completeMPU to match with aws s3 behavior.
> --
>
> Key: HDDS-2395
> URL: https://issues.apache.org/jira/browse/HDDS-2395
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> # When uploaded 2 parts, and when complete upload 1 part no error
>  # During complete multipart upload name/part number not matching with 
> uploaded part and part number then InvalidPart error
>  # When parts are not specified in sorted order InvalidPartOrder
>  # During complete multipart upload when no uploaded parts, and we specify 
> some parts then also InvalidPart
>  # Uploaded parts 1,2,3 and during complete we can do upload 1,3 (No error)
>  # When part 3 uploaded, complete with part 3 can be done



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-10-31 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964482#comment-16964482
 ] 

Bharat Viswanadham commented on HDDS-2356:
--

Opened HDDS-2359 to handle CompleteMPU error cases.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source)
>  at 
> org.apache.hadoop.ozone.client.rpc.RpcClient.completeMultipartUpload(RpcClient.java:883)
>  at 
> org.apache.hadoop.ozone.client.OzoneBucket.completeMultipartUpload(OzoneBucket.java:445)
>  at 
> org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.completeMultipartUpload(ObjectEndpoint.java:498)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
>  at 
> org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103)
>  at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493)
>  
> The following errors has been resolved in 
> https://issues.apache.org/jira/browse/HDDS-2322. 
> 2019-10-24 16:01:59,527 [OMDoubleBufferFlushThread] ERROR - Terminating with 
> exit status 2: OMDoubleBuffer flush 
> threadOMDoubleBufferFlushThreadencountered Throwable error
>  java.util.ConcurrentModificationException
>  at java.util.TreeMap.forEach(TreeMap.java:1004)
>  at 
> org.apache.hadoop.ozone.om.helpers.OmMultipartKeyInfo.getProto(OmMultipartKeyInfo.java:111)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:38)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:31)
>  at 
> org.apache.had

[jira] [Comment Edited] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline

2019-10-31 Thread Bharat Viswanadham (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964482#comment-16964482
 ] 

Bharat Viswanadham edited comment on HDDS-2356 at 11/1/19 12:41 AM:


Opened HDDS-2395 to handle CompleteMPU error cases.


was (Author: bharatviswa):
Opened HDDS-2359 to handle CompleteMPU error cases.

> Multipart upload report errors while writing to ozone Ratis pipeline
> 
>
> Key: HDDS-2356
> URL: https://issues.apache.org/jira/browse/HDDS-2356
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.4.1
> Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM 
> on a separate VM
>Reporter: Li Cheng
>Assignee: Bharat Viswanadham
>Priority: Blocker
> Fix For: 0.5.0
>
> Attachments: image-2019-10-31-18-56-56-177.png
>
>
> Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say 
> it's VM0.
> I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path 
> on VM0, while reading data from VM0 local disk and write to mount path. The 
> dataset has various sizes of files from 0 byte to GB-level and it has a 
> number of ~50,000 files. 
> The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I 
> look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors 
> related with Multipart upload. This error eventually causes the  writing to 
> terminate and OM to be closed. 
>  
> 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete 
> Multipart Upload Request for bucket: ozone-test, key: 
> 20191012/plc_1570863541668_927
>  8
>  MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: 
> Complete Multipart Upload Failed: volume: 
> s3c89e813c80ffcea9543004d57b2a1239bucket:
>  ozone-testkey: 20191012/plc_1570863541668_9278
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732)
>  at 
> org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB
>  .java:1104)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66)
>  at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source)
>  at 
> org.apache.hadoop.ozone.client.rpc.RpcClient.completeMultipartUpload(RpcClient.java:883)
>  at 
> org.apache.hadoop.ozone.client.OzoneBucket.completeMultipartUpload(OzoneBucket.java:445)
>  at 
> org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.completeMultipartUpload(ObjectEndpoint.java:498)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at 
> org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191)
>  at 
> org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200)
>  at 
> org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103)
>  at 
> org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493)
>  
> The following errors has been resolved in 
> https://issues.apache.org/jira/browse/HDDS-2322. 
> 2019-10-24 16:01:59,527 [OMDoubleBufferFlushThread] ERROR - Terminating with 
> exit status 2: OMDoubleBuffer flush 
> threadOMDoubleBufferFlushThreadencountered Throwable error
>  java.util.ConcurrentModificationException
>  at java.util.TreeMap.forEach(TreeMap.java:1004)
>  at 
> org.apache.hadoop.ozone.om.helpers.OmMultipartKeyInfo.getProto(OmMultipartKeyInfo.java:111)
>  at 
> org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:38)
>  at 
>

  1   2   3   4   5   6   7   8   9   10   >