[jira] [Created] (HDFS-16343) Add some debug logs when the dfsUsed are not used during Datanode startup
Mukul Kumar Singh created HDFS-16343: Summary: Add some debug logs when the dfsUsed are not used during Datanode startup Key: HDFS-16343 URL: https://issues.apache.org/jira/browse/HDFS-16343 Project: Hadoop HDFS Issue Type: Bug Components: datanode Reporter: Mukul Kumar Singh Assignee: Mukul Kumar Singh -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16145) CopyListing fails with FNF exception with snapshot diff
[ https://issues.apache.org/jira/browse/HDFS-16145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDFS-16145. -- Fix Version/s: 3.3.2 Resolution: Fixed > CopyListing fails with FNF exception with snapshot diff > --- > > Key: HDFS-16145 > URL: https://issues.apache.org/jira/browse/HDFS-16145 > Project: Hadoop HDFS > Issue Type: Bug > Components: distcp >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 3.3.2 > > Time Spent: 3h 20m > Remaining Estimate: 0h > > Distcp with snapshotdiff and with filters, marks a Rename as a delete > opeartion on the target if the rename target is to a directory which is > exluded by the filter. But, in cases, where files/subdirs created/modified > prior to the Rename post the old snapshot will still be present as > modified/created entries in the final copy list. Since, the parent diretory > is marked for deletion, these subsequent create/modify entries should be > ignored while building the final copy list. > With such cases, when the final copy list is built, distcp tries to do a > lookup for each create/modified file in the newer snapshot which will fail > as, the parent dir is already moved to a new location in later snapshot. > > {code:java} > sudo -u kms hadoop key create testkey > hadoop fs -mkdir -p /data/gcgdlknnasg/ > hdfs crypto -createZone -keyName testkey -path /data/gcgdlknnasg/ > hadoop fs -mkdir -p /dest/gcgdlknnasg > hdfs crypto -createZone -keyName testkey -path /dest/gcgdlknnasg > hdfs dfs -mkdir /data/gcgdlknnasg/dir1 > hdfs dfsadmin -allowSnapshot /data/gcgdlknnasg/ > hdfs dfsadmin -allowSnapshot /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# hdfs dfs -ls -R /data/gcgdlknnasg/ > drwxrwxrwt - hdfs supergroup 0 2021-07-16 14:05 > /data/gcgdlknnasg/.Trash > drwxr-xr-x - hdfs supergroup 0 2021-07-16 13:07 > /data/gcgdlknnasg/dir1 > [root@nightly62x-1 logs]# hdfs dfs -ls -R /dest/gcgdlknnasg/ > [root@nightly62x-1 logs]# > hdfs dfs -put /etc/hosts /data/gcgdlknnasg/dir1/ > hdfs dfs -rm -r /data/gcgdlknnasg/dir1/ > hdfs dfs -mkdir /data/gcgdlknnasg/dir1/ > ===> Run BDR with “Abort on Snapshot Diff Failures” CHECKED now in the > replication schedule. You get into below error and failure of the BDR job. > 21/07/16 15:02:30 INFO distcp.DistCp: Failed to use snapshot diff - > java.io.FileNotFoundException: File does not exist: > /data/gcgdlknnasg/.snapshot/distcp-5-46485360-new/dir1/hosts > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1494) > at > org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1487) > …….. > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-16121) Iterative snapshot diff report can generate duplicate records for creates, deletes and Renames
[ https://issues.apache.org/jira/browse/HDFS-16121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDFS-16121. -- Fix Version/s: 3.4.0 Resolution: Fixed > Iterative snapshot diff report can generate duplicate records for creates, > deletes and Renames > -- > > Key: HDFS-16121 > URL: https://issues.apache.org/jira/browse/HDFS-16121 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Reporter: Srinivasu Majeti >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 40m > Remaining Estimate: 0h > > Currently, iterative snapshot diff report first traverses the created list > for a directory diff and then the deleted list. If the deleted list size is > lesser than the created list size, the offset calculation in the respective > list seems wrong. So the next iteration of diff report generation call, it > will start iterating the already processed in the created list leading to > duplicate entries in the list. > Fix is to correct the offset calculation during the traversal of the deleted > list. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15865) Interrupt DataStreamer thread
[ https://issues.apache.org/jira/browse/HDFS-15865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDFS-15865. -- Resolution: Fixed > Interrupt DataStreamer thread > - > > Key: HDFS-15865 > URL: https://issues.apache.org/jira/browse/HDFS-15865 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Reporter: Karthik Palanisamy >Priority: Minor > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > Have noticed HiveServer2 halts due to DataStreamer#waitForAckedSeqno. > I think we have to interrupt DataStreamer if no packet ack(from datanodes). > It likely happens with infra/network issue. > {code:java} > "HiveServer2-Background-Pool: Thread-35977576" #35977576 prio=5 os_prio=0 > cpu=797.65ms elapsed=3406.28s tid=0x7fc0c6c29800 nid=0x4198 in > Object.wait() [0x7fc1079f3000] > java.lang.Thread.State: TIMED_WAITING (on object monitor) > at java.lang.Object.wait(java.base(at)11.0.5/Native Method) > - waiting on > at > org.apache.hadoop.hdfs.DataStreamer.waitForAckedSeqno(DataStreamer.java:886) > - waiting to re-lock in wait() <0x7fe6eda86ca0> (a > java.util.LinkedList){code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15518) Wrong operation name in FsNamesystem for listSnapshots
Mukul Kumar Singh created HDFS-15518: Summary: Wrong operation name in FsNamesystem for listSnapshots Key: HDFS-15518 URL: https://issues.apache.org/jira/browse/HDFS-15518 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Mukul Kumar Singh List snapshots makes use of listSnapshotDirectory as the string in place of ListSnapshot. https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java#L7026 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15516) Add info for create flags in NameNode audit logs
[ https://issues.apache.org/jira/browse/HDFS-15516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17172225#comment-17172225 ] Mukul Kumar Singh commented on HDFS-15516: -- The audit logs should be enhanced to include createFlag. especially if the create flag is true. Also if the file is really overwritten, that should be added in the logs as well. > Add info for create flags in NameNode audit logs > > > Key: HDFS-15516 > URL: https://issues.apache.org/jira/browse/HDFS-15516 > Project: Hadoop HDFS > Issue Type: Bug > Components: namenode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > > Currently, if file create happens with flags like overwrite , the audit logs > doesn't seem to contain the info regarding the flags in the audit logs. It > would be useful to add info regarding the create options in the audit logs > similar to Rename ops. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15498) Show snapshots deletion status in snapList cmd
[ https://issues.apache.org/jira/browse/HDFS-15498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-15498: - Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the contribution [~shashikant] and [~hemanthboyina] for the reviews. I have merged this change. > Show snapshots deletion status in snapList cmd > -- > > Key: HDFS-15498 > URL: https://issues.apache.org/jira/browse/HDFS-15498 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 3.4.0 > > Attachments: HDFS-15498.000.patch > > > HDFS-15488 adds a cmd to list all snapshots for a given snapshottable > directory. A snapshot can be just marked as deleted with ordered deletion > config set. This Jira aims to add deletion status to cmd output. > > SAMPLE OUTPUT: > {noformat} > sbanerjee-MBP15:hadoop-3.4.0-SNAPSHOT sbanerjee$ bin/hdfs lsSnapshottableDir > drwxr-xr-x 0 sbanerjee supergroup 0 2020-07-27 11:52 2 65536 /user > sbanerjee-MBP15:hadoop-3.4.0-SNAPSHOT sbanerjee$ bin/hdfs lsSnapshot /user > drwxr-xr-x 0 sbanerjee supergroup 0 2020-07-27 11:52 1 ACTIVE > /user/.snapshot/s1 > drwxr-xr-x 0 sbanerjee supergroup 0 2020-07-27 11:51 0 DELETED > /user/.snapshot/s20200727-115156.407{noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15482) Ordered snapshot deletion: hide the deleted snapshots from users
[ https://issues.apache.org/jira/browse/HDFS-15482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-15482: - Parent: (was: HDFS-15477) Issue Type: Bug (was: Sub-task) > Ordered snapshot deletion: hide the deleted snapshots from users > > > Key: HDFS-15482 > URL: https://issues.apache.org/jira/browse/HDFS-15482 > Project: Hadoop HDFS > Issue Type: Bug > Components: snapshots >Reporter: Tsz-wo Sze >Assignee: Shashikant Banerjee >Priority: Major > > In HDFS-15480, the behavior of deleting the non-earliest snapshots is > changed to marking them as deleted in XAttr but not actually deleting them. > The users are still able to access the these snapshots as usual. > In this JIRA, the marked-for-deletion snapshots are hided so that they become > inaccessible > to users. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15497) Make snapshot limit on global as well per snapshot root directory configurable
[ https://issues.apache.org/jira/browse/HDFS-15497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-15497: - Status: Patch Available (was: Open) > Make snapshot limit on global as well per snapshot root directory configurable > -- > > Key: HDFS-15497 > URL: https://issues.apache.org/jira/browse/HDFS-15497 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Affects Versions: 3.4.0 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Attachments: HDFS-15497.000.patch > > > Currently, there is no configurable limit imposed on the no of snapshots > remaining in the system neither on the filesystem level nor on a snaphottable > root directory. Too many snapshots in the system can potentially bloat up the > namespace and with ordered deletion feature on , too many snapshots per > snapshottable root directory will make the deletion of the oldest snapshot > more expensive. This Jira aims to impose these configurable limits . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15496) Add UI for deleted snapshots
[ https://issues.apache.org/jira/browse/HDFS-15496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDFS-15496: Assignee: Vivek Ratnavel Subramanian > Add UI for deleted snapshots > > > Key: HDFS-15496 > URL: https://issues.apache.org/jira/browse/HDFS-15496 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Mukul Kumar Singh >Assignee: Vivek Ratnavel Subramanian >Priority: Major > > Add UI for deleted snapshots > a) Show the list of snapshots per snapshottable directory > b) Add deleted status in the JMX output for the Snapshot along with a snap ID > e) NN UI, should sort the snapshots for snapIds. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15501) Update Apache documentation for new ordered snapshot deletion feature
Mukul Kumar Singh created HDFS-15501: Summary: Update Apache documentation for new ordered snapshot deletion feature Key: HDFS-15501 URL: https://issues.apache.org/jira/browse/HDFS-15501 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Mukul Kumar Singh Update Apache documentation for new ordered snapshot deletion feature. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15500) Add more assertions about ordered deletion of snapshot
Mukul Kumar Singh created HDFS-15500: Summary: Add more assertions about ordered deletion of snapshot Key: HDFS-15500 URL: https://issues.apache.org/jira/browse/HDFS-15500 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Mukul Kumar Singh Assignee: Tsz-wo Sze The jira proposes to add new assertions, one of the assertion to start with is a) Add an assertion that with ordered snapshot deletion flag true, prior snapshot in cleansubtree is null -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15492) Make trash root inside each snapshottable directory
[ https://issues.apache.org/jira/browse/HDFS-15492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-15492: - Parent: HDFS-15477 Issue Type: Sub-task (was: Improvement) > Make trash root inside each snapshottable directory > --- > > Key: HDFS-15492 > URL: https://issues.apache.org/jira/browse/HDFS-15492 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs, hdfs-client >Affects Versions: 3.2.1 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > > We have seen FSImage corruption cases (e.g. HDFS-13101) where files inside > one snapshottable directories are moved outside of it. The most common case > of this is when trash is enabled and user deletes some file via the command > line without skipTrash. > This jira aims to make a trash root for each snapshottable directory, same as > how encryption zone behaves at the moment. > This will make trash cleanup a little bit more expensive on the NameNode as > it will be to iterate all trash roots. But should be fine as long as there > aren't many snapshottable directories. > I could make this improvement as an option and disable it by default if > needed, such as {{dfs.namenode.snapshot.trashroot.enabled}} > One small caveat though, when disabling (disallowing) snapshot on the > snapshottable directory when this improvement is in place. The client should > merge the snapshottable directory's trash with that user's trash to ensure > proper trash cleanup. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15483) Ordered snapshot deletion: Disallow rename between two snapshottable directories
[ https://issues.apache.org/jira/browse/HDFS-15483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-15483: - Status: Patch Available (was: Open) > Ordered snapshot deletion: Disallow rename between two snapshottable > directories > > > Key: HDFS-15483 > URL: https://issues.apache.org/jira/browse/HDFS-15483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Tsz-wo Sze >Assignee: Shashikant Banerjee >Priority: Major > > With the ordered snapshot deletion feature, only the *earliest* snapshot can > be actually deleted from the file system. If renaming between snapshottable > directories is allowed, only the earliest snapshot among all the > snapshottable directories can be actually deleted. In such case, individual > snapshottable directory may not be able to free up the resources by itself. > Therefore, we propose disallowing renaming between snapshottable directories > in this JIRA. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15496) Add UI for deleted snapshots
[ https://issues.apache.org/jira/browse/HDFS-15496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17166141#comment-17166141 ] Mukul Kumar Singh commented on HDFS-15496: -- cc [~jnp] [~shashikant] [~szetszwo] > Add UI for deleted snapshots > > > Key: HDFS-15496 > URL: https://issues.apache.org/jira/browse/HDFS-15496 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Mukul Kumar Singh >Priority: Major > > Add UI for deleted snapshots > a) Show the list of snapshots per snapshottable directory > b) Add deleted status in the JMX output for the Snapshot along with a snap ID > e) NN UI, should sort the snapshots for snapIds. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15496) Add UI for deleted snapshots
[ https://issues.apache.org/jira/browse/HDFS-15496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-15496: - Description: Add UI for deleted snapshots a) Show the list of snapshots per snapshottable directory b) Add deleted status in the JMX output for the Snapshot along with a snap ID e) NN UI, should sort the snapshots for snapIds. was: Add a a) Show the list of snapshots per snapshottable directory b) Add deleted status in the JMX output for the Snapshot along with a snap ID e) NN UI, should sort the snapshots for snapIds. > Add UI for deleted snapshots > > > Key: HDFS-15496 > URL: https://issues.apache.org/jira/browse/HDFS-15496 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Mukul Kumar Singh >Priority: Major > > Add UI for deleted snapshots > a) Show the list of snapshots per snapshottable directory > b) Add deleted status in the JMX output for the Snapshot along with a snap ID > e) NN UI, should sort the snapshots for snapIds. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15496) Add UI for deleted snapshots
Mukul Kumar Singh created HDFS-15496: Summary: Add UI for deleted snapshots Key: HDFS-15496 URL: https://issues.apache.org/jira/browse/HDFS-15496 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Mukul Kumar Singh Add a a) Show the list of snapshots per snapshottable directory b) Add deleted status in the JMX output for the Snapshot along with a snap ID e) NN UI, should sort the snapshots for snapIds. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15488) Add a command to list all snapshots for a snaphottable root with snapshot Ids
[ https://issues.apache.org/jira/browse/HDFS-15488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-15488: - Status: Patch Available (was: Open) > Add a command to list all snapshots for a snaphottable root with snapshot Ids > - > > Key: HDFS-15488 > URL: https://issues.apache.org/jira/browse/HDFS-15488 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > > Currently, the way to list snapshots is do a ls on > /.snapshot directory. Since creation time is not > recorded , there is no way to actually figure out the chronological order of > snapshots. The idea here is to add a command to list snapshots for a > snapshottable directory along with snapshot Ids which grow monotonically as > snapshots are created in the system. With snapID, it will be helpful to > figure out the chronology of snapshots in the system. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15480) Ordered snapshot deletion: record snapshot deletion in XAttr
[ https://issues.apache.org/jira/browse/HDFS-15480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17161927#comment-17161927 ] Mukul Kumar Singh commented on HDFS-15480: -- Thanks for the patch [~shashikant]. Some comments as following. a) FSDirSnapshotOp:280, Lets use the FSDirXAttrOp.unprotectedSetXAttrs api and lets assert that the Snapshot Xattr is only set on Snapshot root. b) TestOrderedSnapshotDeletion:105, lets also assert that the value should be null/non-existent c) TestOrderedSnapshotDeletion:152, can use the same name creation api as in FSDirSnapshotOp. This will keep the name in test consistent. > Ordered snapshot deletion: record snapshot deletion in XAttr > > > Key: HDFS-15480 > URL: https://issues.apache.org/jira/browse/HDFS-15480 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: snapshots >Reporter: Tsz-wo Sze >Assignee: Shashikant Banerjee >Priority: Major > Attachments: HDFS-15480.000.patch, HDFS-15480.001.patch > > > In this JIRA, the behavior of deleting the non-earliest snapshots will be > changed to marking them as deleted in XAttr but not actually deleting them. > Note that > # The marked-for-deletion snapshots will be garbage collected later on; see > HDFS-15481. > # The marked-for-deletion snapshots will be hided from users; see HDFS-15482. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDFS-15301) statfs function in hdfs-fuse is not working
[ https://issues.apache.org/jira/browse/HDFS-15301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDFS-15301. -- Fix Version/s: 3.4.0 3.3.0 Resolution: Fixed > statfs function in hdfs-fuse is not working > --- > > Key: HDFS-15301 > URL: https://issues.apache.org/jira/browse/HDFS-15301 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs, libhdfs >Reporter: Aryan Gupta >Assignee: Aryan Gupta >Priority: Major > Labels: https://github.com/apache/hadoop/pull/1980 > Fix For: 3.3.0, 3.4.0 > > > *statfs function in hdfs-fuse is not working.* It gives error like: > could not find method org/apache/hadoop/fs/FsStatus from class > org/apache/hadoop/fs/FsStatus with signature getUsed > hdfsGetUsed: FsStatus#getUsed error: > NoSuchMethodError: org/apache/hadoop/fs/FsStatusjava.lang.NoSuchMethodError: > org/apache/hadoop/fs/FsStatus > > Problem: Incorrect passing of parameters invokeMethod function. > invokeMethod(env, , INSTANCE, fss, JC_FS_STATUS, > HADOOP_FSSTATUS,"getUsed", "()J"); > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15301) statfs function in hdfs-fuse is not working
[ https://issues.apache.org/jira/browse/HDFS-15301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095105#comment-17095105 ] Mukul Kumar Singh commented on HDFS-15301: -- Merged this to trunk and backported to branch-3.3. > statfs function in hdfs-fuse is not working > --- > > Key: HDFS-15301 > URL: https://issues.apache.org/jira/browse/HDFS-15301 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs, libhdfs >Reporter: Aryan Gupta >Assignee: Aryan Gupta >Priority: Major > Labels: https://github.com/apache/hadoop/pull/1980 > > *statfs function in hdfs-fuse is not working.* It gives error like: > could not find method org/apache/hadoop/fs/FsStatus from class > org/apache/hadoop/fs/FsStatus with signature getUsed > hdfsGetUsed: FsStatus#getUsed error: > NoSuchMethodError: org/apache/hadoop/fs/FsStatusjava.lang.NoSuchMethodError: > org/apache/hadoop/fs/FsStatus > > Problem: Incorrect passing of parameters invokeMethod function. > invokeMethod(env, , INSTANCE, fss, JC_FS_STATUS, > HADOOP_FSSTATUS,"getUsed", "()J"); > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-15301) statfs function in hdfs-fuse is not working
[ https://issues.apache.org/jira/browse/HDFS-15301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095036#comment-17095036 ] Mukul Kumar Singh commented on HDFS-15301: -- Thanks for the review [~weichiu] and [~pifta]. I have merged the changes to trunk. Will also backport this to Hadoop-3.3. [~aryangupta1998], Can we add the test as a followup task ? > statfs function in hdfs-fuse is not working > --- > > Key: HDFS-15301 > URL: https://issues.apache.org/jira/browse/HDFS-15301 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs, libhdfs >Reporter: Aryan Gupta >Assignee: Aryan Gupta >Priority: Major > Labels: https://github.com/apache/hadoop/pull/1980 > > *statfs function in hdfs-fuse is not working.* It gives error like: > could not find method org/apache/hadoop/fs/FsStatus from class > org/apache/hadoop/fs/FsStatus with signature getUsed > hdfsGetUsed: FsStatus#getUsed error: > NoSuchMethodError: org/apache/hadoop/fs/FsStatusjava.lang.NoSuchMethodError: > org/apache/hadoop/fs/FsStatus > > Problem: Incorrect passing of parameters invokeMethod function. > invokeMethod(env, , INSTANCE, fss, JC_FS_STATUS, > HADOOP_FSSTATUS,"getUsed", "()J"); > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2600) Move chaos test to org.apache.hadoop.ozone.chaos package
Mukul Kumar Singh created HDDS-2600: --- Summary: Move chaos test to org.apache.hadoop.ozone.chaos package Key: HDDS-2600 URL: https://issues.apache.org/jira/browse/HDDS-2600 Project: Hadoop Distributed Data Store Issue Type: Bug Components: test Reporter: Mukul Kumar Singh This is a simple refactoring change where all the chaos test are moved to org.apache.hadoop.ozone.chaos package -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14952) Skip safemode if blockTotal is 0 in new NN
[ https://issues.apache.org/jira/browse/HDFS-14952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16977486#comment-16977486 ] Mukul Kumar Singh commented on HDFS-14952: -- +1 for me as well. I will commit this shortly. > Skip safemode if blockTotal is 0 in new NN > -- > > Key: HDFS-14952 > URL: https://issues.apache.org/jira/browse/HDFS-14952 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Reporter: Rajesh Balamohan >Assignee: Xiaoqiao He >Priority: Trivial > Labels: performance > Attachments: HDFS-14952.001.patch, HDFS-14952.002.patch, > HDFS-14952.003.patch > > > When new NN is installed, it spends 30-45 seconds in Safemode. When > {{blockTotal}} is 0, it should be possible to short circuit safemode check in > {{BlockManagerSafeMode::areThresholdsMet}}. > https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManagerSafeMode.java#L571 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist
[ https://issues.apache.org/jira/browse/HDDS-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964000#comment-16964000 ] Mukul Kumar Singh commented on HDDS-2364: - I had a cursory look at the patch and it looks good to me. [~avijayan], can we please also raise followup to add more metrics for OM metrics like, a) flush buffer sizes b) batch put latencies c) get latencies d) number of instances of iterator scans > Add a OM metrics to find the false positive rate for the keyMayExist > > > Key: HDDS-2364 > URL: https://issues.apache.org/jira/browse/HDDS-2364 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Add a OM metrics to find the false positive rate for the keyMayExist. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963991#comment-16963991 ] Mukul Kumar Singh commented on HDDS-2356: - Hi [~timmylicheng], the following error is a new issue. What was the length of the file which we were writing ? {code} - Exception occurred in PutObject java.io.IOException: Failed on local exception: org.apache.hadoop.ipc.RpcException: RPC response exceeds maximum data length; Host Details : local host is: "VM_50_210_centos/127.0.0.1"; destination host is: "9.134.50.210":9862; {code} > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > Fix For: 0.5.0 > > Attachments: image-2019-10-31-18-56-56-177.png > > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > 2019-10-28 11:44:34,079 [qtp1383524016-70] ERROR - Error in Complete > Multipart Upload Request for bucket: ozone-test, key: > 20191012/plc_1570863541668_927 > 8 > MISMATCH_MULTIPART_LIST org.apache.hadoop.ozone.om.exceptions.OMException: > Complete Multipart Upload Failed: volume: > s3c89e813c80ffcea9543004d57b2a1239bucket: > ozone-testkey: 20191012/plc_1570863541668_9278 > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.handleError(OzoneManagerProtocolClientSideTranslatorPB.java:732) > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.completeMultipartUpload(OzoneManagerProtocolClientSideTranslatorPB > .java:1104) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.apache.hadoop.hdds.tracing.TraceAllMethod.invoke(TraceAllMethod.java:66) > at com.sun.proxy.$Proxy82.completeMultipartUpload(Unknown Source) > at > org.apache.hadoop.ozone.client.rpc.RpcClient.completeMultipartUpload(RpcClient.java:883) > at > org.apache.hadoop.ozone.client.OzoneBucket.completeMultipartUpload(OzoneBucket.java:445) > at > org.apache.hadoop.ozone.s3.endpoint.ObjectEndpoint.completeMultipartUpload(ObjectEndpoint.java:498) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:497) > at > org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory.lambda$static$0(ResourceMethodInvocationHandlerFactory.java:76) > at > org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:148) > at > org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:191) > at > org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$ResponseOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:200) > at > org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:103) > at > org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:493) > > The following errors has been resolved in > https://issues.apache.org/jira/browse/HDDS-2322. > 2019-10-24 16:01:59,527 [OMDoubleBufferFlushThread] ERROR - Terminating with > exit status 2: OMDoubleBuffer flush > threadOMDoubleBufferFlushThreadencountered Throwable error > java.util.ConcurrentModificationException > at java.util.TreeMap.forEach(TreeMap.java:1004) > at >
[jira] [Created] (HDDS-2389) add toStateMachineLogEntryString provider in Ozone's ContainerStateMachine
Mukul Kumar Singh created HDDS-2389: --- Summary: add toStateMachineLogEntryString provider in Ozone's ContainerStateMachine Key: HDDS-2389 URL: https://issues.apache.org/jira/browse/HDDS-2389 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Affects Versions: 0.4.1 Reporter: Mukul Kumar Singh This jira proposes to add a new toStateMachineLogEntryString provider in Ozone's ContainerStateMachine to print extra log debug statements in Ratis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2389) add toStateMachineLogEntryString provider in Ozone's ContainerStateMachine
[ https://issues.apache.org/jira/browse/HDDS-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2389: --- Assignee: Mukul Kumar Singh > add toStateMachineLogEntryString provider in Ozone's ContainerStateMachine > -- > > Key: HDDS-2389 > URL: https://issues.apache.org/jira/browse/HDDS-2389 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.1 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > > This jira proposes to add a new toStateMachineLogEntryString provider in > Ozone's ContainerStateMachine to print extra log debug statements in Ratis. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2372) Datanode pipeline is failing with NoSuchFileException
[ https://issues.apache.org/jira/browse/HDDS-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16963796#comment-16963796 ] Mukul Kumar Singh edited comment on HDDS-2372 at 10/31/19 9:21 AM: --- cc: [~msingh] [~shashikant] was (Author: elek): cc: [~msingh] > Datanode pipeline is failing with NoSuchFileException > - > > Key: HDDS-2372 > URL: https://issues.apache.org/jira/browse/HDDS-2372 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Marton Elek >Priority: Critical > > Found it on a k8s based test cluster using a simple 3 node cluster and > HDDS-2327 freon test. After a while the StateMachine become unhealthy after > this error: > {code:java} > datanode-0 datanode java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > java.nio.file.NoSuchFileException: > /data/storage/hdds/2a77fab9-9dc5-4f73-9501-b5347ac6145c/current/containerDir0/1/chunks/gGYYgiTTeg_testdata_chunk_13931.tmp.2.20830 > {code} > Can be reproduced. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2383) Closing open container via SCMCli throws exception
[ https://issues.apache.org/jira/browse/HDDS-2383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2383: --- Assignee: Nanda kumar > Closing open container via SCMCli throws exception > -- > > Key: HDDS-2383 > URL: https://issues.apache.org/jira/browse/HDDS-2383 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Reporter: Rajesh Balamohan >Assignee: Nanda kumar >Priority: Major > > This was observed in apache master branch. > Closing the container via {{SCMCli}} throws the following exception, though > the container ends up getting closed eventually. > {noformat} > 2019-10-30 02:44:41,794 INFO > org.apache.hadoop.hdds.scm.block.SCMBlockDeletingService: Block deletion > txnID mismatch in datanode 79626ba3-1957-46e5-a8b0-32d7f47fb801 for > containerID 6. Datanode delete txnID: 0, SCM txnID: 1004 > 2019-10-30 02:44:41,810 INFO > org.apache.hadoop.hdds.scm.container.IncrementalContainerReportHandler: > Moving container #4 to CLOSED state, datanode > 8885d4ba-228a-4fd2-bf5a-831f01594c6c{ip: 10.17.234.37, host: > vd1327.halxg.cloudera.com, networkLocation: /default-rack, certSerialId: > null} reported CLOSED replica. > 2019-10-30 02:44:41,826 INFO > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer: Object type > container id 4 op close new stage complete > 2019-10-30 02:44:41,826 ERROR > org.apache.hadoop.hdds.scm.container.ContainerStateManager: Failed to update > container state #4, reason: invalid state transition from state: CLOSED upon > event: CLOSE. > 2019-10-30 02:44:41,826 INFO org.apache.hadoop.ipc.Server: IPC Server handler > 6 on 9860, call Call#3 Retry#0 > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocol.submitRequest > from 10.17.234.32:45926 > org.apache.hadoop.hdds.scm.exceptions.SCMException: Failed to update > container state #4, reason: invalid state transition from state: CLOSED upon > event: CLOSE. > at > org.apache.hadoop.hdds.scm.container.ContainerStateManager.updateContainerState(ContainerStateManager.java:338) > at > org.apache.hadoop.hdds.scm.container.SCMContainerManager.updateContainerState(SCMContainerManager.java:326) > at > org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.notifyObjectStageChange(SCMClientProtocolServer.java:388) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.notifyObjectStageChange(StorageContainerLocationProtocolServerSideTranslatorPB.java:303) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.processRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:158) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB$$Lambda$152/2036820231.apply(Unknown > Source) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.hdds.scm.protocol.StorageContainerLocationProtocolServerSideTranslatorPB.submitRequest(StorageContainerLocationProtocolServerSideTranslatorPB.java:112) > at > org.apache.hadoop.hdds.protocol.proto.StorageContainerLocationProtocolProtos$StorageContainerLocationProtocolService$2.callBlockingMethod(StorageContainerLocationProtocolProtos.java:30454) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:876) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:822) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2682) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2376) Fail to read data through XceiverClientGrpc
[ https://issues.apache.org/jira/browse/HDDS-2376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962928#comment-16962928 ] Mukul Kumar Singh commented on HDDS-2376: - hi [~Sammi], are there any errors on the datanode ? I mean is the chunk file present on the datanode. Also we can verify checksum's locally on the datanode as well. > Fail to read data through XceiverClientGrpc > --- > > Key: HDDS-2376 > URL: https://issues.apache.org/jira/browse/HDDS-2376 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Sammi Chen >Assignee: Hanisha Koneru >Priority: Blocker > > Run teragen, application failed with following stack, > 19/10/29 14:35:42 INFO mapreduce.Job: Running job: job_1567133159094_0048 > 19/10/29 14:35:59 INFO mapreduce.Job: Job job_1567133159094_0048 running in > uber mode : false > 19/10/29 14:35:59 INFO mapreduce.Job: map 0% reduce 0% > 19/10/29 14:35:59 INFO mapreduce.Job: Job job_1567133159094_0048 failed with > state FAILED due to: Application application_1567133159094_0048 failed 2 > times due to AM Container for appattempt_1567133159094_0048_02 exited > with exitCode: -1000 > For more detailed output, check application tracking > page:http://host183:8088/cluster/app/application_1567133159094_0048Then, > click on links to logs of each attempt. > Diagnostics: Unexpected OzoneException: > org.apache.hadoop.ozone.common.OzoneChecksumException: Checksum mismatch at > index 0 > java.io.IOException: Unexpected OzoneException: > org.apache.hadoop.ozone.common.OzoneChecksumException: Checksum mismatch at > index 0 > at > org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunk(ChunkInputStream.java:342) > at > org.apache.hadoop.hdds.scm.storage.ChunkInputStream.readChunkFromContainer(ChunkInputStream.java:307) > at > org.apache.hadoop.hdds.scm.storage.ChunkInputStream.prepareRead(ChunkInputStream.java:259) > at > org.apache.hadoop.hdds.scm.storage.ChunkInputStream.read(ChunkInputStream.java:144) > at > org.apache.hadoop.hdds.scm.storage.BlockInputStream.read(BlockInputStream.java:239) > at > org.apache.hadoop.ozone.client.io.KeyInputStream.read(KeyInputStream.java:171) > at > org.apache.hadoop.fs.ozone.OzoneFSInputStream.read(OzoneFSInputStream.java:52) > at java.io.DataInputStream.read(DataInputStream.java:100) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:86) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:60) > at org.apache.hadoop.io.IOUtils.copyBytes(IOUtils.java:120) > at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:366) > at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:267) > at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361) > at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:359) > at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: org.apache.hadoop.ozone.common.OzoneChecksumException: Checksum > mismatch at index 0 > at > org.apache.hadoop.ozone.common.ChecksumData.verifyChecksumDataMatches(ChecksumData.java:148) > at > org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:275) > at > org.apache.hadoop.ozone.common.Checksum.verifyChecksum(Checksum.java:238) > at > org.apache.hadoop.hdds.scm.storage.ChunkInputStream.lambda$new$0(ChunkInputStream.java:375) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:287) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithTraceIDAndRetry(XceiverClientGrpc.java:250) > at > org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:233) > at > org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.readChunk(ContainerProtocolCalls.java:245) > at >
[jira] [Resolved] (HDDS-2056) Datanode unable to start command handler thread with security enabled
[ https://issues.apache.org/jira/browse/HDDS-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2056. - Resolution: Duplicate > Datanode unable to start command handler thread with security enabled > - > > Key: HDDS-2056 > URL: https://issues.apache.org/jira/browse/HDDS-2056 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Shashikant Banerjee >Assignee: Xiaoyu Yao >Priority: Major > Fix For: 0.5.0 > > > > {code:java} > 2019-08-29 02:50:23,536 ERROR > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine: > Critical Error : Command processor thread encountered an error. Thread: > Thread[Command processor thread,5,main] > java.lang.IllegalArgumentException: Null user > at > org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1269) > at > org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1256) > at > org.apache.hadoop.hdds.security.token.BlockTokenVerifier.verify(BlockTokenVerifier.java:116) > at > org.apache.hadoop.ozone.container.common.transport.server.XceiverServer.submitRequest(XceiverServer.java:68) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.submitRequest(XceiverServerRatis.java:482) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:109) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:432) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist
Mukul Kumar Singh created HDDS-2364: --- Summary: Add a OM metrics to find the false positive rate for the keyMayExist Key: HDDS-2364 URL: https://issues.apache.org/jira/browse/HDDS-2364 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Affects Versions: 0.5.0 Reporter: Mukul Kumar Singh Add a OM metrics to find the false positive rate for the keyMayExist. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2356) Multipart upload report errors while writing to ozone Ratis pipeline
[ https://issues.apache.org/jira/browse/HDDS-2356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2356: --- Assignee: Bharat Viswanadham > Multipart upload report errors while writing to ozone Ratis pipeline > > > Key: HDDS-2356 > URL: https://issues.apache.org/jira/browse/HDDS-2356 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.4.1 > Environment: Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM > on a separate VM >Reporter: Li Cheng >Assignee: Bharat Viswanadham >Priority: Blocker > > Env: 4 VMs in total: 3 Datanodes on 3 VMs, 1 OM & 1 SCM on a separate VM, say > it's VM0. > I use goofys as a fuse and enable ozone S3 gateway to mount ozone to a path > on VM0, while reading data from VM0 local disk and write to mount path. The > dataset has various sizes of files from 0 byte to GB-level and it has a > number of ~50,000 files. > The writing is slow (1GB for ~10 mins) and it stops after around 4GB. As I > look at hadoop-root-om-VM_50_210_centos.out log, I see OM throwing errors > related with Multipart upload. This error eventually causes the writing to > terminate and OM to be closed. > > 2019-10-24 16:01:59,527 [OMDoubleBufferFlushThread] ERROR - Terminating with > exit status 2: OMDoubleBuffer flush > threadOMDoubleBufferFlushThreadencountered Throwable error > java.util.ConcurrentModificationException > at java.util.TreeMap.forEach(TreeMap.java:1004) > at > org.apache.hadoop.ozone.om.helpers.OmMultipartKeyInfo.getProto(OmMultipartKeyInfo.java:111) > at > org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:38) > at > org.apache.hadoop.ozone.om.codec.OmMultipartKeyInfoCodec.toPersistedFormat(OmMultipartKeyInfoCodec.java:31) > at > org.apache.hadoop.hdds.utils.db.CodecRegistry.asRawData(CodecRegistry.java:68) > at > org.apache.hadoop.hdds.utils.db.TypedTable.putWithBatch(TypedTable.java:125) > at > org.apache.hadoop.ozone.om.response.s3.multipart.S3MultipartUploadCommitPartResponse.addToDBBatch(S3MultipartUploadCommitPartResponse.java:112) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.lambda$flushTransactions$0(OzoneManagerDoubleBuffer.java:137) > at java.util.Iterator.forEachRemaining(Iterator.java:116) > at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:135) > at java.lang.Thread.run(Thread.java:745) > 2019-10-24 16:01:59,629 [shutdown-hook-0] INFO - SHUTDOWN_MSG: -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2334) Dummy chunk manager fails with length mismatch error
[ https://issues.apache.org/jira/browse/HDDS-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2334: Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the contribution [~adoroszlai] and [~sdeka] and [~shashikant] for the reviews. I have committed this. > Dummy chunk manager fails with length mismatch error > > > Key: HDDS-2334 > URL: https://issues.apache.org/jira/browse/HDDS-2334 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > HDDS-1094 added a config option ({{hdds.container.chunk.persistdata=false}}) > to drop chunks instead of writing them to disk. Currently this option > triggers the following error with any key size: > {noformat} > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > data array does not match the length specified. DataLen: 16777216 Byte > Array: 16777478 > at > org.apache.hadoop.ozone.container.keyvalue.impl.ChunkManagerDummyImpl.writeChunk(ChunkManagerDummyImpl.java:87) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handleWriteChunk(KeyValueHandler.java:695) > at > org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.handle(KeyValueHandler.java:176) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatchRequest(HddsDispatcher.java:277) > at > org.apache.hadoop.ozone.container.common.impl.HddsDispatcher.dispatch(HddsDispatcher.java:150) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.dispatchCommand(ContainerStateMachine.java:413) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.runCommand(ContainerStateMachine.java:423) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.ContainerStateMachine.lambda$handleWriteChunk$1(ContainerStateMachine.java:458) > at > java.base/java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1700) > at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) > at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) > at java.base/java.lang.Thread.run(Thread.java:834) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16956324#comment-16956324 ] Mukul Kumar Singh commented on HDFS-14884: -- Thanks for the review [~weichiu]. I have addressed the review comments in the v3 patch. > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDFS-14884.001.patch, HDFS-14884.002.patch, > HDFS-14884.003.patch, hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-14884: - Attachment: HDFS-14884.003.patch > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDFS-14884.001.patch, HDFS-14884.002.patch, > HDFS-14884.003.patch, hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2301) Write path: Reduce read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2301: --- Assignee: Supratim Deka (was: Nanda kumar) > Write path: Reduce read contention in rocksDB > - > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Assignee: Supratim Deka >Priority: Major > Labels: performance > Attachments: om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in > OM. This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the write path. > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist.}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDFS-14884: Assignee: Mukul Kumar Singh > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Attachments: HDFS-14884.001.patch, HDFS-14884.002.patch, > hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2339) Add OzoneManager to MiniOzoneChaosCluster
Mukul Kumar Singh created HDDS-2339: --- Summary: Add OzoneManager to MiniOzoneChaosCluster Key: HDDS-2339 URL: https://issues.apache.org/jira/browse/HDDS-2339 Project: Hadoop Distributed Data Store Issue Type: Bug Components: om Reporter: Mukul Kumar Singh This jira proposes to add OzoneManager to MiniOzoneChaosCluster with OzoneHA implementation done. This will help in discovering bugs in Ozone Manager HA -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2335) Params not included in AuditMessage
[ https://issues.apache.org/jira/browse/HDDS-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2335: Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the contribution [~adoroszlai] and [~bharat] and [~dineshchitlangia] for the reviews. I have committed this. > Params not included in AuditMessage > --- > > Key: HDDS-2335 > URL: https://issues.apache.org/jira/browse/HDDS-2335 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > HDDS-2323 introduced the following Findbugs violation: > {noformat:title=https://github.com/elek/ozone-ci-q4/blob/master/trunk/trunk-nightly-20191020-r5wzl/findbugs/summary.txt} > M P UrF: Unread field: > org.apache.hadoop.ozone.audit.AuditMessage$Builder.params At > AuditMessage.java:[line 106] > {noformat} > Which reveals that {{params}} is now not logged in audit messages: > {noformat} > 2019-10-20 08:41:35,248 | INFO | OMAudit | user=hadoop | ip=192.168.128.2 | > op=CREATE_VOLUME | ret=SUCCESS | > 2019-10-20 08:41:35,312 | INFO | OMAudit | user=hadoop | ip=192.168.128.2 | > op=CREATE_BUCKET | ret=SUCCESS | > 2019-10-20 08:41:35,407 | INFO | OMAudit | user=hadoop | ip=192.168.128.2 | > op=ALLOCATE_KEY | ret=SUCCESS | > 2019-10-20 08:41:37,355 | INFO | OMAudit | user=hadoop | ip=192.168.128.2 | > op=COMMIT_KEY | ret=SUCCESS | > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2337) Fix checkstyle errors
[ https://issues.apache.org/jira/browse/HDDS-2337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2337: Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the contribution [~adoroszlai]. I have committed this. > Fix checkstyle errors > - > > Key: HDDS-2337 > URL: https://issues.apache.org/jira/browse/HDDS-2337 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Checkstyle errors intoduced in HDDS-2281: > {noformat:title=https://github.com/elek/ozone-ci-q4/blob/master/pr/pr-hdds-2281-wfpgn/checkstyle/summary.txt} > hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/transport/server/ratis/ContainerStateMachine.java > 465: Line is longer than 80 characters (found 81). > hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/container/ContainerTestHelper.java > 244: Line is longer than 80 characters (found 84). > hadoop-ozone/integration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestContainerStateMachineFailures.java > 30: Unused import - > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException. > 506: ; is preceded with whitespace. > 517: ; is preceded with whitespace. > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2336) Fix TestKeyValueContainer#testRocksDBCreateUsesCachedOptions
[ https://issues.apache.org/jira/browse/HDDS-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955674#comment-16955674 ] Mukul Kumar Singh commented on HDDS-2336: - Thanks for the contribution [~adoroszlai]. I have committed this. > Fix TestKeyValueContainer#testRocksDBCreateUsesCachedOptions > > > Key: HDDS-2336 > URL: https://issues.apache.org/jira/browse/HDDS-2336 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > TestKeyValueContainer#testRocksDBCreateUsesCachedOptions, introduced in > HDDS-2283, is failing: > {noformat:title=https://github.com/elek/ozone-ci-q4/blob/master/pr/pr-hdds-2283-cnrrq/unit/hadoop-hdds/container-service/org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer.txt} > testRocksDBCreateUsesCachedOptions(org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer) > Time elapsed: 0.135 s <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<11> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer.testRocksDBCreateUsesCachedOptions(TestKeyValueContainer.java:406) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2336) Fix TestKeyValueContainer#testRocksDBCreateUsesCachedOptions
[ https://issues.apache.org/jira/browse/HDDS-2336?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2336: Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) > Fix TestKeyValueContainer#testRocksDBCreateUsesCachedOptions > > > Key: HDDS-2336 > URL: https://issues.apache.org/jira/browse/HDDS-2336 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.5.0 >Reporter: Attila Doroszlai >Assignee: Attila Doroszlai >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 20m > Remaining Estimate: 0h > > TestKeyValueContainer#testRocksDBCreateUsesCachedOptions, introduced in > HDDS-2283, is failing: > {noformat:title=https://github.com/elek/ozone-ci-q4/blob/master/pr/pr-hdds-2283-cnrrq/unit/hadoop-hdds/container-service/org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer.txt} > testRocksDBCreateUsesCachedOptions(org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer) > Time elapsed: 0.135 s <<< FAILURE! > java.lang.AssertionError: expected:<1> but was:<11> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.ozone.container.keyvalue.TestKeyValueContainer.testRocksDBCreateUsesCachedOptions(TestKeyValueContainer.java:406) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2280) HddsUtils#CheckForException should not return null in case the ratis exception cause is not set
[ https://issues.apache.org/jira/browse/HDDS-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2280. - Fix Version/s: 0.5.0 Resolution: Fixed Thanks for the contribution [~shashikant] and [~bharat] for the review. I have committed this. > HddsUtils#CheckForException should not return null in case the ratis > exception cause is not set > --- > > Key: HDDS-2280 > URL: https://issues.apache.org/jira/browse/HDDS-2280 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 40m > Remaining Estimate: 0h > > HddsUtils#CheckForException checks for the cause to be set properly to one of > the defined/expected exceptions. In case, ratis throws up any runtime > exception, HddsUtils#CheckForException can return null and lead to > NullPointerException while write. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2281) ContainerStateMachine#handleWriteChunk should ignore close container exception
[ https://issues.apache.org/jira/browse/HDDS-2281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2281. - Resolution: Fixed Thanks for the contribution [~shashikant]. I have committed this. > ContainerStateMachine#handleWriteChunk should ignore close container > exception > --- > > Key: HDDS-2281 > URL: https://issues.apache.org/jira/browse/HDDS-2281 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > Currently, ContainerStateMachine#applyTrannsaction ignores close container > exception.Similarly,ContainerStateMachine#handleWriteChunk call also should > ignore close container exception. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2283) Container creation on datanodes take time because of Rocksdb option creation.
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2283. - Resolution: Fixed > Container creation on datanodes take time because of Rocksdb option creation. > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2311) Fix logic of RetryPolicy in OzoneClientSideTranslatorPB
[ https://issues.apache.org/jira/browse/HDDS-2311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2311: --- Assignee: Hanisha Koneru > Fix logic of RetryPolicy in OzoneClientSideTranslatorPB > --- > > Key: HDDS-2311 > URL: https://issues.apache.org/jira/browse/HDDS-2311 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Bharat Viswanadham >Assignee: Hanisha Koneru >Priority: Blocker > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > OzoneManagerProtocolClientSideTranslatorPB.java > L251: if (cause instanceof NotLeaderException) { > NotLeaderException notLeaderException = (NotLeaderException) cause; > omFailoverProxyProvider.performFailoverIfRequired( > notLeaderException.getSuggestedLeaderNodeId()); > return getRetryAction(RetryAction.RETRY, retries, failovers); > } > > The suggested leader returned from Server is not used during failOver, as the > cause is a type of RemoteException. So with current code, it does not use > suggested leader for failOver at all and by default with each OM, it tries > max retries. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2283) Container creation on datanodes take time because of Rocksdb option creation.
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955410#comment-16955410 ] Mukul Kumar Singh commented on HDDS-2283: - Thanks for the contribution [~swagle] and [~avijayan] and [~aengineer] for the review. I have committed this. > Container creation on datanodes take time because of Rocksdb option creation. > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 20m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2283) Container creation on datanodes take time because of Rocksdb option creation.
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2283: Summary: Container creation on datanodes take time because of Rocksdb option creation. (was: Container Creation on datanodes take around 300ms due to rocksdb creation) > Container creation on datanodes take time because of Rocksdb option creation. > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Attachments: HDDS-2283.00.patch > > Time Spent: 10m > Remaining Estimate: 0h > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2286) Add a log info in ozone client and scm to print the exclusion list during allocate block
[ https://issues.apache.org/jira/browse/HDDS-2286?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2286. - Resolution: Fixed Thanks for the contribution [~swagle] and [~adoroszlai] for the review. I have committed this. > Add a log info in ozone client and scm to print the exclusion list during > allocate block > > > Key: HDDS-2286 > URL: https://issues.apache.org/jira/browse/HDDS-2286 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Assignee: Siddharth Wagle >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-14884: - Attachment: HDFS-14884.002.patch > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Priority: Major > Attachments: HDFS-14884.001.patch, HDFS-14884.002.patch, > hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955068#comment-16955068 ] Mukul Kumar Singh commented on HDFS-14884: -- [~xyao] [~daryn] [~weichiu] I have uploaded a patch based on the discussion. Please have a look. > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Priority: Major > Attachments: HDFS-14884.001.patch, hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-14884: - Attachment: HDFS-14884.001.patch > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Priority: Major > Attachments: HDFS-14884.001.patch, hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-14884: - Status: Patch Available (was: Open) > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Priority: Major > Attachments: HDFS-14884.001.patch, hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2324) Enhance locking mechanism in OzoneMangaer
[ https://issues.apache.org/jira/browse/HDDS-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954449#comment-16954449 ] Mukul Kumar Singh commented on HDDS-2324: - cc [~bharat] [~arp] [~hanishakoneru] [~nanda] > Enhance locking mechanism in OzoneMangaer > - > > Key: HDDS-2324 > URL: https://issues.apache.org/jira/browse/HDDS-2324 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Priority: Major > Labels: performance > Attachments: om_lock_100_percent_read_benchmark.svg, > om_lock_reader_and_writer_workload.svg > > > OM has reentrant RW lock. With 100% read or 100% write benchmarks, it works > out reasonably fine. There is already a ticket to optimize the write codepath > (as it incurs reading from DB for key checks). > However, when small amount of write workload (e.g 3-5 threads) is added to > the running read benchmark, throughput suffers significantly. This is due to > the fact that the reader threads would get blocked often. I have observed > around 10x slower throughput (i.e 100% read benchmark was running at 12,000 > TPS and with couple of writer threads added to it, it goes down to 1200-1800 > TPS). > 1. Instead of single write lock, one option could be good to scale out the > write lock depending on the number of cores available in the system and > acquire relevant lock by hashing the key. > 2. Another option is to explore if we can make use of StampedLocks of JDK > 8.x, which scales well when multiple readers and writers are there. But it is > not a reentrant lock. So need to explore whether it can be an option or not. > > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2328) Support large-scale listing
[ https://issues.apache.org/jira/browse/HDDS-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2328: --- Assignee: Hanisha Koneru > Support large-scale listing > > > Key: HDDS-2328 > URL: https://issues.apache.org/jira/browse/HDDS-2328 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Assignee: Hanisha Koneru >Priority: Major > Labels: performance > > Large-scale listing of directory contents takes a lot longer time and also > has the potential to run into OOM. I have > 1 million entries in the same > level and it took lot longer time (didn't complete as it was stuck in > RDB::seek). > S3A batches it with 5K listing per fetch IIRC. It would be good to have this > feature in ozone as well. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2323) Mem allocation: Optimise AuditMessage::build()
[ https://issues.apache.org/jira/browse/HDDS-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954221#comment-16954221 ] Mukul Kumar Singh commented on HDDS-2323: - cc: [~dchitlangia] > Mem allocation: Optimise AuditMessage::build() > -- > > Key: HDDS-2323 > URL: https://issues.apache.org/jira/browse/HDDS-2323 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Priority: Major > Labels: performance > Attachments: Screenshot 2019-10-18 at 8.24.52 AM.png > > > String format allocates/processes more than > {color:#00}OzoneAclUtil.fromProtobuf in write benchmark.{color} > {color:#00}Would be good to use + instead of format.{color} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2145) Optimize client read path by reading multiple chunks along with block info in a single rpc call.
[ https://issues.apache.org/jira/browse/HDDS-2145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2145: --- Assignee: Hanisha Koneru (was: Shashikant Banerjee) > Optimize client read path by reading multiple chunks along with block info in > a single rpc call. > > > Key: HDDS-2145 > URL: https://issues.apache.org/jira/browse/HDDS-2145 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client, Ozone Datanode >Reporter: Shashikant Banerjee >Assignee: Hanisha Koneru >Priority: Major > Fix For: 0.5.0 > > > Currently, ozone client issues a getBlock call to read the metadata info from > rocks Db on dn to get the chunkInfo and then chunk info is read one by one > inn separate rpc calls in the read path. This can be optimized by > piggybacking readChunk calls along with getBlock in a single rpc call to dn. > This Jira aims to address this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2318) Avoid proto::tostring in preconditions to save CPU cycles
[ https://issues.apache.org/jira/browse/HDDS-2318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953743#comment-16953743 ] Mukul Kumar Singh commented on HDDS-2318: - cc [~arp], [~nanda], [~bharat], [~hanishakoneru] > Avoid proto::tostring in preconditions to save CPU cycles > - > > Key: HDDS-2318 > URL: https://issues.apache.org/jira/browse/HDDS-2318 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Assignee: Bharat Viswanadham >Priority: Major > Labels: performance > Attachments: Screenshot 2019-10-17 at 6.10.22 PM.png > > > [https://github.com/apache/hadoop-ozone/blob/61f4aa30f502b34fd778d9b37b1168721abafb2f/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerProtocolServerSideTranslatorPB.java#L117] > > This ends up converting proto toString in precondition checks and burns CPU > cycles. {{request.toString()}} can be added in debug log on need basis. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2318) Avoid proto::tostring in preconditions to save CPU cycles
[ https://issues.apache.org/jira/browse/HDDS-2318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2318: --- Assignee: Bharat Viswanadham > Avoid proto::tostring in preconditions to save CPU cycles > - > > Key: HDDS-2318 > URL: https://issues.apache.org/jira/browse/HDDS-2318 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Assignee: Bharat Viswanadham >Priority: Major > Labels: performance > Attachments: Screenshot 2019-10-17 at 6.10.22 PM.png > > > [https://github.com/apache/hadoop-ozone/blob/61f4aa30f502b34fd778d9b37b1168721abafb2f/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/protocolPB/OzoneManagerProtocolServerSideTranslatorPB.java#L117] > > This ends up converting proto toString in precondition checks and burns CPU > cycles. {{request.toString()}} can be added in debug log on need basis. > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
[ https://issues.apache.org/jira/browse/HDDS-2283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2283: Description: Container Creation on datanodes take around 300ms due to rocksdb creation. Rocksdb creation is taking a considerable time and this needs to be optimized. Creating a rocksdb per disk should be enough and each container can be table inside the rocksdb. was:Container Creation on datanodes take around 300ms due to rocksdb creation. Rocksdb creation is taking a considerable time and this needs to be optimized. > Container Creation on datanodes take around 300ms due to rocksdb creation > - > > Key: HDDS-2283 > URL: https://issues.apache.org/jira/browse/HDDS-2283 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Reporter: Mukul Kumar Singh >Assignee: Siddharth Wagle >Priority: Major > Attachments: HDDS-2283.00.patch > > > Container Creation on datanodes take around 300ms due to rocksdb creation. > Rocksdb creation is taking a considerable time and this needs to be optimized. > Creating a rocksdb per disk should be enough and each container can be table > inside the rocksdb. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2309) Optimise OzoneManagerDoubleBuffer::flushTransactions to flush in batches
[ https://issues.apache.org/jira/browse/HDDS-2309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16952037#comment-16952037 ] Mukul Kumar Singh commented on HDDS-2309: - cc: [~arp][~bharat][~hanishakoneru] > Optimise OzoneManagerDoubleBuffer::flushTransactions to flush in batches > > > Key: HDDS-2309 > URL: https://issues.apache.org/jira/browse/HDDS-2309 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Rajesh Balamohan >Priority: Major > Attachments: Screenshot 2019-10-15 at 4.19.13 PM.png > > > When running a write heavy benchmark, > {{{color:#00}org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.flushTransactions{color}}} > was invoked for pretty much every write. > This forces {{cleanupCache}} to be invoked which ends up choking in single > thread executor. Attaching the profiler information which gives more details. > Ideally, {{flushTransactions}} should batch up the work to reduce load on > rocksDB. > > [https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java#L130] > > [https://github.com/apache/hadoop-ozone/blob/master/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java#L322] > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2308) Switch to centos with the apache/ozone-build docker image
[ https://issues.apache.org/jira/browse/HDDS-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16951714#comment-16951714 ] Mukul Kumar Singh commented on HDDS-2308: - >From the stacktrace not able to make out whether this is coming from >Datanode,SCM or OM. Also this seems to be happening as part of DumpLogs. > Switch to centos with the apache/ozone-build docker image > - > > Key: HDDS-2308 > URL: https://issues.apache.org/jira/browse/HDDS-2308 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Marton Elek >Assignee: Marton Elek >Priority: Major > Attachments: hs_err_pid16346.log > > > I realized multiple JVM crashes in the daily builds: > > {code:java} > ERROR] ExecutionException The forked VM terminated without properly saying > goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp > surefire_947955725320624341206tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] Crashed tests: > > > [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractRename > > > [ERROR] ExecutionException The forked VM terminated without properly > saying goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter5429192218879128313.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7227403571189445391tmp > surefire_1011197392458143645283tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] Crashed tests: > > > [ERROR] org.apache.hadoop.fs.ozone.contract.ITestOzoneContractDistCp > > > [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter1355604543311368443.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire3938612864214747736tmp > surefire_933162535733309260236tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 > > > [ERROR] ExecutionException The forked VM terminated without properly > saying goodbye. VM crash or System.exit called? > > > [ERROR] Command was /bin/sh -c cd /workdir/hadoop-ozone/ozonefs && > /usr/lib/jvm/java-1.8-openjdk/jre/bin/java -Xmx2048m > -XX:+HeapDumpOnOutOfMemoryError -jar > /workdir/hadoop-ozone/ozonefs/target/surefire/surefirebooter9018689154779946208.jar > /workdir/hadoop-ozone/ozonefs/target/surefire > 2019-10-06T14-52-40_697-jvmRun1 surefire7569723928289175829tmp > surefire_947955725320624341206tmp > > > [ERROR] Error occurred in starting fork, check output in log > > > [ERROR] Process Exit Code: 139 {code} > > Based on the crash log (uploaded) it's related to the rocksdb JNI interface. > In the current ozone-build docker image (which provides the environment for > build) we use alpine where musl libc is used instead of the main glibc. I > think it would be more safe to use the same glibc what is used in production. > I tested with centos based docker image and it seems to be more stable. > Didn't see any more JVM crashes. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2204) Avoid buffer coping in checksum verification
[ https://issues.apache.org/jira/browse/HDDS-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2204. - Fix Version/s: 0.5.0 Resolution: Fixed I have committed this to master. thanks for the conctribution [~szetszwo] and [~shashikant] for the review. > Avoid buffer coping in checksum verification > > > Key: HDDS-2204 > URL: https://issues.apache.org/jira/browse/HDDS-2204 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: o2204_20190930.patch, o2204_20190930b.patch, > o2204_20191001.patch > > Time Spent: 1h > Remaining Estimate: 0h > > In Checksum.verifyChecksum(ByteString, ..), it first converts the ByteString > to a byte array. It lead to an unnecessary buffer coping. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2306) Fix TestWatchForCommit failure
Mukul Kumar Singh created HDDS-2306: --- Summary: Fix TestWatchForCommit failure Key: HDDS-2306 URL: https://issues.apache.org/jira/browse/HDDS-2306 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Affects Versions: 0.4.1 Reporter: Mukul Kumar Singh {code} [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 203.385 s <<< FAILURE! - in org.apache.hadoop.ozone.client.rpc.TestWatchForCommit [ERROR] test2WayCommitForTimeoutException(org.apache.hadoop.ozone.client.rpc.TestWatchForCommit) Time elapsed: 27.093 s <<< ERROR! java.util.concurrent.TimeoutException at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) at org.apache.hadoop.hdds.scm.XceiverClientRatis.watchForCommit(XceiverClientRatis.java:283) at org.apache.hadoop.ozone.client.rpc.TestWatchForCommit.test2WayCommitForTimeoutException(TestWatchForCommit.java:391) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) at org.junit.runners.ParentRunner.run(ParentRunner.java:309) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) at org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2305) Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HDDS-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2305: Summary: Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT) (was: Update Ozone to later ratis snapshot.) > Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT) > - > > Key: HDDS-2305 > URL: https://issues.apache.org/jira/browse/HDDS-2305 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Priority: Major > > This jira will update ozone to latest ratis snapshot. for commit > corresponding to > {code} > commit 3f446aaf27704b0bf929bd39887637a6a71b4418 (HEAD -> master, > origin/master, origin/HEAD) > Author: Tsz Wo Nicholas Sze > Date: Fri Oct 11 16:35:38 2019 +0800 > RATIS-705. GrpcClientProtocolClient#close Interrupts itself. Contributed > by Lokesh Jain > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2305) Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT)
[ https://issues.apache.org/jira/browse/HDDS-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2305: --- Assignee: Mukul Kumar Singh > Update Ozone to latest ratis snapshot(0.5.0-3f446aa-SNAPSHOT) > - > > Key: HDDS-2305 > URL: https://issues.apache.org/jira/browse/HDDS-2305 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > > This jira will update ozone to latest ratis snapshot. for commit > corresponding to > {code} > commit 3f446aaf27704b0bf929bd39887637a6a71b4418 (HEAD -> master, > origin/master, origin/HEAD) > Author: Tsz Wo Nicholas Sze > Date: Fri Oct 11 16:35:38 2019 +0800 > RATIS-705. GrpcClientProtocolClient#close Interrupts itself. Contributed > by Lokesh Jain > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2305) Update Ozone to later ratis snapshot.
Mukul Kumar Singh created HDDS-2305: --- Summary: Update Ozone to later ratis snapshot. Key: HDDS-2305 URL: https://issues.apache.org/jira/browse/HDDS-2305 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Mukul Kumar Singh This jira will update ozone to latest ratis snapshot. for commit corresponding to {code} commit 3f446aaf27704b0bf929bd39887637a6a71b4418 (HEAD -> master, origin/master, origin/HEAD) Author: Tsz Wo Nicholas Sze Date: Fri Oct 11 16:35:38 2019 +0800 RATIS-705. GrpcClientProtocolClient#close Interrupts itself. Contributed by Lokesh Jain {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2285) GetBlock and ReadChunk command from the client should be sent to the same datanode to re-use the same connection
[ https://issues.apache.org/jira/browse/HDDS-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2285: --- Assignee: Hanisha Koneru > GetBlock and ReadChunk command from the client should be sent to the same > datanode to re-use the same connection > > > Key: HDDS-2285 > URL: https://issues.apache.org/jira/browse/HDDS-2285 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Mukul Kumar Singh >Assignee: Hanisha Koneru >Priority: Major > > I can be observed that the GetBlock and ReadChunk command is sent to 2 > different datanodes. It should be sent to the same datanode to re-use the > connection. > {code} > 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command GetBlock to > datanode 172.26.32.224 > 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command ReadChunk to > datanode 172.26.32.231 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2285) GetBlock and RadChunk command from the client should be sent to the same datanode to re-use the same connection
Mukul Kumar Singh created HDDS-2285: --- Summary: GetBlock and RadChunk command from the client should be sent to the same datanode to re-use the same connection Key: HDDS-2285 URL: https://issues.apache.org/jira/browse/HDDS-2285 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Mukul Kumar Singh I can be observed that the GetBlock and ReadChunk command is sent to 2 different datanodes. It should be sent to the same datanode to re-use the connection. {code} 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command GetBlock to datanode 172.26.32.224 19/10/10 00:43:42 INFO scm.XceiverClientGrpc: Send command ReadChunk to datanode 172.26.32.231 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2284) XceiverClientMetrics should be initialised as part of XceiverClientManager constructor
Mukul Kumar Singh created HDDS-2284: --- Summary: XceiverClientMetrics should be initialised as part of XceiverClientManager constructor Key: HDDS-2284 URL: https://issues.apache.org/jira/browse/HDDS-2284 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Affects Versions: 0.4.0 Reporter: Mukul Kumar Singh XceiverClientMetrics is currently initialized in the read write path, the metric should be initialized while creating XceiverClientManager -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2283) Container Creation on datanodes take around 300ms due to rocksdb creation
Mukul Kumar Singh created HDDS-2283: --- Summary: Container Creation on datanodes take around 300ms due to rocksdb creation Key: HDDS-2283 URL: https://issues.apache.org/jira/browse/HDDS-2283 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Reporter: Mukul Kumar Singh Container Creation on datanodes take around 300ms due to rocksdb creation. Rocksdb creation is taking a considerable time and this needs to be optimized. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2260) Avoid evaluation of LOG.trace and LOG.debug statement in the read/write path
Mukul Kumar Singh created HDDS-2260: --- Summary: Avoid evaluation of LOG.trace and LOG.debug statement in the read/write path Key: HDDS-2260 URL: https://issues.apache.org/jira/browse/HDDS-2260 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client, Ozone Datanode Affects Versions: 0.4.0 Reporter: Mukul Kumar Singh LOG.trace and LOG.debug with logging information will be evaluated even when debug/trace logging is disabled. This jira proposes to wrap all the trace/debug logging with LOG.isDebugEnabled and LOG.isTraceEnabled to prevent the logging. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2235) Ozone Datanode web page doesn't exist
Mukul Kumar Singh created HDDS-2235: --- Summary: Ozone Datanode web page doesn't exist Key: HDDS-2235 URL: https://issues.apache.org/jira/browse/HDDS-2235 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Affects Versions: 0.4.0 Reporter: Mukul Kumar Singh On trying to access the dn UI, the following error is seen. http://dn_ip:9882/ {code} HTTP ERROR 403 Problem accessing /. Reason: Forbidden {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2215) SCM should exclude datanode it the pipeline initialisation fails
Mukul Kumar Singh created HDDS-2215: --- Summary: SCM should exclude datanode it the pipeline initialisation fails Key: HDDS-2215 URL: https://issues.apache.org/jira/browse/HDDS-2215 Project: Hadoop Distributed Data Store Issue Type: Bug Components: SCM Reporter: Mukul Kumar Singh One of the node y131 is not accessible, however the RatisPipelineProider keeps chosing the same node for the pipeline initialization {code} 2019-10-01 06:03:46,023 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : b647db83-836d-41d5-bf8d-a04cb816025e{ip: 172.26.32.233, host: y133, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:46,044 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : cdf3c007-cf76-4997-85fc-a3385d826053{ip: 172.26.32.231, host: y131, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:46,099 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : 1c699d4f-28a1-41ae-aa9c-8358f52b5d8d{ip: 172.26.32.230, host: y130, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:46,106 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : feba726b-2fcc-4b37-b112-8ed2e9fc8f94{ip: 172.26.32.224, host: y124, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:46,146 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : 9c78c807-be23-415b-b1a2-5eaf6e8925b8{ip: 172.26.32.226, host: y126, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:46,235 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : 0f6c93d2-c63f-4d1a-b57a-6012dd097bd1{ip: 172.26.32.225, host: y125, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:46,395 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : 3e4db9bd-20ee-4e2a-8512-fddd37bf5cc2{ip: 172.26.32.228, host: y128.l42scl.hortonworks.com, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:46,395 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : ba096716-6942-4358-bb21-84623fd06d2c{ip: 172.26.32.232, host: y132, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:46,440 INFO org.apache.hadoop.hdds.scm.node.SCMNodeManager: Registered Data node : 935dd070-8497-4b7d-a0be-ecb115586ed3{ip: 172.26.32.227, host: y127.l42scl.hortonworks.com, networkLocation: /default-rack, certSerialId: null} 2019-10-01 06:03:47,370 INFO org.apache.hadoop.hdds.scm.pipeline.PipelineStateManager: Created pipeline Pipeline[ Id: 8ba7dda5-fcf6-45e3-a333-f4811311d34a, Nodes: b647db83-836d-41d5-bf8d-a04cb816025e{ip: 172.26.32.233, host: y133, networkLocation: /default-rack, certSerialId: null}, Type:RATIS, Factor:ONE, State:OPEN] {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1615) ManagedChannel references are being leaked in ReplicationSupervisor.java
[ https://issues.apache.org/jira/browse/HDDS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-1615: Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the review [~aengineer]. I have committed this. > ManagedChannel references are being leaked in ReplicationSupervisor.java > > > Key: HDDS-1615 > URL: https://issues.apache.org/jira/browse/HDDS-1615 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster, pull-request-available > Fix For: 0.5.0 > > Time Spent: 40m > Remaining Estimate: 0h > > ManagedChannel references are being leaked in ReplicationSupervisor.java > {code} > May 30, 2019 8:10:56 AM > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference > cleanQueue > SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=1495, > target=192.168.0.3:49868} was not shutdown properly!!! ~*~*~* > Make sure to call shutdown()/shutdownNow() and wait until > awaitTermination() returns true. > java.lang.RuntimeException: ManagedChannel allocation site > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44) > at > org.apache.ratis.thirdparty.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:411) > at > org.apache.hadoop.ozone.container.replication.GrpcReplicationClient.(GrpcReplicationClient.java:65) > at > org.apache.hadoop.ozone.container.replication.SimpleContainerDownloader.getContainerDataFromReplicas(SimpleContainerDownloader.java:87) > at > org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:118) > at > org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:115) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941254#comment-16941254 ] Mukul Kumar Singh commented on HDFS-14884: -- [^hdfs_distcp.patch] reproes the failure > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Priority: Major > Attachments: hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
[ https://issues.apache.org/jira/browse/HDFS-14884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDFS-14884: - Attachment: hdfs_distcp.patch > Add sanity check that zone key equals feinfo key while setting Xattrs > - > > Key: HDFS-14884 > URL: https://issues.apache.org/jira/browse/HDFS-14884 > Project: Hadoop HDFS > Issue Type: Bug > Components: encryption, hdfs >Reporter: Mukul Kumar Singh >Priority: Major > Attachments: hdfs_distcp.patch > > > Currently, it is possible to set an external attribute where the zone key is > not the same as feinfo key. This jira will add a precondition before setting > this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14884) Add sanity check that zone key equals feinfo key while setting Xattrs
Mukul Kumar Singh created HDFS-14884: Summary: Add sanity check that zone key equals feinfo key while setting Xattrs Key: HDFS-14884 URL: https://issues.apache.org/jira/browse/HDFS-14884 Project: Hadoop HDFS Issue Type: Bug Components: encryption, hdfs Reporter: Mukul Kumar Singh Currently, it is possible to set an external attribute where the zone key is not the same as feinfo key. This jira will add a precondition before setting this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2207) Update Ratis to latest snapshot
[ https://issues.apache.org/jira/browse/HDDS-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh resolved HDDS-2207. - Resolution: Fixed Thanks for working on this [~shashikant]. I have committed this to trunk. > Update Ratis to latest snapshot > --- > > Key: HDDS-2207 > URL: https://issues.apache.org/jira/browse/HDDS-2207 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 40m > Remaining Estimate: 0h > > This Jira aims to update ozone with latest ratis snapshot which has a crtical > fix for retry behaviour on getting not leader exception in client. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1615) ManagedChannel references are being leaked in ReplicationSupervisor.java
[ https://issues.apache.org/jira/browse/HDDS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-1615: Status: Patch Available (was: Open) > ManagedChannel references are being leaked in ReplicationSupervisor.java > > > Key: HDDS-1615 > URL: https://issues.apache.org/jira/browse/HDDS-1615 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster, pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > ManagedChannel references are being leaked in ReplicationSupervisor.java > {code} > May 30, 2019 8:10:56 AM > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference > cleanQueue > SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=1495, > target=192.168.0.3:49868} was not shutdown properly!!! ~*~*~* > Make sure to call shutdown()/shutdownNow() and wait until > awaitTermination() returns true. > java.lang.RuntimeException: ManagedChannel allocation site > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44) > at > org.apache.ratis.thirdparty.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:411) > at > org.apache.hadoop.ozone.container.replication.GrpcReplicationClient.(GrpcReplicationClient.java:65) > at > org.apache.hadoop.ozone.container.replication.SimpleContainerDownloader.getContainerDataFromReplicas(SimpleContainerDownloader.java:87) > at > org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:118) > at > org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:115) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2056) Datanode unable to start command handler thread with security enabled
[ https://issues.apache.org/jira/browse/HDDS-2056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2056: --- Assignee: Xiaoyu Yao > Datanode unable to start command handler thread with security enabled > - > > Key: HDDS-2056 > URL: https://issues.apache.org/jira/browse/HDDS-2056 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Shashikant Banerjee >Assignee: Xiaoyu Yao >Priority: Major > Fix For: 0.5.0 > > > > {code:java} > 2019-08-29 02:50:23,536 ERROR > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine: > Critical Error : Command processor thread encountered an error. Thread: > Thread[Command processor thread,5,main] > java.lang.IllegalArgumentException: Null user > at > org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1269) > at > org.apache.hadoop.security.UserGroupInformation.createRemoteUser(UserGroupInformation.java:1256) > at > org.apache.hadoop.hdds.security.token.BlockTokenVerifier.verify(BlockTokenVerifier.java:116) > at > org.apache.hadoop.ozone.container.common.transport.server.XceiverServer.submitRequest(XceiverServer.java:68) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.submitRequest(XceiverServerRatis.java:482) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CloseContainerCommandHandler.handle(CloseContainerCommandHandler.java:109) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:432) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1615) ManagedChannel references are being leaked in ReplicationSupervisor.java
[ https://issues.apache.org/jira/browse/HDDS-1615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-1615: --- Assignee: Mukul Kumar Singh (was: Hrishikesh Gadre) > ManagedChannel references are being leaked in ReplicationSupervisor.java > > > Key: HDDS-1615 > URL: https://issues.apache.org/jira/browse/HDDS-1615 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster > > ManagedChannel references are being leaked in ReplicationSupervisor.java > {code} > May 30, 2019 8:10:56 AM > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference > cleanQueue > SEVERE: *~*~*~ Channel ManagedChannelImpl{logId=1495, > target=192.168.0.3:49868} was not shutdown properly!!! ~*~*~* > Make sure to call shutdown()/shutdownNow() and wait until > awaitTermination() returns true. > java.lang.RuntimeException: ManagedChannel allocation site > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper$ManagedChannelReference.(ManagedChannelOrphanWrapper.java:103) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:53) > at > org.apache.ratis.thirdparty.io.grpc.internal.ManagedChannelOrphanWrapper.(ManagedChannelOrphanWrapper.java:44) > at > org.apache.ratis.thirdparty.io.grpc.internal.AbstractManagedChannelImplBuilder.build(AbstractManagedChannelImplBuilder.java:411) > at > org.apache.hadoop.ozone.container.replication.GrpcReplicationClient.(GrpcReplicationClient.java:65) > at > org.apache.hadoop.ozone.container.replication.SimpleContainerDownloader.getContainerDataFromReplicas(SimpleContainerDownloader.java:87) > at > org.apache.hadoop.ozone.container.replication.DownloadAndImportReplicator.replicate(DownloadAndImportReplicator.java:118) > at > org.apache.hadoop.ozone.container.replication.ReplicationSupervisor$TaskRunner.run(ReplicationSupervisor.java:115) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2194) Replication of Container fails with "Only closed containers could be exported"
Mukul Kumar Singh created HDDS-2194: --- Summary: Replication of Container fails with "Only closed containers could be exported" Key: HDDS-2194 URL: https://issues.apache.org/jira/browse/HDDS-2194 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Affects Versions: 0.5.0 Reporter: Mukul Kumar Singh Replication of Container fails with "Only closed containers could be exported" cc: [~nanda] {code} 2019-09-26 15:00:17,640 [grpc-default-executor-13] INFO replication.GrpcReplicationService (GrpcReplicationService.java:download(57)) - Streaming container data (37) to other datanode Sep 26, 2019 3:00:17 PM org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor run SEVERE: Exception while executing runnable org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed@70e641f2 java.lang.IllegalStateException: Only closed containers could be exported: ContainerId=37 2019-09-26 15:00:17,644 [grpc-default-executor-17] ERROR replication.GrpcReplicationClient (GrpcReplicationClient.java:onError(142)) - Container download was unsuccessfull at org.apache.hadoop.ozone.container.keyvalue.KeyValueContainer.exportContainerData(KeyValueContainer.java:527) org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNKNOWN at org.apache.hadoop.ozone.container.keyvalue.KeyValueHandler.exportContainer(KeyValueHandler.java:875) at org.apache.ratis.thirdparty.io.grpc.Status.asRuntimeException(Status.java:526) at org.apache.hadoop.ozone.container.ozoneimpl.ContainerController.exportContainer(ContainerController.java:134) at org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:434) at org.apache.hadoop.ozone.container.replication.OnDemandContainerReplicationSource.copyData(OnDemandContainerReplicationSource at org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) .java:64) at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at org.apache.hadoop.ozone.container.replication.GrpcReplicationService.download(GrpcReplicationService.java:63) at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClient at org.apache.hadoop.hdds.protocol.datanode.proto.IntraDatanodeProtocolServiceGrpc$MethodHandlers.invoke(IntraDatanodeProtocolSCallListener.java:40) erviceGrpc.java:217) at org.apache.ratis.thirdparty.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:678) at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls. at org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39) java:171) at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23) at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283) at org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClient at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:710) CallListener.java:40) at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) at org.apache.ratis.thirdparty.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.ja at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) va:397) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63) at java.lang.Thread.run(Thread.java:748) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.close(ClientCallImpl.java:546) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl.access$600(ClientCallImpl.java:467) at org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1StreamClosed.runInContext(ClientCallImpl.java:584) at
[jira] [Created] (HDDS-2188) Implement LocatedFileStatus & getFileBlockLocations to provide node/localization information to Yarn/Mapreduce
Mukul Kumar Singh created HDDS-2188: --- Summary: Implement LocatedFileStatus & getFileBlockLocations to provide node/localization information to Yarn/Mapreduce Key: HDDS-2188 URL: https://issues.apache.org/jira/browse/HDDS-2188 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Filesystem Affects Versions: 0.5.0 Reporter: Mukul Kumar Singh Assignee: Mukul Kumar Singh For applications like Hive/MapReduce to take advantage of the data locality in Ozone, Ozone should return the location of the Ozone blocks. This is needed for better read performance for Hadoop Applications. {code} if (file instanceof LocatedFileStatus) { blkLocations = ((LocatedFileStatus) file).getBlockLocations(); } else { blkLocations = fs.getFileBlockLocations(file, 0, length); } {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2169: Target Version/s: 0.5.0 > Avoid buffer copies while submitting client requests in Ratis > - > > Key: HDDS-2169 > URL: https://issues.apache.org/jira/browse/HDDS-2169 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Shashikant Banerjee >Assignee: Tsz Wo Nicholas Sze >Priority: Major > Attachments: o2169_20190923.patch > > > Currently, while sending write requests to Ratis from ozone, a protobuf > object containing data encoded and then resultant protobuf is again > converted to a byteString which internally does a copy of the buffer embedded > inside the protobuf again so that it can be submitted over to Ratis client. > Again, while sending the appendRequest as well while building up the > appendRequestProto, it might be again copying the data. The idea here is to > provide client so pass the raw data(stateMachine data) separately to ratis > client without copying overhead. > > {code:java} > private CompletableFuture sendRequestAsync( > ContainerCommandRequestProto request) { > try (Scope scope = GlobalTracer.get() > .buildSpan("XceiverClientRatis." + request.getCmdType().name()) > .startActive(true)) { > ContainerCommandRequestProto finalPayload = > ContainerCommandRequestProto.newBuilder(request) > .setTraceID(TracingUtil.exportCurrentSpan()) > .build(); > boolean isReadOnlyRequest = HddsUtils.isReadOnly(finalPayload); > // finalPayload already has the byteString data embedded. > ByteString byteString = finalPayload.toByteString(); -> It involves a > copy again. > if (LOG.isDebugEnabled()) { > LOG.debug("sendCommandAsync {} {}", isReadOnlyRequest, > sanitizeForDebug(finalPayload)); > } > return isReadOnlyRequest ? > getClient().sendReadOnlyAsync(() -> byteString) : > getClient().sendAsync(() -> byteString); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2081) Fix TestRatisPipelineProvider#testCreatePipelinesDnExclude
[ https://issues.apache.org/jira/browse/HDDS-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2081: Status: Patch Available (was: Open) > Fix TestRatisPipelineProvider#testCreatePipelinesDnExclude > -- > > Key: HDDS-2081 > URL: https://issues.apache.org/jira/browse/HDDS-2081 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Dinesh Chitlangia >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > {code:java} > --- > Test set: org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider > --- > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.374 s <<< > FAILURE! - in org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider > testCreatePipelinesDnExclude(org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider) > Time elapsed: 0.044 s <<< ERROR! > org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot > create pipeline of factor 3 using 2 nodes. > at > org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider.create(RatisPipelineProvider.java:151) > at > org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider.testCreatePipelinesDnExclude(TestRatisPipelineProvider.java:182) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2081) Fix TestRatisPipelineProvider#testCreatePipelinesDnExclude
[ https://issues.apache.org/jira/browse/HDDS-2081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2081: Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the contribution [~avijayan]. I have committed this to trunk. > Fix TestRatisPipelineProvider#testCreatePipelinesDnExclude > -- > > Key: HDDS-2081 > URL: https://issues.apache.org/jira/browse/HDDS-2081 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Reporter: Dinesh Chitlangia >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 50m > Remaining Estimate: 0h > > {code:java} > --- > Test set: org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider > --- > Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.374 s <<< > FAILURE! - in org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider > testCreatePipelinesDnExclude(org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider) > Time elapsed: 0.044 s <<< ERROR! > org.apache.hadoop.hdds.scm.pipeline.InsufficientDatanodesException: Cannot > create pipeline of factor 3 using 2 nodes. > at > org.apache.hadoop.hdds.scm.pipeline.RatisPipelineProvider.create(RatisPipelineProvider.java:151) > at > org.apache.hadoop.hdds.scm.pipeline.TestRatisPipelineProvider.testCreatePipelinesDnExclude(TestRatisPipelineProvider.java:182) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70) > at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50) > at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) > at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) > at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) > at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) > at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) > at org.junit.runners.ParentRunner.run(ParentRunner.java:309) > at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2169) Avoid buffer copies while submitting client requests in Ratis
[ https://issues.apache.org/jira/browse/HDDS-2169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16936465#comment-16936465 ] Mukul Kumar Singh commented on HDDS-2169: - Thanks for working on this [~szetszwo]. The patch does not seem to be applying. Also this problem needs to be fixed for appendEntries from leader to follower as well. > Avoid buffer copies while submitting client requests in Ratis > - > > Key: HDDS-2169 > URL: https://issues.apache.org/jira/browse/HDDS-2169 > Project: Hadoop Distributed Data Store > Issue Type: Improvement >Reporter: Shashikant Banerjee >Assignee: Tsz Wo Nicholas Sze >Priority: Major > Attachments: o2169_20190923.patch > > > Currently, while sending write requests to Ratis from ozone, a protobuf > object containing data encoded and then resultant protobuf is again > converted to a byteString which internally does a copy of the buffer embedded > inside the protobuf again so that it can be submitted over to Ratis client. > Again, while sending the appendRequest as well while building up the > appendRequestProto, it might be again copying the data. The idea here is to > provide client so pass the raw data(stateMachine data) separately to ratis > client without copying overhead. > > {code:java} > private CompletableFuture sendRequestAsync( > ContainerCommandRequestProto request) { > try (Scope scope = GlobalTracer.get() > .buildSpan("XceiverClientRatis." + request.getCmdType().name()) > .startActive(true)) { > ContainerCommandRequestProto finalPayload = > ContainerCommandRequestProto.newBuilder(request) > .setTraceID(TracingUtil.exportCurrentSpan()) > .build(); > boolean isReadOnlyRequest = HddsUtils.isReadOnly(finalPayload); > // finalPayload already has the byteString data embedded. > ByteString byteString = finalPayload.toByteString(); -> It involves a > copy again. > if (LOG.isDebugEnabled()) { > LOG.debug("sendCommandAsync {} {}", isReadOnlyRequest, > sanitizeForDebug(finalPayload)); > } > return isReadOnlyRequest ? > getClient().sendReadOnlyAsync(() -> byteString) : > getClient().sendAsync(() -> byteString); > } > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2114) Rename does not preserve non-explicitly created interim directories
[ https://issues.apache.org/jira/browse/HDDS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16931105#comment-16931105 ] Mukul Kumar Singh commented on HDDS-2114: - The build works out locally and hence committed this to trunk > Rename does not preserve non-explicitly created interim directories > --- > > Key: HDDS-2114 > URL: https://issues.apache.org/jira/browse/HDDS-2114 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Istvan Fajth >Assignee: Lokesh Jain >Priority: Critical > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: demonstrative_test.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > I am attaching a patch that adds a test that demonstrates the problem. > The scenario is coming from the way how Hive implements acid transactions > with the ORC table format, but the test is redacted to the simplest possible > code that reproduces the issue. > The scenario: > * Given a 3 level directory structure, where the top level directory was > explicitly created, and the interim directory is implicitly created (for > example either by creating a file with create("/top/interim/file") or by > creating a directory with mkdirs("top/interim/dir")) > * When the leaf is moved out from the implicitly created directory making > this directory an empty directory > * Then a FileNotFoundException is thrown when getFileStatus or listStatus is > called on the interim directory. > The expected behaviour: > after the directory is becoming empty, the directory should still be part of > the file system, moreover an empty FileStatus array should be returned when > listStatus is called on it, and also a valid FileStatus object should be > returned when getFileStatus is called on it. > > > As this issue is present with Hive, and as this is how a FileSystem is > expected to work this seems to be an at least critical issue as I see, please > feel free to change the priority if needed. > Also please note that, if the interim directory is explicitly created with > mkdirs("top/interim") before creating the leaf, then the issue does not > appear. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2114) Rename does not preserve non-explicitly created interim directories
[ https://issues.apache.org/jira/browse/HDDS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2114: Fix Version/s: 0.5.0 Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the contribution [~ljain] and [~pifta]. I have committed this to trunk. > Rename does not preserve non-explicitly created interim directories > --- > > Key: HDDS-2114 > URL: https://issues.apache.org/jira/browse/HDDS-2114 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Istvan Fajth >Assignee: Lokesh Jain >Priority: Critical > Labels: pull-request-available > Fix For: 0.5.0 > > Attachments: demonstrative_test.patch > > Time Spent: 1h 10m > Remaining Estimate: 0h > > I am attaching a patch that adds a test that demonstrates the problem. > The scenario is coming from the way how Hive implements acid transactions > with the ORC table format, but the test is redacted to the simplest possible > code that reproduces the issue. > The scenario: > * Given a 3 level directory structure, where the top level directory was > explicitly created, and the interim directory is implicitly created (for > example either by creating a file with create("/top/interim/file") or by > creating a directory with mkdirs("top/interim/dir")) > * When the leaf is moved out from the implicitly created directory making > this directory an empty directory > * Then a FileNotFoundException is thrown when getFileStatus or listStatus is > called on the interim directory. > The expected behaviour: > after the directory is becoming empty, the directory should still be part of > the file system, moreover an empty FileStatus array should be returned when > listStatus is called on it, and also a valid FileStatus object should be > returned when getFileStatus is called on it. > > > As this issue is present with Hive, and as this is how a FileSystem is > expected to work this seems to be an at least critical issue as I see, please > feel free to change the priority if needed. > Also please note that, if the interim directory is explicitly created with > mkdirs("top/interim") before creating the leaf, then the issue does not > appear. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14845) Request is a replay (34) error in httpfs
[ https://issues.apache.org/jira/browse/HDFS-14845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDFS-14845: Assignee: Prabhu Joseph > Request is a replay (34) error in httpfs > > > Key: HDFS-14845 > URL: https://issues.apache.org/jira/browse/HDFS-14845 > Project: Hadoop HDFS > Issue Type: Bug > Components: httpfs >Affects Versions: 3.3.0 > Environment: Kerberos and ZKDelgationTokenSecretManager enabled in > HttpFS >Reporter: Akira Ajisaka >Assignee: Prabhu Joseph >Priority: Critical > > We are facing "Request is a replay (34)" error when accessing to HDFS via > httpfs on trunk. > {noformat} > % curl -i --negotiate -u : "https://:4443/webhdfs/v1/?op=liststatus" > HTTP/1.1 401 Authentication required > Date: Mon, 09 Sep 2019 06:00:04 GMT > Date: Mon, 09 Sep 2019 06:00:04 GMT > Pragma: no-cache > X-Content-Type-Options: nosniff > X-XSS-Protection: 1; mode=block > WWW-Authenticate: Negotiate > Set-Cookie: hadoop.auth=; Path=/; Secure; HttpOnly > Cache-Control: must-revalidate,no-cache,no-store > Content-Type: text/html;charset=iso-8859-1 > Content-Length: 271 > HTTP/1.1 403 GSSException: Failure unspecified at GSS-API level (Mechanism > level: Request is a replay (34)) > Date: Mon, 09 Sep 2019 06:00:04 GMT > Date: Mon, 09 Sep 2019 06:00:04 GMT > Pragma: no-cache > X-Content-Type-Options: nosniff > X-XSS-Protection: 1; mode=block > (snip) > Set-Cookie: hadoop.auth=; Path=/; Secure; HttpOnly > Cache-Control: must-revalidate,no-cache,no-store > Content-Type: text/html;charset=iso-8859-1 > Content-Length: 413 > > > > Error 403 GSSException: Failure unspecified at GSS-API level > (Mechanism level: Request is a replay (34)) > > HTTP ERROR 403 > Problem accessing /webhdfs/v1/. Reason: > GSSException: Failure unspecified at GSS-API level (Mechanism level: > Request is a replay (34)) > > > {noformat} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2114) Rename does not preserve non-explicitly created interim directories
[ https://issues.apache.org/jira/browse/HDDS-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2114: --- Assignee: Lokesh Jain > Rename does not preserve non-explicitly created interim directories > --- > > Key: HDDS-2114 > URL: https://issues.apache.org/jira/browse/HDDS-2114 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Istvan Fajth >Assignee: Lokesh Jain >Priority: Critical > Attachments: demonstrative_test.patch > > > I am attaching a patch that adds a test that demonstrates the problem. > The scenario is coming from the way how Hive implements acid transactions > with the ORC table format, but the test is redacted to the simplest possible > code that reproduces the issue. > The scenario: > * Given a 3 level directory structure, where the top level directory was > explicitly created, and the interim directory is implicitly created (for > example either by creating a file with create("/top/interim/file") or by > creating a directory with mkdirs("top/interim/dir")) > * When the leaf is moved out from the implicitly created directory making > this directory an empty directory > * Then a FileNotFoundException is thrown when getFileStatus or listStatus is > called on the interim directory. > The expected behaviour: > after the directory is becoming empty, the directory should still be part of > the file system, moreover an empty FileStatus array should be returned when > listStatus is called on it, and also a valid FileStatus object should be > returned when getFileStatus is called on it. > > > As this issue is present with Hive, and as this is how a FileSystem is > expected to work this seems to be an at least critical issue as I see, please > feel free to change the priority if needed. > Also please note that, if the interim directory is explicitly created with > mkdirs("top/interim") before creating the leaf, then the issue does not > appear. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2102) HddsVolumeChecker should use java optional in place of Guava optional
[ https://issues.apache.org/jira/browse/HDDS-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-2102: --- Assignee: Mukul Kumar Singh > HddsVolumeChecker should use java optional in place of Guava optional > - > > Key: HDDS-2102 > URL: https://issues.apache.org/jira/browse/HDDS-2102 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Mukul Kumar Singh >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HddsVolumeChecker should use java optional in place of Guava optional, as the > Guava dependency is marked unstable. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2102) HddsVolumeChecker should use java optional in place of Guava optional
[ https://issues.apache.org/jira/browse/HDDS-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh updated HDDS-2102: Status: Patch Available (was: Open) > HddsVolumeChecker should use java optional in place of Guava optional > - > > Key: HDDS-2102 > URL: https://issues.apache.org/jira/browse/HDDS-2102 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Datanode >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > HddsVolumeChecker should use java optional in place of Guava optional, as the > Guava dependency is marked unstable. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2102) HddsVolumeChecker should use java optional in place of Guava optional
Mukul Kumar Singh created HDDS-2102: --- Summary: HddsVolumeChecker should use java optional in place of Guava optional Key: HDDS-2102 URL: https://issues.apache.org/jira/browse/HDDS-2102 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Datanode Affects Versions: 0.4.0 Reporter: Mukul Kumar Singh HddsVolumeChecker should use java optional in place of Guava optional, as the Guava dependency is marked unstable. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1899) DeleteBlocksCommandHandler is unable to find the container in SCM
[ https://issues.apache.org/jira/browse/HDDS-1899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukul Kumar Singh reassigned HDDS-1899: --- Assignee: (was: Nanda kumar) > DeleteBlocksCommandHandler is unable to find the container in SCM > - > > Key: HDDS-1899 > URL: https://issues.apache.org/jira/browse/HDDS-1899 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Priority: Major > Labels: MiniOzoneChaosCluster > > DeleteBlocksCommandHandler is unable to find a container in SCM. > {code} > 2019-08-02 14:04:56,735 WARN commandhandler.DeleteBlocksCommandHandler > (DeleteBlocksCommandHandler.java:lambda$handle$0(140)) - Failed to delete > blocks for container=33, TXID=184 > org.apache.hadoop.hdds.scm.container.common.helpers.StorageContainerException: > Unable to find the container 33 > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteBlocksCommandHandler.lambda$handle$0(DeleteBlocksCommandHandler.java:122) > at java.util.ArrayList.forEach(ArrayList.java:1257) > at > java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.DeleteBlocksCommandHandler.handle(DeleteBlocksCommandHandler.java:114) > at > org.apache.hadoop.ozone.container.common.statemachine.commandhandler.CommandDispatcher.handle(CommandDispatcher.java:93) > at > org.apache.hadoop.ozone.container.common.statemachine.DatanodeStateMachine.lambda$initCommandHandlerThread$1(DatanodeStateMachine.java:432) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1504) Watch Request should use retry policy with higher timeouts for RaftClient
[ https://issues.apache.org/jira/browse/HDDS-1504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16923074#comment-16923074 ] Mukul Kumar Singh commented on HDDS-1504: - cc [~Sammi] > Watch Request should use retry policy with higher timeouts for RaftClient > - > > Key: HDDS-1504 > URL: https://issues.apache.org/jira/browse/HDDS-1504 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Priority: Major > > Currently, Raft Client request times out with default of 3s but, watch > request can have longer timeouts as some followers can be really slow. It > would be good to enforce a retry policy with higher timeouts while submitting > watch request over raft client in ozone. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2088) Different components in MiniOzoneChaosCluster should log to different files
Mukul Kumar Singh created HDDS-2088: --- Summary: Different components in MiniOzoneChaosCluster should log to different files Key: HDDS-2088 URL: https://issues.apache.org/jira/browse/HDDS-2088 Project: Hadoop Distributed Data Store Issue Type: Bug Components: test Affects Versions: 0.4.0 Reporter: Shashikant Banerjee Different components/nodes in MiniOzoneChaosCluster should log to different log files. Thanks [~shashikant] for suggesting this. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org