[jira] [Commented] (HDFS-16252) Correct docs for dfs.http.client.retry.policy.spec
[ https://issues.apache.org/jira/browse/HDFS-16252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424032#comment-17424032 ] Aravindan Vijayan commented on HDFS-16252: -- +1, thanks for the fix. > Correct docs for dfs.http.client.retry.policy.spec > --- > > Key: HDFS-16252 > URL: https://issues.apache.org/jira/browse/HDFS-16252 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Stephen O'Donnell >Assignee: Stephen O'Donnell >Priority: Major > Attachments: HDFS-16252.001.patch > > > The hdfs-default doc for dfs.http.client.retry.policy.spec is incorrect, as > it has the wait time and retries switched around in the descriptio. Also, the > doc for dfs.client.retry.policy.spec is not present and should be the same as > for dfs.http.client.retry.policy.spec. > The code shows the timeout is first and then the number of retries: > {code} > String POLICY_SPEC_KEY = PREFIX + "policy.spec"; > String POLICY_SPEC_DEFAULT = "1,6,6,10"; //t1,n1,t2,n2,... > // In RetryPolicies.java, we can see it gets the timeout as the first in > the pair >/** > * Parse the given string as a MultipleLinearRandomRetry object. > * The format of the string is "t_1, n_1, t_2, n_2, ...", > * where t_i and n_i are the i-th pair of sleep time and number of > retries. > * Note that the white spaces in the string are ignored. > * > * @return the parsed object, or null if the parsing fails. > */ > public static MultipleLinearRandomRetry parseCommaSeparatedString(String > s) { > final String[] elements = s.split(","); > if (elements.length == 0) { > LOG.warn("Illegal value: there is no element in \"" + s + "\"."); > return null; > } > if (elements.length % 2 != 0) { > LOG.warn("Illegal value: the number of elements in \"" + s + "\" is " > + elements.length + " but an even number of elements is > expected."); > return null; > } > final List pairs > = new ArrayList(); > > for(int i = 0; i < elements.length; ) { > //parse the i-th sleep-time > final int sleep = parsePositiveInt(elements, i++, s); > if (sleep == -1) { > return null; //parse fails > } > //parse the i-th number-of-retries > final int retries = parsePositiveInt(elements, i++, s); > if (retries == -1) { > return null; //parse fails > } > pairs.add(new RetryPolicies.MultipleLinearRandomRetry.Pair(retries, > sleep)); > } > return new RetryPolicies.MultipleLinearRandomRetry(pairs); > } > {code} > This change simply updates the docs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15257) Fix spelling mistake in DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-15257: Assignee: Haibin Huang (was: Aravindan Vijayan) > Fix spelling mistake in DataXceiverServer > - > > Key: HDFS-15257 > URL: https://issues.apache.org/jira/browse/HDFS-15257 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Haibin Huang >Priority: Minor > Attachments: HDFS-15257-001.patch > > > There is a spelling mistake in DataXceiverServer, try to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15257) Fix spelling mistake in DataXceiverServer
[ https://issues.apache.org/jira/browse/HDFS-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-15257: Assignee: Aravindan Vijayan (was: Haibin Huang) > Fix spelling mistake in DataXceiverServer > - > > Key: HDFS-15257 > URL: https://issues.apache.org/jira/browse/HDFS-15257 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Haibin Huang >Assignee: Aravindan Vijayan >Priority: Minor > Attachments: HDFS-15257-001.patch > > > There is a spelling mistake in DataXceiverServer, try to fix it. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14561) dfs.datanode.data.dir does not honor file:/// scheme
[ https://issues.apache.org/jira/browse/HDFS-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-14561: Assignee: (was: Aravindan Vijayan) > dfs.datanode.data.dir does not honor file:/// scheme > > > Key: HDFS-14561 > URL: https://issues.apache.org/jira/browse/HDFS-14561 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation, hdfs >Affects Versions: 2.9.2, 2.8.5 >Reporter: Artem Ervits >Priority: Major > > _dfs.datanode.data.dir_ with {{file:///}} scheme is not being honored in > Hadoop 2.8.5 and 2.9.2. I didn't test 3.x branches but good to validate also. > In my environment, only file:// seemed to work. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15156) In-Place Erasure Coding should factor in storage quota validations.
[ https://issues.apache.org/jira/browse/HDFS-15156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDFS-15156: - Parent: HDFS-14978 Issue Type: Sub-task (was: Bug) > In-Place Erasure Coding should factor in storage quota validations. > > > Key: HDFS-15156 > URL: https://issues.apache.org/jira/browse/HDFS-15156 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Priority: Major > Labels: ec > > HDFS-14978 is a new feature that brings in In-Place Erasure coding in HDFS. > While converting a replicated file to an EC one, we may end up violating > storage quota validations. Thanks to [~ayushsaxena] for bringing this up. > This JIRA tracks the effort to tackle that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15156) In-Place Erasure Coding should factor in storage quota validations.
[ https://issues.apache.org/jira/browse/HDFS-15156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDFS-15156: - Labels: ec (was: ) > In-Place Erasure Coding should factor in storage quota validations. > > > Key: HDFS-15156 > URL: https://issues.apache.org/jira/browse/HDFS-15156 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Aravindan Vijayan >Priority: Major > Labels: ec > > HDFS-14978 is a new feature that brings in In-Place Erasure coding in HDFS. > While converting a replicated file to an EC one, we may end up violating > storage quota validations. Thanks to [~ayushsaxena] for bringing this up. > This JIRA tracks the effort to tackle that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15156) In-Place Erasure Coding should factor in storage quota validations.
Aravindan Vijayan created HDFS-15156: Summary: In-Place Erasure Coding should factor in storage quota validations. Key: HDFS-15156 URL: https://issues.apache.org/jira/browse/HDFS-15156 Project: Hadoop HDFS Issue Type: Bug Reporter: Aravindan Vijayan HDFS-14978 is a new feature that brings in In-Place Erasure coding in HDFS. While converting a replicated file to an EC one, we may end up violating storage quota validations. Thanks to [~ayushsaxena] for bringing this up. This JIRA tracks the effort to tackle that. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.
[ https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDFS-14989: - Resolution: Fixed Status: Resolved (was: Patch Available) Thanks for the reviews [~arp], [~weichiu], [~ayushtkn]. PR https://github.com/apache/hadoop/pull/1819 merged to branch HDFS-14978_ec_conversion. > Add a 'swapBlockList' operation to Namenode. > > > Key: HDFS-14989 > URL: https://issues.apache.org/jira/browse/HDFS-14989 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Borrowing from the design doc. > bq. The swapBlockList takes two parameters, a source file and a destination > file. This operation swaps the blocks belonging to the source and the > destination atomically. > bq. The namespace metadata of interest is the INodeFile class. A file > (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, > BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile > contains a list of blocks (BlockInfo[]). The operation will swap > BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not > touch other fields. To avoid complication, this operation will abort if > either file is open (isUnderConstruction() == true) > bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST > to record the change persistently. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-14978) In-place Erasure Coding Conversion
[ https://issues.apache.org/jira/browse/HDFS-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001041#comment-17001041 ] Aravindan Vijayan edited comment on HDFS-14978 at 12/20/19 5:53 PM: [~ayushtkn] Thanks for reviewing the design doc. Answers inline. bq. Does the swapBlockList() Takes care of storage quota validations? Not in this version. We plan to support it in the next version of this feature. bq. How is storage Policies handled here? do you retain it, There are some policies not supported for EC, like ONE_SSD. The temporary file that is created will be created with a specific policy for EC. And then the swap block list will also make sure that the replicated file's policy is updated to the newly created EC one. bq. What would be response, say you want to switch to 6-3 and you are able fetch only 7 blocks, unable to write two parities, do we allow moving further, despite moving into chances of making data more vulnerable? or do we fail? We will fail fast. bq. The Overhead of an extra tmp file still says? The tmp file will be deleted at the end of the operation. bq. I need to check this but, If a client is reading a file, the swapBlocks() happens the block locations get changed, he shall go to refetch the locations from namenode, that would be now EC Blocks, The client would be on DFSInputStream not on DFSStripedInputStream, I might have messed up, but is there a cover to this? Yes this is a valid concern. We will have to try and repro this possible race condition. The plan is to not fail the client's read operation while the file is undergoing in place EC conversion. was (Author: avijayan): [~ayushtkn] Thanks for reviewing the design doc. Answers inline. bq. Does the swapBlockList() Takes care of storage quota validations? Not in this version. We plan to support it in the next version of this feature. How is storage Policies handled here? do you retain it, There are some policies not supported for EC, like ONE_SSD. bq. The temporary file that is created will be created with a specific policy for EC. And then the swap block list will also make sure that the replicated file's policy is updated to the newly created EC one. bq. What would be response, say you want to switch to 6-3 and you are able fetch only 7 blocks, unable to write two parities, do we allow moving further, despite moving into chances of making data more vulnerable? or do we fail? We will fail fast. bq. The Overhead of an extra tmp file still says? The tmp file will be deleted at the end of the operation. bq. I need to check this but, If a client is reading a file, the swapBlocks() happens the block locations get changed, he shall go to refetch the locations from namenode, that would be now EC Blocks, The client would be on DFSInputStream not on DFSStripedInputStream, I might have messed up, but is there a cover to this? Yes this is a valid concern. We will have to try and repro this possible race condition. The plan is to not fail the client's read operation while the file is undergoing in place EC conversion. > In-place Erasure Coding Conversion > -- > > Key: HDFS-14978 > URL: https://issues.apache.org/jira/browse/HDFS-14978 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Aravindan Vijayan >Priority: Major > Attachments: In-place Erasure Coding Conversion.pdf > > > HDFS Erasure Coding is a new feature added in Apache Hadoop 3.0. It uses > encoding algorithms to reduce disk space usage while retaining redundancy > necessary for data recovery. It was a huge amount of work but it is just > getting adopted after almost 2 years. > One usability problem that’s blocking users from adopting HDFS Erasure Coding > is that existing replicated files have to be copied to an EC-enabled > directory explicitly. Renaming a file/directory to an EC-enabled directory > does not automatically convert the blocks. Therefore users typically perform > the following steps to erasure-code existing files: > {noformat} > Create $tmp directory, set EC policy at it > Distcp $src to $tmp > Delete $src (rm -rf $src) > mv $tmp $src > {noformat} > There are several reasons why this is not popular: > * Complex. The process involves several steps: distcp data to a temporary > destination; delete source file; move destination to the source path. > * Availability: there is a short period where nothing exists at the source > path, and jobs may fail unexpectedly. > * Overhead. During the copy phase, there is a point in time where all of > source and destination files exist at the same time, exhausting disk space. > * Not snapshot-friendly. If a snapshot is taken prior to performing the >
[jira] [Commented] (HDFS-14978) In-place Erasure Coding Conversion
[ https://issues.apache.org/jira/browse/HDFS-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001041#comment-17001041 ] Aravindan Vijayan commented on HDFS-14978: -- [~ayushtkn] Thanks for reviewing the design doc. Answers inline. bq. Does the swapBlockList() Takes care of storage quota validations? Not in this version. We plan to support it in the next version of this feature. How is storage Policies handled here? do you retain it, There are some policies not supported for EC, like ONE_SSD. bq. The temporary file that is created will be created with a specific policy for EC. And then the swap block list will also make sure that the replicated file's policy is updated to the newly created EC one. bq. What would be response, say you want to switch to 6-3 and you are able fetch only 7 blocks, unable to write two parities, do we allow moving further, despite moving into chances of making data more vulnerable? or do we fail? We will fail fast. bq. The Overhead of an extra tmp file still says? The tmp file will be deleted at the end of the operation. bq. I need to check this but, If a client is reading a file, the swapBlocks() happens the block locations get changed, he shall go to refetch the locations from namenode, that would be now EC Blocks, The client would be on DFSInputStream not on DFSStripedInputStream, I might have messed up, but is there a cover to this? Yes this is a valid concern. We will have to try and repro this possible race condition. The plan is to not fail the client's read operation while the file is undergoing in place EC conversion. > In-place Erasure Coding Conversion > -- > > Key: HDFS-14978 > URL: https://issues.apache.org/jira/browse/HDFS-14978 > Project: Hadoop HDFS > Issue Type: New Feature > Components: erasure-coding >Affects Versions: 3.0.0 >Reporter: Wei-Chiu Chuang >Assignee: Aravindan Vijayan >Priority: Major > Attachments: In-place Erasure Coding Conversion.pdf > > > HDFS Erasure Coding is a new feature added in Apache Hadoop 3.0. It uses > encoding algorithms to reduce disk space usage while retaining redundancy > necessary for data recovery. It was a huge amount of work but it is just > getting adopted after almost 2 years. > One usability problem that’s blocking users from adopting HDFS Erasure Coding > is that existing replicated files have to be copied to an EC-enabled > directory explicitly. Renaming a file/directory to an EC-enabled directory > does not automatically convert the blocks. Therefore users typically perform > the following steps to erasure-code existing files: > {noformat} > Create $tmp directory, set EC policy at it > Distcp $src to $tmp > Delete $src (rm -rf $src) > mv $tmp $src > {noformat} > There are several reasons why this is not popular: > * Complex. The process involves several steps: distcp data to a temporary > destination; delete source file; move destination to the source path. > * Availability: there is a short period where nothing exists at the source > path, and jobs may fail unexpectedly. > * Overhead. During the copy phase, there is a point in time where all of > source and destination files exist at the same time, exhausting disk space. > * Not snapshot-friendly. If a snapshot is taken prior to performing the > conversion, the source (replicated) files will be preserved in the cluster > too. Therefore, the conversion actually increase storage space usage. > * Not management-friendly. This approach changes file inode number, > modification time and access time. Erasure coded files are supposed to store > cold data, but this conversion makes data “hot” again. > * Bulky. It’s either all or nothing. The directory may be partially erasure > coded, but this approach simply erasure code everything again. > To ease data management, we should offer a utility tool to convert replicated > files to erasure coded files in-place. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15006) Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes.
[ https://issues.apache.org/jira/browse/HDFS-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDFS-15006: - Description: In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the set of blocks in an INode File with a new set of blocks. This JIRA will track the effort to add an FS Edit log op to persist this operation. We also need to increase the NamenodeLayoutVersion for this change. was:In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the set of blocks in an INode File with a new set of blocks. This JIRA will track the effort to add an FS Edit log op to persist this operation. > Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes. > > > Key: HDFS-15006 > URL: https://issues.apache.org/jira/browse/HDFS-15006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the > set of blocks in an INode File with a new set of blocks. This JIRA will track > the effort to add an FS Edit log op to persist this operation. > We also need to increase the NamenodeLayoutVersion for this change. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.
[ https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDFS-14989: - Status: Patch Available (was: In Progress) > Add a 'swapBlockList' operation to Namenode. > > > Key: HDFS-14989 > URL: https://issues.apache.org/jira/browse/HDFS-14989 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Borrowing from the design doc. > bq. The swapBlockList takes two parameters, a source file and a destination > file. This operation swaps the blocks belonging to the source and the > destination atomically. > bq. The namespace metadata of interest is the INodeFile class. A file > (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, > BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile > contains a list of blocks (BlockInfo[]). The operation will swap > BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not > touch other fields. To avoid complication, this operation will abort if > either file is open (isUnderConstruction() == true) > bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST > to record the change persistently. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-15008) Expose client API to carry out In-Place EC of a file.
[ https://issues.apache.org/jira/browse/HDFS-15008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDFS-15008: - Description: Converting a file will be a 3 step task: * Make a temporary, erasure-coded copy. * swapBlockList($src, $tmp) (HDFS-14989) * delete($tmp) The API should allow the EC blocks to use the storage policy of the original file through a config parameter. was: Converting a file will be a 3 step task: * Make a temporary, erasure-coded copy. * swapBlockList($src, $tmp) (HDFS-14989) * delete($tmp) > Expose client API to carry out In-Place EC of a file. > - > > Key: HDFS-15008 > URL: https://issues.apache.org/jira/browse/HDFS-15008 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Converting a file will be a 3 step task: > * Make a temporary, erasure-coded copy. > * swapBlockList($src, $tmp) (HDFS-14989) > * delete($tmp) > The API should allow the EC blocks to use the storage policy of the original > file through a config parameter. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.
[ https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDFS-14989 started by Aravindan Vijayan. > Add a 'swapBlockList' operation to Namenode. > > > Key: HDFS-14989 > URL: https://issues.apache.org/jira/browse/HDFS-14989 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Borrowing from the design doc. > bq. The swapBlockList takes two parameters, a source file and a destination > file. This operation swaps the blocks belonging to the source and the > destination atomically. > bq. The namespace metadata of interest is the INodeFile class. A file > (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, > BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile > contains a list of blocks (BlockInfo[]). The operation will swap > BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not > touch other fields. To avoid complication, this operation will abort if > either file is open (isUnderConstruction() == true) > bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST > to record the change persistently. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14980) diskbalancer query command always tries to contact to port 9867
[ https://issues.apache.org/jira/browse/HDFS-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-14980: Assignee: Aravindan Vijayan (was: Siddharth Wagle) > diskbalancer query command always tries to contact to port 9867 > --- > > Key: HDFS-14980 > URL: https://issues.apache.org/jira/browse/HDFS-14980 > Project: Hadoop HDFS > Issue Type: Bug > Components: diskbalancer >Reporter: Nilotpal Nandi >Assignee: Aravindan Vijayan >Priority: Major > > disbalancer query commands always tries to connect to port 9867 even when > datanode IPC port is different. > In this setup , datanode IPC port is set to 20001. > > diskbalancer report command works fine and connects to IPC port 20001 > > {noformat} > hdfs diskbalancer -report -node 172.27.131.193 > 19/11/12 08:58:55 INFO command.Command: Processing report command > 19/11/12 08:58:57 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 19/11/12 08:58:57 INFO block.BlockTokenSecretManager: Setting block keys > 19/11/12 08:58:57 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 19/11/12 08:58:58 INFO command.Command: Reporting volume information for > DataNode(s). These DataNode(s) are parsed from '172.27.131.193'. > Processing report command > Reporting volume information for DataNode(s). These DataNode(s) are parsed > from '172.27.131.193'. > [172.27.131.193:20001] - : 3 > volumes with node data density 0.05. > [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK1/] - 0.15 used: > 39343871181/259692498944, 0.85 free: 220348627763/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK2/] - 0.15 used: > 39371179986/259692498944, 0.85 free: 220321318958/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > [DISK: volume-/dataroot/ycloud/dfs/dn/] - 0.19 used: > 49934903670/259692498944, 0.81 free: 209757595274/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > > {noformat} > > But diskbalancer query command fails and tries to connect to port 9867 > (default port). > > {noformat} > hdfs diskbalancer -query 172.27.131.193 > 19/11/12 06:37:15 INFO command.Command: Executing "query plan" command. > 19/11/12 06:37:16 INFO ipc.Client: Retrying connect to server: > /172.27.131.193:9867. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 19/11/12 06:37:17 INFO ipc.Client: Retrying connect to server: > /172.27.131.193:9867. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > .. > .. > .. > 19/11/12 06:37:25 ERROR tools.DiskBalancerCLI: Exception thrown while running > DiskBalancerCLI. > {noformat} > > > Expectation : > diskbalancer query command should work fine without explicitly mentioning > datanode IPC port address -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-14980) diskbalancer query command always tries to contact to port 9867
[ https://issues.apache.org/jira/browse/HDFS-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-14980: Assignee: Siddharth Wagle (was: Aravindan Vijayan) > diskbalancer query command always tries to contact to port 9867 > --- > > Key: HDFS-14980 > URL: https://issues.apache.org/jira/browse/HDFS-14980 > Project: Hadoop HDFS > Issue Type: Bug > Components: diskbalancer >Reporter: Nilotpal Nandi >Assignee: Siddharth Wagle >Priority: Major > > disbalancer query commands always tries to connect to port 9867 even when > datanode IPC port is different. > In this setup , datanode IPC port is set to 20001. > > diskbalancer report command works fine and connects to IPC port 20001 > > {noformat} > hdfs diskbalancer -report -node 172.27.131.193 > 19/11/12 08:58:55 INFO command.Command: Processing report command > 19/11/12 08:58:57 INFO balancer.KeyManager: Block token params received from > NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec > 19/11/12 08:58:57 INFO block.BlockTokenSecretManager: Setting block keys > 19/11/12 08:58:57 INFO balancer.KeyManager: Update block keys every 2hrs, > 30mins, 0sec > 19/11/12 08:58:58 INFO command.Command: Reporting volume information for > DataNode(s). These DataNode(s) are parsed from '172.27.131.193'. > Processing report command > Reporting volume information for DataNode(s). These DataNode(s) are parsed > from '172.27.131.193'. > [172.27.131.193:20001] - : 3 > volumes with node data density 0.05. > [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK1/] - 0.15 used: > 39343871181/259692498944, 0.85 free: 220348627763/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK2/] - 0.15 used: > 39371179986/259692498944, 0.85 free: 220321318958/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > [DISK: volume-/dataroot/ycloud/dfs/dn/] - 0.19 used: > 49934903670/259692498944, 0.81 free: 209757595274/259692498944, isFailed: > False, isReadOnly: False, isSkip: False, isTransient: False. > > {noformat} > > But diskbalancer query command fails and tries to connect to port 9867 > (default port). > > {noformat} > hdfs diskbalancer -query 172.27.131.193 > 19/11/12 06:37:15 INFO command.Command: Executing "query plan" command. > 19/11/12 06:37:16 INFO ipc.Client: Retrying connect to server: > /172.27.131.193:9867. Already tried 0 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > 19/11/12 06:37:17 INFO ipc.Client: Retrying connect to server: > /172.27.131.193:9867. Already tried 1 time(s); retry policy is > RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 > MILLISECONDS) > .. > .. > .. > 19/11/12 06:37:25 ERROR tools.DiskBalancerCLI: Exception thrown while running > DiskBalancerCLI. > {noformat} > > > Expectation : > diskbalancer query command should work fine without explicitly mentioning > datanode IPC port address -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15007) Make sure upgrades & downgrades work as expected with In-Place EC.
[ https://issues.apache.org/jira/browse/HDFS-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-15007: Assignee: Aravindan Vijayan > Make sure upgrades & downgrades work as expected with In-Place EC. > -- > > Key: HDFS-15007 > URL: https://issues.apache.org/jira/browse/HDFS-15007 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Since we are adding a new FS Edit log op, we have to do some work to make > sure upgrades and downgrade work as expected. A request to change a file to > EC before finalize should not be allowed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDFS-15006) Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes.
[ https://issues.apache.org/jira/browse/HDFS-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDFS-15006: Assignee: Aravindan Vijayan > Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes. > > > Key: HDFS-15006 > URL: https://issues.apache.org/jira/browse/HDFS-15006 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the > set of blocks in an INode File with a new set of blocks. This JIRA will track > the effort to add an FS Edit log op to persist this operation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15008) Expose client API to carry out In-Place EC of a file.
Aravindan Vijayan created HDFS-15008: Summary: Expose client API to carry out In-Place EC of a file. Key: HDFS-15008 URL: https://issues.apache.org/jira/browse/HDFS-15008 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Converting a file will be a 3 step task: * Make a temporary, erasure-coded copy. * swapBlockList($src, $tmp) (HDFS-14989) * delete($tmp) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15007) Make sure upgrades & downgrades work as expected with In-Place EC.
Aravindan Vijayan created HDFS-15007: Summary: Make sure upgrades & downgrades work as expected with In-Place EC. Key: HDFS-15007 URL: https://issues.apache.org/jira/browse/HDFS-15007 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Aravindan Vijayan Since we are adding a new FS Edit log op, we have to do some work to make sure upgrades and downgrade work as expected. A request to change a file to EC before finalize should not be allowed. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-15006) Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes.
Aravindan Vijayan created HDFS-15006: Summary: Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes. Key: HDFS-15006 URL: https://issues.apache.org/jira/browse/HDFS-15006 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Aravindan Vijayan In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the set of blocks in an INode File with a new set of blocks. This JIRA will track the effort to add an FS Edit log op to persist this operation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2590) Integration tests for Recon with Ozone Manager.
Aravindan Vijayan created HDDS-2590: --- Summary: Integration tests for Recon with Ozone Manager. Key: HDDS-2590 URL: https://issues.apache.org/jira/browse/HDDS-2590 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Recon Reporter: Aravindan Vijayan Fix For: 0.5.0 Currently, Recon has only unit tests. We need to add the following integration tests to make sure there are no regressions or contract breakage with Ozone Manager. The first step would be to add Recon as a new component to Mini Ozone cluster. * *Test 1* - Verify Recon can get full snapshot and subsequent delta updates from Ozone Manager on startup. > Start up a Mini Ozone cluster (with Recon) with a few keys in OM. > Verify Recon gets full DB snapshot from OM. > Add 100 keys to OM > Verify Recon picks up the new keys using the delta updates mechanism. > Verify OM DB seq number == Recon's OM DB snapshot's seq number * *Test 2* - Verify Recon restart does not cause issues with the OM DB syncing. > Startup Mini Ozone cluster (with Recon). > Add 100 keys to OM > Verify Recon picks up the new keys. > Stop Recon Server > Add 5 keys to OM. > Start Recon Server > Verify that Recon Server does not request full snapshot from OM (since only a small number of keys have been added, and hence Recon should be able to get the updates alone) > Verify OM DB seq number == Recon's OM DB snapshot's seq number *Note* : This exercise might expose a few bugs in Recon-OM integration which is perfectly normal and is the exact reason why we want these tests to be written. Please file JIRAs for any major issues encountered and link them here. Minor issues can hopefully be fixed as part of this effort. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2590) Integration tests for Recon with Ozone Manager.
[ https://issues.apache.org/jira/browse/HDDS-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2590: Description: Currently, Recon has only unit tests. We need to add the following integration tests to make sure there are no regressions or contract breakage with Ozone Manager. The first step would be to add Recon as a new component to Mini Ozone cluster. * *Test 1* - *Verify Recon can get full snapshot and subsequent delta updates from Ozone Manager on startup.* > Start up a Mini Ozone cluster (with Recon) with a few keys in OM. > Verify Recon gets full DB snapshot from OM. > Add 100 keys to OM > Verify Recon picks up the new keys using the delta updates mechanism. > Verify OM DB seq number == Recon's OM DB snapshot's seq number * *Test 2* - *Verify Recon restart does not cause issues with the OM DB syncing.* > Startup Mini Ozone cluster (with Recon). > Add 100 keys to OM > Verify Recon picks up the new keys. > Stop Recon Server > Add 5 keys to OM. > Start Recon Server > Verify that Recon Server does not request full snapshot from OM (since only a small number of keys have been added, and hence Recon should be able to get the updates alone) > Verify OM DB seq number == Recon's OM DB snapshot's seq number *Note* : This exercise might expose a few bugs in Recon-OM integration which is perfectly normal and is the exact reason why we want these tests to be written. Please file JIRAs for any major issues encountered and link them here. Minor issues can hopefully be fixed as part of this effort. was: Currently, Recon has only unit tests. We need to add the following integration tests to make sure there are no regressions or contract breakage with Ozone Manager. The first step would be to add Recon as a new component to Mini Ozone cluster. * *Test 1* - Verify Recon can get full snapshot and subsequent delta updates from Ozone Manager on startup. > Start up a Mini Ozone cluster (with Recon) with a few keys in OM. > Verify Recon gets full DB snapshot from OM. > Add 100 keys to OM > Verify Recon picks up the new keys using the delta updates mechanism. > Verify OM DB seq number == Recon's OM DB snapshot's seq number * *Test 2* - Verify Recon restart does not cause issues with the OM DB syncing. > Startup Mini Ozone cluster (with Recon). > Add 100 keys to OM > Verify Recon picks up the new keys. > Stop Recon Server > Add 5 keys to OM. > Start Recon Server > Verify that Recon Server does not request full snapshot from OM (since only a small number of keys have been added, and hence Recon should be able to get the updates alone) > Verify OM DB seq number == Recon's OM DB snapshot's seq number *Note* : This exercise might expose a few bugs in Recon-OM integration which is perfectly normal and is the exact reason why we want these tests to be written. Please file JIRAs for any major issues encountered and link them here. Minor issues can hopefully be fixed as part of this effort. > Integration tests for Recon with Ozone Manager. > --- > > Key: HDDS-2590 > URL: https://issues.apache.org/jira/browse/HDDS-2590 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Recon >Reporter: Aravindan Vijayan >Priority: Major > Fix For: 0.5.0 > > > Currently, Recon has only unit tests. We need to add the following > integration tests to make sure there are no regressions or contract breakage > with Ozone Manager. > The first step would be to add Recon as a new component to Mini Ozone cluster. > * *Test 1* - *Verify Recon can get full snapshot and subsequent delta updates > from Ozone Manager on startup.* > > Start up a Mini Ozone cluster (with Recon) with a few keys in OM. > > Verify Recon gets full DB snapshot from OM. > > Add 100 keys to OM > > Verify Recon picks up the new keys using the delta updates mechanism. > > Verify OM DB seq number == Recon's OM DB snapshot's seq number > * *Test 2* - *Verify Recon restart does not cause issues with the OM DB > syncing.* >> Startup Mini Ozone cluster (with Recon). >> Add 100 keys to OM >> Verify Recon picks up the new keys. >> Stop Recon Server >> Add 5 keys to OM. >> Start Recon Server >> Verify that Recon Server does not request full snapshot from OM (since > only a small >number of keys have been added, and hence Recon should be able to get > the >updates alone) >> Verify OM DB seq number == Recon's OM DB snapshot's seq number > *Note* : This exercise might expose a few bugs in Recon-OM integration which > is perfectly normal and is the exact reason why we want these tests to be > written. Please file JIRAs for any
[jira] [Updated] (HDDS-1722) Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on startup
[ https://issues.apache.org/jira/browse/HDDS-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1722: Labels: newbie (was: ) > Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on > startup > - > > Key: HDDS-1722 > URL: https://issues.apache.org/jira/browse/HDDS-1722 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Shweta >Priority: Major > Labels: newbie > > Currently the table creation is done for each schema definition one by one. > Setup sqlite DB and create Recon SQL tables. > cc [~vivekratnavel], [~swagle] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-1722) Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on startup
[ https://issues.apache.org/jira/browse/HDDS-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-1722: --- Assignee: (was: Shweta) > Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on > startup > - > > Key: HDDS-1722 > URL: https://issues.apache.org/jira/browse/HDDS-1722 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Recon >Reporter: Aravindan Vijayan >Priority: Major > Labels: newbie > > Currently the table creation is done for each schema definition one by one. > Setup sqlite DB and create Recon SQL tables. > cc [~vivekratnavel], [~swagle] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2579) Ozone client should refresh pipeline info if reads from all Datanodes fail.
[ https://issues.apache.org/jira/browse/HDDS-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2579: Description: Currently, if the client reads from all Datanodes in the pipleine fail, the read fails altogether. There may be a case when the container is moved to a new pipeline by the time client reads. In this case, the client should request for a refresh pipeline from OM, and read it again if the new pipeline returned from OM is different. This behavior is consistent with that of HDFS. cc [~msingh] / [~shashikant] / [~hanishakoneru] was:Currently, if the client reads from all Datanodes in the pipleine fail, the read fails altogether. There may be a case when the container is moved to a new pipeline by the time client reads. In this case, the client should request for a refresh pipeline from OM, and read it again if the new pipeline returned from OM is different. > Ozone client should refresh pipeline info if reads from all Datanodes fail. > --- > > Key: HDDS-2579 > URL: https://issues.apache.org/jira/browse/HDDS-2579 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Fix For: 0.5.0 > > > Currently, if the client reads from all Datanodes in the pipleine fail, the > read fails altogether. There may be a case when the container is moved to a > new pipeline by the time client reads. In this case, the client should > request for a refresh pipeline from OM, and read it again if the new pipeline > returned from OM is different. > This behavior is consistent with that of HDFS. > cc [~msingh] / [~shashikant] / [~hanishakoneru] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2504) Handle InterruptedException properly
[ https://issues.apache.org/jira/browse/HDDS-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16978061#comment-16978061 ] Aravindan Vijayan commented on HDDS-2504: - Thank you [~adoroszlai] for filing this umbrella task. > Handle InterruptedException properly > > > Key: HDDS-2504 > URL: https://issues.apache.org/jira/browse/HDDS-2504 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Priority: Major > Labels: newbie, sonar > > {quote}Either re-interrupt or rethrow the {{InterruptedException}} > {quote} > in several files (42 issues) > [https://sonarcloud.io/project/issues?id=hadoop-ozone=false=squid%3AS2142=OPEN=BUG] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2573) Handle InterruptedException in KeyOutputStream
[ https://issues.apache.org/jira/browse/HDDS-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2573: --- Assignee: Aravindan Vijayan > Handle InterruptedException in KeyOutputStream > -- > > Key: HDDS-2573 > URL: https://issues.apache.org/jira/browse/HDDS-2573 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Dinesh Chitlangia >Assignee: Aravindan Vijayan >Priority: Major > Labels: newbie, sonar > > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-m5KcVY8lQ4ZsAc=AW5md-m5KcVY8lQ4ZsAc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2577) Handle InterruptedException in OzoneManagerProtocolServerSideTranslatorPB
[ https://issues.apache.org/jira/browse/HDDS-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2577: --- Assignee: Aravindan Vijayan > Handle InterruptedException in OzoneManagerProtocolServerSideTranslatorPB > - > > Key: HDDS-2577 > URL: https://issues.apache.org/jira/browse/HDDS-2577 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Dinesh Chitlangia >Assignee: Aravindan Vijayan >Priority: Major > Labels: newbie, sonar > > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-Z7KcVY8lQ4Zr1l=AW5md-Z7KcVY8lQ4Zr1l -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2571) Handle InterruptedException in SCMPipelineManager
[ https://issues.apache.org/jira/browse/HDDS-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2571: --- Assignee: Aravindan Vijayan > Handle InterruptedException in SCMPipelineManager > - > > Key: HDDS-2571 > URL: https://issues.apache.org/jira/browse/HDDS-2571 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task >Reporter: Dinesh Chitlangia >Assignee: Aravindan Vijayan >Priority: Major > Labels: newbie, sonar > > [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW6BMuREm2E_7tGaNiTh=AW6BMuREm2E_7tGaNiTh] > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2579) Ozone client should refresh pipeline info if reads from all Datanodes fail.
Aravindan Vijayan created HDDS-2579: --- Summary: Ozone client should refresh pipeline info if reads from all Datanodes fail. Key: HDDS-2579 URL: https://issues.apache.org/jira/browse/HDDS-2579 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Client Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Fix For: 0.5.0 Currently, if the client reads from all Datanodes in the pipleine fail, the read fails altogether. There may be a case when the container is moved to a new pipeline by the time client reads. In this case, the client should request for a refresh pipeline from OM, and read it again if the new pipeline returned from OM is different. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1084) Ozone Recon Service v0
[ https://issues.apache.org/jira/browse/HDDS-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1084: Summary: Ozone Recon Service v0 (was: Ozone Recon Service v1) > Ozone Recon Service v0 > -- > > Key: HDDS-1084 > URL: https://issues.apache.org/jira/browse/HDDS-1084 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Recon >Affects Versions: 0.4.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Fix For: 0.5.0 > > Attachments: Ozone_Recon_Design_V1_Draft.pdf > > > Recon Server at a high level will maintain a global view of Ozone that is not > available from SCM or OM. Things like how many volumes exist; and how many > buckets exist per volume; which volume has maximum buckets; which are buckets > that have not been accessed for a year, which are the corrupt blocks, which > are blocks on data nodes which are not used; and answer similar queries. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1084) Ozone Recon Service V0.1
[ https://issues.apache.org/jira/browse/HDDS-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1084: Summary: Ozone Recon Service V0.1 (was: Ozone Recon Service v0) > Ozone Recon Service V0.1 > > > Key: HDDS-1084 > URL: https://issues.apache.org/jira/browse/HDDS-1084 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Recon >Affects Versions: 0.4.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Fix For: 0.5.0 > > Attachments: Ozone_Recon_Design_V1_Draft.pdf > > > Recon Server at a high level will maintain a global view of Ozone that is not > available from SCM or OM. Things like how many volumes exist; and how many > buckets exist per volume; which volume has maximum buckets; which are buckets > that have not been accessed for a year, which are the corrupt blocks, which > are blocks on data nodes which are not used; and answer similar queries. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1996) Ozone Recon Service v0.2
[ https://issues.apache.org/jira/browse/HDDS-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1996: Summary: Ozone Recon Service v0.2 (was: Ozone Recon Service v1.1) > Ozone Recon Service v0.2 > > > Key: HDDS-1996 > URL: https://issues.apache.org/jira/browse/HDDS-1996 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Recon >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Aravindan Vijayan >Priority: Major > > Placeholder for all the tasks covered in the design doc for HDDS-1084 but did > not make it into 0.5.0 release and targeted towards the next release of Ozone > after 0.5.0. > An updated design document will be uploaded to this Jira. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1084) Ozone Recon Service v1
[ https://issues.apache.org/jira/browse/HDDS-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan resolved HDDS-1084. - Resolution: Fixed Moved out all open JIRAs. They will be tracked under HDDS-1996. Resolving this JIRA. > Ozone Recon Service v1 > -- > > Key: HDDS-1084 > URL: https://issues.apache.org/jira/browse/HDDS-1084 > Project: Hadoop Distributed Data Store > Issue Type: New Feature > Components: Ozone Recon >Affects Versions: 0.4.0 >Reporter: Siddharth Wagle >Assignee: Siddharth Wagle >Priority: Major > Fix For: 0.5.0 > > Attachments: Ozone_Recon_Design_V1_Draft.pdf > > > Recon Server at a high level will maintain a global view of Ozone that is not > available from SCM or OM. Things like how many volumes exist; and how many > buckets exist per volume; which volume has maximum buckets; which are buckets > that have not been accessed for a year, which are the corrupt blocks, which > are blocks on data nodes which are not used; and answer similar queries. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1335) Basic Recon UI for serving up container key mapping.
[ https://issues.apache.org/jira/browse/HDDS-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1335: Issue Type: Task (was: Bug) > Basic Recon UI for serving up container key mapping. > > > Key: HDDS-1335 > URL: https://issues.apache.org/jira/browse/HDDS-1335 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Vivek Ratnavel Subramanian >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.
[ https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1873: Parent: (was: HDDS-1084) Issue Type: Task (was: Sub-task) > Add API to get last completed times for every Recon task. > - > > Key: HDDS-1873 > URL: https://issues.apache.org/jira/browse/HDDS-1873 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Recon >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Shweta >Priority: Major > > Recon stores the last ozone manager snapshot received timestamp along with > timestamps of last successful run for each task. > We can add a REST endpoint that returns the contents of this stored data. > This is important to give users a sense of how latest the current data that > they are looking at is. And, we need this per task because some tasks might > fail to run or might take much longer time to run than other tasks and this > needs to be reflected in the UI for better and consistent user experience. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1722) Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on startup
[ https://issues.apache.org/jira/browse/HDDS-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1722: Parent: (was: HDDS-1084) Issue Type: Task (was: Sub-task) > Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on > startup > - > > Key: HDDS-1722 > URL: https://issues.apache.org/jira/browse/HDDS-1722 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Shweta >Priority: Major > > Currently the table creation is done for each schema definition one by one. > Setup sqlite DB and create Recon SQL tables. > cc [~vivekratnavel], [~swagle] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1367) Add ability in Recon to track the growth rate of the cluster.
[ https://issues.apache.org/jira/browse/HDDS-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1367: Parent: (was: HDDS-1084) Issue Type: Task (was: Sub-task) > Add ability in Recon to track the growth rate of the cluster. > -- > > Key: HDDS-1367 > URL: https://issues.apache.org/jira/browse/HDDS-1367 > Project: Hadoop Distributed Data Store > Issue Type: Task > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Vivek Ratnavel Subramanian >Priority: Major > > Recon should be able to answer the question "How fast is the cluster growing, > by week, by month, by day?", which gives the user an idea of the usage stats > of the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1335) Basic Recon UI for serving up container key mapping.
[ https://issues.apache.org/jira/browse/HDDS-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1335: Parent: (was: HDDS-1084) Issue Type: Bug (was: Sub-task) > Basic Recon UI for serving up container key mapping. > > > Key: HDDS-1335 > URL: https://issues.apache.org/jira/browse/HDDS-1335 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Vivek Ratnavel Subramanian >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2510) Use isEmpty() to check whether the collection is empty or not in Ozone Manager module
[ https://issues.apache.org/jira/browse/HDDS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2510: Summary: Use isEmpty() to check whether the collection is empty or not in Ozone Manager module (was: Sonar : Use isEmpty() to check whether the collection is empty or not in Ozone Manager module) > Use isEmpty() to check whether the collection is empty or not in Ozone > Manager module > - > > Key: HDDS-2510 > URL: https://issues.apache.org/jira/browse/HDDS-2510 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: sonar > > Ozone Manager has a number of instances where the following sonar rule is > flagged. > bq. Using Collection.size() to test for emptiness works, but using > Collection.isEmpty() makes the code more readable and can be more performant. > The time complexity of any isEmpty() method implementation should be O(1) > whereas some implementations of size() can be O(n). > An example of a flagged instance - > https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2511) Fix Sonar issues in OzoneManagerServiceProviderImpl
[ https://issues.apache.org/jira/browse/HDDS-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2511: Summary: Fix Sonar issues in OzoneManagerServiceProviderImpl (was: Sonar : Fix Sonar issues in OzoneManagerServiceProviderImpl) > Fix Sonar issues in OzoneManagerServiceProviderImpl > --- > > Key: HDDS-2511 > URL: https://issues.apache.org/jira/browse/HDDS-2511 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: sonar > Fix For: 0.5.0 > > > Link to the list of issues : > https://sonarcloud.io/project/issues?fileUuids=AW5md-HdKcVY8lQ4ZrUn=hadoop-ozone=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2511) Sonar : Fix Sonar issues in OzoneManagerServiceProviderImpl
[ https://issues.apache.org/jira/browse/HDDS-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2511: Labels: sonar (was: ) > Sonar : Fix Sonar issues in OzoneManagerServiceProviderImpl > --- > > Key: HDDS-2511 > URL: https://issues.apache.org/jira/browse/HDDS-2511 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: sonar > Fix For: 0.5.0 > > > Link to the list of issues : > https://sonarcloud.io/project/issues?fileUuids=AW5md-HdKcVY8lQ4ZrUn=hadoop-ozone=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2511) Sonar : Fix Sonar issues in OzoneManagerServiceProviderImpl
Aravindan Vijayan created HDDS-2511: --- Summary: Sonar : Fix Sonar issues in OzoneManagerServiceProviderImpl Key: HDDS-2511 URL: https://issues.apache.org/jira/browse/HDDS-2511 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Recon Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Fix For: 0.5.0 Link to the list of issues : https://sonarcloud.io/project/issues?fileUuids=AW5md-HdKcVY8lQ4ZrUn=hadoop-ozone=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2510) Sonar : Use isEmpty() to check whether the collection is empty or not in Ozone Manager module
[ https://issues.apache.org/jira/browse/HDDS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2510: Description: Ozone Manager has a number of instances where the following sonar rule is flagged. \bq. Using Collection.size() to test for emptiness works, but using Collection.isEmpty() makes the code more readable and can be more performant. The time complexity of any isEmpty() method implementation should be O(1) whereas some implementations of size() can be O(n). An example of a flagged instance - https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false > Sonar : Use isEmpty() to check whether the collection is empty or not in > Ozone Manager module > - > > Key: HDDS-2510 > URL: https://issues.apache.org/jira/browse/HDDS-2510 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: sonar > > Ozone Manager has a number of instances where the following sonar rule is > flagged. > \bq. Using Collection.size() to test for emptiness works, but using > Collection.isEmpty() makes the code more readable and can be more performant. > The time complexity of any isEmpty() method implementation should be O(1) > whereas some implementations of size() can be O(n). > An example of a flagged instance - > https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2510) Sonar : Use isEmpty() to check whether the collection is empty or not in Ozone Manager module
[ https://issues.apache.org/jira/browse/HDDS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2510: Description: Ozone Manager has a number of instances where the following sonar rule is flagged. bq. Using Collection.size() to test for emptiness works, but using Collection.isEmpty() makes the code more readable and can be more performant. The time complexity of any isEmpty() method implementation should be O(1) whereas some implementations of size() can be O(n). An example of a flagged instance - https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false was: Ozone Manager has a number of instances where the following sonar rule is flagged. \bq. Using Collection.size() to test for emptiness works, but using Collection.isEmpty() makes the code more readable and can be more performant. The time complexity of any isEmpty() method implementation should be O(1) whereas some implementations of size() can be O(n). An example of a flagged instance - https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false > Sonar : Use isEmpty() to check whether the collection is empty or not in > Ozone Manager module > - > > Key: HDDS-2510 > URL: https://issues.apache.org/jira/browse/HDDS-2510 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: sonar > > Ozone Manager has a number of instances where the following sonar rule is > flagged. > bq. Using Collection.size() to test for emptiness works, but using > Collection.isEmpty() makes the code more readable and can be more performant. > The time complexity of any isEmpty() method implementation should be O(1) > whereas some implementations of size() can be O(n). > An example of a flagged instance - > https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2510) Sonar : Use isEmpty() to check whether the collection is empty or not in Ozone Manager module
Aravindan Vijayan created HDDS-2510: --- Summary: Sonar : Use isEmpty() to check whether the collection is empty or not in Ozone Manager module Key: HDDS-2510 URL: https://issues.apache.org/jira/browse/HDDS-2510 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2488) Not enough arguments for log messages in GrpcXceiverService
[ https://issues.apache.org/jira/browse/HDDS-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2488: --- Assignee: Aravindan Vijayan > Not enough arguments for log messages in GrpcXceiverService > --- > > Key: HDDS-2488 > URL: https://issues.apache.org/jira/browse/HDDS-2488 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Attila Doroszlai >Assignee: Aravindan Vijayan >Priority: Minor > Labels: sonar > > GrpcXceiverService has log messages with too few arguments for placeholders. > Only one of them is flagged by Sonar, but all seem to have the same problem. > https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-69KcVY8lQ4ZsRZ=AW5md-69KcVY8lQ4ZsRZ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.
[ https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDFS-14989: - Description: Borrowing from the design doc. bq. The swapBlockList takes two parameters, a source file and a destination file. This operation swaps the blocks belonging to the source and the destination atomically. bq. The namespace metadata of interest is the INodeFile class. A file (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile contains a list of blocks (BlockInfo[]). The operation will swap BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not touch other fields. To avoid complication, this operation will abort if either file is open (isUnderConstruction() == true) bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to record the change persistently. was: Borrowing from the design doc. bq. The swapBlockList takes two parameters, a source file and a destination file. This operation swaps the blocks belonging to the source and the destination atomically. The namespace metadata of interest is the INodeFile class. A file (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile contains a list of blocks (BlockInfo[]). The operation will swap BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not touch other fields. To avoid complication, this operation will abort if either file is open (isUnderConstruction() == true) Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to record the change persistently. > Add a 'swapBlockList' operation to Namenode. > > > Key: HDFS-14989 > URL: https://issues.apache.org/jira/browse/HDFS-14989 > Project: Hadoop HDFS > Issue Type: Sub-task >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Borrowing from the design doc. > bq. The swapBlockList takes two parameters, a source file and a destination > file. This operation swaps the blocks belonging to the source and the > destination atomically. > bq. The namespace metadata of interest is the INodeFile class. A file > (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, > BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile > contains a list of blocks (BlockInfo[]). The operation will swap > BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not > touch other fields. To avoid complication, this operation will abort if > either file is open (isUnderConstruction() == true) > bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST > to record the change persistently. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.
Aravindan Vijayan created HDFS-14989: Summary: Add a 'swapBlockList' operation to Namenode. Key: HDFS-14989 URL: https://issues.apache.org/jira/browse/HDFS-14989 Project: Hadoop HDFS Issue Type: Sub-task Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Borrowing from the design doc. bq. The swapBlockList takes two parameters, a source file and a destination file. This operation swaps the blocks belonging to the source and the destination atomically. The namespace metadata of interest is the INodeFile class. A file (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile contains a list of blocks (BlockInfo[]). The operation will swap BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not touch other fields. To avoid complication, this operation will abort if either file is open (isUnderConstruction() == true) Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to record the change persistently. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
[ https://issues.apache.org/jira/browse/HDDS-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2473: Description: sonarcloud.io has flagged a number of code reliability issues in Ozone recon (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). Following issues will be triaged / fixed. * Double Brace Initialization should not be used * Resources should be closed * InterruptedException should not be ignored > Fix code reliability issues found by Sonar in Ozone Recon module. > - > > Key: HDDS-2473 > URL: https://issues.apache.org/jira/browse/HDDS-2473 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Affects Versions: 0.5.0 >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Fix For: 0.5.0 > > > sonarcloud.io has flagged a number of code reliability issues in Ozone recon > (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon). > Following issues will be triaged / fixed. > * Double Brace Initialization should not be used > * Resources should be closed > * InterruptedException should not be ignored -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.
Aravindan Vijayan created HDDS-2473: --- Summary: Fix code reliability issues found by Sonar in Ozone Recon module. Key: HDDS-2473 URL: https://issues.apache.org/jira/browse/HDDS-2473 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Recon Affects Versions: 0.5.0 Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Fix For: 0.5.0 -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.
Aravindan Vijayan created HDDS-2472: --- Summary: Use try-with-resources while creating FlushOptions in RDBStore. Key: HDDS-2472 URL: https://issues.apache.org/jira/browse/HDDS-2472 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Affects Versions: 0.5.0 Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Fix For: 0.5.0 Link to the sonar issue flag - https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-1367) Add ability in Recon to track the growth rate of the cluster.
[ https://issues.apache.org/jira/browse/HDDS-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971870#comment-16971870 ] Aravindan Vijayan commented on HDDS-1367: - [~dineshchitlangia] Apologies for the delayed response. In this JIRA, we will be covering only the growth of the cluster from the usage point of view. We will have to track the read/write ops through metrics from OM, SCM and Datanode. > Add ability in Recon to track the growth rate of the cluster. > -- > > Key: HDDS-1367 > URL: https://issues.apache.org/jira/browse/HDDS-1367 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Vivek Ratnavel Subramanian >Priority: Major > > Recon should be able to answer the question "How fast is the cluster growing, > by week, by month, by day?", which gives the user an idea of the usage stats > of the cluster. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2392) Fix TestScmSafeMode#testSCMSafeModeRestrictedOp
[ https://issues.apache.org/jira/browse/HDDS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969503#comment-16969503 ] Aravindan Vijayan commented on HDDS-2392: - Thank you for reporting this [~hanishakoneru]. Created RATIS-747 to track the fix. > Fix TestScmSafeMode#testSCMSafeModeRestrictedOp > --- > > Key: HDDS-2392 > URL: https://issues.apache.org/jira/browse/HDDS-2392 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Hanisha Koneru >Assignee: Hanisha Koneru >Priority: Blocker > > After ratis upgrade (HDDS-2340), TestScmSafeMode#testSCMSafeModeRestrictedOp > fails as the DNs fail to restart XceiverServerRatis. > RaftServer#start() fails with following exception: > {code:java} > java.io.IOException: java.lang.IllegalStateException: Not started > at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54) > at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61) > at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70) > at > org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:284) > at > org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:296) > at > org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:421) > at > org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:215) > at > org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:110) > at > org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.IllegalStateException: Not started > at > org.apache.ratis.thirdparty.com.google.common.base.Preconditions.checkState(Preconditions.java:504) > at > org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.getPort(ServerImpl.java:176) > at > org.apache.ratis.grpc.server.GrpcService.lambda$new$2(GrpcService.java:143) > at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62) > at > org.apache.ratis.grpc.server.GrpcService.getInetSocketAddress(GrpcService.java:182) > at > org.apache.ratis.server.impl.RaftServerImpl.lambda$new$0(RaftServerImpl.java:84) > at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62) > at > org.apache.ratis.server.impl.RaftServerImpl.getPeer(RaftServerImpl.java:136) > at > org.apache.ratis.server.impl.RaftServerMetrics.(RaftServerMetrics.java:70) > at > org.apache.ratis.server.impl.RaftServerMetrics.getRaftServerMetrics(RaftServerMetrics.java:62) > at > org.apache.ratis.server.impl.RaftServerImpl.(RaftServerImpl.java:119) > at > org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208) > at > java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.
[ https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1873: Description: Recon stores the last ozone manager snapshot received timestamp along with timestamps of last successful run for each task. We can add a REST endpoint that returns the contents of this stored data. This is important to give users a sense of how latest the current data that they are looking at is. And, we need this per task because some tasks might fail to run or might take much longer time to run than other tasks and this needs to be reflected in the UI for better and consistent user experience. was: Recon stores the last ozone manager snapshot received timestamp along with timestamps of last successful run for each task. This is important to give users a sense of how latest the current data that they are looking at is. And, we need this per task because some tasks might fail to run or might take much longer time to run than other tasks and this needs to be reflected in the UI for better and consistent user experience. > Add API to get last completed times for every Recon task. > - > > Key: HDDS-1873 > URL: https://issues.apache.org/jira/browse/HDDS-1873 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Recon >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Shweta >Priority: Major > > Recon stores the last ozone manager snapshot received timestamp along with > timestamps of last successful run for each task. > We can add a REST endpoint that returns the contents of this stored data. > This is important to give users a sense of how latest the current data that > they are looking at is. And, we need this per task because some tasks might > fail to run or might take much longer time to run than other tasks and this > needs to be reflected in the UI for better and consistent user experience. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.
[ https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1873: Description: Recon stores the last ozone manager snapshot received timestamp along with timestamps of last successful run for each task. This is important to give users a sense of how latest the current data that they are looking at is. And, we need this per task because some tasks might fail to run or might take much longer time to run than other tasks and this needs to be reflected in the UI for better and consistent user experience. was: Recon stores the last ozone manager snapshot received timestamp along with timestamps of last successful run for each task. The This is important to give users a sense of how latest the current data that they are looking at is. And, we need this per task because some tasks might fail to run or might take much longer time to run than other tasks and this needs to be reflected in the UI for better and consistent user experience. > Add API to get last completed times for every Recon task. > - > > Key: HDDS-1873 > URL: https://issues.apache.org/jira/browse/HDDS-1873 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Recon >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Shweta >Priority: Major > > Recon stores the last ozone manager snapshot received timestamp along with > timestamps of last successful run for each task. > This is important to give users a sense of how latest the current data that > they are looking at is. And, we need this per task because some tasks might > fail to run or might take much longer time to run than other tasks and this > needs to be reflected in the UI for better and consistent user experience. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.
[ https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1873: Description: Recon stores the last ozone manager snapshot received timestamp along with timestamps of last successful run for each task. The This is important to give users a sense of how latest the current data that they are looking at is. And, we need this per task because some tasks might fail to run or might take much longer time to run than other tasks and this needs to be reflected in the UI for better and consistent user experience. was: Recon should store last ozone manager snapshot received timestamp along with timestamps of last successful run for each task. This is important to give users a sense of how latest the current data that they are looking at is. And, we need this per task because some tasks might fail to run or might take much longer time to run than other tasks and this needs to be reflected in the UI for better and consistent user experience. > Add API to get last completed times for every Recon task. > - > > Key: HDDS-1873 > URL: https://issues.apache.org/jira/browse/HDDS-1873 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Recon >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Shweta >Priority: Major > > Recon stores the last ozone manager snapshot received timestamp along with > timestamps of last successful run for each task. > The > This is important to give users a sense of how latest the current data that > they are looking at is. And, we need this per task because some tasks might > fail to run or might take much longer time to run than other tasks and this > needs to be reflected in the UI for better and consistent user experience. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.
[ https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-1873: Summary: Add API to get last completed times for every Recon task. (was: Recon should store last successful run timestamp for each task) > Add API to get last completed times for every Recon task. > - > > Key: HDDS-1873 > URL: https://issues.apache.org/jira/browse/HDDS-1873 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: Ozone Recon >Affects Versions: 0.4.1 >Reporter: Vivek Ratnavel Subramanian >Assignee: Shweta >Priority: Major > > Recon should store last ozone manager snapshot received timestamp along with > timestamps of last successful run for each task. > This is important to give users a sense of how latest the current data that > they are looking at is. And, we need this per task because some tasks might > fail to run or might take much longer time to run than other tasks and this > needs to be reflected in the UI for better and consistent user experience. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2388) Teragen test failure due to OM exception
[ https://issues.apache.org/jira/browse/HDDS-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964434#comment-16964434 ] Aravindan Vijayan commented on HDDS-2388: - [~shashikant] Is the OM crashing due to this error? This is on a different thread than the OM read/write path. > Teragen test failure due to OM exception > > > Key: HDDS-2388 > URL: https://issues.apache.org/jira/browse/HDDS-2388 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Shashikant Banerjee >Priority: Major > Fix For: 0.5.0 > > > Ran into below exception while running teragen: > {code:java} > Unable to get delta updates since sequenceNumber 79932 > org.rocksdb.RocksDBException: Requested sequence not yet written in the db > at org.rocksdb.RocksDB.getUpdatesSince(Native Method) > at org.rocksdb.RocksDB.getUpdatesSince(RocksDB.java:3587) > at > org.apache.hadoop.hdds.utils.db.RDBStore.getUpdatesSince(RDBStore.java:338) > at > org.apache.hadoop.ozone.om.OzoneManager.getDBUpdates(OzoneManager.java:3283) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.getOMDBUpdates(OzoneManagerRequestHandler.java:404) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handle(OzoneManagerRequestHandler.java:314) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:219) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134) > at > org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72) > at > org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:102) > at > org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:984) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:912) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist
[ https://issues.apache.org/jira/browse/HDDS-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964433#comment-16964433 ] Aravindan Vijayan commented on HDDS-2364: - [~msingh] Thanks for the review. I will raise follow up JIRAs for metrics that are not already exposed through RocksDB. > Add a OM metrics to find the false positive rate for the keyMayExist > > > Key: HDDS-2364 > URL: https://issues.apache.org/jira/browse/HDDS-2364 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Add a OM metrics to find the false positive rate for the keyMayExist. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-2355) Om double buffer flush termination with rocksdb error
[ https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan resolved HDDS-2355. - Resolution: Cannot Reproduce > Om double buffer flush termination with rocksdb error > - > > Key: HDDS-2355 > URL: https://issues.apache.org/jira/browse/HDDS-2355 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Aravindan Vijayan >Priority: Blocker > Fix For: 0.5.0 > > > om_1 |java.io.IOException: Unable to write the batch. > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/]) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/]) > om_1 |at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146) > om_1 |at java.base/java.lang.Thread.run(Thread.java:834) > om_1 |Caused by: org.rocksdb.RocksDBException: > WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in > default WriteCommitted mode). If it is not due to corruption, the WAL must be > emptied before changing the WritePolicy. > om_1 |at org.rocksdb.RocksDB.write0(Native Method) > om_1 |at org.rocksdb.RocksDB.write(RocksDB.java:1421) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/]) > > In few of my test run's i see this error and OM is terminated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2355) Om double buffer flush termination with rocksdb error
[ https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2355: Status: Reopened (was: Reopened) > Om double buffer flush termination with rocksdb error > - > > Key: HDDS-2355 > URL: https://issues.apache.org/jira/browse/HDDS-2355 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Aravindan Vijayan >Priority: Blocker > Fix For: 0.5.0 > > > om_1 |java.io.IOException: Unable to write the batch. > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/]) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/]) > om_1 |at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146) > om_1 |at java.base/java.lang.Thread.run(Thread.java:834) > om_1 |Caused by: org.rocksdb.RocksDBException: > WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in > default WriteCommitted mode). If it is not due to corruption, the WAL must be > emptied before changing the WritePolicy. > om_1 |at org.rocksdb.RocksDB.write0(Native Method) > om_1 |at org.rocksdb.RocksDB.write(RocksDB.java:1421) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/]) > > In few of my test run's i see this error and OM is terminated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Reopened] (HDDS-2355) Om double buffer flush termination with rocksdb error
[ https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reopened HDDS-2355: - Reopening for changing Resolution. > Om double buffer flush termination with rocksdb error > - > > Key: HDDS-2355 > URL: https://issues.apache.org/jira/browse/HDDS-2355 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Aravindan Vijayan >Priority: Blocker > Fix For: 0.5.0 > > > om_1 |java.io.IOException: Unable to write the batch. > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/]) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/]) > om_1 |at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146) > om_1 |at java.base/java.lang.Thread.run(Thread.java:834) > om_1 |Caused by: org.rocksdb.RocksDBException: > WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in > default WriteCommitted mode). If it is not due to corruption, the WAL must be > emptied before changing the WritePolicy. > om_1 |at org.rocksdb.RocksDB.write0(Native Method) > om_1 |at org.rocksdb.RocksDB.write(RocksDB.java:1421) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/]) > > In few of my test run's i see this error and OM is terminated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2380) Use the Table.isExist API instead of get() call while checking for presence of key.
[ https://issues.apache.org/jira/browse/HDDS-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2380: Summary: Use the Table.isExist API instead of get() call while checking for presence of key. (was: OMFileRequest should use the isExist API while checking for pre-existing files in the directory path. ) > Use the Table.isExist API instead of get() call while checking for presence > of key. > --- > > Key: HDDS-2380 > URL: https://issues.apache.org/jira/browse/HDDS-2380 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Labels: pull-request-available > Fix For: 0.5.0 > > Time Spent: 10m > Remaining Estimate: 0h > > Currently, when OM creates a file/directory, it checks the absence of all > prefix paths of the key in its RocksDB. Since we don't care about the > deserialization of the actual value, we should use the isExist API added in > org.apache.hadoop.hdds.utils.db.Table which internally uses the more > performant keyMayExist API of RocksDB. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2355) Om double buffer flush termination with rocksdb error
[ https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962447#comment-16962447 ] Aravindan Vijayan commented on HDDS-2355: - No longer to reproduce this after applying the fix in HDDS-2379. > Om double buffer flush termination with rocksdb error > - > > Key: HDDS-2355 > URL: https://issues.apache.org/jira/browse/HDDS-2355 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Aravindan Vijayan >Priority: Blocker > Fix For: 0.5.0 > > > om_1 |java.io.IOException: Unable to write the batch. > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/]) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/]) > om_1 |at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146) > om_1 |at java.base/java.lang.Thread.run(Thread.java:834) > om_1 |Caused by: org.rocksdb.RocksDBException: > WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in > default WriteCommitted mode). If it is not due to corruption, the WAL must be > emptied before changing the WritePolicy. > om_1 |at org.rocksdb.RocksDB.write0(Native Method) > om_1 |at org.rocksdb.RocksDB.write(RocksDB.java:1421) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/]) > > In few of my test run's i see this error and OM is terminated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2380) OMFileRequest should use the isExist API while checking for pre-existing files in the directory path.
Aravindan Vijayan created HDDS-2380: --- Summary: OMFileRequest should use the isExist API while checking for pre-existing files in the directory path. Key: HDDS-2380 URL: https://issues.apache.org/jira/browse/HDDS-2380 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Reporter: Aravindan Vijayan Fix For: 0.5.0 Currently, when OM creates a file/directory, it checks the absence of all prefix paths of the key in its RocksDB. Since we don't care about the deserialization of the actual value, we should use the isExist API added in org.apache.hadoop.hdds.utils.db.Table which internally uses the more performant keyMayExist API of RocksDB. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2380) OMFileRequest should use the isExist API while checking for pre-existing files in the directory path.
[ https://issues.apache.org/jira/browse/HDDS-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2380: --- Assignee: Aravindan Vijayan > OMFileRequest should use the isExist API while checking for pre-existing > files in the directory path. > -- > > Key: HDDS-2380 > URL: https://issues.apache.org/jira/browse/HDDS-2380 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Fix For: 0.5.0 > > > Currently, when OM creates a file/directory, it checks the absence of all > prefix paths of the key in its RocksDB. Since we don't care about the > deserialization of the actual value, we should use the isExist API added in > org.apache.hadoop.hdds.utils.db.Table which internally uses the more > performant keyMayExist API of RocksDB. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2379) OM terminates with RocksDB error while continuously writing keys.
Aravindan Vijayan created HDDS-2379: --- Summary: OM terminates with RocksDB error while continuously writing keys. Key: HDDS-2379 URL: https://issues.apache.org/jira/browse/HDDS-2379 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Reporter: Aravindan Vijayan Assignee: Bharat Viswanadham Fix For: 0.5.0 Exception trace after writing around 800,000 keys. {code} 2019-10-29 11:15:15,131 ERROR org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with exit status 1: During flush to DB encountered err or in OMDoubleBuffer flush thread OMDoubleBufferFlushThread java.io.IOException: Unable to write the batch. at org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48) at org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240) at org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146) at java.lang.Thread.run(Thread.java:745) Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag at org.rocksdb.RocksDB.write0(Native Method) at org.rocksdb.RocksDB.write(RocksDB.java:1421) at org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46) ... 3 more {code} Assigning to [~bharat] since he has already started work on this. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist
[ https://issues.apache.org/jira/browse/HDDS-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2364: --- Assignee: Aravindan Vijayan > Add a OM metrics to find the false positive rate for the keyMayExist > > > Key: HDDS-2364 > URL: https://issues.apache.org/jira/browse/HDDS-2364 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > > Add a OM metrics to find the false positive rate for the keyMayExist. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2355) Om double buffer flush termination with rocksdb error
[ https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2355: Issue Type: Bug (was: Task) > Om double buffer flush termination with rocksdb error > - > > Key: HDDS-2355 > URL: https://issues.apache.org/jira/browse/HDDS-2355 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Bharat Viswanadham >Assignee: Aravindan Vijayan >Priority: Blocker > Fix For: 0.5.0 > > > om_1 |java.io.IOException: Unable to write the batch. > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/]) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/]) > om_1 |at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146) > om_1 |at java.base/java.lang.Thread.run(Thread.java:834) > om_1 |Caused by: org.rocksdb.RocksDBException: > WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in > default WriteCommitted mode). If it is not due to corruption, the WAL must be > emptied before changing the WritePolicy. > om_1 |at org.rocksdb.RocksDB.write0(Native Method) > om_1 |at org.rocksdb.RocksDB.write(RocksDB.java:1421) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/]) > > In few of my test run's i see this error and OM is terminated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2355) Om double buffer flush termination with rocksdb error
[ https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2355: --- Assignee: Aravindan Vijayan > Om double buffer flush termination with rocksdb error > - > > Key: HDDS-2355 > URL: https://issues.apache.org/jira/browse/HDDS-2355 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Bharat Viswanadham >Assignee: Aravindan Vijayan >Priority: Critical > > om_1 |java.io.IOException: Unable to write the batch. > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/]) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/]) > om_1 |at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146) > om_1 |at java.base/java.lang.Thread.run(Thread.java:834) > om_1 |Caused by: org.rocksdb.RocksDBException: > WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in > default WriteCommitted mode). If it is not due to corruption, the WAL must be > emptied before changing the WritePolicy. > om_1 |at org.rocksdb.RocksDB.write0(Native Method) > om_1 |at org.rocksdb.RocksDB.write(RocksDB.java:1421) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/]) > > In few of my test run's i see this error and OM is terminated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2355) Om double buffer flush termination with rocksdb error
[ https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2355: Priority: Critical (was: Major) > Om double buffer flush termination with rocksdb error > - > > Key: HDDS-2355 > URL: https://issues.apache.org/jira/browse/HDDS-2355 > Project: Hadoop Distributed Data Store > Issue Type: Task >Reporter: Bharat Viswanadham >Priority: Critical > > om_1 |java.io.IOException: Unable to write the batch. > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/]) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/]) > om_1 |at > org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146) > om_1 |at java.base/java.lang.Thread.run(Thread.java:834) > om_1 |Caused by: org.rocksdb.RocksDBException: > WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in > default WriteCommitted mode). If it is not due to corruption, the WAL must be > emptied before changing the WritePolicy. > om_1 |at org.rocksdb.RocksDB.write0(Native Method) > om_1 |at org.rocksdb.RocksDB.write(RocksDB.java:1421) > om_1 | at > [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/]) > > In few of my test run's i see this error and OM is terminated. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2361) Ozone Manager init & start command prints out unnecessary line in the beginning.
Aravindan Vijayan created HDDS-2361: --- Summary: Ozone Manager init & start command prints out unnecessary line in the beginning. Key: HDDS-2361 URL: https://issues.apache.org/jira/browse/HDDS-2361 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Aravindan Vijayan {code} [root@avijayan-om-1 ozone-0.5.0-SNAPSHOT]# bin/ozone --daemon start om Ozone Manager classpath extended by {code} We could probably print this line only when extra elements are added to OM classpathor skip printing this line altogether. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2301) Write path: Reduce read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959100#comment-16959100 ] Aravindan Vijayan commented on HDDS-2301: - [~sdeka] Here are some useful RocksDB configs in OM * Enable RocksDB metrics - *ozone.metastore.rocksdb.statistics=ALL* * Enable RocksDB logging - *rocksdb.logging.enabled=true* * Enable RocksDB DEBUG logging - *rocksdb.logging.level=DEBUG* > Write path: Reduce read contention in rocksDB > - > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Assignee: Supratim Deka >Priority: Major > Labels: performance > Attachments: om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in > OM. This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the write path. > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist.}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2301) Write path: Reduce read contention in rocksDB
[ https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959100#comment-16959100 ] Aravindan Vijayan edited comment on HDDS-2301 at 10/24/19 5:51 PM: --- [~sdeka] Here are some useful RocksDB configs in OM * Enable RocksDB metrics - *ozone.metastore.rocksdb.statistics=ALL* * Enable RocksDB logging - *hadoop.hdds.db.rocksdb.logging.enabled=true* * Enable RocksDB DEBUG logging - *hadoop.hdds.db.rocksdb.logging.level=DEBUG* was (Author: avijayan): [~sdeka] Here are some useful RocksDB configs in OM * Enable RocksDB metrics - *ozone.metastore.rocksdb.statistics=ALL* * Enable RocksDB logging - *rocksdb.logging.enabled=true* * Enable RocksDB DEBUG logging - *rocksdb.logging.level=DEBUG* > Write path: Reduce read contention in rocksDB > - > > Key: HDDS-2301 > URL: https://issues.apache.org/jira/browse/HDDS-2301 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Affects Versions: 0.5.0 >Reporter: Rajesh Balamohan >Assignee: Supratim Deka >Priority: Major > Labels: performance > Attachments: om_write_profile.png > > > Benchmark: > > Simple benchmark which creates 100 and 1000s of keys (empty directory) in > OM. This is done in a tight loop and multiple threads from client side to add > enough load on CPU. Note that intention is to understand the bottlenecks in > OM (intentionally avoiding interactions with SCM & DN). > Observation: > - > During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This > internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every > write operation. This turns out to be expensive and chokes the write path. > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155] > [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63] > In most of the cases, directory creation would be fresh entry. In such cases, > it would be good to try with {{RocksDB::keyMayExist.}} > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Assigned] (HDDS-2320) Negative value seen for OM NumKeys Metric in JMX.
[ https://issues.apache.org/jira/browse/HDDS-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan reassigned HDDS-2320: --- Assignee: Aravindan Vijayan > Negative value seen for OM NumKeys Metric in JMX. > - > > Key: HDDS-2320 > URL: https://issues.apache.org/jira/browse/HDDS-2320 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > Attachments: Screen Shot 2019-10-17 at 11.31.08 AM.png > > > While running teragen/terasort on a cluster and verifying number of keys > created on Ozone Manager, I noticed that the value of NumKeys counter metric > to be a negative value !Screen Shot 2019-10-17 at 11.31.08 AM.png! . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2320) Negative value seen for OM NumKeys Metric in JMX.
[ https://issues.apache.org/jira/browse/HDDS-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954076#comment-16954076 ] Aravindan Vijayan commented on HDDS-2320: - cc [~hanishakoneru] / [~bharat] > Negative value seen for OM NumKeys Metric in JMX. > - > > Key: HDDS-2320 > URL: https://issues.apache.org/jira/browse/HDDS-2320 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Priority: Major > Attachments: Screen Shot 2019-10-17 at 11.31.08 AM.png > > > While running teragen/terasort on a cluster and verifying number of keys > created on Ozone Manager, I noticed that the value of NumKeys counter metric > to be a negative value !Screen Shot 2019-10-17 at 11.31.08 AM.png! . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2320) Negative value seen for OM NumKeys Metric in JMX.
Aravindan Vijayan created HDDS-2320: --- Summary: Negative value seen for OM NumKeys Metric in JMX. Key: HDDS-2320 URL: https://issues.apache.org/jira/browse/HDDS-2320 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Reporter: Aravindan Vijayan Attachments: Screen Shot 2019-10-17 at 11.31.08 AM.png While running teragen/terasort on a cluster and verifying number of keys created on Ozone Manager, I noticed that the value of NumKeys counter metric to be a negative value !Screen Shot 2019-10-17 at 11.31.08 AM.png! . -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944968#comment-16944968 ] Aravindan Vijayan commented on HDDS-2241: - Thank you [~aengineer]. I will post a patch. > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipelines for a key > > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > information from SCM through an RPC for every block in the key. For large > files > 1GB, we may end up making a lot of RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, a simpler change would be to have a map (method local) of > ContainerID -> Pipeline that we get from SCM so that we don't need to make > repeated calls to SCM for the same containerID for a key. _Here, Number of > calls = Number of unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2164) om.db.checkpoints is getting filling up fast
[ https://issues.apache.org/jira/browse/HDDS-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2164: Resolution: Fixed Status: Resolved (was: Patch Available) > om.db.checkpoints is getting filling up fast > > > Key: HDDS-2164 > URL: https://issues.apache.org/jira/browse/HDDS-2164 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Nanda kumar >Assignee: Aravindan Vijayan >Priority: Critical > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > {{om.db.checkpoints}} is filling up fast, we should also clean this up. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2254) Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention
[ https://issues.apache.org/jira/browse/HDDS-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944704#comment-16944704 ] Aravindan Vijayan commented on HDDS-2254: - Thanks for filing this [~swagle]. I will take a look. > Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention > --- > > Key: HDDS-2254 > URL: https://issues.apache.org/jira/browse/HDDS-2254 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: test >Affects Versions: 0.5.0 >Reporter: Siddharth Wagle >Assignee: Aravindan Vijayan >Priority: Major > > Test always fails with assertion error: > {code} > java.lang.AssertionError > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.ozone.client.rpc.TestContainerStateMachine.testRatisSnapshotRetention(TestContainerStateMachine.java:188) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2253) Add container state to snapshot to avoid spurious missing container being reported.
Aravindan Vijayan created HDDS-2253: --- Summary: Add container state to snapshot to avoid spurious missing container being reported. Key: HDDS-2253 URL: https://issues.apache.org/jira/browse/HDDS-2253 Project: Hadoop Distributed Data Store Issue Type: Bug Affects Versions: 0.4.0 Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944053#comment-16944053 ] Aravindan Vijayan edited comment on HDDS-2241 at 10/3/19 10:00 PM: --- [~aengineer] This was not based on data we have seen in read performance tests. This is just an observation based on looking through the code while working on HDDS-2188 along with [~msingh]. Even if this does not show up as a bottleneck at this point of time, IMHO the 2nd option looks like an obvious performance improvement that we should do. was (Author: avijayan): [~aengineer] This was not based on data we have seen in read performance tests. This is just an observation based on looking through the code while working on HDDS-2188 along with [~msingh]. > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipelines for a key > > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > information from SCM through an RPC for every block in the key. For large > files > 1GB, we may end up making a lot of RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, a simpler change would be to have a map (method local) of > ContainerID -> Pipeline that we get from SCM so that we don't need to make > repeated calls to SCM for the same containerID for a key. _Here, Number of > calls = Number of unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944053#comment-16944053 ] Aravindan Vijayan commented on HDDS-2241: - [~aengineer] This was not based on data we have seen in read performance tests. This is just an observation based on looking through the code while working on HDDS-2188 along with [~msingh]. > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipelines for a key > > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > information from SCM through an RPC for every block in the key. For large > files > 1GB, we may end up making a lot of RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, a simpler change would be to have a map (method local) of > ContainerID -> Pipeline that we get from SCM so that we don't need to make > repeated calls to SCM for the same containerID for a key. _Here, Number of > calls = Number of unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2241: Summary: Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key (was: Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key) > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipelines for a key > > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > information from SCM through an RPC for every block in the key. For large > files > 1GB, we may end up making a lot of RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, a simpler change would be to have a map (method local) of > ContainerID -> Pipeline that we get from SCM so that we don't need to make > repeated calls to SCM for the same containerID for a key. _Here, Number of > calls = Number of unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2241: Description: Currently, while looking up a key, the Ozone Manager gets the pipeline information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making a lot of RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, a simpler change would be to have a map (method local) of ContainerID -> Pipeline that we get from SCM so that we don't need to make repeated calls to SCM for the same containerID for a key. _Here, Number of calls = Number of unique containerIDs_ was: Currently, while looking up a key, the Ozone Manager gets the pipeline information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making a lot of RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, we can have a simple map (method local) for ContainerID -> Pipeline that we get from SCM so that we don't need to make repeated calls to SCM for the same containerID for a key. _Here, Number of calls = Number of unique containerIDs_ > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipeline for a key > --- > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > information from SCM through an RPC for every block in the key. For large > files > 1GB, we may end up making a lot of RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, a simpler change would be to have a map (method local) of > ContainerID -> Pipeline that we get from SCM so that we don't need to make > repeated calls to SCM for the same containerID for a key. _Here, Number of > calls = Number of unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944026#comment-16944026 ] Aravindan Vijayan edited comment on HDDS-2241 at 10/3/19 9:06 PM: -- cc [~msingh] / [~aengineer] / [~bharat] / [~nanda] was (Author: avijayan): cc [~msingh] / [~aengineer] / [~bharat] / [~nanda619] > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipeline for a key > --- > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > location information from SCM through an RPC for every block in the key. For > large files > 1GB, we may end up making ~4 RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, we can have a simple map (method local) for ContainerID -> > Pipeline that we get from SCM so that we don't need to make repeated calls to > SCM for the same containerID for a key. _Here, Number of calls = Number of > unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2241: Description: Currently, while looking up a key, the Ozone Manager gets the pipeline information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making a lot of RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, we can have a simple map (method local) for ContainerID -> Pipeline that we get from SCM so that we don't need to make repeated calls to SCM for the same containerID for a key. _Here, Number of calls = Number of unique containerIDs_ was: Currently, while looking up a key, the Ozone Manager gets the pipeline information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, we can have a simple map (method local) for ContainerID -> Pipeline that we get from SCM so that we don't need to make repeated calls to SCM for the same containerID for a key. _Here, Number of calls = Number of unique containerIDs_ > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipeline for a key > --- > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > information from SCM through an RPC for every block in the key. For large > files > 1GB, we may end up making a lot of RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, we can have a simple map (method local) for ContainerID -> > Pipeline that we get from SCM so that we don't need to make repeated calls to > SCM for the same containerID for a key. _Here, Number of calls = Number of > unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2241: Description: Currently, while looking up a key, the Ozone Manager gets the pipeline information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, we can have a simple map (method local) for ContainerID -> Pipeline that we get from SCM so that we don't need to make repeated calls to SCM for the same containerID for a key. _Here, Number of calls = Number of unique containerIDs_ was: Currently, while looking up a key, the Ozone Manager gets the pipeline location information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, we can have a simple map (method local) for ContainerID -> Pipeline that we get from SCM so that we don't need to make repeated calls to SCM for the same containerID for a key. _Here, Number of calls = Number of unique containerIDs_ > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipeline for a key > --- > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > information from SCM through an RPC for every block in the key. For large > files > 1GB, we may end up making ~4 RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, we can have a simple map (method local) for ContainerID -> > Pipeline that we get from SCM so that we don't need to make repeated calls to > SCM for the same containerID for a key. _Here, Number of calls = Number of > unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944026#comment-16944026 ] Aravindan Vijayan commented on HDDS-2241: - cc [~msingh] / [~aengineer] / [~bharat] / [~nanda619] > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipeline for a key > --- > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > location information from SCM through an RPC for every block in the key. For > large files > 1GB, we may end up making ~4 RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, we can have a simple map (method local) for ContainerID -> > Pipeline that we get from SCM so that we don't need to make repeated calls to > SCM for the same containerID for a key. _Here, Number of calls = Number of > unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2241: Description: Currently, while looking up a key, the Ozone Manager gets the pipeline location information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, we can have a simple map (method local) for ContainerID -> Pipeline that we get from SCM so that we don't need to make repeated calls to SCM for the same containerID for a key. _Here, Number of calls = Number of unique containerIDs_ was: Currently, while looking up a key, the Ozone Manager gets the pipeline location information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, we can have a simple map (method local) for ContainerID -> Pipeline that we get from SCM so that we don't need to make calls to SCM again for the same pipeline for that key. _Here, Number of calls = Number of unique containerIDs_ > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipeline for a key > --- > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > location information from SCM through an RPC for every block in the key. For > large files > 1GB, we may end up making ~4 RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, we can have a simple map (method local) for ContainerID -> > Pipeline that we get from SCM so that we don't need to make repeated calls to > SCM for the same containerID for a key. _Here, Number of calls = Number of > unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2241: Description: Currently, while looking up a key, the Ozone Manager gets the pipeline location information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_ * Instead, we can have a simple map (method local) for ContainerID -> Pipeline that we get from SCM so that we don't need to make calls to SCM again for the same pipeline for that key. _Here, Number of calls = Number of unique containerIDs_ was: Currently, while looking up a key, the Ozone Manager gets the pipeline location information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. * Instead, we can have a simple map (inside the method) for ContainerID -> Pipeline that we got from SCM so that we don't need to make calls to SCM again for the same pipeline. Here number of calls = number of unique containerIDs. > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipeline for a key > --- > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > location information from SCM through an RPC for every block in the key. For > large files > 1GB, we may end up making ~4 RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. _Here, Number of calls = 1 (or k > depending on batch size)_ > * Instead, we can have a simple map (method local) for ContainerID -> > Pipeline that we get from SCM so that we don't need to make calls to SCM > again for the same pipeline for that key. _Here, Number of calls = Number of > unique containerIDs_ -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
[ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2241: Description: Currently, while looking up a key, the Ozone Manager gets the pipeline location information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. * Instead, we can have a simple map (inside the method) for ContainerID -> Pipeline that we got from SCM so that we don't need to make calls to SCM again for the same pipeline. Here number of calls = number of unique containerIDs. was: Currently, while looking up a key, the Ozone Manager gets the pipeline location information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. * Instead, we can have a method local cache for ContainerID -> Pipeline that we got from SCM so that we don't need to make calls to SCM again for the same pipeline. > Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the > pipeline for a key > --- > > Key: HDDS-2241 > URL: https://issues.apache.org/jira/browse/HDDS-2241 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > Currently, while looking up a key, the Ozone Manager gets the pipeline > location information from SCM through an RPC for every block in the key. For > large files > 1GB, we may end up making ~4 RPC calls for this. This can be > optimized in a couple of ways > * We can implement a batch getContainerWithPipeline API in SCM using which we > can get the pipeline info locations for all the blocks for a file. To keep > the number of containers passed in to SCM in a single call, we can have a > fixed container batch size on the OM side. > * Instead, we can have a simple map (inside the method) for ContainerID -> > Pipeline that we got from SCM so that we don't need to make calls to SCM > again for the same pipeline. Here number of calls = number of unique > containerIDs. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDDS-2188) Implement LocatedFileStatus & getFileBlockLocations to provide node/localization information to Yarn/Mapreduce
[ https://issues.apache.org/jira/browse/HDDS-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943983#comment-16943983 ] Aravindan Vijayan edited comment on HDDS-2188 at 10/3/19 8:39 PM: -- On discussing with [~msingh], we found the following to be implemented to make this work correctly. I will be creating separate JIRAs for handling each work item. *This JIRA will cover the following* * The getFileStatus call on the Ozone file system will compute and return an instance of LocatedFileStatus. This means that it will have the block locations for the file (if present) as part of the status. This will be used by the Map-Reduce applications automatically. * For the Ozone Manager to supplement the getFileStatus with Block info locations, we need to use a flag like the "refreshPipeline" flag to obtain the information from the SCM. Whenever the flag is set, OM will get the Block locations from SCM and include it in the returned File Status. *New JIRAs will be created for the following.* * Currently, we get block location info from SCM for every block. This will lead to multiple SCM RPC calls to get the blocks for 1 file. We can implement a batch GET API for SCM using which we can get the Block info locations for all the blocks for a file. (HDDS-2241) * As an optimization, we can cache the block info in FileSystem client layer so that we can reuse them instead of making a call to RPC. An expiry based Guava cache is one candidate. was (Author: avijayan): On discussing with [~msingh], we found the following to be implemented to make this work correctly. I will be creating separate JIRAs for handling each work item. *This JIRA will cover the following* * The getFileStatus call on the Ozone file system will compute and return an instance of LocatedFileStatus. This means that it will have the block locations for the file (if present) as part of the status. This will be used by the Map-Reduce applications automatically. * For the Ozone Manager to supplement the getFileStatus with Block info locations, we need to use a flag like the "refreshPipeline" flag to obtain the information from the SCM. Whenever the flag is set, OM will get the Block locations from SCM and include it in the returned File Status. *New JIRAs will be created for the following.* * Currently, we get block location info from SCM for every block. This will lead to multiple SCM RPC calls to get the blocks for 1 file. We can implement a batch GET API for SCM using which we can get the Block info locations for all the blocks for a file. * As an optimization, we can cache the block info in FileSystem client layer so that we can reuse them instead of making a call to RPC. An expiry based Guava cache is one candidate. > Implement LocatedFileStatus & getFileBlockLocations to provide > node/localization information to Yarn/Mapreduce > -- > > Key: HDDS-2188 > URL: https://issues.apache.org/jira/browse/HDDS-2188 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > > For applications like Hive/MapReduce to take advantage of the data locality > in Ozone, Ozone should return the location of the Ozone blocks. This is > needed for better read performance for Hadoop Applications. > {code} > if (file instanceof LocatedFileStatus) { > blkLocations = ((LocatedFileStatus) file).getBlockLocations(); > } else { > blkLocations = fs.getFileBlockLocations(file, 0, length); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Created] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key
Aravindan Vijayan created HDDS-2241: --- Summary: Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key Key: HDDS-2241 URL: https://issues.apache.org/jira/browse/HDDS-2241 Project: Hadoop Distributed Data Store Issue Type: Bug Components: Ozone Manager Reporter: Aravindan Vijayan Assignee: Aravindan Vijayan Currently, while looking up a key, the Ozone Manager gets the pipeline location information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a couple of ways * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. * Instead, we can have a method local cache for ContainerID -> Pipeline that we got from SCM so that we don't need to make calls to SCM again for the same pipeline. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDDS-2188) Implement LocatedFileStatus & getFileBlockLocations to provide node/localization information to Yarn/Mapreduce
[ https://issues.apache.org/jira/browse/HDDS-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943983#comment-16943983 ] Aravindan Vijayan commented on HDDS-2188: - On discussing with [~msingh], we found the following to be implemented to make this work correctly. I will be creating separate JIRAs for handling each work item. *This JIRA will cover the following* * The getFileStatus call on the Ozone file system will compute and return an instance of LocatedFileStatus. This means that it will have the block locations for the file (if present) as part of the status. This will be used by the Map-Reduce applications automatically. * For the Ozone Manager to supplement the getFileStatus with Block info locations, we need to use a flag like the "refreshPipeline" flag to obtain the information from the SCM. Whenever the flag is set, OM will get the Block locations from SCM and include it in the returned File Status. *New JIRAs will be created for the following.* * Currently, we get block location info from SCM for every block. This will lead to multiple SCM RPC calls to get the blocks for 1 file. We can implement a batch GET API for SCM using which we can get the Block info locations for all the blocks for a file. * As an optimization, we can cache the block info in FileSystem client layer so that we can reuse them instead of making a call to RPC. An expiry based Guava cache is one candidate. > Implement LocatedFileStatus & getFileBlockLocations to provide > node/localization information to Yarn/Mapreduce > -- > > Key: HDDS-2188 > URL: https://issues.apache.org/jira/browse/HDDS-2188 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > > For applications like Hive/MapReduce to take advantage of the data locality > in Ozone, Ozone should return the location of the Ozone blocks. This is > needed for better read performance for Hadoop Applications. > {code} > if (file instanceof LocatedFileStatus) { > blkLocations = ((LocatedFileStatus) file).getBlockLocations(); > } else { > blkLocations = fs.getFileBlockLocations(file, 0, length); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-2200) Recon does not handle the NULL snapshot from OM DB cleanly.
[ https://issues.apache.org/jira/browse/HDDS-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-2200 started by Aravindan Vijayan. --- > Recon does not handle the NULL snapshot from OM DB cleanly. > --- > > Key: HDDS-2200 > URL: https://issues.apache.org/jira/browse/HDDS-2200 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Recon >Reporter: Aravindan Vijayan >Assignee: Aravindan Vijayan >Priority: Major > > {code} > 2019-09-27 11:35:19,835 [pool-9-thread-1] ERROR - Null snapshot location > got from OM. > 2019-09-27 11:35:19,839 [pool-9-thread-1] INFO - Calling reprocess on > Recon tasks. > 2019-09-27 11:35:19,840 [pool-7-thread-1] INFO - Starting a 'reprocess' > run of ContainerKeyMapperTask. > 2019-09-27 11:35:20,069 [pool-7-thread-1] INFO - Creating new Recon > Container DB at /tmp/recon/db/recon-container.db_1569609319840 > 2019-09-27 11:35:20,069 [pool-7-thread-1] INFO - Cleaning up old Recon > Container DB at /tmp/recon/db/recon-container.db_1569609258721. > 2019-09-27 11:35:20,144 [pool-9-thread-1] ERROR - Unexpected error : > java.util.concurrent.ExecutionException: java.lang.NullPointerException > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at > org.apache.hadoop.ozone.recon.tasks.ReconTaskControllerImpl.reInitializeTasks(ReconTaskControllerImpl.java:181) > at > org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.syncDataFromOM(OzoneManagerServiceProviderImpl.java:333) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.NullPointerException > at > org.apache.hadoop.ozone.recon.tasks.ContainerKeyMapperTask.reprocess(ContainerKeyMapperTask.java:81) > at > org.apache.hadoop.ozone.recon.tasks.ReconTaskControllerImpl.lambda$reInitializeTasks$3(ReconTaskControllerImpl.java:176) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDDS-2164) om.db.checkpoints is getting filling up fast
[ https://issues.apache.org/jira/browse/HDDS-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravindan Vijayan updated HDDS-2164: Status: Patch Available (was: In Progress) > om.db.checkpoints is getting filling up fast > > > Key: HDDS-2164 > URL: https://issues.apache.org/jira/browse/HDDS-2164 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Manager >Reporter: Nanda kumar >Assignee: Aravindan Vijayan >Priority: Critical > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > {{om.db.checkpoints}} is filling up fast, we should also clean this up. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Work started] (HDDS-2188) Implement LocatedFileStatus & getFileBlockLocations to provide node/localization information to Yarn/Mapreduce
[ https://issues.apache.org/jira/browse/HDDS-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HDDS-2188 started by Aravindan Vijayan. --- > Implement LocatedFileStatus & getFileBlockLocations to provide > node/localization information to Yarn/Mapreduce > -- > > Key: HDDS-2188 > URL: https://issues.apache.org/jira/browse/HDDS-2188 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Filesystem >Affects Versions: 0.5.0 >Reporter: Mukul Kumar Singh >Assignee: Aravindan Vijayan >Priority: Major > > For applications like Hive/MapReduce to take advantage of the data locality > in Ozone, Ozone should return the location of the Ozone blocks. This is > needed for better read performance for Hadoop Applications. > {code} > if (file instanceof LocatedFileStatus) { > blkLocations = ((LocatedFileStatus) file).getBlockLocations(); > } else { > blkLocations = fs.getFileBlockLocations(file, 0, length); > } > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org