[jira] [Commented] (HDFS-16252) Correct docs for dfs.http.client.retry.policy.spec

2021-10-04 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16252?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424032#comment-17424032
 ] 

Aravindan Vijayan commented on HDFS-16252:
--

+1, thanks for the fix. 

> Correct docs for dfs.http.client.retry.policy.spec 
> ---
>
> Key: HDFS-16252
> URL: https://issues.apache.org/jira/browse/HDFS-16252
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Stephen O'Donnell
>Assignee: Stephen O'Donnell
>Priority: Major
> Attachments: HDFS-16252.001.patch
>
>
> The hdfs-default doc for dfs.http.client.retry.policy.spec is incorrect, as 
> it has the wait time and retries switched around in the descriptio. Also, the 
> doc for dfs.client.retry.policy.spec is not present and should be the same as 
> for dfs.http.client.retry.policy.spec.
> The code shows the timeout is first and then the number of retries:
> {code}
> String  POLICY_SPEC_KEY = PREFIX + "policy.spec";
> String  POLICY_SPEC_DEFAULT = "1,6,6,10"; //t1,n1,t2,n2,...
> // In RetryPolicies.java, we can see it gets the timeout as the first in 
> the pair
>/**
>  * Parse the given string as a MultipleLinearRandomRetry object.
>  * The format of the string is "t_1, n_1, t_2, n_2, ...",
>  * where t_i and n_i are the i-th pair of sleep time and number of 
> retries.
>  * Note that the white spaces in the string are ignored.
>  *
>  * @return the parsed object, or null if the parsing fails.
>  */
> public static MultipleLinearRandomRetry parseCommaSeparatedString(String 
> s) {
>   final String[] elements = s.split(",");
>   if (elements.length == 0) {
> LOG.warn("Illegal value: there is no element in \"" + s + "\".");
> return null;
>   }
>   if (elements.length % 2 != 0) {
> LOG.warn("Illegal value: the number of elements in \"" + s + "\" is "
> + elements.length + " but an even number of elements is 
> expected.");
> return null;
>   }
>   final List pairs
>   = new ArrayList();
>
>   for(int i = 0; i < elements.length; ) {
> //parse the i-th sleep-time
> final int sleep = parsePositiveInt(elements, i++, s);
> if (sleep == -1) {
>   return null; //parse fails
> }
> //parse the i-th number-of-retries
> final int retries = parsePositiveInt(elements, i++, s);
> if (retries == -1) {
>   return null; //parse fails
> }
> pairs.add(new RetryPolicies.MultipleLinearRandomRetry.Pair(retries, 
> sleep));
>   }
>   return new RetryPolicies.MultipleLinearRandomRetry(pairs);
>   }
> {code}
> This change simply updates the docs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15257) Fix spelling mistake in DataXceiverServer

2020-04-02 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-15257:


Assignee: Haibin Huang  (was: Aravindan Vijayan)

> Fix spelling mistake in DataXceiverServer
> -
>
> Key: HDFS-15257
> URL: https://issues.apache.org/jira/browse/HDFS-15257
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Haibin Huang
>Priority: Minor
> Attachments: HDFS-15257-001.patch
>
>
> There is a spelling mistake in DataXceiverServer, try to fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15257) Fix spelling mistake in DataXceiverServer

2020-04-02 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-15257:


Assignee: Aravindan Vijayan  (was: Haibin Huang)

> Fix spelling mistake in DataXceiverServer
> -
>
> Key: HDFS-15257
> URL: https://issues.apache.org/jira/browse/HDFS-15257
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Haibin Huang
>Assignee: Aravindan Vijayan
>Priority: Minor
> Attachments: HDFS-15257-001.patch
>
>
> There is a spelling mistake in DataXceiverServer, try to fix it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14561) dfs.datanode.data.dir does not honor file:/// scheme

2020-02-28 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-14561:


Assignee: (was: Aravindan Vijayan)

> dfs.datanode.data.dir does not honor file:/// scheme
> 
>
> Key: HDFS-14561
> URL: https://issues.apache.org/jira/browse/HDFS-14561
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation, hdfs
>Affects Versions: 2.9.2, 2.8.5
>Reporter: Artem Ervits
>Priority: Major
>
> _dfs.datanode.data.dir_ with {{file:///}} scheme is not being honored in 
> Hadoop 2.8.5 and 2.9.2. I didn't test 3.x branches but good to validate also. 
> In my environment, only file:// seemed to work. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15156) In-Place Erasure Coding should factor in storage quota validations.

2020-02-06 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDFS-15156:
-
Parent: HDFS-14978
Issue Type: Sub-task  (was: Bug)

> In-Place Erasure Coding should factor in storage quota validations. 
> 
>
> Key: HDFS-15156
> URL: https://issues.apache.org/jira/browse/HDFS-15156
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Priority: Major
>  Labels: ec
>
> HDFS-14978 is a new feature that brings in In-Place Erasure coding in HDFS. 
> While converting a replicated file to an EC one, we may end up violating 
> storage quota validations. Thanks to [~ayushsaxena] for bringing this up. 
> This JIRA tracks the effort to tackle that. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15156) In-Place Erasure Coding should factor in storage quota validations.

2020-02-06 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDFS-15156:
-
Labels: ec  (was: )

> In-Place Erasure Coding should factor in storage quota validations. 
> 
>
> Key: HDFS-15156
> URL: https://issues.apache.org/jira/browse/HDFS-15156
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Aravindan Vijayan
>Priority: Major
>  Labels: ec
>
> HDFS-14978 is a new feature that brings in In-Place Erasure coding in HDFS. 
> While converting a replicated file to an EC one, we may end up violating 
> storage quota validations. Thanks to [~ayushsaxena] for bringing this up. 
> This JIRA tracks the effort to tackle that. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15156) In-Place Erasure Coding should factor in storage quota validations.

2020-02-06 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDFS-15156:


 Summary: In-Place Erasure Coding should factor in storage quota 
validations. 
 Key: HDFS-15156
 URL: https://issues.apache.org/jira/browse/HDFS-15156
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Aravindan Vijayan


HDFS-14978 is a new feature that brings in In-Place Erasure coding in HDFS. 
While converting a replicated file to an EC one, we may end up violating 
storage quota validations. Thanks to [~ayushsaxena] for bringing this up. This 
JIRA tracks the effort to tackle that. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.

2020-02-04 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDFS-14989:
-
Resolution: Fixed
Status: Resolved  (was: Patch Available)

Thanks for the reviews [~arp], [~weichiu], [~ayushtkn]. PR 
https://github.com/apache/hadoop/pull/1819 merged to branch 
HDFS-14978_ec_conversion. 

> Add a 'swapBlockList' operation to Namenode.
> 
>
> Key: HDFS-14989
> URL: https://issues.apache.org/jira/browse/HDFS-14989
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Borrowing from the design doc.
> bq. The swapBlockList takes two parameters, a source file and a destination 
> file. This operation swaps the blocks belonging to the source and the 
> destination atomically.
> bq. The namespace metadata of interest is the INodeFile class. A file 
> (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, 
> BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile 
> contains a list of blocks (BlockInfo[]). The operation will swap 
> BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not 
> touch other fields. To avoid complication, this operation will abort if 
> either file is open (isUnderConstruction() == true)
> bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST 
> to record the change persistently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-14978) In-place Erasure Coding Conversion

2019-12-20 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001041#comment-17001041
 ] 

Aravindan Vijayan edited comment on HDFS-14978 at 12/20/19 5:53 PM:


[~ayushtkn] Thanks for reviewing the design doc. Answers inline.

bq. Does the swapBlockList() Takes care of storage quota validations?
Not in this version. We plan to support it in the next version of this feature. 

bq. How is storage Policies handled here? do you retain it, There are some 
policies not supported for EC, like ONE_SSD.
The temporary file that is created will be created with a specific policy for 
EC. And then the swap block list will also make sure that the replicated file's 
policy is updated to the newly created EC one.

bq. What would be response, say you want to switch to 6-3 and you are able 
fetch only 7 blocks, unable to write two parities, do we allow moving further, 
despite moving into chances of making data more vulnerable? or do we fail?
We will fail fast.

bq. The Overhead of an extra tmp file still says?
The tmp file will be deleted at the end of the operation.

bq. I need to check this but, If a client is reading a file, the swapBlocks() 
happens the block locations get changed, he shall go to refetch the locations 
from namenode, that would be now EC Blocks, The client would be on 
DFSInputStream not on DFSStripedInputStream, I might have messed up, but is 
there a cover to this?
Yes this is a valid concern. We will have to try and repro this possible race 
condition. The plan is to not fail the client's read operation while the file 
is undergoing in place EC conversion.


was (Author: avijayan):
[~ayushtkn] Thanks for reviewing the design doc. Answers inline.

bq. Does the swapBlockList() Takes care of storage quota validations?
Not in this version. We plan to support it in the next version of this feature. 

How is storage Policies handled here? do you retain it, There are some policies 
not supported for EC, like ONE_SSD.
bq. The temporary file that is created will be created with a specific policy 
for EC. And then the swap block list will also make sure that the replicated 
file's policy is updated to the newly created EC one.

bq. What would be response, say you want to switch to 6-3 and you are able 
fetch only 7 blocks, unable to write two parities, do we allow moving further, 
despite moving into chances of making data more vulnerable? or do we fail?
We will fail fast.

bq. The Overhead of an extra tmp file still says?
The tmp file will be deleted at the end of the operation.

bq. I need to check this but, If a client is reading a file, the swapBlocks() 
happens the block locations get changed, he shall go to refetch the locations 
from namenode, that would be now EC Blocks, The client would be on 
DFSInputStream not on DFSStripedInputStream, I might have messed up, but is 
there a cover to this?
Yes this is a valid concern. We will have to try and repro this possible race 
condition. The plan is to not fail the client's read operation while the file 
is undergoing in place EC conversion.

> In-place Erasure Coding Conversion
> --
>
> Key: HDFS-14978
> URL: https://issues.apache.org/jira/browse/HDFS-14978
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Aravindan Vijayan
>Priority: Major
> Attachments: In-place Erasure Coding Conversion.pdf
>
>
> HDFS Erasure Coding is a new feature added in Apache Hadoop 3.0. It uses 
> encoding algorithms to reduce disk space usage while retaining redundancy 
> necessary for data recovery. It was a huge amount of work but it is just 
> getting adopted after almost 2 years.
> One usability problem that’s blocking users from adopting HDFS Erasure Coding 
> is that existing replicated files have to be copied to an EC-enabled 
> directory explicitly. Renaming a file/directory to an EC-enabled directory 
> does not automatically convert the blocks. Therefore users typically perform 
> the following steps to erasure-code existing files:
> {noformat}
> Create $tmp directory, set EC policy at it
> Distcp $src to $tmp
> Delete $src (rm -rf $src)
> mv $tmp $src
> {noformat}
> There are several reasons why this is not popular:
> * Complex. The process involves several steps: distcp data to a temporary 
> destination; delete source file; move destination to the source path.
> * Availability: there is a short period where nothing exists at the source 
> path, and jobs may fail unexpectedly.
> * Overhead. During the copy phase, there is a point in time where all of 
> source and destination files exist at the same time, exhausting disk space.
> * Not snapshot-friendly. If a snapshot is taken prior to performing the 
> 

[jira] [Commented] (HDFS-14978) In-place Erasure Coding Conversion

2019-12-20 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-14978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17001041#comment-17001041
 ] 

Aravindan Vijayan commented on HDFS-14978:
--

[~ayushtkn] Thanks for reviewing the design doc. Answers inline.

bq. Does the swapBlockList() Takes care of storage quota validations?
Not in this version. We plan to support it in the next version of this feature. 

How is storage Policies handled here? do you retain it, There are some policies 
not supported for EC, like ONE_SSD.
bq. The temporary file that is created will be created with a specific policy 
for EC. And then the swap block list will also make sure that the replicated 
file's policy is updated to the newly created EC one.

bq. What would be response, say you want to switch to 6-3 and you are able 
fetch only 7 blocks, unable to write two parities, do we allow moving further, 
despite moving into chances of making data more vulnerable? or do we fail?
We will fail fast.

bq. The Overhead of an extra tmp file still says?
The tmp file will be deleted at the end of the operation.

bq. I need to check this but, If a client is reading a file, the swapBlocks() 
happens the block locations get changed, he shall go to refetch the locations 
from namenode, that would be now EC Blocks, The client would be on 
DFSInputStream not on DFSStripedInputStream, I might have messed up, but is 
there a cover to this?
Yes this is a valid concern. We will have to try and repro this possible race 
condition. The plan is to not fail the client's read operation while the file 
is undergoing in place EC conversion.

> In-place Erasure Coding Conversion
> --
>
> Key: HDFS-14978
> URL: https://issues.apache.org/jira/browse/HDFS-14978
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: erasure-coding
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Assignee: Aravindan Vijayan
>Priority: Major
> Attachments: In-place Erasure Coding Conversion.pdf
>
>
> HDFS Erasure Coding is a new feature added in Apache Hadoop 3.0. It uses 
> encoding algorithms to reduce disk space usage while retaining redundancy 
> necessary for data recovery. It was a huge amount of work but it is just 
> getting adopted after almost 2 years.
> One usability problem that’s blocking users from adopting HDFS Erasure Coding 
> is that existing replicated files have to be copied to an EC-enabled 
> directory explicitly. Renaming a file/directory to an EC-enabled directory 
> does not automatically convert the blocks. Therefore users typically perform 
> the following steps to erasure-code existing files:
> {noformat}
> Create $tmp directory, set EC policy at it
> Distcp $src to $tmp
> Delete $src (rm -rf $src)
> mv $tmp $src
> {noformat}
> There are several reasons why this is not popular:
> * Complex. The process involves several steps: distcp data to a temporary 
> destination; delete source file; move destination to the source path.
> * Availability: there is a short period where nothing exists at the source 
> path, and jobs may fail unexpectedly.
> * Overhead. During the copy phase, there is a point in time where all of 
> source and destination files exist at the same time, exhausting disk space.
> * Not snapshot-friendly. If a snapshot is taken prior to performing the 
> conversion, the source (replicated) files will be preserved in the cluster 
> too. Therefore, the conversion actually increase storage space usage.
> * Not management-friendly. This approach changes file inode number, 
> modification time and access time. Erasure coded files are supposed to store 
> cold data, but this conversion makes data “hot” again.
> * Bulky. It’s either all or nothing. The directory may be partially erasure 
> coded, but this approach simply erasure code everything again.
> To ease data management, we should offer a utility tool to convert replicated 
> files to erasure coded files in-place.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15006) Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes.

2019-12-20 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDFS-15006:
-
Description: 
In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the 
set of blocks in an INode File with a new set of blocks. This JIRA will track 
the effort to add an FS Edit log op to persist this operation. 

We also need to increase the NamenodeLayoutVersion for this change.

  was:In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace 
the set of blocks in an INode File with a new set of blocks. This JIRA will 
track the effort to add an FS Edit log op to persist this operation. 


> Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes.
> 
>
> Key: HDFS-15006
> URL: https://issues.apache.org/jira/browse/HDFS-15006
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the 
> set of blocks in an INode File with a new set of blocks. This JIRA will track 
> the effort to add an FS Edit log op to persist this operation. 
> We also need to increase the NamenodeLayoutVersion for this change.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.

2019-12-12 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDFS-14989:
-
Status: Patch Available  (was: In Progress)

> Add a 'swapBlockList' operation to Namenode.
> 
>
> Key: HDFS-14989
> URL: https://issues.apache.org/jira/browse/HDFS-14989
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Borrowing from the design doc.
> bq. The swapBlockList takes two parameters, a source file and a destination 
> file. This operation swaps the blocks belonging to the source and the 
> destination atomically.
> bq. The namespace metadata of interest is the INodeFile class. A file 
> (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, 
> BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile 
> contains a list of blocks (BlockInfo[]). The operation will swap 
> BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not 
> touch other fields. To avoid complication, this operation will abort if 
> either file is open (isUnderConstruction() == true)
> bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST 
> to record the change persistently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-15008) Expose client API to carry out In-Place EC of a file.

2019-12-10 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDFS-15008:
-
Description: 
Converting a file will be a 3 step task:
* Make a temporary, erasure-coded copy.
* swapBlockList($src, $tmp) (HDFS-14989)
* delete($tmp)

The API should allow the EC blocks to use the storage policy of the original 
file through a config parameter.

  was:
Converting a file will be a 3 step task:
* Make a temporary, erasure-coded copy.
* swapBlockList($src, $tmp) (HDFS-14989)
* delete($tmp)


> Expose client API to carry out In-Place EC of a file.
> -
>
> Key: HDFS-15008
> URL: https://issues.apache.org/jira/browse/HDFS-15008
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Converting a file will be a 3 step task:
> * Make a temporary, erasure-coded copy.
> * swapBlockList($src, $tmp) (HDFS-14989)
> * delete($tmp)
> The API should allow the EC blocks to use the storage policy of the original 
> file through a config parameter.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.

2019-12-10 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDFS-14989 started by Aravindan Vijayan.

> Add a 'swapBlockList' operation to Namenode.
> 
>
> Key: HDFS-14989
> URL: https://issues.apache.org/jira/browse/HDFS-14989
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Borrowing from the design doc.
> bq. The swapBlockList takes two parameters, a source file and a destination 
> file. This operation swaps the blocks belonging to the source and the 
> destination atomically.
> bq. The namespace metadata of interest is the INodeFile class. A file 
> (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, 
> BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile 
> contains a list of blocks (BlockInfo[]). The operation will swap 
> BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not 
> touch other fields. To avoid complication, this operation will abort if 
> either file is open (isUnderConstruction() == true)
> bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST 
> to record the change persistently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14980) diskbalancer query command always tries to contact to port 9867

2019-12-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-14980:


Assignee: Aravindan Vijayan  (was: Siddharth Wagle)

> diskbalancer query command always tries to contact to port 9867
> ---
>
> Key: HDFS-14980
> URL: https://issues.apache.org/jira/browse/HDFS-14980
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: diskbalancer
>Reporter: Nilotpal Nandi
>Assignee: Aravindan Vijayan
>Priority: Major
>
> disbalancer query commands always tries to connect to port 9867 even when 
> datanode IPC port is different.
> In this setup , datanode IPC port is set to 20001.
>  
> diskbalancer report command works fine and connects to IPC port 20001
>  
> {noformat}
> hdfs diskbalancer -report -node 172.27.131.193
> 19/11/12 08:58:55 INFO command.Command: Processing report command
> 19/11/12 08:58:57 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 19/11/12 08:58:57 INFO block.BlockTokenSecretManager: Setting block keys
> 19/11/12 08:58:57 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 19/11/12 08:58:58 INFO command.Command: Reporting volume information for 
> DataNode(s). These DataNode(s) are parsed from '172.27.131.193'.
> Processing report command
> Reporting volume information for DataNode(s). These DataNode(s) are parsed 
> from '172.27.131.193'.
> [172.27.131.193:20001] - : 3 
> volumes with node data density 0.05.
> [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK1/] - 0.15 used: 
> 39343871181/259692498944, 0.85 free: 220348627763/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
> [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK2/] - 0.15 used: 
> 39371179986/259692498944, 0.85 free: 220321318958/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
> [DISK: volume-/dataroot/ycloud/dfs/dn/] - 0.19 used: 
> 49934903670/259692498944, 0.81 free: 209757595274/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
>  
> {noformat}
>  
> But  diskbalancer query command fails and tries to connect to port 9867 
> (default port).
>  
> {noformat}
> hdfs diskbalancer -query 172.27.131.193
> 19/11/12 06:37:15 INFO command.Command: Executing "query plan" command.
> 19/11/12 06:37:16 INFO ipc.Client: Retrying connect to server: 
> /172.27.131.193:9867. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19/11/12 06:37:17 INFO ipc.Client: Retrying connect to server: 
> /172.27.131.193:9867. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> ..
> ..
> ..
> 19/11/12 06:37:25 ERROR tools.DiskBalancerCLI: Exception thrown while running 
> DiskBalancerCLI.
> {noformat}
>  
>  
> Expectation :
> diskbalancer query command should work fine without explicitly mentioning 
> datanode IPC port address



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-14980) diskbalancer query command always tries to contact to port 9867

2019-12-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-14980:


Assignee: Siddharth Wagle  (was: Aravindan Vijayan)

> diskbalancer query command always tries to contact to port 9867
> ---
>
> Key: HDFS-14980
> URL: https://issues.apache.org/jira/browse/HDFS-14980
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: diskbalancer
>Reporter: Nilotpal Nandi
>Assignee: Siddharth Wagle
>Priority: Major
>
> disbalancer query commands always tries to connect to port 9867 even when 
> datanode IPC port is different.
> In this setup , datanode IPC port is set to 20001.
>  
> diskbalancer report command works fine and connects to IPC port 20001
>  
> {noformat}
> hdfs diskbalancer -report -node 172.27.131.193
> 19/11/12 08:58:55 INFO command.Command: Processing report command
> 19/11/12 08:58:57 INFO balancer.KeyManager: Block token params received from 
> NN: update interval=10hrs, 0sec, token lifetime=10hrs, 0sec
> 19/11/12 08:58:57 INFO block.BlockTokenSecretManager: Setting block keys
> 19/11/12 08:58:57 INFO balancer.KeyManager: Update block keys every 2hrs, 
> 30mins, 0sec
> 19/11/12 08:58:58 INFO command.Command: Reporting volume information for 
> DataNode(s). These DataNode(s) are parsed from '172.27.131.193'.
> Processing report command
> Reporting volume information for DataNode(s). These DataNode(s) are parsed 
> from '172.27.131.193'.
> [172.27.131.193:20001] - : 3 
> volumes with node data density 0.05.
> [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK1/] - 0.15 used: 
> 39343871181/259692498944, 0.85 free: 220348627763/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
> [DISK: volume-/dataroot/ycloud/dfs/NEW_DISK2/] - 0.15 used: 
> 39371179986/259692498944, 0.85 free: 220321318958/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
> [DISK: volume-/dataroot/ycloud/dfs/dn/] - 0.19 used: 
> 49934903670/259692498944, 0.81 free: 209757595274/259692498944, isFailed: 
> False, isReadOnly: False, isSkip: False, isTransient: False.
>  
> {noformat}
>  
> But  diskbalancer query command fails and tries to connect to port 9867 
> (default port).
>  
> {noformat}
> hdfs diskbalancer -query 172.27.131.193
> 19/11/12 06:37:15 INFO command.Command: Executing "query plan" command.
> 19/11/12 06:37:16 INFO ipc.Client: Retrying connect to server: 
> /172.27.131.193:9867. Already tried 0 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> 19/11/12 06:37:17 INFO ipc.Client: Retrying connect to server: 
> /172.27.131.193:9867. Already tried 1 time(s); retry policy is 
> RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 
> MILLISECONDS)
> ..
> ..
> ..
> 19/11/12 06:37:25 ERROR tools.DiskBalancerCLI: Exception thrown while running 
> DiskBalancerCLI.
> {noformat}
>  
>  
> Expectation :
> diskbalancer query command should work fine without explicitly mentioning 
> datanode IPC port address



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15007) Make sure upgrades & downgrades work as expected with In-Place EC.

2019-11-21 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-15007:


Assignee: Aravindan Vijayan

> Make sure upgrades & downgrades work as expected with In-Place EC.
> --
>
> Key: HDFS-15007
> URL: https://issues.apache.org/jira/browse/HDFS-15007
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Since we are adding a new FS Edit log op, we have to do some work to make 
> sure upgrades and downgrade work as expected. A request to change a file to 
> EC  before finalize should not be allowed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-15006) Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes.

2019-11-21 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDFS-15006:


Assignee: Aravindan Vijayan

> Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes.
> 
>
> Key: HDFS-15006
> URL: https://issues.apache.org/jira/browse/HDFS-15006
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the 
> set of blocks in an INode File with a new set of blocks. This JIRA will track 
> the effort to add an FS Edit log op to persist this operation. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15008) Expose client API to carry out In-Place EC of a file.

2019-11-21 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDFS-15008:


 Summary: Expose client API to carry out In-Place EC of a file.
 Key: HDFS-15008
 URL: https://issues.apache.org/jira/browse/HDFS-15008
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan


Converting a file will be a 3 step task:
* Make a temporary, erasure-coded copy.
* swapBlockList($src, $tmp) (HDFS-14989)
* delete($tmp)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15007) Make sure upgrades & downgrades work as expected with In-Place EC.

2019-11-21 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDFS-15007:


 Summary: Make sure upgrades & downgrades work as expected with 
In-Place EC.
 Key: HDFS-15007
 URL: https://issues.apache.org/jira/browse/HDFS-15007
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Aravindan Vijayan


Since we are adding a new FS Edit log op, we have to do some work to make sure 
upgrades and downgrade work as expected. A request to change a file to EC  
before finalize should not be allowed. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-15006) Add OP_SWAP_BLOCK_LIST as an operation code in FSEditLogOpCodes.

2019-11-21 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDFS-15006:


 Summary: Add OP_SWAP_BLOCK_LIST as an operation code in 
FSEditLogOpCodes.
 Key: HDFS-15006
 URL: https://issues.apache.org/jira/browse/HDFS-15006
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Aravindan Vijayan


In HDFS-14989, we add a new Namenode operation "swapBlockList" to replace the 
set of blocks in an INode File with a new set of blocks. This JIRA will track 
the effort to add an FS Edit log op to persist this operation. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2590) Integration tests for Recon with Ozone Manager.

2019-11-20 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2590:
---

 Summary: Integration tests for Recon with Ozone Manager.
 Key: HDDS-2590
 URL: https://issues.apache.org/jira/browse/HDDS-2590
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: Ozone Recon
Reporter: Aravindan Vijayan
 Fix For: 0.5.0


Currently, Recon has only unit tests. We need to add the following integration 
tests to make sure there are no regressions or contract breakage with Ozone 
Manager. 

The first step would be to add Recon as a new component to Mini Ozone cluster.

* *Test 1* - Verify Recon can get full snapshot and subsequent delta updates 
from Ozone Manager on startup.
  > Start up a Mini Ozone cluster (with Recon) with a few keys in OM.
  > Verify Recon gets full DB snapshot from OM.
  > Add 100 keys to OM
  > Verify Recon picks up the new keys using the delta updates mechanism.
  > Verify OM DB seq number == Recon's OM DB snapshot's seq number

* *Test 2* - Verify Recon restart does not cause issues with the OM DB syncing.
   > Startup Mini Ozone cluster (with Recon).
   > Add 100 keys to OM
   > Verify Recon picks up the new keys.
   > Stop Recon Server
   > Add 5 keys to OM.
   > Start Recon Server
   > Verify that Recon Server does not request full snapshot from OM (since 
only a small 
   number of keys have been added, and hence Recon should be able to get 
the 
   updates alone)
   > Verify OM DB seq number == Recon's OM DB snapshot's seq number

*Note* : This exercise might expose a few bugs in Recon-OM integration which is 
perfectly normal and is the exact reason why we want these tests to be written. 
Please file JIRAs for any major issues encountered and link them here. Minor 
issues can hopefully be fixed as part of this effort. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2590) Integration tests for Recon with Ozone Manager.

2019-11-20 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2590:

Description: 
Currently, Recon has only unit tests. We need to add the following integration 
tests to make sure there are no regressions or contract breakage with Ozone 
Manager. 

The first step would be to add Recon as a new component to Mini Ozone cluster.

* *Test 1* - *Verify Recon can get full snapshot and subsequent delta updates 
from Ozone Manager on startup.*
  > Start up a Mini Ozone cluster (with Recon) with a few keys in OM.
  > Verify Recon gets full DB snapshot from OM.
  > Add 100 keys to OM
  > Verify Recon picks up the new keys using the delta updates mechanism.
  > Verify OM DB seq number == Recon's OM DB snapshot's seq number

* *Test 2* - *Verify Recon restart does not cause issues with the OM DB 
syncing.*
   > Startup Mini Ozone cluster (with Recon).
   > Add 100 keys to OM
   > Verify Recon picks up the new keys.
   > Stop Recon Server
   > Add 5 keys to OM.
   > Start Recon Server
   > Verify that Recon Server does not request full snapshot from OM (since 
only a small 
   number of keys have been added, and hence Recon should be able to get 
the 
   updates alone)
   > Verify OM DB seq number == Recon's OM DB snapshot's seq number

*Note* : This exercise might expose a few bugs in Recon-OM integration which is 
perfectly normal and is the exact reason why we want these tests to be written. 
Please file JIRAs for any major issues encountered and link them here. Minor 
issues can hopefully be fixed as part of this effort. 

  was:
Currently, Recon has only unit tests. We need to add the following integration 
tests to make sure there are no regressions or contract breakage with Ozone 
Manager. 

The first step would be to add Recon as a new component to Mini Ozone cluster.

* *Test 1* - Verify Recon can get full snapshot and subsequent delta updates 
from Ozone Manager on startup.
  > Start up a Mini Ozone cluster (with Recon) with a few keys in OM.
  > Verify Recon gets full DB snapshot from OM.
  > Add 100 keys to OM
  > Verify Recon picks up the new keys using the delta updates mechanism.
  > Verify OM DB seq number == Recon's OM DB snapshot's seq number

* *Test 2* - Verify Recon restart does not cause issues with the OM DB syncing.
   > Startup Mini Ozone cluster (with Recon).
   > Add 100 keys to OM
   > Verify Recon picks up the new keys.
   > Stop Recon Server
   > Add 5 keys to OM.
   > Start Recon Server
   > Verify that Recon Server does not request full snapshot from OM (since 
only a small 
   number of keys have been added, and hence Recon should be able to get 
the 
   updates alone)
   > Verify OM DB seq number == Recon's OM DB snapshot's seq number

*Note* : This exercise might expose a few bugs in Recon-OM integration which is 
perfectly normal and is the exact reason why we want these tests to be written. 
Please file JIRAs for any major issues encountered and link them here. Minor 
issues can hopefully be fixed as part of this effort. 


> Integration tests for Recon with Ozone Manager.
> ---
>
> Key: HDDS-2590
> URL: https://issues.apache.org/jira/browse/HDDS-2590
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Priority: Major
> Fix For: 0.5.0
>
>
> Currently, Recon has only unit tests. We need to add the following 
> integration tests to make sure there are no regressions or contract breakage 
> with Ozone Manager. 
> The first step would be to add Recon as a new component to Mini Ozone cluster.
> * *Test 1* - *Verify Recon can get full snapshot and subsequent delta updates 
> from Ozone Manager on startup.*
>   > Start up a Mini Ozone cluster (with Recon) with a few keys in OM.
>   > Verify Recon gets full DB snapshot from OM.
>   > Add 100 keys to OM
>   > Verify Recon picks up the new keys using the delta updates mechanism.
>   > Verify OM DB seq number == Recon's OM DB snapshot's seq number
> * *Test 2* - *Verify Recon restart does not cause issues with the OM DB 
> syncing.*
>> Startup Mini Ozone cluster (with Recon).
>> Add 100 keys to OM
>> Verify Recon picks up the new keys.
>> Stop Recon Server
>> Add 5 keys to OM.
>> Start Recon Server
>> Verify that Recon Server does not request full snapshot from OM (since 
> only a small 
>number of keys have been added, and hence Recon should be able to get 
> the 
>updates alone)
>> Verify OM DB seq number == Recon's OM DB snapshot's seq number
> *Note* : This exercise might expose a few bugs in Recon-OM integration which 
> is perfectly normal and is the exact reason why we want these tests to be 
> written. Please file JIRAs for any 

[jira] [Updated] (HDDS-1722) Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on startup

2019-11-20 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1722:

Labels: newbie  (was: )

> Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on 
> startup
> -
>
> Key: HDDS-1722
> URL: https://issues.apache.org/jira/browse/HDDS-1722
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Shweta
>Priority: Major
>  Labels: newbie
>
> Currently the table creation is done for each schema definition one by one. 
> Setup sqlite DB and create Recon SQL tables.
> cc [~vivekratnavel], [~swagle]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-1722) Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on startup

2019-11-20 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-1722:
---

Assignee: (was: Shweta)

> Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on 
> startup
> -
>
> Key: HDDS-1722
> URL: https://issues.apache.org/jira/browse/HDDS-1722
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Priority: Major
>  Labels: newbie
>
> Currently the table creation is done for each schema definition one by one. 
> Setup sqlite DB and create Recon SQL tables.
> cc [~vivekratnavel], [~swagle]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2579) Ozone client should refresh pipeline info if reads from all Datanodes fail.

2019-11-19 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2579:

Description: 
Currently, if the client reads from all Datanodes in the pipleine fail, the 
read fails altogether. There may be a case when the container is moved to a new 
pipeline by the time client reads. In this case, the client should request for 
a refresh pipeline from OM, and read it again if the new pipeline returned from 
OM is different. 

This behavior is consistent with that of HDFS.
cc [~msingh] / [~shashikant] / [~hanishakoneru]

  was:Currently, if the client reads from all Datanodes in the pipleine fail, 
the read fails altogether. There may be a case when the container is moved to a 
new pipeline by the time client reads. In this case, the client should request 
for a refresh pipeline from OM, and read it again if the new pipeline returned 
from OM is different. 


> Ozone client should refresh pipeline info if reads from all Datanodes fail.
> ---
>
> Key: HDDS-2579
> URL: https://issues.apache.org/jira/browse/HDDS-2579
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Client
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
> Fix For: 0.5.0
>
>
> Currently, if the client reads from all Datanodes in the pipleine fail, the 
> read fails altogether. There may be a case when the container is moved to a 
> new pipeline by the time client reads. In this case, the client should 
> request for a refresh pipeline from OM, and read it again if the new pipeline 
> returned from OM is different. 
> This behavior is consistent with that of HDFS.
> cc [~msingh] / [~shashikant] / [~hanishakoneru]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2504) Handle InterruptedException properly

2019-11-19 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16978061#comment-16978061
 ] 

Aravindan Vijayan commented on HDDS-2504:
-

Thank you [~adoroszlai] for filing this umbrella task. 

> Handle InterruptedException properly
> 
>
> Key: HDDS-2504
> URL: https://issues.apache.org/jira/browse/HDDS-2504
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Attila Doroszlai
>Priority: Major
>  Labels: newbie, sonar
>
> {quote}Either re-interrupt or rethrow the {{InterruptedException}}
> {quote}
> in several files (42 issues)
> [https://sonarcloud.io/project/issues?id=hadoop-ozone=false=squid%3AS2142=OPEN=BUG]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2573) Handle InterruptedException in KeyOutputStream

2019-11-19 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2573:
---

Assignee: Aravindan Vijayan

> Handle InterruptedException in KeyOutputStream
> --
>
> Key: HDDS-2573
> URL: https://issues.apache.org/jira/browse/HDDS-2573
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: newbie, sonar
>
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-m5KcVY8lQ4ZsAc=AW5md-m5KcVY8lQ4ZsAc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2577) Handle InterruptedException in OzoneManagerProtocolServerSideTranslatorPB

2019-11-19 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2577:
---

Assignee: Aravindan Vijayan

> Handle InterruptedException in OzoneManagerProtocolServerSideTranslatorPB
> -
>
> Key: HDDS-2577
> URL: https://issues.apache.org/jira/browse/HDDS-2577
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: newbie, sonar
>
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-Z7KcVY8lQ4Zr1l=AW5md-Z7KcVY8lQ4Zr1l



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2571) Handle InterruptedException in SCMPipelineManager

2019-11-19 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2571:
---

Assignee: Aravindan Vijayan

> Handle InterruptedException in SCMPipelineManager
> -
>
> Key: HDDS-2571
> URL: https://issues.apache.org/jira/browse/HDDS-2571
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>Reporter: Dinesh Chitlangia
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: newbie, sonar
>
> [https://sonarcloud.io/project/issues?id=hadoop-ozone=AW6BMuREm2E_7tGaNiTh=AW6BMuREm2E_7tGaNiTh]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2579) Ozone client should refresh pipeline info if reads from all Datanodes fail.

2019-11-19 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2579:
---

 Summary: Ozone client should refresh pipeline info if reads from 
all Datanodes fail.
 Key: HDDS-2579
 URL: https://issues.apache.org/jira/browse/HDDS-2579
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan
 Fix For: 0.5.0


Currently, if the client reads from all Datanodes in the pipleine fail, the 
read fails altogether. There may be a case when the container is moved to a new 
pipeline by the time client reads. In this case, the client should request for 
a refresh pipeline from OM, and read it again if the new pipeline returned from 
OM is different. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1084) Ozone Recon Service v0

2019-11-19 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1084:

Summary: Ozone Recon Service v0  (was: Ozone Recon Service v1)

> Ozone Recon Service v0
> --
>
> Key: HDDS-1084
> URL: https://issues.apache.org/jira/browse/HDDS-1084
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Recon
>Affects Versions: 0.4.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: Ozone_Recon_Design_V1_Draft.pdf
>
>
> Recon Server at a high level will maintain a global view of Ozone that is not 
> available from SCM or OM. Things like how many volumes exist; and how many 
> buckets exist per volume; which volume has maximum buckets; which are buckets 
> that have not been accessed for a year, which are the corrupt blocks, which 
> are blocks on data nodes which are not used; and answer similar queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1084) Ozone Recon Service V0.1

2019-11-19 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1084:

Summary: Ozone Recon Service V0.1  (was: Ozone Recon Service v0)

> Ozone Recon Service V0.1
> 
>
> Key: HDDS-1084
> URL: https://issues.apache.org/jira/browse/HDDS-1084
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Recon
>Affects Versions: 0.4.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: Ozone_Recon_Design_V1_Draft.pdf
>
>
> Recon Server at a high level will maintain a global view of Ozone that is not 
> available from SCM or OM. Things like how many volumes exist; and how many 
> buckets exist per volume; which volume has maximum buckets; which are buckets 
> that have not been accessed for a year, which are the corrupt blocks, which 
> are blocks on data nodes which are not used; and answer similar queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1996) Ozone Recon Service v0.2

2019-11-19 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1996:

Summary: Ozone Recon Service v0.2  (was: Ozone Recon Service v1.1)

> Ozone Recon Service v0.2
> 
>
> Key: HDDS-1996
> URL: https://issues.apache.org/jira/browse/HDDS-1996
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Recon
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Placeholder for all the tasks covered in the design doc for HDDS-1084 but did 
> not make it into 0.5.0 release and targeted towards the next release of Ozone 
> after 0.5.0.
> An updated design document will be uploaded to this Jira.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-1084) Ozone Recon Service v1

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan resolved HDDS-1084.
-
Resolution: Fixed

Moved out all open JIRAs. They will be tracked under HDDS-1996. Resolving this 
JIRA. 

> Ozone Recon Service v1
> --
>
> Key: HDDS-1084
> URL: https://issues.apache.org/jira/browse/HDDS-1084
> Project: Hadoop Distributed Data Store
>  Issue Type: New Feature
>  Components: Ozone Recon
>Affects Versions: 0.4.0
>Reporter: Siddharth Wagle
>Assignee: Siddharth Wagle
>Priority: Major
> Fix For: 0.5.0
>
> Attachments: Ozone_Recon_Design_V1_Draft.pdf
>
>
> Recon Server at a high level will maintain a global view of Ozone that is not 
> available from SCM or OM. Things like how many volumes exist; and how many 
> buckets exist per volume; which volume has maximum buckets; which are buckets 
> that have not been accessed for a year, which are the corrupt blocks, which 
> are blocks on data nodes which are not used; and answer similar queries.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1335) Basic Recon UI for serving up container key mapping.

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1335:

Issue Type: Task  (was: Bug)

> Basic Recon UI for serving up container key mapping.
> 
>
> Key: HDDS-1335
> URL: https://issues.apache.org/jira/browse/HDDS-1335
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1873:

Parent: (was: HDDS-1084)
Issue Type: Task  (was: Sub-task)

> Add API to get last completed times for every Recon task.
> -
>
> Key: HDDS-1873
> URL: https://issues.apache.org/jira/browse/HDDS-1873
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Recon
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Shweta
>Priority: Major
>
> Recon stores the last ozone manager snapshot received timestamp along with 
> timestamps of last successful run for each task.
> We can add a REST endpoint that returns the contents of this stored data. 
> This is important to give users a sense of how latest the current data that 
> they are looking at is. And, we need this per task because some tasks might 
> fail to run or might take much longer time to run than other tasks and this 
> needs to be reflected in the UI for better and consistent user experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1722) Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on startup

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1722:

Parent: (was: HDDS-1084)
Issue Type: Task  (was: Sub-task)

> Use the bindings in ReconSchemaGenerationModule to create Recon SQL tables on 
> startup
> -
>
> Key: HDDS-1722
> URL: https://issues.apache.org/jira/browse/HDDS-1722
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Shweta
>Priority: Major
>
> Currently the table creation is done for each schema definition one by one. 
> Setup sqlite DB and create Recon SQL tables.
> cc [~vivekratnavel], [~swagle]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1367) Add ability in Recon to track the growth rate of the cluster.

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1367:

Parent: (was: HDDS-1084)
Issue Type: Task  (was: Sub-task)

> Add ability in Recon to track the growth rate of the cluster. 
> --
>
> Key: HDDS-1367
> URL: https://issues.apache.org/jira/browse/HDDS-1367
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>
> Recon should be able to answer the question "How fast is the cluster growing, 
> by week, by month, by day?", which gives the user an idea of the usage stats 
> of the cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1335) Basic Recon UI for serving up container key mapping.

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1335:

Parent: (was: HDDS-1084)
Issue Type: Bug  (was: Sub-task)

> Basic Recon UI for serving up container key mapping.
> 
>
> Key: HDDS-1335
> URL: https://issues.apache.org/jira/browse/HDDS-1335
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2510) Use isEmpty() to check whether the collection is empty or not in Ozone Manager module

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2510:

Summary: Use isEmpty() to check whether the collection is empty or not in 
Ozone Manager module  (was: Sonar : Use isEmpty() to check whether the 
collection is empty or not in Ozone Manager module)

> Use isEmpty() to check whether the collection is empty or not in Ozone 
> Manager module
> -
>
> Key: HDDS-2510
> URL: https://issues.apache.org/jira/browse/HDDS-2510
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: sonar
>
> Ozone Manager has a number of instances where the following sonar rule is 
> flagged.
> bq. Using Collection.size() to test for emptiness works, but using 
> Collection.isEmpty() makes the code more readable and can be more performant. 
> The time complexity of any isEmpty() method implementation should be O(1) 
> whereas some implementations of size() can be O(n).
> An example of a flagged instance - 
> https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2511) Fix Sonar issues in OzoneManagerServiceProviderImpl

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2511:

Summary: Fix Sonar issues in OzoneManagerServiceProviderImpl  (was: Sonar : 
Fix Sonar issues in OzoneManagerServiceProviderImpl)

> Fix Sonar issues in OzoneManagerServiceProviderImpl
> ---
>
> Key: HDDS-2511
> URL: https://issues.apache.org/jira/browse/HDDS-2511
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: sonar
> Fix For: 0.5.0
>
>
> Link to the list of issues : 
> https://sonarcloud.io/project/issues?fileUuids=AW5md-HdKcVY8lQ4ZrUn=hadoop-ozone=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2511) Sonar : Fix Sonar issues in OzoneManagerServiceProviderImpl

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2511:

Labels: sonar  (was: )

> Sonar : Fix Sonar issues in OzoneManagerServiceProviderImpl
> ---
>
> Key: HDDS-2511
> URL: https://issues.apache.org/jira/browse/HDDS-2511
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: sonar
> Fix For: 0.5.0
>
>
> Link to the list of issues : 
> https://sonarcloud.io/project/issues?fileUuids=AW5md-HdKcVY8lQ4ZrUn=hadoop-ozone=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2511) Sonar : Fix Sonar issues in OzoneManagerServiceProviderImpl

2019-11-15 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2511:
---

 Summary: Sonar : Fix Sonar issues in 
OzoneManagerServiceProviderImpl
 Key: HDDS-2511
 URL: https://issues.apache.org/jira/browse/HDDS-2511
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Recon
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan
 Fix For: 0.5.0


Link to the list of issues : 
https://sonarcloud.io/project/issues?fileUuids=AW5md-HdKcVY8lQ4ZrUn=hadoop-ozone=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2510) Sonar : Use isEmpty() to check whether the collection is empty or not in Ozone Manager module

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2510:

Description: 
Ozone Manager has a number of instances where the following sonar rule is 
flagged.

\bq. Using Collection.size() to test for emptiness works, but using 
Collection.isEmpty() makes the code more readable and can be more performant. 
The time complexity of any isEmpty() method implementation should be O(1) 
whereas some implementations of size() can be O(n).

An example of a flagged instance - 
https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false


> Sonar : Use isEmpty() to check whether the collection is empty or not in 
> Ozone Manager module
> -
>
> Key: HDDS-2510
> URL: https://issues.apache.org/jira/browse/HDDS-2510
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: sonar
>
> Ozone Manager has a number of instances where the following sonar rule is 
> flagged.
> \bq. Using Collection.size() to test for emptiness works, but using 
> Collection.isEmpty() makes the code more readable and can be more performant. 
> The time complexity of any isEmpty() method implementation should be O(1) 
> whereas some implementations of size() can be O(n).
> An example of a flagged instance - 
> https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2510) Sonar : Use isEmpty() to check whether the collection is empty or not in Ozone Manager module

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2510:

Description: 
Ozone Manager has a number of instances where the following sonar rule is 
flagged.

bq. Using Collection.size() to test for emptiness works, but using 
Collection.isEmpty() makes the code more readable and can be more performant. 
The time complexity of any isEmpty() method implementation should be O(1) 
whereas some implementations of size() can be O(n).

An example of a flagged instance - 
https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false


  was:
Ozone Manager has a number of instances where the following sonar rule is 
flagged.

\bq. Using Collection.size() to test for emptiness works, but using 
Collection.isEmpty() makes the code more readable and can be more performant. 
The time complexity of any isEmpty() method implementation should be O(1) 
whereas some implementations of size() can be O(n).

An example of a flagged instance - 
https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false



> Sonar : Use isEmpty() to check whether the collection is empty or not in 
> Ozone Manager module
> -
>
> Key: HDDS-2510
> URL: https://issues.apache.org/jira/browse/HDDS-2510
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: sonar
>
> Ozone Manager has a number of instances where the following sonar rule is 
> flagged.
> bq. Using Collection.size() to test for emptiness works, but using 
> Collection.isEmpty() makes the code more readable and can be more performant. 
> The time complexity of any isEmpty() method implementation should be O(1) 
> whereas some implementations of size() can be O(n).
> An example of a flagged instance - 
> https://sonarcloud.io/issues?myIssues=true=AW5md-W4KcVY8lQ4Zrv_=false



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2510) Sonar : Use isEmpty() to check whether the collection is empty or not in Ozone Manager module

2019-11-15 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2510:
---

 Summary: Sonar : Use isEmpty() to check whether the collection is 
empty or not in Ozone Manager module
 Key: HDDS-2510
 URL: https://issues.apache.org/jira/browse/HDDS-2510
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2488) Not enough arguments for log messages in GrpcXceiverService

2019-11-15 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2488:
---

Assignee: Aravindan Vijayan

> Not enough arguments for log messages in GrpcXceiverService
> ---
>
> Key: HDDS-2488
> URL: https://issues.apache.org/jira/browse/HDDS-2488
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Attila Doroszlai
>Assignee: Aravindan Vijayan
>Priority: Minor
>  Labels: sonar
>
> GrpcXceiverService has log messages with too few arguments for placeholders.  
> Only one of them is flagged by Sonar, but all seem to have the same problem.
> https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-69KcVY8lQ4ZsRZ=AW5md-69KcVY8lQ4ZsRZ



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.

2019-11-14 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDFS-14989:
-
Description: 
Borrowing from the design doc.

bq. The swapBlockList takes two parameters, a source file and a destination 
file. This operation swaps the blocks belonging to the source and the 
destination atomically.

bq. The namespace metadata of interest is the INodeFile class. A file 
(INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, 
BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile 
contains a list of blocks (BlockInfo[]). The operation will swap 
BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not 
touch other fields. To avoid complication, this operation will abort if either 
file is open (isUnderConstruction() == true)

bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to 
record the change persistently.


  was:
Borrowing from the design doc.

bq. 
The swapBlockList takes two parameters, a source file and a destination file. 
This operation swaps the blocks belonging to the source and the destination 
atomically.

The namespace metadata of interest is the INodeFile class. A file (INodeFile) 
contains a header composed of PREFERRED_BLOCK_SIZE, BLOCK_LAYOUT_AND_REDUNDANCY 
and STORAGE_POLICY_ID. In addition, an INodeFile contains a list of blocks 
(BlockInfo[]). The operation will swap BLOCK_LAYOUT_AND_REDUNDANCY header bits 
and the block lists. But it will not touch other fields. To avoid complication, 
this operation will abort if either file is open (isUnderConstruction() == true)

Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to 
record the change persistently.



> Add a 'swapBlockList' operation to Namenode.
> 
>
> Key: HDFS-14989
> URL: https://issues.apache.org/jira/browse/HDFS-14989
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Borrowing from the design doc.
> bq. The swapBlockList takes two parameters, a source file and a destination 
> file. This operation swaps the blocks belonging to the source and the 
> destination atomically.
> bq. The namespace metadata of interest is the INodeFile class. A file 
> (INodeFile) contains a header composed of PREFERRED_BLOCK_SIZE, 
> BLOCK_LAYOUT_AND_REDUNDANCY and STORAGE_POLICY_ID. In addition, an INodeFile 
> contains a list of blocks (BlockInfo[]). The operation will swap 
> BLOCK_LAYOUT_AND_REDUNDANCY header bits and the block lists. But it will not 
> touch other fields. To avoid complication, this operation will abort if 
> either file is open (isUnderConstruction() == true)
> bq. Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST 
> to record the change persistently.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-14989) Add a 'swapBlockList' operation to Namenode.

2019-11-14 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDFS-14989:


 Summary: Add a 'swapBlockList' operation to Namenode.
 Key: HDFS-14989
 URL: https://issues.apache.org/jira/browse/HDFS-14989
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan


Borrowing from the design doc.

bq. 
The swapBlockList takes two parameters, a source file and a destination file. 
This operation swaps the blocks belonging to the source and the destination 
atomically.

The namespace metadata of interest is the INodeFile class. A file (INodeFile) 
contains a header composed of PREFERRED_BLOCK_SIZE, BLOCK_LAYOUT_AND_REDUNDANCY 
and STORAGE_POLICY_ID. In addition, an INodeFile contains a list of blocks 
(BlockInfo[]). The operation will swap BLOCK_LAYOUT_AND_REDUNDANCY header bits 
and the block lists. But it will not touch other fields. To avoid complication, 
this operation will abort if either file is open (isUnderConstruction() == true)

Additionally, this operation introduces a new opcode OP_SWAP_BLOCK_LIST to 
record the change persistently.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.

2019-11-13 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2473:

Description: 
sonarcloud.io has flagged a number of code reliability issues in Ozone recon 
(https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon).

Following issues will be triaged / fixed.
* Double Brace Initialization should not be used
* Resources should be closed
* InterruptedException should not be ignored


> Fix code reliability issues found by Sonar in Ozone Recon module.
> -
>
> Key: HDDS-2473
> URL: https://issues.apache.org/jira/browse/HDDS-2473
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Affects Versions: 0.5.0
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
> Fix For: 0.5.0
>
>
> sonarcloud.io has flagged a number of code reliability issues in Ozone recon 
> (https://sonarcloud.io/code?id=hadoop-ozone=hadoop-ozone%3Ahadoop-ozone%2Frecon%2Fsrc%2Fmain%2Fjava%2Forg%2Fapache%2Fhadoop%2Fozone%2Frecon).
> Following issues will be triaged / fixed.
> * Double Brace Initialization should not be used
> * Resources should be closed
> * InterruptedException should not be ignored



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2473) Fix code reliability issues found by Sonar in Ozone Recon module.

2019-11-13 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2473:
---

 Summary: Fix code reliability issues found by Sonar in Ozone Recon 
module.
 Key: HDDS-2473
 URL: https://issues.apache.org/jira/browse/HDDS-2473
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Recon
Affects Versions: 0.5.0
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan
 Fix For: 0.5.0






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2472) Use try-with-resources while creating FlushOptions in RDBStore.

2019-11-13 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2472:
---

 Summary: Use try-with-resources while creating FlushOptions in 
RDBStore.
 Key: HDDS-2472
 URL: https://issues.apache.org/jira/browse/HDDS-2472
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Affects Versions: 0.5.0
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan
 Fix For: 0.5.0


Link to the sonar issue flag - 
https://sonarcloud.io/project/issues?id=hadoop-ozone=AW5md-zwKcVY8lQ4ZsJ4=AW5md-zwKcVY8lQ4ZsJ4.
 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-1367) Add ability in Recon to track the growth rate of the cluster.

2019-11-11 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-1367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971870#comment-16971870
 ] 

Aravindan Vijayan commented on HDDS-1367:
-

[~dineshchitlangia] Apologies for the delayed response. In this JIRA, we will 
be covering only the growth of the cluster from the usage point of view. We 
will have to track the read/write ops through metrics from OM, SCM and 
Datanode. 

> Add ability in Recon to track the growth rate of the cluster. 
> --
>
> Key: HDDS-1367
> URL: https://issues.apache.org/jira/browse/HDDS-1367
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Vivek Ratnavel Subramanian
>Priority: Major
>
> Recon should be able to answer the question "How fast is the cluster growing, 
> by week, by month, by day?", which gives the user an idea of the usage stats 
> of the cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2392) Fix TestScmSafeMode#testSCMSafeModeRestrictedOp

2019-11-07 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16969503#comment-16969503
 ] 

Aravindan Vijayan commented on HDDS-2392:
-

Thank you for reporting this [~hanishakoneru]. Created RATIS-747 to track the 
fix. 

> Fix TestScmSafeMode#testSCMSafeModeRestrictedOp
> ---
>
> Key: HDDS-2392
> URL: https://issues.apache.org/jira/browse/HDDS-2392
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Hanisha Koneru
>Assignee: Hanisha Koneru
>Priority: Blocker
>
> After ratis upgrade (HDDS-2340), TestScmSafeMode#testSCMSafeModeRestrictedOp 
> fails as the DNs fail to restart XceiverServerRatis. 
> RaftServer#start() fails with following exception:
> {code:java}
> java.io.IOException: java.lang.IllegalStateException: Not started
>   at org.apache.ratis.util.IOUtils.asIOException(IOUtils.java:54)
>   at org.apache.ratis.util.IOUtils.toIOException(IOUtils.java:61)
>   at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:70)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.getImpls(RaftServerProxy.java:284)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.start(RaftServerProxy.java:296)
>   at 
> org.apache.hadoop.ozone.container.common.transport.server.ratis.XceiverServerRatis.start(XceiverServerRatis.java:421)
>   at 
> org.apache.hadoop.ozone.container.ozoneimpl.OzoneContainer.start(OzoneContainer.java:215)
>   at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:110)
>   at 
> org.apache.hadoop.ozone.container.common.states.endpoint.VersionEndpointTask.call(VersionEndpointTask.java:42)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>   at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.IllegalStateException: Not started
>   at 
> org.apache.ratis.thirdparty.com.google.common.base.Preconditions.checkState(Preconditions.java:504)
>   at 
> org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl.getPort(ServerImpl.java:176)
>   at 
> org.apache.ratis.grpc.server.GrpcService.lambda$new$2(GrpcService.java:143)
>   at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
>   at 
> org.apache.ratis.grpc.server.GrpcService.getInetSocketAddress(GrpcService.java:182)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.lambda$new$0(RaftServerImpl.java:84)
>   at org.apache.ratis.util.MemoizedSupplier.get(MemoizedSupplier.java:62)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.getPeer(RaftServerImpl.java:136)
>   at 
> org.apache.ratis.server.impl.RaftServerMetrics.(RaftServerMetrics.java:70)
>   at 
> org.apache.ratis.server.impl.RaftServerMetrics.getRaftServerMetrics(RaftServerMetrics.java:62)
>   at 
> org.apache.ratis.server.impl.RaftServerImpl.(RaftServerImpl.java:119)
>   at 
> org.apache.ratis.server.impl.RaftServerProxy.lambda$newRaftServerImpl$2(RaftServerProxy.java:208)
>   at 
> java.util.concurrent.CompletableFuture$AsyncSupply.run(CompletableFuture.java:1590)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.

2019-11-06 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1873:

Description: 
Recon stores the last ozone manager snapshot received timestamp along with 
timestamps of last successful run for each task.

We can add a REST endpoint that returns the contents of this stored data. 

This is important to give users a sense of how latest the current data that 
they are looking at is. And, we need this per task because some tasks might 
fail to run or might take much longer time to run than other tasks and this 
needs to be reflected in the UI for better and consistent user experience.

  was:
Recon stores the last ozone manager snapshot received timestamp along with 
timestamps of last successful run for each task.

This is important to give users a sense of how latest the current data that 
they are looking at is. And, we need this per task because some tasks might 
fail to run or might take much longer time to run than other tasks and this 
needs to be reflected in the UI for better and consistent user experience.


> Add API to get last completed times for every Recon task.
> -
>
> Key: HDDS-1873
> URL: https://issues.apache.org/jira/browse/HDDS-1873
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Shweta
>Priority: Major
>
> Recon stores the last ozone manager snapshot received timestamp along with 
> timestamps of last successful run for each task.
> We can add a REST endpoint that returns the contents of this stored data. 
> This is important to give users a sense of how latest the current data that 
> they are looking at is. And, we need this per task because some tasks might 
> fail to run or might take much longer time to run than other tasks and this 
> needs to be reflected in the UI for better and consistent user experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.

2019-11-06 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1873:

Description: 
Recon stores the last ozone manager snapshot received timestamp along with 
timestamps of last successful run for each task.

This is important to give users a sense of how latest the current data that 
they are looking at is. And, we need this per task because some tasks might 
fail to run or might take much longer time to run than other tasks and this 
needs to be reflected in the UI for better and consistent user experience.

  was:
Recon stores the last ozone manager snapshot received timestamp along with 
timestamps of last successful run for each task.

The  
This is important to give users a sense of how latest the current data that 
they are looking at is. And, we need this per task because some tasks might 
fail to run or might take much longer time to run than other tasks and this 
needs to be reflected in the UI for better and consistent user experience.


> Add API to get last completed times for every Recon task.
> -
>
> Key: HDDS-1873
> URL: https://issues.apache.org/jira/browse/HDDS-1873
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Shweta
>Priority: Major
>
> Recon stores the last ozone manager snapshot received timestamp along with 
> timestamps of last successful run for each task.
> This is important to give users a sense of how latest the current data that 
> they are looking at is. And, we need this per task because some tasks might 
> fail to run or might take much longer time to run than other tasks and this 
> needs to be reflected in the UI for better and consistent user experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.

2019-11-06 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1873:

Description: 
Recon stores the last ozone manager snapshot received timestamp along with 
timestamps of last successful run for each task.

The  
This is important to give users a sense of how latest the current data that 
they are looking at is. And, we need this per task because some tasks might 
fail to run or might take much longer time to run than other tasks and this 
needs to be reflected in the UI for better and consistent user experience.

  was:
Recon should store last ozone manager snapshot received timestamp along with 
timestamps of last successful run for each task.

This is important to give users a sense of how latest the current data that 
they are looking at is. And, we need this per task because some tasks might 
fail to run or might take much longer time to run than other tasks and this 
needs to be reflected in the UI for better and consistent user experience.


> Add API to get last completed times for every Recon task.
> -
>
> Key: HDDS-1873
> URL: https://issues.apache.org/jira/browse/HDDS-1873
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Shweta
>Priority: Major
>
> Recon stores the last ozone manager snapshot received timestamp along with 
> timestamps of last successful run for each task.
> The  
> This is important to give users a sense of how latest the current data that 
> they are looking at is. And, we need this per task because some tasks might 
> fail to run or might take much longer time to run than other tasks and this 
> needs to be reflected in the UI for better and consistent user experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-1873) Add API to get last completed times for every Recon task.

2019-11-06 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-1873?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-1873:

Summary: Add API to get last completed times for every Recon task.  (was: 
Recon should store last successful run timestamp for each task)

> Add API to get last completed times for every Recon task.
> -
>
> Key: HDDS-1873
> URL: https://issues.apache.org/jira/browse/HDDS-1873
> Project: Hadoop Distributed Data Store
>  Issue Type: Sub-task
>  Components: Ozone Recon
>Affects Versions: 0.4.1
>Reporter: Vivek Ratnavel Subramanian
>Assignee: Shweta
>Priority: Major
>
> Recon should store last ozone manager snapshot received timestamp along with 
> timestamps of last successful run for each task.
> This is important to give users a sense of how latest the current data that 
> they are looking at is. And, we need this per task because some tasks might 
> fail to run or might take much longer time to run than other tasks and this 
> needs to be reflected in the UI for better and consistent user experience.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2388) Teragen test failure due to OM exception

2019-10-31 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964434#comment-16964434
 ] 

Aravindan Vijayan commented on HDDS-2388:
-

[~shashikant] Is the OM crashing due to this error? This is on a different 
thread than the OM read/write path. 

> Teragen test failure due to OM exception
> 
>
> Key: HDDS-2388
> URL: https://issues.apache.org/jira/browse/HDDS-2388
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Shashikant Banerjee
>Priority: Major
> Fix For: 0.5.0
>
>
> Ran into below exception while running teragen:
> {code:java}
> Unable to get delta updates since sequenceNumber 79932 
> org.rocksdb.RocksDBException: Requested sequence not yet written in the db
>   at org.rocksdb.RocksDB.getUpdatesSince(Native Method)
>   at org.rocksdb.RocksDB.getUpdatesSince(RocksDB.java:3587)
>   at 
> org.apache.hadoop.hdds.utils.db.RDBStore.getUpdatesSince(RDBStore.java:338)
>   at 
> org.apache.hadoop.ozone.om.OzoneManager.getDBUpdates(OzoneManager.java:3283)
>   at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.getOMDBUpdates(OzoneManagerRequestHandler.java:404)
>   at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerRequestHandler.handle(OzoneManagerRequestHandler.java:314)
>   at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequestDirectlyToOM(OzoneManagerProtocolServerSideTranslatorPB.java:219)
>   at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.processRequest(OzoneManagerProtocolServerSideTranslatorPB.java:134)
>   at 
> org.apache.hadoop.hdds.server.OzoneProtocolMessageDispatcher.processRequest(OzoneProtocolMessageDispatcher.java:72)
>   at 
> org.apache.hadoop.ozone.protocolPB.OzoneManagerProtocolServerSideTranslatorPB.submitRequest(OzoneManagerProtocolServerSideTranslatorPB.java:102)
>   at 
> org.apache.hadoop.ozone.protocol.proto.OzoneManagerProtocolProtos$OzoneManagerService$2.callBlockingMethod(OzoneManagerProtocolProtos.java)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:528)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:984)
>   at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:912)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:422)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2882)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist

2019-10-31 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16964433#comment-16964433
 ] 

Aravindan Vijayan commented on HDDS-2364:
-

[~msingh] Thanks for the review. I will raise follow up JIRAs for metrics that 
are not already exposed through RocksDB. 

> Add a OM metrics to find the false positive rate for the keyMayExist
> 
>
> Key: HDDS-2364
> URL: https://issues.apache.org/jira/browse/HDDS-2364
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Add a OM metrics to find the false positive rate for the keyMayExist.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-2355) Om double buffer flush termination with rocksdb error

2019-10-30 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan resolved HDDS-2355.
-
Resolution: Cannot Reproduce

> Om double buffer flush termination with rocksdb error
> -
>
> Key: HDDS-2355
> URL: https://issues.apache.org/jira/browse/HDDS-2355
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Aravindan Vijayan
>Priority: Blocker
> Fix For: 0.5.0
>
>
> om_1    |java.io.IOException: Unable to write the batch.
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/])
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/])
> om_1    |at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
> om_1    |at java.base/java.lang.Thread.run(Thread.java:834)
> om_1    |Caused by: org.rocksdb.RocksDBException: 
> WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in 
> default WriteCommitted mode). If it is not due to corruption, the WAL must be 
> emptied before changing the WritePolicy.
> om_1    |at org.rocksdb.RocksDB.write0(Native Method)
> om_1    |at org.rocksdb.RocksDB.write(RocksDB.java:1421)
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/])
>  
> In few of my test run's i see this error and OM is terminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2355) Om double buffer flush termination with rocksdb error

2019-10-30 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2355:

Status: Reopened  (was: Reopened)

> Om double buffer flush termination with rocksdb error
> -
>
> Key: HDDS-2355
> URL: https://issues.apache.org/jira/browse/HDDS-2355
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Aravindan Vijayan
>Priority: Blocker
> Fix For: 0.5.0
>
>
> om_1    |java.io.IOException: Unable to write the batch.
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/])
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/])
> om_1    |at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
> om_1    |at java.base/java.lang.Thread.run(Thread.java:834)
> om_1    |Caused by: org.rocksdb.RocksDBException: 
> WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in 
> default WriteCommitted mode). If it is not due to corruption, the WAL must be 
> emptied before changing the WritePolicy.
> om_1    |at org.rocksdb.RocksDB.write0(Native Method)
> om_1    |at org.rocksdb.RocksDB.write(RocksDB.java:1421)
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/])
>  
> In few of my test run's i see this error and OM is terminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Reopened] (HDDS-2355) Om double buffer flush termination with rocksdb error

2019-10-30 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reopened HDDS-2355:
-

Reopening for changing Resolution.

> Om double buffer flush termination with rocksdb error
> -
>
> Key: HDDS-2355
> URL: https://issues.apache.org/jira/browse/HDDS-2355
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Aravindan Vijayan
>Priority: Blocker
> Fix For: 0.5.0
>
>
> om_1    |java.io.IOException: Unable to write the batch.
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/])
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/])
> om_1    |at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
> om_1    |at java.base/java.lang.Thread.run(Thread.java:834)
> om_1    |Caused by: org.rocksdb.RocksDBException: 
> WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in 
> default WriteCommitted mode). If it is not due to corruption, the WAL must be 
> emptied before changing the WritePolicy.
> om_1    |at org.rocksdb.RocksDB.write0(Native Method)
> om_1    |at org.rocksdb.RocksDB.write(RocksDB.java:1421)
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/])
>  
> In few of my test run's i see this error and OM is terminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2380) Use the Table.isExist API instead of get() call while checking for presence of key.

2019-10-29 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2380:

Summary: Use the Table.isExist API instead of get() call while checking for 
presence of key.  (was: OMFileRequest should use the isExist API while checking 
for pre-existing files in the directory path. )

> Use the Table.isExist API instead of get() call while checking for presence 
> of key.
> ---
>
> Key: HDDS-2380
> URL: https://issues.apache.org/jira/browse/HDDS-2380
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.5.0
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, when OM creates a file/directory, it checks the absence of all 
> prefix paths of the key in its RocksDB. Since we don't care about the 
> deserialization of the actual value, we should use the isExist API added in 
> org.apache.hadoop.hdds.utils.db.Table which internally uses the more 
> performant keyMayExist API of RocksDB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2355) Om double buffer flush termination with rocksdb error

2019-10-29 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16962447#comment-16962447
 ] 

Aravindan Vijayan commented on HDDS-2355:
-

No longer to reproduce this after applying the fix in HDDS-2379.

> Om double buffer flush termination with rocksdb error
> -
>
> Key: HDDS-2355
> URL: https://issues.apache.org/jira/browse/HDDS-2355
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Aravindan Vijayan
>Priority: Blocker
> Fix For: 0.5.0
>
>
> om_1    |java.io.IOException: Unable to write the batch.
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/])
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/])
> om_1    |at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
> om_1    |at java.base/java.lang.Thread.run(Thread.java:834)
> om_1    |Caused by: org.rocksdb.RocksDBException: 
> WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in 
> default WriteCommitted mode). If it is not due to corruption, the WAL must be 
> emptied before changing the WritePolicy.
> om_1    |at org.rocksdb.RocksDB.write0(Native Method)
> om_1    |at org.rocksdb.RocksDB.write(RocksDB.java:1421)
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/])
>  
> In few of my test run's i see this error and OM is terminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2380) OMFileRequest should use the isExist API while checking for pre-existing files in the directory path.

2019-10-29 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2380:
---

 Summary: OMFileRequest should use the isExist API while checking 
for pre-existing files in the directory path. 
 Key: HDDS-2380
 URL: https://issues.apache.org/jira/browse/HDDS-2380
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Aravindan Vijayan
 Fix For: 0.5.0


Currently, when OM creates a file/directory, it checks the absence of all 
prefix paths of the key in its RocksDB. Since we don't care about the 
deserialization of the actual value, we should use the isExist API added in 
org.apache.hadoop.hdds.utils.db.Table which internally uses the more performant 
keyMayExist API of RocksDB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2380) OMFileRequest should use the isExist API while checking for pre-existing files in the directory path.

2019-10-29 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2380:
---

Assignee: Aravindan Vijayan

> OMFileRequest should use the isExist API while checking for pre-existing 
> files in the directory path. 
> --
>
> Key: HDDS-2380
> URL: https://issues.apache.org/jira/browse/HDDS-2380
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
> Fix For: 0.5.0
>
>
> Currently, when OM creates a file/directory, it checks the absence of all 
> prefix paths of the key in its RocksDB. Since we don't care about the 
> deserialization of the actual value, we should use the isExist API added in 
> org.apache.hadoop.hdds.utils.db.Table which internally uses the more 
> performant keyMayExist API of RocksDB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2379) OM terminates with RocksDB error while continuously writing keys.

2019-10-29 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2379:
---

 Summary: OM terminates with RocksDB error while continuously 
writing keys.
 Key: HDDS-2379
 URL: https://issues.apache.org/jira/browse/HDDS-2379
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Aravindan Vijayan
Assignee: Bharat Viswanadham
 Fix For: 0.5.0


Exception trace after writing around 800,000 keys.


{code}
2019-10-29 11:15:15,131 ERROR 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer: Terminating with 
exit status 1: During flush to DB encountered err
or in OMDoubleBuffer flush thread OMDoubleBufferFlushThread
java.io.IOException: Unable to write the batch.
at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48)
at 
org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240)
at 
org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.rocksdb.RocksDBException: unknown WriteBatch tag
at org.rocksdb.RocksDB.write0(Native Method)
at org.rocksdb.RocksDB.write(RocksDB.java:1421)
at 
org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46)
... 3 more
{code}

Assigning to [~bharat] since he has already started work on this. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2364) Add a OM metrics to find the false positive rate for the keyMayExist

2019-10-29 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2364:
---

Assignee: Aravindan Vijayan

> Add a OM metrics to find the false positive rate for the keyMayExist
> 
>
> Key: HDDS-2364
> URL: https://issues.apache.org/jira/browse/HDDS-2364
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Add a OM metrics to find the false positive rate for the keyMayExist.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2355) Om double buffer flush termination with rocksdb error

2019-10-28 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2355:

Issue Type: Bug  (was: Task)

> Om double buffer flush termination with rocksdb error
> -
>
> Key: HDDS-2355
> URL: https://issues.apache.org/jira/browse/HDDS-2355
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Bharat Viswanadham
>Assignee: Aravindan Vijayan
>Priority: Blocker
> Fix For: 0.5.0
>
>
> om_1    |java.io.IOException: Unable to write the batch.
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/])
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/])
> om_1    |at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
> om_1    |at java.base/java.lang.Thread.run(Thread.java:834)
> om_1    |Caused by: org.rocksdb.RocksDBException: 
> WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in 
> default WriteCommitted mode). If it is not due to corruption, the WAL must be 
> emptied before changing the WritePolicy.
> om_1    |at org.rocksdb.RocksDB.write0(Native Method)
> om_1    |at org.rocksdb.RocksDB.write(RocksDB.java:1421)
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/])
>  
> In few of my test run's i see this error and OM is terminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2355) Om double buffer flush termination with rocksdb error

2019-10-24 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2355:
---

Assignee: Aravindan Vijayan

> Om double buffer flush termination with rocksdb error
> -
>
> Key: HDDS-2355
> URL: https://issues.apache.org/jira/browse/HDDS-2355
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Bharat Viswanadham
>Assignee: Aravindan Vijayan
>Priority: Critical
>
> om_1    |java.io.IOException: Unable to write the batch.
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/])
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/])
> om_1    |at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
> om_1    |at java.base/java.lang.Thread.run(Thread.java:834)
> om_1    |Caused by: org.rocksdb.RocksDBException: 
> WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in 
> default WriteCommitted mode). If it is not due to corruption, the WAL must be 
> emptied before changing the WritePolicy.
> om_1    |at org.rocksdb.RocksDB.write0(Native Method)
> om_1    |at org.rocksdb.RocksDB.write(RocksDB.java:1421)
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/])
>  
> In few of my test run's i see this error and OM is terminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2355) Om double buffer flush termination with rocksdb error

2019-10-24 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2355:

Priority: Critical  (was: Major)

> Om double buffer flush termination with rocksdb error
> -
>
> Key: HDDS-2355
> URL: https://issues.apache.org/jira/browse/HDDS-2355
> Project: Hadoop Distributed Data Store
>  Issue Type: Task
>Reporter: Bharat Viswanadham
>Priority: Critical
>
> om_1    |java.io.IOException: Unable to write the batch.
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:48|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:48/])
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBStore.commitBatchOperation(RDBStore.java:240|http://org.apache.hadoop.hdds.utils.db.rdbstore.commitbatchoperation%28rdbstore.java:240/])
> om_1    |at 
> org.apache.hadoop.ozone.om.ratis.OzoneManagerDoubleBuffer.flushTransactions(OzoneManagerDoubleBuffer.java:146)
> om_1    |at java.base/java.lang.Thread.run(Thread.java:834)
> om_1    |Caused by: org.rocksdb.RocksDBException: 
> WritePrepared/WriteUnprepared txn tag when write_after_commit_ is enabled (in 
> default WriteCommitted mode). If it is not due to corruption, the WAL must be 
> emptied before changing the WritePolicy.
> om_1    |at org.rocksdb.RocksDB.write0(Native Method)
> om_1    |at org.rocksdb.RocksDB.write(RocksDB.java:1421)
> om_1    | at 
> [org.apache.hadoop.hdds.utils.db.RDBBatchOperation.commit(RDBBatchOperation.java:46|http://org.apache.hadoop.hdds.utils.db.rdbbatchoperation.commit%28rdbbatchoperation.java:46/])
>  
> In few of my test run's i see this error and OM is terminated.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2361) Ozone Manager init & start command prints out unnecessary line in the beginning.

2019-10-24 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2361:
---

 Summary: Ozone Manager init & start command prints out unnecessary 
line in the beginning.
 Key: HDDS-2361
 URL: https://issues.apache.org/jira/browse/HDDS-2361
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Aravindan Vijayan


{code}
[root@avijayan-om-1 ozone-0.5.0-SNAPSHOT]# bin/ozone --daemon start om
Ozone Manager classpath extended by
{code}

We could probably print this line only when extra elements are added to OM 
classpathor skip printing this line altogether.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2301) Write path: Reduce read contention in rocksDB

2019-10-24 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959100#comment-16959100
 ] 

Aravindan Vijayan commented on HDDS-2301:
-

[~sdeka] Here are some useful RocksDB configs in OM

* Enable RocksDB metrics - *ozone.metastore.rocksdb.statistics=ALL*
* Enable RocksDB logging - *rocksdb.logging.enabled=true*
* Enable RocksDB DEBUG logging - *rocksdb.logging.level=DEBUG*

> Write path: Reduce read contention in rocksDB
> -
>
> Key: HDDS-2301
> URL: https://issues.apache.org/jira/browse/HDDS-2301
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Rajesh Balamohan
>Assignee: Supratim Deka
>Priority: Major
>  Labels: performance
> Attachments: om_write_profile.png
>
>
> Benchmark: 
>  
>  Simple benchmark which creates 100 and 1000s of keys (empty directory) in 
> OM. This is done in a tight loop and multiple threads from client side to add 
> enough load on CPU. Note that intention is to understand the bottlenecks in 
> OM (intentionally avoiding interactions with SCM & DN).
> Observation:
>  -
>  During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This 
> internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every 
> write operation. This turns out to be expensive and chokes the write path.
> [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155]
> [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63]
> In most of the cases, directory creation would be fresh entry. In such cases, 
> it would be good to try with {{RocksDB::keyMayExist.}}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2301) Write path: Reduce read contention in rocksDB

2019-10-24 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16959100#comment-16959100
 ] 

Aravindan Vijayan edited comment on HDDS-2301 at 10/24/19 5:51 PM:
---

[~sdeka] Here are some useful RocksDB configs in OM

* Enable RocksDB metrics - *ozone.metastore.rocksdb.statistics=ALL*
* Enable RocksDB logging - *hadoop.hdds.db.rocksdb.logging.enabled=true*
* Enable RocksDB DEBUG logging - *hadoop.hdds.db.rocksdb.logging.level=DEBUG*


was (Author: avijayan):
[~sdeka] Here are some useful RocksDB configs in OM

* Enable RocksDB metrics - *ozone.metastore.rocksdb.statistics=ALL*
* Enable RocksDB logging - *rocksdb.logging.enabled=true*
* Enable RocksDB DEBUG logging - *rocksdb.logging.level=DEBUG*

> Write path: Reduce read contention in rocksDB
> -
>
> Key: HDDS-2301
> URL: https://issues.apache.org/jira/browse/HDDS-2301
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Affects Versions: 0.5.0
>Reporter: Rajesh Balamohan
>Assignee: Supratim Deka
>Priority: Major
>  Labels: performance
> Attachments: om_write_profile.png
>
>
> Benchmark: 
>  
>  Simple benchmark which creates 100 and 1000s of keys (empty directory) in 
> OM. This is done in a tight loop and multiple threads from client side to add 
> enough load on CPU. Note that intention is to understand the bottlenecks in 
> OM (intentionally avoiding interactions with SCM & DN).
> Observation:
>  -
>  During write path, Ozone checks {{OMFileRequest.verifyFilesInPath}}. This 
> internally calls {{omMetadataManager.getKeyTable().get(dbKeyName)}} for every 
> write operation. This turns out to be expensive and chokes the write path.
> [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMDirectoryCreateRequest.java#L155]
> [https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/file/OMFileRequest.java#L63]
> In most of the cases, directory creation would be fresh entry. In such cases, 
> it would be good to try with {{RocksDB::keyMayExist.}}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDDS-2320) Negative value seen for OM NumKeys Metric in JMX.

2019-10-17 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan reassigned HDDS-2320:
---

Assignee: Aravindan Vijayan

> Negative value seen for OM NumKeys Metric in JMX.
> -
>
> Key: HDDS-2320
> URL: https://issues.apache.org/jira/browse/HDDS-2320
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
> Attachments: Screen Shot 2019-10-17 at 11.31.08 AM.png
>
>
> While running teragen/terasort on a cluster and verifying number of keys 
> created on Ozone Manager, I noticed that the value of NumKeys counter metric 
> to be a negative value !Screen Shot 2019-10-17 at 11.31.08 AM.png! .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2320) Negative value seen for OM NumKeys Metric in JMX.

2019-10-17 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954076#comment-16954076
 ] 

Aravindan Vijayan commented on HDDS-2320:
-

cc [~hanishakoneru] / [~bharat]

> Negative value seen for OM NumKeys Metric in JMX.
> -
>
> Key: HDDS-2320
> URL: https://issues.apache.org/jira/browse/HDDS-2320
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Priority: Major
> Attachments: Screen Shot 2019-10-17 at 11.31.08 AM.png
>
>
> While running teragen/terasort on a cluster and verifying number of keys 
> created on Ozone Manager, I noticed that the value of NumKeys counter metric 
> to be a negative value !Screen Shot 2019-10-17 at 11.31.08 AM.png! .



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2320) Negative value seen for OM NumKeys Metric in JMX.

2019-10-17 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2320:
---

 Summary: Negative value seen for OM NumKeys Metric in JMX.
 Key: HDDS-2320
 URL: https://issues.apache.org/jira/browse/HDDS-2320
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Aravindan Vijayan
 Attachments: Screen Shot 2019-10-17 at 11.31.08 AM.png

While running teragen/terasort on a cluster and verifying number of keys 
created on Ozone Manager, I noticed that the value of NumKeys counter metric to 
be a negative value !Screen Shot 2019-10-17 at 11.31.08 AM.png! .





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key

2019-10-04 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944968#comment-16944968
 ] 

Aravindan Vijayan commented on HDDS-2241:
-

Thank you [~aengineer]. I will post a patch. 

> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipelines for a key
> 
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> information from SCM through an RPC for every block in the key. For large 
> files > 1GB, we may end up making a lot of RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, a simpler change would be to have a map (method local) of 
> ContainerID -> Pipeline that we get from SCM so that we don't need to make 
> repeated calls to SCM for the same containerID for a key. _Here, Number of 
> calls = Number of unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2164) om.db.checkpoints is getting filling up fast

2019-10-04 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2164:

Resolution: Fixed
Status: Resolved  (was: Patch Available)

> om.db.checkpoints is getting filling up fast
> 
>
> Key: HDDS-2164
> URL: https://issues.apache.org/jira/browse/HDDS-2164
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Nanda kumar
>Assignee: Aravindan Vijayan
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> {{om.db.checkpoints}} is filling up fast, we should also clean this up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2254) Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention

2019-10-04 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2254?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944704#comment-16944704
 ] 

Aravindan Vijayan commented on HDDS-2254:
-

Thanks for filing this [~swagle]. I will take a look. 

> Fix flaky unit testTestContainerStateMachine#testRatisSnapshotRetention
> ---
>
> Key: HDDS-2254
> URL: https://issues.apache.org/jira/browse/HDDS-2254
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.5.0
>Reporter: Siddharth Wagle
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Test always fails with assertion error:
> {code}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:86)
>   at org.junit.Assert.assertTrue(Assert.java:41)
>   at org.junit.Assert.assertTrue(Assert.java:52)
>   at 
> org.apache.hadoop.ozone.client.rpc.TestContainerStateMachine.testRatisSnapshotRetention(TestContainerStateMachine.java:188)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2253) Add container state to snapshot to avoid spurious missing container being reported.

2019-10-04 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2253:
---

 Summary: Add container state to snapshot to avoid spurious missing 
container being reported.
 Key: HDDS-2253
 URL: https://issues.apache.org/jira/browse/HDDS-2253
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Affects Versions: 0.4.0
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan






--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944053#comment-16944053
 ] 

Aravindan Vijayan edited comment on HDDS-2241 at 10/3/19 10:00 PM:
---

[~aengineer] This was not based on data we have seen in read performance tests. 
This is just an observation based on looking through the code while working on 
HDDS-2188 along with [~msingh].

Even if this does not show up as a bottleneck at this point of time, IMHO the 
2nd option looks like an obvious performance improvement that we should do. 


was (Author: avijayan):
[~aengineer] This was not based on data we have seen in read performance tests. 
This is just an observation based on looking through the code while working on 
HDDS-2188 along with [~msingh].  

> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipelines for a key
> 
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> information from SCM through an RPC for every block in the key. For large 
> files > 1GB, we may end up making a lot of RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, a simpler change would be to have a map (method local) of 
> ContainerID -> Pipeline that we get from SCM so that we don't need to make 
> repeated calls to SCM for the same containerID for a key. _Here, Number of 
> calls = Number of unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944053#comment-16944053
 ] 

Aravindan Vijayan commented on HDDS-2241:
-

[~aengineer] This was not based on data we have seen in read performance tests. 
This is just an observation based on looking through the code while working on 
HDDS-2188 along with [~msingh].  

> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipelines for a key
> 
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> information from SCM through an RPC for every block in the key. For large 
> files > 1GB, we may end up making a lot of RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, a simpler change would be to have a map (method local) of 
> ContainerID -> Pipeline that we get from SCM so that we don't need to make 
> repeated calls to SCM for the same containerID for a key. _Here, Number of 
> calls = Number of unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2241:

Summary: Optimize the refresh pipeline logic used by KeyManagerImpl to 
obtain the pipelines for a key  (was: Optimize the refresh pipeline logic used 
by KeyManagerImpl to obtain the pipeline for a key)

> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipelines for a key
> 
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> information from SCM through an RPC for every block in the key. For large 
> files > 1GB, we may end up making a lot of RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, a simpler change would be to have a map (method local) of 
> ContainerID -> Pipeline that we get from SCM so that we don't need to make 
> repeated calls to SCM for the same containerID for a key. _Here, Number of 
> calls = Number of unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2241:

Description: 
Currently, while looking up a key, the Ozone Manager gets the pipeline 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making a lot of RPC calls for this. This can be optimized 
in a couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, a simpler change would be to have a map (method local) of 
ContainerID -> Pipeline that we get from SCM so that we don't need to make 
repeated calls to SCM for the same containerID for a key. _Here, Number of 
calls = Number of unique containerIDs_

  was:
Currently, while looking up a key, the Ozone Manager gets the pipeline 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making a lot of RPC calls for this. This can be optimized 
in a couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, we can have a simple map (method local) for ContainerID -> Pipeline 
that we get from SCM so that we don't need to make repeated calls to SCM for 
the same containerID for a key. _Here, Number of calls = Number of unique 
containerIDs_


> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipeline for a key
> ---
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> information from SCM through an RPC for every block in the key. For large 
> files > 1GB, we may end up making a lot of RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, a simpler change would be to have a map (method local) of 
> ContainerID -> Pipeline that we get from SCM so that we don't need to make 
> repeated calls to SCM for the same containerID for a key. _Here, Number of 
> calls = Number of unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944026#comment-16944026
 ] 

Aravindan Vijayan edited comment on HDDS-2241 at 10/3/19 9:06 PM:
--

cc [~msingh] / [~aengineer] / [~bharat] / [~nanda]


was (Author: avijayan):
cc [~msingh] / [~aengineer] / [~bharat] / [~nanda619]

> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipeline for a key
> ---
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> location information from SCM through an RPC for every block in the key. For 
> large files > 1GB, we may end up making ~4 RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, we can have a simple map (method local) for ContainerID -> 
> Pipeline that we get from SCM so that we don't need to make repeated calls to 
> SCM for the same containerID for a key. _Here, Number of calls = Number of 
> unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2241:

Description: 
Currently, while looking up a key, the Ozone Manager gets the pipeline 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making a lot of RPC calls for this. This can be optimized 
in a couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, we can have a simple map (method local) for ContainerID -> Pipeline 
that we get from SCM so that we don't need to make repeated calls to SCM for 
the same containerID for a key. _Here, Number of calls = Number of unique 
containerIDs_

  was:
Currently, while looking up a key, the Ozone Manager gets the pipeline 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, we can have a simple map (method local) for ContainerID -> Pipeline 
that we get from SCM so that we don't need to make repeated calls to SCM for 
the same containerID for a key. _Here, Number of calls = Number of unique 
containerIDs_


> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipeline for a key
> ---
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> information from SCM through an RPC for every block in the key. For large 
> files > 1GB, we may end up making a lot of RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, we can have a simple map (method local) for ContainerID -> 
> Pipeline that we get from SCM so that we don't need to make repeated calls to 
> SCM for the same containerID for a key. _Here, Number of calls = Number of 
> unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2241:

Description: 
Currently, while looking up a key, the Ozone Manager gets the pipeline 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, we can have a simple map (method local) for ContainerID -> Pipeline 
that we get from SCM so that we don't need to make repeated calls to SCM for 
the same containerID for a key. _Here, Number of calls = Number of unique 
containerIDs_

  was:
Currently, while looking up a key, the Ozone Manager gets the pipeline location 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, we can have a simple map (method local) for ContainerID -> Pipeline 
that we get from SCM so that we don't need to make repeated calls to SCM for 
the same containerID for a key. _Here, Number of calls = Number of unique 
containerIDs_


> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipeline for a key
> ---
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> information from SCM through an RPC for every block in the key. For large 
> files > 1GB, we may end up making ~4 RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, we can have a simple map (method local) for ContainerID -> 
> Pipeline that we get from SCM so that we don't need to make repeated calls to 
> SCM for the same containerID for a key. _Here, Number of calls = Number of 
> unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16944026#comment-16944026
 ] 

Aravindan Vijayan commented on HDDS-2241:
-

cc [~msingh] / [~aengineer] / [~bharat] / [~nanda619]

> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipeline for a key
> ---
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> location information from SCM through an RPC for every block in the key. For 
> large files > 1GB, we may end up making ~4 RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, we can have a simple map (method local) for ContainerID -> 
> Pipeline that we get from SCM so that we don't need to make repeated calls to 
> SCM for the same containerID for a key. _Here, Number of calls = Number of 
> unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2241:

Description: 
Currently, while looking up a key, the Ozone Manager gets the pipeline location 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, we can have a simple map (method local) for ContainerID -> Pipeline 
that we get from SCM so that we don't need to make repeated calls to SCM for 
the same containerID for a key. _Here, Number of calls = Number of unique 
containerIDs_

  was:
Currently, while looking up a key, the Ozone Manager gets the pipeline location 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, we can have a simple map (method local) for ContainerID -> Pipeline 
that we get from SCM so that we don't need to make calls to SCM again for the 
same pipeline for that key. _Here, Number of calls = Number of unique 
containerIDs_


> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipeline for a key
> ---
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> location information from SCM through an RPC for every block in the key. For 
> large files > 1GB, we may end up making ~4 RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, we can have a simple map (method local) for ContainerID -> 
> Pipeline that we get from SCM so that we don't need to make repeated calls to 
> SCM for the same containerID for a key. _Here, Number of calls = Number of 
> unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2241:

Description: 
Currently, while looking up a key, the Ozone Manager gets the pipeline location 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. _Here, Number of calls = 1 (or k depending 
on batch size)_
* Instead, we can have a simple map (method local) for ContainerID -> Pipeline 
that we get from SCM so that we don't need to make calls to SCM again for the 
same pipeline for that key. _Here, Number of calls = Number of unique 
containerIDs_

  was:
Currently, while looking up a key, the Ozone Manager gets the pipeline location 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. 
* Instead, we can have a simple map (inside the method) for ContainerID -> 
Pipeline that we got from SCM so that we don't need to make calls to SCM again 
for the same pipeline. Here number of calls = number of unique containerIDs.


> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipeline for a key
> ---
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> location information from SCM through an RPC for every block in the key. For 
> large files > 1GB, we may end up making ~4 RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. _Here, Number of calls = 1 (or k 
> depending on batch size)_
> * Instead, we can have a simple map (method local) for ContainerID -> 
> Pipeline that we get from SCM so that we don't need to make calls to SCM 
> again for the same pipeline for that key. _Here, Number of calls = Number of 
> unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2241:

Description: 
Currently, while looking up a key, the Ozone Manager gets the pipeline location 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file. To keep the 
number of containers passed in to SCM in a single call, we can have a fixed 
container batch size on the OM side. 
* Instead, we can have a simple map (inside the method) for ContainerID -> 
Pipeline that we got from SCM so that we don't need to make calls to SCM again 
for the same pipeline. Here number of calls = number of unique containerIDs.

  was:
Currently, while looking up a key, the Ozone Manager gets the pipeline location 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file.
* Instead, we can have a method local cache for ContainerID -> Pipeline that we 
got from SCM so that we don't need to make calls to SCM again for the same 
pipeline.


> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the 
> pipeline for a key
> ---
>
> Key: HDDS-2241
> URL: https://issues.apache.org/jira/browse/HDDS-2241
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline 
> location information from SCM through an RPC for every block in the key. For 
> large files > 1GB, we may end up making ~4 RPC calls for this. This can be 
> optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we 
> can get the pipeline info locations for all the blocks for a file. To keep 
> the number of containers passed in to SCM in a single call, we can have a 
> fixed container batch size on the OM side. 
> * Instead, we can have a simple map (inside the method) for ContainerID -> 
> Pipeline that we got from SCM so that we don't need to make calls to SCM 
> again for the same pipeline. Here number of calls = number of unique 
> containerIDs.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDDS-2188) Implement LocatedFileStatus & getFileBlockLocations to provide node/localization information to Yarn/Mapreduce

2019-10-03 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943983#comment-16943983
 ] 

Aravindan Vijayan edited comment on HDDS-2188 at 10/3/19 8:39 PM:
--

On discussing with [~msingh], we found the following to be implemented to make 
this work correctly. I will be creating separate JIRAs for handling each work 
item.

*This JIRA will cover the following*
* The getFileStatus call on the Ozone file system will compute and return an 
instance of LocatedFileStatus. This means that it will have the block locations 
for the file (if present) as part of the status. This will be used by the 
Map-Reduce applications automatically. 
* For the Ozone Manager to supplement the getFileStatus with Block info 
locations, we need to use a flag like the "refreshPipeline" flag to obtain the 
information from the SCM. Whenever the flag is set, OM will get the Block 
locations from SCM and include it in the returned File Status.

*New JIRAs will be created for the following.* 
* Currently, we get block location info from SCM for every block. This will 
lead to multiple SCM RPC calls to get the blocks for 1 file. We can implement a 
batch GET API for SCM using which we can get the Block info locations for all 
the blocks for a file. (HDDS-2241)
* As an optimization, we can cache the block info in FileSystem client layer so 
that we can reuse them instead of making a call to RPC.  An expiry based Guava 
cache is one candidate. 


was (Author: avijayan):
On discussing with [~msingh], we found the following to be implemented to make 
this work correctly. I will be creating separate JIRAs for handling each work 
item.

*This JIRA will cover the following*
* The getFileStatus call on the Ozone file system will compute and return an 
instance of LocatedFileStatus. This means that it will have the block locations 
for the file (if present) as part of the status. This will be used by the 
Map-Reduce applications automatically. 
* For the Ozone Manager to supplement the getFileStatus with Block info 
locations, we need to use a flag like the "refreshPipeline" flag to obtain the 
information from the SCM. Whenever the flag is set, OM will get the Block 
locations from SCM and include it in the returned File Status.

*New JIRAs will be created for the following.* 
* Currently, we get block location info from SCM for every block. This will 
lead to multiple SCM RPC calls to get the blocks for 1 file. We can implement a 
batch GET API for SCM using which we can get the Block info locations for all 
the blocks for a file. 
* As an optimization, we can cache the block info in FileSystem client layer so 
that we can reuse them instead of making a call to RPC.  An expiry based Guava 
cache is one candidate. 

> Implement LocatedFileStatus & getFileBlockLocations to provide 
> node/localization information to Yarn/Mapreduce
> --
>
> Key: HDDS-2188
> URL: https://issues.apache.org/jira/browse/HDDS-2188
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>
> For applications like Hive/MapReduce to take advantage of the data locality 
> in Ozone, Ozone should return the location of the Ozone blocks. This is 
> needed for better read performance for Hadoop Applications.
> {code}
> if (file instanceof LocatedFileStatus) {
>   blkLocations = ((LocatedFileStatus) file).getBlockLocations();
> } else {
>   blkLocations = fs.getFileBlockLocations(file, 0, length);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipeline for a key

2019-10-03 Thread Aravindan Vijayan (Jira)
Aravindan Vijayan created HDDS-2241:
---

 Summary: Optimize the refresh pipeline logic used by 
KeyManagerImpl to obtain the pipeline for a key
 Key: HDDS-2241
 URL: https://issues.apache.org/jira/browse/HDDS-2241
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Aravindan Vijayan
Assignee: Aravindan Vijayan


Currently, while looking up a key, the Ozone Manager gets the pipeline location 
information from SCM through an RPC for every block in the key. For large files 
> 1GB, we may end up making ~4 RPC calls for this. This can be optimized in a 
couple of ways

* We can implement a batch getContainerWithPipeline API in SCM using which we 
can get the pipeline info locations for all the blocks for a file.
* Instead, we can have a method local cache for ContainerID -> Pipeline that we 
got from SCM so that we don't need to make calls to SCM again for the same 
pipeline.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDDS-2188) Implement LocatedFileStatus & getFileBlockLocations to provide node/localization information to Yarn/Mapreduce

2019-10-03 Thread Aravindan Vijayan (Jira)


[ 
https://issues.apache.org/jira/browse/HDDS-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16943983#comment-16943983
 ] 

Aravindan Vijayan commented on HDDS-2188:
-

On discussing with [~msingh], we found the following to be implemented to make 
this work correctly. I will be creating separate JIRAs for handling each work 
item.

*This JIRA will cover the following*
* The getFileStatus call on the Ozone file system will compute and return an 
instance of LocatedFileStatus. This means that it will have the block locations 
for the file (if present) as part of the status. This will be used by the 
Map-Reduce applications automatically. 
* For the Ozone Manager to supplement the getFileStatus with Block info 
locations, we need to use a flag like the "refreshPipeline" flag to obtain the 
information from the SCM. Whenever the flag is set, OM will get the Block 
locations from SCM and include it in the returned File Status.

*New JIRAs will be created for the following.* 
* Currently, we get block location info from SCM for every block. This will 
lead to multiple SCM RPC calls to get the blocks for 1 file. We can implement a 
batch GET API for SCM using which we can get the Block info locations for all 
the blocks for a file. 
* As an optimization, we can cache the block info in FileSystem client layer so 
that we can reuse them instead of making a call to RPC.  An expiry based Guava 
cache is one candidate. 

> Implement LocatedFileStatus & getFileBlockLocations to provide 
> node/localization information to Yarn/Mapreduce
> --
>
> Key: HDDS-2188
> URL: https://issues.apache.org/jira/browse/HDDS-2188
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>
> For applications like Hive/MapReduce to take advantage of the data locality 
> in Ozone, Ozone should return the location of the Ozone blocks. This is 
> needed for better read performance for Hadoop Applications.
> {code}
> if (file instanceof LocatedFileStatus) {
>   blkLocations = ((LocatedFileStatus) file).getBlockLocations();
> } else {
>   blkLocations = fs.getFileBlockLocations(file, 0, length);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDDS-2200) Recon does not handle the NULL snapshot from OM DB cleanly.

2019-10-02 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDDS-2200 started by Aravindan Vijayan.
---
> Recon does not handle the NULL snapshot from OM DB cleanly.
> ---
>
> Key: HDDS-2200
> URL: https://issues.apache.org/jira/browse/HDDS-2200
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Recon
>Reporter: Aravindan Vijayan
>Assignee: Aravindan Vijayan
>Priority: Major
>
> {code}
> 2019-09-27 11:35:19,835 [pool-9-thread-1] ERROR  - Null snapshot location 
> got from OM.
> 2019-09-27 11:35:19,839 [pool-9-thread-1] INFO   - Calling reprocess on 
> Recon tasks.
> 2019-09-27 11:35:19,840 [pool-7-thread-1] INFO   - Starting a 'reprocess' 
> run of ContainerKeyMapperTask.
> 2019-09-27 11:35:20,069 [pool-7-thread-1] INFO   - Creating new Recon 
> Container DB at /tmp/recon/db/recon-container.db_1569609319840
> 2019-09-27 11:35:20,069 [pool-7-thread-1] INFO   - Cleaning up old Recon 
> Container DB at /tmp/recon/db/recon-container.db_1569609258721.
> 2019-09-27 11:35:20,144 [pool-9-thread-1] ERROR  - Unexpected error :
> java.util.concurrent.ExecutionException: java.lang.NullPointerException
> at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> at 
> org.apache.hadoop.ozone.recon.tasks.ReconTaskControllerImpl.reInitializeTasks(ReconTaskControllerImpl.java:181)
> at 
> org.apache.hadoop.ozone.recon.spi.impl.OzoneManagerServiceProviderImpl.syncDataFromOM(OzoneManagerServiceProviderImpl.java:333)
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.NullPointerException
> at 
> org.apache.hadoop.ozone.recon.tasks.ContainerKeyMapperTask.reprocess(ContainerKeyMapperTask.java:81)
> at 
> org.apache.hadoop.ozone.recon.tasks.ReconTaskControllerImpl.lambda$reInitializeTasks$3(ReconTaskControllerImpl.java:176)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDDS-2164) om.db.checkpoints is getting filling up fast

2019-10-02 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aravindan Vijayan updated HDDS-2164:

Status: Patch Available  (was: In Progress)

> om.db.checkpoints is getting filling up fast
> 
>
> Key: HDDS-2164
> URL: https://issues.apache.org/jira/browse/HDDS-2164
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Manager
>Reporter: Nanda kumar
>Assignee: Aravindan Vijayan
>Priority: Critical
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> {{om.db.checkpoints}} is filling up fast, we should also clean this up.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Work started] (HDDS-2188) Implement LocatedFileStatus & getFileBlockLocations to provide node/localization information to Yarn/Mapreduce

2019-10-01 Thread Aravindan Vijayan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDDS-2188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HDDS-2188 started by Aravindan Vijayan.
---
> Implement LocatedFileStatus & getFileBlockLocations to provide 
> node/localization information to Yarn/Mapreduce
> --
>
> Key: HDDS-2188
> URL: https://issues.apache.org/jira/browse/HDDS-2188
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>  Components: Ozone Filesystem
>Affects Versions: 0.5.0
>Reporter: Mukul Kumar Singh
>Assignee: Aravindan Vijayan
>Priority: Major
>
> For applications like Hive/MapReduce to take advantage of the data locality 
> in Ozone, Ozone should return the location of the Ozone blocks. This is 
> needed for better read performance for Hadoop Applications.
> {code}
> if (file instanceof LocatedFileStatus) {
>   blkLocations = ((LocatedFileStatus) file).getBlockLocations();
> } else {
>   blkLocations = fs.getFileBlockLocations(file, 0, length);
> }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



  1   2   3   4   5   >