[jira] [Created] (HDFS-13091) Remove tomcat from the Hadoop-auth test bundle

2018-01-30 Thread Xiao Chen (JIRA)
Xiao Chen created HDFS-13091:


 Summary: Remove tomcat from the Hadoop-auth test bundle
 Key: HDFS-13091
 URL: https://issues.apache.org/jira/browse/HDFS-13091
 Project: Hadoop HDFS
  Issue Type: Task
Reporter: Xiao Chen
Assignee: Xiao Chen


We have switched KMS and HttpFS from tomcat to jetty in 3.0. There appears to 
have some left over tests in Hadoop-auth which were for used for KMS / HttpFS 
coverage.

We should cleanup the test accordingly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router

2018-01-30 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346291#comment-16346291
 ] 

Yiqun Lin commented on HDFS-13044:
--

+1 on the patch for branch-3.0. Please go ahead.

> RBF: Add a safe mode for the Router
> ---
>
> Key: HDFS-13044
> URL: https://issues.apache.org/jira/browse/HDFS-13044
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, 
> HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, 
> HDFS-13044.004.patch, HDFS-13044.005.patch
>
>
> When a Router cannot communicate with the State Store, it should enter into a 
> safe mode that disallows certain operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Xiao Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiao Chen updated HDFS-12528:
-
Fix Version/s: 2.10.0

> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 3.0.1
>
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346285#comment-16346285
 ] 

Xiao Chen commented on HDFS-12528:
--

Thanks for the commit [~cheersyang].

Cherry-picked to branch-2 as well.

> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 3.1.0, 2.10.0, 3.0.1
>
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13085) getStoragePolicy varies b/w REST and CLI for unspecified path

2018-01-30 Thread Surendra Singh Lilhore (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346259#comment-16346259
 ] 

Surendra Singh Lilhore commented on HDFS-13085:
---

Thanks [~brahmareddy] for reporting issue.

One suggestion, currently  {{-getStoragePolicy}} command give the current inode 
policy. It will not inherit from the parent. We can add one optional parameter 
"-i" in command to return the inherited policy when current inode policy is 
undefined.  

For example: 
{noformat}
./hdfs storagepolicies -getStoragePolicy [-i] -path {noformat}

> getStoragePolicy varies b/w REST and CLI for unspecified path
> -
>
> Key: HDFS-13085
> URL: https://issues.apache.org/jira/browse/HDFS-13085
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Mukul Kumar Singh
>Priority: Major
>
> *CLI*
> {noformat}
> hdfs storagepolicies -getStoragePolicy -path /node-label-root
> The storage policy of /node-label-root is unspecified
> {noformat}
> *REST* 
> {noformat}
> curl -i "http://1*.*.*.*:50070/webhdfs/v1/node-label-root?op=GETSTORAGEPOLICY;
> {"BlockStoragePolicy":\{"copyOnCreateFile":false,"creationFallbacks":[],"id":7,"name":"HOT","replicationFallbacks":["ARCHIVE"],"storageTypes":["DISK"]}}
> {noformat}
> -HDFS-11968-  introduced this new behaviour for CLI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13083) RBF: Fix doc error setting up client

2018-01-30 Thread tartarus (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346233#comment-16346233
 ] 

tartarus commented on HDFS-13083:
-

[~elgoiri]  [~hudson]  It's my pleasure to participate in the project's 
contribution.

> RBF: Fix doc error setting up client
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346231#comment-16346231
 ] 

genericqa commented on HDFS-12051:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
13m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
21s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
1m  5s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 1235 unchanged - 18 fixed = 1236 total (was 1253) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
1s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  9s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  2m  
3s{color} | {color:red} hadoop-hdfs-project/hadoop-hdfs generated 1 new + 0 
unchanged - 0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}136m 58s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}193m 32s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| FindBugs | module:hadoop-hdfs-project/hadoop-hdfs |
|  |  Increment of volatile field 
org.apache.hadoop.hdfs.server.namenode.NameCache.size in 
org.apache.hadoop.hdfs.server.namenode.NameCache.put(byte[])  At 
NameCache.java:in org.apache.hadoop.hdfs.server.namenode.NameCache.put(byte[])  
At NameCache.java:[line 119] |
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.namenode.TestMetaSave |
|   | hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12051 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908462/HDFS-12051.09.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  xml  |
| uname | Linux 3849f6cfea35 3.13.0-135-generic 

[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346223#comment-16346223
 ] 

genericqa commented on HDFS-13043:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 35s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}152m 47s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 2s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}204m 30s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.security.TestDelegationTokenForProxyUser |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.TestErasureCodeBenchmarkThroughput |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.TestLocalDFS |
|   | hadoop.hdfs.server.federation.metrics.TestFederationMetrics |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13043 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908459/HDFS-13043.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 0b729fdb416b 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 

[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346195#comment-16346195
 ] 

genericqa commented on HDFS-12997:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 18m 
46s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
7s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 36s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} hadoop-hdfs-project_hadoop-hdfs generated 0 new + 
392 unchanged - 2 fixed = 392 total (was 394) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
38s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 152 unchanged - 7 fixed = 152 total (was 159) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}114m  1s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
25s{color} | {color:red} The patch generated 4 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}187m 24s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure080 |
|   | hadoop.hdfs.TestRollingUpgrade |
|   | hadoop.hdfs.TestAbandonBlock |
|   | hadoop.hdfs.TestDecommission |
|   | hadoop.hdfs.TestPread |
|   | hadoop.hdfs.server.datanode.TestRefreshNamenodes |
|   | hadoop.hdfs.TestReplaceDatanodeFailureReplication |
|   | hadoop.hdfs.TestSafeModeWithStripedFileWithRandomECPolicy |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.server.namenode.ha.TestStandbyIsHot |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12997 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908456/HDFS-12997.007.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux a4a183eca85c 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | 

[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-30 Thread Yiqun Lin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346188#comment-16346188
 ] 

Yiqun Lin commented on HDFS-13043:
--

The patch almost looks good to me. Would you fix the related failure test? 
Seems failed in these lines:
{code}
+  assertEquals("", json.get("lastMembershipUpdate"));
+  assertEquals("", json.get("lastMountTableUpdate"));
{code}

 

> RBF: Expose the state of the Routers in the federation
> --
>
> Key: HDFS-13043
> URL: https://issues.apache.org/jira/browse/HDFS-13043
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, 
> HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, 
> HDFS-13043.005.patch, HDFS-13043.006.patch, HDFS-13043.007.patch, 
> router-info.png
>
>
> The Router should expose the state of the other Routers in the federation 
> through a user UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13083) RBF: Fix doc error setting up client

2018-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346183#comment-16346183
 ] 

Hudson commented on HDFS-13083:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13585 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13585/])
HDFS-13083. RBF: Fix doc error setting up client. Contributed by (inigoiri: rev 
5206b2c7ca479dd53f614d75bab594043a9866e1)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/site/markdown/HDFSRouterFederation.md


> RBF: Fix doc error setting up client
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13089) Add test to validate dfs used and no of blocks when blocks are moved across volumes

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346172#comment-16346172
 ] 

genericqa commented on HDFS-13089:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 15m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
50s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
10s{color} | {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 1s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 54s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
46s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}147m 22s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}193m 49s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.tools.TestHdfsConfigFields |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13089 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908454/HDFS-13089.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 5568ef2dedcb 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f9dd5b6 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| mvninstall | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22896/artifact/out/patch-mvninstall-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 

[jira] [Updated] (HDFS-13083) RBF: Fix doc error setting up client

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13083:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: (was: 3.0.0)
   3.0.1
   2.9.1
   2.10.0
   3.1.0
   Status: Resolved  (was: Patch Available)

Thanks [~tartarus] for the contribution. Committed to {{trunk}}, 
{{branch-3.0}}, {{branch-2}}, and {{branch-2.9}}.

> RBF: Fix doc error setting up client
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.1.0, 2.10.0, 2.9.1, 3.0.1
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13083) RBF: Fix doc error setting up client

2018-01-30 Thread tartarus (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346125#comment-16346125
 ] 

tartarus commented on HDFS-13083:
-

[~elgoiri]  Thank you for your optimization. it's okay no problem.

> RBF: Fix doc error setting up client
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.0.0
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12942) Synchronization issue in FSDataSetImpl#moveBlock

2018-01-30 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346120#comment-16346120
 ] 

Ajay Kumar commented on HDFS-12942:
---

test failures are unrelated. all of them passed locally.

> Synchronization issue in FSDataSetImpl#moveBlock
> 
>
> Key: HDFS-12942
> URL: https://issues.apache.org/jira/browse/HDFS-12942
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-12942.001.patch, HDFS-12942.002.patch, 
> HDFS-12942.003.patch, HDFS-12942.004.patch, HDFS-12942.005.patch, 
> HDFS-12942.006.patch
>
>
> FSDataSetImpl#moveBlock  works in following following 3 steps:
> # first creates a new replicaInfo object
> # calls finalizeReplica to finalize it.
> # Calls removeOldReplica to remove oldReplica.
> A client can potentially append to the old replica between step 1 and 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory

2018-01-30 Thread Misha Dmitriev (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346080#comment-16346080
 ] 

Misha Dmitriev commented on HDFS-12051:
---

I ran another test in a cluster with ~30M HDFS files, where NameNode uses ~20GB 
of memory. The measurements, before my change:

*Types of duplicate objects:*
||  *Overhead* ||  *# objects* ||  *Unique objects* || *Class name* ||
| 745,343K (4.0%)| 28,746,601| 9,566,545|byte[]|
| 48,014K (0.3%)| 1,533,733| 1,438|int[]|

 

The same table after my change:
||  *Overhead* ||  *# objects* ||  *Unique objects* || *Class name* ||
| 48,014K (0.3%)| 1,533,702| 1,407|int[]|
| 4,906K (< 0.1%)| 9,611,557| 9,553,953|byte[]|

 

In other words, we have about the same number of unique byte[] arrays, but 
almost none of them have duplicates now. As a result, we saved ~0.6GB of 
memory. This is the removed overhead of duplicate byte[] arrays (approx. 740MB) 
minus the size of the cache (approx. 160MB)

 

> Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly 
> those denoting file/directory names) to save memory
> -
>
> Key: HDFS-12051
> URL: https://issues.apache.org/jira/browse/HDFS-12051
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, 
> HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, 
> HDFS-12051.06.patch, HDFS-12051.07.patch, HDFS-12051.08.patch, 
> HDFS-12051.09.patch
>
>
> When snapshot diff operation is performed in a NameNode that manages several 
> million HDFS files/directories, NN needs a lot of memory. Analyzing one heap 
> dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays 
> result in 6.5% memory overhead, and most of these arrays are referenced by 
> {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}}
>  and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}:
> {code:java}
> 19. DUPLICATE PRIMITIVE ARRAYS
> Types of duplicate objects:
>  Ovhd Num objs  Num unique objs   Class name
> 3,220,272K (6.5%)   104749528  25760871 byte[]
> 
>   1,841,485K (3.7%), 53194037 dup arrays (13158094 unique)
> 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 
> of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, 
> 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 
> 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 
> of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, 
> 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...)
> ... and 45902395 more arrays, of which 13158084 are unique
>  <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name 
> <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode 
> <--  {j.u.ArrayList} <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs 
> <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 
> elements) ... <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
>  <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java 
> Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER
>   409,830K (0.8%), 13482787 dup arrays (13260241 unique)
> 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of 
> byte[32](116, 97, 115, 107, 95, 49, 

[jira] [Commented] (HDFS-13048) LowRedundancyReplicatedBlocks metric can be negative

2018-01-30 Thread Chen Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346064#comment-16346064
 ] 

Chen Liang commented on HDFS-13048:
---

Thanks for the catch [~ajisakaa]! +1 on v002 patch. 

> LowRedundancyReplicatedBlocks metric can be negative
> 
>
> Key: HDFS-13048
> URL: https://issues.apache.org/jira/browse/HDFS-13048
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: metrics
>Affects Versions: 3.0.0
>Reporter: Akira Ajisaka
>Assignee: Akira Ajisaka
>Priority: Major
> Attachments: HDFS-13048-sample.patch, HDFS-13048.001.patch, 
> HDFS-13048.002.patch
>
>
> I'm seeing {{LowRedundancyReplicatedBlocks}} become negative. This should be 
> 0 or positive.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory

2018-01-30 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HDFS-12051:
--
Status: Patch Available  (was: In Progress)

I've just updated this patch to make NameNode automatically set the cache size 
as a percentage of the total heap (by default 1/512th, or little less than 
0.25%). I found in my testing that this works better than the previous 
fixed-size cache.

> Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly 
> those denoting file/directory names) to save memory
> -
>
> Key: HDFS-12051
> URL: https://issues.apache.org/jira/browse/HDFS-12051
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, 
> HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, 
> HDFS-12051.06.patch, HDFS-12051.07.patch, HDFS-12051.08.patch, 
> HDFS-12051.09.patch
>
>
> When snapshot diff operation is performed in a NameNode that manages several 
> million HDFS files/directories, NN needs a lot of memory. Analyzing one heap 
> dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays 
> result in 6.5% memory overhead, and most of these arrays are referenced by 
> {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}}
>  and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}:
> {code:java}
> 19. DUPLICATE PRIMITIVE ARRAYS
> Types of duplicate objects:
>  Ovhd Num objs  Num unique objs   Class name
> 3,220,272K (6.5%)   104749528  25760871 byte[]
> 
>   1,841,485K (3.7%), 53194037 dup arrays (13158094 unique)
> 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 
> of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, 
> 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 
> 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 
> of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, 
> 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...)
> ... and 45902395 more arrays, of which 13158084 are unique
>  <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name 
> <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode 
> <--  {j.u.ArrayList} <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs 
> <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 
> elements) ... <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
>  <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java 
> Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER
>   409,830K (0.8%), 13482787 dup arrays (13260241 unique)
> 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...)
> ... and 13479257 more arrays, of which 13260231 are unique
>  <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.name <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- 
> org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- 
> 

[jira] [Updated] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory

2018-01-30 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HDFS-12051:
--
Attachment: HDFS-12051.09.patch

> Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly 
> those denoting file/directory names) to save memory
> -
>
> Key: HDFS-12051
> URL: https://issues.apache.org/jira/browse/HDFS-12051
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, 
> HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, 
> HDFS-12051.06.patch, HDFS-12051.07.patch, HDFS-12051.08.patch, 
> HDFS-12051.09.patch
>
>
> When snapshot diff operation is performed in a NameNode that manages several 
> million HDFS files/directories, NN needs a lot of memory. Analyzing one heap 
> dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays 
> result in 6.5% memory overhead, and most of these arrays are referenced by 
> {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}}
>  and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}:
> {code:java}
> 19. DUPLICATE PRIMITIVE ARRAYS
> Types of duplicate objects:
>  Ovhd Num objs  Num unique objs   Class name
> 3,220,272K (6.5%)   104749528  25760871 byte[]
> 
>   1,841,485K (3.7%), 53194037 dup arrays (13158094 unique)
> 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 
> of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, 
> 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 
> 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 
> of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, 
> 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...)
> ... and 45902395 more arrays, of which 13158084 are unique
>  <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name 
> <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode 
> <--  {j.u.ArrayList} <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs 
> <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 
> elements) ... <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
>  <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java 
> Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER
>   409,830K (0.8%), 13482787 dup arrays (13260241 unique)
> 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...)
> ... and 13479257 more arrays, of which 13260231 are unique
>  <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.name <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- 
> org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
>  <-- j.l.Thread[] <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> 

[jira] [Updated] (HDFS-12051) Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly those denoting file/directory names) to save memory

2018-01-30 Thread Misha Dmitriev (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Misha Dmitriev updated HDFS-12051:
--
Status: In Progress  (was: Patch Available)

> Reimplement NameCache in NameNode: Intern duplicate byte[] arrays (mainly 
> those denoting file/directory names) to save memory
> -
>
> Key: HDFS-12051
> URL: https://issues.apache.org/jira/browse/HDFS-12051
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Misha Dmitriev
>Assignee: Misha Dmitriev
>Priority: Major
> Attachments: HDFS-12051.01.patch, HDFS-12051.02.patch, 
> HDFS-12051.03.patch, HDFS-12051.04.patch, HDFS-12051.05.patch, 
> HDFS-12051.06.patch, HDFS-12051.07.patch, HDFS-12051.08.patch
>
>
> When snapshot diff operation is performed in a NameNode that manages several 
> million HDFS files/directories, NN needs a lot of memory. Analyzing one heap 
> dump with jxray (www.jxray.com), we observed that duplicate byte[] arrays 
> result in 6.5% memory overhead, and most of these arrays are referenced by 
> {{org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name}}
>  and {{org.apache.hadoop.hdfs.server.namenode.INodeFile.name}}:
> {code:java}
> 19. DUPLICATE PRIMITIVE ARRAYS
> Types of duplicate objects:
>  Ovhd Num objs  Num unique objs   Class name
> 3,220,272K (6.5%)   104749528  25760871 byte[]
> 
>   1,841,485K (3.7%), 53194037 dup arrays (13158094 unique)
> 3510556 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 2228255 
> of byte[8](48, 48, 48, 48, 48, 48, 95, 48), 357439 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 237395 of byte[8](48, 48, 48, 48, 48, 49, 
> 95, 48), 227853 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 
> 179193 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...), 169487 
> of byte[8](48, 48, 48, 48, 48, 50, 95, 48), 145055 of byte[17](112, 97, 114, 
> 116, 45, 109, 45, 48, 48, 48, ...), 128134 of byte[8](48, 48, 48, 48, 48, 51, 
> 95, 48), 108265 of byte[17](112, 97, 114, 116, 45, 109, 45, 48, 48, 48, ...)
> ... and 45902395 more arrays, of which 13158084 are unique
>  <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes$SnapshotCopy.name 
> <-- org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiff.snapshotINode 
> <--  {j.u.ArrayList} <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileDiffList.diffs <-- 
> org.apache.hadoop.hdfs.server.namenode.snapshot.FileWithSnapshotFeature.diffs 
> <-- org.apache.hadoop.hdfs.server.namenode.INode$Feature[] <-- 
> org.apache.hadoop.hdfs.server.namenode.INodeFile.features <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- ... (1 
> elements) ... <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
>  <-- j.l.Thread[] <-- j.l.ThreadGroup.threads <-- j.l.Thread.group <-- Java 
> Static: org.apache.hadoop.fs.FileSystem$Statistics.STATS_DATA_CLEANER
>   409,830K (0.8%), 13482787 dup arrays (13260241 unique)
> 430 of byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 353 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 352 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 350 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 342 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 341 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 340 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 337 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...), 334 of 
> byte[32](116, 97, 115, 107, 95, 49, 52, 57, 55, 48, ...)
> ... and 13479257 more arrays, of which 13260231 are unique
>  <-- org.apache.hadoop.hdfs.server.namenode.INodeFile.name <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.bc <-- 
> org.apache.hadoop.util.LightWeightGSet$LinkedElement[] <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.blocksMap <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$BlockReportProcessingThread.this$0
>  <-- j.l.Thread[] <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap$1.entries <-- 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.blocks <-- 
> 

[jira] [Updated] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13043:
---
Attachment: HDFS-13043.007.patch

> RBF: Expose the state of the Routers in the federation
> --
>
> Key: HDFS-13043
> URL: https://issues.apache.org/jira/browse/HDFS-13043
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, 
> HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, 
> HDFS-13043.005.patch, HDFS-13043.006.patch, HDFS-13043.007.patch, 
> router-info.png
>
>
> The Router should expose the state of the other Routers in the federation 
> through a user UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router

2018-01-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346038#comment-16346038
 ] 

Íñigo Goiri commented on HDFS-13044:


The patch for branch-3.0  [^HDFS-13044-branch-3.0.000.patch] seems to be 
running out of memory to run {{TestRouterSafemode}}; not sure why.

> RBF: Add a safe mode for the Router
> ---
>
> Key: HDFS-13044
> URL: https://issues.apache.org/jira/browse/HDFS-13044
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, 
> HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, 
> HDFS-13044.004.patch, HDFS-13044.005.patch
>
>
> When a Router cannot communicate with the State Store, it should enter into a 
> safe mode that disallows certain operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12512) RBF: Add WebHDFS

2018-01-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346035#comment-16346035
 ] 

Íñigo Goiri commented on HDFS-12512:


Tested and working fine; we had most of it already there and the refactoring 
doesn't break it :)
Just the comments I left 19/Jan/18 18:11.

> RBF: Add WebHDFS
> 
>
> Key: HDFS-12512
> URL: https://issues.apache.org/jira/browse/HDFS-12512
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Wei Yan
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, 
> HDFS-12512.002.patch, HDFS-12512.003.patch
>
>
> The Router currently does not support WebHDFS. It needs to implement 
> something similar to {{NamenodeWebHdfsMethods}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13090) Support composite trusted channel resolver that supports both whitelist and blacklist

2018-01-30 Thread Ajay Kumar (JIRA)
Ajay Kumar created HDFS-13090:
-

 Summary: Support composite trusted channel resolver that supports 
both whitelist and blacklist
 Key: HDFS-13090
 URL: https://issues.apache.org/jira/browse/HDFS-13090
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ajay Kumar
Assignee: Ajay Kumar


support composite trusted channel resolver that supports both whitelist and 
blacklist



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage

2018-01-30 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346028#comment-16346028
 ] 

Ajay Kumar commented on HDFS-12997:
---

[~xyao], Updated patch v7 to address suggestions. 

> Move logging to slf4j in BlockPoolSliceStorage and Storage 
> ---
>
> Key: HDFS-12997
> URL: https://issues.apache.org/jira/browse/HDFS-12997
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, 
> HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, 
> HDFS-12997.006.patch, HDFS-12997.007.patch
>
>
> Move logging to slf4j in BlockPoolSliceStorage and Storage classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage

2018-01-30 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-12997:
--
Attachment: HDFS-12997.007.patch

> Move logging to slf4j in BlockPoolSliceStorage and Storage 
> ---
>
> Key: HDFS-12997
> URL: https://issues.apache.org/jira/browse/HDFS-12997
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, 
> HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, 
> HDFS-12997.006.patch, HDFS-12997.007.patch
>
>
> Move logging to slf4j in BlockPoolSliceStorage and Storage classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-12528:
---
Fix Version/s: 3.0.1

> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16346009#comment-16346009
 ] 

Hudson commented on HDFS-12528:
---

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #13584 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/13584/])
HDFS-12528. Add an option to not disable short-circuit reads on (wwei: rev 
2e7331ca264dd366b975f3c8e610cf84eb8cc155)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/HdfsClientConfigKeys.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java
* (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/resources/hdfs-default.xml
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/client/impl/TestBlockReaderFactory.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/DfsClientConf.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/ShortCircuitCache.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java


> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13062) Provide support for JN to use separate journal disk per namespace

2018-01-30 Thread Hanisha Koneru (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345996#comment-16345996
 ] 

Hanisha Koneru commented on HDFS-13062:
---

Thanks for updating the patch, [~bharatviswa].
 * In {{setConf()}}, why are we getting the nameserviceIds from the config key 
{{DFS_INTERNAL_NAMESERVICES_KEY}} before {{DFS_NAMESERVICES}}? IIUC from 
HDFS-6376, which introduced the internal nameservices key, it is meant for 
datanodes to distinguish between which nameservices to connect to. JournalNodes 
should not be using this configuration to deduce the nameservice Ids. Please 
correct me if I am wrong.
 
 * In {{getLogDir}}:
** The if condition below should check that the {{dir}} path does not exist 
in {{localDir}}. The {{dir.exists}} check is not sufficient to ensure that the 
{{dir}} path has been validated by {{validateAndCreateJournalDir}}.
{code:java}
else if (!new File(dir.trim()).exists()){{code}
** The above check should be done irrespective of which config setting 
{{dir}} is configured from. 
** In {{getLogDir}}, we can reuse and optimize the File objects. {{new 
File(dir.trim())}} is created multiple times with the same {{dir}} path.
{code}if (dir == null) {
  dir = conf.get(DFSConfigKeys.DFS_JOURNALNODE_EDITS_DIR_KEY,
  DFSConfigKeys.DFS_JOURNALNODE_EDITS_DIR_DEFAULT);
} 

File journalDir = new File(dir.trim());
if (!localDir.contains(journalDir)) {
   validateAndCreateJournalDir(journalDir);
   
}
{code}


> Provide support for JN to use separate journal disk per namespace
> -
>
> Key: HDFS-13062
> URL: https://issues.apache.org/jira/browse/HDFS-13062
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Bharat Viswanadham
>Assignee: Bharat Viswanadham
>Priority: Major
> Attachments: HDFS-13062.00.patch, HDFS-13062.01.patch, 
> HDFS-13062.02.patch, HDFS-13062.03.patch, HDFS-13062.04.patch
>
>
> In Federated HA setup, provide support for separate journal disk for each 
> namespace.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Weiwei Yang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345993#comment-16345993
 ] 

Weiwei Yang commented on HDFS-12528:


Thanks [~xiaochen] to provide the fix, thanks for [~jzhuge] for the UT and 
review, also reviews from [~GeLiXin]. I just committed this to trunk and 
branch-3.0. 

> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 3.1.0, 3.0.1
>
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-12528:
---
Release Note: Add an option to not disables short-circuit reads on 
failures, by setting dfs.domain.socket.disable.interval.seconds to 0.

> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-12528:
---
Release Note: Added an option to not disables short-circuit reads on 
failures, by setting dfs.domain.socket.disable.interval.seconds to 0.  (was: 
Add an option to not disables short-circuit reads on failures, by setting 
dfs.domain.socket.disable.interval.seconds to 0.)

> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2018-01-30 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345983#comment-16345983
 ] 

Daryn Sharp commented on HDFS-10285:


*StoragePolicySatisifier*
 Should {{handleException}} use a double-checked lock to avoid synchronization? 
 Unexpected exceptions should be a rarity, right?
 Speaking of which, it’s not safe to ignore all {{Throwable}} in the run loop!  
You have no idea if data structures are in a sane or consistent state.


 *FSTreeTraverser*
 I need to study this more but I have grave concerns this will work correctly 
in a mutating namesystem.  Ex. renames and deletes esp. in combination with 
snapshots. Looks like there's a chance it will go off in the weeds when 
backtracking out of a renamed directory.


 {{traverseDir}} may NPE if it's traversing a tree in a snapshot and one of the 
ancestors is deleted.


 Not sure why it's bothering to re-check permissions during the crawl.  The 
storage policy is inherited by the entire tree, regardless of whether the 
sub-contents are accessible.  The effect of this patch is the storage policy is 
enforced for all readable files, non-readable violate the new storage policy, 
new non-readable will conform to the new storage policy.  Very convoluted.  
Since new files will conform, should just process the entire tree.
  

> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-10285-consolidated-merge-patch-04.patch, 
> HDFS-10285-consolidated-merge-patch-05.patch, 
> HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage policy. These 
> policies can be set on directory/file to specify the user preference, where 
> to store the physical block. When user set the storage policy before writing 
> data, then the blocks could take advantage of storage policy preferences and 
> stores physical block accordingly. 
> If user set the storage policy after writing and completing the file, then 
> the blocks would have been written with default storage policy (nothing but 
> DISK). User has to run the ‘Mover tool’ explicitly by specifying all such 
> file names as a list. In some distributed system scenarios (ex: HBase) it 
> would be difficult to collect all the files and run the tool as different 
> nodes can write files separately and file can have different paths.
> Another scenarios is, when user rename the files from one effected storage 
> policy file (inherited policy from parent directory) to another storage 
> policy effected directory, it will not copy inherited storage policy from 
> source. So it will take effect from destination file/dir parent storage 
> policy. This rename operation is just a metadata change in Namenode. The 
> physical blocks still remain with source storage policy.
> So, Tracking all such business logic based file names could be difficult for 
> admins from distributed nodes(ex: region servers) and running the Mover tool. 
> Here the proposal is to provide an API from Namenode itself for trigger the 
> storage policy satisfaction. A Daemon thread inside Namenode should track 
> such calls and process to DN as movement commands. 
> Will post the detailed design thoughts document soon. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Weiwei Yang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weiwei Yang updated HDFS-12528:
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.1.0
   Status: Resolved  (was: Patch Available)

> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Fix For: 3.1.0
>
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13089) Add test to validate dfs used and no of blocks when blocks are moved across volumes

2018-01-30 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-13089:
--
Status: Patch Available  (was: Open)

> Add test to validate dfs used and no of blocks when blocks are moved across 
> volumes
> ---
>
> Key: HDFS-13089
> URL: https://issues.apache.org/jira/browse/HDFS-13089
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13089.000.patch
>
>
> Add test to validate dfs used and no of blocks when blocks are moved across 
> volumes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13089) Add test to validate dfs used and no of blocks when blocks are moved across volumes

2018-01-30 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13089?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-13089:
--
Attachment: HDFS-13089.000.patch

> Add test to validate dfs used and no of blocks when blocks are moved across 
> volumes
> ---
>
> Key: HDFS-13089
> URL: https://issues.apache.org/jira/browse/HDFS-13089
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13089.000.patch
>
>
> Add test to validate dfs used and no of blocks when blocks are moved across 
> volumes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13089) Add test to validate dfs used and no of blocks when blocks are moved across volumes

2018-01-30 Thread Ajay Kumar (JIRA)
Ajay Kumar created HDFS-13089:
-

 Summary: Add test to validate dfs used and no of blocks when 
blocks are moved across volumes
 Key: HDFS-13089
 URL: https://issues.apache.org/jira/browse/HDFS-13089
 Project: Hadoop HDFS
  Issue Type: Test
Reporter: Ajay Kumar
Assignee: Ajay Kumar


Add test to validate dfs used and no of blocks when blocks are moved across 
volumes



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345954#comment-16345954
 ] 

genericqa commented on HDFS-13060:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 14m  
4s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
39s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 32m 
22s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 13m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
15m 26s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
14s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 12m 
46s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
2m 21s{color} | {color:orange} root: The patch generated 1 new + 0 unchanged - 
0 fixed = 1 total (was 0) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  2m 
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 30s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
58s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  8m 
15s{color} | {color:green} hadoop-common in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}142m 13s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
51s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}273m 18s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.server.datanode.TestDataNodeUUID |
|   | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.datanode.TestDataNodeMultipleRegistrations |
|   | hadoop.hdfs.server.namenode.TestDecommissioningStatus |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration |
|   | hadoop.hdfs.server.datanode.TestDirectoryScanner |
|   | hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.server.namenode.TestMetaSave |
|   | hadoop.hdfs.server.namenode.ha.TestHAAppend |
|   | hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure |
|   | hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
|
\\
\\
|| 

[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345866#comment-16345866
 ] 

genericqa commented on HDFS-13061:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 18m 
46s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
42s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
38s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 11s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m  
9s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
32s{color} | {color:green} hadoop-hdfs-project generated 0 new + 433 unchanged 
- 1 fixed = 433 total (was 434) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 46s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  1m 
21s{color} | {color:green} hadoop-hdfs-client in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}115m  4s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 58s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
|   | hadoop.hdfs.TestHDFSFileSystemContract |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13061 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908397/HDFS-13061.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 72ce7cf6f7de 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f9dd5b6 |
| maven | version: Apache Maven 3.3.9 |
| Default 

[jira] [Updated] (HDFS-7240) Scaling HDFS

2018-01-30 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Summary: Scaling HDFS  (was: Object store in HDFS)

> Scaling HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its 
> scaling challenges and how to address those challenges. Scaling HDFS requires 
> scaling the namespace layer and also the block layer. _This jira provides a 
> new block layer,  Hadoop Distributed Storage Layer (HDSL), that scales the 
> block layer by grouping blocks into containers thereby reducing the 
> block-to-location map and also reducing the number of block reports and their 
> processing_
> _A scalable namespace can be put on top this scalable block layer:_
>  * _HDFS-10419 describes how the existing NN can be modified to use the new 
> block layer._
>  * _HDFS-13074  also provides, as an_ *_interim_* _step; a scalable flat 
> Key-Value namespace  on top of the new block layer; while it does not provide 
> the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, 
> FileContext)._
>  
> Old Description
> This jira proposes to add object store capabilities into HDFS. 
>  As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
>  In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-7240) Object store in HDFS

2018-01-30 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Description: 
[^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides, as an_ *_interim_* _step; a scalable flat 
Key-Value namespace  on top of the new block layer; while it does not provide 
the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, 
FileContext)._

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.

  was:
[Scaling HDFS| x] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
namespace  on top of the new block layer; while it does not provide the HDFS 
API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 

 

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.


> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its 
> scaling challenges and how to address those challenges. Scaling HDFS requires 
> scaling the namespace layer and also the block layer. _This jira provides a 
> new block layer,  Hadoop Distributed Storage Layer (HDSL), that scales the 
> block layer by grouping blocks into containers thereby reducing the 
> block-to-location map and also reducing the number of block reports and their 
> processing_
> _A scalable namespace can be put on top this scalable block layer:_
>  * _HDFS-10419 describes how the existing NN can be modified to use the new 
> block layer._
>  * _HDFS-13074  also provides, as an_ *_interim_* _step; a scalable flat 
> Key-Value namespace  on top of the new block layer; while it does not provide 
> the HDFS API, it does support the Hadoop FS APIs (Hadoop FileSystem, 
> FileContext)._
>  
> Old Description
> This jira proposes to add object store capabilities into HDFS. 
>  As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
>  In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by 

[jira] [Commented] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345843#comment-16345843
 ] 

genericqa commented on HDFS-13043:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
19s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 4 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 14m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 51s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 24s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}132m  9s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}177m 33s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA |
|   | hadoop.hdfs.server.federation.metrics.TestFederationMetrics |
|   | hadoop.hdfs.web.TestWebHdfsTimeouts |
|   | hadoop.hdfs.server.namenode.TestReencryptionWithKMS |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13043 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908394/HDFS-13043.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  cc  |
| uname | Linux 52f04c6127fb 4.4.0-64-generic #85-Ubuntu SMP Mon Feb 20 
11:50:30 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f9dd5b6 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 

[jira] [Updated] (HDFS-7240) Object store in HDFS

2018-01-30 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Description: 
[Scaling HDFS| x] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
namespace  on top of the new block layer; while it does not provide the HDFS 
API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 

 

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.

  was:
[^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
namespace  on top of the new block layer; while it does not provide the HDFS 
API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 

 

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.


> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> [Scaling HDFS| x] describes areas where HDFS does well and its scaling 
> challenges and how to address those challenges. Scaling HDFS requires scaling 
> the namespace layer and also the block layer. _This jira provides a new block 
> layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer 
> by grouping blocks into containers thereby reducing the block-to-location map 
> and also reducing the number of block reports and their processing_
> _A scalable namespace can be put on top this scalable block layer:_
>  * _HDFS-10419 describes how the existing NN can be modified to use the new 
> block layer._
>  * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
> namespace  on top of the new block layer; while it does not provide the HDFS 
> API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 
>  
>  
> Old Description
> This jira proposes to add object store capabilities into HDFS. 
>  As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
>  In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA

[jira] [Updated] (HDFS-7240) Object store in HDFS

2018-01-30 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Description: 
[^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its scaling 
challenges and how to address those challenges. Scaling HDFS requires scaling 
the namespace layer and also the block layer. _This jira provides a new block 
layer,  Hadoop Distributed Storage Layer (HDSL), that scales the block layer by 
grouping blocks into containers thereby reducing the block-to-location map and 
also reducing the number of block reports and their processing_

_A scalable namespace can be put on top this scalable block layer:_
 * _HDFS-10419 describes how the existing NN can be modified to use the new 
block layer._
 * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
namespace  on top of the new block layer; while it does not provide the HDFS 
API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 

 

 

Old Description

This jira proposes to add object store capabilities into HDFS. 
 As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
 In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.

  was:
This jira proposes to add object store capabilities into HDFS. 
As part of the federation work (HDFS-1052) we separated block storage as a 
generic storage layer. Using the Block Pool abstraction, new kinds of 
namespaces can be built on top of the storage layer i.e. datanodes.
In this jira I will explore building an object store using the datanode 
storage, but independent of namespace metadata.

I will soon update with a detailed design document.






> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> [^HDFS Scalability-v2.pdf] describes areas where HDFS does well and its 
> scaling challenges and how to address those challenges. Scaling HDFS requires 
> scaling the namespace layer and also the block layer. _This jira provides a 
> new block layer,  Hadoop Distributed Storage Layer (HDSL), that scales the 
> block layer by grouping blocks into containers thereby reducing the 
> block-to-location map and also reducing the number of block reports and their 
> processing_
> _A scalable namespace can be put on top this scalable block layer:_
>  * _HDFS-10419 describes how the existing NN can be modified to use the new 
> block layer._
>  * _HDFS-13074  also provides,  as an_ *_interim_* _step; a scalable  flat KV 
> namespace  on top of the new block layer; while it does not provide the HDFS 
> API, it does support the Hadoop FS APIs (Hadoop FileSystem, FileContext)._ 
>  
>  
> Old Description
> This jira proposes to add object store capabilities into HDFS. 
>  As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
>  In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345828#comment-16345828
 ] 

genericqa commented on HDFS-12997:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
49s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
13s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
39s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
1s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 32s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
53s{color} | {color:green} hadoop-hdfs-project_hadoop-hdfs generated 0 new + 
392 unchanged - 2 fixed = 392 total (was 394) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} hadoop-hdfs-project/hadoop-hdfs: The patch generated 
0 new + 153 unchanged - 7 fixed = 153 total (was 160) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 47s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 36s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
25s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}149m  0s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
|   | hadoop.hdfs.TestReconstructStripedFile |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure120 |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12997 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908395/HDFS-12997.006.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6378e0117699 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / f9dd5b6 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22894/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22894/testReport/ |
| Max. 

[jira] [Updated] (HDFS-7240) Object store in HDFS

2018-01-30 Thread Sanjay Radia (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sanjay Radia updated HDFS-7240:
---
Attachment: HDFS Scalability-v2.pdf

> Object store in HDFS
> 
>
> Key: HDFS-7240
> URL: https://issues.apache.org/jira/browse/HDFS-7240
> Project: Hadoop HDFS
>  Issue Type: New Feature
>Reporter: Jitendra Nath Pandey
>Assignee: Jitendra Nath Pandey
>Priority: Major
> Attachments: HDFS Scalability and Ozone.pdf, HDFS Scalability-v2.pdf, 
> HDFS-7240.001.patch, HDFS-7240.002.patch, HDFS-7240.003.patch, 
> HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, 
> HDFS-7240.006.patch, HadoopStorageLayerSecurity.pdf, MeetingMinutes.pdf, 
> Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a 
> generic storage layer. Using the Block Pool abstraction, new kinds of 
> namespaces can be built on top of the storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode 
> storage, but independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345722#comment-16345722
 ] 

genericqa commented on HDFS-11187:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 10m 
24s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
44s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m  0s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
5s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
52s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  7s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 92m 25s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}156m 16s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-11187 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908379/HDFS-11187.005.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 164716be0a39 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 901d15a |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22888/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22888/testReport/ |
| Max. process+thread count | 4147 (vs. ulimit of 5000) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22888/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Optimize disk access for last partial chunk checksum of 

[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage

2018-01-30 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345716#comment-16345716
 ] 

Xiaoyu Yao commented on HDFS-12997:
---

Thanks [~ajayydv] for the update. The patch looks good to me. Two more minor 
issues, +1 after that being fixed.

 
BlockPoolSliceStorage.java

LINE 169-170: "formatted for {}" is redundant. We can remove it and the 2nd 
"nsInfo.getBlockPoolID()"


NNStorage.java

Line 866-867: can we wrap these two calls inside {color}if 
(LOG.isDebugEnabled()) to minimize performance impact of logging? 

Line 882-883: Same as above

> Move logging to slf4j in BlockPoolSliceStorage and Storage 
> ---
>
> Key: HDFS-12997
> URL: https://issues.apache.org/jira/browse/HDFS-12997
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, 
> HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, 
> HDFS-12997.006.patch
>
>
> Move logging to slf4j in BlockPoolSliceStorage and Storage classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345705#comment-16345705
 ] 

genericqa commented on HDFS-13044:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
10s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-3.0 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 17m 
49s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
47s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
53s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m 41s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} branch-3.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} branch-3.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green}  
9m  6s{color} | {color:green} patch has no errors when building and testing our 
client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 63m 41s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red}  0m 
21s{color} | {color:red} The patch generated 1 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black}111m  2s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Unreaped Processes | hadoop-hdfs:7 |
| Failed junit tests | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure070 |
|   | hadoop.hdfs.server.blockmanagement.TestNodeCount |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
|   | hadoop.hdfs.TestDistributedFileSystem |
|   | hadoop.hdfs.TestFileChecksum |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
|   | hadoop.hdfs.server.federation.router.TestRouterSafemode |
|   | hadoop.hdfs.server.balancer.TestBalancerWithEncryptedTransfer |
|   | hadoop.hdfs.TestReadStripedFileWithDecodingCorruptData |
|   | hadoop.hdfs.server.datanode.TestDataNodeECN |
|   | hadoop.hdfs.server.namenode.TestFSImageWithAcl |
|   | hadoop.hdfs.TestDFSClientRetries |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy |
|   | hadoop.hdfs.TestDFSPermission |
|   | 
hadoop.hdfs.tools.offlineImageViewer.TestOfflineImageViewerWithStripedBlocks |
|   | hadoop.hdfs.TestGetBlocks |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithRandomECPolicy |
|   | hadoop.hdfs.TestErasureCodingPolicies |
|   | hadoop.hdfs.TestDFSStripedOutputStreamWithFailure200 |
|   | 

[jira] [Comment Edited] (HDFS-13069) Enable HDFS to cache data read from external storage systems

2018-01-30 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345673#comment-16345673
 ] 

Virajith Jalaparti edited comment on HDFS-13069 at 1/30/18 7:55 PM:


The design for caching PROVIDED files is as follows:
 1) The onus of caching any file will be on the client. If a particular file 
has to be cached, a {{readThrough}} flag will be set to {{true}} as part of the 
{{CachingStrategy}}.
 2) When a Datanode receives a {{readBlock}} request, for a PROVIDED block, 
with the {{readThrough}} flag set to {{true}}, it creates a {{TEMPORARY}} local 
replica. As data is streamed back to the client, it will be written this 
temporary replica (can be synchronous or asynchronous).
 3) When the stream to the client closes, the Datanode attempts to finalize the 
replica. If the block is not fully read by the client, the Datanode pages in 
the remaining data from the PROVIDED store. The local replica is then finalized 
and the NN is notified ({{FsDatasetSpi#notifyNamenodeForNewReplica}} is called).
 4) In the above, if any failure occurs between the creation of the temporary 
replica and the replica being finalized, it will be cleaned up. If the Datanode 
goes down while this happens, the temporary replica will be handled as any 
other temporary replica.

Note that if the Namenode gets notified of the new replica in (4) above, it 
would be deleted as an excess replica. To prevent this, we will introduce a new 
kind of per-file replication factor called "over-replication" with the 
following semantics: the NN will allow (replication + over-replication) number 
of replicas to exist for any block. The current semantics of replication will 
continue to be enforced (i.e., if the number of replicas of a block are less 
than the replication factor, the NN will schedule replications) but no 
replications will be scheduled to meet the over-replication.

Thus, to enable read-through caching, the client has to first set the 
appropriate over-replication for the PROVIDED files and then read the data. If 
needed, the over-replication can also be set as part of the image generation 
process for the PROVIDED storage (HDFS-9806/HDFS-10706).

As long as files in the PROVIDED store is read-only (and thus, do not get 
modified), there will be no need to invalidate the cached replicas. If data in 
the PROVIDED store changes, the cached replicas will require invalidation.


was (Author: virajith):
The design for caching PROVIDED files is as follows:
 1) The onus of caching any file will be on the client. If a particular file 
has to be cached, a {{readThrough}} flag will be set to {{true}} as part of the 
{{CachingStrategy}}.
 2) When a Datanode receives a {{readBlock}} request, for a PROVIDED block, 
with the {{readThrough}} flag set to {{true}}, it creates a {{TEMPORARY}} local 
replica. As data is streamed back to the client, it will be written this 
temporary replica (can be synchronous or asynchronous).
 3) When the stream to the client closes, the Datanode attempts to finalize the 
replica. If the block is not fully read by the client, the Datanode pages in 
the remaining data from the PROVIDED store. The local replica is then finalized 
and the NN is notified ({{FsDatasetSpi#notifyNamenodeForNewReplica}} is called).
 4) In the above, if any failure occurs between the creation of the temporary 
replica and the replica being finalized, it will be cleaned up. If the Datanode 
goes down while this happens, the temporary replica will be handled as any 
other temporary replica.

Note that if the Namenode gets notified of the new replica in (4) above, it 
would be deleted as an excess replica. To prevent this, we will introduce a new 
kind of per-file replication factor called "over-replication" with the 
following semantics: the NN will allow (replication + over-replication) number 
of replicas to exist for any block. The current semantics of replication will 
continue to be enforced (i.e., if the number of replicas of a block are less 
than the replication factor, the NN will schedule replications) but no 
replications will be scheduled to meet the over-replication.

So to enable read-through caching, the client has to first set the appropriate 
over-replication for the PROVIDED files and then read the data. If needed, the 
over-replication can also be set as part of the image generation process for 
the PROVIDED storage (HDFS-9806/HDFS-10706)

> Enable HDFS to cache data read from external storage systems
> 
>
> Key: HDFS-13069
> URL: https://issues.apache.org/jira/browse/HDFS-13069
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Virajith Jalaparti
>Priority: Major
>
> With {{PROVIDED}} storage (HDFS-9806), HDFS can address data stored in 
> external 

[jira] [Created] (HDFS-13088) Allow HDFS files/blocks to be over-replicated.

2018-01-30 Thread Virajith Jalaparti (JIRA)
Virajith Jalaparti created HDFS-13088:
-

 Summary: Allow HDFS files/blocks to be over-replicated.
 Key: HDFS-13088
 URL: https://issues.apache.org/jira/browse/HDFS-13088
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Virajith Jalaparti


This JIRA is to add a per-file "over-replication" factor to HDFS. As mentioned 
in HDFS-13069, the over-replication factor will be the excess replicas that 
will be allowed to exist for a file or block. This is beneficial if the 
application deems additional replicas for a file are needed. In the case of  
HDFS-13069, it would allow copies of data in PROVIDED storage to be cached 
locally in HDFS in a read-through manner.

The Namenode will not proactively meet the over-replication i.e., it does not 
schedule replications if the number of replicas for a block is less than 
(replication factor + over-replication factor) as long as they are more than 
the replication factor of the file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12512) RBF: Add WebHDFS

2018-01-30 Thread Wei Yan (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12512?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345680#comment-16345680
 ] 

Wei Yan commented on HDFS-12512:


[~elgoiri] any comments from your test cluster? :)

> RBF: Add WebHDFS
> 
>
> Key: HDFS-12512
> URL: https://issues.apache.org/jira/browse/HDFS-12512
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: fs
>Reporter: Íñigo Goiri
>Assignee: Wei Yan
>Priority: Major
>  Labels: RBF
> Attachments: HDFS-12512.000.patch, HDFS-12512.001.patch, 
> HDFS-12512.002.patch, HDFS-12512.003.patch
>
>
> The Router currently does not support WebHDFS. It needs to implement 
> something similar to {{NamenodeWebHdfsMethods}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13069) Enable HDFS to cache data read from external storage systems

2018-01-30 Thread Virajith Jalaparti (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345673#comment-16345673
 ] 

Virajith Jalaparti commented on HDFS-13069:
---

The design for caching PROVIDED files is as follows:
 1) The onus of caching any file will be on the client. If a particular file 
has to be cached, a {{readThrough}} flag will be set to {{true}} as part of the 
{{CachingStrategy}}.
 2) When a Datanode receives a {{readBlock}} request, for a PROVIDED block, 
with the {{readThrough}} flag set to {{true}}, it creates a {{TEMPORARY}} local 
replica. As data is streamed back to the client, it will be written this 
temporary replica (can be synchronous or asynchronous).
 3) When the stream to the client closes, the Datanode attempts to finalize the 
replica. If the block is not fully read by the client, the Datanode pages in 
the remaining data from the PROVIDED store. The local replica is then finalized 
and the NN is notified ({{FsDatasetSpi#notifyNamenodeForNewReplica}} is called).
 4) In the above, if any failure occurs between the creation of the temporary 
replica and the replica being finalized, it will be cleaned up. If the Datanode 
goes down while this happens, the temporary replica will be handled as any 
other temporary replica.

Note that if the Namenode gets notified of the new replica in (4) above, it 
would be deleted as an excess replica. To prevent this, we will introduce a new 
kind of per-file replication factor called "over-replication" with the 
following semantics: the NN will allow (replication + over-replication) number 
of replicas to exist for any block. The current semantics of replication will 
continue to be enforced (i.e., if the number of replicas of a block are less 
than the replication factor, the NN will schedule replications) but no 
replications will be scheduled to meet the over-replication.

So to enable read-through caching, the client has to first set the appropriate 
over-replication for the PROVIDED files and then read the data. If needed, the 
over-replication can also be set as part of the image generation process for 
the PROVIDED storage (HDFS-9806/HDFS-10706)

> Enable HDFS to cache data read from external storage systems
> 
>
> Key: HDFS-13069
> URL: https://issues.apache.org/jira/browse/HDFS-13069
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Reporter: Virajith Jalaparti
>Priority: Major
>
> With {{PROVIDED}} storage (HDFS-9806), HDFS can address data stored in 
> external storage systems. Caching this data, locally, in HDFS can speed up 
> subsequent accesses to the data -- this feature is especially useful when 
> accesses to the external store have limited bandwidth/higher latency. This 
> JIRA is to add this feature in HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12522) Ozone: Remove the Priority Queues used in the Container State Manager

2018-01-30 Thread Anu Engineer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anu Engineer updated HDFS-12522:

   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: HDFS-7240
   Status: Resolved  (was: Patch Available)

[~xyao] and [~nandakumar131] Thanks for the reviews, I have committed this to 
the feature branch.

 

> Ozone: Remove the Priority Queues used in the Container State Manager
> -
>
> Key: HDFS-12522
> URL: https://issues.apache.org/jira/browse/HDFS-12522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12522-HDFS-7240.001.patch, 
> HDFS-12522-HDFS-7240.002.patch, HDFS-12522-HDFS-7240.003.patch, 
> HDFS-12522-HDFS-7240.004.patch
>
>
> During code review of HDFS-12387, it was suggested that we remove the 
> priority queues that was used in ContainerStateManager. This JIRA tracks that 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures

2018-01-30 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345156#comment-16345156
 ] 

Jitendra Nath Pandey edited comment on HDFS-11701 at 1/30/18 6:51 PM:
--

+1. The patch looks good to me.

Could you please evaluate if it is possible to add a unit test using Mockito?


was (Author: jnp):
+1. The patch looks good to me.

> NPE from Unresolved Host causes permanent DFSInputStream failures
> -
>
> Key: HDFS-11701
> URL: https://issues.apache.org/jira/browse/HDFS-11701
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
> Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH 
> 5.9.0
>Reporter: James Moore
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: HDFS-11701.001.patch, HDFS-11701.002.patch
>
>
> We recently encountered the following NPE due to the DFSInputStream storing 
> old cached block locations from hosts which could no longer resolve.
> {quote}
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127)
> ~HBase related stack frames trimmed~
> {quote}
> After investigating, the DFSInputStream appears to have been open for upwards 
> of 3-4 weeks and had cached block locations from decommissioned nodes that no 
> longer resolve in DNS and had been shutdown and removed from the cluster 2 
> weeks prior.  If the DFSInputStream had refreshed its block locations from 
> the name node, it would have received alternative block locations which would 
> not contain the decommissioned data nodes.  As the above NPE leaves the 
> non-resolving data node in the list of block locations the DFSInputStream 
> never refreshes the block locations and all attempts to open a BlockReader 
> for the given blocks will fail.
> In our case, we resolved the NPE by closing and re-opening every 
> DFSInputStream in the cluster to force a purge of the block locations cache.  
> Ideally, the DFSInputStream would re-fetch all block locations for a host 
> which can't be resolved in DNS or at least the blocks requested.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-30 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345595#comment-16345595
 ] 

Ajay Kumar commented on HDFS-13061:
---

[~xyao], Updated patch to remove {{socket.close()}} call.

> SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted 
> channel
> -
>
> Key: HDFS-13061
> URL: https://issues.apache.org/jira/browse/HDFS-13061
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, 
> HDFS-13061.002.patch, HDFS-13061.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the 
> client and server address are trusted, respectively. It decides the channel 
> is untrusted only if both client and server are not trusted to enforce 
> encryption. *This ticket is opened to change it to not trust (and encrypt) if 
> either client or server address are not trusted.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-30 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-13061:
--
Attachment: HDFS-13061.003.patch

> SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted 
> channel
> -
>
> Key: HDFS-13061
> URL: https://issues.apache.org/jira/browse/HDFS-13061
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, 
> HDFS-13061.002.patch, HDFS-13061.003.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the 
> client and server address are trusted, respectively. It decides the channel 
> is untrusted only if both client and server are not trusted to enforce 
> encryption. *This ticket is opened to change it to not trust (and encrypt) if 
> either client or server address are not trusted.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage

2018-01-30 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-12997:
--
Attachment: HDFS-12997.006.patch

> Move logging to slf4j in BlockPoolSliceStorage and Storage 
> ---
>
> Key: HDFS-12997
> URL: https://issues.apache.org/jira/browse/HDFS-12997
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, 
> HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, 
> HDFS-12997.006.patch
>
>
> Move logging to slf4j in BlockPoolSliceStorage and Storage classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12997) Move logging to slf4j in BlockPoolSliceStorage and Storage

2018-01-30 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345592#comment-16345592
 ] 

Ajay Kumar commented on HDFS-12997:
---

Test failures are unrelated. Updated patch to fix 1 checkstyle issue.

> Move logging to slf4j in BlockPoolSliceStorage and Storage 
> ---
>
> Key: HDFS-12997
> URL: https://issues.apache.org/jira/browse/HDFS-12997
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Ajay Kumar
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-12997.001.patch, HDFS-12997.002.patch, 
> HDFS-12997.003.patch, HDFS-12997.004.patch, HDFS-12997.005.patch, 
> HDFS-12997.006.patch
>
>
> Move logging to slf4j in BlockPoolSliceStorage and Storage classes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12522) Ozone: Remove the Priority Queues used in the Container State Manager

2018-01-30 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345588#comment-16345588
 ] 

Xiaoyu Yao commented on HDFS-12522:
---

Thanks [~anu] for the update. Patch v4 LGTM, +1.

> Ozone: Remove the Priority Queues used in the Container State Manager
> -
>
> Key: HDFS-12522
> URL: https://issues.apache.org/jira/browse/HDFS-12522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
> Attachments: HDFS-12522-HDFS-7240.001.patch, 
> HDFS-12522-HDFS-7240.002.patch, HDFS-12522-HDFS-7240.003.patch, 
> HDFS-12522-HDFS-7240.004.patch
>
>
> During code review of HDFS-12387, it was suggested that we remove the 
> priority queues that was used in ContainerStateManager. This JIRA tracks that 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13068) RBF: Add router admin option to manage safe mode

2018-01-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345583#comment-16345583
 ] 

Íñigo Goiri commented on HDFS-13068:


Thanks [~linyiqun] for  [^HDFS-13068.001.patch] .

Most of the patch is just the interface part so it's big but the changes are 
pretty contained.

For the regular {{dfsadmin}} the way the option is there, is something like: 
{{[-safemode enter | leave | get | wait | forceExit]}}, it should be something 
like that.

We may want to add an option to get the state.

> RBF: Add router admin option to manage safe mode
> 
>
> Key: HDFS-13068
> URL: https://issues.apache.org/jira/browse/HDFS-13068
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Yiqun Lin
>Priority: Major
> Attachments: HDFS-13068.001.patch
>
>
> HDFS-13044 adds a safe mode to reject requests. We should have an option to 
> manually set the Router into safe mode.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-12636) Ozone: OzoneFileSystem: Implement seek functionality for rpc client

2018-01-30 Thread Jitendra Nath Pandey (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jitendra Nath Pandey updated HDFS-12636:

Parent Issue: HDFS-13074  (was: HDFS-7240)

> Ozone: OzoneFileSystem: Implement seek functionality for rpc client
> ---
>
> Key: HDFS-12636
> URL: https://issues.apache.org/jira/browse/HDFS-12636
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Affects Versions: HDFS-7240
>Reporter: Mukul Kumar Singh
>Assignee: Lokesh Jain
>Priority: Major
> Fix For: HDFS-7240
>
> Attachments: HDFS-12636-HDFS-7240.001.patch, 
> HDFS-12636-HDFS-7240.002.patch, HDFS-12636-HDFS-7240.003.patch, 
> HDFS-12636-HDFS-7240.004.patch
>
>
> OzoneClient library provides a method to invoke both RPC as well as REST 
> based methods to ozone. This api will help in the improving both the 
> performance as well as the interface management in OzoneFileSystem.
> This jira will be used to convert the REST based calls to use this new 
> unified client.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13083) RBF: Fix doc error setting up client

2018-01-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345574#comment-16345574
 ] 

Íñigo Goiri commented on HDFS-13083:


I took the freedom to make this part of HDFS-12615 and renamed the title to be 
a little more specific.

If you don't have any concerns, I'll commit this labeling it as "Contributed by 
tartarus".

[~tartarus], let me know if this is good with you or want some other change.

> RBF: Fix doc error setting up client
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.0.0
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13083) RBF: Fix doc error setting up client

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13083:
---
Summary: RBF: Fix doc error setting up client  (was: Fix HDFS Router 
Federation doc error)

> RBF: Fix doc error setting up client
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.0.0
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13043) RBF: Expose the state of the Routers in the federation

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13043:
---
Attachment: HDFS-13043.006.patch

> RBF: Expose the state of the Routers in the federation
> --
>
> Key: HDFS-13043
> URL: https://issues.apache.org/jira/browse/HDFS-13043
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13043.000.patch, HDFS-13043.001.patch, 
> HDFS-13043.002.patch, HDFS-13043.003.patch, HDFS-13043.004.patch, 
> HDFS-13043.005.patch, HDFS-13043.006.patch, router-info.png
>
>
> The Router should expose the state of the other Routers in the federation 
> through a user UI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-30 Thread Ajay Kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ajay Kumar updated HDFS-13060:
--
Attachment: HDFS-13060.001.patch

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13061) SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted channel

2018-01-30 Thread Xiaoyu Yao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345557#comment-16345557
 ] 

Xiaoyu Yao commented on HDFS-13061:
---

Thanks [~ajayydv] for the update.  Just one more NIT, +1 after than being fixed 
pending Jenkins.

TestSaslDataTransfer.java

Line 303/353/398: {{socket.close()}}; is not needed as you have the following 
to handle that.
{code:java}
IOUtils.cleanupWithLogger(null, socket, serverSocket);{code}
 

> SaslDataTransferClient#checkTrustAndSend should not trust a partially trusted 
> channel
> -
>
> Key: HDFS-13061
> URL: https://issues.apache.org/jira/browse/HDFS-13061
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13061.000.patch, HDFS-13061.001.patch, 
> HDFS-13061.002.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> SaslDataTransferClient#checkTrustAndSend ask the channel resolve whether the 
> client and server address are trusted, respectively. It decides the channel 
> is untrusted only if both client and server are not trusted to enforce 
> encryption. *This ticket is opened to change it to not trust (and encrypt) if 
> either client or server address are not trusted.*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12522) Ozone: Remove the Priority Queues used in the Container State Manager

2018-01-30 Thread Anu Engineer (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12522?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345548#comment-16345548
 ] 

Anu Engineer commented on HDFS-12522:
-

[~xyao] and [~nandakumar131]  Can you please take a look at the latest patch. 
The test failures are fixed, there is one checkstyle warning about more than 80 
chars that I will fix while committing. Thanks

 

> Ozone: Remove the Priority Queues used in the Container State Manager
> -
>
> Key: HDFS-12522
> URL: https://issues.apache.org/jira/browse/HDFS-12522
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Anu Engineer
>Assignee: Anu Engineer
>Priority: Major
> Attachments: HDFS-12522-HDFS-7240.001.patch, 
> HDFS-12522-HDFS-7240.002.patch, HDFS-12522-HDFS-7240.003.patch, 
> HDFS-12522-HDFS-7240.004.patch
>
>
> During code review of HDFS-12387, it was suggested that we remove the 
> priority queues that was used in ContainerStateManager. This JIRA tracks that 
> issue.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router

2018-01-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345543#comment-16345543
 ] 

Íñigo Goiri commented on HDFS-13044:


I added [^HDFS-13044.005.patch] with the actual commit to trunk and 
[^HDFS-13044-branch-3.0.000.patch] with the one that would go to branch-3.0 and 
others.

> RBF: Add a safe mode for the Router
> ---
>
> Key: HDFS-13044
> URL: https://issues.apache.org/jira/browse/HDFS-13044
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, 
> HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, 
> HDFS-13044.004.patch, HDFS-13044.005.patch
>
>
> When a Router cannot communicate with the State Store, it should enter into a 
> safe mode that disallows certain operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13060) Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver

2018-01-30 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345546#comment-16345546
 ] 

Ajay Kumar commented on HDFS-13060:
---

Patch v2 to address checkstyle issues.

> Adding a BlacklistBasedTrustedChannelResolver for TrustedChannelResolver
> 
>
> Key: HDFS-13060
> URL: https://issues.apache.org/jira/browse/HDFS-13060
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Xiaoyu Yao
>Assignee: Ajay Kumar
>Priority: Major
> Attachments: HDFS-13060.000.patch, HDFS-13060.001.patch
>
>
> HDFS-5910 introduces encryption negotiation between client and server based 
> on a customizable TrustedChannelResolver class. The TrustedChannelResolver is 
> invoked on both client and server side. If the resolver indicates that the 
> channel is trusted, then the data transfer will not be encrypted even if 
> dfs.encrypt.data.transfer is set to true. 
> The default trust channel resolver implementation returns false indicating 
> that the channel is not trusted, which always enables encryption. HDFS-5910 
> also added a build-int whitelist based trust channel resolver. It allows you 
> to put IP address/Network Mask of trusted client/server in whitelist files to 
> skip encryption for certain traffics. 
> This ticket is opened to add a blacklist based trust channel resolver for 
> cases only certain machines (IPs) are untrusted without adding each trusted 
> IP individually.
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13044) RBF: Add a safe mode for the Router

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13044:
---
Attachment: HDFS-13044-branch-3.0.000.patch

> RBF: Add a safe mode for the Router
> ---
>
> Key: HDFS-13044
> URL: https://issues.apache.org/jira/browse/HDFS-13044
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13004.000.patch, HDFS-13044-branch-3.0.000.patch, 
> HDFS-13044.001.patch, HDFS-13044.002.patch, HDFS-13044.003.patch, 
> HDFS-13044.004.patch, HDFS-13044.005.patch
>
>
> When a Router cannot communicate with the State Store, it should enter into a 
> safe mode that disallows certain operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13044) RBF: Add a safe mode for the Router

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345528#comment-16345528
 ] 

genericqa commented on HDFS-13044:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HDFS-13044 does not apply to trunk. Rebase required? Wrong 
Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-13044 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908390/HDFS-13044.005.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22889/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> RBF: Add a safe mode for the Router
> ---
>
> Key: HDFS-13044
> URL: https://issues.apache.org/jira/browse/HDFS-13044
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13004.000.patch, HDFS-13044.001.patch, 
> HDFS-13044.002.patch, HDFS-13044.003.patch, HDFS-13044.004.patch, 
> HDFS-13044.005.patch
>
>
> When a Router cannot communicate with the State Store, it should enter into a 
> safe mode that disallows certain operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13044) RBF: Add a safe mode for the Router

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13044?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13044:
---
Attachment: HDFS-13044.005.patch

> RBF: Add a safe mode for the Router
> ---
>
> Key: HDFS-13044
> URL: https://issues.apache.org/jira/browse/HDFS-13044
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Íñigo Goiri
>Assignee: Íñigo Goiri
>Priority: Major
> Attachments: HDFS-13004.000.patch, HDFS-13044.001.patch, 
> HDFS-13044.002.patch, HDFS-13044.003.patch, HDFS-13044.004.patch, 
> HDFS-13044.005.patch
>
>
> When a Router cannot communicate with the State Store, it should enter into a 
> safe mode that disallows certain operations.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13083) Fix HDFS Router Federation doc error

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13083:
---
Issue Type: Sub-task  (was: Bug)
Parent: HDFS-12615

> Fix HDFS Router Federation doc error
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.0.0
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13083) Fix HDFS Router Federation doc error

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345513#comment-16345513
 ] 

genericqa commented on HDFS-13083:
--

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
36s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
0s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
27m 30s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m  6s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 40m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-13083 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12908292/HDFS-13083.patch |
| Optional Tests |  asflicense  mvnsite  |
| uname | Linux 624eb2ecc69a 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 901d15a |
| maven | version: Apache Maven 3.3.9 |
| Max. process+thread count | 301 (vs. ulimit of 5000) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22887/console |
| Powered by | Apache Yetus 0.8.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> Fix HDFS Router Federation doc error
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.0.0
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-01-30 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345478#comment-16345478
 ] 

Wei-Chiu Chuang edited comment on HDFS-11187 at 1/30/18 5:43 PM:
-

v005 to address checstyle warning.

Test errors are unrelated. Do not reproduce in my local tree.


was (Author: jojochuang):
v005 to address checkstyl warning.

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-10285) Storage Policy Satisfier in Namenode

2018-01-30 Thread Daryn Sharp (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-10285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345489#comment-16345489
 ] 

Daryn Sharp commented on HDFS-10285:


More comments, mostly focusing on the DN side.  There appear to be more serious 
issues here.

*FSDirSatisfyStoragePolicyOp*
{{satisfyStoragePolicy}} errors if the xattr is already present.  Should this 
be a no-op?  A client re-requesting a storage policy correction probably 
shouldn't fail.

{{unprotectedSatisfyStoragePolicy}} is called prior to xattr updates, which 
calls {{addSPSPathId}}. To avoid race conditions or inconsistent state if the 
xattr fails, should call {{addSPSPathId}} after xattrs are successfully updated.

{{inodeHasSatisfyXAttr}} calls {{getXAttrFeature}} then immediately shorts out 
if the inode isn't a file.  Should do file check first to avoid unnecessary 
computation.

In general, not fond of unnecessarily guava.  Instead of 
{{newArrayListWithCapacity}} + {{add}}, standard {{Arrays.asList(item)}} is 
more succinct.

*FSDirStatAndListOp*
Not sure why javadoc was changed to add needLocation.  It's already present and 
now doubled up.

{{BlockStorageMovementCommand/BlocksStorageMoveAttemptFinished}}
Again, not sure that a new DN command is necessary, and why does it 
specifically report back successful moves instead of relying on IBRs?  I would 
actually expect the DN to be completely ignorant of a SPS move vs any other 
move.

*StoragePolicySatisfyWorker*
bq. // TODO: Needs to manage the number of concurrent moves per DataNode.
Shouldn't this be fixed now?

{{DFS_MOVER_MOVERTHREADS_DEFAULT}} is *1000 per DN*?  If the DN is concurrently 
doing 1000 moves, it's not in a good state, disk io is probably saturated, and 
this will only make it much worse.  10 is probably more than sufficient.

The executor uses a {{CallerRunsPolicy}}?  If the thread pool is exhausted, WHY 
should the heartbeat thread process the operation?  That's a terrible idea.

I was under the impression the DN did an optimized movement between its 
storages?  Does it actually make a connection to itself?  If yes, when the DN 
xceiver pool is very busy, the mover thread pool will exhaust and cause the 
heartbeat thread to block.

*BlockStorageMovementTracker*
Many data structures are riddled with non-threadsafe race conditions and risk 
of CMEs.

Ex. The {{moverTaskFutures}} map.  Adding new blocks and/or adding to a block's 
list of futures is synchronized.  However the run loop does an unsynchronized 
block get, unsynchronized future remove, unsynchronized isEmpty, possibly 
another unsynchronized get, only then does it do a synchronized remove of the 
block.  The whole chunk of code should be synchronized.

Is the problematic {{moverTaskFutures}} even needed?  It's aggregating futures 
per-block for seemingly no reason.  Why track all the futures at all instead of 
just relying on the completion service?  As best I can tell:
* It's only used to determine if a future from the completion service should be 
ignored during shutdown.  Shutdown sets the running boolean to false and clears 
the entire datastructure so why not use the {{running}} boolean like a check 
just a little further down?
* As synchronization to sleep up to 2 seconds before performing a blocking 
{{moverCompletionService.take}}, but only when it thinks there are no active 
futures.  I'll ignore the missed notify race that the bounded wait masks, but 
the real question is why not just do the blocking take?

Why all the complexity?  Am I missing something?

*BlocksMovementsStatusHandler*
Suffers same type of thread safety issues as {{StoragePolicySatisfyWorker}}.  
Ex. {{blockIdVsMovementStatus}} is inconsistent synchronized.  Does synchronize 
to return an unmodifiable list which sadly does nothing to protect the caller 
from CME.

{{handle}} is iterating over a non-thread safe list.





> Storage Policy Satisfier in Namenode
> 
>
> Key: HDFS-10285
> URL: https://issues.apache.org/jira/browse/HDFS-10285
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: datanode, namenode
>Affects Versions: HDFS-10285
>Reporter: Uma Maheswara Rao G
>Assignee: Uma Maheswara Rao G
>Priority: Major
> Attachments: HDFS-10285-consolidated-merge-patch-00.patch, 
> HDFS-10285-consolidated-merge-patch-01.patch, 
> HDFS-10285-consolidated-merge-patch-02.patch, 
> HDFS-10285-consolidated-merge-patch-03.patch, 
> HDFS-10285-consolidated-merge-patch-04.patch, 
> HDFS-10285-consolidated-merge-patch-05.patch, 
> HDFS-SPS-TestReport-20170708.pdf, SPS Modularization.pdf, 
> Storage-Policy-Satisfier-in-HDFS-June-20-2017.pdf, 
> Storage-Policy-Satisfier-in-HDFS-May10.pdf, 
> Storage-Policy-Satisfier-in-HDFS-Oct-26-2017.pdf
>
>
> Heterogeneous storage in HDFS introduced the concept of storage 

[jira] [Commented] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-01-30 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345478#comment-16345478
 ] 

Wei-Chiu Chuang commented on HDFS-11187:


v005 to address checkstyl warning.

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-11187) Optimize disk access for last partial chunk checksum of Finalized replica

2018-01-30 Thread Wei-Chiu Chuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-11187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang updated HDFS-11187:
---
Attachment: HDFS-11187.005.patch

> Optimize disk access for last partial chunk checksum of Finalized replica
> -
>
> Key: HDFS-11187
> URL: https://issues.apache.org/jira/browse/HDFS-11187
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Wei-Chiu Chuang
>Assignee: Wei-Chiu Chuang
>Priority: Major
> Attachments: HDFS-11187.001.patch, HDFS-11187.002.patch, 
> HDFS-11187.003.patch, HDFS-11187.004.patch, HDFS-11187.005.patch
>
>
> The patch at HDFS-11160 ensures BlockSender reads the correct version of 
> metafile when there are concurrent writers.
> However, the implementation is not optimal, because it must always read the 
> last partial chunk checksum from disk while holding FsDatasetImpl lock for 
> every reader. It is possible to optimize this by keeping an up-to-date 
> version of last partial checksum in-memory and reduce disk access.
> I am separating the optimization into a new jira, because maintaining the 
> state of in-memory checksum requires a lot more work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12909) SSLConnectionConfigurator creation error should be printed only if security is enabled

2018-01-30 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345401#comment-16345401
 ] 

Ajay Kumar commented on HDFS-12909:
---

[~ljain], thanks for working on this. Do you mind adding a test case specific 
to this change. (i.e If we don't set any http policy {{URLConnectionFactory}} 
doesn't attempt creating an SSL con)?
have one question:
* Why move HTTP policy related keys to {{HdfsClientConfigKeys}}? I can imagine 
a case where client may have different policy than server in that case those 
keys may exist in both classes (i.e {{HdfsClientConfigKeys}} and 
{{DFSConfigKeys}}) but having it only in client config gives an impression that 
this config is not relevant in server side.



> SSLConnectionConfigurator creation error should be printed only if security 
> is enabled
> --
>
> Key: HDFS-12909
> URL: https://issues.apache.org/jira/browse/HDFS-12909
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: HDFS-12909.002.patch, HDFS-12909.003.patch, 
> HDFS-12909.patch
>
>
> Currently URLConnectionFactory#getSSLConnectionConfiguration attempts to 
> create a SSL connection configurator even if security is not enabled. This 
> raises the below false warning in the logs.
> {code:java}
> 17/12/08 10:12:03 WARN web.URLConnectionFactory: Cannot load customized ssl 
> related configuration. Fallback to system-generic settings.
> java.io.FileNotFoundException: /etc/security/clientKeys/all.jks (No such file 
> or directory)
>   at java.io.FileInputStream.open0(Native Method)
>   at java.io.FileInputStream.open(FileInputStream.java:195)
>   at java.io.FileInputStream.(FileInputStream.java:138)
>   at 
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.loadTrustManager(ReloadingX509TrustManager.java:169)
>   at 
> org.apache.hadoop.security.ssl.ReloadingX509TrustManager.(ReloadingX509TrustManager.java:87)
>   at 
> org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:219)
>   at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:176)
>   at 
> org.apache.hadoop.hdfs.web.URLConnectionFactory.newSslConnConfigurator(URLConnectionFactory.java:164)
>   at 
> org.apache.hadoop.hdfs.web.URLConnectionFactory.getSSLConnectionConfiguration(URLConnectionFactory.java:106)
>   at 
> org.apache.hadoop.hdfs.web.URLConnectionFactory.newDefaultURLConnectionFactory(URLConnectionFactory.java:85)
>   at org.apache.hadoop.hdfs.tools.DFSck.(DFSck.java:136)
>   at org.apache.hadoop.hdfs.tools.DFSck.(DFSck.java:128)
>   at org.apache.hadoop.hdfs.tools.DFSck.main(DFSck.java:396)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13084) [SPS]: Fix the branch review comments

2018-01-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876
 ] 

Rakesh R edited comment on HDFS-13084 at 1/30/18 4:57 PM:
--

Thanks a lot [~daryn] for your time and review comments. My reply follows, 
please take a look at it.

*Comment-1)*
{quote}BlockManager
 Shouldn’t spsMode be volatile? Although I question why it’s here.
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-2)*
{quote}Adding SPS methods to this class implies an unexpected coupling of the 
SPS service to the block manager. Please move them out to prove it’s not 
tightly coupled.
{quote}
[Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} 
and keep all the related apis over there.

*Comment-3)*
{quote}BPServiceActor
 Is it actually sending back the moved blocks? Aren’t IBRs sufficient?
{quote}
[Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the 
blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the 
DN. Since we have grouped separately, each block movement finished block can be 
easily/quickly recognized by the Satisfier in NN side and can easily update the 
tracking details. I think, if we use IBR approach, for every newly reported 
block, Satisfier in NN side could do *additional extra* check whether this new 
block has occurred due to SPSBlockMove operation for updating the track details 
and this *extra* check will be quite often considering more other ops compares 
to SPSBlockMove op. I meant, a new block can be reported due SPSBlockMove op or 
as part of some other operation like re-replication or newly written file block 
or balancer tool or Mover tool etc. With the current approach, it can simply 
get the {{blksMovementsFinished}} list from each heartbeat and do the logic.

*Comment-4)*
{quote}DataNode
 Why isn’t this just a block transfer? How is transferring between DNs any 
different than across storages?
{quote}
[Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we 
just followed same approach in sps also. Am I missing anything here?

*Comment-5)*
{quote}DatanodeDescriptor
 Why use a synchronized linked list to offer/poll instead of BlockingQueue?
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-6)*
{quote}DatanodeManager
 I know it’s configurable, but realistically, when would you ever want to give 
storage movement tasks equal footing with under-replication? Is there really a 
use case for not valuing durability?
{quote}
[Rakesh's reply] We don't have any partcular use case, though. One scenario we 
thought is, user configured SSDs and filled up quickly. In that case, there 
could be situations that cleaning-up is considered as a high priority. If you 
feel, this is not a real case then I'm OK to remove this config and SPS will 
use only the remaining slots. If needed, we can bring such configs later.

*Comment-7)*
{quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport 
is already a very bad method that should be avoided for anything but jmx – even 
then it’s a concern. I eliminated calls to it years ago. All it takes is a 
nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of 
time. Beyond that, the response is going to be pretty large and tagging all the 
storage reports is not going to be cheap.

verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem 
lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its 
storageMap?

Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned 
earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached 
state of the world. Then it gets another datanode report to determine the 
number of live nodes to decide if it should sleep before processing the next 
path. The number of nodes from the prior cached view of the world should 
suffice.
{quote}
[Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. 
Actually, we depend on this api for the data node storage types and remaining 
space details. I think, it requires two different mechanisms for internal and 
external sps. For internal, how about sps can directly refer 
{{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are 
suggesting a cache mechanism. How about, get storageReport once and cache at 
ExternalContext. This local cache can be refreshed periodically. Say, After 
every 5mins (just an arbitrary number I put here, if you have some period in 
mind please suggest), when getDatanodeStorageReport called, cache can be 
treated as expired and fetch freshly. Within 5mins it can use from cache.  Does 
this make sense to you?

Another point we thought of is, right now for checking whether node has good 
space, its going to NN. With above caching, we could just check the space 
constraints 

[jira] [Commented] (HDFS-13083) Fix HDFS Router Federation doc error

2018-01-30 Thread JIRA

[ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345367#comment-16345367
 ] 

Íñigo Goiri commented on HDFS-13083:


I missed this. Thanks [~tartarus] for the patch, I assigned it to you.

Once Yetus comes back I'll commit.

+1

> Fix HDFS Router Federation doc error
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.0.0
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-13084) [SPS]: Fix the branch review comments

2018-01-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876
 ] 

Rakesh R edited comment on HDFS-13084 at 1/30/18 4:51 PM:
--

Thanks a lot [~daryn] for your time and review comments. My reply follows, 
please take a look at it.

*Comment-1)*
{quote}BlockManager
 Shouldn’t spsMode be volatile? Although I question why it’s here.
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-2)*
{quote}Adding SPS methods to this class implies an unexpected coupling of the 
SPS service to the block manager. Please move them out to prove it’s not 
tightly coupled.
{quote}
[Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} 
and keep all the related apis over there.

*Comment-3)*
{quote}BPServiceActor
 Is it actually sending back the moved blocks? Aren’t IBRs sufficient?
{quote}
[Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the 
blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the 
DN. Since we have grouped separately, each block movement finished block can be 
easily/quickly recognized by the Satisfier in NN side and can easily update the 
tracking details. I think, if we use IBR approach, for every newly reported 
block, Satisfier in NN side could do *additional extra* check whether this new 
block has occurred due to SPSBlockMove operation for updating the track details 
and this *extra* check will be quite often considering more other ops compares 
to SPSBlockMove op. I meant, a new block can be reported due SPSBlockMove op or 
as part of some other operation like re-replication or newly written file block 
or balancer tool or Mover tool etc. With the current approach, it can simply 
get the {{blksMovementsFinished}} list from each heartbeat and do the logic.

*Comment-4)*
{quote}DataNode
 Why isn’t this just a block transfer? How is transferring between DNs any 
different than across storages?
{quote}
[Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we 
just followed same approach in sps also. Am I missing anything here?

*Comment-5)*
{quote}DatanodeDescriptor
 Why use a synchronized linked list to offer/poll instead of BlockingQueue?
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-6)*
{quote}DatanodeManager
 I know it’s configurable, but realistically, when would you ever want to give 
storage movement tasks equal footing with under-replication? Is there really a 
use case for not valuing durability?
{quote}
[Rakesh's reply] We don't have any partcular use case, though. One scenario we 
thought is, user configured SSDs and filled up quickly. In that case, there 
could be situations that cleaning-up is considered as a high priority. If you 
feel, this is not a real case then I'm OK to remove this config and SPS will 
use only the remaining slots. If needed, we can bring such configs later.

*Comment-7)*
{quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport 
is already a very bad method that should be avoided for anything but jmx – even 
then it’s a concern. I eliminated calls to it years ago. All it takes is a 
nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of 
time. Beyond that, the response is going to be pretty large and tagging all the 
storage reports is not going to be cheap.

verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem 
lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its 
storageMap?

Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned 
earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached 
state of the world. Then it gets another datanode report to determine the 
number of live nodes to decide if it should sleep before processing the next 
path. The number of nodes from the prior cached view of the world should 
suffice.
{quote}
[Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. 
Actually, we depend on this api for the data node storage types and remaining 
space details. I think, it requires two different mechanisms for internal and 
external sps. For internal, how about sps can directly refer 
{{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are 
suggesting a cache mechanism. How about, get storageReport once and cache at 
ExternalContext. This local cache can be refreshed periodically. Say, After 
every 5mins (just an arbitrary number I put here, if you have some period in 
mind please suggest), when getDatanodeStorageReport called, cache can be 
treated as expired and fetch freshly. Within 5mins it can use from cache.  Does 
this make sense to you?

*Comment-8)*
{quote}DFSUtil
 DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier
 }}. These aren’t generally useful methods so why are they 

[jira] [Assigned] (HDFS-13083) Fix HDFS Router Federation doc error

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri reassigned HDFS-13083:
--

Assignee: tartarus

> Fix HDFS Router Federation doc error
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 2.9.0, 3.0.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Assignee: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.0.0
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13083) Fix HDFS Router Federation doc error

2018-01-30 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/HDFS-13083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Íñigo Goiri updated HDFS-13083:
---
Status: Patch Available  (was: Reopened)

> Fix HDFS Router Federation doc error
> 
>
> Key: HDFS-13083
> URL: https://issues.apache.org/jira/browse/HDFS-13083
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation
>Affects Versions: 3.0.0, 2.9.0
> Environment: CentOS6.5
> Hadoop 3.0.0
>Reporter: tartarus
>Priority: Major
>  Labels: Router
> Fix For: 3.0.0
>
> Attachments: HDFS-13083.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> HDFS Router Federation doc error
> Configuration instructions
> {quote}
> dfs.namenodes.ns-fed
> r1,r2
> 
> {quote}
> then execute :  *hdfs dfs -ls hdfs://ns-fed/tmp/*   failed with
> *{{Couldn't create proxy provider class 
> org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider}}*
> The correct configuration is
> {quote}
> dfs.{color:#FF}ha{color}.namenodes.ns-fed
> r1,r2
> 
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13085) getStoragePolicy varies b/w REST and CLI for unspecified path

2018-01-30 Thread Arpit Agarwal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345350#comment-16345350
 ] 

Arpit Agarwal commented on HDFS-13085:
--

Thanks for the analysis [~msingh]. 

[~brahmareddy], any change here will likely be incompatible and hence out for 
Hadoop 3.x, so probably we should leave things as they are. What do you think?

> getStoragePolicy varies b/w REST and CLI for unspecified path
> -
>
> Key: HDFS-13085
> URL: https://issues.apache.org/jira/browse/HDFS-13085
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Mukul Kumar Singh
>Priority: Major
>
> *CLI*
> {noformat}
> hdfs storagepolicies -getStoragePolicy -path /node-label-root
> The storage policy of /node-label-root is unspecified
> {noformat}
> *REST* 
> {noformat}
> curl -i "http://1*.*.*.*:50070/webhdfs/v1/node-label-root?op=GETSTORAGEPOLICY;
> {"BlockStoragePolicy":\{"copyOnCreateFile":false,"creationFallbacks":[],"id":7,"name":"HOT","replicationFallbacks":["ARCHIVE"],"storageTypes":["DISK"]}}
> {noformat}
> -HDFS-11968-  introduced this new behaviour for CLI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13085) getStoragePolicy varies b/w REST and CLI for unspecified path

2018-01-30 Thread Arpit Agarwal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arpit Agarwal updated HDFS-13085:
-
Hadoop Flags: Incompatible change

> getStoragePolicy varies b/w REST and CLI for unspecified path
> -
>
> Key: HDFS-13085
> URL: https://issues.apache.org/jira/browse/HDFS-13085
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Brahma Reddy Battula
>Assignee: Mukul Kumar Singh
>Priority: Major
>
> *CLI*
> {noformat}
> hdfs storagepolicies -getStoragePolicy -path /node-label-root
> The storage policy of /node-label-root is unspecified
> {noformat}
> *REST* 
> {noformat}
> curl -i "http://1*.*.*.*:50070/webhdfs/v1/node-label-root?op=GETSTORAGEPOLICY;
> {"BlockStoragePolicy":\{"copyOnCreateFile":false,"creationFallbacks":[],"id":7,"name":"HOT","replicationFallbacks":["ARCHIVE"],"storageTypes":["DISK"]}}
> {noformat}
> -HDFS-11968-  introduced this new behaviour for CLI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12528) Add an option to not disable short-circuit reads on failures

2018-01-30 Thread Xiao Chen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345343#comment-16345343
 ] 

Xiao Chen commented on HDFS-12528:
--

Thanks all, and thanks John for writing the initial unit test!

> Add an option to not disable short-circuit reads on failures
> 
>
> Key: HDFS-12528
> URL: https://issues.apache.org/jira/browse/HDFS-12528
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client, performance
>Affects Versions: 2.6.0
>Reporter: Andre Araujo
>Assignee: Xiao Chen
>Priority: Major
> Attachments: HDFS-12528.000.patch, HDFS-12528.01.patch, 
> HDFS-12528.02.patch, HDFS-12528.03.patch, HDFS-12528.04.patch, 
> HDFS-12528.05.patch
>
>
> We have scenarios where data ingestion makes use of the -appendToFile 
> operation to add new data to existing HDFS files. In these situations, we're 
> frequently running into the problem described below.
> We're using Impala to query the HDFS data with short-circuit reads (SCR) 
> enabled. After each file read, Impala "unbuffer"'s the HDFS file to reduce 
> the memory footprint. In some cases, though, Impala still keeps the HDFS file 
> handle open for reuse.
> The "unbuffer" call, however, causes the file's current block reader to be 
> closed, which makes the associated ShortCircuitReplica evictable from the 
> ShortCircuitCache. When the cluster is under load, this means that the 
> ShortCircuitReplica can be purged off the cache pretty fast, which closes the 
> file descriptor to the underlying storage file.
> That means that when Impala re-reads the file it has to re-open the storage 
> files associated with the ShortCircuitReplica's that were evicted from the 
> cache. If there were no appends to those blocks, the re-open will succeed 
> without problems. If one block was appended since the ShortCircuitReplica was 
> created, the re-open will fail with the following error:
> {code}
> Meta file for BP-810388474-172.31.113.69-1499543341726:blk_1074012183_273087 
> not found
> {code}
> This error is handled as an "unknown response" by the BlockReaderFactory [1], 
> which disables short-circuit reads for 10 minutes [2] for the client.
> These 10 minutes without SCR can have a big performance impact for the client 
> operations. In this particular case ("Meta file not found") it would suffice 
> to return null without disabling SCR. This particular block read would fall 
> back to the normal, non-short-circuited, path and other SCR requests would 
> continue to work as expected.
> It might also be interesting to be able to control how long SCR is disabled 
> for in the "unknown response" case. 10 minutes seems a bit to long and not 
> being able to change that is a problem.
> [1] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/client/impl/BlockReaderFactory.java#L646
> [2] 
> https://github.com/apache/hadoop/blob/f67237cbe7bc48a1b9088e990800b37529f1db2a/hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/shortcircuit/DomainSocketFactory.java#L97



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-11701) NPE from Unresolved Host causes permanent DFSInputStream failures

2018-01-30 Thread Jitendra Nath Pandey (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345156#comment-16345156
 ] 

Jitendra Nath Pandey commented on HDFS-11701:
-

+1. The patch looks good to me.

> NPE from Unresolved Host causes permanent DFSInputStream failures
> -
>
> Key: HDFS-11701
> URL: https://issues.apache.org/jira/browse/HDFS-11701
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.0
> Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH 
> 5.9.0
>Reporter: James Moore
>Assignee: Lokesh Jain
>Priority: Major
> Attachments: HDFS-11701.001.patch, HDFS-11701.002.patch
>
>
> We recently encountered the following NPE due to the DFSInputStream storing 
> old cached block locations from hosts which could no longer resolve.
> {quote}
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122)
> at 
> org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474)
> at 
> org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662)
> at 
> org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613)
> at 
> org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127)
> ~HBase related stack frames trimmed~
> {quote}
> After investigating, the DFSInputStream appears to have been open for upwards 
> of 3-4 weeks and had cached block locations from decommissioned nodes that no 
> longer resolve in DNS and had been shutdown and removed from the cluster 2 
> weeks prior.  If the DFSInputStream had refreshed its block locations from 
> the name node, it would have received alternative block locations which would 
> not contain the decommissioned data nodes.  As the above NPE leaves the 
> non-resolving data node in the list of block locations the DFSInputStream 
> never refreshes the block locations and all attempts to open a BlockReader 
> for the given blocks will fail.
> In our case, we resolved the NPE by closing and re-opening every 
> DFSInputStream in the cluster to force a purge of the block locations cache.  
> Ideally, the DFSInputStream would re-fetch all block locations for a host 
> which can't be resolved in DNS or at least the blocks requested.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-13080) Ozone: Make finalhash in ContainerInfo of StorageContainerDatanodeProtocol.proto optional

2018-01-30 Thread Nanda kumar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nanda kumar updated HDFS-13080:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Ozone: Make finalhash in ContainerInfo of 
> StorageContainerDatanodeProtocol.proto optional
> -
>
> Key: HDFS-13080
> URL: https://issues.apache.org/jira/browse/HDFS-13080
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nanda kumar
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13080-HDFS-7240.000.patch, 
> HDFS-13080-HDFS-7240.001.patch
>
>
> ContainerInfo in StorageContainerDatanodeProtocol.proto has a required field, 
> {{finalhash}} which will be null for an open container, this has to be made 
> as an optional field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13080) Ozone: Make finalhash in ContainerInfo of StorageContainerDatanodeProtocol.proto optional

2018-01-30 Thread Nanda kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345132#comment-16345132
 ] 

Nanda kumar commented on HDFS-13080:


Thanks [~elek] for the contribution. I have committed this to the feature 
branch.

> Ozone: Make finalhash in ContainerInfo of 
> StorageContainerDatanodeProtocol.proto optional
> -
>
> Key: HDFS-13080
> URL: https://issues.apache.org/jira/browse/HDFS-13080
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nanda kumar
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13080-HDFS-7240.000.patch, 
> HDFS-13080-HDFS-7240.001.patch
>
>
> ContainerInfo in StorageContainerDatanodeProtocol.proto has a required field, 
> {{finalhash}} which will be null for an open container, this has to be made 
> as an optional field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13080) Ozone: Make finalhash in ContainerInfo of StorageContainerDatanodeProtocol.proto optional

2018-01-30 Thread Nanda kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345110#comment-16345110
 ] 

Nanda kumar commented on HDFS-13080:


+1, I will commit this shortly.

> Ozone: Make finalhash in ContainerInfo of 
> StorageContainerDatanodeProtocol.proto optional
> -
>
> Key: HDFS-13080
> URL: https://issues.apache.org/jira/browse/HDFS-13080
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ozone
>Reporter: Nanda kumar
>Assignee: Elek, Marton
>Priority: Major
> Attachments: HDFS-13080-HDFS-7240.000.patch, 
> HDFS-13080-HDFS-7240.001.patch
>
>
> ContainerInfo in StorageContainerDatanodeProtocol.proto has a required field, 
> {{finalhash}} which will be null for an open container, this has to be made 
> as an optional field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail

2018-01-30 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16345089#comment-16345089
 ] 

genericqa commented on HDFS-12116:
--

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 16m 
29s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
59s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
11m 34s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
53s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
10m 55s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}116m 22s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 16s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.tools.TestDFSZKFailoverController |
|   | hadoop.hdfs.TestReadStripedFileWithMissingBlocks |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:5b98639 |
| JIRA Issue | HDFS-12116 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12877421/HDFS-12116.03.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux 6ec659f0f99e 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 6463e10 |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22886/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22886/testReport/ |
| Max. process+thread count | 3204 (vs. ulimit of 5000) |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/22886/console |
| Powered by | Apache 

[jira] [Comment Edited] (HDFS-13084) [SPS]: Fix the branch review comments

2018-01-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876
 ] 

Rakesh R edited comment on HDFS-13084 at 1/30/18 12:25 PM:
---

Thanks a lot [~daryn] for your time and review comments. My reply follows, 
please take a look at it.

*Comment-1)*
{quote}BlockManager
 Shouldn’t spsMode be volatile? Although I question why it’s here.
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-2)*
{quote}Adding SPS methods to this class implies an unexpected coupling of the 
SPS service to the block manager. Please move them out to prove it’s not 
tightly coupled.
{quote}
[Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} 
and keep all the related apis over there.

*Comment-3)*
{quote}BPServiceActor
 Is it actually sending back the moved blocks? Aren’t IBRs sufficient?
{quote}
[Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the 
blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the 
DN. Since we have grouped separately, each block movement finished block can be 
easily/quickly recognized by the Satisfier in NN side and can easily update the 
tracking details. I think, if we use IBR approach, for every newly reported 
block, Satisfier in NN side could do *additional extra* check whether this new 
block has occurred due to SPSBlockMove operation for updating the track details 
and this *extra* check will be quite often considering more other ops compares 
to SPSBlockMove op. I meant, a new block can be reported due SPSBlockMove op or 
as part of some other operation like re-replication or newly written file block 
or balancer tool or Mover tool etc. With the current approach, it can simply 
get the {{blksMovementsFinished}} list from each heartbeat and do the logic.

*Comment-4)*
{quote}DataNode
 Why isn’t this just a block transfer? How is transferring between DNs any 
different than across storages?
{quote}
[Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we 
just followed same approach in sps also. Am I missing anything here?

*Comment-5)*
{quote}DatanodeDescriptor
 Why use a synchronized linked list to offer/poll instead of BlockingQueue?
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-6)*
{quote}DatanodeManager
 I know it’s configurable, but realistically, when would you ever want to give 
storage movement tasks equal footing with under-replication? Is there really a 
use case for not valuing durability?
{quote}
[Rakesh's reply] We don't have any partcular use case, though. One scenario we 
thought is, user configured SSDs and filled up quickly. In that case, there 
could be situations that cleaning-up is considered as a high priority. If you 
feel, this is not a real case then I'm OK to remove this config and SPS will 
use only the remaining slots. If needed, we can bring such configs later.

*Comment-7)*
{quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport 
is already a very bad method that should be avoided for anything but jmx – even 
then it’s a concern. I eliminated calls to it years ago. All it takes is a 
nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of 
time. Beyond that, the response is going to be pretty large and tagging all the 
storage reports is not going to be cheap.

verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem 
lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its 
storageMap?

Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned 
earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached 
state of the world. Then it gets another datanode report to determine the 
number of live nodes to decide if it should sleep before processing the next 
path. The number of nodes from the prior cached view of the world should 
suffice.
{quote}
[Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. 
Actually, we depend on this api for the data node storage types and remaining 
space details. I think, it requires two different mechanisms for internal and 
external sps. For internal, how about sps can directly refer 
{{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are 
suggesting a cache mechanism. How about, make data structure {{StorageGroup}} 
similar to Mover class and build cache. This local cache can be refreshed per 
file or periodically.

*Comment-8)*
{quote}DFSUtil
 DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier
 }}. These aren’t generally useful methods so why are they in DFSUtil? Why 
aren’t they in the only calling class StoragePolicySatisfier?
{quote}
[Rakesh's reply] Agreed, Will do the changes.

*Comment-9)*
{quote}FSDirXAttrOp
 Not fond of the SPS queue updates being edge triggered by xattr 

[jira] [Created] (HDFS-13087) Fix: Snapshots On encryption zones get incorrect EZ settings when encryption zone changes

2018-01-30 Thread LiXin Ge (JIRA)
LiXin Ge created HDFS-13087:
---

 Summary: Fix: Snapshots On encryption zones get incorrect EZ 
settings when encryption zone changes
 Key: HDFS-13087
 URL: https://issues.apache.org/jira/browse/HDFS-13087
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: encryption
Affects Versions: 3.1.0
Reporter: LiXin Ge


Snapshots are supposed to be immutable and read only, so the EZ settings which 
in a snapshot path shouldn't change when the origin encryption zone changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-13087) Fix: Snapshots On encryption zones get incorrect EZ settings when encryption zone changes

2018-01-30 Thread LiXin Ge (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-13087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LiXin Ge reassigned HDFS-13087:
---

Assignee: LiXin Ge

> Fix: Snapshots On encryption zones get incorrect EZ settings when encryption 
> zone changes
> -
>
> Key: HDFS-13087
> URL: https://issues.apache.org/jira/browse/HDFS-13087
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: encryption
>Affects Versions: 3.1.0
>Reporter: LiXin Ge
>Assignee: LiXin Ge
>Priority: Major
>
> Snapshots are supposed to be immutable and read only, so the EZ settings 
> which in a snapshot path shouldn't change when the origin encryption zone 
> changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-13086) Add a field about exception type to BlockOpResponseProto

2018-01-30 Thread LiXin Ge (JIRA)
LiXin Ge created HDFS-13086:
---

 Summary: Add a field about exception type to BlockOpResponseProto
 Key: HDFS-13086
 URL: https://issues.apache.org/jira/browse/HDFS-13086
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 3.1.0
Reporter: LiXin Ge
Assignee: LiXin Ge


When user re-read a file in the way of short-circuit reads, it may come across 
unknown errors due to the reasons that the file has been appended after the 
first read which changes it's meta file, or the file has been moved away by the 
balancer.

Such unknown errors will unnecessary disable short-circuit reads for 10 
minutes. HDFS-12528 Make the {{expireAfterWrite}} of 
{{DomainSocketFactory$pathMap}} configurable to give user a choice of never 
disable the domain socket. 

We can go a step further that add a field about exception type to 
BlockOpResponseProto, so that we can Ignore the acceptable FNFE and set a 
appropriate disable time to handle the unacceptable exceptions when different 
type of exception happens in the same cluster.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Assigned] (HDFS-12116) BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail

2018-01-30 Thread Gabor Bota (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gabor Bota reassigned HDFS-12116:
-

Assignee: (was: Gabor Bota)

> BlockReportTestBase#blockReport_08 and #blockReport_08 intermittently fail
> --
>
> Key: HDFS-12116
> URL: https://issues.apache.org/jira/browse/HDFS-12116
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.22.0
>Reporter: Xiao Chen
>Priority: Major
> Attachments: HDFS-12116.01.patch, HDFS-12116.02.patch, 
> HDFS-12116.03.patch, 
> TEST-org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage.xml
>
>
> This seems to be long-standing, but the failure rate (~10%) is slightly 
> higher in dist-test run in using cdh.
> In both _08 and _09 tests:
> # an attempt is made to make a replica in {{TEMPORARY}}
>  state, by {{waitForTempReplica}}.
> # Once that's returned, the test goes on to verify block reports shows 
> correct pending replication blocks.
> But there's a race condition. If the replica is replicated between steps #1 
> and #2, {{getPendingReplicationBlocks}} could return 0 or 1, depending on how 
> many replicas are replicated, hence failing the test.
> Failures are seen on both {{TestNNHandlesBlockReportPerStorage}} and 
> {{TestNNHandlesCombinedBlockReport}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-13084) [SPS]: Fix the branch review comments

2018-01-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876
 ] 

Rakesh R commented on HDFS-13084:
-

Thanks a lot [~daryn] for your time and review comments. My reply follows, 
please take a look at it.

*Comment-1)*
{quote}BlockManager
 Shouldn’t spsMode be volatile? Although I question why it’s here.
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-2)*
{quote}Adding SPS methods to this class implies an unexpected coupling of the 
SPS service to the block manager. Please move them out to prove it’s not 
tightly coupled.
{quote}
[Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} 
and keep all the related apis over there.

*Comment-3)*
{quote}BPServiceActor
 Is it actually sending back the moved blocks? Aren’t IBRs sufficient?
{quote}
[Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the 
blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the 
DN. Since we have grouped separately, each block movement finished block can be 
easily/quickly recognized by the Satisfier in NN side and can easily update the 
tracking details. I think, if we use IBR approach, for every newly reported 
block, Satisfier in NN side could check whether this new block has occurred due 
to SPSBlockMove operation for updating the track details. I meant, a new block 
can be reported due SPSBlockMove op or as part of some other operation like 
re-replication or newly written file block or balancer tool or Mover tool etc. 
With the current approach, it can simply get the {{blksMovementsFinished}} list 
from each heartbeat and do the logic.

*Comment-4)*
{quote}DataNode
 Why isn’t this just a block transfer? How is transferring between DNs any 
different than across storages?
{quote}
[Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we 
just followed same approach in sps also. Am I missing anything here?

*Comment-5)*
{quote}DatanodeDescriptor
 Why use a synchronized linked list to offer/poll instead of BlockingQueue?
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-5)*
{quote}DatanodeManager
 I know it’s configurable, but realistically, when would you ever want to give 
storage movement tasks equal footing with under-replication? Is there really a 
use case for not valuing durability?
{quote}
[Rakesh's reply] We don't have any partcular use case, though. One scenario we 
thought is, user configured SSDs and filled up quickly. In that case, there 
could be situations that cleaning-up is considered as a high priority. If you 
feel, this is not a real case then I'm OK to remove this config and SPS will 
use only the remaining slots. If needed, we can bring such configs later.

*Comment-6)*
{quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport 
is already a very bad method that should be avoided for anything but jmx – even 
then it’s a concern. I eliminated calls to it years ago. All it takes is a 
nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of 
time. Beyond that, the response is going to be pretty large and tagging all the 
storage reports is not going to be cheap.

verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem 
lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its 
storageMap?

Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned 
earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached 
state of the world. Then it gets another datanode report to determine the 
number of live nodes to decide if it should sleep before processing the next 
path. The number of nodes from the prior cached view of the world should 
suffice.
{quote}
[Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. 
Actually, we depend on this api for the data node storage types and remaining 
space details. I think, it requires two different mechanisms for internal and 
external sps. For internal, how about sps can directly refer 
{{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are 
suggesting a cache mechanism. How about, make data structure {{StorageGroup}} 
similar to Mover class and build cache. This local cache can be refreshed per 
file or periodically.

*Comment-7)*
{quote}DFSUtil
 DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier
 }}. These aren’t generally useful methods so why are they in DFSUtil? Why 
aren’t they in the only calling class StoragePolicySatisfier?
{quote}
[Rakesh's reply] Agreed, Will do the changes.

*Comment-8)*
{quote}FSDirXAttrOp
 Not fond of the SPS queue updates being edge triggered by xattr changes. Food 
for thought: why add more rpc methods if the client can just twiddle the xattr?
{quote}
[Rakesh's reply] Sps is depending on the xattr. On each user satisfy 

[jira] [Comment Edited] (HDFS-13084) [SPS]: Fix the branch review comments

2018-01-30 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-13084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16344876#comment-16344876
 ] 

Rakesh R edited comment on HDFS-13084 at 1/30/18 10:55 AM:
---

Thanks a lot [~daryn] for your time and review comments. My reply follows, 
please take a look at it.

*Comment-1)*
{quote}BlockManager
 Shouldn’t spsMode be volatile? Although I question why it’s here.
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-2)*
{quote}Adding SPS methods to this class implies an unexpected coupling of the 
SPS service to the block manager. Please move them out to prove it’s not 
tightly coupled.
{quote}
[Rakesh's reply] Agreed. I'm planning to create {{StoragePolicySatisfyManager}} 
and keep all the related apis over there.

*Comment-3)*
{quote}BPServiceActor
 Is it actually sending back the moved blocks? Aren’t IBRs sufficient?
{quote}
[Rakesh's reply] Currently, SPS have a coupling with heartbeat for sending the 
blk move tasks to DN and recieving the {{blksMovementsFinished}} list from the 
DN. Since we have grouped separately, each block movement finished block can be 
easily/quickly recognized by the Satisfier in NN side and can easily update the 
tracking details. I think, if we use IBR approach, for every newly reported 
block, Satisfier in NN side could check whether this new block has occurred due 
to SPSBlockMove operation for updating the track details. I meant, a new block 
can be reported due SPSBlockMove op or as part of some other operation like 
re-replication or newly written file block or balancer tool or Mover tool etc. 
With the current approach, it can simply get the {{blksMovementsFinished}} list 
from each heartbeat and do the logic.

*Comment-4)*
{quote}DataNode
 Why isn’t this just a block transfer? How is transferring between DNs any 
different than across storages?
{quote}
[Rakesh's reply] I could see Mover is also using {{REPLACE_BLOCK}} call and we 
just followed same approach in sps also. Am I missing anything here?

*Comment-5)*
{quote}DatanodeDescriptor
 Why use a synchronized linked list to offer/poll instead of BlockingQueue?
{quote}
[Rakesh's reply] Agreed, will do the changes.

*Comment-6)*
{quote}DatanodeManager
 I know it’s configurable, but realistically, when would you ever want to give 
storage movement tasks equal footing with under-replication? Is there really a 
use case for not valuing durability?
{quote}
[Rakesh's reply] We don't have any partcular use case, though. One scenario we 
thought is, user configured SSDs and filled up quickly. In that case, there 
could be situations that cleaning-up is considered as a high priority. If you 
feel, this is not a real case then I'm OK to remove this config and SPS will 
use only the remaining slots. If needed, we can bring such configs later.

*Comment-7)*
{quote}Adding getDatanodeStorageReport is concerning. getDatanodeListForReport 
is already a very bad method that should be avoided for anything but jmx – even 
then it’s a concern. I eliminated calls to it years ago. All it takes is a 
nscd/dns hiccup and you’re left holding the fsn lock for an excessive length of 
time. Beyond that, the response is going to be pretty large and tagging all the 
storage reports is not going to be cheap.

verifyTargetDatanodeHasSpaceForScheduling does it really need the namesystem 
lock? Can’t DatanodeDescriptor#chooseStorage4Block synchronize on its 
storageMap?

Appears to be calling getLiveDatanodeStorageReport for every file. As mentioned 
earlier, this is NOT cheap. The SPS should be able to operate on a fuzzy/cached 
state of the world. Then it gets another datanode report to determine the 
number of live nodes to decide if it should sleep before processing the next 
path. The number of nodes from the prior cached view of the world should 
suffice.
{quote}
[Rakesh's reply] Good point. Sometime back Uma and me thought about cache part. 
Actually, we depend on this api for the data node storage types and remaining 
space details. I think, it requires two different mechanisms for internal and 
external sps. For internal, how about sps can directly refer 
{{DatanodeManager#datanodeMap}} for every file. For the external, IIUC you are 
suggesting a cache mechanism. How about, make data structure {{StorageGroup}} 
similar to Mover class and build cache. This local cache can be refreshed per 
file or periodically.

*Comment-8)*
{quote}DFSUtil
 DFSUtil.removeOverlapBetweenStorageTypes and {{DFSUtil.getSPSWorkMultiplier
 }}. These aren’t generally useful methods so why are they in DFSUtil? Why 
aren’t they in the only calling class StoragePolicySatisfier?
{quote}
[Rakesh's reply] Agreed, Will do the changes.

*Comment-9)*
{quote}FSDirXAttrOp
 Not fond of the SPS queue updates being edge triggered by xattr changes. Food 
for thought: why add more rpc methods if the client can just twiddle the xattr?
{quote}
[Rakesh's reply] 

  1   2   >