Re: Permissions to edit Confluence Wiki

2017-09-08 Thread Arun Suresh
Thanks Akira

Cheers
-Arun

On Fri, Sep 8, 2017 at 6:53 PM, Akira Ajisaka  wrote:

> You can get the privilege by sending an e-mail to the dev ML.
> I added it for you.
>
> Thanks,
> Akira
>
>
> On 2017/09/09 4:50, Arun Suresh wrote:
>
>> Hi folks
>>
>> How do we get access to edit the confluence wiki;
>> https://cwiki.apache.org/confluence/display/HADOOP ?
>>
>> We were hoping to update it with hadoop 2.9 release details.
>>
>> Cheers
>> -Arun
>>
>>


Re: Permissions to edit Confluence Wiki

2017-09-08 Thread Akira Ajisaka

You can get the privilege by sending an e-mail to the dev ML.
I added it for you.

Thanks,
Akira

On 2017/09/09 4:50, Arun Suresh wrote:

Hi folks

How do we get access to edit the confluence wiki;
https://cwiki.apache.org/confluence/display/HADOOP ?

We were hoping to update it with hadoop 2.9 release details.

Cheers
-Arun



-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12412) Remove ErasureCodingWorker.stripedReadPool

2017-09-08 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-12412:


 Summary: Remove ErasureCodingWorker.stripedReadPool
 Key: HDFS-12412
 URL: https://issues.apache.org/jira/browse/HDFS-12412
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha3
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu


In {{ErasureCodingWorker}}, it uses {{stripedReconstructionPool}} to schedule 
the EC recovery tasks, while uses {{stripedReadPool}} for the reader threads in 
each recovery task.  We only need one of them to throttle the speed of recovery 
process, because each EC recovery task has a fix number of source readers 
(i.e., 3 for RS(3,2)). And because of the findings in HDFS-12044, the speed of 
EC recovery can be throttled by {{strippedReconstructionPool}} with 
{{xmitsInProgress}}. 

Moreover, keeping {{stripedReadPool}} makes customer difficult to understand 
and calculate the right balance between 
{{dfs.datanode.ec.reconstruction.stripedread.threads}}, 
{{dfs.datanode.ec.reconstruction.stripedblock.threads.size}} and 
{{maxReplicationStreams}}.  For example, a small {{stripread.threads}} 
(comparing to which {{reconstruction.threads.size}} implies), will 
unnecessarily limit the speed of recovery, which leads to larger MTTR. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12411) Ozone: Add container usage information to DN container report

2017-09-08 Thread Xiaoyu Yao (JIRA)
Xiaoyu Yao created HDFS-12411:
-

 Summary: Ozone: Add container usage information to DN container 
report
 Key: HDFS-12411
 URL: https://issues.apache.org/jira/browse/HDFS-12411
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ozone, scm
Reporter: Xiaoyu Yao
Assignee: Xiaoyu Yao


Current DN ReportState for container only has a counter, we will need to 
include individual container usage information so that SCM can 
* close container when they are full
* assign container for block service with different policies.
* etc.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12410) Ignore unknown StorageTypes

2017-09-08 Thread Chris Douglas (JIRA)
Chris Douglas created HDFS-12410:


 Summary: Ignore unknown StorageTypes
 Key: HDFS-12410
 URL: https://issues.apache.org/jira/browse/HDFS-12410
 Project: Hadoop HDFS
  Issue Type: Task
  Components: datanode, fs
Reporter: Chris Douglas
Priority: Minor


A storage configured with an unknown type will cause runtime exceptions. 
Instead, these storages can be ignored/skipped.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12409) Add metrics of execution time of EC recovery tasks

2017-09-08 Thread Lei (Eddy) Xu (JIRA)
Lei (Eddy) Xu created HDFS-12409:


 Summary: Add metrics of execution time of EC recovery tasks
 Key: HDFS-12409
 URL: https://issues.apache.org/jira/browse/HDFS-12409
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Affects Versions: 3.0.0-alpha3
Reporter: Lei (Eddy) Xu
Assignee: Lei (Eddy) Xu
Priority: Minor


Admin can use more metrics to monitor EC recovery tasks, to get insights to 
tune recovery performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12408) Many EC tests fail in trunk

2017-09-08 Thread Arpit Agarwal (JIRA)
Arpit Agarwal created HDFS-12408:


 Summary: Many EC tests fail in trunk
 Key: HDFS-12408
 URL: https://issues.apache.org/jira/browse/HDFS-12408
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: erasure-coding
Affects Versions: 3.0.0-alpha4
Reporter: Arpit Agarwal
Priority: Blocker


Many EC tests seem to be failing in pre-commit runs. e.g.
https://builds.apache.org/job/PreCommit-HDFS-Build/21055/testReport/
https://builds.apache.org/job/PreCommit-HDFS-Build/21052/testReport/
https://builds.apache.org/job/PreCommit-HDFS-Build/21048/testReport/

This is creating a lot of noise in Jenkins runs outputs. We should either fix 
or disable these tests.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Permissions to edit Confluence Wiki

2017-09-08 Thread Arun Suresh
Hi folks

How do we get access to edit the confluence wiki;
https://cwiki.apache.org/confluence/display/HADOOP ?

We were hoping to update it with hadoop 2.9 release details.

Cheers
-Arun


Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2017-09-08 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/517/

[Sep 7, 2017 7:46:20 AM] (sunilg) YARN-6992. Kill application button is visible 
even if the application is
[Sep 7, 2017 12:38:23 PM] (kai.zheng) HDFS-12402. Refactor 
ErasureCodingPolicyManager and related codes.
[Sep 7, 2017 3:18:28 PM] (arp) HDFS-12376. Enable JournalNode Sync by default. 
Contributed by Hanisha
[Sep 7, 2017 4:50:36 PM] (yzhang) HDFS-12357. Let NameNode to bypass external 
attribute provider for
[Sep 7, 2017 5:35:03 PM] (stevel) HADOOP-14520. WASB: Block compaction for 
Azure Block Blobs. Contributed
[Sep 7, 2017 5:23:12 PM] (Arun Suresh) YARN-6978. Add updateContainer API to 
NMClient. (Kartheek Muthyala via
[Sep 7, 2017 6:55:56 PM] (stevel) HADOOP-14774. S3A case 
"testRandomReadOverBuffer" failed due to improper
[Sep 7, 2017 7:40:09 PM] (aengineer) HDFS-12350. Support meta tags in configs. 
Contributed by Ajay Kumar.
[Sep 7, 2017 9:13:37 PM] (wangda) YARN-7033. Add support for NM Recovery of 
assigned resources (e.g.
[Sep 7, 2017 9:17:03 PM] (jlowe) YARN-6930. Admins should be able to explicitly 
enable specific
[Sep 7, 2017 11:30:12 PM] (xiao) HDFS-12369. Edit log corruption due to hard 
lease recovery of not-closed
[Sep 7, 2017 11:56:35 PM] (wang) HDFS-12218. Rename split EC / replicated block 
metrics in BlockManager.
[Sep 7, 2017 11:57:19 PM] (wang) HDFS-12218. Addendum. Rename split EC / 
replicated block metrics in
[Sep 8, 2017 12:20:42 AM] (manojpec) HDFS-12404. Rename hdfs config 
authorization.provider.bypass.users to
[Sep 8, 2017 1:01:37 AM] (lei) HDFS-12349. Improve log message when it could 
not alloc enough blocks
[Sep 8, 2017 1:45:17 AM] (sunilg) YARN-6600. Introduce default and max lifetime 
of application at
[Sep 8, 2017 2:07:17 AM] (subru) YARN-5330. SharingPolicy enhancements required 
to support recurring
[Sep 8, 2017 3:51:02 AM] (xiao) HDFS-12400. Provide a way for NN to drain the 
local key cache before




-1 overall


The following subsystems voted -1:
findbugs unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

FindBugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Format-string method String.format(String, Object[]) called with format 
string "File %s could only be written to %d of the %d %s. There are %d 
datanode(s) running and %s node(s) are excluded in this operation." wants 6 
arguments but is given 7 in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(String,
 int, Node, Set, long, List, byte, BlockType, ErasureCodingPolicy, EnumSet) At 
BlockManager.java:with format string "File %s could only be written to %d of 
the %d %s. There are %d datanode(s) running and %s node(s) are excluded in this 
operation." wants 6 arguments but is given 7 in 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(String,
 int, Node, Set, long, List, byte, BlockType, ErasureCodingPolicy, EnumSet) At 
BlockManager.java:[line 2076] 

FindBugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager
 
   Hard coded reference to an absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:absolute pathname in 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.DockerLinuxContainerRuntime.launchContainer(ContainerRuntimeContext)
 At DockerLinuxContainerRuntime.java:[line 490] 

Failed junit tests :

   hadoop.hdfs.server.namenode.TestReencryption 
   hadoop.hdfs.server.namenode.TestReencryptionWithKMS 
   hadoop.hdfs.TestLeaseRecoveryStriped 
   hadoop.hdfs.TestClientProtocolForPipelineRecovery 
   hadoop.hdfs.TestReconstructStripedFile 
   hadoop.hdfs.server.blockmanagement.TestBlockManager 
   hadoop.hdfs.server.blockmanagement.TestBlockStatsMXBean 
   hadoop.hdfs.protocol.datatransfer.sasl.TestSaslDataTransfer 
   hadoop.hdfs.TestDFSStripedOutputStreamWithFailureWithRandomECPolicy 
   
hadoop.yarn.server.resourcemanager.scheduler.capacity.TestContainerAllocation 
   hadoop.yarn.server.TestDiskFailures 
   hadoop.mapreduce.v2.hs.webapp.TestHSWebApp 
   hadoop.yarn.sls.TestReservationSystemInvariants 
   hadoop.yarn.sls.TestSLSRunner 

Timed out junit tests :

   org.apache.hadoop.hdfs.TestWriteReadStripedFile 
   
org.apache.hadoop.yarn.server.resourcemanager.TestSubmitApplicationWithRMHA 
   
org.apache.hadoop.yarn.server.resourcemanager.TestKillApplicationWithRMHA 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/517/artifact/out/diff-compile-cc-root.txt
  [4.0K]

 

[jira] [Created] (HDFS-12407) Journal nodes fails to shutdown cleanly if JournalNodeHttpServer or JournalNodeRpcServer fails to start

2017-09-08 Thread Ajay Kumar (JIRA)
Ajay Kumar created HDFS-12407:
-

 Summary: Journal nodes fails to shutdown cleanly if 
JournalNodeHttpServer or JournalNodeRpcServer fails to start
 Key: HDFS-12407
 URL: https://issues.apache.org/jira/browse/HDFS-12407
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ajay Kumar


Journal nodes fails to shutdown cleanly if JournalNodeHttpServer or 
JournalNodeRpcServer fails to start.

Steps to recreate the issue:
# Change the http port for JournalNodeHttpServerr to some port which is already 
in use
{code}dfs.journalnode.http-address{code}
# Start the journalnode. JournalNodeHttpServer start will fail with bind 
exception while journalnode process will continue to run.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12296) Add a field to FsServerDefaults to tell if external attribute provider is enabled

2017-09-08 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang resolved HDFS-12296.
--
Resolution: Won't Fix

> Add a field to FsServerDefaults to tell if external attribute provider is 
> enabled
> -
>
> Key: HDFS-12296
> URL: https://issues.apache.org/jira/browse/HDFS-12296
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12294) Let distcp to bypass external attribute provider when calling getFileStatus etc at source cluster

2017-09-08 Thread Yongjun Zhang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yongjun Zhang resolved HDFS-12294.
--
Resolution: Won't Fix

> Let distcp to bypass external attribute provider when calling getFileStatus 
> etc at source cluster
> -
>
> Key: HDFS-12294
> URL: https://issues.apache.org/jira/browse/HDFS-12294
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs
>Reporter: Yongjun Zhang
>Assignee: Yongjun Zhang
>
> This is an alternative solution for HDFS-12202, which proposed introducing a 
> new set of API, with an additional boolean parameter bypassExtAttrProvider, 
> so to let NN bypass external attribute provider when getFileStatus. The goal 
> is to avoid distcp from copying attributes from one cluster's external 
> attribute provider and save to another cluster's fsimage.
> The solution here is, instead of having an additional parameter, encode this 
> parameter to the path itself, when calling getFileStatus (and some other 
> calls), NN will parse the path, and figure out that whether external 
> attribute provider need to be bypassed. The suggested encoding is to have a 
> prefix to the path before calling getFileStatus, e.g. /ab/c becomes 
> /.reserved/bypassExtAttr/a/b/c. NN will parse the path at the very beginning.
> Thanks much to [~andrew.wang] for this suggestion. The scope of change is 
> smaller and we don't have to change the FileSystem APIs.
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-12406) dfsadmin command prints "Exception encountered" even if there is no exception, when debug is enabled

2017-09-08 Thread Nandakumar (JIRA)
Nandakumar created HDFS-12406:
-

 Summary: dfsadmin command prints "Exception encountered" even if 
there is no exception, when debug is enabled 
 Key: HDFS-12406
 URL: https://issues.apache.org/jira/browse/HDFS-12406
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Reporter: Nandakumar
Assignee: Nandakumar
Priority: Minor


In DFSAdmin we are printing {{"Exception encountered"}} at debug level for all 
the calls even if there is no exception.

{code:title=DFSAdmin#run}
if (LOG.isDebugEnabled()) {
  LOG.debug("Exception encountered:", debugException);
}
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[DISCUSS] official docker image(s) for hadoop

2017-09-08 Thread Marton, Elek


TL;DR: I propose to create official hadoop images and upload them to the 
dockerhub.


GOAL/SCOPE: I would like improve the existing documentation with 
easy-to-use docker based recipes to start hadoop clusters with various 
configuration.


The images also could be used to test experimental features. For example 
ozone could be tested easily with these compose file and configuration:


https://gist.github.com/elek/1676a97b98f4ba561c9f51fce2ab2ea6

Or even the configuration could be included in the compose file:

https://github.com/elek/hadoop/blob/docker-2.8.0/example/docker-compose.yaml

I would like to create separated example compose files for federation, 
ha, metrics usage, etc. to make it easier to try out and understand the 
features.


CONTEXT: There is an existing Jira 
https://issues.apache.org/jira/browse/HADOOP-13397
But it’s about a tool to generate production quality docker images 
(multiple types, in a flexible way). If no objections, I will create a 
separated issue to create simplified docker images for rapid prototyping 
and investigating new features. And register the branch to the dockerhub 
to create the images automatically.


MY BACKGROUND: I am working with docker based hadoop/spark clusters 
quite a while and run them succesfully in different environments 
(kubernetes, docker-swarm, nomad-based scheduling, etc.) My work is 
available from here: https://github.com/flokkr but they could handle 
more complex use cases (eg. instrumenting java processes with btrace, or 
read/reload configuration from consul).
 And IMHO in the official hadoop documentation it’s better to suggest 
to use official apache docker images and not external ones (which could 
be changed).


Please let me know if you have any comments.

Marton

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



hadoop roadmaps

2017-09-08 Thread Marton, Elek

Hi,

I tried to summarize all of the information from different mail threads 
about the upcomming releases:


https://cwiki.apache.org/confluence/display/HADOOP/Roadmap

Please fix it / let me know if you see any invalid data. I will try to 
follow the conversations and update accordingly.


Two administrative questions:

 * Is there any information about which wiki should be used? Or about 
the migration process? As I see the new pages are created on the cwiki 
recently.


 * Could you please give me permission (user: elek) to the old wiki. I 
would like to update the old Roadmap page 
(https://wiki.apache.org/hadoop/Roadmap)


Thanks
Marton

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: 2017-09-07 Hadoop 3 release status update

2017-09-08 Thread Steve Loughran

On 8 Sep 2017, at 00:50, Andrew Wang 
> wrote:

  - HADOOP-14738  (Remove
  S3N and obsolete bits of S3A; rework docs): Steve has been actively revving
  this with our new committer Aaron Fabbri ready to review. The scope has
  expanded from HADOOP-14826, so it's not just a doc update.

For people not tracking this, it's merged with other cleanup code so pulls the 
entirety of the s3n:// connector and the original 
S3AOutputStreamessentially the unmaintained and obsolete bits of code. The 
ones where any bugrep would be dealt with "have you switched to..."


[jira] [Resolved] (HDFS-12326) What is the correct way of retrying when failure occurs during writing

2017-09-08 Thread Andras Bokor (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-12326?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andras Bokor resolved HDFS-12326.
-
Resolution: Not A Problem

It seems like a question, not a bug.

> What is the correct way of retrying when failure occurs during writing
> --
>
> Key: HDFS-12326
> URL: https://issues.apache.org/jira/browse/HDFS-12326
> Project: Hadoop HDFS
>  Issue Type: Test
>  Components: hdfs-client
>Reporter: ZhangBiao
>
> I'm using hdfs client for golang https://github.com/colinmarc/hdfs to write 
> to the hdfs. And I'm using hadoop 2.7.3
> When the number of files concurrently being opened is larger, for example 
> 200. I'll always get the 'broken pipe' error.
> So I want to retry to continue writing. What is the correct way of retrying? 
> Because https://github.com/colinmarc/hdfs hasn't been able to recover the 
> stream status when an error occurs duing writing, so I have to reopen and get 
> a new stream. So I tried the following steps:
> 1 close the current stream
> 2 Append the file to get a new stream
> But when I close the stream, I got the error "updateBlockForPipeline call 
> failed with ERROR_APPLICATION (java.io.IOException"
> and it seems the namenode complains:
> {code:java}
> 2017-08-20 03:22:55,598 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 2 on 9000, call 
> org.apache.hadoop.hdfs.protocol.ClientProtocol.updateBlockForPipeline from 
> 192.168.0.39:46827 Call#50183 Retry#-1
> java.io.IOException: 
> BP-1152809458-192.168.0.39-1502261411064:blk_1073825071_111401 does not exist 
> or is not under Constructionblk_1073825071_111401{UCState=COMMITTED, 
> truncateBlock=null, primaryNodeIndex=-1, 
> replicas=[ReplicaUC[[DISK]DS-d61914ba-df64-467b-bb75-272875e5e865:NORMAL:192.168.0.39:50010|RBW],
>  
> ReplicaUC[[DISK]DS-1314debe-ab08-4001-ab9a-8e234f28f87c:NORMAL:192.168.0.38:50010|RBW]]}
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkUCBlock(FSNamesystem.java:6241)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.updateBlockForPipeline(FSNamesystem.java:6309)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.updateBlockForPipeline(NameNodeRpcServer.java:806)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.updateBlockForPipeline(ClientNamenodeProtocolServerSideTranslatorPB.java:955)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
> 2017-08-20 03:22:56,333 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: BLOCK* 
> blk_1073825071_111401{UCState=COMMITTED, truncateBlock=null, 
> primaryNodeIndex=-1, 
> replicas=[ReplicaUC[[DISK]DS-d61914ba-df64-467b-bb75-272875e5e865:NORMAL:192.168.0.39:50010|RBW],
>  
> ReplicaUC[[DISK]DS-1314debe-ab08-4001-ab9a-8e234f28f87c:NORMAL:192.168.0.38:50010|RBW]]}
>  is not COMPLETE (ucState = COMMITTED, replication# = 0 <  minimum = 1) in 
> file 
> /user/am/scan_task/2017-08-20/192.168.0.38_audience_f/user-bak010-20170820030804.log
> {code}
> when I Appended to get a new stream, I got the error 'append call failed with 
> ERROR_APPLICATION 
> (org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException)', and the 
> corresponding error in namenode is:
> {code:java}
> 2017-08-20 03:22:56,335 WARN org.apache.hadoop.hdfs.StateChange: DIR* 
> NameSystem.append: Failed to APPEND_FILE 
> /user/am/scan_task/2017-08-20/192.168.0.38_audience_f/user-bak010-20170820030804.log
>  for go-hdfs-OAfvZiSUM2Eu894p on 192.168.0.39 because 
> go-hdfs-OAfvZiSUM2Eu894p is already the current lease holder.
> 2017-08-20 03:22:56,335 INFO org.apache.hadoop.ipc.Server: IPC Server handler 
> 0 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.append from 
> 192.168.0.39:46827 Call#50186 Retry#-1: 
> org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException: Failed to 
> APPEND_FILE 
> /user/am/scan_task/2017-08-20/192.168.0.38_audience_f/user-bak010-20170820030804.log
>  for go-hdfs-OAfvZiSUM2Eu894p on 192.168.0.39 because 
> go-hdfs-OAfvZiSUM2Eu894p is already the current lease holder.
> {code}
> Could you please suggest the correct 

[jira] [Created] (HDFS-12405) Complete remove "removed" state erasure coding policy during Namenode restart

2017-09-08 Thread SammiChen (JIRA)
SammiChen created HDFS-12405:


 Summary: Complete remove "removed" state erasure coding policy 
during Namenode restart
 Key: HDFS-12405
 URL: https://issues.apache.org/jira/browse/HDFS-12405
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: erasure-coding
Reporter: SammiChen


Currently, when an erasure coding policy is removed, it's been transited to 
"removed" state. User cannot apply policy with "removed" state to 
file/directory anymore.  The policy cannot be safely removed from the system 
unless we know there is no existing files or directories use this "remove" 
policy. To find out whether there is files or directories which are using the 
policy is time consuming in runtime and might impact the Namenode performance. 
So a better choice is do the work when NameNode restarts and loads Inode one by 
one. Collect the information at that time will not introduce much extra 
workloads. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org