[jira] [Created] (HDDS-941) Rename ChunkGroupInputStream to keyInputStream and ChunkInputStream to BlockInputStream

2018-12-18 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-941:


 Summary: Rename ChunkGroupInputStream to keyInputStream and 
ChunkInputStream to BlockInputStream
 Key: HDDS-941
 URL: https://issues.apache.org/jira/browse/HDDS-941
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Client
Affects Versions: 0.4.0
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee
 Fix For: 0.4.0


ChunkGroupInputStream reads all blocks for a key and ChunkInputStream reads all 
chunks of a block. It would be convenient to call ChunkGroupInputStream to 
KeyInputStream and ChunkInputStream to BlockInputStream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-940) Remove dead store to local variable in OmMetadataManagerImpl

2018-12-18 Thread Dinesh Chitlangia (JIRA)
Dinesh Chitlangia created HDDS-940:
--

 Summary: Remove dead store to local variable in 
OmMetadataManagerImpl
 Key: HDDS-940
 URL: https://issues.apache.org/jira/browse/HDDS-940
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Manager
Reporter: Dinesh Chitlangia
Assignee: Dinesh Chitlangia


OmMetadataManagerImpl#getExpiredOpenKeys creates a dead store local variable in 
following line:
{{long now = Time.now();}}

This jira aims to remove the dead store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14161) RBF:throw RetriableException instead of IOException so that client can retry when can not get connection

2018-12-18 Thread Fei Hui (JIRA)
Fei Hui created HDFS-14161:
--

 Summary: RBF:throw RetriableException instead of IOException so 
that client can retry when can not get connection
 Key: HDFS-14161
 URL: https://issues.apache.org/jira/browse/HDFS-14161
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Fei Hui






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [VOTE - 2] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

2018-12-18 Thread Chao Sun
+1 (no-binding)

At Uber we've deployed a subset of features covered in this JIRA in
production at multiple data centers, and it's being running great for a
while now. We are seeing huge benefits in terms of scaling our NameNode
throughput and providing better SLA guarantees for applications such as
Presto. We are very much looking forward to try out the full features once
this is merged into trunk.

Chao

On Tue, Dec 18, 2018 at 5:29 PM Jonathan Hung  wrote:

> +1!
>
> Jonathan Hung
>
>
> On Sat, Dec 15, 2018 at 8:26 AM Zhe Zhang  wrote:
>
> > +1
> >
> > Thanks for addressing concerns from the previous vote.
> >
> > On Fri, Dec 14, 2018 at 6:24 PM Konstantin Shvachko <
> shv.had...@gmail.com>
> > wrote:
> >
> > > Hi Hadoop developers,
> > >
> > > I would like to propose to merge to trunk the feature branch HDFS-12943
> > for
> > > Consistent Reads from Standby Node. The feature is intended to scale
> read
> > > RPC workloads. On large clusters reads comprise 95% of all RPCs to the
> > > NameNode. We should be able to accommodate higher overall RPC workloads
> > (up
> > > to 4x by some estimates) by adding multiple ObserverNodes.
> > >
> > > The main functionality has been implemented see sub-tasks of
> HDFS-12943.
> > > We followed up with the test plan. Testing was done on two independent
> > > clusters (see HDFS-14058 and HDFS-14059) with security enabled.
> > > We ran standard HDFS commands, MR jobs, admin commands including manual
> > > failover.
> > > We know of one cluster running this feature in production.
> > >
> > > Since the previous vote we addressed Daryn's concern (see HDFS-13873),
> > > added documentation for the new feature, and fixed a few other jiras.
> > >
> > > I attached a unified patch to the umbrella jira for the review.
> > > Please vote on this thread. The vote will run for 7 days until Wed Dec
> > 21.
> > >
> > > Thanks,
> > > --Konstantin
> > >
> > --
> > Zhe Zhang
> > Apache Hadoop Committer
> > http://zhe-thoughts.github.io/about/ | @oldcap
> >
>


Re: [VOTE - 2] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

2018-12-18 Thread Jonathan Hung
+1!

Jonathan Hung


On Sat, Dec 15, 2018 at 8:26 AM Zhe Zhang  wrote:

> +1
>
> Thanks for addressing concerns from the previous vote.
>
> On Fri, Dec 14, 2018 at 6:24 PM Konstantin Shvachko 
> wrote:
>
> > Hi Hadoop developers,
> >
> > I would like to propose to merge to trunk the feature branch HDFS-12943
> for
> > Consistent Reads from Standby Node. The feature is intended to scale read
> > RPC workloads. On large clusters reads comprise 95% of all RPCs to the
> > NameNode. We should be able to accommodate higher overall RPC workloads
> (up
> > to 4x by some estimates) by adding multiple ObserverNodes.
> >
> > The main functionality has been implemented see sub-tasks of HDFS-12943.
> > We followed up with the test plan. Testing was done on two independent
> > clusters (see HDFS-14058 and HDFS-14059) with security enabled.
> > We ran standard HDFS commands, MR jobs, admin commands including manual
> > failover.
> > We know of one cluster running this feature in production.
> >
> > Since the previous vote we addressed Daryn's concern (see HDFS-13873),
> > added documentation for the new feature, and fixed a few other jiras.
> >
> > I attached a unified patch to the umbrella jira for the review.
> > Please vote on this thread. The vote will run for 7 days until Wed Dec
> 21.
> >
> > Thanks,
> > --Konstantin
> >
> --
> Zhe Zhang
> Apache Hadoop Committer
> http://zhe-thoughts.github.io/about/ | @oldcap
>


Re: [VOTE - 2] Merge HDFS-12943 branch to trunk - Consistent Reads from Standby

2018-12-18 Thread Chen Liang
+1

Thanks Konstantin for driving the merge vote!

I have been working on the development and testing of the feature. It has
been running for several weeks on our cluster with ~100 nodes, which has HA
and Kerberos enabled. I have been able to run several different MapReduce
jobs and HDFS benchmarks (see HDFS-14058 for more detail). I feel confident
that the feature now has the completed functionality and is ready for merge
into trunk.

Chen

Konstantin Shvachko  于2018年12月14日周五 下午6:24写道:

> Hi Hadoop developers,
>
> I would like to propose to merge to trunk the feature branch HDFS-12943 for
> Consistent Reads from Standby Node. The feature is intended to scale read
> RPC workloads. On large clusters reads comprise 95% of all RPCs to the
> NameNode. We should be able to accommodate higher overall RPC workloads (up
> to 4x by some estimates) by adding multiple ObserverNodes.
>
> The main functionality has been implemented see sub-tasks of HDFS-12943.
> We followed up with the test plan. Testing was done on two independent
> clusters (see HDFS-14058 and HDFS-14059) with security enabled.
> We ran standard HDFS commands, MR jobs, admin commands including manual
> failover.
> We know of one cluster running this feature in production.
>
> Since the previous vote we addressed Daryn's concern (see HDFS-13873),
> added documentation for the new feature, and fixed a few other jiras.
>
> I attached a unified patch to the umbrella jira for the review.
> Please vote on this thread. The vote will run for 7 days until Wed Dec 21.
>
> Thanks,
> --Konstantin
>


[jira] [Created] (HDFS-14160) ObserverReadInvocationHandler should implement RpcInvocationHandler

2018-12-18 Thread Konstantin Shvachko (JIRA)
Konstantin Shvachko created HDFS-14160:
--

 Summary: ObserverReadInvocationHandler should implement 
RpcInvocationHandler
 Key: HDFS-14160
 URL: https://issues.apache.org/jira/browse/HDFS-14160
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko


Currently ObserverReadInvocationHandler implements InvocationHandler.
[As 
mentioned|https://issues.apache.org/jira/browse/HDFS-14116?focusedCommentId=16710596=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16710596]
 in HDFS-14116 this is the cause of Fsck failing with Observer.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[DISCUSS] Merging YARN-8200 to branch-2

2018-12-18 Thread Jonathan Hung
Hi folks,

Starting a thread to discuss merging YARN-8200 (resource profiles/GPU
support) to branch-2.

For resource types, we have ported YARN-4081~YARN-7137 (as part of
YARN-3926 umbrella).
For GPU support, we have ported the native non-docker GPU support related
items in YARN-6223.
For both of these, we have also ported miscellaneous fixes for issues we
encountered internally.

Some potential issues I see are, some of the resource types commits did not
make it to branch-3.0. Also most of the GPU-specific commits did not make
it to branch-3.0 either.

We have deployed these two features internally on top of a branch-2.9 fork
on a 100 node GPU cluster which is running deep learning workloads, and it
is working well.

Before the holidays/after new years we will work on cleaning up the feature
branch (YARN-8200), e.g. filing tickets on branch-2 specific bug fixes,
rebasing on latest branch-2, syncing any bug fixes in our internal fork
which did not make it to the feature branch, etc. Assuming no objections,
once it's ready we will start a vote to merge.

Thanks,
Jonathan Hung


[jira] [Created] (HDDS-939) Add S3 access check to Ozone manager

2018-12-18 Thread Anu Engineer (JIRA)
Anu Engineer created HDDS-939:
-

 Summary: Add S3 access check to Ozone manager
 Key: HDDS-939
 URL: https://issues.apache.org/jira/browse/HDDS-939
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager, S3
Reporter: Anu Engineer
Assignee: Dinesh Chitlangia


Add the mapping from S3 User Identity to UGI inside Ozone Manager.  Also add 
the access check permission, that is call into the checkAccess, which will be 
intercepted by Ranger or Ozone access check.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-938) Add Client APIs for using S3 Auth interface

2018-12-18 Thread Anu Engineer (JIRA)
Anu Engineer created HDDS-938:
-

 Summary: Add Client APIs for using S3 Auth interface
 Key: HDDS-938
 URL: https://issues.apache.org/jira/browse/HDDS-938
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: S3
Reporter: Anu Engineer
Assignee: Dinesh Chitlangia


Add Client and Server APIs to access the S3 Auth interface supported by Ozone 
Manager.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-937) Create an S3 Auth Table

2018-12-18 Thread Anu Engineer (JIRA)
Anu Engineer created HDDS-937:
-

 Summary: Create an S3 Auth Table
 Key: HDDS-937
 URL: https://issues.apache.org/jira/browse/HDDS-937
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: S3
Reporter: Anu Engineer
Assignee: Dinesh Chitlangia


Add a S3 auth table in Ozone Manager. This table allows us to create mapping 
from S3 keys to UGIs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-936) Need a tool to map containers to ozone objects

2018-12-18 Thread Jitendra Nath Pandey (JIRA)
Jitendra Nath Pandey created HDDS-936:
-

 Summary: Need a tool to map containers to ozone objects
 Key: HDDS-936
 URL: https://issues.apache.org/jira/browse/HDDS-936
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Reporter: Jitendra Nath Pandey
Assignee: sarun singla


Ozone should have a tool to get list of objects that a container contains. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2018-12-18 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/

[Dec 17, 2018 5:54:57 AM] (surendralilhore) HDFS-14096. [SPS] : Add Support for 
Storage Policy Satisfier in ViewFs.
[Dec 17, 2018 11:04:40 AM] (stevel) HADOOP-15969. ABFS: getNamespaceEnabled can 
fail blocking user access
[Dec 17, 2018 11:10:22 AM] (stevel) HADOOP-15972 ABFS: reduce list page size to 
to 500.
[Dec 17, 2018 11:15:20 AM] (stevel) HADOOP-16004. ABFS: Convert 404 error 
response in AbfsInputStream and
[Dec 17, 2018 5:04:25 PM] (eyang) YARN-9040.  Fixed memory leak in 
LevelDBCacheTimelineStore and
[Dec 17, 2018 8:18:47 PM] (aengineer) HDDS-908: NPE in TestOzoneRpcClient. 
Contributed by Ajay Kumar.
[Dec 17, 2018 11:40:22 PM] (xyao) HDDS-99. Adding SCM Audit log. Contributed by 
Dinesh Chitlangia.




-1 overall


The following subsystems voted -1:
asflicense findbugs hadolint pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.registry.secure.TestSecureLogins 
   
hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyPersistReplicaRecovery 
   hadoop.hdfs.web.TestWebHdfsTimeouts 
   hadoop.yarn.sls.TestSLSRunner 
  

   cc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/diff-compile-javac-root.txt
  [336K]

   checkstyle:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/diff-checkstyle-root.txt
  [17M]

   hadolint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   pathlen:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/diff-patch-pylint.txt
  [60K]

   shellcheck:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/diff-patch-shellcheck.txt
  [20K]

   shelldocs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/diff-patch-shelldocs.txt
  [12K]

   whitespace:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/whitespace-eol.txt
  [9.3M]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/whitespace-tabs.txt
  [1.1M]

   findbugs:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-hdds_client.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-hdds_container-service.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-hdds_framework.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-hdds_server-scm.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-hdds_tools.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-ozone_client.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-ozone_common.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-ozone_objectstore-service.txt
  [8.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-ozone_ozone-manager.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-ozone_ozonefs.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-ozone_s3gateway.txt
  [4.0K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/branch-findbugs-hadoop-ozone_tools.txt
  [8.0K]

   javadoc:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/diff-javadoc-javadoc-root.txt
  [752K]

   unit:

   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/patch-unit-hadoop-common-project_hadoop-registry.txt
  [12K]
   
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/991/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [328K]
   

[jira] [Created] (HDFS-14159) Backporting HDFS-12882 to branch-3.0: Support full open(PathHandle) contract in HDFS

2018-12-18 Thread Adam Antal (JIRA)
Adam Antal created HDFS-14159:
-

 Summary: Backporting HDFS-12882 to branch-3.0: Support full 
open(PathHandle) contract in HDFS
 Key: HDFS-14159
 URL: https://issues.apache.org/jira/browse/HDFS-14159
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Affects Versions: 3.0.3
Reporter: Adam Antal
Assignee: Adam Antal


This task aims to backport HDFS-12882 and some connecting commits to branch-3.0 
without introducing API incompatibilities.

In order to be able to cleanly backport, first HDFS-7878, then HDFS-12877 
should be backported to that branch as well (both can be executed cleanly, and 
with build success).

Also, this patch would also introduce API backward incompatibilities in 
hadoop-hdfs-client, and we should modify it to a compat change (similar as in 
HDFS-13830 fought with this problem).

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-14158) Checkpointer always triggers after 5 minutes

2018-12-18 Thread Timo Walter (JIRA)
Timo Walter created HDFS-14158:
--

 Summary: Checkpointer always triggers after 5 minutes
 Key: HDFS-14158
 URL: https://issues.apache.org/jira/browse/HDFS-14158
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.8.1
Reporter: Timo Walter


The checkpointer always triggers a checkpoint every 5 minutes and ignores the 
flag "*dfs.namenode.checkpoint.period*"

See the code below (in Checkpointer.java):
{code:java}
//Main work loop of the Checkpointer
public void run() {
  // Check the size of the edit log once every 5 minutes.
  long periodMSec = 5 * 60;   // 5 minutes
  if(checkpointConf.getPeriod() < periodMSec) {
periodMSec = checkpointConf.getPeriod();
  }
{code}
If the configured period ("*dfs.namenode.checkpoint.period*") is lower than 5 
minutes, you choose use the configured one. But it always ignores it, if it's 
greater than 5 minutes.

 

In my opinion, the if-expression should be:
{code:java}
if(checkpointConf.getPeriod() > periodMSec) {
periodMSec = checkpointConf.getPeriod();
  }
{code}
 

Then "*dfs.namenode.checkpoint.period*" won't get ignored.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-935) Avoid creating an already created container on a datanode in case of disk removal followed by datanode restart

2018-12-18 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-935:


 Summary: Avoid creating an already created container on a datanode 
in case of disk removal followed by datanode restart
 Key: HDDS-935
 URL: https://issues.apache.org/jira/browse/HDDS-935
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Affects Versions: 0.4.0
Reporter: Rakesh R
Assignee: Shashikant Banerjee


Currently, a container gets created when a writeChunk request comes to 
HddsDispatcher and if the container does not exist already. In case a disk on 
which a container exists gets removed and datanode restarts and now, if a 
writeChunkRequest comes , it might end up creating the same container again 
with an updated BCSID as it won't detect the disk is removed. This won't be 
detected by SCM as well as it will have the latest BCSID. This Jira aims to 
address this issue.

The proposed fix would be to persist the all the containerIds existing in the 
containerSet when a ratis snapshot is taken in the snapshot file. If the disk 
is removed and dn gets restarted, the container set will be rebuild after 
scanning all the available disks and the the container list stored in the 
snapshot file will give all the containers created in the datanode. The diff 
between these two will give the exact list of containers which were created but 
were not detected after the restart. Any writeChunk request now should validate 
the container Id from the list of missing containers. Also, we need to ensure 
container creation does not happen as part of applyTransaction of writeChunk 
request in Ratis.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-934) freon run hung and did not terminate when run on non-functional pipeline

2018-12-18 Thread Nilotpal Nandi (JIRA)
Nilotpal Nandi created HDDS-934:
---

 Summary: freon run hung and did not terminate when run on 
non-functional pipeline
 Key: HDDS-934
 URL: https://issues.apache.org/jira/browse/HDDS-934
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Reporter: Nilotpal Nandi
 Attachments: jstack.txt

steps taken:
 # created docker cluster with 3 datanodes where datanodes are not able to 
communicate with each other.
 # ran freon to write 1 key

freon run threw following exception but it did not terminate.

exception:

--

 
{noformat}
2018-12-18 10:38:06 INFO RandomKeyGenerator:227 - Number of Threads: 10
2018-12-18 10:38:06 INFO RandomKeyGenerator:233 - Number of Volumes: 1.
2018-12-18 10:38:06 INFO RandomKeyGenerator:234 - Number of Buckets per Volume: 
1.
2018-12-18 10:38:06 INFO RandomKeyGenerator:235 - Number of Keys per Bucket: 1.
2018-12-18 10:38:06 INFO RandomKeyGenerator:236 - Key size: 10240 bytes
2018-12-18 10:38:06 INFO RandomKeyGenerator:266 - Starting progress bar Thread.
0.00% |█ | 0/1 Time: 0:00:002018-12-18 10:38:06 INFO RpcClient:250 - Creating 
Volume: vol-0-74492, with hadoop as owner and quota set to 1152921504606846976 
bytes.
2018-12-18 10:38:06 INFO RpcClient:379 - Creating Bucket: 
vol-0-74492/bucket-0-16002, with Versioning false and Storage Type set to DISK
 0.00% |█ | 0/1 Time: 0:02:402018-12-18 10:40:46 ERROR 
ChunkGroupOutputStream:275 - Try to allocate more blocks for write failed, 
already allocated 0 blocks for this write.
2018-12-18 10:40:46 ERROR RandomKeyGenerator:624 - Exception while adding key: 
key-0-28925 in bucket: org.apache.hadoop.ozone.client.OzoneBucket@1675c402 of 
volume: org.apache.hadoop.ozone.client.OzoneVolume@5b6bfafd.
java.io.IOException: Allocate block failed, error:INTERNAL_ERROR
 at 
org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.allocateBlock(OzoneManagerProtocolClientSideTranslatorPB.java:620)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.allocateNewBlock(ChunkGroupOutputStream.java:437)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleWrite(ChunkGroupOutputStream.java:272)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleException(ChunkGroupOutputStream.java:377)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleFlushOrClose(ChunkGroupOutputStream.java:473)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleFlushOrClose(ChunkGroupOutputStream.java:474)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleFlushOrClose(ChunkGroupOutputStream.java:474)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleFlushOrClose(ChunkGroupOutputStream.java:474)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.handleWrite(ChunkGroupOutputStream.java:309)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupOutputStream.write(ChunkGroupOutputStream.java:255)
 at 
org.apache.hadoop.ozone.client.io.OzoneOutputStream.write(OzoneOutputStream.java:49)
 at java.io.OutputStream.write(OutputStream.java:75)
 at 
org.apache.hadoop.ozone.freon.RandomKeyGenerator$OfflineProcessor.run(RandomKeyGenerator.java:606)
 at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
 0.00% |█ | 0/1 Time: 0:11:01
 0.00% |█ | 0/1 Time: 0:11:19
{noformat}
Here is the jstack for the freon process :

[^jstack.txt]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org