[jira] [Created] (HDDS-1190) Fix jdk 11 issue for ozonesecure base image and docker-compose
Xiaoyu Yao created HDDS-1190: Summary: Fix jdk 11 issue for ozonesecure base image and docker-compose Key: HDDS-1190 URL: https://issues.apache.org/jira/browse/HDDS-1190 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Xiaoyu Yao Assignee: Xiaoyu Yao HDDS-1019 changes to use hadoop-runner as base image for ozonesecure docker-compose. There are a few issues that need to fixed. 1.The hadoop-runner uses jdk11 but the ozonesecure/docker-config assume openjdk8 for JAVA_HOME. 2. The KEYTAB_DIR needs to be quoted with '. 3. keytab based login failed with Message stream modified (41), [~elek] mentioned in HDDS-1019 that we need to add max_renewable_life to "docker-image/docker-krb5/krb5.conf" like follows. [realms] EXAMPLE.COM = \{ kdc = localhost admin_server = localhost max_renewable_life = 7d } Failures: {code} org.apache.hadoop.security.KerberosAuthException: failure to login: for principal: scm/s...@example.com from keytab /etc/security/keytabs/scm.keytab javax.security.auth.login.LoginException: Message stream modified (41) scm_1 | at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1847) scm_1 | {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1189) Recon DB schema and ORM
Siddharth Wagle created HDDS-1189: - Summary: Recon DB schema and ORM Key: HDDS-1189 URL: https://issues.apache.org/jira/browse/HDDS-1189 Project: Hadoop Distributed Data Store Issue Type: Sub-task Affects Versions: 0.5.0 Reporter: Siddharth Wagle Assignee: Aravindan Vijayan Fix For: 0.5.0 _Objectives_ - Define V1 of the db schema for recon service - The current proposal is to use jOOQ as the ORM for SQL interaction. For two main reasons: a) powerful DSL for querying that abstracts out SQL dialects, b) Allows code to schema and schema to code seamless transition, critical for creating DDL through the code and unit testing across versions of the application. - Add e2e unit tests suite for Recon entities, created based on the design doc -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1188) Implement a skeleton patch for Recon server with initial set of interfaces
Siddharth Wagle created HDDS-1188: - Summary: Implement a skeleton patch for Recon server with initial set of interfaces Key: HDDS-1188 URL: https://issues.apache.org/jira/browse/HDDS-1188 Project: Hadoop Distributed Data Store Issue Type: Sub-task Affects Versions: 0.5.0 Reporter: Siddharth Wagle Assignee: Siddharth Wagle Fix For: 0.5.0 Jira to define package structure, maven module, recon server application and initial db schema. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-451) PutKey failed due to error "Rejecting write chunk request. Chunk overwrite without explicit request"
[ https://issues.apache.org/jira/browse/HDDS-451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo Nicholas Sze resolved HDDS-451. -- Resolution: Cannot Reproduce Resolving as "Cannot Reproduce". > PutKey failed due to error "Rejecting write chunk request. Chunk overwrite > without explicit request" > > > Key: HDDS-451 > URL: https://issues.apache.org/jira/browse/HDDS-451 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Affects Versions: 0.2.1 >Reporter: Nilotpal Nandi >Assignee: Shashikant Banerjee >Priority: Blocker > Labels: alpha2 > Attachments: all-node-ozone-logs-1536841590.tar.gz > > > steps taken : > -- > # Ran Put Key command to write 50GB data. Put Key client operation failed > after 17 mins. > error seen ozone.log : > > > {code} > 2018-09-13 12:11:53,734 [ForkJoinPool.commonPool-worker-20] DEBUG > (ChunkManagerImpl.java:85) - writing > chunk:bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_1 > chunk stage:COMMIT_DATA chunk > file:/tmp/hadoop-root/dfs/data/hdds/de0a9e01-4a12-40e3-b567-51b9bd83248e/current/containerDir0/16/chunks/bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_1 > tmp chunk file > 2018-09-13 12:11:56,576 [pool-3-thread-60] DEBUG (ChunkManagerImpl.java:85) - > writing > chunk:bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_2 > chunk stage:WRITE_DATA chunk > file:/tmp/hadoop-root/dfs/data/hdds/de0a9e01-4a12-40e3-b567-51b9bd83248e/current/containerDir0/16/chunks/bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_2 > tmp chunk file > 2018-09-13 12:11:56,739 [ForkJoinPool.commonPool-worker-20] DEBUG > (ChunkManagerImpl.java:85) - writing > chunk:bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_2 > chunk stage:COMMIT_DATA chunk > file:/tmp/hadoop-root/dfs/data/hdds/de0a9e01-4a12-40e3-b567-51b9bd83248e/current/containerDir0/16/chunks/bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_2 > tmp chunk file > 2018-09-13 12:12:21,410 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:148) - Executing cycle Number : 206 > 2018-09-13 12:12:51,411 [Datanode State Machine Thread - 0] DEBUG > (DatanodeStateMachine.java:148) - Executing cycle Number : 207 > 2018-09-13 12:12:53,525 [BlockDeletingService#1] DEBUG > (TopNOrderedContainerDeletionChoosingPolicy.java:79) - Stop looking for next > container, there is no pending deletion block contained in remaining > containers. > 2018-09-13 12:12:55,048 [Datanode ReportManager Thread - 1] DEBUG > (ContainerSet.java:191) - Starting container report iteration. > 2018-09-13 12:13:02,626 [pool-3-thread-1] ERROR (ChunkUtils.java:244) - > Rejecting write chunk request. Chunk overwrite without explicit request. > ChunkInfo{chunkName='bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_2, > offset=0, len=16777216} > 2018-09-13 12:13:03,035 [pool-3-thread-1] INFO (ContainerUtils.java:149) - > Operation: WriteChunk : Trace ID: 54834b29-603d-4ba9-9d68-0885215759d8 : > Message: Rejecting write chunk request. OverWrite flag > required.ChunkInfo{chunkName='bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_2, > offset=0, len=16777216} : Result: OVERWRITE_FLAG_REQUIRED > 2018-09-13 12:13:03,037 [ForkJoinPool.commonPool-worker-11] ERROR > (ChunkUtils.java:244) - Rejecting write chunk request. Chunk overwrite > without explicit request. > ChunkInfo{chunkName='bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_2, > offset=0, len=16777216} > 2018-09-13 12:13:03,037 [ForkJoinPool.commonPool-worker-11] INFO > (ContainerUtils.java:149) - Operation: WriteChunk : Trace ID: > 54834b29-603d-4ba9-9d68-0885215759d8 : Message: Rejecting write chunk > request. OverWrite flag > required.ChunkInfo{chunkName='bd80b58a5eba888200a4832a0f2aafb3_stream_5f3b2505-6964-45c9-a7ad-827388a1e6a0_chunk_2, > offset=0, len=16777216} : Result: OVERWRITE_FLAG_REQUIRED > > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1187) Healthy pipeline Chill Mode rule to consider only pipelines with replication factor three
Bharat Viswanadham created HDDS-1187: Summary: Healthy pipeline Chill Mode rule to consider only pipelines with replication factor three Key: HDDS-1187 URL: https://issues.apache.org/jira/browse/HDDS-1187 Project: Hadoop Distributed Data Store Issue Type: Bug Reporter: Bharat Viswanadham Assignee: Bharat Viswanadham Few offline comments from [~nandakumar131] # We should not process pipeline report from datanode again during calculations. # We should consider only replication factor 3 ratis pipelines. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1180) TestRandomKeyGenerator fails with NPE
[ https://issues.apache.org/jira/browse/HDDS-1180?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bharat Viswanadham resolved HDDS-1180. -- Resolution: Duplicate > TestRandomKeyGenerator fails with NPE > - > > Key: HDDS-1180 > URL: https://issues.apache.org/jira/browse/HDDS-1180 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: test >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-966) Rename ChunkGroupInputStream to KeyInputStream and ChunkInputStream to BlockInputStream
[ https://issues.apache.org/jira/browse/HDDS-966?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-966. -- Resolution: Fixed > Rename ChunkGroupInputStream to KeyInputStream and ChunkInputStream to > BlockInputStream > --- > > Key: HDDS-966 > URL: https://issues.apache.org/jira/browse/HDDS-966 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: Ozone Client >Affects Versions: 0.4.0 >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.4.0 > > > ChunkGroupInputStream reads the data for a ozone key and ChunkInputStream > reads data for a block. It would be more appropriate to rename > ChunkGroupInputStream to KeyInputStream and ChunkInputStream to > BlockInputStream. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1140) TestSCMChillModeManager is failing with NullPointerException
[ https://issues.apache.org/jira/browse/HDDS-1140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lokesh Jain resolved HDDS-1140. --- Resolution: Duplicate > TestSCMChillModeManager is failing with NullPointerException > > > Key: HDDS-1140 > URL: https://issues.apache.org/jira/browse/HDDS-1140 > Project: Hadoop Distributed Data Store > Issue Type: Sub-task > Components: SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Assignee: Lokesh Jain >Priority: Major > > TestSCMChillModeManager is failing with the following exception > {code} > [ERROR] > testDisableChillMode(org.apache.hadoop.hdds.scm.chillmode.TestSCMChillModeManager) > Time elapsed: 0.012 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.utils.Scheduler.scheduleWithFixedDelay(Scheduler.java:78) > at > org.apache.hadoop.hdds.scm.pipeline.RatisPipelineUtils.scheduleFixedIntervalPipelineCreator(RatisPipelineUtils.java:211) > at > org.apache.hadoop.hdds.scm.chillmode.SCMChillModeManager.exitChillMode(SCMChillModeManager.java:137) > at > org.apache.hadoop.hdds.scm.chillmode.SCMChillModeManager.(SCMChillModeManager.java:93) > at > org.apache.hadoop.hdds.scm.chillmode.TestSCMChillModeManager.testDisableChillMode(TestSCMChillModeManager.java:134) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-274) Handle overreplication in ReplicationManager
[ https://issues.apache.org/jira/browse/HDDS-274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton resolved HDDS-274. --- Resolution: Duplicate Thanks to [~nandakumar131] it's implemented in HDDS-896. > Handle overreplication in ReplicationManager > > > Key: HDDS-274 > URL: https://issues.apache.org/jira/browse/HDDS-274 > Project: Hadoop Distributed Data Store > Issue Type: Improvement > Components: SCM >Reporter: Elek, Marton >Assignee: Elek, Marton >Priority: Major > > HDDS-199 provides the framework to handle over/under replicated containers, > but it contains implementation only for the under replicated containers. > The over replicated containers should be handled and should be deleted from > the datanodes. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
Apache Hadoop qbt Report: branch2+JDK7 on Linux/x86
For more details, see https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/ No changes -1 overall The following subsystems voted -1: asflicense findbugs hadolint pathlen unit xml The following subsystems voted -1 but were configured to be filtered/ignored: cc checkstyle javac javadoc pylint shellcheck shelldocs whitespace The following subsystems are considered long running: (runtime bigger than 1h 0m 0s) unit Specific tests: XML : Parsing Error(s): hadoop-build-tools/src/main/resources/checkstyle/checkstyle.xml hadoop-build-tools/src/main/resources/checkstyle/suppressions.xml hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/conf/empty-configuration.xml hadoop-tools/hadoop-azure/src/config/checkstyle.xml hadoop-tools/hadoop-resourceestimator/src/config/checkstyle.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/public/crossdomain.xml hadoop-yarn-project/hadoop-yarn/hadoop-yarn-ui/src/main/webapp/public/crossdomain.xml FindBugs : module:hadoop-common-project/hadoop-common Class org.apache.hadoop.fs.GlobalStorageStatistics defines non-transient non-serializable instance field map In GlobalStorageStatistics.java:instance field map In GlobalStorageStatistics.java FindBugs : module:hadoop-hdfs-project/hadoop-hdfs Dead store to state in org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Saver.save(OutputStream, INodeSymlink) At FSImageFormatPBINode.java:org.apache.hadoop.hdfs.server.namenode.FSImageFormatPBINode$Saver.save(OutputStream, INodeSymlink) At FSImageFormatPBINode.java:[line 623] FindBugs : module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-timelineservice-hbase/hadoop-yarn-server-timelineservice-hbase-client Boxed value is unboxed and then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:then immediately reboxed in org.apache.hadoop.yarn.server.timelineservice.storage.common.ColumnRWHelper.readResultsWithTimestamps(Result, byte[], byte[], KeyConverter, ValueConverter, boolean) At ColumnRWHelper.java:[line 335] Failed junit tests : hadoop.util.TestBasicDiskValidator hadoop.util.TestDiskCheckerWithDiskIo hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys hadoop.registry.secure.TestSecureLogins hadoop.yarn.server.TestContainerManagerSecurity hadoop.yarn.server.timelineservice.security.TestTimelineAuthFilterForV2 cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-compile-cc-root-jdk1.7.0_95.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-compile-javac-root-jdk1.7.0_95.txt [328K] cc: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-compile-cc-root-jdk1.8.0_191.txt [4.0K] javac: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-compile-javac-root-jdk1.8.0_191.txt [308K] checkstyle: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-checkstyle-root.txt [16M] hadolint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-patch-hadolint.txt [4.0K] pathlen: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/pathlen.txt [12K] pylint: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-patch-pylint.txt [24K] shellcheck: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-patch-shellcheck.txt [72K] shelldocs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/diff-patch-shelldocs.txt [8.0K] whitespace: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/whitespace-eol.txt [12M] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/whitespace-tabs.txt [1.2M] xml: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/xml.txt [20K] findbugs: https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/branch-findbugs-hadoop-common-project_hadoop-common-warnings.html [8.0K] https://builds.apache.org/job/hadoop-qbt-branch2-java7-linux-x86/245/artifact/out/branch-findbugs-hadoop-hdfs-project_hadoop-hdfs-warnings.html [8.0K]
[jira] [Created] (HDDS-1186) Ozone S3 gateways
Elek, Marton created HDDS-1186: -- Summary: Ozone S3 gateways Key: HDDS-1186 URL: https://issues.apache.org/jira/browse/HDDS-1186 Project: Hadoop Distributed Data Store Issue Type: Task Components: S3 Reporter: Elek, Marton S3 compatible rest gateway is implemented in HDDS-434 for 0.3.0 With the second phase (HDDS-763) multi part upload and other improvements are added (for release 0.4.0) I open this jira to collect all the open tasks to improve the s3 gateway for 0.5.0 release. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1019) Use apache/hadoop-runner image to test ozone secure cluster
[ https://issues.apache.org/jira/browse/HDDS-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton resolved HDDS-1019. Resolution: Fixed Fix Version/s: 0.4.0 Thanks a lot to fix this [~xyao]. I also committed the trunk part. I will create a separated jira to improve the kdc image (the current one issue tickets with 0 renewal which is not accepted by jdk 11) > Use apache/hadoop-runner image to test ozone secure cluster > --- > > Key: HDDS-1019 > URL: https://issues.apache.org/jira/browse/HDDS-1019 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Elek, Marton >Assignee: Xiaoyu Yao >Priority: Critical > Fix For: 0.4.0 > > Attachments: HDDS-1019-docker-hadoop-runner.01.patch, > HDDS-1019-docker-hadoop-runner.02.patch, > HDDS-1019-docker-hadoop-runner.03.patch, HDDS-1019-trunk.01.patch, > HDDS-1019-trunk.02.patch, HDDS-1019-trunk.03.patch > > > As of now the secure ozone cluster uses a custom image which is not based on > the apache/hadoop-runner image. There are multiple problems with that: > 1. multiple script files which are maintained in the docker-hadoop-runner > branch are copied and duplicated in > hadoop-ozone/dist/src/main/compose/ozonesecure/docker-image/runner/scripts > 2. The user of the image is root. It creates > core-site.xml/hdfs-site.xml/ozone-site.xml which root user which prevents to > run all the default smoke tests > 3. To build the base image with each build takes more time > I propose to check what is missing from the apache/hadoop-ozone base image, > add it and use that one. > I marked it critical because 2): it breaks the run of the the acceptance test > suit. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1083) Improve error code when SCM fails to allocate block
[ https://issues.apache.org/jira/browse/HDDS-1083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elek, Marton resolved HDDS-1083. Resolution: Duplicate Agree. I think it's fixed in HDDS-1068. [~msingh]: Please reopen it if you still see the issue. > Improve error code when SCM fails to allocate block > --- > > Key: HDDS-1083 > URL: https://issues.apache.org/jira/browse/HDDS-1083 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: SCM >Affects Versions: 0.4.0 >Reporter: Mukul Kumar Singh >Priority: Major > Fix For: 0.4.0 > > > The following error, KEY_ALLOCATION_ERROR doesn't display information around > the number of replica's. Also the error isn't detailed about whether no > pipelines were found or there wasn't enough space on the datanode to create > the containers. > {code} > 019-02-11 14:56:12 ERROR RandomKeyGenerator:621 - Exception while adding key: > key-0-91322 in bucket: org.apache.hadoop.ozone.client.OzoneBucket@24ef95df of > volume: org.apache.hadoop.ozone.client.OzoneVolume@6473c338. > java.io.IOException: Create key failed, error:KEY_ALLOCATION_ERROR > at > org.apache.hadoop.ozone.om.protocolPB.OzoneManagerProtocolClientSideTranslatorPB.openKey(OzoneManagerProtocolClientSideTranslatorPB.java:692) > at > org.apache.hadoop.ozone.client.rpc.RpcClient.createKey(RpcClient.java:571) > at > org.apache.hadoop.ozone.client.OzoneBucket.createKey(OzoneBucket.java:274) > at > org.apache.hadoop.ozone.freon.RandomKeyGenerator$OfflineProcessor.run(RandomKeyGenerator.java:596) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[Ozone][status] 2019.02.27
Hi all, I didn't write any summary about the recent community calls. Mainly because on the last few calls developers usually just explained the current status / progress of ozone related developments (Security, HA, Tracing, etc.) There was no big news, all the information is available from the related Jiras but the calls helped to share the current status/latest findings. The current summary from the last calls: * 0.4.0 (badlands) release is closer and closer. Security work is almost done (HDDS-4) and stability is also highly improved. Expected to have the release during March. Ajay Kumar is working as the release manager. * HA works are continued with adopting Apache Ratis based statemachine in OM (target release is 0.5.0) * Recon service (~fsck server) work has been started. Design doc just uploaded to the HDDS-1084. * There are serious reliability testing with jepsen-like tests (using blockade) and down-stream applications (eg. Hive). Huge work has been done on the reliability side * Thanks to our Hive friends we adopted the async profiler endpoint from Hive (HIVE-20202) * Fist version of distributed tracing is working (some tricky context propagation inside Apache Ratis is missing). First results showed some client side problems with buffer allocations. * Wiki page about profiler/tracing will be created soon, and they may be demonstrated at the next call. * We have serious build problems. New Jenkinsfile based job is enabled for Hadoop without proper Ozone support and all the ozone PRs are commented with false positive findings. Fix is in progress (See comments in HADOOP-16035) Closing with the usual lines: Ozone community calls are 100% open weekly calls [1]. Feel free to join if you have any questions/suggestions/concerns. Thanks, Marton [1]: https://cwiki.apache.org/confluence/display/HADOOP/Ozone+Community+Calls - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1185) Optimize GetFileStatus in OzoneFileSystem by reducing the number of rpc call to OM.
Mukul Kumar Singh created HDDS-1185: --- Summary: Optimize GetFileStatus in OzoneFileSystem by reducing the number of rpc call to OM. Key: HDDS-1185 URL: https://issues.apache.org/jira/browse/HDDS-1185 Project: Hadoop Distributed Data Store Issue Type: Sub-task Components: Ozone Filesystem Affects Versions: 0.4.0 Reporter: Mukul Kumar Singh Assignee: Mukul Kumar Singh Fix For: 0.4.0 Attachments: HDDS-1185.001.patch GetFileStatus sends multiple rpc calls to Ozone Manager to fetch the file status for a given file. This can be optimized by performing all the processing on the OzoneManager for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Created] (HDDS-1184) Parallelization of write chunks in datanodes is broken
Shashikant Banerjee created HDDS-1184: - Summary: Parallelization of write chunks in datanodes is broken Key: HDDS-1184 URL: https://issues.apache.org/jira/browse/HDDS-1184 Project: Hadoop Distributed Data Store Issue Type: Improvement Components: Ozone Datanode Affects Versions: 0.4.0 Reporter: Shashikant Banerjee Assignee: Shashikant Banerjee Fix For: 0.4.0 After the HDDS-4 branch merge, parallelization in write chunks and applyTransaction is broken. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
[jira] [Resolved] (HDDS-1124) java.lang.IllegalStateException exception in datanode log
[ https://issues.apache.org/jira/browse/HDDS-1124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-1124. --- Resolution: Not A Problem Fix Version/s: 0.5.0 > java.lang.IllegalStateException exception in datanode log > - > > Key: HDDS-1124 > URL: https://issues.apache.org/jira/browse/HDDS-1124 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.5.0 > > > steps taken : > > # created 12 datanodes cluster and running workload on all the nodes > exception seen : > --- > > {noformat} > 2019-02-15 10:15:53,355 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolled log segment from > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3036 > to > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_3036-3047 > 2019-02-15 10:15:53,367 INFO org.apache.ratis.server.impl.RaftServerImpl: > 943007c8-4fdd-4926-89e2-2c8c52c05073: set configuration 3048: > [a40a7b01-a30b-469c-b373-9fcb20a126ed:172.27.54.212:9858, > 8c77b16b-8054-49e3-b669-1ff759cfd271:172.27.23.196:9858, > 943007c8-4fdd-4926-89e2-2c8c52c05073:172.27.76.72:9858], old=null at 3048 > 2019-02-15 10:15:53,523 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: created new log segment > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3048 > 2019-02-15 10:15:53,580 ERROR org.apache.ratis.grpc.server.GrpcLogAppender: > Failed onNext serverReply { > requestorId: "943007c8-4fdd-4926-89e2-2c8c52c05073" > replyId: "a40a7b01-a30b-469c-b373-9fcb20a126ed" > raftGroupId { > id: "\001\323\357*\221,O\300\200\266\001#C\327j\333" > } > success: true > } > term: 3 > nextIndex: 3049 > followerCommit: 3047 > java.lang.IllegalStateException: reply's next index is 3049, request's > previous is term: 1 > index: 3047 > at org.apache.ratis.util.Preconditions.assertTrue(Preconditions.java:60) > at > org.apache.ratis.grpc.server.GrpcLogAppender.onSuccess(GrpcLogAppender.java:285) > at > org.apache.ratis.grpc.server.GrpcLogAppender$AppendLogResponseHandler.onNextImpl(GrpcLogAppender.java:230) > at > org.apache.ratis.grpc.server.GrpcLogAppender$AppendLogResponseHandler.onNext(GrpcLogAppender.java:215) > at > org.apache.ratis.grpc.server.GrpcLogAppender$AppendLogResponseHandler.onNext(GrpcLogAppender.java:197) > at > org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onMessage(ClientCalls.java:421) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) > at > org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onMessage(ForwardingClientCallListener.java:33) > at > org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:519) > at > org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37) > at > org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > 2019-02-15 10:15:56,442 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolling segment > log-3048_3066 to index:3066 > 2019-02-15 10:15:56,442 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolled log segment from > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3048 > to > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_3048-3066 > 2019-02-15 10:15:56,564 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: created new log segment > /data/disk1/ozone/meta/ratis/01d3ef2a-912c-4fc0-80b6-012343d76adb/current/log_inprogress_3067 > 2019-02-15 10:16:45,420 INFO org.apache.ratis.server.storage.RaftLogWorker: > 943007c8-4fdd-4926-89e2-2c8c52c05073-RaftLogWorker: Rolling segment > log-3067_3077 to index:3077 > {noformat} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail:
[jira] [Resolved] (HDDS-1125) java.lang.InterruptedException seen in datanode logs
[ https://issues.apache.org/jira/browse/HDDS-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-1125. --- Resolution: Not A Problem Fix Version/s: 0.4.0 > java.lang.InterruptedException seen in datanode logs > > > Key: HDDS-1125 > URL: https://issues.apache.org/jira/browse/HDDS-1125 > Project: Hadoop Distributed Data Store > Issue Type: Bug >Reporter: Nilotpal Nandi >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.4.0 > > > steps taken : > > # created 12 datanodes cluster and running workload on all the nodes > > exception seen : > - > > {noformat} > 2019-02-15 10:16:48,713 ERROR org.apache.ratis.server.impl.LogAppender: > 943007c8-4fdd-4926-89e2-2c8c52c05073: Failed readStateMachineData for (t:3, > i:3084), STATEMACHINELOGENTRY, client-632E77ADA885, cid=6232 > java.lang.InterruptedException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:347) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at > org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:433) > at org.apache.ratis.util.DataQueue.pollList(DataQueue.java:133) > at > org.apache.ratis.server.impl.LogAppender.createRequest(LogAppender.java:171) > at > org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:152) > at > org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:96) > at org.apache.ratis.server.impl.LogAppender.runAppender(LogAppender.java:101) > at java.lang.Thread.run(Thread.java:748) > 2019-02-15 10:16:48,714 ERROR org.apache.ratis.server.impl.LogAppender: > GrpcLogAppender(943007c8-4fdd-4926-89e2-2c8c52c05073 -> > 8c77b16b-8054-49e3-b669-1ff759cfd271) hit IOException while loading raft log > org.apache.ratis.server.storage.RaftLogIOException: > 943007c8-4fdd-4926-89e2-2c8c52c05073: Failed readStateMachineData for (t:3, > i:3084), STATEMACHINELOGENTRY, client-632E77ADA885, cid=6232 > at > org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:440) > at org.apache.ratis.util.DataQueue.pollList(DataQueue.java:133) > at > org.apache.ratis.server.impl.LogAppender.createRequest(LogAppender.java:171) > at > org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:152) > at > org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:96) > at org.apache.ratis.server.impl.LogAppender.runAppender(LogAppender.java:101) > at java.lang.Thread.run(Thread.java:748) > Caused by: java.lang.InterruptedException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:347) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at > org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:433) > ... 6 more > 2019-02-15 10:16:48,715 ERROR org.apache.ratis.server.impl.LogAppender: > 943007c8-4fdd-4926-89e2-2c8c52c05073: Failed readStateMachineData for (t:3, > i:3084), STATEMACHINELOGENTRY, client-632E77ADA885, cid=6232 > java.lang.InterruptedException > at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:347) > at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915) > at > org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:433) > at org.apache.ratis.util.DataQueue.pollList(DataQueue.java:133) > at > org.apache.ratis.server.impl.LogAppender.createRequest(LogAppender.java:171) > at > org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:152) > at > org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:96) > at org.apache.ratis.server.impl.LogAppender.runAppender(LogAppender.java:101) > at java.lang.Thread.run(Thread.java:748) > 2019-02-15 10:16:48,715 ERROR org.apache.ratis.server.impl.LogAppender: > GrpcLogAppender(943007c8-4fdd-4926-89e2-2c8c52c05073 -> > a40a7b01-a30b-469c-b373-9fcb20a126ed) hit IOException while loading raft log > org.apache.ratis.server.storage.RaftLogIOException: > 943007c8-4fdd-4926-89e2-2c8c52c05073: Failed readStateMachineData for (t:3, > i:3084), STATEMACHINELOGENTRY, client-632E77ADA885, cid=6232 > at > org.apache.ratis.server.storage.RaftLog$EntryWithData.getEntry(RaftLog.java:440) > at org.apache.ratis.util.DataQueue.pollList(DataQueue.java:133) > at > org.apache.ratis.server.impl.LogAppender.createRequest(LogAppender.java:171) > at > org.apache.ratis.grpc.server.GrpcLogAppender.appendLog(GrpcLogAppender.java:152) > at > org.apache.ratis.grpc.server.GrpcLogAppender.runAppenderImpl(GrpcLogAppender.java:96) > at
[jira] [Resolved] (HDDS-582) Remove ChunkOutputStreamEntry class from ChunkGroupOutputStream
[ https://issues.apache.org/jira/browse/HDDS-582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shashikant Banerjee resolved HDDS-582. -- Resolution: Won't Do Fix Version/s: 0.5.0 > Remove ChunkOutputStreamEntry class from ChunkGroupOutputStream > --- > > Key: HDDS-582 > URL: https://issues.apache.org/jira/browse/HDDS-582 > Project: Hadoop Distributed Data Store > Issue Type: Bug > Components: Ozone Client >Reporter: Shashikant Banerjee >Assignee: Shashikant Banerjee >Priority: Major > Fix For: 0.5.0 > > > ChunkOutPutStreamEntry holds the info for the blocks which needs to be > wriiten down to Datanodes. This info can also be held in KeyLocationInfo list > which will be used to updated the OM once the stream closes. It does not > serve much purpose here and hence can be removed. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org