Re: Apache Hadoop 2.9.2 release plan

2018-10-31 Thread Akira Ajisaka
branch-2.9.2 is cut and moved the 2.9.2-targeted jiras to 2.9.3.
Hi committers, please set the fix version to 2.9.3 when you commit a
fix in branch-2.9.

Thanks,
Akira
2018年11月1日(木) 13:52 Akira Ajisaka :
>
> I have been checking the blocker/critical issues fixed in branch-2 and
> not in branch-2.9 [1].
> Now all the blockers in the list were backported to branch-2.9. If
> someone want to backport some bugs to 2.9.2 release, please let me
> know.
>
> Next I'll cut branch-2.9.2 and move the 2.9.2-targeted-jiras [2] to 2.9.3.
>
> -Akira
>
> [1] https://s.apache.org/2.9-candidate-jiras
> [2] https://s.apache.org/2.9.2-targeted-jiras
> 2018年10月29日(月) 17:28 Akira Ajisaka :
> >
> > Hi all,
> >
> > We have released Apache Hadoop 2.9.1 on May 5, 2018. To further
> > improve the quality of the release, I'm planning to release 2.9.2. The
> > focus of 2.9.2 will be fixing blocker/critical bugs and other
> > enhancements. So far there are 189 JIRAs [1] have fixed in branch-2.9.
> >
> > There are no blocker/critical bugs targeted for 2.9.2 now [2], so I
> > plan to cut branch-2.9.2 and create RC by the end of this week.
> >
> > If someone want to be a release manager for 2.9.2. You can do it
> > instead of me, and I'll help you.
> >
> > Please feel free to share your thoughts.
> >
> > Thanks,
> > Akira Ajisaka
> >
> > [1] https://s.apache.org/2.9.2-fixed-jiras
> > [2] https://s.apache.org/2.9.2-targeted-jiras

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: Apache Hadoop 2.9.2 release plan

2018-10-31 Thread Akira Ajisaka
I have been checking the blocker/critical issues fixed in branch-2 and
not in branch-2.9 [1].
Now all the blockers in the list were backported to branch-2.9. If
someone want to backport some bugs to 2.9.2 release, please let me
know.

Next I'll cut branch-2.9.2 and move the 2.9.2-targeted-jiras [2] to 2.9.3.

-Akira

[1] https://s.apache.org/2.9-candidate-jiras
[2] https://s.apache.org/2.9.2-targeted-jiras
2018年10月29日(月) 17:28 Akira Ajisaka :
>
> Hi all,
>
> We have released Apache Hadoop 2.9.1 on May 5, 2018. To further
> improve the quality of the release, I'm planning to release 2.9.2. The
> focus of 2.9.2 will be fixing blocker/critical bugs and other
> enhancements. So far there are 189 JIRAs [1] have fixed in branch-2.9.
>
> There are no blocker/critical bugs targeted for 2.9.2 now [2], so I
> plan to cut branch-2.9.2 and create RC by the end of this week.
>
> If someone want to be a release manager for 2.9.2. You can do it
> instead of me, and I'll help you.
>
> Please feel free to share your thoughts.
>
> Thanks,
> Akira Ajisaka
>
> [1] https://s.apache.org/2.9.2-fixed-jiras
> [2] https://s.apache.org/2.9.2-targeted-jiras

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-786) Fix the findbugs for SCMClientProtocolServer#getContainerWithPipeline

2018-10-31 Thread Yiqun Lin (JIRA)
Yiqun Lin created HDDS-786:
--

 Summary: Fix the findbugs for 
SCMClientProtocolServer#getContainerWithPipeline
 Key: HDDS-786
 URL: https://issues.apache.org/jira/browse/HDDS-786
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Yiqun Lin
Assignee: Yiqun Lin


There is a findbugs warning existing recently.
{noformat}
Dead store to remoteUser in 
org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.getContainerWithPipeline(long)
Bug type DLS_DEAD_LOCAL_STORE (click for details) 
In class org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer
In method 
org.apache.hadoop.hdds.scm.server.SCMClientProtocolServer.getContainerWithPipeline(long)
Local variable named remoteUser
At SCMClientProtocolServer.java:[line 192]
{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDFS-12026) libhdfs++: Fix compilation errors and warnings when compiling with Clang

2018-10-31 Thread Sunil Govindan (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDFS-12026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sunil Govindan resolved HDFS-12026.
---
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 3.3.0
   3.2.0

> libhdfs++: Fix compilation errors and warnings when compiling with Clang 
> -
>
> Key: HDFS-12026
> URL: https://issues.apache.org/jira/browse/HDFS-12026
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: hdfs-client
>Reporter: Anatoli Shein
>Assignee: Anatoli Shein
>Priority: Blocker
> Fix For: 3.2.0, 3.3.0
>
> Attachments: HDFS-12026.HDFS-8707.000.patch, 
> HDFS-12026.HDFS-8707.001.patch, HDFS-12026.HDFS-8707.002.patch, 
> HDFS-12026.HDFS-8707.003.patch, HDFS-12026.HDFS-8707.004.patch, 
> HDFS-12026.HDFS-8707.005.patch, HDFS-12026.HDFS-8707.006.patch, 
> HDFS-12026.HDFS-8707.007.patch, HDFS-12026.HDFS-8707.008.patch, 
> HDFS-12026.HDFS-8707.009.patch, HDFS-12026.HDFS-8707.010.patch
>
>
> Currently multiple errors and warnings prevent libhdfspp from being compiled 
> with clang. It should compile cleanly using flag:
> -std=c++11
> and also warning flags:
> -Weverything -Wno-c++98-compat -Wno-missing-prototypes 
> -Wno-c++98-compat-pedantic -Wno-padded -Wno-covered-switch-default 
> -Wno-missing-noreturn -Wno-unknown-pragmas -Wconversion -Werror



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-785) Ozone shell put key does not create parent directories

2018-10-31 Thread Hanisha Koneru (JIRA)
Hanisha Koneru created HDDS-785:
---

 Summary: Ozone shell put key does not create parent directories
 Key: HDDS-785
 URL: https://issues.apache.org/jira/browse/HDDS-785
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Hanisha Koneru
Assignee: Hanisha Koneru


When we create a key in ozone through Ozone Shell, the parent directory 
structure is not created. 
{code:java}
$ ./ozone sh key put /volume1/bucket1/o3sh/t1/dir1/file1 /etc/hosts -r=ONE 
$ ./ozone sh key list /volume1/bucket1 
[ { 
   ….
   "size" : 5898, 
   "keyName" : "o3sh/t1/dir1/file1” 
} ] 

$ ./ozone fs -ls o3fs://bucket1.volume1/o3sh/t1/dir1/ 
ls: `o3fs://bucket1.volume1/o3sh/t1/dir1/': No such file or directory 

$ ./ozone fs -ls o3fs://bucket1.volume1/o3sh/t1/dir1/file1 
-rw-rw-rw- 1 hk hk       5898 2018-10-23 18:02 
o3fs://bucket1.volume1/o3sh/t1/dir1/file1{code}
OzoneFileSystem and S3AFileSystem, when creating files, create the parent 
directories if they do not exist. We should match this behavior in Ozone shell 
as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Resolved] (HDDS-688) Hive Query hangs, if DN's are restarted before the query is submitted

2018-10-31 Thread Namit Maheshwari (JIRA)


 [ 
https://issues.apache.org/jira/browse/HDDS-688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Maheshwari resolved HDDS-688.
---
Resolution: Fixed

This is fixed with the recent changes. Resolving it.

> Hive Query hangs, if DN's are restarted before the query is submitted
> -
>
> Key: HDDS-688
> URL: https://issues.apache.org/jira/browse/HDDS-688
> Project: Hadoop Distributed Data Store
>  Issue Type: Bug
>Reporter: Namit Maheshwari
>Assignee: Mukul Kumar Singh
>Priority: Major
>
> Run a Hive Insert Query. It runs fine as below:
> {code:java}
> 0: jdbc:hive2://ctr-e138-1518143905142-510793> insert into testo3 values(1, 
> "aa", 3.0);
> INFO : Compiling 
> command(queryId=hive_20181018005729_fe644ab2-f8cc-41c3-b2d8-ffe1022de607): 
> insert into testo3 values(1, "aa", 3.0)
> INFO : Semantic Analysis Completed (retrial = false)
> INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_col0, 
> type:int, comment:null), FieldSchema(name:_col1, type:string, comment:null), 
> FieldSchema(name:_col2, type:float, comment:null)], properties:null)
> INFO : Completed compiling 
> command(queryId=hive_20181018005729_fe644ab2-f8cc-41c3-b2d8-ffe1022de607); 
> Time taken: 0.52 seconds
> INFO : Executing 
> command(queryId=hive_20181018005729_fe644ab2-f8cc-41c3-b2d8-ffe1022de607): 
> insert into testo3 values(1, "aa", 3.0)
> INFO : Query ID = hive_20181018005729_fe644ab2-f8cc-41c3-b2d8-ffe1022de607
> INFO : Total jobs = 1
> INFO : Launching Job 1 out of 1
> INFO : Starting task [Stage-1:MAPRED] in serial mode
> INFO : Subscribed to counters: [] for queryId: 
> hive_20181018005729_fe644ab2-f8cc-41c3-b2d8-ffe1022de607
> INFO : Session is already open
> INFO : Dag name: insert into testo3 values(1, "aa", 3.0) (Stage-1)
> INFO : Status: Running (Executing on YARN cluster with App id 
> application_1539383731490_0073)
> --
> VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
> --
> Map 1 .. container SUCCEEDED 1 1 0 0 0 0
> Reducer 2 .. container SUCCEEDED 1 1 0 0 0 0
> --
> VERTICES: 02/02 [==>>] 100% ELAPSED TIME: 11.95 s
> --
> INFO : Status: DAG finished successfully in 10.68 seconds
> INFO :
> INFO : Query Execution Summary
> INFO : 
> --
> INFO : OPERATION DURATION
> INFO : 
> --
> INFO : Compile Query 0.52s
> INFO : Prepare Plan 0.23s
> INFO : Get Query Coordinator (AM) 0.00s
> INFO : Submit Plan 0.11s
> INFO : Start DAG 0.57s
> INFO : Run DAG 10.68s
> INFO : 
> --
> INFO :
> INFO : Task Execution Summary
> INFO : 
> --
> INFO : VERTICES DURATION(ms) CPU_TIME(ms) GC_TIME(ms) INPUT_RECORDS 
> OUTPUT_RECORDS
> INFO : 
> --
> INFO : Map 1 7074.00 11,280 276 3 1
> INFO : Reducer 2 1074.00 2,040 0 1 0
> INFO : 
> --
> INFO :
> INFO : org.apache.tez.common.counters.DAGCounter:
> INFO : NUM_SUCCEEDED_TASKS: 2
> INFO : TOTAL_LAUNCHED_TASKS: 2
> INFO : AM_CPU_MILLISECONDS: 1390
> INFO : AM_GC_TIME_MILLIS: 0
> INFO : File System Counters:
> INFO : FILE_BYTES_READ: 135
> INFO : FILE_BYTES_WRITTEN: 135
> INFO : HDFS_BYTES_WRITTEN: 199
> INFO : HDFS_READ_OPS: 3
> INFO : HDFS_WRITE_OPS: 2
> INFO : HDFS_OP_CREATE: 1
> INFO : HDFS_OP_GET_FILE_STATUS: 3
> INFO : HDFS_OP_RENAME: 1
> INFO : org.apache.tez.common.counters.TaskCounter:
> INFO : SPILLED_RECORDS: 0
> INFO : NUM_SHUFFLED_INPUTS: 1
> INFO : NUM_FAILED_SHUFFLE_INPUTS: 0
> INFO : GC_TIME_MILLIS: 276
> INFO : TASK_DURATION_MILLIS: 8474
> INFO : CPU_MILLISECONDS: 13320
> INFO : PHYSICAL_MEMORY_BYTES: 4294967296
> INFO : VIRTUAL_MEMORY_BYTES: 11205029888
> INFO : COMMITTED_HEAP_BYTES: 4294967296
> INFO : INPUT_RECORDS_PROCESSED: 5
> INFO : INPUT_SPLIT_LENGTH_BYTES: 1
> INFO : OUTPUT_RECORDS: 1
> INFO : OUTPUT_LARGE_RECORDS: 0
> INFO : OUTPUT_BYTES: 94
> INFO : OUTPUT_BYTES_WITH_OVERHEAD: 102
> INFO : OUTPUT_BYTES_PHYSICAL: 127
> INFO : ADDITIONAL_SPILLS_BYTES_WRITTEN: 0
> INFO : 

[jira] [Created] (HDFS-14042) NPE when PROVIDED storage is missing

2018-10-31 Thread JIRA
Íñigo Goiri created HDFS-14042:
--

 Summary: NPE when PROVIDED storage is missing
 Key: HDFS-14042
 URL: https://issues.apache.org/jira/browse/HDFS-14042
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 3.3.0
Reporter: Íñigo Goiri
Assignee: Virajith Jalaparti


java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.updateStorageStats(DatanodeDescriptor.java:460)
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeDescriptor.updateHeartbeatState(DatanodeDescriptor.java:390)
at 
org.apache.hadoop.hdfs.server.blockmanagement.HeartbeatManager.updateLifeline(HeartbeatManager.java:254)
at 
org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.handleLifeline(DatanodeManager.java:1789)
at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.handleLifeline(FSNamesystem.java:3997)
at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.sendLifeline(NameNodeRpcServer.java:1666)
at 
org.apache.hadoop.hdfs.protocolPB.DatanodeLifelineProtocolServerSideTranslatorPB.sendLifeline(DatanodeLifelineProtocolServerSideTranslatorPB.java:62)
at 
org.apache.hadoop.hdfs.protocol.proto.DatanodeLifelineProtocolProtos$DatanodeLifelineProtocolService$2.callBlockingMethod(DatanodeLifelineProtocolProtos.java:409)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:524)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1025)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:898)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:844)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1876)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2727)




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-784) ozone fs volume created with non-existing unix user

2018-10-31 Thread Soumitra Sulav (JIRA)
Soumitra Sulav created HDDS-784:
---

 Summary: ozone fs volume created with non-existing unix user
 Key: HDDS-784
 URL: https://issues.apache.org/jira/browse/HDDS-784
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Filesystem
Affects Versions: 0.3.0
Reporter: Soumitra Sulav


ozone command to create a volume with owner being any username runs 
successfully even if it is not part of unix users.

The command throws a security warning _(security.ShellBasedUnixGroupsMapping)_ 
but still creates the volume.

As a result we can't list the volume, and volume listing with root returns an 
empty string.

ozone cli Command run :
{code:java}
ozone sh volume create testvolume -u=hdfs{code}
WARNING thrown :

 
{code:java}
2018-10-30 10:19:38,268 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
2018-10-30 10:19:39,061 WARN security.ShellBasedUnixGroupsMapping: unable to 
return groups for user hdfs
PartialGroupNameException The user name 'hdfs' is not found. id: hdfs: no such 
user
id: hdfs: no such user
at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.resolvePartialGroupNames(ShellBasedUnixGroupsMapping.java:294)
 at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:207)
 at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:97)
 at 
org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:51)
 at 
org.apache.hadoop.security.Groups$GroupCacheLoader.fetchGroupList(Groups.java:387)
 at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:321)
 at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:270)
 at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568)
 at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350)
 at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313)
 at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228)
 at com.google.common.cache.LocalCache.get(LocalCache.java:3965)
 at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969)
 at 
com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829)
 at org.apache.hadoop.security.Groups.getGroups(Groups.java:228)
 at 
org.apache.hadoop.security.UserGroupInformation.getGroups(UserGroupInformation.java:1588)
 at 
org.apache.hadoop.security.UserGroupInformation.getGroupNames(UserGroupInformation.java:1576)
 at 
org.apache.hadoop.ozone.client.rpc.RpcClient.createVolume(RpcClient.java:187)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at 
org.apache.hadoop.ozone.client.OzoneClientInvocationHandler.invoke(OzoneClientInvocationHandler.java:54)
 at com.sun.proxy.$Proxy15.createVolume(Unknown Source)
 at org.apache.hadoop.ozone.client.ObjectStore.createVolume(ObjectStore.java:82)
 at 
org.apache.hadoop.ozone.web.ozShell.volume.CreateVolumeHandler.call(CreateVolumeHandler.java:103)
 at 
org.apache.hadoop.ozone.web.ozShell.volume.CreateVolumeHandler.call(CreateVolumeHandler.java:41)
 at picocli.CommandLine.execute(CommandLine.java:919)
 at picocli.CommandLine.access$700(CommandLine.java:104)
 at picocli.CommandLine$RunLast.handle(CommandLine.java:1083)
 at picocli.CommandLine$RunLast.handle(CommandLine.java:1051)
 at 
picocli.CommandLine$AbstractParseResultHandler.handleParseResult(CommandLine.java:959)
 at picocli.CommandLine.parseWithHandlers(CommandLine.java:1242)
 at picocli.CommandLine.parseWithHandler(CommandLine.java:1181)
 at org.apache.hadoop.hdds.cli.GenericCli.execute(GenericCli.java:61)
 at org.apache.hadoop.hdds.cli.GenericCli.run(GenericCli.java:52)
 at org.apache.hadoop.ozone.web.ozShell.Shell.main(Shell.java:80)
2018-10-30 10:19:39,073 INFO rpc.RpcClient: Creating Volume: testvolume, with 
hdfs as owner and quota set to 1152921504606846976 bytes.
{code}
Volume list empty return :
{code:java}
[root@ctr-e138-1518143905142-552728-01-02 ~]# ozone sh volume list
2018-10-30 10:20:03,275 WARN util.NativeCodeLoader: Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
[ ]{code}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-782) MR example pi job runs 5 min for 1 Map/1 Sample

2018-10-31 Thread Soumitra Sulav (JIRA)
Soumitra Sulav created HDDS-782:
---

 Summary: MR example pi job runs 5 min for 1 Map/1 Sample
 Key: HDDS-782
 URL: https://issues.apache.org/jira/browse/HDDS-782
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Filesystem
Affects Versions: 0.3.0
Reporter: Soumitra Sulav


On running a hadoop examples pi job it takes 250+ seconds which generally run 
in few seconds in HDFS cluster.

On seeing the service/job logs it seems that there are few 
_SocketTimeoutException_ in between and YARN keeps on _Waiting for 
AsyncDispatcher to drain_. Thread state is _WAITING_ for a very long interval.

 

Refer attached log for further details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-781) Ambari HDP NoClassDefFoundError for MR jobs

2018-10-31 Thread Soumitra Sulav (JIRA)
Soumitra Sulav created HDDS-781:
---

 Summary: Ambari HDP NoClassDefFoundError for MR jobs
 Key: HDDS-781
 URL: https://issues.apache.org/jira/browse/HDDS-781
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Filesystem
Affects Versions: 0.3.0
Reporter: Soumitra Sulav


HDP integrated with Ambari has a 
_/usr/hdp//hadoop/mapreduce.tar.gz_ file containing all the 
libraries needed for a MR job to run and is copied in the yarn containers at 
time of execution.

As introducing ozone filesystem, relevant jars need to be packaged as part of 
the tar, also the tar is placed as part of _yum install hadoop_ components done 
by Ambari during cluster setup.

During an MR Job run, I faced below java.lang.NoClassDefFoundError exceptions :

org/apache/hadoop/fs/ozone/OzoneFileSystem

org/apache/ratis/proto/RaftProtos$ReplicationLevel

org/apache/ratis/thirdparty/com/google/protobuf/ProtocolMessageEnum

Adding the relevant jar in the mentioned tar file resolves the exception.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-780) ozone client daemon restart fails if triggered from different host

2018-10-31 Thread Soumitra Sulav (JIRA)
Soumitra Sulav created HDDS-780:
---

 Summary: ozone client daemon restart fails if triggered from 
different host
 Key: HDDS-780
 URL: https://issues.apache.org/jira/browse/HDDS-780
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Affects Versions: 0.3.0
Reporter: Soumitra Sulav


Ozone client operation throws *java.net.BindException: Cannot assign requested 
address* exception if om, scm are not located on the same node from where the 
cli command is run.

Command triggered from a node which has scm but not om :
{code:java}
ozone --daemon start om{code}
Complete stacktrace of Exception :
{code:java}
2018-10-30 10:17:22,675 INFO org.apache.hadoop.ipc.CallQueueManager: Using 
callQueue: class java.util.concurrent.LinkedBlockingQueue, queueCapacity: 2000, 
scheduler: class org.apache.hadoop.ipc.DefaultRpcScheduler, ipcBackoff: false.
2018-10-30 10:17:22,683 ERROR org.apache.hadoop.ozone.om.OzoneManager: Failed 
to start the OzoneManager.
java.net.BindException: Problem binding to 
[ctr-e138-1518143905142-552728-01-03.hwx.site:9889] java.net.BindException: 
Cannot assign requested address; For more details see: 
http://wiki.apache.org/hadoop/BindException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:736)
at org.apache.hadoop.ipc.Server.bind(Server.java:566)
at org.apache.hadoop.ipc.Server$Listener.(Server.java:1042)
at org.apache.hadoop.ipc.Server.(Server.java:2815)
at org.apache.hadoop.ipc.RPC$Server.(RPC.java:994)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server.(ProtobufRpcEngine.java:421)
at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:342)
at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:804)
at org.apache.hadoop.ozone.om.OzoneManager.startRpcServer(OzoneManager.java:241)
at org.apache.hadoop.ozone.om.OzoneManager.(OzoneManager.java:156)
at org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:339)
at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:265)
Caused by: java.net.BindException: Cannot assign requested address
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.apache.hadoop.ipc.Server.bind(Server.java:549)
... 10 more
2018-10-30 10:17:22,687 INFO org.apache.hadoop.util.ExitUtil: Exiting with 
status 1: java.net.BindException: Problem binding to 
[ctr-e138-1518143905142-552728-01-03.hwx.site:9889] java.net.BindException: 
Cannot assign requested address; For more details see: 
http://wiki.apache.org/hadoop/BindException
{code}
Same applies for daemon stop operations as well. It doesn't throw any exception 
but at the same time doesn't stop the daemon as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-779) Fix ASF License violation in S3Consts and S3Utils

2018-10-31 Thread Dinesh Chitlangia (JIRA)
Dinesh Chitlangia created HDDS-779:
--

 Summary: Fix ASF License violation in S3Consts and S3Utils
 Key: HDDS-779
 URL: https://issues.apache.org/jira/browse/HDDS-779
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Dinesh Chitlangia
Assignee: Dinesh Chitlangia


Spotted this issue during one of the Jenkins runs for HDDS-120.

[https://builds.apache.org/job/PreCommit-HDDS-Build/1569/artifact/out/patch-asflicense-problems.txt]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-778) Add an interface for CA and Clients for Certificate operations

2018-10-31 Thread Anu Engineer (JIRA)
Anu Engineer created HDDS-778:
-

 Summary: Add an interface for CA and Clients for Certificate 
operations
 Key: HDDS-778
 URL: https://issues.apache.org/jira/browse/HDDS-778
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: SCM, SCM Client
Reporter: Anu Engineer
Assignee: Anu Engineer


This JIRA proposes to add an interface specification that can be programmed 
against by Datanodes and Ozone Manager and other clients that want to use the 
certificate-based security features of HDDS.

We will also add a Certificate Server interface, this interface can be used to 
use non-SCM based CA or if we need to use HSM based secret storage services. 

At this point, it is simply an interface and nothing more. Thanks to [~xyao] 
for suggesting this idea.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-777) Fix missing jenkins issue in s3gateway module

2018-10-31 Thread Bharat Viswanadham (JIRA)
Bharat Viswanadham created HDDS-777:
---

 Summary: Fix missing jenkins issue in s3gateway module
 Key: HDDS-777
 URL: https://issues.apache.org/jira/browse/HDDS-777
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Bharat Viswanadham
Assignee: Bharat Viswanadham






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-776) Make OM initialization resilient to dns failures

2018-10-31 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-776:
-

 Summary: Make OM initialization resilient to dns failures
 Key: HDDS-776
 URL: https://issues.apache.org/jira/browse/HDDS-776
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: OM
Reporter: Elek, Marton


Ozone Manager could be initialized by 'ozone om --init' command and it connects 
to a running scm.

In case of scm is unavailable because a dns issue the initialization is failed 
without any retry:

{code}
 2018-10-31 15:36:26 ERROR OzoneManager:376 - Could not initialize OM version 
file
java.net.UnknownHostException: Invalid host name: local host is: (unknown); 
destination host is: "releastest2-ozone-scm-0.releastest2-ozone-scm":9863; 
java.net.UnknownHostException; For more details see:  
http://wiki.apache.org/hadoop/UnknownHost
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at 
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:768)
at org.apache.hadoop.ipc.Client$Connection.(Client.java:449)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1552)
at org.apache.hadoop.ipc.Client.call(Client.java:1403)
at org.apache.hadoop.ipc.Client.call(Client.java:1367)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy9.getScmInfo(Unknown Source)
at 
org.apache.hadoop.hdds.scm.protocolPB.ScmBlockLocationProtocolClientSideTranslatorPB.getScmInfo(ScmBlockLocationProtocolClientSideTranslatorPB.java:154)
at org.apache.hadoop.ozone.om.OzoneManager.omInit(OzoneManager.java:358)
at 
org.apache.hadoop.ozone.om.OzoneManager.createOm(OzoneManager.java:326)
at org.apache.hadoop.ozone.om.OzoneManager.main(OzoneManager.java:265)
Caused by: java.net.UnknownHostException
at org.apache.hadoop.ipc.Client$Connection.(Client.java:450)
... 10 more 
{code}

This is a problem for all the containerized environments. In kubernetes om 
can't be started sometimes. For docker-compose environments we have a 15 sec 
sleep to be sure to avoid this issue. 

Would be great to retry in case of a dns problem.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Hadoop RPC encryption performance improvements

2018-10-31 Thread Daryn Sharp
Various KMS tasks have been delaying my RPC encryption work – which is 2nd
on TODO list.  It's becoming a top priority for us so I'll try my best to
get a preliminary netty server patch (sans TLS) up this week if that helps.

The two cited jiras had some critical flaws.  Skimming my comments, both
use blocking IO (obvious nonstarter).  HADOOP-10768 is a hand rolled
TLS-like encryption which I don't feel is something the community can or
should maintain from a security standpoint.

Daryn

On Wed, Oct 31, 2018 at 8:43 AM Wei-Chiu Chuang  wrote:

> Ping. Any one? Cloudera is interested in moving forward with the RPC
> encryption improvements, but I just like to get a consensus which approach
> to go with.
>
> Otherwise I'll pick HADOOP-10768 since it's ready for commit, and I've
> spent time on testing it.
>
> On Thu, Oct 25, 2018 at 11:04 AM Wei-Chiu Chuang 
> wrote:
>
> > Folks,
> >
> > I would like to invite all to discuss the various Hadoop RPC encryption
> > performance improvements. As you probably know, Hadoop RPC encryption
> > currently relies on Java SASL, and have _really_ bad performance (in
> terms
> > of number of RPCs per second, around 15~20% of the one without SASL)
> >
> > There have been some attempts to address this, most notably, HADOOP-10768
> >  (Optimize Hadoop
> RPC
> > encryption performance) and HADOOP-13836
> >  (Securing Hadoop
> RPC
> > using SSL). But it looks like both attempts have not been progressing.
> >
> > During the recent Hadoop contributor meetup, Daryn Sharp mentioned he's
> > working on another approach that leverages Netty for its SSL encryption,
> > and then integrate Netty with Hadoop RPC so that Hadoop RPC automatically
> > benefits from netty's SSL encryption performance.
> >
> > So there are at least 3 attempts to address this issue as I see it. Do we
> > have a consensus that:
> > 1. this is an important problem
> > 2. which approach we want to move forward with
> >
> > --
> > A very happy Hadoop contributor
> >
>
>
> --
> A very happy Hadoop contributor
>


-- 

Daryn


[jira] [Created] (HDDS-775) Batch updates to container db to minimize number of updates.

2018-10-31 Thread Mukul Kumar Singh (JIRA)
Mukul Kumar Singh created HDDS-775:
--

 Summary: Batch updates to container db to minimize number of 
updates.
 Key: HDDS-775
 URL: https://issues.apache.org/jira/browse/HDDS-775
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: SCM
Affects Versions: 0.3.0
Reporter: Mukul Kumar Singh
Assignee: Mukul Kumar Singh


Currently while processing container reports, each report results in a put 
operation to the db. This can be optimized by replacing put with a batch 
operation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Re: [DISCUSS] Hadoop RPC encryption performance improvements

2018-10-31 Thread Wei-Chiu Chuang
Ping. Any one? Cloudera is interested in moving forward with the RPC
encryption improvements, but I just like to get a consensus which approach
to go with.

Otherwise I'll pick HADOOP-10768 since it's ready for commit, and I've
spent time on testing it.

On Thu, Oct 25, 2018 at 11:04 AM Wei-Chiu Chuang  wrote:

> Folks,
>
> I would like to invite all to discuss the various Hadoop RPC encryption
> performance improvements. As you probably know, Hadoop RPC encryption
> currently relies on Java SASL, and have _really_ bad performance (in terms
> of number of RPCs per second, around 15~20% of the one without SASL)
>
> There have been some attempts to address this, most notably, HADOOP-10768
>  (Optimize Hadoop RPC
> encryption performance) and HADOOP-13836
>  (Securing Hadoop RPC
> using SSL). But it looks like both attempts have not been progressing.
>
> During the recent Hadoop contributor meetup, Daryn Sharp mentioned he's
> working on another approach that leverages Netty for its SSL encryption,
> and then integrate Netty with Hadoop RPC so that Hadoop RPC automatically
> benefits from netty's SSL encryption performance.
>
> So there are at least 3 attempts to address this issue as I see it. Do we
> have a consensus that:
> 1. this is an important problem
> 2. which approach we want to move forward with
>
> --
> A very happy Hadoop contributor
>


-- 
A very happy Hadoop contributor


[jira] [Created] (HDDS-774) Remove OpenContainerBlockMap from datanode

2018-10-31 Thread Shashikant Banerjee (JIRA)
Shashikant Banerjee created HDDS-774:


 Summary: Remove OpenContainerBlockMap from datanode
 Key: HDDS-774
 URL: https://issues.apache.org/jira/browse/HDDS-774
 Project: Hadoop Distributed Data Store
  Issue Type: Improvement
  Components: Ozone Datanode
Affects Versions: 0.4.0
Reporter: Shashikant Banerjee
Assignee: Shashikant Banerjee
 Fix For: 0.4.0


With HDDS-675, partial flush of uncommitted keys on Datanodes is not required. 
OpenContainerBlockMap hence serves no purpose anymore.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-773) Loading ozone s3 bucket browser could be failed

2018-10-31 Thread Elek, Marton (JIRA)
Elek, Marton created HDDS-773:
-

 Summary: Loading ozone s3 bucket browser could be failed
 Key: HDDS-773
 URL: https://issues.apache.org/jira/browse/HDDS-773
 Project: Hadoop Distributed Data Store
  Issue Type: Sub-task
  Components: S3
Reporter: Elek, Marton
Assignee: Elek, Marton


Ozone S3 gateway support an internal bucket browser to display the content of 
the ozone s3 buckets in the browser.

You can check the content of any bucket with using the url 
http://localhost:9878/bucket?browser=true

This endpoint is failing some times with the following error:

{code}
2018-10-31 11:26:55 WARN  HttpChannel:486 - //localhost:9878/blist?browser=x
javax.servlet.ServletException: javax.servlet.ServletException: 
org.glassfish.jersey.server.ContainerException: java.io.IOException: Stream 
closed
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:139)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
at org.eclipse.jetty.server.Server.handle(Server.java:534)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)
at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)
at 
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)
at 
org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)
at 
org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)
at java.lang.Thread.run(Thread.java:748)
Caused by: javax.servlet.ServletException: 
org.glassfish.jersey.server.ContainerException: java.io.IOException: Stream 
closed
at 
org.glassfish.jersey.servlet.WebComponent.serviceImpl(WebComponent.java:432)
at 
org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:370)
at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:389)
at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:342)
at 
org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:229)
at 
org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:840)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1772)
at 
org.apache.hadoop.http.HttpServer2$QuotingInputFilter.doFilter(HttpServer2.java:1610)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at org.apache.hadoop.http.NoCacheFilter.doFilter(NoCacheFilter.java:45)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1759)
at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)
at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)
at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
... 13 more
Caused by: org.glassfish.jersey.server.ContainerException: java.io.IOException: 
Stream closed
at 
org.glassfish.jersey.servlet.internal.ResponseWriter.rethrow(ResponseWriter.java:278)
at 
org.glassfish.jersey.servlet.internal.ResponseWriter.failure(ResponseWriter.java:260)
at 
org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:460)
at 
org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:285)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:272)
at org.glassfish.jersey.internal.Errors$1.call(Errors.java:268)
at org.glassfish.jersey.internal.Errors.process(Errors.java:316)
at 

[jira] [Created] (HDDS-772) ratis retries infinitely and does not timeout when datanode goes down

2018-10-31 Thread Nilotpal Nandi (JIRA)
Nilotpal Nandi created HDDS-772:
---

 Summary: ratis retries infinitely and does not timeout when 
datanode goes down
 Key: HDDS-772
 URL: https://issues.apache.org/jira/browse/HDDS-772
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Affects Versions: 0.3.0
Reporter: Nilotpal Nandi


steps taken :

-
 # Ran ozonefs client operations.
 # Some of the datanodes were down.
 # client operations did not fail and are in waiting/hung state.

reason: RATIS retries infinitely.

datanode.log



 
{noformat}
2018-10-31 11:13:28,423 WARN 
org.apache.ratis.grpc.server.GrpcServerProtocolService: 
046351fe-bb76-4f86-b296-c682746981c4: Failed requestVote 
54026017-a738-45f5-92f9-c50a0fc24a9f->046351fe-bb76-4f86-b296-c682746981c4#0
org.apache.ratis.protocol.GroupMismatchException: 
046351fe-bb76-4f86-b296-c682746981c4: group-FF58136AA1BA not found.
 at 
org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:257)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:266)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
 at 
org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:428)
 at 
org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
 at 
org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
 at 
org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
2018-10-31 11:13:29,574 WARN 
org.apache.ratis.grpc.server.GrpcServerProtocolService: 
046351fe-bb76-4f86-b296-c682746981c4: Failed requestVote 
54026017-a738-45f5-92f9-c50a0fc24a9f->046351fe-bb76-4f86-b296-c682746981c4#0
org.apache.ratis.protocol.GroupMismatchException: 
046351fe-bb76-4f86-b296-c682746981c4: group-FF58136AA1BA not found.
 at 
org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:257)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:266)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:261)
 at 
org.apache.ratis.server.impl.RaftServerProxy.requestVote(RaftServerProxy.java:428)
 at 
org.apache.ratis.grpc.server.GrpcServerProtocolService.requestVote(GrpcServerProtocolService.java:54)
 at 
org.apache.ratis.proto.grpc.RaftServerProtocolServiceGrpc$MethodHandlers.invoke(RaftServerProtocolServiceGrpc.java:319)
 at 
org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
 at java.lang.Thread.run(Thread.java:745)
2018-10-31 11:13:30,772 WARN 
org.apache.ratis.grpc.server.GrpcServerProtocolService: 
046351fe-bb76-4f86-b296-c682746981c4: Failed requestVote 
54026017-a738-45f5-92f9-c50a0fc24a9f->046351fe-bb76-4f86-b296-c682746981c4#0
org.apache.ratis.protocol.GroupMismatchException: 
046351fe-bb76-4f86-b296-c682746981c4: group-FF58136AA1BA not found.
 at 
org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:114)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:257)
 at 
org.apache.ratis.server.impl.RaftServerProxy.getImpl(RaftServerProxy.java:266)
 at 

[jira] [Created] (HDDS-771) ChunkGroupOutputStream stream entries need to be properly updated on closed container exception

2018-10-31 Thread Lokesh Jain (JIRA)
Lokesh Jain created HDDS-771:


 Summary: ChunkGroupOutputStream stream entries need to be properly 
updated on closed container exception
 Key: HDDS-771
 URL: https://issues.apache.org/jira/browse/HDDS-771
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Reporter: Lokesh Jain
Assignee: Lokesh Jain


Currently ChunkGroupOutputStream does not increment the currentStreamIndex when 
a chunk write completes but there is no data in the buffer. This leads to 
overwriting of stream entry.

We also need to update the bcsid in case of closed container exception. The 
stream entry's bcsid needs to be updated with the bcsid of the committed block.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-770) ozonefs client warning exception logs should not be displayed on console

2018-10-31 Thread Nilotpal Nandi (JIRA)
Nilotpal Nandi created HDDS-770:
---

 Summary: ozonefs client warning exception logs should not be 
displayed on console
 Key: HDDS-770
 URL: https://issues.apache.org/jira/browse/HDDS-770
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Client
Affects Versions: 0.3.0
Reporter: Nilotpal Nandi


steps taken :

-
 # ran ozonefs cp command  - "ozone fs -cp /testdir2/2GB /testdir2/2GB_111"
 # command execution was successful and file was successfully copied.

But , the warning logs/exceptions are displayed on console :

 
{noformat}
[root@ctr-e138-1518143905142-53-01-03 ~]# ozone fs -cp /testdir2/2GB 
/testdir2/2GB_111
2018-10-31 09:12:35,052 WARN scm.XceiverClientGrpc: Failed to execute command 
cmdType: GetBlock
traceID: "b73d7d2d-232a-40d7-b0b6-478e3d40ed6a"
containerID: 17
datanodeUuid: "ce0084c2-97cd-4c97-9378-e5175daad18b"
getBlock {
 blockID {
 containerID: 17
 localID: 100989077200109583
 }
 blockCommitSequenceId: 60
}
 on datanode 9fab9937-fbcd-4196-8014-cb165045724b
java.util.concurrent.ExecutionException: 
org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: UNAVAILABLE: io 
exception
 at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357)
 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895)
 at 
org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommandWithRetry(XceiverClientGrpc.java:167)
 at 
org.apache.hadoop.hdds.scm.XceiverClientGrpc.sendCommand(XceiverClientGrpc.java:146)
 at 
org.apache.hadoop.hdds.scm.storage.ContainerProtocolCalls.getBlock(ContainerProtocolCalls.java:105)
 at 
org.apache.hadoop.ozone.client.io.ChunkGroupInputStream.getFromOmKeyInfo(ChunkGroupInputStream.java:301)
 at org.apache.hadoop.ozone.client.rpc.RpcClient.getKey(RpcClient.java:493)
 at org.apache.hadoop.ozone.client.OzoneBucket.readKey(OzoneBucket.java:272)
 at org.apache.hadoop.fs.ozone.OzoneFileSystem.open(OzoneFileSystem.java:178)
 at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:950)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.copyFileToTarget(CommandWithDestination.java:341)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:277)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPath(CommandWithDestination.java:262)
 at org.apache.hadoop.fs.shell.Command.processPathInternal(Command.java:367)
 at org.apache.hadoop.fs.shell.Command.processPaths(Command.java:331)
 at org.apache.hadoop.fs.shell.Command.processPathArgument(Command.java:304)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.processPathArgument(CommandWithDestination.java:257)
 at org.apache.hadoop.fs.shell.Command.processArgument(Command.java:286)
 at org.apache.hadoop.fs.shell.Command.processArguments(Command.java:270)
 at 
org.apache.hadoop.fs.shell.CommandWithDestination.processArguments(CommandWithDestination.java:228)
 at org.apache.hadoop.fs.shell.FsCommand.processRawArguments(FsCommand.java:120)
 at org.apache.hadoop.fs.shell.Command.run(Command.java:177)
 at org.apache.hadoop.fs.FsShell.run(FsShell.java:327)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
 at org.apache.hadoop.fs.FsShell.main(FsShell.java:390)
Caused by: org.apache.ratis.thirdparty.io.grpc.StatusRuntimeException: 
UNAVAILABLE: io exception
 at 
org.apache.ratis.thirdparty.io.grpc.Status.asRuntimeException(Status.java:526)
 at 
org.apache.ratis.thirdparty.io.grpc.stub.ClientCalls$StreamObserverToCallListenerAdapter.onClose(ClientCalls.java:420)
 at 
org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
 at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
 at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.CensusStatsModule$StatsClientInterceptor$1$1.onClose(CensusStatsModule.java:684)
 at 
org.apache.ratis.thirdparty.io.grpc.PartialForwardingClientCallListener.onClose(PartialForwardingClientCallListener.java:39)
 at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener.onClose(ForwardingClientCallListener.java:23)
 at 
org.apache.ratis.thirdparty.io.grpc.ForwardingClientCallListener$SimpleForwardingClientCallListener.onClose(ForwardingClientCallListener.java:40)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.CensusTracingModule$TracingClientInterceptor$1$1.onClose(CensusTracingModule.java:403)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl.closeObserver(ClientCallImpl.java:459)
 at 
org.apache.ratis.thirdparty.io.grpc.internal.ClientCallImpl.access$300(ClientCallImpl.java:63)
 at 

[jira] [Created] (HDDS-769) temporary file ._COPYING_ is not deleted after put command failure

2018-10-31 Thread Nilotpal Nandi (JIRA)
Nilotpal Nandi created HDDS-769:
---

 Summary: temporary file ._COPYING_ is not deleted after put 
command failure
 Key: HDDS-769
 URL: https://issues.apache.org/jira/browse/HDDS-769
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
  Components: Ozone Manager
Affects Versions: 0.3.0
Reporter: Nilotpal Nandi


steps taken :

-
 # stopped all datanodes.
 # Ran ozonefs put command. Command execution failed.

{noformat}
[root@ctr-e138-1518143905142-53-01-08 ~]# ozone fs -put /etc/passwd 
/testdir5/
2018-10-31 08:42:12,711 [main] ERROR - Try to allocate more blocks for write 
failed, already allocated 0 blocks for this write.
put: Allocate block failed, error:INTERNAL_ERROR{noformat}
But , the temporary file was not deleted from OM.
{noformat}
[root@ctr-e138-1518143905142-53-01-03 logs]# ozone fs -ls 
/testdir5/passwd._COPYING_
-rw-rw-rw- 1 root root 0 2018-10-31 08:42 /testdir5/passwd._COPYING_{noformat}
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDDS-768) writeStateMachineData times out

2018-10-31 Thread Nilotpal Nandi (JIRA)
Nilotpal Nandi created HDDS-768:
---

 Summary: writeStateMachineData times out
 Key: HDDS-768
 URL: https://issues.apache.org/jira/browse/HDDS-768
 Project: Hadoop Distributed Data Store
  Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Nilotpal Nandi


datanode stopped due to following error :

datanode.log
{noformat}
2018-10-31 09:12:04,517 INFO org.apache.ratis.server.impl.RaftServerImpl: 
9fab9937-fbcd-4196-8014-cb165045724b: set configuration 169: 
[9fab9937-fbcd-4196-8014-cb165045724b:172.27.15.131:9858, 
ce0084c2-97cd-4c97-9378-e5175daad18b:172.27.15.139:9858, 
f0291cb4-7a48-456a-847f-9f91a12aa850:172.27.38.9:9858], old=null at 169
2018-10-31 09:12:22,187 ERROR org.apache.ratis.server.storage.RaftLogWorker: 
Terminating with exit status 1: 
9fab9937-fbcd-4196-8014-cb165045724b-RaftLogWorker failed.
org.apache.ratis.protocol.TimeoutIOException: Timeout: WriteLog:182: (t:10, 
i:182), STATEMACHINELOGENTRY, client-611073BBFA46, cid=127-writeStateMachineData
 at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:87)
 at 
org.apache.ratis.server.storage.RaftLogWorker$WriteLog.execute(RaftLogWorker.java:310)
 at org.apache.ratis.server.storage.RaftLogWorker.run(RaftLogWorker.java:182)
 at java.lang.Thread.run(Thread.java:745)
Caused by: java.util.concurrent.TimeoutException
 at java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1771)
 at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1915)
 at org.apache.ratis.util.IOUtils.getFromFuture(IOUtils.java:79)
 ... 3 more{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org