Hi Ilya, Can you please share the log files for this container?
Is the log level set to 'DEBUG'? Thanks, Chandni On Fri, Mar 11, 2016 at 8:57 AM, Chaitanya Chebolu < [email protected]> wrote: > I think rolling is not happening and this depends on "rollingFile" > property. > By default, value of rollingFile = false. > Property "rollingFile" is true only if one of the below condition > satisfies: > > - maxLength < Long.MAX_VALUE > - rotationWindows > 0. > > Please check by setting one of the above properties. > > On Fri, Mar 11, 2016 at 9:48 PM, Ganelin, Ilya < > [email protected]> > wrote: > > > This is happening after some time but file roll-over appears to be > working > > well with this approach in other instances. > > > > > > > > > > On 3/11/16, 8:02 AM, "Sandeep Deshmukh" <[email protected]> wrote: > > > > >Is this happening for the first itself or after some time? > > > > > >May be the file is getting rolled over to the next file but as you are > > >overriding the default file naming policy, the rollover is also trying > to > > >write to the same file. > > > > > >Regards, > > >Sandeep > > > > > >On Fri, Mar 11, 2016 at 9:21 PM, Ganelin, Ilya < > > [email protected]> > > >wrote: > > > > > >> I explicitly assign a different name for each partition of the > operator > > as > > >> well based on the context ID. > > >> > > >> > > >> > > >> On 3/11/16, 7:34 AM, "Sandeep Deshmukh" <[email protected]> > > wrote: > > >> > > >> >The AbstractFileOutputOperator creates file with timestamp in the > file > > >> >name. So, if there is conflict in the name prompts that the same > > operator > > >> >could be trying to write to same file. > > >> >Does this happen after operator recovery or before any other failure > > >> occurs? > > >> > > > >> >Is it possible that multiple partitions write to the same directory? > > >> > > > >> > > > >> > > > >> >On Fri, Mar 11, 2016 at 7:12 AM, Ganelin, Ilya < > > >> [email protected]> > > >> >wrote: > > >> > > > >> >> This is 3.0.0. > > >> >> > > >> >> > > >> >> > > >> >> Sent with Good (www.good.com) > > >> >> ________________________________ > > >> >> From: Thomas Weise <[email protected]> > > >> >> Sent: Friday, March 11, 2016 2:02:13 AM > > >> >> To: [email protected] > > >> >> Subject: Re: Long-running HDFS Write errors > > >> >> > > >> >> Which version of Malhar is this? > > >> >> > > >> >> > > >> >> On Thu, Mar 10, 2016 at 10:56 PM, Ganelin, Ilya < > > >> >> [email protected] > > >> >> > wrote: > > >> >> > > >> >> > Hello – I have a long-running job which simultaneously writes to > > >> >multiple > > >> >> > files on HDFS. I am seeing the following error come up: > > >> >> > > > >> >> > I would appreciate any insight into what’s going on here. > > >> >> > > > >> >> > > > >> >> > Stopped running due to an exception. > > >> >> > com.google.common.util.concurrent.UncheckedExecutionException: > > >> >> > java.lang.RuntimeException: > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > > >> >> > Failed to create file > > >> >> > > > >> >> > > >> > > >> > > > >[/user/vault8/citadel_out/2016_03_20_17_34_339/records/[email protected] > > >> >> ] > > >> >> > for [DFSClient_NONMAPREDUCE_232430238_1207] for client > > [10.24.28.64], > > >> >> > because this file is already being created by > > >> >> > [DFSClient_NONMAPREDUCE_-1482819983_1172] on [10.24.28.64] > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3122) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > > >> >> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > > >> >> > at > > >> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > > >> >> > at > > >> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > > >> >> > at java.security.AccessController.doPrivileged(Native > > Method) > > >> >> > at javax.security.auth.Subject.doAs(Subject.java:415) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > > >> >> > at > > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > > >> >> > > > >> >> > at > > >> >> > > > com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2234) > > >> >> > at > > >> com.google.common.cache.LocalCache.get(LocalCache.java:3965) > > >> >> > at > > >> >> > > com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3969) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4829) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.datatorrent.lib.io.fs.AbstractFileOutputOperator.processTuple(AbstractFileOutputOperator.java:667) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.datatorrent.lib.io.fs.AbstractFileOutputOperator$1.process(AbstractFileOutputOperator.java:236) > > >> >> > at > > >> >> > > com.datatorrent.api.DefaultInputPort.put(DefaultInputPort.java:67) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.datatorrent.stram.stream.BufferServerSubscriber$BufferReservoir.sweep(BufferServerSubscriber.java:244) > > >> >> > at > > >> >> > > com.datatorrent.stram.engine.GenericNode.run(GenericNode.java:226) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.datatorrent.stram.engine.StreamingContainer$2.run(StreamingContainer.java:1365) > > >> >> > Caused by: java.lang.RuntimeException: > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > > >> >> > Failed to create file > > >> >> > > > >> >> > > >> > > >> > > > >[/user/vault8/citadel_out/2016_03_20_17_34_339/records/[email protected] > > >> >> ] > > >> >> > for [DFSClient_NONMAPREDUCE_232430238_1207] for client > > [10.24.28.64], > > >> >> > because this file is already being created by > > >> >> > [DFSClient_NONMAPREDUCE_-1482819983_1172] on [10.24.28.64] > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3122) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > > >> >> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > > >> >> > at > > >> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > > >> >> > at > > >> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > > >> >> > at java.security.AccessController.doPrivileged(Native > > Method) > > >> >> > at javax.security.auth.Subject.doAs(Subject.java:415) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > > >> >> > at > > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > > >> >> > > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.datatorrent.lib.io.fs.AbstractFileOutputOperator$3.load(AbstractFileOutputOperator.java:414) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.datatorrent.lib.io.fs.AbstractFileOutputOperator$3.load(AbstractFileOutputOperator.java:334) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3568) > > >> >> > at > > >> >> > > > >> > > > >com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2350) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2313) > > >> >> > at > > >> >> > > > com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2228) > > >> >> > ... 9 more > > >> >> > Caused by: > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.protocol.AlreadyBeingCreatedException): > > >> >> > Failed to create file > > >> >> > > > >> >> > > >> > > >> > > > >[/user/vault8/citadel_out/2016_03_20_17_34_339/records/[email protected] > > >> >> ] > > >> >> > for [DFSClient_NONMAPREDUCE_232430238_1207] for client > > [10.24.28.64], > > >> >> > because this file is already being created by > > >> >> > [DFSClient_NONMAPREDUCE_-1482819983_1172] on [10.24.28.64] > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3122) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInternal(FSNamesystem.java:2905) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFileInt(FSNamesystem.java:3186) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.FSNamesystem.appendFile(FSNamesystem.java:3149) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.append(NameNodeRpcServer.java:611) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.append(AuthorizationProviderProxyClientProtocol.java:124) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.append(ClientNamenodeProtocolServerSideTranslatorPB.java:416) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:617) > > >> >> > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1060) > > >> >> > at > > >> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2086) > > >> >> > at > > >> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2082) > > >> >> > at java.security.AccessController.doPrivileged(Native > > Method) > > >> >> > at javax.security.auth.Subject.doAs(Subject.java:415) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671) > > >> >> > at > > org.apache.hadoop.ipc.Server$Handler.run(Server.java:2080) > > >> >> > > > >> >> > at org.apache.hadoop.ipc.Client.call(Client.java:1472) > > >> >> > at org.apache.hadoop.ipc.Client.call(Client.java:1403) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230) > > >> >> > at com.sun.proxy.$Proxy14.append(Unknown Source) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.append(ClientNamenodeProtocolTranslatorPB.java:313) > > >> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > > Method) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > >> >> > at java.lang.reflect.Method.invoke(Method.java:606) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:252) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104) > > >> >> > at com.sun.proxy.$Proxy15.append(Unknown Source) > > >> >> > at > > >> >> org.apache.hadoop.hdfs.DFSClient.callAppend(DFSClient.java:1842) > > >> >> > at > > >> org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1878) > > >> >> > at > > >> org.apache.hadoop.hdfs.DFSClient.append(DFSClient.java:1871) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:329) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:325) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >org.apache.hadoop.hdfs.DistributedFileSystem.append(DistributedFileSystem.java:325) > > >> >> > at > > >> org.apache.hadoop.fs.FileSystem.append(FileSystem.java:1172) > > >> >> > at > > >> >> > > > >> >> > > >> > > >> > > > >com.datatorrent.lib.io.fs.AbstractFileOutputOperator$3.load(AbstractFileOutputOperator.java:371) > > >> >> > ... 14 more > > >> >> > ________________________________________________________ > > >> >> > > > >> >> > The information contained in this e-mail is confidential and/or > > >> >> > proprietary to Capital One and/or its affiliates and may only be > > used > > >> >> > solely in performance of work or services for Capital One. The > > >> >> information > > >> >> > transmitted herewith is intended only for use by the individual > or > > >> >entity > > >> >> > to which it is addressed. If the reader of this message is not > the > > >> >> intended > > >> >> > recipient, you are hereby notified that any review, > retransmission, > > >> >> > dissemination, distribution, copying or other use of, or taking > of > > any > > >> >> > action in reliance upon this information is strictly prohibited. > If > > >> you > > >> >> > have received this communication in error, please contact the > > sender > > >> and > > >> >> > delete the material from your computer. > > >> >> > > > >> >> ________________________________________________________ > > >> >> > > >> >> The information contained in this e-mail is confidential and/or > > >> >> proprietary to Capital One and/or its affiliates and may only be > used > > >> >> solely in performance of work or services for Capital One. The > > >> information > > >> >> transmitted herewith is intended only for use by the individual or > > >> entity > > >> >> to which it is addressed. If the reader of this message is not the > > >> >intended > > >> >> recipient, you are hereby notified that any review, retransmission, > > >> >> dissemination, distribution, copying or other use of, or taking of > > any > > >> >> action in reliance upon this information is strictly prohibited. If > > you > > >> >> have received this communication in error, please contact the > sender > > and > > >> >> delete the material from your computer. > > >> >> > > >> ________________________________________________________ > > >> > > >> The information contained in this e-mail is confidential and/or > > >> proprietary to Capital One and/or its affiliates and may only be used > > >> solely in performance of work or services for Capital One. The > > information > > >> transmitted herewith is intended only for use by the individual or > > entity > > >> to which it is addressed. If the reader of this message is not the > > intended > > >> recipient, you are hereby notified that any review, retransmission, > > >> dissemination, distribution, copying or other use of, or taking of any > > >> action in reliance upon this information is strictly prohibited. If > you > > >> have received this communication in error, please contact the sender > and > > >> delete the material from your computer. > > >> > > ________________________________________________________ > > > > The information contained in this e-mail is confidential and/or > > proprietary to Capital One and/or its affiliates and may only be used > > solely in performance of work or services for Capital One. The > information > > transmitted herewith is intended only for use by the individual or entity > > to which it is addressed. If the reader of this message is not the > intended > > recipient, you are hereby notified that any review, retransmission, > > dissemination, distribution, copying or other use of, or taking of any > > action in reliance upon this information is strictly prohibited. If you > > have received this communication in error, please contact the sender and > > delete the material from your computer. > > >
