Hrm. If you never call Pipeline.done, you should never cleanup the temporary files for the job...
On Thu, Sep 24, 2015 at 5:44 PM, Everett Anderson <[email protected]> wrote: > While we tried to take comfort in the fact that we'd only seen this only > HD-based cc2.8xlarges, I'm afraid we're now seeing it when processing > larger amounts of data on SSD-based c3.4x8larges. > > My two hypotheses are > > 1) Somehow these temp files are getting cleaned up before they're accessed > for the last time. Perhaps either something in HDFS or Hadoop cleans up > these temp directories, or perhaps there's a bunch in Crunch's planner. > > 2) HDFS has chosen 3 machines to replicate data to, but it is performing a > very lopsided replication. While the cluster overall looks like it has HDFS > capacity, perhaps a small subset of the machines is actually at capacity. > Things seem to fail in obscure ways when running out of disk. > > > 2015-09-24 23:28:58,850 WARN [main] org.apache.hadoop.mapred.YarnChild: > Exception running child : org.apache.crunch.CrunchRuntimeException: Could not > read runtime node information > at > org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:48) > at > org.apache.crunch.impl.mr.run.CrunchReducer.setup(CrunchReducer.java:40) > at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:172) > at > org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:656) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:394) > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170) > Caused by: java.io.FileNotFoundException: File does not exist: > /tmp/crunch-2031291770/p567/REDUCE > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1726) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1669) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1649) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1621) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:497) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:599) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:526) > at > org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) > at > org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1147) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1135) > at > org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1125) > at > org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:273) > at > org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:240) > at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:233) > at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1298) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300) > at > org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296) > at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768) > at org.apache.crunch.util.DistCache.read(DistCache.java:72) > at > org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:46) > ... 9 more > Caused by: > org.apache.hadoop.ipc.RemoteException(java.io.FileNotFoundException): File > does not exist: /tmp/crunch-2031291770/p567/REDUCE > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) > at > org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1726) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1669) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1649) > at > org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1621) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:497) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322) > at > org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:599) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) > at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) > > at org.apache.hadoop.ipc.Client.call(Client.java:1410) > at org.apache.hadoop.ipc.Client.call(Client.java:1363) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:215) > at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) > at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:219) > at > org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1145) > ... 22 more > > > On Fri, Aug 21, 2015 at 3:52 PM, Jeff Quinn <[email protected]> wrote: > >> Also worth noting, we inspected the hadoop configuration defaults that >> the AWS EMR service populates for the two different instance types, for >> mapred-site.xml, core-site.xml, and hdfs-site.xml all settings were >> identical, with the exception of slight differences in JVM memory allotted. >> Further investigated the max number of file descriptors for each instance >> type via ulimit, and saw no differences there either. >> >> So not sure what the main difference is between these two clusters that >> would cause these very different outcomes, other than cc2.8xlarge having >> SSDs and c3.8xlarge having spinning disks. >> >> On Fri, Aug 21, 2015 at 1:03 PM, Everett Anderson <[email protected]> >> wrote: >> >>> Hey, >>> >>> Jeff graciously agreed to try it out. >>> >>> I'm afraid we're still getting failures on that instance type, though >>> with 0.11 with the patches, the cluster ended up in a state that no new >>> applications could be submitted afterwards. >>> >>> The errors when running the pipeline seem to be similarly HDFS related. >>> It's quite odd. >>> >>> Examples when using 0.11 + the patches: >>> >>> >>> 2015-08-20 23:17:50,455 WARN [Thread-38] >>> org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source >>> file >>> "/tmp/crunch-274499863/p504/output/_temporary/1/_temporary/attempt_1440102643297_out0_0107_r_000001_0/out0-r-00001" >>> - Aborting... >>> >>> >>> 2015-08-20 22:39:42,184 WARN [Thread-51] >>> org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception >>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): >>> No lease on >>> /tmp/crunch-274499863/p510/output/_temporary/1/_temporary/attempt_1440102643297_out12_0103_r_000167_2/out12-r-00167 >>> (inode 83784): File does not exist. [Lease. Holder: >>> DFSClient_attempt_1440102643297_0103_r_000167_2_964529009_1, >>> pendingcreates: 24] >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:3516) >>> at >>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.abandonBlock(FSNamesystem.java:3486) >>> at >>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.abandonBlock(NameNodeRpcServer.java:687) >>> at >>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.abandonBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:467) >>> at >>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >>> at >>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:635) >>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962) >>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039) >>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) >>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2033) >>> >>> at org.apache.hadoop.ipc.Client.call(Client.java:1468) >>> at org.apache.hadoop.ipc.Client.call(Client.java:1399) >>> at >>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:241) >>> at com.sun.proxy.$Proxy13.abandonBlock(Unknown Source) >>> at >>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.abandonBlock(ClientNamenodeProtocolTranslatorPB.java:376) >>> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source) >>> at >>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>> at java.lang.reflect.Method.invoke(Method.java:606) >>> at >>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) >>> at >>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) >>> at com.sun.proxy.$Proxy14.abandonBlock(Unknown Source) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1377) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:594) >>> 2015-08-20 22:39:42,184 WARN [Thread-51] >>> org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source >>> file >>> "/tmp/crunch-274499863/p510/output/_temporary/1/_temporary/attempt_1440102643297_out12_0103_r_000167_2/out12-r-00167" >>> - Aborting... >>> >>> >>> >>> 2015-08-20 23:34:59,276 INFO [Thread-37] >>> org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream >>> java.io.IOException: Bad connect ack with firstBadLink as >>> 10.55.1.103:50010 >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1472) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:594) >>> 2015-08-20 23:34:59,276 INFO [Thread-37] >>> org.apache.hadoop.hdfs.DFSClient: Abandoning >>> BP-835517662-10.55.1.32-1440102626965:blk_1073828261_95268 >>> 2015-08-20 23:34:59,278 INFO [Thread-37] >>> org.apache.hadoop.hdfs.DFSClient: Excluding datanode 10.55.1.103:50010 >>> 2015-08-20 23:34:59,278 WARN [Thread-37] >>> org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception >>> java.io.IOException: Unable to create new block. >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1386) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:594) >>> 2015-08-20 23:34:59,278 WARN [Thread-37] >>> org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source >>> file >>> "/tmp/crunch-274499863/p504/output/_temporary/1/_temporary/attempt_1440102643297_out0_0107_r_000001_2/out0-r-00001" >>> - Aborting... >>> 2015-08-20 23:34:59,279 WARN [main] org.apache.hadoop.mapred.YarnChild: >>> Exception running child : org.apache.crunch.CrunchRuntimeException: >>> java.io.IOException: Bad connect ack with firstBadLink as >>> 10.55.1.103:50010 >>> at >>> org.apache.crunch.impl.mr.run.CrunchTaskContext.cleanup(CrunchTaskContext.java:74) >>> at >>> org.apache.crunch.impl.mr.run.CrunchReducer.cleanup(CrunchReducer.java:64) >>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:195) >>> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:656) >>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:394) >>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:171) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at javax.security.auth.Subject.doAs(Subject.java:415) >>> at >>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) >>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:166) >>> Caused by: java.io.IOException: Bad connect ack with firstBadLink as >>> 10.55.1.103:50010 >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1472) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1373) >>> at >>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:594) >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> On Fri, Aug 21, 2015 at 11:59 AM, Josh Wills <[email protected]> >>> wrote: >>> >>>> Curious how this went. :) >>>> >>>> On Tue, Aug 18, 2015 at 4:26 PM, Everett Anderson <[email protected]> >>>> wrote: >>>> >>>>> Sure, let me give it a try. I'm going to take 0.11 and patch it with >>>>> >>>>> https://issues.apache.org/jira/browse/CRUNCH-553 >>>>> https://issues.apache.org/jira/browse/CRUNCH-517 >>>>> >>>>> as we also rely on 517. >>>>> >>>>> >>>>> >>>>> On Tue, Aug 18, 2015 at 4:09 PM, Josh Wills <[email protected]> >>>>> wrote: >>>>> >>>>>> (In particular, I'm wondering if something in CRUNCH-481 is related >>>>>> to this problem.) >>>>>> >>>>>> On Tue, Aug 18, 2015 at 4:07 PM, Josh Wills <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hey Everett, >>>>>>> >>>>>>> Shot in the dark-- would you mind trying it w/0.11.0-hadoop2 w/the >>>>>>> 553 patch? Is that easy to do? >>>>>>> >>>>>>> J >>>>>>> >>>>>>> On Tue, Aug 18, 2015 at 3:18 PM, Everett Anderson <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I verified that the pipeline succeeds on the same cc2.8xlarge >>>>>>>> hardware when setting crunch.max.running.jobs to 1. I generally >>>>>>>> feel like the pipeline application itself logic is sound, at this >>>>>>>> point. It >>>>>>>> could be that this is just taxing these machines too hard and we need >>>>>>>> to >>>>>>>> increase the number of retries? >>>>>>>> >>>>>>>> It reliably fails on this hardware when crunch.max.running.jobs >>>>>>>> set to its default. >>>>>>>> >>>>>>>> Can you explain a little what the /tmp/crunch-XXXXXXX files are as >>>>>>>> well as how Crunch uses side effect files? Do you know if HDFS would >>>>>>>> clean >>>>>>>> up those directories from underneath Crunch? >>>>>>>> >>>>>>>> There are usually 4 failed applications, failing due to reduces. >>>>>>>> The failures seem to be one of the following three kinds -- (1) No >>>>>>>> lease on >>>>>>>> <side effect file>, (2) File not found </tmp/crunch-XXXXXXX> file, (3) >>>>>>>> SocketTimeoutException. >>>>>>>> >>>>>>>> Examples: >>>>>>>> >>>>>>>> [1] No lease exception >>>>>>>> >>>>>>>> Error: org.apache.crunch.CrunchRuntimeException: >>>>>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): >>>>>>>> No lease on >>>>>>>> /tmp/crunch-4694113/p662/output/_temporary/1/_temporary/attempt_1439917295505_out7_0018_r_000003_1/out7-r-00003: >>>>>>>> File does not exist. Holder >>>>>>>> DFSClient_attempt_1439917295505_0018_r_000003_1_824053899_1 does not >>>>>>>> have >>>>>>>> any open files. at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2944) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3008) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2988) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:641) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:484) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >>>>>>>> at >>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:599) >>>>>>>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at >>>>>>>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at >>>>>>>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at >>>>>>>> java.security.AccessController.doPrivileged(Native Method) at >>>>>>>> javax.security.auth.Subject.doAs(Subject.java:415) at >>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>>>>>>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) at >>>>>>>> org.apache.crunch.impl.mr.run.CrunchTaskContext.cleanup(CrunchTaskContext.java:74) >>>>>>>> at >>>>>>>> org.apache.crunch.impl.mr.run.CrunchReducer.cleanup(CrunchReducer.java:64) >>>>>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:195) at >>>>>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:656) >>>>>>>> at >>>>>>>> org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:394) at >>>>>>>> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at >>>>>>>> java.security.AccessController.doPrivileged(Native Method) at >>>>>>>> javax.security.auth.Subject.doAs(Subject.java:415) at >>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>>>>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170) Caused >>>>>>>> by: >>>>>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): >>>>>>>> No lease on >>>>>>>> /tmp/crunch-4694113/p662/output/_temporary/1/_temporary/attempt_1439917295505_out7_0018_r_000003_1/out7-r-00003: >>>>>>>> File does not exist. Holder >>>>>>>> DFSClient_attempt_1439917295505_0018_r_000003_1_824053899_1 does not >>>>>>>> have >>>>>>>> any open files. at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkLease(FSNamesystem.java:2944) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFileInternal(FSNamesystem.java:3008) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.completeFile(FSNamesystem.java:2988) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.complete(NameNodeRpcServer.java:641) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.complete(ClientNamenodeProtocolServerSideTranslatorPB.java:484) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >>>>>>>> at >>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:599) >>>>>>>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) at >>>>>>>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) at >>>>>>>> org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) at >>>>>>>> java.security.AccessController.doPrivileged(Native Method) at >>>>>>>> javax.security.auth.Subject.doAs(Subject.java:415) at >>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>>>>>>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) at >>>>>>>> org.apache.hadoop.ipc.Client.call(Client.java:1410) at >>>>>>>> org.apache.hadoop.ipc.Client.call(Client.java:1363) at >>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:215) >>>>>>>> at com.sun.proxy.$Proxy13.complete(Unknown Source) at >>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at >>>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) >>>>>>>> at >>>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >>>>>>>> at java.lang.reflect.Method.invoke(Method.java:606) at >>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) >>>>>>>> at >>>>>>>> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) >>>>>>>> at com.sun.proxy.$Proxy13.complete(Unknown Source) at >>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.complete(ClientNamenodeProtocolTranslatorPB.java:404) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream.completeFile(DFSOutputStream.java:2130) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream.close(DFSOutputStream.java:2114) >>>>>>>> at >>>>>>>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.close(FSDataOutputStream.java:72) >>>>>>>> at >>>>>>>> org.apache.hadoop.fs.FSDataOutputStream.close(FSDataOutputStream.java:105) >>>>>>>> at >>>>>>>> org.apache.hadoop.io.SequenceFile$Writer.close(SequenceFile.java:1289) >>>>>>>> at >>>>>>>> org.apache.hadoop.mapreduce.lib.output.SequenceFileOutputFormat$1.close(SequenceFileOutputFormat.java:87) >>>>>>>> at >>>>>>>> org.apache.crunch.io.CrunchOutputs$OutputState.close(CrunchOutputs.java:300) >>>>>>>> at org.apache.crunch.io.CrunchOutputs.close(CrunchOutputs.java:180) at >>>>>>>> org.apache.crunch.impl.mr.run.CrunchTaskContext.cleanup(CrunchTaskContext.java:72) >>>>>>>> ... 9 more >>>>>>>> >>>>>>>> >>>>>>>> [2] File does not exist >>>>>>>> >>>>>>>> 2015-08-18 17:36:10,195 INFO [AsyncDispatcher event handler] >>>>>>>> org.apache.hadoop.mapreduce.v2.app.job.impl.TaskAttemptImpl: >>>>>>>> Diagnostics report from attempt_1439917295505_0034_r_000004_1: Error: >>>>>>>> org.apache.crunch.CrunchRuntimeException: Could not read runtime node >>>>>>>> information >>>>>>>> at >>>>>>>> org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:48) >>>>>>>> at >>>>>>>> org.apache.crunch.impl.mr.run.CrunchReducer.setup(CrunchReducer.java:40) >>>>>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:172) >>>>>>>> at >>>>>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:656) >>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:394) >>>>>>>> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) >>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>>>>> at >>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>>>>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170) >>>>>>>> Caused by: java.io.FileNotFoundException: File does not exist: >>>>>>>> /tmp/crunch-4694113/p470/REDUCE >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:65) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:55) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsUpdateTimes(FSNamesystem.java:1726) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocationsInt(FSNamesystem.java:1669) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1649) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:1621) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:497) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:322) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) >>>>>>>> at >>>>>>>> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:599) >>>>>>>> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928) >>>>>>>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013) >>>>>>>> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009) >>>>>>>> at java.security.AccessController.doPrivileged(Native Method) >>>>>>>> at javax.security.auth.Subject.doAs(Subject.java:415) >>>>>>>> at >>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>>>>>>> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007) >>>>>>>> >>>>>>>> at >>>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) >>>>>>>> at >>>>>>>> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) >>>>>>>> at >>>>>>>> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) >>>>>>>> at >>>>>>>> java.lang.reflect.Constructor.newInstance(Constructor.java:526) >>>>>>>> at >>>>>>>> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) >>>>>>>> at >>>>>>>> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1147) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1135) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1125) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:273) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:240) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:233) >>>>>>>> at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1298) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:300) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:296) >>>>>>>> at >>>>>>>> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:296) >>>>>>>> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:768) >>>>>>>> at org.apache.crunch.util.DistCache.read(DistCache.java:72) >>>>>>>> at >>>>>>>> org.apache.crunch.impl.mr.run.CrunchTaskContext.<init>(CrunchTaskContext.java:46) >>>>>>>> ... 9 more >>>>>>>> >>>>>>>> [3] SocketTimeoutException >>>>>>>> >>>>>>>> Error: org.apache.crunch.CrunchRuntimeException: >>>>>>>> java.net.SocketTimeoutException: 70000 millis timeout while waiting >>>>>>>> for channel to be ready for read. ch : >>>>>>>> java.nio.channels.SocketChannel[connected local=/10.55.1.229:35720 >>>>>>>> remote=/10.55.1.230:9200] at >>>>>>>> org.apache.crunch.impl.mr.run.CrunchTaskContext.cleanup(CrunchTaskContext.java:74) >>>>>>>> at >>>>>>>> org.apache.crunch.impl.mr.run.CrunchReducer.cleanup(CrunchReducer.java:64) >>>>>>>> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:195) at >>>>>>>> org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:656) >>>>>>>> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:394) at >>>>>>>> org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175) at >>>>>>>> java.security.AccessController.doPrivileged(Native Method) at >>>>>>>> javax.security.auth.Subject.doAs(Subject.java:415) at >>>>>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548) >>>>>>>> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:170) Caused >>>>>>>> by: java.net.SocketTimeoutException: 70000 millis timeout while >>>>>>>> waiting for channel to be ready for read. ch : >>>>>>>> java.nio.channels.SocketChannel[connected local=/10.55.1.229:35720 >>>>>>>> remote=/10.55.1.230:9200] at >>>>>>>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) >>>>>>>> at >>>>>>>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161) >>>>>>>> at >>>>>>>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131) >>>>>>>> at >>>>>>>> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118) >>>>>>>> at java.io.FilterInputStream.read(FilterInputStream.java:83) at >>>>>>>> java.io.FilterInputStream.read(FilterInputStream.java:83) at >>>>>>>> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1985) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.transfer(DFSOutputStream.java:1075) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:1042) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1186) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.processDatanodeError(DFSOutputStream.java:935) >>>>>>>> at >>>>>>>> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:491) >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Fri, Aug 14, 2015 at 3:54 PM, Everett Anderson <[email protected] >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Aug 14, 2015 at 3:26 PM, Josh Wills <[email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hey Everett, >>>>>>>>>> >>>>>>>>>> Initial thought-- there are lots of reasons for lease expired >>>>>>>>>> exceptions, and their usually more symptomatic of other problems in >>>>>>>>>> the >>>>>>>>>> pipeline. Are you sure none of the jobs in the Crunch pipeline on the >>>>>>>>>> non-SSD instances are failing for some other reason? I'd be >>>>>>>>>> surprised if no >>>>>>>>>> other errors showed up in the app master, although there are reports >>>>>>>>>> of >>>>>>>>>> some weirdness around LeaseExpireds when writing to S3-- but you're >>>>>>>>>> not >>>>>>>>>> doing that here, right? >>>>>>>>>> >>>>>>>>> >>>>>>>>> We're reading from and writing to HDFS, here. (We've copied in >>>>>>>>> input from S3 to HDFS in another step.) >>>>>>>>> >>>>>>>>> There are a few exceptions in the logs. Most seem related to >>>>>>>>> missing temp files. >>>>>>>>> >>>>>>>>> Let me see if I can reproduce it with crunch.max.running.jobs set >>>>>>>>> to 1 to try to narrow down the originating failure. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>> J >>>>>>>>>> >>>>>>>>>> On Fri, Aug 14, 2015 at 2:10 PM, Everett Anderson < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I recently started trying to run our Crunch pipeline on more >>>>>>>>>>> data and have been trying out different AWS instance types in >>>>>>>>>>> anticipation >>>>>>>>>>> of our storage and compute needs. >>>>>>>>>>> >>>>>>>>>>> I was using EMR 3.8 (so Hadoop 2.4.0) with Crunch 0.12 (patched >>>>>>>>>>> with the CRUNCH-553 >>>>>>>>>>> <https://issues.apache.org/jira/browse/CRUNCH-553> fix). >>>>>>>>>>> >>>>>>>>>>> Our pipeline finishes fine in these cluster configurations: >>>>>>>>>>> >>>>>>>>>>> - 50 c3.4xlarge Core, 0 Task >>>>>>>>>>> - 10 c3.8xlarge Core, 0 Task >>>>>>>>>>> - 25 c3.8xlarge Core, 0 Task >>>>>>>>>>> >>>>>>>>>>> However, it always fails on the same data when using 10 >>>>>>>>>>> cc2.8xlarge Core instances. >>>>>>>>>>> >>>>>>>>>>> The biggest obvious hardware difference is that the cc2.8xlarges >>>>>>>>>>> use hard disks instead of SSDs. >>>>>>>>>>> >>>>>>>>>>> While it's a little hard to track down the exact originating >>>>>>>>>>> failure, I think it's from errors like: >>>>>>>>>>> >>>>>>>>>>> 2015-08-13 21:34:38,379 ERROR [IPC Server handler 24 on 45711] >>>>>>>>>>> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: >>>>>>>>>>> attempt_1439499407003_0028_r_000153_1 - exited : >>>>>>>>>>> org.apache.crunch.CrunchRuntimeException: >>>>>>>>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): >>>>>>>>>>> No lease on >>>>>>>>>>> /tmp/crunch-970849245/p662/output/_temporary/1/_temporary/attempt_1439499407003_out7_0028_r_000153_1/out7-r-00153: >>>>>>>>>>> File does not exist. Holder >>>>>>>>>>> DFSClient_attempt_1439499407003_0028_r_000153_1_609888542_1 does >>>>>>>>>>> not have >>>>>>>>>>> any open files. >>>>>>>>>>> >>>>>>>>>>> Those paths look like these side effect files >>>>>>>>>>> <https://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/mapred/FileOutputFormat.html#getWorkOutputPath(org.apache.hadoop.mapred.JobConf)> >>>>>>>>>>> . >>>>>>>>>>> >>>>>>>>>>> Would Crunch have generated applications that depend on side >>>>>>>>>>> effect paths as input across MapReduce applications and something >>>>>>>>>>> in HDFS >>>>>>>>>>> is cleaning up those paths, unaware of the higher level >>>>>>>>>>> dependencies? AWS >>>>>>>>>>> configures Hadoop differently for each instance type, and might >>>>>>>>>>> have more >>>>>>>>>>> aggressive cleanup settings on HDs, though this is very uninformed >>>>>>>>>>> hypothesis. >>>>>>>>>>> >>>>>>>>>>> A sample full log is attached. >>>>>>>>>>> >>>>>>>>>>> Thanks for any guidance! >>>>>>>>>>> >>>>>>>>>>> - Everett >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> *DISCLAIMER:* The contents of this email, including any >>>>>>>>>>> attachments, may contain information that is confidential, >>>>>>>>>>> proprietary in >>>>>>>>>>> nature, protected health information (PHI), or otherwise protected >>>>>>>>>>> by law >>>>>>>>>>> from disclosure, and is solely for the use of the intended >>>>>>>>>>> recipient(s). If >>>>>>>>>>> you are not the intended recipient, you are hereby notified that >>>>>>>>>>> any use, >>>>>>>>>>> disclosure or copying of this email, including any attachments, is >>>>>>>>>>> unauthorized and strictly prohibited. If you have received this >>>>>>>>>>> email in >>>>>>>>>>> error, please notify the sender of this email. Please delete this >>>>>>>>>>> and all >>>>>>>>>>> copies of this email from your system. Any opinions either >>>>>>>>>>> expressed or >>>>>>>>>>> implied in this email and all attachments, are those of its author >>>>>>>>>>> only, >>>>>>>>>>> and do not necessarily reflect those of Nuna Health, Inc. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Director of Data Science >>>>>>>>>> Cloudera <http://www.cloudera.com> >>>>>>>>>> Twitter: @josh_wills <http://twitter.com/josh_wills> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> *DISCLAIMER:* The contents of this email, including any >>>>>>>> attachments, may contain information that is confidential, proprietary >>>>>>>> in >>>>>>>> nature, protected health information (PHI), or otherwise protected by >>>>>>>> law >>>>>>>> from disclosure, and is solely for the use of the intended >>>>>>>> recipient(s). If >>>>>>>> you are not the intended recipient, you are hereby notified that any >>>>>>>> use, >>>>>>>> disclosure or copying of this email, including any attachments, is >>>>>>>> unauthorized and strictly prohibited. If you have received this email >>>>>>>> in >>>>>>>> error, please notify the sender of this email. Please delete this and >>>>>>>> all >>>>>>>> copies of this email from your system. Any opinions either expressed or >>>>>>>> implied in this email and all attachments, are those of its author >>>>>>>> only, >>>>>>>> and do not necessarily reflect those of Nuna Health, Inc. >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Director of Data Science >>>>>>> Cloudera <http://www.cloudera.com> >>>>>>> Twitter: @josh_wills <http://twitter.com/josh_wills> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Director of Data Science >>>>>> Cloudera <http://www.cloudera.com> >>>>>> Twitter: @josh_wills <http://twitter.com/josh_wills> >>>>>> >>>>> >>>>> >>>>> *DISCLAIMER:* The contents of this email, including any attachments, >>>>> may contain information that is confidential, proprietary in nature, >>>>> protected health information (PHI), or otherwise protected by law from >>>>> disclosure, and is solely for the use of the intended recipient(s). If you >>>>> are not the intended recipient, you are hereby notified that any use, >>>>> disclosure or copying of this email, including any attachments, is >>>>> unauthorized and strictly prohibited. If you have received this email in >>>>> error, please notify the sender of this email. Please delete this and all >>>>> copies of this email from your system. Any opinions either expressed or >>>>> implied in this email and all attachments, are those of its author only, >>>>> and do not necessarily reflect those of Nuna Health, Inc. >>>>> >>>> >>>> >>>> >>>> -- >>>> Director of Data Science >>>> Cloudera <http://www.cloudera.com> >>>> Twitter: @josh_wills <http://twitter.com/josh_wills> >>>> >>> >>> >> > > *DISCLAIMER:* The contents of this email, including any attachments, may > contain information that is confidential, proprietary in nature, protected > health information (PHI), or otherwise protected by law from disclosure, > and is solely for the use of the intended recipient(s). If you are not the > intended recipient, you are hereby notified that any use, disclosure or > copying of this email, including any attachments, is unauthorized and > strictly prohibited. If you have received this email in error, please > notify the sender of this email. Please delete this and all copies of this > email from your system. Any opinions either expressed or implied in this > email and all attachments, are those of its author only, and do not > necessarily reflect those of Nuna Health, Inc. > -- Director of Data Science Cloudera <http://www.cloudera.com> Twitter: @josh_wills <http://twitter.com/josh_wills>
