Re: Strange behaviour of session DAG

Jeff Zhang Mon, 29 Feb 2016 04:27:47 -0800

It looks like the issue is in MapChainVertex's InputInitializer.  It time
out after 3 minutes due to the HA issue.  Do they implement a custom
InputInitializer or using a custom InputFormat ? If they use a
custom InputInitializer then it might be the user code issue.



2016-02-16 12:05:21,435 INFO [InputInitializer [MapChainVertex:Export job
(349): ClaimLineExportToDB2#export(Disconnected record stream)
(2b86fca4-215f-47ba-9303-cbd2ad9d676e)] #0] graphv2.ClusterTaskRuntime:
Initializing cluster task configuration...
2016-02-16 12:05:21,437 INFO [InputInitializer [MapChainVertex:Export job
(349): ClaimLineExportToDB2#export(Disconnected record stream)
(2b86fca4-215f-47ba-9303-cbd2ad9d676e)] #0] util.LoggingUtil: Setting root
logger level for hadoop task to WARN:
2016-02-16 12:05:21,722 WARN [AsyncDispatcher event handler]
hdfs.ClientContext: Existing client context 'default' does not match
requested configuration.  Existing: shortCircuitStreamsCacheSize = 256,
shortCircuitStreamsCacheExpiryMs = 300000, shortCircuitMmapCacheSize = 256,
shortCircuitMmapCacheExpiryMs = 3600000, shortCircuitMmapCacheRetryTimeout
= 300000, shortCircuitCacheStaleThresholdMs = 1800000, socketCacheCapacity
= 16, socketCacheExpiry = 3000, shortCircuitLocalReads = false,
useLegacyBlockReaderLocal = false, domainSocketDataTraffic = false,
shortCircuitSharedMemoryWatcherInterruptCheckMs = 60000, Requested:
shortCircuitStreamsCacheSize = 4096, shortCircuitStreamsCacheExpiryMs =
300000, shortCircuitMmapCacheSize = 256, shortCircuitMmapCacheExpiryMs =
3600000, shortCircuitMmapCacheRetryTimeout = 300000,
shortCircuitCacheStaleThresholdMs = 1800000, socketCacheCapacity = 16,
socketCacheExpiry = 3000, shortCircuitLocalReads = true,
useLegacyBlockReaderLocal = false, domainSocketDataTraffic = false,
shortCircuitSharedMemoryWatcherInterruptCheckMs = 60000
*2016-02-16 12:05:21,732* WARN [AsyncDispatcher event handler] ipc.Client:
Exception encountered while connecting to the server :
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby
*2016-02-16 12:08:12*,515 WARN [AsyncDispatcher event handler]
filesystem.FileSystemNirvana: No file-system provider configuration setup!
2016-02-16 12:08:12,554 WARN [AsyncDispatcher event handler] ipc.Client:
Exception encountered while connecting to the server :
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
Operation category READ is not supported in state standby

On Mon, Feb 29, 2016 at 8:14 PM, Johannes Zillmann <[email protected]
> wrote:

> Here you go:
>
>
> Johannes
>
> On 26 Feb 2016, at 02:56, Jeff Zhang <[email protected]> wrote:
>
> Is it possible for you to get the whole app log from your customer ?
>
> On Thu, Feb 25, 2016 at 6:26 PM, Johannes Zillmann <
> [email protected]> wrote:
>
>> Looks like there is the same warning in MR as well:
>>
>> 016-02-23 11:59:24,568 WARN [main] org.apache.hadoop.ipc.Client:
>> Exception encountered while connecting to the server :
>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>> Operation category READ is not supported in state standby
>>
>> But in this case it doesn’t seem to have a bad effect…
>> Johannes
>>
>> On 25 Feb 2016, at 10:46, Jeff Zhang <[email protected]> wrote:
>>
>> Might be, does only tez job affected ? What about other jobs like MR ?
>>
>> On Thu, Feb 25, 2016 at 5:32 PM, Johannes Zillmann <
>> [email protected]> wrote:
>>
>>> So yes, i think HA is enabled.
>>> But this whole behaviour is not a one shot thing, its happen repeatedly
>>> for this job. So my guess is its rather a unhappy configuration then the
>>> name node switching again and again !?
>>>
>>> Johannes
>>>
>>>
>>> On 25 Feb 2016, at 10:29, Jeff Zhang <[email protected]> wrote:
>>>
>>> The standby warning is because it try to connect to the standby node.
>>> (HA should be enabled I think).  Maybe this cause the output missing, could
>>> you attach the whole app log ?
>>>
>>> On Thu, Feb 25, 2016 at 5:03 PM, Johannes Zillmann <
>>> [email protected]> wrote:
>>>
>>>> Hey guys,
>>>>
>>>> have a Tez job on a customer deployment which has very strange
>>>> symptoms. Maybe you have some pointers for me what to look into!
>>>>
>>>> App master log:
>>>> —————————————
>>>>
>>>> 2016-02-16 12:05:16,666 INFO [main] app.DAGAppMaster: Created DAGAppMaster 
>>>> for application appattempt_1455209830330_35852_000001, versionInfo=[ 
>>>> component=tez-dag, version=0.6.0, 
>>>> revision=25e41bc0f29f5e25adbfc17258b00ad702a17888, 
>>>> SCM-URL=scm:git:https://git-wip-us.apache.org/repos/asf/tez.git, 
>>>> buildTime=2015-03-03T18:01:25Z ]
>>>> 2016-02-16 12:05:16,807 INFO [main] Configuration.deprecation: 
>>>> fs.default.name is deprecated. Instead, use fs.defaultFS
>>>> 2016-02-16 12:05:16,816 INFO [main] Configuration.deprecation: 
>>>> dfs.umaskmode is deprecated. Instead, use fs.permissions.umask-mode
>>>> 2016-02-16 12:05:16,987 WARN [main] util.NativeCodeLoader: Unable to load 
>>>> native-hadoop library for your platform... using builtin-java classes 
>>>> where applicable
>>>> 2016-02-16 12:05:17,134 INFO [main] app.DAGAppMaster: Comparing client 
>>>> version with AM version, clientVersion=0.6.0, AMVersion=0.6.0
>>>> 2016-02-16 12:05:18,158 WARN [main] ipc.Client: Exception encountered 
>>>> while connecting to the server : 
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>>>  Operation category READ is not supported in state standby
>>>> 2016-02-16 12:05:18,298 INFO [main] app.DAGAppMaster: Adding session token 
>>>> to jobTokenSecretManager for application
>>>> 2016-02-16 12:05:18,305 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.rm.container.AMContainerEventType for class 
>>>> org.apache.tez.dag.app.rm.container.AMContainerMap
>>>> 2016-02-16 12:05:18,306 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.rm.node.AMNodeEventType for class 
>>>> org.apache.tez.dag.app.rm.node.AMNodeTracker
>>>> 2016-02-16 12:05:18,308 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.dag.event.DAGAppMasterEventType for class 
>>>> org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterEventHandler
>>>> 2016-02-16 12:05:18,309 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.dag.event.DAGEventType for class 
>>>> org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher
>>>> 2016-02-16 12:05:18,310 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.dag.event.VertexEventType for class 
>>>> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher
>>>> 2016-02-16 12:05:18,311 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.dag.event.TaskEventType for class 
>>>> org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher
>>>> 2016-02-16 12:05:18,312 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.dag.event.TaskAttemptEventType for class 
>>>> org.apache.tez.dag.app.DAGAppMaster$TaskAttemptEventDispatcher
>>>> 2016-02-16 12:05:18,312 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.dag.event.SpeculatorEventType for class 
>>>> org.apache.tez.dag.app.DAGAppMaster$4
>>>> 2016-02-16 12:05:18,312 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.dag.event.SpeculatorEventType for class 
>>>> org.apache.hadoop.yarn.event.AsyncDispatcher$GenericEventHandler
>>>> 2016-02-16 12:05:18,357 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.rm.AMSchedulerEventType for class 
>>>> org.apache.tez.dag.app.rm.TaskSchedulerEventHandler
>>>> 2016-02-16 12:05:18,359 INFO [main] event.AsyncDispatcher: Registering 
>>>> class org.apache.tez.dag.app.rm.NMCommunicatorEventType for class 
>>>> org.apache.tez.dag.app.launcher.ContainerLauncherImpl
>>>> 2016-02-16 12:05:18,483 INFO [main] node.AMNodeTracker: 
>>>> blacklistDisablePercent is 33, blacklistingEnabled: true, 
>>>> maxTaskFailuresPerNode: 10
>>>> 2016-02-16 12:05:18,485 INFO [main] launcher.ContainerLauncherImpl: Upper 
>>>> limit on the thread pool size is 500
>>>> 2016-02-16 12:05:18,485 INFO [main] history.HistoryEventHandler: 
>>>> Initializing HistoryEventHandler
>>>> 2016-02-16 12:05:18,489 INFO [main] impl.SimpleHistoryLoggingService: Log 
>>>> file location for SimpleHistoryLoggingService not specified, defaulting to 
>>>> containerLogDir=/u10/hadoop/yarn/log/application_1455209830330_35852/container_1455209830330_35852_01_000001
>>>> 2016-02-16 12:05:18,490 INFO [main] impl.SimpleHistoryLoggingService: 
>>>> Initializing SimpleHistoryLoggingService, 
>>>> logFileLocation=/u10/hadoop/yarn/log/application_1455209830330_35852/container_1455209830330_35852_01_000001/history.txt.appattempt_1455209830330_35852_000001,
>>>>  maxErrors=10
>>>> 2016-02-16 12:05:18,490 INFO [main] recovery.RecoveryService: Initializing 
>>>> RecoveryService
>>>> 2016-02-16 12:05:18,491 INFO [main] history.HistoryEventHandler: 
>>>> [HISTORY][DAG:N/A][Event:APP_LAUNCHED]: 
>>>> applicationId=application_1455209830330_35852, 
>>>> appSubmitTime=1455642293317, launchTime=1455642316658
>>>> 2016-02-16 12:05:18,492 INFO [main] history.HistoryEventHandler: 
>>>> [HISTORY][DAG:N/A][Event:AM_LAUNCHED]: 
>>>> appAttemptId=appattempt_1455209830330_35852_000001, 
>>>> appSubmitTime=1455642293317,
>>>>
>>>>
>>>> … user specific code
>>>>
>>>> 2016-02-16 12:05:19,343 INFO [IPC Server handler 0 on 13239] 
>>>> app.DAGAppMaster: Starting DAG submitted via RPC: Export job (349): 
>>>> ClaimLineExportToDB2#export(Disconnected record 
>>>> stream)#export(Disconnected record (d4e9fda0-0f6c-4e63-8d5b-e846334affdf)
>>>> 2016-02-16 12:05:19,347 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: mapred.output.compress is deprecated. Instead, 
>>>> use mapreduce.output.fileoutputformat.compress
>>>> 2016-02-16 12:05:19,347 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: fs.default.name is deprecated. Instead, use 
>>>> fs.defaultFS
>>>> 2016-02-16 12:05:19,348 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: mapred.map.tasks.speculative.execution is 
>>>> deprecated. Instead, use mapreduce.map.speculative
>>>> 2016-02-16 12:05:19,348 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use 
>>>> mapreduce.job.reduces
>>>> 2016-02-16 12:05:19,349 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: mapred.min.split.size is deprecated. Instead, 
>>>> use mapreduce.input.fileinputformat.split.minsize
>>>> 2016-02-16 12:05:19,349 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: mapred.reduce.tasks.speculative.execution is 
>>>> deprecated. Instead, use mapreduce.reduce.speculative
>>>> 2016-02-16 12:05:19,351 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: mapred.output.compression.type is deprecated. 
>>>> Instead, use mapreduce.output.fileoutputformat.compress.type
>>>> 2016-02-16 12:05:19,352 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: mapred.max.split.size is deprecated. Instead, 
>>>> use mapreduce.input.fileinputformat.split.maxsize
>>>> 2016-02-16 12:05:19,354 INFO [IPC Server handler 0 on 13239] 
>>>> Configuration.deprecation: mapred.compress.map.output is deprecated. 
>>>> Instead, use mapreduce.map.output.compress
>>>> 2016-02-16 12:05:19,480 INFO [IPC Server handler 0 on 13239] 
>>>> app.DAGAppMaster: Generating DAG graphviz file, 
>>>> dagId=dag_1455209830330_35852_1, 
>>>> filePath=/u01/hadoop/yarn/log/application_1455209830330_35852/container_1455209830330_35852_01_000001/dag_1455209830330_35852_1.dot
>>>> 2016-02-16 12:05:19,482 INFO [IPC Server handler 0 on 13239] 
>>>> common.TezUtilsInternal: Redirecting log file based on addend: 
>>>> dag_1455209830330_35852_1
>>>>
>>>> —————————————
>>>>
>>>> DAG-log:
>>>> --------------------------
>>>>
>>>> 2016-02-16 12:05:19,482 INFO [IPC Server handler 0 on 13239] 
>>>> app.DAGAppMaster: Running DAG: Export job (349): 
>>>> ClaimLineExportToDB2#export(Disconnected record 
>>>> stream)#export(Disconnected record (d4e9fda0-0f6c-4e63-8d5b-e846334affdf)
>>>> 2016-02-16 12:05:19,966 INFO [HistoryEventHandlingThread] 
>>>> impl.SimpleHistoryLoggingService: Writing event DAG_SUBMITTED to history 
>>>> file
>>>>
>>>> … user specific code
>>>>
>>>> 2016-02-16 12:05:21,437 INFO [InputInitializer [MapChainVertex:Export job 
>>>> (349): ClaimLineExportToDB2#export(Disconnected record stream) 
>>>> (2b86fca4-215f-47ba-9303-cbd2ad9d676e)] #0] util.LoggingUtil: Setting root 
>>>> logger level for hadoop task to WARN:
>>>> 2016-02-16 12:05:21,722 WARN [AsyncDispatcher event handler] 
>>>> hdfs.ClientContext: Existing client context 'default' does not match 
>>>> requested configuration.  Existing: shortCircuitStreamsCacheSize = 256, 
>>>> shortCircuitStreamsCacheExpiryMs = 300000, shortCircuitMmapCacheSize = 
>>>> 256, shortCircuitMmapCacheExpiryMs = 3600000, 
>>>> shortCircuitMmapCacheRetryTimeout = 300000, 
>>>> shortCircuitCacheStaleThresholdMs = 1800000, socketCacheCapacity = 16, 
>>>> socketCacheExpiry = 3000, shortCircuitLocalReads = false, 
>>>> useLegacyBlockReaderLocal = false, domainSocketDataTraffic = false, 
>>>> shortCircuitSharedMemoryWatcherInterruptCheckMs = 60000, Requested: 
>>>> shortCircuitStreamsCacheSize = 4096, shortCircuitStreamsCacheExpiryMs = 
>>>> 300000, shortCircuitMmapCacheSize = 256, shortCircuitMmapCacheExpiryMs = 
>>>> 3600000, shortCircuitMmapCacheRetryTimeout = 300000, 
>>>> shortCircuitCacheStaleThresholdMs = 1800000, socketCacheCapacity = 16, 
>>>> socketCacheExpiry = 3000, shortCircuitLocalReads = true, 
>>>> useLegacyBlockReaderLocal = false, domainSocketDataTraffic = false, 
>>>> shortCircuitSharedMemoryWatcherInterruptCheckMs = 60000
>>>> 2016-02-16 12:05:21,732 WARN [AsyncDispatcher event handler] ipc.Client: 
>>>> Exception encountered while connecting to the server : 
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>>>  Operation category READ is not supported in state standby
>>>> 2016-02-16 12:08:12,554 WARN [AsyncDispatcher event handler] ipc.Client: 
>>>> Exception encountered while connecting to the server : 
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>>>  Operation category READ is not supported in state standby
>>>>
>>>> --------------------------
>>>>
>>>>
>>>> Latest line of the (single) task:
>>>> --------------------------
>>>>
>>>> ...
>>>>
>>>> 2016-02-16 12:07:00,166 INFO [Initializer 1] 
>>>> runtime.LogicalIOProcessorRuntimeTask: Initializing Output using 
>>>> OutputSpec: { destinationVertexName=output, physicalEdgeCount=0, 
>>>> outputClassName=datameer.plugin.tez.output.DoNothingOutput }
>>>> 2016-02-16 12:07:00,168 INFO [Initializer 0] 
>>>> runtime.LogicalIOProcessorRuntimeTask: Creating Input
>>>> 2016-02-16 12:07:00,168 INFO [Initializer 1] 
>>>> runtime.LogicalIOProcessorRuntimeTask: Creating Output
>>>> 2016-02-16 12:07:00,175 INFO [Initializer 1] 
>>>> runtime.LogicalIOProcessorRuntimeTask: Initializing Output with dest edge: 
>>>> output
>>>> 2016-02-16 12:07:00,176 INFO [Initializer 0] 
>>>> runtime.LogicalIOProcessorRuntimeTask: Initializing Input with src edge: 
>>>> Input:Export job (349): ClaimLineExportToDB2#export(Disconnected record 
>>>> stream)
>>>> 2016-02-16 12:07:00,176 INFO [Initializer 0] 
>>>> resources.MemoryDistributor$RequestorInfo: Received request: 0, type: 
>>>> INPUT, componentVertexName: Input:Export job (349): 
>>>> ClaimLineExportToDB2#export(Disconnected record stream)
>>>> 2016-02-16 12:07:00,176 INFO [Initializer 1] 
>>>> resources.MemoryDistributor$RequestorInfo: Received request: 0, type: 
>>>> OUTPUT, componentVertexName: output
>>>> 2016-02-16 12:07:00,177 INFO [Initializer 1] 
>>>> runtime.LogicalIOProcessorRuntimeTask: Initialized Output with dest edge: 
>>>> output
>>>>
>>>>
>>>> … user specific code
>>>>
>>>> 2016-02-16 12:07:01,871 INFO [TezChild] util.LoggingUtil: Setting root
>>>> logger level for hadoop task to WARN:
>>>>
>>>> 2016-02-16 12:07:03,990 WARN [TezChild] ipc.Client: Exception encountered 
>>>> while connecting to the server : 
>>>> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException):
>>>>  Operation category READ is not supported in state standby
>>>>
>>>> —————————————
>>>>
>>>> The client log contains a shutdown exception:
>>>> —————————————
>>>>
>>>> INFO [2016-02-16 12:13:30.412] [MrPlanRunnerV2] (TezClient.java:470) - 
>>>> Shutting down Tez Session, sessionName=Export job (349): 
>>>> ClaimLineExportToDB2#export(Disconnected record 
>>>> stream)#export(Disconnected record (d4e9fda0-0f6c-4e63-8d5b-e846334affdf), 
>>>> applicationId=application_1455209830330_35852
>>>>  INFO [2016-02-16 12:13:30.581] [MrPlanRunnerV2] (TezClientUtils.java:838) 
>>>> - Application not running, applicationId=application_1455209830330_35852, 
>>>> yarnApplicationState=FINISHED, finalApplicationStatus=SUCCEEDED, 
>>>> trackingUrl=http://xxx:8088/proxy/application_1455209830330_35852/, 
>>>> diagnostics=Session stats:submittedDAGs=1, successfulDAGs=1, failedDAGs=0, 
>>>> killedDAGs=0
>>>>
>>>>  INFO [2016-02-16 12:13:30.581] [MrPlanRunnerV2] (TezClient.java:484) - 
>>>> Failed to shutdown Tez Session via proxy
>>>> org.apache.tez.dag.api.SessionNotRunning: Application not running, 
>>>> applicationId=application_1455209830330_35852, 
>>>> yarnApplicationState=FINISHED, finalApplicationStatus=SUCCEEDED, 
>>>> trackingUrl=http://xxx:8088/proxy/application_1455209830330_35852/, 
>>>> diagnostics=Session stats:submittedDAGs=1, successfulDAGs=1, failedDAGs=0, 
>>>> killedDAGs=0
>>>>
>>>>    at 
>>>> org.apache.tez.client.TezClientUtils.getSessionAMProxy(TezClientUtils.java:839)
>>>>    at org.apache.tez.client.TezClient.getSessionAMProxy(TezClient.java:669)
>>>>    at org.apache.tez.client.TezClient.stop(TezClient.java:476)
>>>>    at 
>>>> datameer.plugin.tez.session.TezSessionImpl.close(TezSessionImpl.java:69)
>>>>    at 
>>>> datameer.plugin.tez.session.PoolingTezSessionFactory.shutdownSession(PoolingTezSessionFactory.java:173)
>>>>    at 
>>>> datameer.plugin.tez.session.PoolingTezSessionFactory.shutdownSessions(PoolingTezSessionFactory.java:167)
>>>>    at 
>>>> datameer.plugin.tez.session.PoolingTezSessionFactory.close(PoolingTezSessionFactory.java:162)
>>>>    at datameer.com.google.common.io.Closer.close(Closer.java:214)
>>>>    at 
>>>> datameer.plugin.tez.smart.SmartClusterJobFlowResolver.close(SmartClusterJobFlowResolver.java:147)
>>>>    at datameer.com.google.common.io.Closer.close(Closer.java:214)
>>>>    at datameer.dap.sdk.util.IoUtil.close(IoUtil.java:330)
>>>>    at 
>>>> datameer.dap.common.graphv2.mixedframework.MixedClusterSession.commitJob(MixedClusterSession.java:65)
>>>>    at 
>>>> datameer.dap.common.graphv2.ClusterSession.commitJob(ClusterSession.java:81)
>>>>    at 
>>>> datameer.dap.common.graphv2.MrPlanRunnerV2.run(MrPlanRunnerV2.java:97)
>>>>    at 
>>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>    at 
>>>> datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:96)
>>>>    at 
>>>> datameer.dap.common.security.DatameerSecurityService.runAsUser(DatameerSecurityService.java:173)
>>>>    at datameer.dap.common.security.RunAsThread.run(RunAsThread.java:22)
>>>>  INFO [2016-02-16 12:13:30.582] [MrPlanRunnerV2] (TezClient.java:489) - 
>>>> Could not connect to AM, killing session via YARN, sessionName=Export job 
>>>> (349): ClaimLineExportToDB2#export(Disconnected record 
>>>> stream)#export(Disconnected record (d4e9fda0-0f6c-4e63-8d5b-e846334affdf), 
>>>> applicationId=application_1455209830330_35852
>>>>
>>>>
>>>> —————————————
>>>>
>>>>
>>>> The counter and the event shows that:
>>>> - the correct amount of input records has been ingested
>>>> - the expected amount of output records is missing (we have zero
>>>> records produced which is reflected in both, the counters and the output
>>>> itself)
>>>> - for Tez the task and the DAG are both handled as successful
>>>>
>>>> The customers says that the symptoms of the job are always the same:
>>>> - there is that standby warning
>>>> - there is the problem with the shutdown
>>>> - (and the data is missing)
>>>>
>>>> To me its still unclear if there are strange things going on with Tez
>>>> or with our application logic.
>>>> The same job is successful when executed with pure MapReduce.
>>>> Is the standby thing something to worry about ?
>>>>
>>>> best
>>>> Johannes
>>>>
>>>
>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>>
>>>
>>
>>
>> --
>> Best Regards
>>
>> Jeff Zhang
>>
>>
>>
>
>
> --
> Best Regards
>
> Jeff Zhang
>
>
>
>


-- 
Best Regards

Jeff Zhang

Re: Strange behaviour of session DAG

Reply via email to