Thanks! We could access the logs the way you pointed out. 2016-06-15 12:33 GMT-03:00 Hitesh Shah <[email protected]>:
> If log aggregation is not enabled, the next best thing would be to > download the application master logs from the RM UI for the apps in > question. Those would provide a good starting point for figuring out what > is going on. > > thanks > β HItesh > > > > On Jun 15, 2016, at 8:29 AM, Jose Rozanec <[email protected]> > wrote: > > > > Hello, > > > > We provide an update. Seems we understood something wrong: hive returned > us an error in the query, while Tez job was running not reporting progress. > We did not cancel it, since seemed that it hanged. After two hours reported > as finished on the UI; while still held running state when listed from YARN > for some time more and finished finally finished. > > We have log aggregation enabled, but after the job finished, we still > get the same message as reported in the previous email. > > > > Now will research why Hive detached from Tez while still running; and if > we can improve query accept times, since is taking a while to start > executing complex queries. > > > > Thanks, > > > > > > > > > > 2016-06-15 12:09 GMT-03:00 Jose Rozanec <[email protected]>: > > Hello, > > > > I ran the command, and got the following message: > > 16/06/15 15:07:35 INFO impl.TimelineClientImpl: Timeline service > address: http://ip-10-64-23-215.ec2.internal:8188/ws/v1/timeline/ > > 16/06/15 15:07:35 INFO client.RMProxy: Connecting to ResourceManager at > ip-10-64-23-215.ec2.internal/10.64.23.215:8032 > > /var/log/hadoop-yarn/apps/hadoop/logs/application_1465996511770_0001 > does not exist. > > Log aggregation has not completed or is not enabled. > > > > I think we are missing some configuration that would help us get more > insight? > > > > Thanks! > > > > Joze. > > > > 2016-06-15 12:03 GMT-03:00 Hitesh Shah <[email protected]>: > > Hello Joze, > > > > Would it be possible for you to provide the YARN application logs > obtained via βbin/yarn logs -applicationId <appId>β for both of the cases > you have seen? Feel free to file JIRAs and attach logs to each of them. > > > > thanks > > β Hitesh > > > > > On Jun 15, 2016, at 7:38 AM, Jose Rozanec < > [email protected]> wrote: > > > > > > Hello, > > > > > > We are experiencing some issues with Tez 0.8.3 when we issue heavy > queries from Hive. Seems some jobs hang on Tez and never return. Those jobs > show up in the DAG web-ui, but no progress is reported on UI nor on Hive > logs. Any ideas why this could happen? We detect happens with certain > memory configurations, which if missing, the job dies soon (we guess due to > OOM). > > > > > > Most probably not related to this, at some point we also got the > following error: "org.apache.tez.dag.api.SessionNotRunning: TezSession has > already shutdown. Application xxxxx failed 2 times due to AM Container". We > are not sure can be related to TEZ-2663, which should be solved since > version 0.7.1 onwards. > > > > > > Thanks in advance, > > > > > > Joze. > > > > > > > >
