Yes, log files and stacktraces are different things. A stacktrace shows the call hierarchy of all threads in a JVM at the time when it is taken. So you can see the method that is currently executed (and from where it was called) when the stacktrace is taken. In case of a deadlock, you see where the program is waiting.
The stack you sent is only a part of the complete stacktrace. Most IDEs have a feature to take a stacktrace while they are executing a program. 2016-09-23 11:43 GMT+02:00 Yassine MARZOUGUI <y.marzou...@mindlytix.com>: > Hi Fabian, > > Not sure if this answers your question, here is the stack I got when > debugging the combine and datasource operators when the job got stuck: > > "DataSource (at main(BatchTest.java:28) > (org.apache.flink.api.java.io.TupleCsvInputFormat)) > (1/8)" > at java.lang.Object.wait(Object.java) > at org.apache.flink.runtime.io.network.buffer. > LocalBufferPool.requestBuffer(LocalBufferPool.java:163) > at org.apache.flink.runtime.io.network.buffer.LocalBufferPool. > requestBufferBlocking(LocalBufferPool.java:133) > at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit( > RecordWriter.java:93) > at org.apache.flink.runtime.operators.shipping.OutputCollector.collect( > OutputCollector.java:65) > at org.apache.flink.runtime.operators.util.metrics. > CountingCollector.collect(CountingCollector.java:35) > at org.apache.flink.runtime.operators.DataSourceTask. > invoke(DataSourceTask.java:163) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) > at java.lang.Thread.run(Thread.java:745) > > "Combine (GroupReduce at first(DataSet.java:573)) (1/8)" > at java.lang.Object.wait(Object.java) > at org.apache.flink.runtime.io.network.buffer. > LocalBufferPool.requestBuffer(LocalBufferPool.java:163) > at org.apache.flink.runtime.io.network.buffer.LocalBufferPool. > requestBufferBlocking(LocalBufferPool.java:133) > at org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit( > RecordWriter.java:93) > at org.apache.flink.runtime.operators.shipping.OutputCollector.collect( > OutputCollector.java:65) > at org.apache.flink.api.java.functions.FirstReducer.reduce( > FirstReducer.java:41) > at org.apache.flink.api.java.functions.FirstReducer. > combine(FirstReducer.java:52) > at org.apache.flink.runtime.operators.AllGroupReduceDriver.run( > AllGroupReduceDriver.java:152) > at org.apache.flink.runtime.operators.BatchTask.run(BatchTask.java:486) > at org.apache.flink.runtime.operators.BatchTask.invoke(BatchTask.java:351) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) > at java.lang.Thread.run(Thread.java:745) > > Best, > Yassine > > > 2016-09-23 11:28 GMT+02:00 Yassine MARZOUGUI <y.marzou...@mindlytix.com>: > >> Hi Fabian, >> >> Is it different from the output I already sent? (see attached file). If >> yes, how can I obtain the stacktrace of the job programmatically? Thanks. >> >> Best, >> Yassine >> >> 2016-09-23 10:55 GMT+02:00 Fabian Hueske <fhue...@gmail.com>: >> >>> Hi Yassine, can you share a stacktrace of the job when it got stuck? >>> >>> Thanks, Fabian >>> >>> 2016-09-22 14:03 GMT+02:00 Yassine MARZOUGUI <y.marzou...@mindlytix.com> >>> : >>> >>>> The input splits are correctly assgined. I noticed that whenever the >>>> job is stuck, that is because the task *Combine (GroupReduce at >>>> first(DataSet.java:573)) *keeps RUNNING and never switches to FINISHED. >>>> I tried to debug the program at the *first(100), *but I couldn't do >>>> much. I attahced the full DEBUG output. >>>> >>>> 2016-09-22 12:10 GMT+02:00 Robert Metzger <rmetz...@apache.org>: >>>> >>>>> Can you try running with DEBUG logging level? >>>>> Then you should see if input splits are assigned. >>>>> Also, you could try to use a debugger to see what's going on. >>>>> >>>>> On Mon, Sep 19, 2016 at 2:04 PM, Yassine MARZOUGUI < >>>>> y.marzou...@mindlytix.com> wrote: >>>>> >>>>>> Hi Chensey, >>>>>> >>>>>> I am running Flink 1.1.2, and using NetBeans 8.1. >>>>>> I made a screencast reproducing the problem here: >>>>>> http://recordit.co/P53OnFokN4 <http://recordit.co/VRBpBlb51A>. >>>>>> >>>>>> Best, >>>>>> Yassine >>>>>> >>>>>> >>>>>> 2016-09-19 10:04 GMT+02:00 Chesnay Schepler <ches...@apache.org>: >>>>>> >>>>>>> No, I can't recall that i had this happen to me. >>>>>>> >>>>>>> I would enable logging and try again, as well as checking whether >>>>>>> the second job is actually running through the WebInterface. >>>>>>> >>>>>>> If you tell me your NetBeans version i can try to reproduce it. >>>>>>> >>>>>>> Also, which version of Flink are you using? >>>>>>> >>>>>>> >>>>>>> On 19.09.2016 07:45, Aljoscha Krettek wrote: >>>>>>> >>>>>>> Hmm, this sound like it could be IDE/Windows specific, unfortunately >>>>>>> I don't have access to a windows machine. I'll loop in Chesnay how is >>>>>>> using >>>>>>> windows. >>>>>>> >>>>>>> Chesnay, do you maybe have an idea what could be the problem? Have >>>>>>> you ever encountered this? >>>>>>> >>>>>>> On Sat, 17 Sep 2016 at 15:30 Yassine MARZOUGUI < >>>>>>> y.marzou...@mindlytix.com> wrote: >>>>>>> >>>>>>>> Hi Aljoscha, >>>>>>>> >>>>>>>> Thanks for your response. By the first time I mean I hit run from >>>>>>>> the IDE (I am using Netbeans on Windows) the first time after building >>>>>>>> the >>>>>>>> program. If then I stop it and run it again (without rebuidling) It is >>>>>>>> stuck in the state RUNNING. Sometimes I have to rebuild it, or close >>>>>>>> the >>>>>>>> IDE to be able to get an output. The behaviour is random, maybe it's >>>>>>>> related to the IDE or the OS and not necessarily Flink itself. >>>>>>>> >>>>>>>> On Sep 17, 2016 15:16, "Aljoscha Krettek" <aljos...@apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> when is the "first time". It seems you have tried this repeatedly >>>>>>>>> so what differentiates a "first time" from the other times? Are you >>>>>>>>> closing >>>>>>>>> your IDE in-between or do you mean running the job a second time >>>>>>>>> within the >>>>>>>>> same program? >>>>>>>>> >>>>>>>>> Cheers, >>>>>>>>> Aljoscha >>>>>>>>> >>>>>>>>> On Fri, 9 Sep 2016 at 16:40 Yassine MARZOUGUI < >>>>>>>>> y.marzou...@mindlytix.com> wrote: >>>>>>>>> >>>>>>>>>> Hi all, >>>>>>>>>> >>>>>>>>>> When I run the following batch job inside the IDE for the first >>>>>>>>>> time, it outputs results and switches to FINISHED, but when I run it >>>>>>>>>> again >>>>>>>>>> it is stuck in the state RUNNING. The csv file size is 160 MB. What >>>>>>>>>> could >>>>>>>>>> be the reason for this behaviour? >>>>>>>>>> >>>>>>>>>> public class BatchJob { >>>>>>>>>> >>>>>>>>>> public static void main(String[] args) throws Exception { >>>>>>>>>> final ExecutionEnvironment env = >>>>>>>>>> ExecutionEnvironment.getExecutionEnvironment(); >>>>>>>>>> >>>>>>>>>> env.readCsvFile("dump.csv") >>>>>>>>>> .ignoreFirstLine() >>>>>>>>>> .fieldDelimiter(";") >>>>>>>>>> .includeFields("111000") >>>>>>>>>> .types(String.class, String.class, String.class) >>>>>>>>>> .first(100) >>>>>>>>>> .print(); >>>>>>>>>> >>>>>>>>>> } >>>>>>>>>> } >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Yassine >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >