I bumped into the following thread about dumping stack traces with Gradle
[1] and thought that may be worth sharing in case someone decides to
implement something along these lines for Calcite.

Best,
Stamatis

[1] https://discuss.gradle.org/t/dump-stack-trace-for-tests/33524

On Mon, Dec 13, 2021 at 6:26 PM Jacques Nadeau <[email protected]> wrote:

> I wonder if we can create a simple shell script that runs a jstack once an
> hour (starting after one hour) and then run it using
> https://github.com/psxpaul/gradle-execfork-plugin? Since none of our jobs
> run an hour, most of the time it wouldn't do anything. In the cases where
> the job hung, we'd hopefully get a jstack.
>
>
> On Mon, Dec 13, 2021 at 12:17 AM Stamatis Zampetakis <[email protected]>
> wrote:
>
> > If there is a systematic way to do it I would be interested to know.
> >
> > In the past, when I encountered similar hangs in CI what I ended-up doing
> > is adding debugging commits in the PR with a thread printing stack traces
> > of other threads at some intervals.
> >
> > Best,
> > Stamatis
> >
> > On Sun, Dec 12, 2021 at 7:00 PM Jacques Nadeau <[email protected]>
> wrote:
> >
> > > It could be infra but I'm wondering if it is some kind of concurrency
> > bug.
> > >
> > > Anyone know if there is a straightforward way to add a secondary
> process
> > in
> > > a github workflow that takes a jstack after an hour or something (if
> the
> > > tests run that long). Trying to jump on an instance when this happens
> and
> > > do this manually sounds like an effort in frustration.
> > >
> > > I guess another option would be to modify the druid job to provide info
> > on
> > > tests that are running so that we can see if it always locks on the
> same
> > > test.
> > >
> > > On Sat, Dec 11, 2021 at 11:39 PM Alessandro Solimando <
> > > [email protected]> wrote:
> > >
> > > > I started noticing that intermittently around a month ago, I had a
> > quick
> > > > look back then but I could not pinpoint the root cause.
> > > >
> > > > I don't think it is expected, and I guess it comes from test infra
> > setup
> > > > rather than the Calcite code itself.
> > > >
> > > > Il Dom 12 Dic 2021, 05:43 Jacques Nadeau <[email protected]> ha
> > > scritto:
> > > >
> > > > > I see a couple of recent builds with Druid tests hanging. Is that a
> > > > normal
> > > > > thing or something that has started recently.
> > > > >
> > > > > Examples:
> > > > >
> > >
> https://github.com/apache/calcite/runs/4487013505?check_suite_focus=true
> > > > >
> > > >
> > >
> >
> https://github.com/jacques-n/calcite/runs/4494836558?check_suite_focus=true
> > > > >
> > > >
> > >
> >
>

Reply via email to