Thanks for the additional info. No insight yet,
Please try the following:
1. clean system
2. start jconsole
3. 2!:6'' NB. pid for future reference
4. qrun until hung
5. windows command window:
tasklist /FI "imagename eq jconsole.exe"
6. above should list the pid from earlier - and perhaps other pids
7. windows command window:
taskkill /PID nnnnn - where nnnn is NOT the pid reported by 2!:6''
8. this should let the main task run again and perhaps give more info
I would expect this to
On Wed, Oct 4, 2017 at 1:35 PM, 'Pascal Jasmin' via Programming <
[email protected]> wrote:
> sometimes disabling smt in bios can increase performance or avoid such
> problems ( I didn't do this, but ran with 5 threads ie < cores)
>
> following sequence,
>
> jconsole
> 99 5 2 fine
> 99 5 3 fine
> 99 5 4 hangs at "end task 12"
> ctrl c no immediate result
>
> jqt,
> 99 5 2 finishes, and jconsole unhangs to produce ctrl c output:
>
> |break: cdx
> | r[check _1~:>{.r=.x cdx y
>
>
> in jqt, rerunning 99 5 3 several times on 4th try with debug (ctrl-k) on,
> hangs
>
> in jconsole (attempt to unblock jqt)
>
> qrun 99 5 2
> |port already in use in this task: assert
> | 'port already in use in this task' assert-.port e.>1{"1 jcs''
> this error never occurred before, (when debug in jqt wasn't on).
>
> ________________________________
> From: Eric Iverson <[email protected]>
> To: Programming forum <[email protected]>
> Sent: Wednesday, October 4, 2017 12:58 PM
> Subject: Re: [Jprogramming] jcs/zmq addons updated
>
>
>
> Thanks for clarifying things.
>
> On your system, in a clean state, jconsole qrun 99 99 2 hangs.
>
> When you have the clean state hang in jconsole, please try ctrl+c (if you
> have not already done so) as this should break out of some socket hangs. If
> this breaks, it would provide important info.
>
> It would be useful if you could get the hang with smaller args. For
> example, can you get the hang with: qrun each 10#40 4 2
>
> Unfortunately I can not reproduce this on my windows system. I can loop
> through 100s of this test without problem. Also on Linux and OSX.
>
> On Wed, Oct 4, 2017 at 12:45 PM, 'Pascal Jasmin' via Programming <
> [email protected]> wrote:
>
> > running qrun in a single session hangs. One semi-solution that sometimes
> > works is to then launch another session (jqt or jconsole) and run qrun,
> > which will unhang the original session. If both sessions are hung,
> > launching a 3rd session may unfreeze them.
> >
> > A single run of 99 99 x does not always work. My initial claim that
> first
> > runs always worked was based on using a tasks number lower than the
> > hardware SMT capabilities.
> >
> > after clean start in jconsole
> >
> > qrun 99 99 2
> >
> > hangs at
> >
> > "end task: 98"
> >
> > since this fails, I'm not trying the 5# or 10# version.
> >
> > with the above hanged, doing the same run in jqt, in this case,
> >
> > failed to unhang jconsole
> >
> > hangs at "end task: 13"
> > ________________________________
> >
> > From: Eric Iverson <[email protected]>
> > To: Programming forum <[email protected]>
> > Sent: Wednesday, October 4, 2017 12:28 PM
> > Subject: Re: [Jprogramming] jcs/zmq addons updated
> >
> >
> >
> > I am confused by your message.
> >
> > Are you trying to run qrun at the same time in different J sessions? This
> > will definitely not work and is not the intended use for qrun.
> >
> > We need to narrow down to a simple case that fails.
> >
> > You indicate you get failures in jconsole, so let's focus on that.
> >
> > I thought you had indicated that a single run always worked and that the
> > problem only occurred in repeated runs. If that is correct, then your
> test
> > must be something like the example I gave: qrun each 10#<99 99 2.
> >
> > Please give me the exact steps that fail and how it fails.
> >
> > For example:
> > 1. clean system start
> > 2. start jconsole
> > 3. load'~addons/net/jcs/jcs.ijs'
> > 4. load'~addons/net/jcs/qrun.ijs'
> > 5. qrun each 10#<99 99 2
> > 6. what happens?
> >
> >
> > On Wed, Oct 4, 2017 at 12:01 PM, 'Pascal Jasmin' via Programming <
> > [email protected]> wrote:
> >
> > > I also had the avast virus chest issue, reran tests with shields
> > disabled,
> > > after restart.
> > >
> > >
> > > qrun 99 99 2 is the main test I've used. Though 99 11 has more success
> > (I
> > > have 6 core 12 hyperthread AMD Ryzen processor), it still fails.
> > >
> > > the tests also fail in jconsole. There is "forward momentum"
> interaction
> > > between jqt and jconsole sessions running the same qrun parameters.
> > >
> > > I've tried the following modifications to kill__
> > >
> > >
> > > kill=: 3 : 0
> > > access=: su
> > > runa'exit 0'
> > > destroy''
> > > killp PORT
> > > if. IFQT do. wd 'msgs' end.
> > > i.0 0
> > > )
> > >
> > > though these modifications have no to potentially slightly worse
> "getting
> > > through" performance.
> > >
> > >
> > > Engine: j806/j64avx/windows
> > > Beta-6: commercial/2017-09-26T14:05:48
> > > Library: 8.06.07
> > > Qt IDE: 1.6.1/5.6.3
> > > Platform: Win 64
> > > Installer: J806 install
> > > InstallPath: d:/j64-806
> > >
> > > ________________________________
> > > From: Eric Iverson <[email protected]>
> > > To: Programming forum <[email protected]>
> > > Sent: Wednesday, October 4, 2017 10:39 AM
> > > Subject: Re: [Jprogramming] jcs/zmq addons updated
> > >
> > >
> > >
> > > Pascal (qrun),
> > >
> > > I have run many tests on windows. The tests always run clean with
> > jconsole
> > > and JHS. There have been a few hiccups with Jqt. A few hangs as you
> > > describe and one crash where avast put jqt.exe in its virus chest.
> > >
> > > Jqt is probably fine vs qrun but that is the only place I have seen
> > > problems with the latest code changes. A possible suspicion is
> wd'msgs'.
> > I
> > > can't imagine why running a new Jqt session with qrun would have the
> > effect
> > > you describe,
> > >
> > > Remember that the linger bug was fixed and so things run more reliably
> > than
> > > in your tests with the first release.
> > >
> > > Please do the following:
> > > 1. let us know exactly what test you run (I use: qrun each 5#<99 99 2)
> > > 2. ensure you have the latest base, net, and qtide
> > > 3. run your tests in jconsole or JHS until you have a failure or are
> > > satisfied
> > > 4. run your tests in Jqt
> > > 5. let us know your findings
> > >
> > >
> > > On Wed, Oct 4, 2017 at 8:58 AM, 'Pascal Jasmin' via Programming <
> > > [email protected]> wrote:
> > >
> > > > was running with 1e2.
> > > >
> > > > The reason the different sessions were unblocking each other is that
> > they
> > > > were using the same ports. (as best as I can guess).
> > > >
> > > > qrun hard codes the start addresses.
> > > >
> > > >
> > > >
> > > > ________________________________
> > > > From: bill lam <[email protected]>
> > > > To: Programming forum <[email protected]>
> > > > Sent: Tuesday, October 3, 2017 10:55 PM
> > > > Subject: Re: [Jprogramming] jcs/zmq addons updated
> > > >
> > > >
> > > >
> > > > Let's take out the memory constraint factor first, say qrun with
> > sentence
> > > > 1e3. I am not sure running in different jqt instances is a good idea
> > > since
> > > > the range of 100 ports used by jcs is hardcoded and are the same for
> > each
> > > > jqt.
> > > >
> > > > On Oct 4, 2017 10:41 AM, "'Pascal Jasmin' via Programming" <
> > > > [email protected]> wrote:
> > > >
> > > > in a 4th jqt session, yes it hung on first run, though pretty far in.
> > > >
> > > > I started getting memory errors (without hanging), at 80 80, and 22
> 22.
> > > I
> > > > have 4 hung jqt sessions now, but any new one lets the others
> progress.
> > > > Task manager reports very low memory use.
> > > >
> > > > 99 11 finishes just fine. It seems that in order to unblock another
> > > > session, the tasks attempted have to number the same as in the
> blocked
> > > > session, and it has to make it up to (near) the blocked task number.
> > > >
> > > > ________________________________
> > > > From: bill lam <[email protected]>
> > > > To: Programming forum <[email protected]>
> > > > Sent: Tuesday, October 3, 2017 10:06 PM
> > > > Subject: Re: [Jprogramming] jcs/zmq addons updated
> > > >
> > > >
> > > >
> > > > Did qrun 99 99 hang in the first run?
> > > >
> > > >
> > > > On Oct 4, 2017 9:16 AM, "'Pascal Jasmin' via Programming" <
> > > > [email protected]> wrote:
> > > >
> > > > > qrun still hangs for me. Never on the first run though. In 5 of 6
> > > > tries,
> > > > > it hangs on the 3rd run. On other it hanged on 2nd run. 3rd
> parameter
> > > > > always 6.
> > > > >
> > > > > I don't think I ever breeched memory/swap issues in these or
> previous
> > > > > tests.
> > > > >
> > > > > I found a way to unhang it though.
> > > > >
> > > > > start 2nd jqt session, and run qrun in it. It may hang, but other
> > > > session
> > > > > will unfreeze. If it did hang, then repeat in other session until
> > both
> > > > > unfrozen. Though, doing this enough can result in both sessions
> > frozen
> > > > > (especially if using uneven task balances)... A 3rd jqt session to
> > the
> > > > > rescue of both frozen ones.
> > > > >
> > > > >
> > > > >
> > > > > the show command and immediate jqt console output is a nice change.
> > > > >
> > > > >
> > > > >
> > > > > ________________________________
> > > > > From: Eric Iverson <[email protected]>
> > > > > To: Programming forum <[email protected]>
> > > > > Sent: Tuesday, October 3, 2017 5:41 PM
> > > > > Subject: [Jprogramming] jcs/zmq addons updated
> > > > >
> > > > >
> > > > >
> > > > > A few cosmetic changes and perhaps fixes for qrun and related task
> > > > > problems.
> > > > >
> > > > >
> > > > > Note: qrun now defined in jcs/qrun.ijs
> > > > >
> > > > >
> > > > > The main problem was that a task ending could have a delayed close
> of
> > > the
> > > > >
> > > > > associated socket port and this could, depending on timing, prevent
> > the
> > > > >
> > > > > proper start of the next task trying to use the same port.
> > > > >
> > > > >
> > > > > The jcs sockets now set LINGER 0. This should avoid that class of
> > > > problem.
> > > > >
> > > > > Initial stress tests all run clean on Linux and Windows.
> > > > >
> > > > >
> > > > > The other problem was that a server errror in qrun caused a hang.
> > This
> > > > >
> > > > > wouldn't happen normally if the jobs were well defined and ran to
> > > > >
> > > > > completion. A way to trigger the qrun server error in Windows was
> to
> > > run
> > > > a
> > > > >
> > > > > large number of tasks with large (memory consumption) jobs. This
> > could
> > > > >
> > > > > exhaust windows swap memory and get an out-of-memory error.
> > > > >
> > > > >
> > > > > qrun now catches the server error, reports the lse error, and
> > > continues.
> > > > >
> > > > > ------------------------------------------------------------
> > ----------
> > > > >
> > > > > For information about J forums see http://www.jsoftware.com/
> > forums.htm
>
> >
> > >
> > > >
> > > >
> > > > > ------------------------------------------------------------
> > ----------
> > > > > For information about J forums see http://www.jsoftware.com/
> > forums.htm
> > > > ------------------------------------------------------------
> ----------
> > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> > > > ------------------------------------------------------------
> ----------
> > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> > > > ------------------------------------------------------------
> ----------
> > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> > > > ------------------------------------------------------------
> ----------
> > > > For information about J forums see http://www.jsoftware.com/
> forums.htm
> > > >
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> > > ----------------------------------------------------------------------
> > > For information about J forums see http://www.jsoftware.com/forums.htm
> > >
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> > ----------------------------------------------------------------------
> > For information about J forums see http://www.jsoftware.com/forums.htm
> >
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
> ----------------------------------------------------------------------
> For information about J forums see http://www.jsoftware.com/forums.htm
>
----------------------------------------------------------------------
For information about J forums see http://www.jsoftware.com/forums.htm