On Sep 1, 2018, at 2:00 PM, Dave Nasralla wrote:

> One of our systems is crashing about every 3 days and I can't seem to
> isolate the cause. Lately these are crashes with a Mac crash report
> appearing on the screen.
> Some system details are:
> - 4D Built Server app with v17.0 HF1 (64 bit Server with 64 Mac and
> 32 bit Windows Clients)
> - Mac and Windows Clients
> - Mac OS 10.13.5
> 
> What I know so far:
> - I have the Server Debug file. It ends with a "." and so the last
> command appears to have executed.
> - I'm using the Report Info component, logging every 5 minutes. There
> doesn't seem to be memory problems or run away cache issues.
> - I also know who was one each time it crashes and said out an email
> to those users to find patterns (so far I've found none).
> - The crashes typically happen around 10am to 11am.
> - The client and server builds match.
> 
> I'm debating turning on the client debugger files and then harvesting
> them afterwards when the user logs back in. I'm open to other
> debugging techniques.
> 
> There are other v17 systems running on the same machine with zero issue.
> 
> Below is a snippet of the crash report. It seems to be different each
> time, but here is the latest. Thread 73 crashed, so I only included
> that one.
> 
> Thanks,
> 
> dave nasralla
> ------------------------------------
> Process:               Corporate [93958]
> Path:                  /Users/USER/*/Corporate
> Server.app/Contents/MacOS/Corporate
> Identifier:            4d.com.Corporate Server.app
> Version:               17.0 build 17.226566 (???)
> Code Type:             X86-64 (Native)
> Parent Process:        ??? [1]
> Responsible:           Corporate [93958]
> User ID:               501
> 
> Date/Time:             2018-08-31 11:00:05.952 -0500
> OS Version:            Mac OS X 10.13.5 (17F77)
> Report Version:        12
> Anonymous UUID:        723511FD-4CA0-6E8B-0642-883209248DFC
> 
> 
> Time Awake Since Boot: 3700000 seconds
> 
> System Integrity Protection: enabled
> 
> Crashed Thread:        73  LabProjects List (id = -114)
> 
> Exception Type:        EXC_BAD_ACCESS (SIGSEGV)
> Exception Codes:       EXC_I386_GPFLT
> Exception Note:        EXC_CORPSE_NOTIFY
> 
> Termination Signal:    Segmentation fault: 11
> Termination Reason:    Namespace SIGNAL, Code 0xb
> Terminating Process:   exc handler [0]
> ----------------------------------------------------------
> 
> 
> Thread 73 Crashed:: LabProjects List (id = -114)
> 0   4d.com.Corporate Server.app       0x000000010694fdbe
> V4DConnection::OnPostpone(bool) + 40
> 1   4d.com.Corporate Server.app       0x0000000106b095f7
> V4DServerUser::PostponeServiceConnection() + 35
> 2   4d.com.Corporate Server.app       0x0000000106b20567
> V4DServer::exec_ConnectionPostpone(V4DRequestReply&, V4DTaskConcrete*,
> short) + 395
> 3   4d.com.Corporate Server.app       0x0000000106b211ca
> V4DServer::exec_streamreq(V4DRequestReply&, V4DTaskConcrete*) + 100

Hi Dave,

Crashing every 3 days is a real problem and totally unacceptable. So what can 
be done to try and make this situation better? We need to make changes to make 
this crashing stop. But what changes? 

Here is my thinking as I read this crash report. Keep in mind I’m not an expert 
on this, so I may be wrong in some areas. If I am wrong hopefully those that 
know more can correct me — and in turn help me and others understand more about 
how to read these macOS crash reports. (Thinking about Miyako, JPR, Christian 
Sakowski and Rob Laveaux — they are real experts in this area. Real macOS 
programmers that know how to read these things properly.)

The crash report is supposed to provide a programmer with information on 
exactly here the program crashed and the cause of the crash. If you have the 
special 4D “debug” version it will contain more “symbols” and thus when 4D 
crashes you get better names for functions instead of just memory address 
offset. I think you even get 4D command names that were involved in the crash. 
But the basic crash dump info that we have here can help point to the general 
area of concern. Here is a website that helps explain crash dumps and how to 
read them: 

https://www.maketecheasier.com/read-macos-crash-reports-troubleshoot-mac/

This is 4D v17.0 build 226566 that is running compiled in 64bit mode (Code 
Type: x86-64). So first thought is that this could be a 4D 64bit issue. That’s 
important because some of the code is completely different between 32bit 4D and 
64bit 4D. The 64bit code could be newly written code, the 32bit code could be 
legacy code that has been around for years. 

Thread 73 “LabProjects List” is what crashed. Do you have a table named 
“LabProjects” or maybe a MODIFY SELECTION or a listbox window that shows 
records in this table? Or a process that has that name? Makes me think that you 
do. That’s another pointer to where in your application the crashing problem 
occurred.

Exception Type is "EXC_BAD_ACCESS (SIGSEGV)” and that means "the program 
attempts to access memory incorrectly or with an invalid address”. Could be a C 
pointer that went bad or something doing with virtual memory or even how 4D 
allocates its own memory internally. Could be 4D data cache related. Basically 
4D tried to access memory is was not allowed to access and macOS killed 4D so 
that it could not damage other parts of the system and cause them to crash. 
Thank you macOS for watching out and protecting us from complete system 
corruption and crashing. Windows does this too.

The last area is where we can see exactly where in 4D — and even the 4D C or 
Objective C function name — that was running when macOS said “enough, this 
application has gone crazy, I need to kill it before it does damage to other 
applications.” The functions are listed in reverse chronological order, so the 
one at the bottom is where the “call chain” started. The one at the top is 
where it died.

The function name is "V4DConnection::OnPostpone(bool)” and at the code at 40 
bytes from the start of that function is where the offending memory address 
statement occurred. The name “V4DConnection” makes me think this is related to 
networking, 4D Server handling network actions with 4D Client. The “OnPostpone” 
makes me think this is somehow related to sleeping or a 4D Client connection 
that has been asleep and needs to now wake up. And lastly it make me think 
“this is related to the new network layer code”. Again, this is just my 
thinking. I could be completely wrong about all of this. 

So now my brain tries to build a scenario that could most likely happen that 
could be connected to this situation. Happens during the day between 10am and 
11am. It’s a work day with users connected. People came in to work got 
connected to 4D Server, then wandered off to a meeting or something and their 
computer went to sleep. You are using 4D Server compiled 64bit so you MUST be 
using the new network layer. Legacy is only available in 32bit compiled 4D 
Server macOS. 

There is this new network layer feature where if a 4D Client machine goes into 
sleep mode you don’t lose your 4D Server connection. So that when the user 
wakes up the 4D Client machine it notifies 4D Server and the old network 
connection is reenergized and brought back to life. That “OnPostpone” mention 
above makes me think this also. Maybe something went wrong in that area of 4D. 
It is a tricky area because sleep could last for hours or days and memory could 
be moved around and pointer can easily go bad in those type of situations. 

So there is my analysis. Now what changes could you make to stop these damn 
crashing situations? Here are some idea:

- You say it happens about every 3 days, so just restart 4D Server every single 
day. Giant PITA I know. But just an idea for what to do now to eliminate the 
crashing. 

- Stop all 4D Client machines from sleeping. You’d have to physically go to 
every machine and turn off system sleeping and allow the display to go to 
sleep. You can’t rely on users to do this, and do it right. This is what I 
would do, if I had physical access to all the machine — or at least RDP access 
— so that I could make sure every machine had system sleep turned off. (Of 
course you already have App Napping turned off on the 4D Server machine so 
that’s not part of this issue, right?)

- Crash dump lists Build Number 226566. v17.0 has build 225365. v17.0 HF1 has 
build 226237. A quick check of 4D forums “Nightly Builds 4D v17” shows this 
build is from 8/22/18. So you are running a nightly build. I’m guessing you 
used v17.0 and had problems, went to v17.0 HF1 and still had problems, so you 
went to nightly builds to try and find a fix. Maybe you keep doing that. 
Current nightly build is 226837. You may find they’ve fixed the bug that is 
biting you. 

- Stop using the new network layer. You would have to stop using 64bit 4D 
Server so the many not be a viable option. You are limited to a 2GB data cache. 
But maybe if you can stop the crashing now it worth that limitation. That means 
compiling a 32bit version of 4D Server and 4D Client, and replacing all the 
64bit 4D Client applications with the 32bit version. I think you could use the 
auto client update feature to automate this. 

That’s all I can contribute. If you find a solution to your crashing every 3 
days problem please be sure to post here so we know what fixed it for you.

Tim

*****************************************
Tim Nevels
Innovative Solutions
785-749-3444
[email protected]
*****************************************


**********************************************************************
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:[email protected]
**********************************************************************

Reply via email to