Re: Isolating the Cause of a Server Crash
Hi JPR, Thanks for your comments. - On each crash report, it is a different thread. Twice it was similar to what I'll post at the end of this message (ServerNet select I/O handler). Once it was the LabProjects List, but there is nothing unique about that list of records. - Range checking is on (the application always runs compiled) - It could be related to network stress - typically is does happen at busier hours (never after hours) - I do generate debug files. It's not a specific method that is running. It varies. The last command in the debug file always has a "." after it. My understanding is that means the command executed complete. - I do use interprocess variables to cache employee data for fast access to names, email addresses, etc. It is relatively small with 7 parallel arrays containing less that 150 elements each. Also some system settings - also under 100 elements. - The cache is set to 1GB. The datafile is 3GB in size. - I use the 4D Info Reporter. Tim has walked me through looks at the results. At first it looked like the Server was running low after a backup, but I wrote in a purge command that clears it up. At the time of each crash there is nothing remarkable in the report. I think you are correct, that it probably is not a client issue - though I do use routines that have the "execute on server" box checked. Either way, I uploaded a modification last night that turns on client debugging and creates a session record in a table at the start of a client session. If the client record is not closed out via the "On Exit" method, when the user logs in again the system will upload their debug files (max of two are created). On the next crash I'll take a closer look to see what clients were doing. One thing that bothers me, is on occasion the Administration interface begins to no longer display information. For example, when I went to quit the application last night for and update, the window appeared asking how to quit. I told the system to shutdown in 1 minute. The next dialog contained only a server icon, and the countdown clock stuck at "00 00". No text or message as displayed. The server did shutdown as requested in 1 minute. Thanks for your questions. Another sample crash report is below. dave Thread 29 Crashed:: ServerNet select I/O handler (id = 90423) 0 com.4d.ServerNet 0x000110d5837e xbox::VTCPSelectWatchAction::HandleError(fd_set*) + 38 1 com.4d.ServerNet 0x000110d589ea xbox::VTCPSelectIOHandler::DoRun() + 712 2 com.4d.ServerNet 0x000110d58afd non-virtual thunk to xbox::VTCPSelectIOHandler::DoRun() + 13 3 com.4d.kernel 0x000110bbadaa xbox::VTask::_Run() + 234 4 com.4d.kernel 0x000110bbfb01 xbox::XMacTask_preemptive::_ThreadProc(void*) + 145 5 libsystem_pthread.dylib 0x7fff6e307661 _pthread_body + 340 6 libsystem_pthread.dylib 0x7fff6e30750d _pthread_start + 377 7 libsystem_pthread.dylib 0x7fff6e306bf9 thread_start + 13 On Tue, Sep 4, 2018 at 10:40 AM JPR via 4D_Tech <4d_tech@lists.4d.com> wrote: > > [JPR] > > Hi Dave, Tim, > > This kind of crash is always difficult to track down, for it is not easily > reproductible. From what I see (and as Tim pointed) it seems there is a > memory problem that is revelated in the process LabProjects List. But a > memory problem can occur a while before the actual crash, because the > application may have a corrupted memory and not be aware of it until the > crash. > > - Is your application compiled? If yes, be sure that the Range checking > option is set. > - Is the LabProjects ListProcess a client process on server, or a worker or > process running on the server? > - The time of crash seems irrelevant, but may be it's linked to a peak in > activity and a server or network stress? > - A client problem causing a server crash is unlikely, but it may help to > know if there is a correlation between the crash and a particular client > doing a particular operation. > - Do you know which method is executed when it crashes? > - Do you use interprocess variables like arrays for instance? > - How much memory has been given to the server and to the cache? > > This is just a short list of points to check, but it may help to reduce the > problem to a small part of the application. > > My very best, > > JPR > > > > On 2 Sep 2018, at 21:00, 4d_tech-requ...@lists.4d.com wrote: > > > > From: Tim Nevels > > To: 4d_tech@lists.4d.com > > Subject: Re: Isolating the Cause of a Server Crash > > Message-ID: > > Content-Type: text/plain; charset=utf-8 > > > > On Sep 1, 2018, at 2:00 PM, Dave Nasralla wrote: > > > >> One of our systems is crashing about every 3 days and I c
Re: Isolating the Cause of a Server Crash
[JPR] Hi Dave, Tim, This kind of crash is always difficult to track down, for it is not easily reproductible. From what I see (and as Tim pointed) it seems there is a memory problem that is revelated in the process LabProjects List. But a memory problem can occur a while before the actual crash, because the application may have a corrupted memory and not be aware of it until the crash. - Is your application compiled? If yes, be sure that the Range checking option is set. - Is the LabProjects ListProcess a client process on server, or a worker or process running on the server? - The time of crash seems irrelevant, but may be it's linked to a peak in activity and a server or network stress? - A client problem causing a server crash is unlikely, but it may help to know if there is a correlation between the crash and a particular client doing a particular operation. - Do you know which method is executed when it crashes? - Do you use interprocess variables like arrays for instance? - How much memory has been given to the server and to the cache? This is just a short list of points to check, but it may help to reduce the problem to a small part of the application. My very best, JPR > On 2 Sep 2018, at 21:00, 4d_tech-requ...@lists.4d.com wrote: > > From: Tim Nevels > To: 4d_tech@lists.4d.com > Subject: Re: Isolating the Cause of a Server Crash > Message-ID: > Content-Type: text/plain; charset=utf-8 > > On Sep 1, 2018, at 2:00 PM, Dave Nasralla wrote: > >> One of our systems is crashing about every 3 days and I can't seem to >> isolate the cause. Lately these are crashes with a Mac crash report >> appearing on the screen. >> Some system details are: >> - 4D Built Server app with v17.0 HF1 (64 bit Server with 64 Mac and >> 32 bit Windows Clients) >> - Mac and Windows Clients >> - Mac OS 10.13.5 >> >> What I know so far: >> - I have the Server Debug file. It ends with a "." and so the last >> command appears to have executed. >> - I'm using the Report Info component, logging every 5 minutes. There >> doesn't seem to be memory problems or run away cache issues. >> - I also know who was one each time it crashes and said out an email >> to those users to find patterns (so far I've found none). >> - The crashes typically happen around 10am to 11am. >> - The client and server builds match. >> >> I'm debating turning on the client debugger files and then harvesting >> them afterwards when the user logs back in. I'm open to other >> debugging techniques. >> >> There are other v17 systems running on the same machine with zero issue. >> >> Below is a snippet of the crash report. It seems to be different each >> time, but here is the latest. Thread 73 crashed, so I only included >> that one. >> >> Thanks, >> >> dave nasralla >> >> Process: Corporate [93958] >> Path: /Users/USER/*/Corporate >> Server.app/Contents/MacOS/Corporate >> Identifier:4d.com.Corporate Server.app >> Version: 17.0 build 17.226566 (???) >> Code Type: X86-64 (Native) >> Parent Process:??? [1] >> Responsible: Corporate [93958] >> User ID: 501 >> >> Date/Time: 2018-08-31 11:00:05.952 -0500 >> OS Version:Mac OS X 10.13.5 (17F77) >> Report Version:12 >> Anonymous UUID:723511FD-4CA0-6E8B-0642-883209248DFC >> >> >> Time Awake Since Boot: 370 seconds >> >> System Integrity Protection: enabled >> >> Crashed Thread:73 LabProjects List (id = -114) >> >> Exception Type:EXC_BAD_ACCESS (SIGSEGV) >> Exception Codes: EXC_I386_GPFLT >> Exception Note:EXC_CORPSE_NOTIFY >> >> Termination Signal:Segmentation fault: 11 >> Termination Reason:Namespace SIGNAL, Code 0xb >> Terminating Process: exc handler [0] >> -- >> >> >> Thread 73 Crashed:: LabProjects List (id = -114) >> 0 4d.com.Corporate Server.app 0x00010694fdbe >> V4DConnection::OnPostpone(bool) + 40 >> 1 4d.com.Corporate Server.app 0x000106b095f7 >> V4DServerUser::PostponeServiceConnection() + 35 >> 2 4d.com.Corporate Server.app 0x000106b20567 >> V4DServer::exec_ConnectionPostpone(V4DRequestReply&, V4DTaskConcrete*, >> short) + 395 >> 3 4d.com.Corporate Server.app 0x000106b211ca >> V4DServer::exec_streamreq(V4DRequestReply&, V4DTaskConcrete*) + 100 > > Hi Dave, &g
Re: Isolating the Cause of a Server Crash
On Sep 1, 2018, at 2:00 PM, Dave Nasralla wrote: > One of our systems is crashing about every 3 days and I can't seem to > isolate the cause. Lately these are crashes with a Mac crash report > appearing on the screen. > Some system details are: > - 4D Built Server app with v17.0 HF1 (64 bit Server with 64 Mac and > 32 bit Windows Clients) > - Mac and Windows Clients > - Mac OS 10.13.5 > > What I know so far: > - I have the Server Debug file. It ends with a "." and so the last > command appears to have executed. > - I'm using the Report Info component, logging every 5 minutes. There > doesn't seem to be memory problems or run away cache issues. > - I also know who was one each time it crashes and said out an email > to those users to find patterns (so far I've found none). > - The crashes typically happen around 10am to 11am. > - The client and server builds match. > > I'm debating turning on the client debugger files and then harvesting > them afterwards when the user logs back in. I'm open to other > debugging techniques. > > There are other v17 systems running on the same machine with zero issue. > > Below is a snippet of the crash report. It seems to be different each > time, but here is the latest. Thread 73 crashed, so I only included > that one. > > Thanks, > > dave nasralla > > Process: Corporate [93958] > Path: /Users/USER/*/Corporate > Server.app/Contents/MacOS/Corporate > Identifier:4d.com.Corporate Server.app > Version: 17.0 build 17.226566 (???) > Code Type: X86-64 (Native) > Parent Process:??? [1] > Responsible: Corporate [93958] > User ID: 501 > > Date/Time: 2018-08-31 11:00:05.952 -0500 > OS Version:Mac OS X 10.13.5 (17F77) > Report Version:12 > Anonymous UUID:723511FD-4CA0-6E8B-0642-883209248DFC > > > Time Awake Since Boot: 370 seconds > > System Integrity Protection: enabled > > Crashed Thread:73 LabProjects List (id = -114) > > Exception Type:EXC_BAD_ACCESS (SIGSEGV) > Exception Codes: EXC_I386_GPFLT > Exception Note:EXC_CORPSE_NOTIFY > > Termination Signal:Segmentation fault: 11 > Termination Reason:Namespace SIGNAL, Code 0xb > Terminating Process: exc handler [0] > -- > > > Thread 73 Crashed:: LabProjects List (id = -114) > 0 4d.com.Corporate Server.app 0x00010694fdbe > V4DConnection::OnPostpone(bool) + 40 > 1 4d.com.Corporate Server.app 0x000106b095f7 > V4DServerUser::PostponeServiceConnection() + 35 > 2 4d.com.Corporate Server.app 0x000106b20567 > V4DServer::exec_ConnectionPostpone(V4DRequestReply&, V4DTaskConcrete*, > short) + 395 > 3 4d.com.Corporate Server.app 0x000106b211ca > V4DServer::exec_streamreq(V4DRequestReply&, V4DTaskConcrete*) + 100 Hi Dave, Crashing every 3 days is a real problem and totally unacceptable. So what can be done to try and make this situation better? We need to make changes to make this crashing stop. But what changes? Here is my thinking as I read this crash report. Keep in mind I’m not an expert on this, so I may be wrong in some areas. If I am wrong hopefully those that know more can correct me — and in turn help me and others understand more about how to read these macOS crash reports. (Thinking about Miyako, JPR, Christian Sakowski and Rob Laveaux — they are real experts in this area. Real macOS programmers that know how to read these things properly.) The crash report is supposed to provide a programmer with information on exactly here the program crashed and the cause of the crash. If you have the special 4D “debug” version it will contain more “symbols” and thus when 4D crashes you get better names for functions instead of just memory address offset. I think you even get 4D command names that were involved in the crash. But the basic crash dump info that we have here can help point to the general area of concern. Here is a website that helps explain crash dumps and how to read them: https://www.maketecheasier.com/read-macos-crash-reports-troubleshoot-mac/ This is 4D v17.0 build 226566 that is running compiled in 64bit mode (Code Type: x86-64). So first thought is that this could be a 4D 64bit issue. That’s important because some of the code is completely different between 32bit 4D and 64bit 4D. The 64bit code could be newly written code, the 32bit code could be legacy code that has been around for years. Thread 73 “LabProjects List” is what crashed. Do you have a table named “LabProjects” or maybe a MODIFY SELECTION or a listbox window that shows records in this table? Or a process that has that name? Makes me think that you do. That’s another pointer to where in your application the crashing problem occurred. Exception Type is "EXC_BAD_ACCESS (SIGSEGV)” and that means "the
Re: Isolating the Cause of a Server Crash
Thanks to all that have responded. - I rebooted the machine this evening. (In the past it has run as long as a year without a reboot - which was only done for a system update.) - No virus scans running on it - Backblaze runs, but the .4DD files are skipped. - MCS Scans came back clean - Indexes have been rebuilt One thing I have noticed is that, although the client machines are running along fine and users can log in or out and do their tasks, the 4D Administration Interface on the built application gets wonky. For example, after running for a day, the "Monitor" tab will no longer show a graph and the Details area (with the pie charts) is blank with a message something like (only visible to database administrators). Or I'll go to the Users tab and nothing shows up, yet users are connected. Other 4D applications are fine. dave On Fri, Aug 31, 2018 at 3:45 PM Stephen J. Orth via 4D_Tech <4d_tech@lists.4d.com> wrote: > > I strongly recommend what Chuck is saying. We tell our customers to exempt > our folders from any scanning, virus, auto-bots, etc... > > We have seen database damage caused by this, which in turn results in > crashing. > > Steve > > > -Original Message- > From: 4D_Tech [mailto:4d_tech-boun...@lists.4d.com] On Behalf Of Chuck Miller > via 4D_Tech > Sent: Friday, August 31, 2018 4:32 PM > To: 4DTechList Tech <4d_tech@lists.4d.com> > Cc: Chuck Miller > Subject: Re: Isolating the Cause of a Server Crash > > Are you running any virus detective on that machine. If so you should skip 4D > folders > > Regards > > Chuck > > Chuck Miller Voice: (617) 739-0306 > Informed Solutions, Inc. Fax: (617) 232-1064 > mailto:cjmillerinformed-solutions.com > Brookline, MA 02446 USA Registered 4D Developer >Providers of 4D and Sybase connectivity > http://www.informed-solutions.com > > This message and any attached documents contain information which may be > confidential, subject to privilege or exempt from disclosure under applicable > law. These materials are intended only for the use of the intended > recipient. If you are not the intended recipient of this transmission, you > are hereby notified that any distribution, disclosure, printing, copying, > storage, modification or the taking of any action in reliance upon this > transmission is strictly prohibited. Delivery of this message to any person > other than the intended recipient shall not compromise or waive such > confidentiality, privilege or exemption from disclosure as to this > communication. > > > On Aug 31, 2018, at 5:17 PM, Spencer Hinsdale via 4D_Tech > > <4d_tech@lists.4d.com> wrote: > > > > reboot the computer. it has been running for 40 days? > > ** > 4D Internet Users Group (4D iNUG) > Archive: http://lists.4d.com/archives.html > Options: https://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ** > > ** > 4D Internet Users Group (4D iNUG) > Archive: http://lists.4d.com/archives.html > Options: https://lists.4d.com/mailman/options/4d_tech > Unsub: mailto:4d_tech-unsubscr...@lists.4d.com > ** -- David Nasralla Clean Air Engineering ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
RE: Isolating the Cause of a Server Crash
I strongly recommend what Chuck is saying. We tell our customers to exempt our folders from any scanning, virus, auto-bots, etc... We have seen database damage caused by this, which in turn results in crashing. Steve -Original Message- From: 4D_Tech [mailto:4d_tech-boun...@lists.4d.com] On Behalf Of Chuck Miller via 4D_Tech Sent: Friday, August 31, 2018 4:32 PM To: 4DTechList Tech <4d_tech@lists.4d.com> Cc: Chuck Miller Subject: Re: Isolating the Cause of a Server Crash Are you running any virus detective on that machine. If so you should skip 4D folders Regards Chuck Chuck Miller Voice: (617) 739-0306 Informed Solutions, Inc. Fax: (617) 232-1064 mailto:cjmillerinformed-solutions.com Brookline, MA 02446 USA Registered 4D Developer Providers of 4D and Sybase connectivity http://www.informed-solutions.com This message and any attached documents contain information which may be confidential, subject to privilege or exempt from disclosure under applicable law. These materials are intended only for the use of the intended recipient. If you are not the intended recipient of this transmission, you are hereby notified that any distribution, disclosure, printing, copying, storage, modification or the taking of any action in reliance upon this transmission is strictly prohibited. Delivery of this message to any person other than the intended recipient shall not compromise or waive such confidentiality, privilege or exemption from disclosure as to this communication. > On Aug 31, 2018, at 5:17 PM, Spencer Hinsdale via 4D_Tech > <4d_tech@lists.4d.com> wrote: > > reboot the computer. it has been running for 40 days? ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com ** ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Isolating the Cause of a Server Crash
Are you running any virus detective on that machine. If so you should skip 4D folders Regards Chuck Chuck Miller Voice: (617) 739-0306 Informed Solutions, Inc. Fax: (617) 232-1064 mailto:cjmillerinformed-solutions.com Brookline, MA 02446 USA Registered 4D Developer Providers of 4D and Sybase connectivity http://www.informed-solutions.com This message and any attached documents contain information which may be confidential, subject to privilege or exempt from disclosure under applicable law. These materials are intended only for the use of the intended recipient. If you are not the intended recipient of this transmission, you are hereby notified that any distribution, disclosure, printing, copying, storage, modification or the taking of any action in reliance upon this transmission is strictly prohibited. Delivery of this message to any person other than the intended recipient shall not compromise or waive such confidentiality, privilege or exemption from disclosure as to this communication. > On Aug 31, 2018, at 5:17 PM, Spencer Hinsdale via 4D_Tech > <4d_tech@lists.4d.com> wrote: > > reboot the computer. it has been running for 40 days? ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **
Re: Isolating the Cause of a Server Crash
reboot the computer. it has been running for 40 days? > On Aug 31, 2018, at 1:54 PM, Dave Nasralla via 4D_Tech <4d_tech@lists.4d.com> > wrote: > > One of our systems is crashing about every 3 days ** 4D Internet Users Group (4D iNUG) Archive: http://lists.4d.com/archives.html Options: https://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **