Re: Isolating the Cause of a Server Crash

2018-09-06 Thread Dave Nasralla via 4D_Tech
Hi JPR,

Thanks for your comments.

- On each crash report, it is a different thread. Twice it was similar
to what I'll post at the end of this message (ServerNet select I/O
handler).
Once it was the LabProjects List, but there is nothing unique about
that list of records.
 - Range checking is on (the application always runs compiled)
 - It could be related to network stress - typically is does happen at
busier hours (never after hours)
 - I do generate debug files. It's not a specific method that is
running. It varies. The last command in the debug file always has a
"." after it. My understanding is that means the command executed
complete.
 - I do use interprocess variables to cache employee data for fast
access to names, email addresses, etc. It is relatively small with 7
parallel arrays containing less that 150 elements each. Also some
system settings - also under 100 elements.
 - The cache is set to 1GB. The datafile is 3GB in size.
- I use the 4D Info Reporter. Tim has walked me through looks at the
results. At first it looked like the Server was running low after a
backup, but I wrote in a purge command that clears it up. At the time
of each crash there is nothing remarkable in the report.

I think you are correct, that it probably is not a client issue -
though I do use routines that have the "execute on server" box
checked. Either way, I uploaded a modification last night that turns
on client debugging and creates a session record in a table at the
start of a client session. If the client record is not closed out via
the "On Exit" method, when the user logs in again the system will
upload their debug files (max of two are created). On the next crash
I'll take a closer look to see what clients were doing.

One thing that bothers me, is on occasion the Administration interface
begins to no longer display information. For example, when I went to
quit the application last night for and update, the window appeared
asking how to quit. I told the system to shutdown in 1 minute. The
next dialog contained only a server icon, and the countdown clock
stuck at "00 00". No text or message as displayed. The server did
shutdown as requested in 1 minute.

Thanks for your questions. Another sample crash report is below.

dave


Thread 29 Crashed:: ServerNet select I/O handler (id = 90423)
0   com.4d.ServerNet  0x000110d5837e
xbox::VTCPSelectWatchAction::HandleError(fd_set*) + 38
1   com.4d.ServerNet  0x000110d589ea
xbox::VTCPSelectIOHandler::DoRun() + 712
2   com.4d.ServerNet  0x000110d58afd non-virtual
thunk to xbox::VTCPSelectIOHandler::DoRun() + 13
3   com.4d.kernel 0x000110bbadaa
xbox::VTask::_Run() + 234
4   com.4d.kernel 0x000110bbfb01
xbox::XMacTask_preemptive::_ThreadProc(void*) + 145
5   libsystem_pthread.dylib   0x7fff6e307661 _pthread_body + 340
6   libsystem_pthread.dylib   0x7fff6e30750d _pthread_start + 377
7   libsystem_pthread.dylib   0x7fff6e306bf9 thread_start + 13

On Tue, Sep 4, 2018 at 10:40 AM JPR via 4D_Tech <4d_tech@lists.4d.com> wrote:
>
> [JPR]
>
> Hi Dave, Tim,
>
> This kind of crash is always difficult to track down, for it is not easily 
> reproductible. From what I see (and as Tim pointed) it seems there is a 
> memory problem that is revelated in the process LabProjects List. But a 
> memory problem can occur a while before the actual crash, because the 
> application may have a corrupted memory and not be aware of it until the 
> crash.
>
> - Is your application compiled? If yes, be sure that the Range checking 
> option is set.
> - Is the LabProjects ListProcess a client process on server, or a worker or 
> process running on the server?
> - The time of crash seems irrelevant, but may be it's linked to a peak in 
> activity and a server or network stress?
> - A client problem causing a server crash is unlikely, but it may help to 
> know if there is a correlation between the crash and a particular client 
> doing a particular operation.
> - Do you know which method is executed when it crashes?
> - Do you use interprocess variables like arrays for instance?
> - How much memory has been given to the server and to the cache?
>
> This is just a short list of points to check, but it may help to reduce the 
> problem to a small part of the application.
>
> My very best,
>
> JPR
>
>
> > On 2 Sep 2018, at 21:00, 4d_tech-requ...@lists.4d.com wrote:
> >
> > From: Tim Nevels 
> > To: 4d_tech@lists.4d.com
> > Subject: Re: Isolating the Cause of a Server Crash
> > Message-ID: 
> > Content-Type: text/plain; charset=utf-8
> >
> > On Sep 1, 2018, at 2:00 PM, Dave Nasralla wrote:
> >
> >> One of our systems is crashing about every 3 days and I c

Re: Isolating the Cause of a Server Crash

2018-09-04 Thread JPR via 4D_Tech
[JPR]

Hi Dave, Tim,

This kind of crash is always difficult to track down, for it is not easily 
reproductible. From what I see (and as Tim pointed) it seems there is a memory 
problem that is revelated in the process LabProjects List. But a memory problem 
can occur a while before the actual crash, because the application may have a 
corrupted memory and not be aware of it until the crash.

- Is your application compiled? If yes, be sure that the Range checking option 
is set.
- Is the LabProjects ListProcess a client process on server, or a worker or 
process running on the server?
- The time of crash seems irrelevant, but may be it's linked to a peak in 
activity and a server or network stress?
- A client problem causing a server crash is unlikely, but it may help to know 
if there is a correlation between the crash and a particular client doing a 
particular operation.
- Do you know which method is executed when it crashes?
- Do you use interprocess variables like arrays for instance?
- How much memory has been given to the server and to the cache?

This is just a short list of points to check, but it may help to reduce the 
problem to a small part of the application.

My very best,

JPR


> On 2 Sep 2018, at 21:00, 4d_tech-requ...@lists.4d.com wrote:
> 
> From: Tim Nevels 
> To: 4d_tech@lists.4d.com
> Subject: Re: Isolating the Cause of a Server Crash
> Message-ID: 
> Content-Type: text/plain; charset=utf-8
> 
> On Sep 1, 2018, at 2:00 PM, Dave Nasralla wrote:
> 
>> One of our systems is crashing about every 3 days and I can't seem to
>> isolate the cause. Lately these are crashes with a Mac crash report
>> appearing on the screen.
>> Some system details are:
>> - 4D Built Server app with v17.0 HF1 (64 bit Server with 64 Mac and
>> 32 bit Windows Clients)
>> - Mac and Windows Clients
>> - Mac OS 10.13.5
>> 
>> What I know so far:
>> - I have the Server Debug file. It ends with a "." and so the last
>> command appears to have executed.
>> - I'm using the Report Info component, logging every 5 minutes. There
>> doesn't seem to be memory problems or run away cache issues.
>> - I also know who was one each time it crashes and said out an email
>> to those users to find patterns (so far I've found none).
>> - The crashes typically happen around 10am to 11am.
>> - The client and server builds match.
>> 
>> I'm debating turning on the client debugger files and then harvesting
>> them afterwards when the user logs back in. I'm open to other
>> debugging techniques.
>> 
>> There are other v17 systems running on the same machine with zero issue.
>> 
>> Below is a snippet of the crash report. It seems to be different each
>> time, but here is the latest. Thread 73 crashed, so I only included
>> that one.
>> 
>> Thanks,
>> 
>> dave nasralla
>> 
>> Process:   Corporate [93958]
>> Path:  /Users/USER/*/Corporate
>> Server.app/Contents/MacOS/Corporate
>> Identifier:4d.com.Corporate Server.app
>> Version:   17.0 build 17.226566 (???)
>> Code Type: X86-64 (Native)
>> Parent Process:??? [1]
>> Responsible:   Corporate [93958]
>> User ID:   501
>> 
>> Date/Time: 2018-08-31 11:00:05.952 -0500
>> OS Version:Mac OS X 10.13.5 (17F77)
>> Report Version:12
>> Anonymous UUID:723511FD-4CA0-6E8B-0642-883209248DFC
>> 
>> 
>> Time Awake Since Boot: 370 seconds
>> 
>> System Integrity Protection: enabled
>> 
>> Crashed Thread:73  LabProjects List (id = -114)
>> 
>> Exception Type:EXC_BAD_ACCESS (SIGSEGV)
>> Exception Codes:   EXC_I386_GPFLT
>> Exception Note:EXC_CORPSE_NOTIFY
>> 
>> Termination Signal:Segmentation fault: 11
>> Termination Reason:Namespace SIGNAL, Code 0xb
>> Terminating Process:   exc handler [0]
>> --
>> 
>> 
>> Thread 73 Crashed:: LabProjects List (id = -114)
>> 0   4d.com.Corporate Server.app   0x00010694fdbe
>> V4DConnection::OnPostpone(bool) + 40
>> 1   4d.com.Corporate Server.app   0x000106b095f7
>> V4DServerUser::PostponeServiceConnection() + 35
>> 2   4d.com.Corporate Server.app   0x000106b20567
>> V4DServer::exec_ConnectionPostpone(V4DRequestReply&, V4DTaskConcrete*,
>> short) + 395
>> 3   4d.com.Corporate Server.app   0x000106b211ca
>> V4DServer::exec_streamreq(V4DRequestReply&, V4DTaskConcrete*) + 100
> 
> Hi Dave,
&g

Re: Isolating the Cause of a Server Crash

2018-09-01 Thread Tim Nevels via 4D_Tech
On Sep 1, 2018, at 2:00 PM, Dave Nasralla wrote:

> One of our systems is crashing about every 3 days and I can't seem to
> isolate the cause. Lately these are crashes with a Mac crash report
> appearing on the screen.
> Some system details are:
> - 4D Built Server app with v17.0 HF1 (64 bit Server with 64 Mac and
> 32 bit Windows Clients)
> - Mac and Windows Clients
> - Mac OS 10.13.5
> 
> What I know so far:
> - I have the Server Debug file. It ends with a "." and so the last
> command appears to have executed.
> - I'm using the Report Info component, logging every 5 minutes. There
> doesn't seem to be memory problems or run away cache issues.
> - I also know who was one each time it crashes and said out an email
> to those users to find patterns (so far I've found none).
> - The crashes typically happen around 10am to 11am.
> - The client and server builds match.
> 
> I'm debating turning on the client debugger files and then harvesting
> them afterwards when the user logs back in. I'm open to other
> debugging techniques.
> 
> There are other v17 systems running on the same machine with zero issue.
> 
> Below is a snippet of the crash report. It seems to be different each
> time, but here is the latest. Thread 73 crashed, so I only included
> that one.
> 
> Thanks,
> 
> dave nasralla
> 
> Process:   Corporate [93958]
> Path:  /Users/USER/*/Corporate
> Server.app/Contents/MacOS/Corporate
> Identifier:4d.com.Corporate Server.app
> Version:   17.0 build 17.226566 (???)
> Code Type: X86-64 (Native)
> Parent Process:??? [1]
> Responsible:   Corporate [93958]
> User ID:   501
> 
> Date/Time: 2018-08-31 11:00:05.952 -0500
> OS Version:Mac OS X 10.13.5 (17F77)
> Report Version:12
> Anonymous UUID:723511FD-4CA0-6E8B-0642-883209248DFC
> 
> 
> Time Awake Since Boot: 370 seconds
> 
> System Integrity Protection: enabled
> 
> Crashed Thread:73  LabProjects List (id = -114)
> 
> Exception Type:EXC_BAD_ACCESS (SIGSEGV)
> Exception Codes:   EXC_I386_GPFLT
> Exception Note:EXC_CORPSE_NOTIFY
> 
> Termination Signal:Segmentation fault: 11
> Termination Reason:Namespace SIGNAL, Code 0xb
> Terminating Process:   exc handler [0]
> --
> 
> 
> Thread 73 Crashed:: LabProjects List (id = -114)
> 0   4d.com.Corporate Server.app   0x00010694fdbe
> V4DConnection::OnPostpone(bool) + 40
> 1   4d.com.Corporate Server.app   0x000106b095f7
> V4DServerUser::PostponeServiceConnection() + 35
> 2   4d.com.Corporate Server.app   0x000106b20567
> V4DServer::exec_ConnectionPostpone(V4DRequestReply&, V4DTaskConcrete*,
> short) + 395
> 3   4d.com.Corporate Server.app   0x000106b211ca
> V4DServer::exec_streamreq(V4DRequestReply&, V4DTaskConcrete*) + 100

Hi Dave,

Crashing every 3 days is a real problem and totally unacceptable. So what can 
be done to try and make this situation better? We need to make changes to make 
this crashing stop. But what changes? 

Here is my thinking as I read this crash report. Keep in mind I’m not an expert 
on this, so I may be wrong in some areas. If I am wrong hopefully those that 
know more can correct me — and in turn help me and others understand more about 
how to read these macOS crash reports. (Thinking about Miyako, JPR, Christian 
Sakowski and Rob Laveaux — they are real experts in this area. Real macOS 
programmers that know how to read these things properly.)

The crash report is supposed to provide a programmer with information on 
exactly here the program crashed and the cause of the crash. If you have the 
special 4D “debug” version it will contain more “symbols” and thus when 4D 
crashes you get better names for functions instead of just memory address 
offset. I think you even get 4D command names that were involved in the crash. 
But the basic crash dump info that we have here can help point to the general 
area of concern. Here is a website that helps explain crash dumps and how to 
read them: 

https://www.maketecheasier.com/read-macos-crash-reports-troubleshoot-mac/

This is 4D v17.0 build 226566 that is running compiled in 64bit mode (Code 
Type: x86-64). So first thought is that this could be a 4D 64bit issue. That’s 
important because some of the code is completely different between 32bit 4D and 
64bit 4D. The 64bit code could be newly written code, the 32bit code could be 
legacy code that has been around for years. 

Thread 73 “LabProjects List” is what crashed. Do you have a table named 
“LabProjects” or maybe a MODIFY SELECTION or a listbox window that shows 
records in this table? Or a process that has that name? Makes me think that you 
do. That’s another pointer to where in your application the crashing problem 
occurred.

Exception Type is "EXC_BAD_ACCESS (SIGSEGV)” and that means "the 

Re: Isolating the Cause of a Server Crash

2018-08-31 Thread Dave Nasralla via 4D_Tech
Thanks to all that have responded.
 - I rebooted the machine this evening. (In the past it has run as
long as a year without a reboot - which was only done for a system
update.)
 - No virus scans running on it
 - Backblaze runs, but the .4DD files are skipped.
 - MCS Scans came  back clean
 - Indexes have been rebuilt

One thing I have noticed is that, although the client machines are
running along fine and users can log in or out and do their tasks, the
4D Administration  Interface on the built application gets wonky. For
example, after running for a day, the "Monitor" tab will no longer
show a graph and the Details area (with the pie charts) is blank with
a message something like (only visible to database administrators). Or
I'll go to the Users tab and nothing shows up, yet users are
connected.

Other 4D applications are fine.

dave

On Fri, Aug 31, 2018 at 3:45 PM Stephen J. Orth via 4D_Tech
<4d_tech@lists.4d.com> wrote:
>
> I strongly recommend what Chuck is saying.  We tell our customers to exempt 
> our folders from any scanning, virus, auto-bots, etc...
>
> We have seen database damage caused by this, which in turn results in 
> crashing.
>
> Steve
>
>
> -Original Message-
> From: 4D_Tech [mailto:4d_tech-boun...@lists.4d.com] On Behalf Of Chuck Miller 
> via 4D_Tech
> Sent: Friday, August 31, 2018 4:32 PM
> To: 4DTechList Tech <4d_tech@lists.4d.com>
> Cc: Chuck Miller 
> Subject: Re: Isolating the Cause of a Server Crash
>
> Are you running any virus detective on that machine. If so you should skip 4D 
> folders
>
> Regards
>
> Chuck
> 
>  Chuck Miller Voice: (617) 739-0306
>  Informed Solutions, Inc. Fax: (617) 232-1064
>  mailto:cjmillerinformed-solutions.com
>  Brookline, MA 02446 USA Registered 4D Developer
>Providers of 4D and Sybase connectivity
>   http://www.informed-solutions.com
> 
> This message and any attached documents contain information which may be 
> confidential, subject to privilege or exempt from disclosure under applicable 
> law.  These materials are intended only for the use of the intended 
> recipient. If you are not the intended recipient of this transmission, you 
> are hereby notified that any distribution, disclosure, printing, copying, 
> storage, modification or the taking of any action in reliance upon this 
> transmission is strictly prohibited.  Delivery of this message to any person 
> other than the intended recipient shall not compromise or waive such 
> confidentiality, privilege or exemption from disclosure as to this 
> communication.
>
> > On Aug 31, 2018, at 5:17 PM, Spencer Hinsdale via 4D_Tech 
> > <4d_tech@lists.4d.com> wrote:
> >
> > reboot the computer. it has been running for 40 days?
>
> **
> 4D Internet Users Group (4D iNUG)
> Archive:  http://lists.4d.com/archives.html
> Options: https://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **
>
> **
> 4D Internet Users Group (4D iNUG)
> Archive:  http://lists.4d.com/archives.html
> Options: https://lists.4d.com/mailman/options/4d_tech
> Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
> **



-- 
David Nasralla
Clean Air Engineering
**
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

RE: Isolating the Cause of a Server Crash

2018-08-31 Thread Stephen J. Orth via 4D_Tech
I strongly recommend what Chuck is saying.  We tell our customers to exempt our 
folders from any scanning, virus, auto-bots, etc...

We have seen database damage caused by this, which in turn results in crashing.

Steve


-Original Message-
From: 4D_Tech [mailto:4d_tech-boun...@lists.4d.com] On Behalf Of Chuck Miller 
via 4D_Tech
Sent: Friday, August 31, 2018 4:32 PM
To: 4DTechList Tech <4d_tech@lists.4d.com>
Cc: Chuck Miller 
Subject: Re: Isolating the Cause of a Server Crash

Are you running any virus detective on that machine. If so you should skip 4D 
folders

Regards

Chuck

 Chuck Miller Voice: (617) 739-0306
 Informed Solutions, Inc. Fax: (617) 232-1064   
 mailto:cjmillerinformed-solutions.com 
 Brookline, MA 02446 USA Registered 4D Developer
   Providers of 4D and Sybase connectivity
  http://www.informed-solutions.com  

This message and any attached documents contain information which may be 
confidential, subject to privilege or exempt from disclosure under applicable 
law.  These materials are intended only for the use of the intended recipient. 
If you are not the intended recipient of this transmission, you are hereby 
notified that any distribution, disclosure, printing, copying, storage, 
modification or the taking of any action in reliance upon this transmission is 
strictly prohibited.  Delivery of this message to any person other than the 
intended recipient shall not compromise or waive such confidentiality, 
privilege or exemption from disclosure as to this communication. 

> On Aug 31, 2018, at 5:17 PM, Spencer Hinsdale via 4D_Tech 
> <4d_tech@lists.4d.com> wrote:
> 
> reboot the computer. it has been running for 40 days?

**
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

**
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Isolating the Cause of a Server Crash

2018-08-31 Thread Chuck Miller via 4D_Tech
Are you running any virus detective on that machine. If so you should skip 4D 
folders

Regards

Chuck

 Chuck Miller Voice: (617) 739-0306
 Informed Solutions, Inc. Fax: (617) 232-1064   
 mailto:cjmillerinformed-solutions.com 
 Brookline, MA 02446 USA Registered 4D Developer
   Providers of 4D and Sybase connectivity
  http://www.informed-solutions.com  

This message and any attached documents contain information which may be 
confidential, subject to privilege or exempt from disclosure under applicable 
law.  These materials are intended only for the use of the intended recipient. 
If you are not the intended recipient of this transmission, you are hereby 
notified that any distribution, disclosure, printing, copying, storage, 
modification or the taking of any action in reliance upon this transmission is 
strictly prohibited.  Delivery of this message to any person other than the 
intended recipient shall not compromise or waive such confidentiality, 
privilege or exemption from disclosure as to this communication. 

> On Aug 31, 2018, at 5:17 PM, Spencer Hinsdale via 4D_Tech 
> <4d_tech@lists.4d.com> wrote:
> 
> reboot the computer. it has been running for 40 days?

**
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**

Re: Isolating the Cause of a Server Crash

2018-08-31 Thread Spencer Hinsdale via 4D_Tech
reboot the computer. it has been running for 40 days?

> On Aug 31, 2018, at 1:54 PM, Dave Nasralla via 4D_Tech <4d_tech@lists.4d.com> 
> wrote:
> 
> One of our systems is crashing about every 3 days
**
4D Internet Users Group (4D iNUG)
Archive:  http://lists.4d.com/archives.html
Options: https://lists.4d.com/mailman/options/4d_tech
Unsub:  mailto:4d_tech-unsubscr...@lists.4d.com
**