(Hello, everybody, I'm trying to crawl back out from my cave and see what's going on here.)

I have an external problem.  Well, that's where the symptoms show up.

I created an external that is a thin layer over some win32 calls (targeted to XP). It does some extensive error checking. The code looks clean and the external has tested out.

However, in the customer's environment which includes another external of mine that runs a customer supplied ActiveX module something strange happens after a while. Based on the data from the customer, it looks like the results from system calls are shifted off by one. That is, if calls are like this--f1(), g(), g(), f2(), g()-- at some point the data returned is that that should have been for the previous call--empty, f1(), g(), g(), f2(). I have no queues in my code, but it looks as if data is queued but an extra value is left in or inserted at some point. (The customer also reported memory usage growth.)

I suspect that some other module in my customer's environment is breaking things--one of these: the other external, the activeX, Rev 2.6.1, XP or maybe even the Transcript. I think either the external's memory is getting smashed (or the heap or Rev) or something is going wrong with malloc/free. I'm pretty sure it is not this external (famous last words).

The module uses static linking to C run time, and the best I can tell, there is no substitution for the malloc. In all cases *retString is set. (A quick check shows gibberish is returned if it is not.) Strong exception catching is used. All function results are checked for CRT and WIN32 calls. I checked the calls to malloc and free and in my external they balance. I make no calls to CRT functions that use malloc (according to MSDN documentation). I haven't looked into where malloc gets its memory yet, maybe the process heap--anybody know?

These use my C++ libraries for externals, but these have worked for a long time and in lots of environments. (More famous last words.)

The test stack does not seem to be blowing the Transcript call stack, but does have some interesting uses of wait with messages.

I'm not able to duplicate this in my environment on 3 machines. I've made an effort to make sure the environments are the same as that of my customer, but was in the middle of that when the troubleshooting effort was stopped. The customer test stack makes lots of different kinds of calls and uses send a lot. In any case, the test is not small and it takes a while to fail in the customer's environment.

Since I couldn't replicate the bug (I know how RunRev feels with some of the Rev bugs), I sent some variations that might shift the symptoms or even report what went wrong. Unfortunately, one of them (one that uses malloc less) did not display the problem, and testing of the batch of variations stopped right there, most untried.

I realize this is very weird and folks on this list, even external builders, may not have seen this, but I thought I'd give it a try.

I hope my customer can get his product to run reliably and I want to vindicate this external.

I can come up with a model for almost anything, but this baffles me. What can cause this?

OK, here is a model, but it is pretty wild: I know external calls are slow, but I would be surprised if Rev is pushing & pulling data through queues to another thread that runs external calls.

Dar Scott
Rev guy on the northern Rio Grande


_______________________________________________
use-revolution mailing list
use-revolution@lists.runrev.com
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to