______________ INTRODUCTION I have been having some recent correspondence with Ben Barringer on the performance of SCA. Some of us have been talking about this amongst ourselves,and in correspondence with Ben. I wanted to move the conversation onto here so that anyone can share and we capture the history.
_____________________ WHAT'S THE PROBLEM The nub of it is that we spend a lot of time processing XML schema files in order to build an SDO model. Sometimes it is schema that SCA wants for itself - the schema for SOAP or the schema for WSDL, for example - but just it is just as likely to be schema for the application's complex types, and these can be enormous: the schema for eBaY for example is enormous. Ben is not the only one who has pointed out that we are making ourselves unusable for some applications with our performance: Adam Trachtenberg and Rob Richards have both commented adversely on the performance: http://www.trachtenberg.com/blog/2006/10/12/php-soap-vs-sdo/ ("Where SDO really falls down for me is performance. ") http://groups.google.co.uk/group/phpsoa/browse_thread/thread/c827044441e74456/80b655115a3bca23?#80b655115a3bca23 (BTW this thread has some suggestions in it for things that we might do to help the performance) It's not just one problem. We know that SCA for PHP sometimes loads the same schema more than once in a single request. We know that SDO runs any schema it does load through a SAX parser twice when ideally it would do it once. We know there are places in the parsing that we can get some improvement. I don't expect any truly dramatic improvements if we just chip away at those though. ____________________ DISCUSSION TO DATE Given how infrequently a given WSDL or schema file changes, it makes no sense to pound away on it building the SDO model from it on every request. We ought to cache the result of doing that: caching either the SDO model or the data factory that contains that model. There are two approaches we could take: 1. We could try to keep the interface unchanged, so all PHP code continues to use just SDO_DAS_XML::create() and addTypes() ... 2. We could put in some explicit caching that is visible at the PHP level and is controlled by the SCA for PHP code or even the application code somehow There are, independently, a couple of possibilities for where and what we cache. Two options seem to be: A. we could serialise the SDO model out to a file and read it back in when needed ... B. we could hold on to the data factory within memory, within the sdo_php extension. We examined option A, write the XML DAS to a file. What we found is that there is logic in the XML DAS to cache the model to a file already, but it caches as schema, so reading it back in just gets us back into loading schema again. So, we would need to come up with a format - binary or human-readable - that is quicker to re-read. We imagine by the way that anything cached in this way does not have to last very long. We would not want to get into the situation of trying to have file formats that were compatible across different releases of SDO, or between different platforms, or anything fancy. So, we have concluded that the simplest thing to do is probably to cache in memory, option B. Now look at the options 1. vs 2. i.e the interface. The ideal is probably to keep the interface unchanged, but in the meantime we might want to do something quicker to implement as a stop-gap, even if it puts a bit of responsibility into the SCA code. The thing that worries me about option 1. comes about because we have addTypes(). If you do create(), followed by a string of addTypes(), at what point do you consider the data factory/model finished? And then they come back issuing the same string of create() and addTypes() (hence wanting the exact same model), how do you spot that and use the cached one? It seems to me that that needs a solution. Perhaps allow create() to take an array of types, and make that array the the key to the cached DAS? You also need to consider what to do to catch when the files changed of course. Would you inspect the file modification times to check they had not changed? Would you want to do some quick hash of the contents as a backup check? I now want to finish this posting and leave it up to others to comment. I intend to close with an extract form Ben's most recent note. We had got to the point where I was suggesting the combination of cache in memory, under PHP control, and I had suggested explicit caching calls like $xmldas->saveModelunderKey and $xmldas->reloadModelFromKey Ben's reply: I have a couple of suggestions. If you have a saveModuleUnderKey function it seems like you would need a clearModuleUnderKey function as well. Not sure if it would be good or not but to reduce the number of function calls we could add another parameter to Create such as $cache='true. Then if it wasn't cached it would cache it with the file name and if it was cached it would reload it from cache. In most real world scenarios that I can think of most everyone will want to use the caching, so making it the default might be a good idea. So the code could be $xmldas = SDO_DAS_XML::create('cmssys.xsd', true) ; With this approach we will still need a SDO_DAS_XML::ClearCache() function. I realize this gets more complicated because if we updated our xsd file we would need to clear the cache so that leaves me checking the modified date on the xsd file every page load in dev and QA. Not trying to ask for too much here but it would be great if the extension could check for date modified on the xsd file and reload automatically if it has been updated. So what I am proposing is 1. Add a new cache parameter to Create() that defaults to true. 2. Use the file name as the key for caching. This would also avoid accidentally caching the same file with different keys. 3. Check the xsd file modified time and refresh the cache if changed. This could be controlled via PHP.ini setting so it could be turned off for peak performance. In our Dev/QA region we would have it automatically refreshed in production we would turn this off for better performance. __________ EPILOGUE I hope I have captured enough of the conversation to date that we can continue from here. Any comments, anyone? --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "phpsoa" group. To post to this group, send email to phpsoa@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.co.uk/group/phpsoa?hl=en -~----------~----~----~----~------~----~------~--~---