Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
Hi Przemek, Il 02/02/2009 11.26, Przemyslaw Czerpak ha scritto: This problem should be eliminated by: 2009-02-02 11:02 UTC+0100 Przemyslaw Czerpak (druzus/at/priv.onet.pl) Thank you very much for your efforts, I will try. Best regards, Francesco ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour
Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
On Wed, 28 Jan 2009, Przemyslaw Czerpak wrote: Hi, I suggest to ignore this problem until I'll not change structures used to hold static variables. The chance for GPF is rather small. The side effect of array preallocation added by Mindaugas is noticeable reduced number of memory reallocation when new statics are added so the risk is really small that the problem will be exploited. Of course it _MUST_ be fixed but it should not be critical during uHTTPd developing. This problem should be eliminated by: 2009-02-02 11:02 UTC+0100 Przemyslaw Czerpak (druzus/at/priv.onet.pl) Here are the last points on my MT TODO list: 1. Tracelog is not MT safe. It's not a big problem for normal applications because this code is disabled in standard builds and enabled only on explicit user request during Harbour compilation to debug some Harbour internals so it's rather for developers only. 2. PROFILLER cannot collect information for each thread separately. It's optional code disabled by default. Before we will touch it please think how it should work for MT programs. 3. DEBUGGER: only main thread debugger can see the names of file wide STATICs and have information about line numbers with good break points. We should add code to share this information between threads. 4. hb_threadQuitRequest() can be ignored by thread in some cases due to race condition. It may happen if thread will overwrite request send by caller simultaneously, f.e. by its own BREAK. I can resolve it but we can also leave it as is and document such behavior as expected or even remove this function. Killing other threads in such way is dangerous and can be used only for some simple situation. It's much safer when user uses his own mechanism to terminate treads in some safe for his code places. 5. We should check contrib code and eliminate non MT safe code or document non MT safe functions. 6. Possible TODO: - add CRITICAL {FUNCTION|PROCEDURE} As you can see there is nothing critical in the above list what can block MT programming. best regards, Przemek ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour
Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
Hi Przemek, Il 28/01/2009 3.52, Przemyslaw Czerpak ha scritto: First of all thank you for your detailed explanation, [snip] HB_REINIT_STATIC HB_KEEP_LOCAL_REFERENCES good HB_OVERLOAD_FUNCTIONS interesting, but, as you pointed out, is dangerous in code like HTTP server, so my vote is no So what should you make now? I suggest to ignore this problem until I'll not change structures used to hold static variables. The chance for GPF is rather small. The side effect of array preallocation added by Mindaugas is noticeable reduced number of memory reallocation when new statics are added so the risk is really small that the problem will be exploited. Of course it _MUST_ be fixed but it should not be critical during uHTTPd developing. Now concentrate on the above information. I hope it helps to understand how .hrb modules works. BTW all of the above is also valid with exactly the same conditions for compiled .prg code loaded/unloaded from dynamic libraries (.dll, .so, .sl, .dyn, ...). Ok, I will go on. Thank you again. Just 3 other questions: 1 - I would like to limit use of some functions that may be dangerous if used from a user, like zap a database or FErase, but this are only two samples and maybe not pertinent in real developing, or to other functions, like dbUseArea(), that I want that not have to be used because I would like to add an HTTP_dbUseArea() that can open databases and control them. Is there a way to add a HRB_INHIBIT_FUNCTIONS that can inhibit some function calls from hrb at HVM level ? 2 - In uhttpd.prg the piece of code that runs HRB modules is guarded from a mutex: -- . // Lock HRB to avoid MT race conditions IF hb_mutexLock( s_hmtxHRB ) IF HRB_ACTIVATE_CACHE // caching modules IF !hb_HHasKey( s_hHRBModules, cFileName ) hb_HSet( s_hHRBModules, cFileName, HRB_LoadFromFile( uOSFileName( cFileName ) ) ) ENDIF cHRBBody := s_hHRBModules[ cFileName ] ELSE cHRBBody := HRB_LoadFromFile( uOSFileName( cFileName ) ) ENDIF IF !EMPTY( pHRB := HB_HRBLOAD( cHRBBody ) ) xResult := HRBMAIN() HB_HRBUNLOAD( pHRB ) ELSE uSetStatusCode( 404 ) t_cErrorMsg := File does not exist: + cFileName ENDIF hb_mutexUnlock( s_hmtxHRB ) ENDIF . STATIC FUNCTION HRB_LoadFromFile( cFile ) RETURN hb_memoread( cFile ) -- Is it correct or is it unnecessary because this is already done from HVM ? 3 - In the same piece of code (not already uploaded) I have thought to a memory cache of HRBBody code using an hash (actually HRB_ACTIVATE_CACHE can has value .T. or .F.) but I'm not able to evaluate costs between load from file everytime (where I suppose that OS helps caching file in memory) or scanning an hash and retrieve the string (apart from a side effect: loading an hrb module from file permits to update hrb code on the fly, the other method load hrb only first time than I need to purge list to have it loaded again). TIA Best regards, Francesco ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour
Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
On Wed, 28 Jan 2009, Francesco Saverio Giudice wrote: Hi, Just 3 other questions: 1 - I would like to limit use of some functions that may be dangerous if used from a user, like zap a database or FErase, but this are only two samples and maybe not pertinent in real developing, or to other functions, like dbUseArea(), that I want that not have to be used because I would like to add an HTTP_dbUseArea() that can open databases and control them. Is there a way to add a HRB_INHIBIT_FUNCTIONS that can inhibit some function calls from hrb at HVM level ? As long as you do not block macro compiler then it will not be possible. Even if you block it in this module then programmer can find other way to execute any function by macro compiler inside other modules, f.e. he can try to use some filter or even index key expressions passed as string adding his own forbidden code. To make it we should locate and document all functions which makes macrocompilation. We should declare forbidden namespace for given thread and respect this name space in all macrocompilations. It's complicated problem and to create some really working solutions which will be acceptable for programmers using restrictive environment it's necessary to first carefully document all possible danger places in core code and what will happen if some things will be blocked. 2 - In uhttpd.prg the piece of code that runs HRB modules is guarded from a mutex: -- . // Lock HRB to avoid MT race conditions IF hb_mutexLock( s_hmtxHRB ) IF HRB_ACTIVATE_CACHE // caching modules IF !hb_HHasKey( s_hHRBModules, cFileName ) hb_HSet( s_hHRBModules, cFileName, ; HRB_LoadFromFile( uOSFileName( cFileName ) ) ) ENDIF cHRBBody := s_hHRBModules[ cFileName ] ELSE cHRBBody := HRB_LoadFromFile( uOSFileName( cFileName ) ) ENDIF IF !EMPTY( pHRB := HB_HRBLOAD( cHRBBody ) ) xResult := HRBMAIN() HB_HRBUNLOAD( pHRB ) ELSE uSetStatusCode( 404 ) t_cErrorMsg := File does not exist: + cFileName ENDIF hb_mutexUnlock( s_hmtxHRB ) ENDIF . STATIC FUNCTION HRB_LoadFromFile( cFile ) RETURN hb_memoread( cFile ) -- Is it correct or is it unnecessary because this is already done from HVM ? For HVM it's not necessary but you need it for your own code to protect s_hHRBModules variable and public function name space because you are using fixed starting function HRBMAIN() and only one public function with given name can be register in HVM so this code can be executed only by one thread and you have to block other threads. 3 - In the same piece of code (not already uploaded) I have thought to a memory cache of HRBBody code using an hash (actually HRB_ACTIVATE_CACHE can has value .T. or .F.) but I'm not able to evaluate costs between load from file everytime (where I suppose that OS helps caching file in memory) or scanning an hash and retrieve the string (apart from a side effect: loading an hrb module from file permits to update hrb code on the fly, the other method load hrb only first time than I need to purge list to have it loaded again). The performance will be strictly platform dependent. Probably in Linux and other *nixes you will not find big speed difference. But in windows or when files are not stored on local drives then it will increase performance. Anyhow in such case I suggest to add option to discard buffers without server reloading. When cache is disabled I suggest to make file loading outside s_hmtxHRB to reduce time when it's locked. best regards, Przemek ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour
Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
Hi All After examining the uhttpd implimentation I am wondering if .hrb code can be loaded once and reused with every next request. That way this will come at par with Xbase++'s WAA (Web Application Adopter) protocol where program DLL (called package) is loaded only once. Regards Pritpal Bedi -- View this message in context: http://www.nabble.com/2009-01-27-10%3A14-UTC%2B0100-Francesco-Saverio-Giudice-%28info-at-fsgiudice.com%29-tp21681995p21688199.html Sent from the Harbour - Dev mailing list archive at Nabble.com. ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour
Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
Hi Pritpal, Il 27/01/2009 16.58, Pritpal Bedi ha scritto: Hi All After examining the uhttpd implimentation I am wondering if .hrb code can be loaded once and reused with every next request. That way this will come at par with Xbase++'s WAA (Web Application Adopter) protocol where program DLL (called package) is loaded only once. this is one point in my todo list, but I don't know if there are any problems storing hrb modules in memory and reuse them. What happens to loaded symbol table ? because every modules will have HRBMAIN() and could have same other function names. And another: what is faster ? Load from file using __HRBLOAD() / __HRBUNLOAD() or scanning and array / hash table every time ? I suppose the second, but I have doubts on symbol table. Przemek, Viktor ? Best regards Francesco ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour
Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
I think the .hrb modules aren't loaded on a per thread basis, so it's not like several of them could be loaded in parallel. As far as I see from the sources, loading an .hrb from memory is already supported, if you pass the .hrb as a string to the HB_HRBLOAD() function. Maybe a separate HB_HRBLOADFROMMEMORY() would be more clear. [ BTW I'd recommend using HB_HRB*() names instead of the obsolete __HRB*() ones. ] Brgds, Viktor On Tue, Jan 27, 2009 at 6:45 PM, Francesco Saverio Giudice i...@fsgiudice.com wrote: Hi Pritpal, Il 27/01/2009 16.58, Pritpal Bedi ha scritto: Hi All After examining the uhttpd implimentation I am wondering if .hrb code can be loaded once and reused with every next request. That way this will come at par with Xbase++'s WAA (Web Application Adopter) protocol where program DLL (called package) is loaded only once. this is one point in my todo list, but I don't know if there are any problems storing hrb modules in memory and reuse them. What happens to loaded symbol table ? because every modules will have HRBMAIN() and could have same other function names. And another: what is faster ? Load from file using __HRBLOAD() / __HRBUNLOAD() or scanning and array / hash table every time ? I suppose the second, but I have doubts on symbol table. Przemek, Viktor ? Best regards Francesco ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour
Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
I think the .hrb modules aren't loaded on a per thread basis, so it's not like several of them could be loaded in parallel. This should be Przemek to clear. What happens in this case ? Are they MT safe ? In uhttpd they *can be loaded* in parallel in different threads. One thing I may do is to serialize them to a single-specific thread that can run hrb modules one after another and store them to avoid next file load. Is this the way ? Yes, let's wait for Przemek. It was pretty much a guess from my side, I haven't yet started to use .hrbs, but they seem very useful. [ BTW I'd recommend using HB_HRB*() names instead of the obsolete __HRB*() ones. ] Done, I will upload in next commit Thank you. Brgds, Viktor ___ Harbour mailing list Harbour@harbour-project.org http://lists.harbour-project.org/mailman/listinfo/harbour
Re: [Harbour] 2009-01-27 10:14 UTC+0100 Francesco Saverio Giudice (info/at/fsgiudice.com)
On Wed, 28 Jan 2009, Francesco Saverio Giudice wrote: I think the .hrb modules aren't loaded on a per thread basis, so it's not like several of them could be loaded in parallel. This should be Przemek to clear. What happens in this case ? Are they MT safe ? In uhttpd they *can be loaded* in parallel in different threads. One thing I may do is to serialize them to a single-specific thread that can run hrb modules one after another and store them to avoid next file load. Is this the way ? We have common public function list so when one thread loads .hrb module then all public functions in this module become visible and accessible to all other threads. When module is unloaded then functions are removed. It's not safe to unload .hrb module when other threads or even the same thread is executing code from this module so programmer should avoid such situations. This potential problem can be exploited even in single thread program so it's not directly related to MT mode. When .hrb module is loaded HVM allocates symbol table for this module. This symbol table is never released even if module is unloaded. In such case the symbol table is only marked as unused and will be reused instead of allocating new one if the same module will be loaded again or other module using exactly the same symbols. It's possible to implement full module unloading but now it will be hard to introduce it due to backward compatibility. Just simply at beginning Harbour was not designed for such situation and addresses of function symbols were used as constant values by different (mostly 3-rd party) code. If we implement module unloading then such code will have to be updated. In fact only dynamic symbol table addresses (PHB_DYNS) are really constant and it's guarantied that they will never changed during HVM execution. So I decided to implement only symbol table recycling but I'm open to change it in the future. Anyhow current behavior has also few interesting features. When .hrb module is loaded and it uses static variables then array with static vars then during module loading area for new statics is allocated and attached to symbol table. This area is also never freed and is reused with symbol table together when module is loaded again. Static variables are initialized only once when module is loaded 1-st time. Sometimes it can be nice and really helpful behavior f.e. module can store some data in static variables and then access it when it's loaded again or can pass reference to static variable to other code which will update it when module is unloaded and then after next load retrieve result from it. But sometimes people may expect full reinitialization. Now it's not possible. If we want to have both functionality then we will have to introduce some flags to HB_HRBLOAD() to control it. Similar to flags passed to hb_threadStart() which controls memvar inheritance, f.e.: HB_REINIT_STATIC But please remember that reusing module symbol table and static variables without reinitialization has yet another very important functionality. If .hrb modules defines new classes then when modules is unloaded and loaded again the same class definition is also reused. Otherwise new class will be allocated because we keep class references in static variables inside class function. The next thing which can be controlled by such flags is action for duplicated public function symbols. Now when .hrb module is loaded and in contains public function which already exists in HVM then this function is not registered and all references to this function in .hrb module are replaced by the function which already exists in HVM so simply such function is not visible or accessible at all. Here we can introduce the following actions: HB_KEEP_LOCAL_REFERENCES It means that local references to public function will not be overloaded by the functions with the same name already registered in HVM. So other code and macro compiler will still access public HVM function but code executed from .hrb module will access the function defined in this module. HB_OVERLOAD_FUNCTIONS When public function name conflict appears then function defined in .hrb module will overload the one existed before. This is a little bit danger functionality which should not be enabled in the code like HTTP server. In practice it will be usable for users who want to upgrade/patch their programs using .hrb modules. It's important and interesting functionality but rather limited to local usage. It also produce other problems. F.e. how to restore functions overloaded by few modules when they are unloaded in different order. I do not know if I want to deeply fight with such problem. If I implement such functionality then probably it will be very basic version and after unloading previous functions will not be restored. If module is unloaded and some other code will try to execute function or method which was in this module f.e. using function reference then it simply receives RT error