Re: [OpenSER-Devel] RFC: memory management in database modules
On Friday 08 February 2008, Bogdan-Andrei Iancu wrote: I'm really interested in focusing on things that really drawback openser (as throughput) - and I do not refer at the non-sense lab tests (TM performances :P) which more or less have 0 relevance in real life scenarios. So, as a first plan, I wan to do some work on the DB part : I want to add support for prepared statements which will give some speed up in standard DB queries - 90% of the queries are standard and there is no need to prepare, build query, pass it to the driver, to parse it (by driver), etc I want first to add this support in usrloc module, only for mysql backend. Hi Bogdan, great that you plan to work on this problem. Prepared statements are also on my wish list.. :-) Do you plan also to change the DB API to only export a loading function (like in usrloc), as Daniel suggested? If the result are promising, we can extend this to all modules using DB. Then, we can investigate which other drivers support prepared statements and to enhance them also - if not native supported, we can do it the openser module. Right, i also see no real need to change all modules to use prepared statements. We should concentrate on performance critical parts like usrloc and perhaps auth_db like you said. Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
Hi Victor, Yes, I know your patch and I plan to use it as start point (patch id 1717848).I will port it to 1.4 and then add the DB API support and extend usrloc/auth_db to make use of it. Thanks and regards, Bogdan Victor Gamov wrote: Hi Bogdan! Sometimes ago I make patch for 1.2.0 which introduce prepared statements into mysql module. This patch was removed later because it incompatible with current versions of OpenSER but I still have in mind to reproduce this patch for CURRENT when DB API will be stable. So if you plan to add support for prepared statements you can see into this patch (It was worked for our 1.2.0 installation for some time but now we use 1.2.2). This patch changes driver only not other modules like usrloc or other. But introducing new DB API function allowing prepare statement and then use it in module may be more effectively. Bogdan-Andrei Iancu wrote: Hi Henning, I'm really interested in focusing on things that really drawback openser (as throughput) - and I do not refer at the non-sense lab tests (TM performances :P) which more or less have 0 relevance in real life scenarios. So, as a first plan, I wan to do some work on the DB part : I want to add support for prepared statements which will give some speed up in standard DB queries - 90% of the queries are standard and there is no need to prepare, build query, pass it to the driver, to parse it (by driver), etc I want first to add this support in usrloc module, only for mysql backend. If the result are promising, we can extend this to all modules using DB. Then, we can investigate which other drivers support prepared statements and to enhance them also - if not native supported, we can do it the openser module. I know this is a long shot, but first I want to experiment with usrloc and mysql to see if really makes a difference. ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
Hi Bogdan! Sometimes ago I make patch for 1.2.0 which introduce prepared statements into mysql module. This patch was removed later because it incompatible with current versions of OpenSER but I still have in mind to reproduce this patch for CURRENT when DB API will be stable. So if you plan to add support for prepared statements you can see into this patch (It was worked for our 1.2.0 installation for some time but now we use 1.2.2). This patch changes driver only not other modules like usrloc or other. But introducing new DB API function allowing prepare statement and then use it in module may be more effectively. Bogdan-Andrei Iancu wrote: Hi Henning, I'm really interested in focusing on things that really drawback openser (as throughput) - and I do not refer at the non-sense lab tests (TM performances :P) which more or less have 0 relevance in real life scenarios. So, as a first plan, I wan to do some work on the DB part : I want to add support for prepared statements which will give some speed up in standard DB queries - 90% of the queries are standard and there is no need to prepare, build query, pass it to the driver, to parse it (by driver), etc I want first to add this support in usrloc module, only for mysql backend. If the result are promising, we can extend this to all modules using DB. Then, we can investigate which other drivers support prepared statements and to enhance them also - if not native supported, we can do it the openser module. I know this is a long shot, but first I want to experiment with usrloc and mysql to see if really makes a difference. -- CU, Victor Gamov ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
Hi Henning, Indeed, it will be interesting to run the such tests, but some major advantages we have with the current mem manager: 1) in private memory, we have no locking. Malloc from glibc does all the time lock in order to be thread safe 2) debugging possibilities - I know there are hundreds of tools for doing this, but we have it built in and better tuned for openser 3) runtime inspection - mem usage, fragments, Anyhow, we will never be able to drop the internal openser mem manager as it is mandatory for shm memory ;) Regards, Bogdan Henning Westerholt wrote: On Thursday 07 February 2008, Dan Pascu wrote: Well do not take me as a performance maniac :D...As I said, it is not about performance but about functionality - memory fragmentation is something serious and we should try to avoid it as much as possible. As a side note to the discussed issue, I don't think it is realistic to assume that memory fragmentation will not occur by just avoiding some DB memory allocations that vary in size too much. Given the varying size of SIP requests, over time, if the proxy is online for a long time and servicing many requests per second, memory fragmentation will occur sooner or later, unless the memory allocator is smart and works around it, or if it has a defragmentor that is run when memory gets too fragmented. It would be interesting to run OpenSER on a recent kernel with a recent glibc and sees how the performance figures of the standard malloc compares now against the internal f_malloc. The memory allocator in the kernel was reworked quite a few times, and some work have been done in the active defragmentation area (but i'm not sure if this have been merged yet). Cheers, Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
Hi Henning, I'm really interested in focusing on things that really drawback openser (as throughput) - and I do not refer at the non-sense lab tests (TM performances :P) which more or less have 0 relevance in real life scenarios. So, as a first plan, I wan to do some work on the DB part : I want to add support for prepared statements which will give some speed up in standard DB queries - 90% of the queries are standard and there is no need to prepare, build query, pass it to the driver, to parse it (by driver), etc I want first to add this support in usrloc module, only for mysql backend. If the result are promising, we can extend this to all modules using DB. Then, we can investigate which other drivers support prepared statements and to enhance them also - if not native supported, we can do it the openser module. I know this is a long shot, but first I want to experiment with usrloc and mysql to see if really makes a difference. Regards, Bogdan Henning Westerholt wrote: On Wednesday 06 February 2008, Bogdan-Andrei Iancu wrote: On Tuesday 05 February 2008, Bogdan-Andrei Iancu wrote: BTW, do you have any estimation (as time) about the start and end of these changes? I plan also some major changes in the DB interaction and preferable will be not to overlap ;) Hi Bogdan, sounds interesting, what type of chances do you plan? Just curious.. ;-) Cheers, Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
Hi Henning, Henning Westerholt wrote: On Friday 08 February 2008, Bogdan-Andrei Iancu wrote: I'm really interested in focusing on things that really drawback openser (as throughput) - and I do not refer at the non-sense lab tests (TM performances :P) which more or less have 0 relevance in real life scenarios. So, as a first plan, I wan to do some work on the DB part : I want to add support for prepared statements which will give some speed up in standard DB queries - 90% of the queries are standard and there is no need to prepare, build query, pass it to the driver, to parse it (by driver), etc I want first to add this support in usrloc module, only for mysql backend. Hi Bogdan, great that you plan to work on this problem. Prepared statements are also on my wish list.. :-) Do you plan also to change the DB API to only export a loading function (like in usrloc), as Daniel suggested? haven;t took this into consideration as it is more like cosmetics ;) - I want first to explore the technical aspect of the prepared statements and after that, why not, we can look for this. Regards, Bogdan If the result are promising, we can extend this to all modules using DB. Then, we can investigate which other drivers support prepared statements and to enhance them also - if not native supported, we can do it the openser module. Right, i also see no real need to change all modules to use prepared statements. We should concentrate on performance critical parts like usrloc and perhaps auth_db like you said. ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
On Tuesday 05 February 2008, Daniel-Constantin Mierla wrote: [..] postgres, db_berkeley allocate new memory and copy all string values from the database to the internal representation. Modules that uses this driver don't need to copy there values, even after the freeing of the result set. quite strange, if the copy is not freed with free_result(), when it happens? At first thought, seems exposed to memory leak. db_berkely seems constructed on dbtext structure. In dbtext, the result is duplicated in private memory, as the tables are stored in shared memory, and making direct references will be exposed to race. Hi Daniel, could you point me to the position in the db_text module where the result is duplicated? Are only the STR, STRING and BLOB values duplicated, and thus must be later freed, or all values that are definied in db_type_t? Thank you, Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
Hi Henning, It is not a rush at all - I just need to have an estimation to do my own plans :). I agree that such changes takes time, and to be honest I wasn't expecting such a short timeframe from you ;). I though it may take some time (weeks) . Thanks and regards, Bogdan Henning Westerholt wrote: On Tuesday 05 February 2008, Bogdan-Andrei Iancu wrote: BTW, do you have any estimation (as time) about the start and end of these changes? I plan also some major changes in the DB interaction and preferable will be not to overlap ;) I don't want to rush this type of changes, better to test a little bit more.. But i think i'll be able to finish the first part of this work (for unixodbc, postgres, mysql) on monday or tuesday next week. Is this ok for you? In a second part the postgres and db_berkeley driver could be changed to not do this value coping anymore. This way we don't break to much functionality at once, the changes could be better reviewed. Cheers, Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
On Tuesday 05 February 2008, Bogdan-Andrei Iancu wrote: For the VAL_NAMES-s exists the same issue, i'll evaluate if it make sense to introduce a flag for them too. well...is it needed any altering of the column names? so far I think all are static (as returned by driver) No, i don't think that altering of the colum names is necessary. But Postgres and db_berkeley does this VAL_NAMES-s copying, i've not investigate dbtext yet. If this is not necessary even in dbtext, then we could perhaps remove this completly. BTW, do you have any estimation (as time) about the start and end of these changes? I plan also some major changes in the DB interaction and preferable will be not to overlap ;) I don't want to rush this type of changes, better to test a little bit more.. But i think i'll be able to finish the first part of this work (for unixodbc, postgres, mysql) on monday or tuesday next week. Is this ok for you? In a second part the postgres and db_berkeley driver could be changed to not do this value coping anymore. This way we don't break to much functionality at once, the changes could be better reviewed. Cheers, Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
[OpenSER-Devel] RFC: memory management in database modules
Hi, i've a question about the 'correct' way to do the memory management for results in the database modules. At the moment there existing two different styles: mysql, unixodbc don't allocate new memory and just assign the string pointer of the result set to the internal OpenSER representation. The modules that needs to use this results needs to copy them, because there are not available after a call to the DB specific result free function. postgres, db_berkeley allocate new memory and copy all string values from the database to the internal representation. Modules that uses this driver don't need to copy there values, even after the freeing of the result set. As mysql is the most used database, (i assume) that every module copy the values from the result set after the query execution. This is unnecessary for the postgres DB, and further prevents me from using only one memory management function for the internal representation of the DB structures. As we have many more modules than database connectors in the code, i think it would make more sense to change the postgres and db_berkeley module to match the behaviour of the mysql module. Any comments? Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
On Tuesday 05 February 2008, Daniel-Constantin Mierla wrote: [..] At the moment there existing two different styles: mysql, unixodbc don't allocate new memory and just assign the string pointer of the result set to the internal OpenSER representation. The modules that needs to use this results needs to copy them, because there are not available after a call to the DB specific result free function. postgres, db_berkeley allocate new memory and copy all string values from the database to the internal representation. Modules that uses this driver don't need to copy there values, even after the freeing of the result set. quite strange, if the copy is not freed with free_result(), when it happens? At first thought, seems exposed to memory leak. The copy is eventual freed with free_result(). The difference is only the copy operation that takes place. db_berkely seems constructed on dbtext structure. In dbtext, the result is duplicated in private memory, as the tables are stored in shared memory, and making direct references will be exposed to race. Ok, i don't thought that much about this issues with dbtext yet.. I think that the driver modules should not duplicate the result unless there is some race in reading/writing values in the result. Each module that uses the driver decides whether it needs values after free_result() and does a copy -- this is how should be now, at least did so. Well, ok. Perhaps we can introduce a flag in the db_val or db_row type, that indicates that STR, STRING and BLOB values are copied, and thus needs to be freed? This would make it possible to use the same free function for the different databases, without changing that much of the internal logic of the driver. Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
Hi Henning, I agree with you. I see here two arguments for this: 1) if mysql is not duplicating the date returned from driver, it means all the modules using DB are already safe from this point of view - they do their one copy and they do not count on the persistence of data returned from DB. 2) copying data too many times may have a performance impact, but not because of alloc/copy/free ops, but mainly because of memory fragmentation - the size of data operated with DB vary a lot (like size of chucks), so the impact may be huge. But I have here a note: it may not be possible in all case to pass the pointer returned by driver to the upper layer (module) as the data returned by driver may need some pre-processing. Like the postgres module does for string and blobs (if I'm not wrong) to do escape and unescape. So, the DB module may hide (totally transparent) that certain fields are re-allocated due some pre-processing. This extra mem must be also freed (also transparent) by the DB module when the result is freed. This will not break the the overall behaviour, but I just mentioned because the no suppositions should made on the data returned by the DB module - it may be allocated by underlaying driver, may be in openser pkg mem or in heap. If a module needs the data, it must make a copy! Regards, Bogdan Henning Westerholt wrote: Hi, i've a question about the 'correct' way to do the memory management for results in the database modules. At the moment there existing two different styles: mysql, unixodbc don't allocate new memory and just assign the string pointer of the result set to the internal OpenSER representation. The modules that needs to use this results needs to copy them, because there are not available after a call to the DB specific result free function. postgres, db_berkeley allocate new memory and copy all string values from the database to the internal representation. Modules that uses this driver don't need to copy there values, even after the freeing of the result set. As mysql is the most used database, (i assume) that every module copy the values from the result set after the query execution. This is unnecessary for the postgres DB, and further prevents me from using only one memory management function for the internal representation of the DB structures. As we have many more modules than database connectors in the code, i think it would make more sense to change the postgres and db_berkeley module to match the behaviour of the mysql module. Any comments? Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
On 02/05/08 18:38, Henning Westerholt wrote: On Tuesday 05 February 2008, Daniel-Constantin Mierla wrote: [..] At the moment there existing two different styles: mysql, unixodbc don't allocate new memory and just assign the string pointer of the result set to the internal OpenSER representation. The modules that needs to use this results needs to copy them, because there are not available after a call to the DB specific result free function. postgres, db_berkeley allocate new memory and copy all string values from the database to the internal representation. Modules that uses this driver don't need to copy there values, even after the freeing of the result set. quite strange, if the copy is not freed with free_result(), when it happens? At first thought, seems exposed to memory leak. The copy is eventual freed with free_result(). The difference is only the copy operation that takes place. ok, maybe I have misunderstood what you meant. db_berkely seems constructed on dbtext structure. In dbtext, the result is duplicated in private memory, as the tables are stored in shared memory, and making direct references will be exposed to race. Ok, i don't thought that much about this issues with dbtext yet.. I think that the driver modules should not duplicate the result unless there is some race in reading/writing values in the result. Each module that uses the driver decides whether it needs values after free_result() and does a copy -- this is how should be now, at least did so. Well, ok. Perhaps we can introduce a flag in the db_val or db_row type, that indicates that STR, STRING and BLOB values are copied, and thus needs to be freed? This would make it possible to use the same free function for the different databases, without changing that much of the internal logic of the driver. Seems the right approach, to mark whether the value should be freed by the openser db driver module or is done when freeing the result from the db library used beneath. I think it is at value granularity, as the row has different value types and in some cases, even for same type it is no need if the openser db driver module didn't changed the value returned by db library. Cheers, Daniel Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
On Tuesday 05 February 2008, Bogdan-Andrei Iancu wrote: Hi Henning, I agree with you. I see here two arguments for this: 1) if mysql is not duplicating the date returned from driver, it means all the modules using DB are already safe from this point of view - they do their one copy and they do not count on the persistence of data returned from DB. Hi Bogdan, yes, this was the point i was trying to convey. 2) copying data too many times may have a performance impact, but not because of alloc/copy/free ops, but mainly because of memory fragmentation - the size of data operated with DB vary a lot (like size of chucks), so the impact may be huge. I know that you're receptive for performance arguments.. ;-) But I have here a note: it may not be possible in all case to pass the pointer returned by driver to the upper layer (module) as the data returned by driver may need some pre-processing. Like the postgres module does for string and blobs (if I'm not wrong) to do escape and unescape. In the postgres module its not a problem for STRINGs and STRs, but for BLOBs. The escaping function allocates new memory that don't belong to the request that is build. Its also not valid to assign the escaped string to the old string pointer (to free it on free_result), because it could be bigger, and according to the documentation this is also not allowed. Thus its necessary to free this memory in the free_row function. So, the DB module may hide (totally transparent) that certain fields are re-allocated due some pre-processing. This extra mem must be also freed (also transparent) by the DB module when the result is freed. This will not break the the overall behaviour, but I just mentioned because the no suppositions should made on the data returned by the DB module - it may be allocated by underlaying driver, may be in openser pkg mem or in heap. If a module needs the data, it must make a copy! I don't want to change this assumption. So i'll introduce a flag in the db_val_t structure to track the memory status of a value. Then we've the flexibility to support different types of modules, and we can also keep or improve the performance. For the VAL_NAMES-s exists the same issue, i'll evaluate if it make sense to introduce a flag for them too. Cheers, Henning ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
Re: [OpenSER-Devel] RFC: memory management in database modules
Hi Henning, Henning Westerholt wrote: On Tuesday 05 February 2008, Bogdan-Andrei Iancu wrote: Hi Henning, I agree with you. I see here two arguments for this: 1) if mysql is not duplicating the date returned from driver, it means all the modules using DB are already safe from this point of view - they do their one copy and they do not count on the persistence of data returned from DB. Hi Bogdan, yes, this was the point i was trying to convey. It sounds natural as it is already present and working ;) 2) copying data too many times may have a performance impact, but not because of alloc/copy/free ops, but mainly because of memory fragmentation - the size of data operated with DB vary a lot (like size of chucks), so the impact may be huge. I know that you're receptive for performance arguments.. ;-) Well do not take me as a performance maniac :D...As I said, it is not about performance but about functionality - memory fragmentation is something serious and we should try to avoid it as much as possible. But I have here a note: it may not be possible in all case to pass the pointer returned by driver to the upper layer (module) as the data returned by driver may need some pre-processing. Like the postgres module does for string and blobs (if I'm not wrong) to do escape and unescape. In the postgres module its not a problem for STRINGs and STRs, but for BLOBs. The escaping function allocates new memory that don't belong to the request that is build. Its also not valid to assign the escaped string to the old string pointer (to free it on free_result), because it could be bigger, and according to the documentation this is also not allowed. Thus its necessary to free this memory in the free_row function. yes, I know this is a tricky thing - I got into this code after Norman did some changes there, and the logic for handling the BLOB case was quite complex and tricky. Most probably because the DB API did not provide any support (as extension) to keep additional information about the memory status of the field (if statics, allocated, etc) So, the DB module may hide (totally transparent) that certain fields are re-allocated due some pre-processing. This extra mem must be also freed (also transparent) by the DB module when the result is freed. This will not break the the overall behaviour, but I just mentioned because the no suppositions should made on the data returned by the DB module - it may be allocated by underlaying driver, may be in openser pkg mem or in heap. If a module needs the data, it must make a copy! I don't want to change this assumption. So i'll introduce a flag in the db_val_t structure to track the memory status of a value. Then we've the flexibility to support different types of modules, and we can also keep or improve the performance. Perfect! For the VAL_NAMES-s exists the same issue, i'll evaluate if it make sense to introduce a flag for them too. well...is it needed any altering of the column names? so far I think all are static (as returned by driver) BTW, do you have any estimation (as time) about the start and end of these changes? I plan also some major changes in the DB interaction and preferable will be not to overlap ;) Regards, Bogdan ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel
[OpenSER-Devel] RFC: memory management in database modules
Henning Westerholt writes: As we have many more modules than database connectors in the code, i think it would make more sense to change the postgres and db_berkeley module to match the behaviour of the mysql module. Any comments? i'm in favor of your suggestion. in case module does not need to store a value, copying work can be avoided. -- juha ___ Devel mailing list Devel@lists.openser.org http://lists.openser.org/cgi-bin/mailman/listinfo/devel