Re: [Linux-ha-dev] MAXMSG too small
On Jun 1, 2006, at 3:56 PM, Lars Marowsky-Bree wrote:On 2006-05-31T07:37:34, Alan Robertson <[EMAIL PROTECTED]> wrote: are you saying that there should be higher limit or no limit in IPC-only messages? I think the message layer can provide another API for that I don't remember how much burden such a change would be on the IPC layer.But it seems to me that unless all local messages are uncompressed, it seems like we need a higher limit at the very least... Well, we sort-of need a fix for this soon, as the Transition Graph keepsgrowing and growing, and XML is pretty noisy, and if I got Andrew right,a 6-10 node cluster with one or two clones will already bite us in theheel here.Short-term, I think Andrew should really consider writing the graph to afile and having the PE/TE exchange that token only.i'll be getting started on this shortly(Implementation detail/tangent: I think it'd be nice if the PE passed aregular XML graph, but if that then had an include statement refering tothe external file, the PE could always decide how it wanted to passthis; might be useful for debugging to be able to always write the TEout or not...)i'm not sure i see what you're saying. wouldn't it just be easier to always write everything to disk?side-note... security implications?Mid-term, we need the IPC limit increased for local messaging.Long-term, we need a more efficient way of dealing with clones, whichalas will incur changes down to the RA level. (ie, not having to queryfor each clone child separately, but sending a single query and gettingthem all and such stuff for other ops too.)Sincerely, Lars Marowsky-Brée-- High Availability & ClusteringSUSE Labs, Research and DevelopmentSUSE LINUX Products GmbH - A Novell Business -- Charles Darwin"Ignorance more frequently begets confidence than does knowledge"___Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.orghttp://lists.linux-ha.org/mailman/listinfo/linux-ha-devHome Page: http://linux-ha.org/ -- Andrew Beekhof "Would the last person to leave please turn out the enlightenment?" - TISM ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
On 2006-05-31T07:37:34, Alan Robertson <[EMAIL PROTECTED]> wrote: > >are you saying that there should be higher limit or no limit in IPC-only > >messages? I think the message layer can provide another API for that > > I don't remember how much burden such a change would be on the IPC layer. > > But it seems to me that unless all local messages are uncompressed, it > seems like we need a higher limit at the very least... Well, we sort-of need a fix for this soon, as the Transition Graph keeps growing and growing, and XML is pretty noisy, and if I got Andrew right, a 6-10 node cluster with one or two clones will already bite us in the heel here. Short-term, I think Andrew should really consider writing the graph to a file and having the PE/TE exchange that token only. (Implementation detail/tangent: I think it'd be nice if the PE passed a regular XML graph, but if that then had an include statement refering to the external file, the PE could always decide how it wanted to pass this; might be useful for debugging to be able to always write the TE out or not...) Mid-term, we need the IPC limit increased for local messaging. Long-term, we need a more efficient way of dealing with clones, which alas will incur changes down to the RA level. (ie, not having to query for each clone child separately, but sending a single query and getting them all and such stuff for other ops too.) Sincerely, Lars Marowsky-Brée -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
Guochun Shi wrote: Alan Robertson wrote: Guochun Shi wrote: Andrew Beekhof wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. It should be compressed if you have specified compression method ha.cf. However it would be good to have some proof that it is compressed. Having a message > 256K after compression means the uncompressed one probably has 1M ~2M Another way that might be interesting is to provide an API that has much higher bound, which is suited for local usage only. > We could increase the value but looking through the code this seems to > be an artificial limitation to various degrees... > > * In some cases its used as a substitute for get_netstringlen(msg) - I > believe these should be fixed > > * In some cases its used to pre-empt checks by "child" functions - I > believe these should be removed. > > The two cases that seem to legitimately use MAXMSG are the HBcomm > plugins and the decompression code (though even that could retry a > "couple" of time with larger buffers). > > > Alan, can you please take a look at the use of MAXMSG in the IPC > layer which is really not my area of expertise (especially the HBcomm > plugins) and verify that my assessment is correct (and possibly get > someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. Andrew, if you can send log/debug file to me, I may (or may not) find some clue I think that MAXMSG is inappropriately used for the size of IPC messages - which would prevent messages from being sent in some cases. are you saying that there should be higher limit or no limit in IPC-only messages? I think the message layer can provide another API for that I don't remember how much burden such a change would be on the IPC layer. But it seems to me that unless all local messages are uncompressed, it seems like we need a higher limit at the very least... -- Alan Robertson <[EMAIL PROTECTED]> "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
Alan Robertson wrote: Guochun Shi wrote: Andrew Beekhof wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. It should be compressed if you have specified compression method ha.cf. However it would be good to have some proof that it is compressed. Having a message > 256K after compression means the uncompressed one probably has 1M ~2M Another way that might be interesting is to provide an API that has much higher bound, which is suited for local usage only. > We could increase the value but looking through the code this seems to > be an artificial limitation to various degrees... > > * In some cases its used as a substitute for get_netstringlen(msg) - I > believe these should be fixed > > * In some cases its used to pre-empt checks by "child" functions - I > believe these should be removed. > > The two cases that seem to legitimately use MAXMSG are the HBcomm > plugins and the decompression code (though even that could retry a > "couple" of time with larger buffers). > > > Alan, can you please take a look at the use of MAXMSG in the IPC > layer which is really not my area of expertise (especially the HBcomm > plugins) and verify that my assessment is correct (and possibly get > someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. Andrew, if you can send log/debug file to me, I may (or may not) find some clue I think that MAXMSG is inappropriately used for the size of IPC messages - which would prevent messages from being sent in some cases. are you saying that there should be higher limit or no limit in IPC-only messages? I think the message layer can provide another API for that -Guochun ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
Guochun Shi wrote: Andrew Beekhof wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. It should be compressed if you have specified compression method ha.cf. However it would be good to have some proof that it is compressed. Having a message > 256K after compression means the uncompressed one probably has 1M ~2M Another way that might be interesting is to provide an API that has much higher bound, which is suited for local usage only. > We could increase the value but looking through the code this seems to > be an artificial limitation to various degrees... > > * In some cases its used as a substitute for get_netstringlen(msg) - I > believe these should be fixed > > * In some cases its used to pre-empt checks by "child" functions - I > believe these should be removed. > > The two cases that seem to legitimately use MAXMSG are the HBcomm > plugins and the decompression code (though even that could retry a > "couple" of time with larger buffers). > > > Alan, can you please take a look at the use of MAXMSG in the IPC > layer which is really not my area of expertise (especially the HBcomm > plugins) and verify that my assessment is correct (and possibly get > someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. Andrew, if you can send log/debug file to me, I may (or may not) find some clue I think that MAXMSG is inappropriately used for the size of IPC messages - which would prevent messages from being sent in some cases. -- Alan Robertson <[EMAIL PROTECTED]> "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
Andrew Beekhof wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. It should be compressed if you have specified compression method ha.cf. However it would be good to have some proof that it is compressed. Having a message > 256K after compression means the uncompressed one probably has 1M ~2M Another way that might be interesting is to provide an API that has much higher bound, which is suited for local usage only. > We could increase the value but looking through the code this seems to > be an artificial limitation to various degrees... > > * In some cases its used as a substitute for get_netstringlen(msg) - I > believe these should be fixed > > * In some cases its used to pre-empt checks by "child" functions - I > believe these should be removed. > > The two cases that seem to legitimately use MAXMSG are the HBcomm > plugins and the decompression code (though even that could retry a > "couple" of time with larger buffers). > > > Alan, can you please take a look at the use of MAXMSG in the IPC > layer which is really not my area of expertise (especially the HBcomm > plugins) and verify that my assessment is correct (and possibly get > someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. Andrew, if you can send log/debug file to me, I may (or may not) find some clue -Guochun ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
On 5/29/06, Andrew Beekhof <[EMAIL PROTECTED]> wrote: On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: > Andrew Beekhof wrote: > > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > > send its transition graph and the cluster stalls indefinitely. > > So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. > > > We could increase the value but looking through the code this seems to > > be an artificial limitation to various degrees... > > > > * In some cases its used as a substitute for get_netstringlen(msg) - I > > believe these should be fixed > > > > * In some cases its used to pre-empt checks by "child" functions - I > > believe these should be removed. > > > > The two cases that seem to legitimately use MAXMSG are the HBcomm > > plugins and the decompression code (though even that could retry a > > "couple" of time with larger buffers). > > > > > > Alan, can you please take a look at the use of MAXMSG in the IPC > > layer which is really not my area of expertise (especially the HBcomm > > plugins) and verify that my assessment is correct (and possibly get > > someone to look at fixing it). > > Unfortunately, this means various buffers get locked into memory at this > size. Our processes are already pretty huge. get_netstringlen() is an > expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. > Why do you think that predicting that child buffers will be too large is > a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. though it may be that the HBcomm plugins cant work any other way in which case we just have to keep upping MAXMSG > Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
On 5/29/06, Lars Marowsky-Bree <[EMAIL PROTECTED]> wrote: On 2006-05-29T06:44:05, Alan Robertson <[EMAIL PROTECTED]> wrote: > Andrew Beekhof wrote: > >Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > >send its transition graph and the cluster stalls indefinitely. > > So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? This is about the transition graph, not the CIB, and it's sent via local IPC only (uncompressed). Compressing local data transfers seems silly. The problem, as Andrew tells me, is the tons of probes generated for every resource and every clone right now. (But I anticipate that a cluster with large number of resources might hit the same problem, even w/o probes.) Sigh. As this data is only passed locally, we _could_ pass around a reference to a file instead. until we run into the same problem for the CIB... i'd be more inclined to fix it so that both will work. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
On 5/29/06, Alan Robertson <[EMAIL PROTECTED]> wrote: Andrew Beekhof wrote: > Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? its being added with ha_msg_addstruct_compress(msg, field, xml); and sent via IPC to the crmd (from the pengine) whether its actually been compressed or not i dont know. > We could increase the value but looking through the code this seems to > be an artificial limitation to various degrees... > > * In some cases its used as a substitute for get_netstringlen(msg) - I > believe these should be fixed > > * In some cases its used to pre-empt checks by "child" functions - I > believe these should be removed. > > The two cases that seem to legitimately use MAXMSG are the HBcomm > plugins and the decompression code (though even that could retry a > "couple" of time with larger buffers). > > > Alan, can you please take a look at the use of MAXMSG in the IPC > layer which is really not my area of expertise (especially the HBcomm > plugins) and verify that my assessment is correct (and possibly get > someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Thats basically the tradeoff... either we increase MAXMSG and take a hit on the process size, or we do more dynamically and take a runtime hit. Not being a guru in the IPC layer, I dont know which is worse. However, my suspicion was that get_(net)stringlen was not too bad for flat messages and would therefore be preferred. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? For low values of MAXMSG I think its fine to do that. But we keep upping the value and allocating 256k for regular heartbeat packets seems like a real waste. Is your concern related to compressed/uncompressed sizes? As above. I'm doing my part and indicating that it can/should be compressed, but i dont know the internals well enough to say for sure. ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
On 2006-05-29T06:44:05, Alan Robertson <[EMAIL PROTECTED]> wrote: > Andrew Beekhof wrote: > >Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot > >send its transition graph and the cluster stalls indefinitely. > > So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? This is about the transition graph, not the CIB, and it's sent via local IPC only (uncompressed). Compressing local data transfers seems silly. The problem, as Andrew tells me, is the tons of probes generated for every resource and every clone right now. (But I anticipate that a cluster with large number of resources might hit the same problem, even w/o probes.) Sigh. As this data is only passed locally, we _could_ pass around a reference to a file instead. Sincerely, Lars Marowsky-Brée -- High Availability & Clustering SUSE Labs, Research and Development SUSE LINUX Products GmbH - A Novell Business -- Charles Darwin "Ignorance more frequently begets confidence than does knowledge" ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/
Re: [Linux-ha-dev] MAXMSG too small
Andrew Beekhof wrote: Running CTS on 6 nodes has shown MAXMSG to be too small - the PE cannot send its transition graph and the cluster stalls indefinitely. So, that means the CIB is > 256K compressed? Or is it > 256K uncompressed? We could increase the value but looking through the code this seems to be an artificial limitation to various degrees... * In some cases its used as a substitute for get_netstringlen(msg) - I believe these should be fixed * In some cases its used to pre-empt checks by "child" functions - I believe these should be removed. The two cases that seem to legitimately use MAXMSG are the HBcomm plugins and the decompression code (though even that could retry a "couple" of time with larger buffers). Alan, can you please take a look at the use of MAXMSG in the IPC layer which is really not my area of expertise (especially the HBcomm plugins) and verify that my assessment is correct (and possibly get someone to look at fixing it). Unfortunately, this means various buffers get locked into memory at this size. Our processes are already pretty huge. get_netstringlen() is an expensive call. Why do you think that predicting that child buffers will be too large is a bad idea? How do you understand that removing it will help? Is your concern related to compressed/uncompressed sizes? -- Alan Robertson <[EMAIL PROTECTED]> "Openness is the foundation and preservative of friendship... Let me claim from you at all times your undisguised opinions." - William Wilberforce ___ Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev Home Page: http://linux-ha.org/