Hi, > I am thinking on how to enhance the engine so that fastest-possible > database writes (actually, any output) are possible. However, I come > across a couple of points. I would like to do so in the most generic > way. Let me quote those message parts that I have specific questions > on (out of sequence, thus I preserve the full message below - if you > need more context). > > > I made a small Python prototype to do something similar to what you > > propose, with no batches, but committing each 1000 entries. The > > speedup I got by introducing batches was about a factor 50. And the > > statement was already prepared. > > Could you check what actually brings most of the speedup - the batches > or the prepared statement. I am thinking along the lines of using > batches but not prepared statements, as in this sample > > begin insert ... insert ... insert ... insert ... end
I'll do, but please note that begin execute(unprepared_insert_statement) execute(unprepared_insert_statement) execute(unprepared_insert_statement) execute(unprepared_insert_statement) commit Needs 4 message exchanges with the server. OTOH: <client> push (@batch, $item); push (@batch, $item); push (@batch, $item); push (@batch, $item); <send to server> begin execute_many (insert_statement, @batch) commit Requires only one, so the network overhead is *way* smaller. This is true not only of Oracle, but also of PostgreSQL, and I suppose MySQL provides similar API. I'll try to verify where the hottest spot is, anyways. > And second question. Let's envision that the rsyslog core could > provide you with multiple data records at once. That would be *great*. > For the case given above, I could still simply pass in a single - now > longer - string (that makes it that attractive for the other db > plugins). However, that does not work for the omoracle interface. For omoracle it's not good, indeed. Also, I don't think you want to maintain yet another way of passing messages to modules. IMHO, we have two orthogonal use cases: a) the module wants all messages one by one and is happy with it (all modules but omoracle). b) the module wants to handle the properties in big batches (omoracle). IMHO, this is flexible enough for new developers to choose between easy and fast. > Let's say the new interface we created is a "vector interface" as it > provide each data item as part of a one-dimensional vector (or > tuple). Then, it would look most natural to me if we extend this to > "matrix interface", where you receive a tuple of tuples (or a > two-dimensional structure that "feels" much like a SQL result set). Indeed, that's what I have to maintain in omoracle. If I could offload it to rsyslog's core it would be even better. > What that be useful for you? Or, the other way around, what > would you consider an optimal interface to your plugin if the rsyslog > core would provide batching support? > The matrix-like structure is the one I need, indeed. :) Cheers. -- Luis Fernando Muñoz Mejías [email protected] _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com

