Hello Chris and others, Some prior thoughts and then some comments.
First of all, we are working with Paolo on releasing a threaded version of pmaact. This new release wont help too much on single processors but in this era of dual cores and ht we expect to gain some performance. We agree with Paolo taks should be threaded unless you want to greatelly penalize performance of traffic capturing. This version will be made public very soon so you guys can test it and give more ideas. > Absolutely, I agree that there is an upper limit to the rate that you > can > insert into any database. A direct flow to sql translation is a dead end no matter threads or no threads, the database wont be able to support this. IMHO, the SQL plugin should summarize data some way and do this on a periodic fashion. In our case, with our flow tools based solution we do this every 5 min. We aggregate as much as we can and we loose detail under control. The same (or similar) ideas should be used in pmacct and this is something we are working with Paolo to include. If your start to reduce the time between each computing you reduce the power of summaries and well, not easy. So for real time stuff we might need to think in other terms, I dont think SQL can cope with it (many times even RAM cant cope with it :) Maybe the solution could be some "in memory" database, but either way, wont be easy. At the same time, Paolo has implemented some great features from ATT papers. I believe they are a MUST in any production environment. We specially like the concept of holding a table in size, and increase it based on server resources, but have it always controlled. If data arrives faster, either way you summarize it (loose detail) or you loose it (loose all data). For example, in our case we just enter directly 2100 entries in each 5 min period for a total size of 300m entries in a day. If you receive more, then the smaller ones are grouped to reduce the number. Of course, if traffic is very hourly dependant you can increase this limit. I think Paolo called it "Sampling under constrains". > What about the OS queues for packets? Were they not effective? Was > pmacct > doing a database write for each packet? (I can see how that would kill > it > very quickly). You mean single theaded on multithreaded? > I would be much happier if there was a single thread that would flush > data > to the database at (up to) the maximum rate that the database could > support. I don't think there is any benefit to multiple threads with > MySQL; you will not actually be able to insert any more rows into the > same > table that way, at least with MyISAM tables. > > A single thread doesn't seem like it would be too hard to implement, > perhaps as a configurable option. The thread would just start when > pmacctd > starts, sleep until the refresh time expires, and then flush all dirty > > records from memory to the database, then sleep again if necessary, > i.e. > unless the inserts took more than the sql_refresh_time. This is not a bad idea, actually we do it like this (well, we have 4 threads concurrently wornking on computing each probe). Another alternative would be a thread to compute data and summarize it and a new one to store it into the database. Of course, data capturing and classification should be considered a different thread, I'm just relating to the SQL plugin. > This way, we could get the maximum performance from the database (at > least > MySQL) without interfering with packet capture at all. If the database > > can't keep up with the sql_refresh_time, you simply get fewer > updates/inserts, no data loss except temporal resolution. Mmmmm, I dont think you wont loose data. If you can only hold a given amount of it in memory and SQL plugin cant cope with its rate, you will fill the space and start to loose information. Our approach is to reduce the volume of entries from raw data to sql entries. > What do you think about that idea? As stated, we are working with Paolo on this and other stuff, but surelly your ideas are much appreciated. -------------------------------------------- Jaime Nebrera - [EMAIL PROTECTED] Consultor TI - ENEO Tecnologia SL Pol. PISA - C/ Manufactura 6, P1, 3B Mairena del Aljarafe - 41927 - Sevilla Telf.- (+34) 955 60 11 60 / 619 04 55 18 _______________________________________________ pmacct-discussion mailing list http://www.pmacct.net/#mailinglists
