Le 11/30/12 4:52 PM, Howard Chu a écrit : > Emmanuel Lécharny wrote: >> Hi guys, >> >> We can do many different things when processing a search, and each of >> them can collide : >> >> o simple search >> o with time limit >> o with size limit >> o with both >> o abandon request receoved >> o session closed >> o search with MandesDSAIt control >> o paged search >> o cancelled paged search >> o new paged search >> o persistent search >> o replication >> >> Of course, we can have many searches running at the same time for one >> session... >> >> The problem is that the way it currently works, beside the inherent >> complexity, make it possible that the server ends with an OOM : we first >> compute the set of candidates (it's a set of UUIDs), and we then fetch >> all the entries one by one, writing them to the client. Of course, if >> the client is not reading them fast enough, those entries are stored in >> a queue, in memory, leading to potential OOM very quickly. >> >> This is a real problem with the way we use MINA : we don't wait for the >> entries to be actually *written* in the socket, we just push the entries >> in the session. >> >> In fact, there are three ways to write entries in MINA 2. >> 1) you do what we currently do, and we are at risk to get an OOM >> 2) for each write, we get back a WriteFuture, and we wait on this >> writeFuture for the message to be in the socket. The pb with this >> approach is that we block the thread until the entry is sent fully. It >> works because we use an executorFilter, but this executor filter does >> not have an infinite number of threads in its pool. At some point, we >> will block completely >> 3) there is a smartest way, but way more complex : we just write the >> first entry, and that's it. When the first message is physically sent, >> we will get a MessageSent event, and we can now process the next entry. >> Etc. Of course, we will have as many MINA <-> ADS communication as we >> have entries to send, an overhead we have to consider (it's around 5 >> to 10%) >> However, doing so guarantee that we don't push anything in the queue : >> all the entries are flushed one by one, we don't block any thread. >> >> Atm, I have started to implement the third solution, and it works pretty >> well for the simple search. Now, it gets very complex for the >> pagedSearch, as it's a multiple SearchRequest operation, and we have to >> deal with many different possibilities. > > This definitely sounds like a lot of work, for an uncertain payoff. In term of performance, it will dfinitively have a cost. I benched the server yesterday, and found a penalty of around 8%
> I would consider a couple tweaks: > write two messages before waiting > if there are multiple entries in the result set, you can save a > little overhead. This is definitively something I have in mind. I wanted to have the number of entry to push configurable, to mimic what any RDBMS does : sending a block of rows insead of sending every row one by one. However, in a first version, I'm focusing on having the mechanism working with all the different use cases, which is far from being simple :) > if there's only one entry, and then the searchResult message, > there's no reason to wait at all. True. I do wait for the sake of being consistent atm, but really, I can avoid doing so in real life. -- Regards, Cordialement, Emmanuel Lécharny www.iktek.com
