Does your calling function really need to know if the insert was successful? If you can do without this extra overhead, then I think the best way to do this is to place the insert request on a queue and free up the RPC request immediately rather than wait for the inserts to batch up and complete.

So some skeleton code would be:

static LinkedBlockingQueue<dataType> dataToInsert = new LinkedBlockingQueue<dataType>(); //java.util.concurrent
static int INSERT_TIME_THRESHOLD = 3600;
static int INSERT_SIZE_THRESHOLD = 1000;

// thrift-accessible function
public void insert(dataType data) {
                this.dataToInsert.add(data); //thread-safe
}

// worker threads
private static class insertWorker extends Thread {

        public void run() {

ArrayList<dataType> dataToInsertWorker = new ArrayList<dataType>();
             while (true) {
dataToInsertWorker.add(this.dataToInsert.take()); //blocks

if ((dataToInsertWorker.size() > INSERT_SIZE_THRESHOLD) || (dataToInsertWorker.size() > 1 && (System.currentTimeMillis() - lastInsertTime) > INSERT_TIME_THRESHOLD)) {

                              // insert into DB here

lastInsertTime = System.currentTimeMillis();
                         }
                }

         }
}




--------------------------
Philip Fung
Engineering
Facebook, Inc.
[EMAIL PROTECTED]


On May 28, 2008, at 11:46 AM, Ben Maurer wrote:

Hey,

Usually, when writing stuff for thrift, I've found it's best to create an
object that takes parameters:

FooReturn doFoo(1: FooArgs args);

For the return value, this is pretty critical because thrift doesn't allow
you to return multiple values. For the arguments, I've found that even
though thrift can support multiple arguments, doing this makes it easier
(eg, you can serialize args and log it).

So for this kind of API I'd just take the args value insert it into a
queue. It does require a bit of work for each function, however, you can also do stuff like validate the request and raise an exception if you know
the insert will fail.

-b

On Wed, 28 May 2008, Benjamin Reed wrote:

Could I get a pointer to how to deal with the following scenario:

I have a Java server using thrift. There are potentially hundreds of clients sending hundreds of requests at a time. The server receives a request, batches it up with other pending requests, processes a batch at a time, and
then generates the responses when the batch finishes.

For example, clients A, B, and C, are each sending up records to be inserted into a database. The clients are sending up 1000 requests per second. The server will grab some number, lets say 100 requests at a time, insert them into the database, issue a commit, and send back successful responses. Doing batch commits of 100 requests at a time allows the server to keep up with the
clients. Committing each request individually would be too slow.

So, in my Java server, how do I get an RPC request and then put it on a completion list so that I can free up the thread for the next RPC call and
complete the RPC when I do the batch processing?

thanx
ben
_______________________________________________
thrift mailing list
[EMAIL PROTECTED]
http://publists.facebook.com/mailman/listinfo/thrift


_______________________________________________
thrift mailing list
[EMAIL PROTECTED]
http://publists.facebook.com/mailman/listinfo/thrift

Reply via email to