Hi Olivier, Thank for your answer. Indeed, the passing by reference is key to my performance issue.
What I am doing now is to perform a *deep copy* of my object when I receive it, and then work only on the copy. Then the service returns the copy to the client, and the client perfoms again a deep copy. The performance impact of the deep copying is very small compared to the impact caused by the synchronization overload. Regarding my needs, problem is solved. I did not know about redis. I'll keep that in mind. Boris On Tuesday, January 22, 2013 8:19:58 PM UTC+1, Oliver Drake wrote: > > Hi Boris, > I guess you're making the assumption that the entire dictionary gets > pickled and sent to the server. Tomer can give a deeper explanation of how > this works, but essentially you're passing a 'reference' to the dictionary > that d references across to the server, so I imagine the server will create > some sort of proxy object. This would mean any operation on that dictionary > on the server side will need to go across the rpc link (d still lives in > memory on the client side only). This behaviour is crucial to keep rpyc > transparent as this is how your local code behaves as well (you're passing > a reference to the dictionary when you invoke dummy.dummy - it's faster as > there's no rpc overhead). The question is do both your client and server > side processes need fast access to the same dictionary? Or could you define > the dict on the server side? If you really wanted to use pass by value you > would then need to send the dictionary back from the server to the client > and make d reference the new emptied dictionary - this could get messy. If > the answer to my first question is yes, you could consider using something > like redis <http://redis.io/> which is a fast key/value store that can be > accessed by multiple processes. Otherwise decide which process should own > the actual dictionary and build up your application from there. > Hope this helps, > Oliver > > On 23 January 2013 05:16, Boris <[email protected] <javascript:>>wrote: > >> I should add that: >> - the error in dummy.print2 is solved if I set allow_public_attrs = True >> for my server >> >> But still: >> How is that dummy.dummy(d) takes 10^5 longer when run by the service? >> (no network issue here, as everything is on my localhost) >> >> >> >> >> On Tuesday, January 22, 2013 4:24:50 PM UTC+1, Boris wrote: >>> >>> Hi rpyc users and developpers! >>> >>> (cf my code below as I can not attach files...) >>> >>> I am building a big dummy dictionary, and call a dummy function on it >>> that loops through all keys and sets the values to 0. >>> This function takes 0.00099 seconds. >>> >>> If run by a rpyc.service (that runs on the same host), it takes 9.346 >>> seconds. >>> Also, it looks like the execution of the body of the function exposed by >>> the service will start even if the input object "sent" by the client has >>> not been "received" entirely on the server side. Cf in my code: >>> dummy.print1 will fail (trying to perform a for k,v in d.items). >>> >>> Thus my questions: >>> - Is there a connection param I could use to speed up the transmission >>> of a big object between my client and my service? (typically a 150kB object >>> when pickled) >>> - How is it that the function run by the service starts its execution >>> while the function input is not available yet? >>> >>> Thanks a lot for your comments/suggestions, and the great work on rpyc, >>> Boris >>> >>> **************** mini_service.py ******************************** >>> ************ >>> **************************************************************** >>> ****************** >>> import rpyc >>> from rpyc.utils.server import ThreadedServer >>> import dummy >>> >>> class miniService(rpyc.Service): >>> def exposed_myfunc(self,d): >>> #dummy.print1(d) #success >>> #dummy.print2(d) #Netref.py / protocol.py / AttributeError: cannot >>> access 'items' >>> dummy.dummy(d) >>> >>> if __name__=='__main__': >>> t = ThreadedServer(miniService,**protocol_config = {"allow_pickle" : >>> True}, port = 19865) >>> t.start() >>> >>> ********************* mini_client.py ******************************** >>> ******** >>> **************************************************************** >>> ****************** >>> import rpyc >>> import sys >>> #import socket >>> import pickle >>> import dummy >>> def makedict(n): >>> d={x:x for x in range(n)} >>> return d >>> >>> if __name__ == "__main__": >>> d=makedict(20000) >>> print(sys.getsizeof(d)) #result = 393356 >>> # dummy.print2(d) >>> >>> # output = open("C:\\rd\\non_mc_test_**files\\mini.pkl",'wb') #117kB >>> object for n=20k >>> # pickle.dump(d,output) >>> # output.close() >>> >>> #RUN1 : dummy.dummy(d) out of rpyc takes 0.00099 seconds >>> # dummy.dummy(d) >>> >>> #RUN2 : dummy.dummy(d) via RPYC on localhost takes 9.346 seconds >>> # conn=rpyc.connect('localhost',**19865,config={"allow_pickle":**True}) >>> # conn.root.myfunc(d) >>> >>> print('Done.') >>> >>> **************** dummy.py ******************************** >>> ***************** >>> **************************************************************** >>> ****************** >>> import time >>> >>> def print1(d): >>> for k,v in d.items(): >>> print(k,v) >>> >>> def print2(d): >>> for key in d: >>> print(key) >>> def dummy(d): >>> start_ = time.time() >>> for key in d: >>> d[key]=0 >>> print('Time spent in dummy in seconds: ' + str(time.time()-start_)) >>> >>> >>> >>> >
