Re: [rpyc] Re: rpyc.service needs 10 seconds to "receive" a 150kB object - Can I make it quicker?

Boris Wed, 23 Jan 2013 01:26:17 -0800

Hi Olivier,
Thank for your answer. Indeed, the passing by reference is key to my 
performance issue.


What I am doing now is to perform a *deep copy* of my object when I receive 
it, and then work only on the copy. Then the service returns the copy to 
the client, and the client perfoms again a deep copy. The performance 
impact of the deep copying is very small compared to the impact caused by 
the synchronization overload.

Regarding my needs, problem is solved.
I did not know about redis. I'll keep that in mind.
Boris


On Tuesday, January 22, 2013 8:19:58 PM UTC+1, Oliver Drake wrote:
>
> Hi Boris,
> I guess you're making the assumption that the entire dictionary gets 
> pickled and sent to the server. Tomer can give a deeper explanation of how 
> this works, but essentially you're passing a 'reference' to the dictionary 
> that d references across to the server, so I imagine the server will create 
> some sort of proxy object. This would mean any operation on that dictionary 
> on the server side will need to go across the rpc link (d still lives in 
> memory on the client side only). This behaviour is crucial to keep rpyc 
> transparent as this is how your local code behaves as well (you're passing 
> a reference to the dictionary when you invoke dummy.dummy - it's faster as 
> there's no rpc overhead). The question is do both your client and server 
> side processes need fast access to the same dictionary? Or could you define 
> the dict on the server side? If you really wanted to use pass by value you 
> would then need to send the dictionary back from the server to the client 
> and make d reference the new emptied dictionary - this could get messy. If 
> the answer to my first question is yes, you could consider using something 
> like redis <http://redis.io/> which is a fast key/value store that can be 
> accessed by multiple processes. Otherwise decide which process should own 
> the actual dictionary and build up your application from there.
> Hope this helps,
> Oliver
>
> On 23 January 2013 05:16, Boris <[email protected] <javascript:>>wrote:
>
>> I should add that:
>> - the error in dummy.print2 is solved if I set allow_public_attrs = True 
>> for my server
>>
>> But still:
>> How is that dummy.dummy(d) takes 10^5 longer when run by the service?
>> (no network issue here, as everything is on my localhost)
>>
>>
>>
>>
>> On Tuesday, January 22, 2013 4:24:50 PM UTC+1, Boris wrote:
>>>
>>> Hi rpyc users and developpers!
>>>
>>> (cf my code below as I can not attach files...)
>>>
>>> I am building a big dummy dictionary, and call a dummy function on it 
>>> that loops through all keys and sets the values to 0.
>>> This function takes 0.00099 seconds.
>>>
>>> If run by a rpyc.service (that runs on the same host), it takes 9.346 
>>> seconds.
>>> Also, it looks like the execution of the body of the function exposed by 
>>> the service will start even if the input object "sent" by the client has 
>>> not been "received" entirely on the server side. Cf in my code: 
>>> dummy.print1 will fail (trying to perform a for k,v in d.items).
>>>
>>> Thus my questions:
>>> - Is there a connection param I could use to speed up the transmission 
>>> of a big object between my client and my service? (typically a 150kB object 
>>> when pickled)
>>> - How is it that the function run by the service starts its execution 
>>> while the function input is not available yet?
>>>
>>> Thanks a lot for your comments/suggestions, and the great work on rpyc,
>>> Boris 
>>>
>>> **************** mini_service.py ********************************
>>> ************
>>> ****************************************************************
>>> ******************
>>> import rpyc
>>> from rpyc.utils.server import ThreadedServer
>>> import dummy
>>>
>>> class miniService(rpyc.Service):
>>> def exposed_myfunc(self,d):
>>>  #dummy.print1(d) #success
>>> #dummy.print2(d) #Netref.py / protocol.py / AttributeError: cannot 
>>> access 'items'
>>>  dummy.dummy(d)
>>>
>>> if __name__=='__main__':
>>> t = ThreadedServer(miniService,**protocol_config = {"allow_pickle" : 
>>> True}, port = 19865)
>>>  t.start()
>>>
>>> ********************* mini_client.py ********************************
>>> ********
>>> ****************************************************************
>>> ******************
>>> import rpyc
>>> import sys
>>> #import socket
>>> import pickle
>>> import dummy
>>> def makedict(n):
>>> d={x:x for x in range(n)}
>>>  return d
>>>
>>> if __name__ == "__main__":
>>> d=makedict(20000)
>>> print(sys.getsizeof(d)) #result = 393356
>>> # dummy.print2(d)
>>>
>>> # output = open("C:\\rd\\non_mc_test_**files\\mini.pkl",'wb') #117kB 
>>> object for n=20k
>>> # pickle.dump(d,output)
>>> # output.close()
>>>
>>> #RUN1 : dummy.dummy(d) out of rpyc takes 0.00099 seconds
>>> # dummy.dummy(d)
>>>
>>> #RUN2 : dummy.dummy(d) via RPYC on localhost takes 9.346 seconds
>>> # conn=rpyc.connect('localhost',**19865,config={"allow_pickle":**True})
>>> # conn.root.myfunc(d)
>>>
>>> print('Done.') 
>>>
>>> **************** dummy.py ********************************
>>> *****************
>>> ****************************************************************
>>> ******************
>>> import time
>>>
>>> def print1(d):
>>> for k,v in d.items():
>>> print(k,v)
>>>
>>> def print2(d):
>>>  for key in d:
>>> print(key)
>>> def dummy(d):
>>> start_ = time.time()
>>>  for key in d:
>>> d[key]=0
>>> print('Time spent in dummy in seconds: ' + str(time.time()-start_)) 
>>>
>>>
>>>
>>>
>

Re: [rpyc] Re: rpyc.service needs 10 seconds to "receive" a 150kB object - Can I make it quicker?

Reply via email to