Re: [rpyc] Re: rpyc.service needs 10 seconds to "receive" a 150kB object - Can I make it quicker?

Boris Wed, 23 Jan 2013 03:44:44 -0800

Hi Tomer,
Our code base is significant, so I will not switch to the classic mode, and 
stick to services instead.
My change is to perform deep copies. I guess this change corresponds to the 
"few lines" you refer to.
Thanks again,
Boris



On Wednesday, January 23, 2013 11:08:25 AM UTC+1, Tomer Filiba wrote:
>
> Hi Boris,
>
> Olivier provided a very good explanation (thanks Olivier :))
>
> If you wish to *copy* objects from/to client/server, and you use the 
> classic mode, you can use
> obtain() and deliver(). see 
> http://rpyc.sourceforge.net/api/utils_classic.html#rpyc.utils.classic.obtain
> These two functions pass a copy (pickled version) of the object to the 
> other party, reconstructing it there.
> If you're not using classic mode, you can implement them yourself in your 
> service (just a few lines of code)
>
> Hope this helps,
> -Tomer
>
>
> -----------------------------------------------------------------
>     
> *Tomer Filiba* 
> tomerfiliba.com     <http://www.facebook.com/tomerfiliba>    
> <http://il.linkedin.com/in/tomerfiliba> 
>
>
> On Wed, Jan 23, 2013 at 11:26 AM, Boris <[email protected]<javascript:>
> > wrote:
>
>> Hi Olivier,
>> Thank for your answer. Indeed, the passing by reference is key to my 
>> performance issue.
>>
>> What I am doing now is to perform a *deep copy* of my object when I 
>> receive it, and then work only on the copy. Then the service returns the 
>> copy to the client, and the client perfoms again a deep copy. The 
>> performance impact of the deep copying is very small compared to the impact 
>> caused by the synchronization overload.
>>
>> Regarding my needs, problem is solved.
>> I did not know about redis. I'll keep that in mind.
>> Boris
>>
>>
>> On Tuesday, January 22, 2013 8:19:58 PM UTC+1, Oliver Drake wrote:
>>
>>> Hi Boris,
>>> I guess you're making the assumption that the entire dictionary gets 
>>> pickled and sent to the server. Tomer can give a deeper explanation of how 
>>> this works, but essentially you're passing a 'reference' to the dictionary 
>>> that d references across to the server, so I imagine the server will create 
>>> some sort of proxy object. This would mean any operation on that dictionary 
>>> on the server side will need to go across the rpc link (d still lives in 
>>> memory on the client side only). This behaviour is crucial to keep rpyc 
>>> transparent as this is how your local code behaves as well (you're passing 
>>> a reference to the dictionary when you invoke dummy.dummy - it's faster as 
>>> there's no rpc overhead). The question is do both your client and server 
>>> side processes need fast access to the same dictionary? Or could you define 
>>> the dict on the server side? If you really wanted to use pass by value you 
>>> would then need to send the dictionary back from the server to the client 
>>> and make d reference the new emptied dictionary - this could get messy. If 
>>> the answer to my first question is yes, you could consider using something 
>>> like redis <http://redis.io/> which is a fast key/value store that can 
>>> be accessed by multiple processes. Otherwise decide which process should 
>>> own the actual dictionary and build up your application from there.
>>> Hope this helps,
>>> Oliver
>>>
>>>
>>> On 23 January 2013 05:16, Boris <[email protected]> wrote:
>>>
>>>> I should add that:
>>>> - the error in dummy.print2 is solved if I set allow_public_attrs = 
>>>> True for my server
>>>>
>>>> But still:
>>>> How is that dummy.dummy(d) takes 10^5 longer when run by the service?
>>>> (no network issue here, as everything is on my localhost)
>>>>
>>>>
>>>>
>>>>
>>>> On Tuesday, January 22, 2013 4:24:50 PM UTC+1, Boris wrote:
>>>>>
>>>>> Hi rpyc users and developpers!
>>>>>
>>>>> (cf my code below as I can not attach files...)
>>>>>
>>>>> I am building a big dummy dictionary, and call a dummy function on it 
>>>>> that loops through all keys and sets the values to 0.
>>>>> This function takes 0.00099 seconds.
>>>>>
>>>>> If run by a rpyc.service (that runs on the same host), it takes 
>>>>> 9.346seconds.
>>>>> Also, it looks like the execution of the body of the function exposed 
>>>>> by the service will start even if the input object "sent" by the client 
>>>>> has 
>>>>> not been "received" entirely on the server side. Cf in my code: 
>>>>> dummy.print1 will fail (trying to perform a for k,v in d.items).
>>>>>
>>>>> Thus my questions:
>>>>> - Is there a connection param I could use to speed up the transmission 
>>>>> of a big object between my client and my service? (typically a 150kB 
>>>>> object 
>>>>> when pickled)
>>>>> - How is it that the function run by the service starts its execution 
>>>>> while the function input is not available yet?
>>>>>
>>>>> Thanks a lot for your comments/suggestions, and the great work on rpyc,
>>>>> Boris 
>>>>>
>>>>> **************** mini_service.py **********************************
>>>>> ************
>>>>> ********************************************************************
>>>>> ******************
>>>>> import rpyc
>>>>> from rpyc.utils.server import ThreadedServer
>>>>> import dummy
>>>>>
>>>>> class miniService(rpyc.Service):
>>>>> def exposed_myfunc(self,d):
>>>>>  #dummy.print1(d) #success
>>>>> #dummy.print2(d) #Netref.py / protocol.py / AttributeError: cannot 
>>>>> access 'items'
>>>>>  dummy.dummy(d)
>>>>>
>>>>> if __name__=='__main__':
>>>>> t = ThreadedServer(miniService,**pro**tocol_config = {"allow_pickle" 
>>>>> : True}, port = 19865)
>>>>>  t.start()
>>>>>
>>>>> ********************* mini_client.py *********************************
>>>>> *********
>>>>> ********************************************************************
>>>>> ******************
>>>>> import rpyc
>>>>> import sys
>>>>> #import socket
>>>>> import pickle
>>>>> import dummy
>>>>> def makedict(n):
>>>>> d={x:x for x in range(n)}
>>>>>  return d
>>>>>
>>>>> if __name__ == "__main__":
>>>>> d=makedict(20000)
>>>>> print(sys.getsizeof(d)) #result = 393356
>>>>> # dummy.print2(d)
>>>>>
>>>>> # output = open("C:\\rd\\non_mc_test_**file**s\\mini.pkl",'wb') 
>>>>> #117kB object for n=20k
>>>>> # pickle.dump(d,output)
>>>>> # output.close()
>>>>>
>>>>> #RUN1 : dummy.dummy(d) out of rpyc takes 0.00099 seconds
>>>>> # dummy.dummy(d)
>>>>>
>>>>> #RUN2 : dummy.dummy(d) via RPYC on localhost takes 9.346 seconds
>>>>> # conn=rpyc.connect('localhost',****19865,config={"allow_pickle":**T**
>>>>> rue})
>>>>> # conn.root.myfunc(d)
>>>>>
>>>>> print('Done.') 
>>>>>
>>>>> **************** dummy.py **********************************
>>>>> *****************
>>>>> ********************************************************************
>>>>> ******************
>>>>>  import time
>>>>>
>>>>> def print1(d):
>>>>> for k,v in d.items():
>>>>> print(k,v)
>>>>>
>>>>> def print2(d):
>>>>>  for key in d:
>>>>> print(key)
>>>>> def dummy(d):
>>>>> start_ = time.time()
>>>>>  for key in d:
>>>>> d[key]=0
>>>>> print('Time spent in dummy in seconds: ' + str(time.time()-start_)) 
>>>>>
>>>>>
>>>>>
>>>>>
>>>
>

Re: [rpyc] Re: rpyc.service needs 10 seconds to "receive" a 150kB object - Can I make it quicker?

Reply via email to