Re: [rpyc] Re: rpyc.service needs 10 seconds to "receive" a 150kB object - Can I make it quicker?

Tomer Filiba Wed, 23 Jan 2013 02:08:28 -0800

Hi Boris,

Olivier provided a very good explanation (thanks Olivier :))


If you wish to *copy* objects from/to client/server, and you use the
classic mode, you can use
obtain() and deliver(). see
http://rpyc.sourceforge.net/api/utils_classic.html#rpyc.utils.classic.obtain
These two functions pass a copy (pickled version) of the object to the
other party, reconstructing it there.
If you're not using classic mode, you can implement them yourself in your
service (just a few lines of code)

Hope this helps,
-Tomer


-----------------------------------------------------------------

*Tomer Filiba*
tomerfiliba.com     <http://www.facebook.com/tomerfiliba>
<http://il.linkedin.com/in/tomerfiliba>


On Wed, Jan 23, 2013 at 11:26 AM, Boris <[email protected]> wrote:

> Hi Olivier,
> Thank for your answer. Indeed, the passing by reference is key to my
> performance issue.
>
> What I am doing now is to perform a *deep copy* of my object when I
> receive it, and then work only on the copy. Then the service returns the
> copy to the client, and the client perfoms again a deep copy. The
> performance impact of the deep copying is very small compared to the impact
> caused by the synchronization overload.
>
> Regarding my needs, problem is solved.
> I did not know about redis. I'll keep that in mind.
> Boris
>
>
> On Tuesday, January 22, 2013 8:19:58 PM UTC+1, Oliver Drake wrote:
>
>> Hi Boris,
>> I guess you're making the assumption that the entire dictionary gets
>> pickled and sent to the server. Tomer can give a deeper explanation of how
>> this works, but essentially you're passing a 'reference' to the dictionary
>> that d references across to the server, so I imagine the server will create
>> some sort of proxy object. This would mean any operation on that dictionary
>> on the server side will need to go across the rpc link (d still lives in
>> memory on the client side only). This behaviour is crucial to keep rpyc
>> transparent as this is how your local code behaves as well (you're passing
>> a reference to the dictionary when you invoke dummy.dummy - it's faster as
>> there's no rpc overhead). The question is do both your client and server
>> side processes need fast access to the same dictionary? Or could you define
>> the dict on the server side? If you really wanted to use pass by value you
>> would then need to send the dictionary back from the server to the client
>> and make d reference the new emptied dictionary - this could get messy. If
>> the answer to my first question is yes, you could consider using something
>> like redis <http://redis.io/> which is a fast key/value store that can
>> be accessed by multiple processes. Otherwise decide which process should
>> own the actual dictionary and build up your application from there.
>> Hope this helps,
>> Oliver
>>
>>
>> On 23 January 2013 05:16, Boris <[email protected]> wrote:
>>
>>> I should add that:
>>> - the error in dummy.print2 is solved if I set allow_public_attrs = True
>>> for my server
>>>
>>> But still:
>>> How is that dummy.dummy(d) takes 10^5 longer when run by the service?
>>> (no network issue here, as everything is on my localhost)
>>>
>>>
>>>
>>>
>>> On Tuesday, January 22, 2013 4:24:50 PM UTC+1, Boris wrote:
>>>>
>>>> Hi rpyc users and developpers!
>>>>
>>>> (cf my code below as I can not attach files...)
>>>>
>>>> I am building a big dummy dictionary, and call a dummy function on it
>>>> that loops through all keys and sets the values to 0.
>>>> This function takes 0.00099 seconds.
>>>>
>>>> If run by a rpyc.service (that runs on the same host), it takes 
>>>> 9.346seconds.
>>>> Also, it looks like the execution of the body of the function exposed
>>>> by the service will start even if the input object "sent" by the client has
>>>> not been "received" entirely on the server side. Cf in my code:
>>>> dummy.print1 will fail (trying to perform a for k,v in d.items).
>>>>
>>>> Thus my questions:
>>>> - Is there a connection param I could use to speed up the transmission
>>>> of a big object between my client and my service? (typically a 150kB object
>>>> when pickled)
>>>> - How is it that the function run by the service starts its execution
>>>> while the function input is not available yet?
>>>>
>>>> Thanks a lot for your comments/suggestions, and the great work on rpyc,
>>>> Boris
>>>>
>>>> **************** mini_service.py **********************************
>>>> ************
>>>> ********************************************************************
>>>> ******************
>>>> import rpyc
>>>> from rpyc.utils.server import ThreadedServer
>>>> import dummy
>>>>
>>>> class miniService(rpyc.Service):
>>>> def exposed_myfunc(self,d):
>>>>  #dummy.print1(d) #success
>>>> #dummy.print2(d) #Netref.py / protocol.py / AttributeError: cannot
>>>> access 'items'
>>>>  dummy.dummy(d)
>>>>
>>>> if __name__=='__main__':
>>>> t = ThreadedServer(miniService,**pro**tocol_config = {"allow_pickle" :
>>>> True}, port = 19865)
>>>>  t.start()
>>>>
>>>> ********************* mini_client.py **********************************
>>>> ********
>>>> ********************************************************************
>>>> ******************
>>>> import rpyc
>>>> import sys
>>>> #import socket
>>>> import pickle
>>>> import dummy
>>>> def makedict(n):
>>>> d={x:x for x in range(n)}
>>>>  return d
>>>>
>>>> if __name__ == "__main__":
>>>> d=makedict(20000)
>>>> print(sys.getsizeof(d)) #result = 393356
>>>> # dummy.print2(d)
>>>>
>>>> # output = open("C:\\rd\\non_mc_test_**file**s\\mini.pkl",'wb') #117kB
>>>> object for n=20k
>>>> # pickle.dump(d,output)
>>>> # output.close()
>>>>
>>>> #RUN1 : dummy.dummy(d) out of rpyc takes 0.00099 seconds
>>>> # dummy.dummy(d)
>>>>
>>>> #RUN2 : dummy.dummy(d) via RPYC on localhost takes 9.346 seconds
>>>> # conn=rpyc.connect('localhost',****19865,config={"allow_pickle":**T**
>>>> rue})
>>>> # conn.root.myfunc(d)
>>>>
>>>> print('Done.')
>>>>
>>>> **************** dummy.py **********************************
>>>> *****************
>>>> ********************************************************************
>>>> ******************
>>>> import time
>>>>
>>>> def print1(d):
>>>> for k,v in d.items():
>>>> print(k,v)
>>>>
>>>> def print2(d):
>>>>  for key in d:
>>>> print(key)
>>>> def dummy(d):
>>>> start_ = time.time()
>>>>  for key in d:
>>>> d[key]=0
>>>> print('Time spent in dummy in seconds: ' + str(time.time()-start_))
>>>>
>>>>
>>>>
>>>>
>>

Re: [rpyc] Re: rpyc.service needs 10 seconds to "receive" a 150kB object - Can I make it quicker?

Reply via email to