Wow changed to TBinaryProtocolAccelerated and it went down to 1.5 seconds.

Something very wrong with the default python implementation.

On 16 June 2016 at 20:34, Glenn Pierce <[email protected]>
wrote:

> The important profile lines are
>
>    ncalls <http://carneab4.memset.net:8000/?sort=calls>  tottime 
> <http://carneab4.memset.net:8000/?sort=time>  percall  cumtime 
> <http://carneab4.memset.net:8000/?sort=cumulative>  percall filename 
> <http://carneab4.memset.net:8000/?sort=module>:lineno 
> <http://carneab4.memset.net:8000/?sort=nfl>(function)
>         1    0.006    0.006  963.335  963.335 
> ./ThriftTestServer.py:3(<module> 
> <http://carneab4.memset.net:8000/?func_name=%3Cmodule%3E>)
>         1    0.000    0.000  963.304  963.304 
> /usr/lib64/python2.7/site-packages/thrift/server/TServer.py:102(serve 
> <http://carneab4.memset.net:8000/?func_name=serve>)
>         2    0.000    0.000  963.303  481.652 
> /usr/lib64/python2.7/site-packages/thrift/transport/TSocket.py:188(accept 
> <http://carneab4.memset.net:8000/?func_name=accept>)
>         2    0.000    0.000  963.303  481.652 
> /usr/lib64/python2.7/socket.py:201(accept 
> <http://carneab4.memset.net:8000/?func_name=accept>)
>         2  963.303  481.652  963.303  481.652 {method 'accept' of 
> '_socket.socket' objects}
>
>
>
> On 16 June 2016 at 20:01, Glenn Pierce <[email protected]>
> wrote:
>
>> That code is only called once.
>> All the time is spent after execution leaves my server function. So it is
>> somewhere in the generated thrift code.
>>
>> Just to be certain I generated the test data at the start of the program
>> like
>>
>> results = [('2016-05-20T14:01:01+0000', [22222])] * 60000
>>
>> But I get the same reults. 2 minutes twenty seconds.
>>
>> My python server process runs at over 100% cpu.
>>
>>
>> Thanks
>>
>>
>>
>>
>>
>> On 16 June 2016 at 19:23, Randy Abernethy <[email protected]> wrote:
>>
>>> Hi Glenn,
>>>
>>> While two minutes is a little surprising, I suspect opening/closng the
>>> file
>>> every time the getData() method is called will be problematic. File open
>>> operations are usually an order of magnitude more expensive than file
>>> read
>>> or write operations. Try opening the file in the handler constructor,
>>> buffering it, closing the file and then responding to getData requests
>>> using the buffer.
>>>
>>> -Randy
>>>
>>>
>>> On Thu, Jun 16, 2016 at 9:08 AM, Glenn Pierce <
>>> [email protected]> wrote:
>>>
>>> > Hi I wonder if someone could give advice on a performance issue I have.
>>> >
>>> > I have written a test case in Python
>>> >
>>> > The server is
>>> >
>>> > class TestServerThriftHandler(object):
>>> >
>>> >     def getData(self):
>>> >         with open('data.json', 'r') as f:
>>> >                 results = ujson.loads(f.read())
>>> >                 results = [(r['timestamp'], r['values']) for r in
>>> results]
>>> >                 print "size of list", len(results)
>>> >                 data = [ThriftTestGroup(timestamp=row[0],
>>> >                                         values=[float(x) if x is not
>>> None
>>> > else float('NaN') for x in row[1]]) for row in results]
>>> >
>>> >                 return data
>>> >
>>> >
>>> > if __name__ == "__main__":
>>> >     handler = TestServerThriftHandler()
>>> >     processor = ThriftTestService.Processor(handler)
>>> >     transport = TSocket.TServerSocket("127.0.0.1", 7777)
>>> >     tfactory = TTransport.TBufferedTransportFactory()
>>> >     pfactory = TBinaryProtocol.TBinaryProtocolFactory()
>>> >     server = TServer.TThreadedServer(processor, transport, tfactory,
>>> > pfactory)
>>> >     print 'Starting server'
>>> >     server.serve()
>>> >
>>> >
>>> > The interface is
>>> >
>>> > struct ThriftTestGroup {
>>> >
>>> >     1:  required Timestamp timestamp;
>>> >     2:  required list<double> values;
>>> > }
>>> >
>>> > service ThriftTestService {
>>> >
>>> >   list<ThriftTestGroup> getData() throws ( 1:TestException ex);
>>> > }
>>> >
>>> >
>>> > Basically getData() loads some example 58773 entries into a list from a
>>> > file.
>>> > Each entry is a string representing the time like
>>> > 2016-05-20T13:58:59+0000 and one or more double values.
>>> > In this test there is only one double value.
>>> >
>>> > My client on the same host calls this method and it takes 2 minutes 14
>>> > seconds ?
>>> > My code loading the data in the server in instant.
>>> >
>>> > Does anyone have an idea why this is so slow ?
>>> >
>>> > Any advice would be great.
>>> >
>>> > Thanks
>>> >
>>>
>>
>>
>

Reply via email to