On Tue, Dec 7, 2010 at 11:06 PM, Ned Rockson <[email protected]> wrote:
> Hi,
>
> I'm currently investigating an issue involved in a major performance issue
> in the 98th  percentile of deserialization of a thrift object in python
> using the fastbinary protocol. The objects aren't complicated but do contain
> a fair number of children - 500 or more in these cases. Performance is
> important and in average case it is okay, but 2% of the time the speed is
> too slow.
>
> I ran a test that loops on Fetch object from memcache, Deserialize object
> 500 times in a row. Statistics are tracked for a number of different objects
> through this test. The output is consistently similar to this:
>
> TOTAL TIME: [9.2217922210693359, 9.3901157379150391, 10.351181030273438,
> 10.828018188476562, 11.240959167480469, 32.318115234375]
> DESERIALIZATION TIME: [6.6769123077392578, 6.7799091339111328,
> 7.8821182250976562, 8.3730220794677734, 8.6162090301513672,
> 29.79588508605957]
>
> Where each element in the array specifies a different percentile - 25, 50,
> 75, 90, 95, 99. What's clear is the 99th %ile is about 4x higher than the
> median. When running cProfile, all of the time spent in deserialization is
> spent in the read() method of ttypes for the different objects that are
> being deserialized. I pasted an excerpt below.
>
>    18000    0.023    0.000    0.023    0.000 ttypes.py:1057(__init__)
>    18000    0.052    0.000    0.052    0.000 ttypes.py:1162(__init__)
>     4500    0.009    0.000    0.009    0.000 ttypes.py:1735(__init__)
>     4500    0.006    0.000    0.006    0.000 ttypes.py:1929(__init__)
>     9000    0.014    0.000    0.014    0.000 ttypes.py:2115(__init__)
>      500    0.001    0.000    0.001    0.000 ttypes.py:2224(__init__)
>      500    0.001    0.000    0.001    0.000 ttypes.py:2338(__init__)
>   252000    0.356    0.000    0.356    0.000 ttypes.py:245(__init__)
>    18000    0.030    0.000    0.030    0.000 ttypes.py:2495(__init__)
>      500    0.001    0.000    0.001    0.000 ttypes.py:2600(__init__)
>   252000    0.917    0.000    0.917    0.000 ttypes.py:371(__init__)
>      500    0.002    0.000    0.002    0.000 ttypes.py:55(__init__)
>    18000    0.020    0.000    0.020    0.000 ttypes.py:570(__init__)
> ==>      500    0.004    0.000    3.733    0.007 ttypes.py:64(read)
>    18000    0.030    0.000    0.030    0.000 ttypes.py:648(__init__)
>    18000    0.040    0.000    0.040    0.000 ttypes.py:767(__init__)
>
>
> I'm trying to come up with a way to make sense of this and having trouble,
> so I'm hoping that someone else has seen an issue like this. Please respond
> with any thoughts around potential work-arounds or other tests that could be
> helpful.
>
> Thank you,
> Ned
>
> PS I wasn't sure whether to send this to user or dev, so if this is the
> wrong mailing list please let me know.
>

Is your test fetching the *same* object each time?

If so, is it possible that gc is getting run during that big slowdown?

-- 
Kevin Clark
http://glu.ttono.us

Reply via email to