QPID python client API, bad performance about RangedSet.add
------------------------------------------------------------
Key: QPID-3378
URL: https://issues.apache.org/jira/browse/QPID-3378
Project: Qpid
Issue Type: Improvement
Components: Python Client
Affects Versions: 0.10, 0.8, 0.6
Environment: winxp; intel E7500; 2G
Reporter: Yang XU
Priority: Minor
Fix For: Future
To feedback to server the mesages that client had been received, client should
use:
>>> session.message_accept(Range)
while the Range is a qpid.datatypes.RangedSet object. My test shows that it has
a bad performance when we want to feedback a batch at one time, test code as
follow:
----------------------------------------------------- code
---------------------------
from qpid.datatypes import RangedSet
from random import choice, shuffle
from timeit import Timer
def normaladd(seq):
rs = RangedSet()
map(rs.add, seq)
print rs.ranges
def enhanceadd(seq):
def get_span(sset):
sset = set(sset)
templist, length, index = list(sset), len(sset), 1
result = []
if length: tmp_head = templist[0]
while index <= length:
if index == length or not templist[index] ==
templist[index-1]+1:
result.append((tmp_head, templist[index-1]))
if index < length: tmp_head = templist[index]
index += 1
return result
rs = RangedSet()
map(lambda items: rs.add(*items), get_span(seq))
print rs.ranges
t_old = Timer('normaladd(seq)', 'from __main__ import normaladd, seq')
t_new = Timer('enhanceadd(seq)', 'from __main__ import enhanceadd, seq')
def test():
global seq
total_num = 1000
def case1ctx():
l = range(total_num)
return l
def case2ctx():
l = range(total_num)
shuffle(l)
return l
def case3ctx():
l = range(total_num)
ll = []
for count in range(total_num):
ll.append(choice(l))
return ll
seq = case1ctx()
print 'ordered, unique'
print 'old time used: ', t_old.repeat(1,1)
print 'new time used: ', t_new.repeat(1,1)
print
seq = case2ctx()
print 'disordered, unique'
print 'old time used: ', t_old.repeat(1,1)
print 'new time used: ', t_new.repeat(1,1)
print
seq = case3ctx()
print 'disordered, duplicated'
print 'old time used: ', t_old.repeat(1,1)
print 'new time used: ', t_new.repeat(1,1)
print
test()
-----------------------------------------------------------------------
Compring to the original RangedSet.add, the function enhanceadd could have a
much better performance. Also, in enhanceadd situation, some code in original
RangedSet are useless as following:
def add_range(self, range):
idx = 0
while idx < len(self.ranges):
r = self.ranges[idx]
if range.touches(r):
del self.ranges[idx]
range = range.span(r)
elif range.upper < r.lower:
self.ranges.insert(idx, range)
return
else:
idx += 1
self.ranges.append(range)
def add(self, lower, upper = None):
self.add_range(Range(lower, upper))
it can be changed like with a further improvment:
def add(self, lower, upper = None):
self.ranges.append((Range(lower, upper))
The comparing result on my computer:
--withou code changing
ordered, unique
old time used: [0.030543205640406971]
new time used: [0.000312459202400521]
disordered, unique
old time used: [6.1625172448676198]
new time used: [0.00031574488299845882]
disordered, duplicated
old time used: [6.8560797353410585]
new time used: [1.6444935113447237]
--with code changing(original way will not work, so the result is not listed)
ordered, unique
new time used: [0.00029485411974586751]
disordered, unique
new time used: [0.00029566614149547331]
disordered, duplicated
new time used: [0.0015168716657040435]
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project: http://qpid.apache.org
Use/Interact: mailto:[email protected]