The two being compared had everything the same (including logging) other than one was doing insert (with key check) and the other was upsert (without). I had corrected the logging before doing the comparison. Even more cool, I realized that the 143 million was actually the speed at which my python script was sending data over the socket, meaning that Asterix wasn't hitting any bottlenecks. So I tried using two feeds to the same dataset with two different ports, and I was able to double this speed again without hitting a bottleneck. I still haven't done any experiments for more than an hour, but this is really promising. Steven
On Tue, May 16, 2017 at 11:14 PM Mike Carey <[email protected]> wrote: > Wow! @Steven, did you change the logging level simultaneously? > > On May 16, 2017 8:03 PM, "abdullah alamoudi" <[email protected]> wrote: > > > That is really cool. > > > > > On May 16, 2017, at 1:29 PM, Steven Jacobs <[email protected]> wrote: > > > > > > Hi all, > > > I Just wanted to leave a comment continuing the feed speed capabilities > > > discussions that we've had recently. Using Abdullah's change to allow > an > > > upsert feed with no secondary index to skip the duplicate key check, I > > saw > > > a huge increase in speed capabilities. In the first hour of the > > experiment, > > > I jumped from inserting 17 million records without the change to 143 > > > million records with the change (that's around 40 thousand records per > > > second). I haven't done more than hour-long experiments, but I > recommend > > > others with speed problems to try using this change as well (it's in > > master > > > already). > > > Steven > > > > >
