Here is the output from the ES query bolt: "Total execution time for this batch: 179655(millisecond)" is the call time around .emit. As you can see, to emit 14000 entries, it takes anytime from 145231 to 180000
INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=14000 hits=14000 took=26172 40813 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-13_00-00-00 40889 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 782 40890 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 4000 records 59335 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 59335 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=28000 hits=14000 took=18033 238920 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-14_00-00-00 238990 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 179655 238990 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 8000 records 257633 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 257633 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=42000 hits=14000 took=17926 260932 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-15_00-00-00 402852 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-16_00-00-00 402865 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 145231 402865 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 2000 records 417427 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 417427 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=56000 hits=14000 took=13962 417459 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-17_00-00-00 417493 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 66 417493 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 6000 records 429629 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 429629 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=70000 hits=14000 took=12009 441208 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-18_00-00-00 744276 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-19_00-00-00 744277 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 314647 744277 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 0 records 779030 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 779030 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=84000 hits=14000 took=34631 785315 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-20_00-00-00 785332 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 6302 785332 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 4000 records 811859 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 811859 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=98000 hits=14000 took=25806 945938 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-21_00-00-00 960308 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 148449 960308 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 8000 records 983611 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 983611 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=112000 hits=14000 took=22698 983627 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-22_00-00-00 1002262 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-23_00-00-00 1002272 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 18661 1002272 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 2000 records 1021226 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 1021227 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=126000 hits=14000 took=18854 1110480 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-24_00-00-00 1188188 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 166961 1188188 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 6000 records 1204474 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 1204474 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=140000 hits=14000 took=15422 1204495 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-25_00-00-00 1270240 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the new key(hdfs folder) is 2014-07-26_00-00-00 1270240 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 65766 1270240 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 0 records 1284391 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 1284391 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=145861 hits=5861 took=14084 1284414 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 23 1284414 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 5861 records 1284417 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the total hits are 145861 1284417 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - total=145861 hits=0 took=0 1284417 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - Total execution time for this batch: 0 1284418 [pool-1-thread-1] INFO com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner - the current batch has 5861 records Total execution time: 1276946 On Fri, Jul 11, 2014 at 2:14 PM, Chen Wang <[email protected]> wrote: > here you go: > https://gist.github.com/cynosureabu/b317646d5c475d0d2e42 > Its actually pretty straight forward. The only thing worth of mention is > that I use another thread in the ES bolt to do the actual query and tuple > emit. > Thanks for looking. > Chen > > > > On Fri, Jul 11, 2014 at 1:18 PM, Sam Goodwin <[email protected]> > wrote: > >> Can you show some code? 200 seconds for 15K puts sounds like you're not >> batching. >> >> >> On Fri, Jul 11, 2014 at 12:47 PM, Chen Wang <[email protected]> >> wrote: >> >>> typo in previous email >>> The emit method in the query bolt takes about 200(instead of 20) >>> seconds.. >>> >>> >>> On Fri, Jul 11, 2014 at 11:58 AM, Chen Wang <[email protected]> >>> wrote: >>> >>>> Hi, Guys, >>>> I have a storm topology, with a single thread bolt querying large >>>> amount of data (From elasticsearch), and emit to a HBase bolt(10 threads), >>>> doing some filtering, then emit to Arvo bolt.(10threads) The arvo bolt >>>> simply emit the tuple to arvo client, which will be received by two flume >>>> node and then sink into hdfs. I am testing in local mode. >>>> >>>> In the query bolt, i am getting around 15000 entries in a batch, the >>>> query itself takes about 4second, however, he emit method in the query bolt >>>> takes about 20 seconds. Does it mean that >>>> the downstream bolt(HBaseBolt and Avro bolt) cannot catch up with the >>>> query bolt? >>>> >>>> How can I tune my topology to make this process as fast as possible? I >>>> tried to increase the HBase thread to 20 but it does not seem to help. >>>> >>>> I use shuffleGrouping from query bolt to hbase bolt, and from hbase >>>> bolt to avro. >>>> >>>> Thanks for any advice. >>>> Chen >>>> >>> >>> >> >
