Here is the output from the ES query bolt:
 "Total execution time for this batch: 179655(millisecond)" is the call
time around .emit. As you can see, to emit 14000 entries, it takes
anytime from 145231 to 180000


 INFO  com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=14000 hits=14000 took=26172
40813 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-13_00-00-00
40889 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 782
40890 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 4000 records
59335 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
59335 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=28000 hits=14000 took=18033
238920 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-14_00-00-00
238990 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 179655
238990 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 8000 records
257633 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
257633 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=42000 hits=14000 took=17926
260932 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-15_00-00-00
402852 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-16_00-00-00
402865 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 145231
402865 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 2000 records
417427 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
417427 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=56000 hits=14000 took=13962
417459 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-17_00-00-00
417493 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 66
417493 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 6000 records
429629 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
429629 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=70000 hits=14000 took=12009
441208 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-18_00-00-00
744276 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-19_00-00-00
744277 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 314647
744277 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 0 records
779030 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
779030 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=84000 hits=14000 took=34631
785315 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-20_00-00-00
785332 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 6302
785332 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 4000 records
811859 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
811859 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=98000 hits=14000 took=25806
945938 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-21_00-00-00
960308 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 148449
960308 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 8000 records
983611 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
983611 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=112000 hits=14000 took=22698
983627 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-22_00-00-00
1002262 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-23_00-00-00
1002272 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 18661
1002272 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 2000 records
1021226 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
1021227 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=126000 hits=14000 took=18854
1110480 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-24_00-00-00
1188188 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 166961
1188188 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 6000 records
1204474 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
1204474 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=140000 hits=14000 took=15422
1204495 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-25_00-00-00
1270240 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the new
key(hdfs folder) is 2014-07-26_00-00-00
1270240 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 65766
1270240 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 0 records
1284391 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
1284391 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=145861 hits=5861 took=14084
1284414 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 23
1284414 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 5861 records
1284417 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the total
hits are 145861
1284417 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  -
total=145861 hits=0 took=0
1284417 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - Total
execution time for this batch: 0
1284418 [pool-1-thread-1] INFO
 com.walmartlabs.targeting.storm.bolt.ElasticSearchQueryRunner  - the
current batch has 5861 records
Total execution time: 1276946


On Fri, Jul 11, 2014 at 2:14 PM, Chen Wang <[email protected]>
wrote:

> here you go:
> https://gist.github.com/cynosureabu/b317646d5c475d0d2e42
> Its actually pretty straight forward. The only thing worth of mention is
> that I use another thread in the ES bolt to do the actual query and tuple
> emit.
> Thanks for looking.
> Chen
>
>
>
> On Fri, Jul 11, 2014 at 1:18 PM, Sam Goodwin <[email protected]>
> wrote:
>
>> Can you show some code? 200 seconds for 15K puts sounds like you're not
>> batching.
>>
>>
>> On Fri, Jul 11, 2014 at 12:47 PM, Chen Wang <[email protected]>
>> wrote:
>>
>>> typo in previous email
>>> The emit method in the query bolt takes about 200(instead of 20)
>>> seconds..
>>>
>>>
>>> On Fri, Jul 11, 2014 at 11:58 AM, Chen Wang <[email protected]>
>>> wrote:
>>>
>>>> Hi, Guys,
>>>> I have a storm topology, with a single thread bolt querying large
>>>> amount of data (From elasticsearch), and emit to a HBase bolt(10 threads),
>>>> doing some filtering, then emit to Arvo bolt.(10threads) The arvo bolt
>>>> simply emit the tuple to arvo client, which will be received by two flume
>>>> node and then sink into hdfs. I am testing in local mode.
>>>>
>>>> In the query bolt, i am  getting around 15000 entries in a batch, the
>>>> query itself takes about 4second, however, he emit method in the query bolt
>>>> takes about 20 seconds. Does it mean that
>>>> the downstream bolt(HBaseBolt and Avro bolt) cannot catch up with the
>>>> query bolt?
>>>>
>>>> How can I tune my topology to make this process as fast as possible? I
>>>> tried to increase the HBase thread to 20 but it does not seem to help.
>>>>
>>>> I use shuffleGrouping from query bolt to hbase bolt, and from hbase
>>>> bolt to avro.
>>>>
>>>> Thanks for any advice.
>>>> Chen
>>>>
>>>
>>>
>>
>

Reply via email to