On Jun 4, 2013, at 16:00 , baojian Zhou wrote: > thank you Mr Matthieu, > if i want to configure the load shedding, do i just need to add the > parameters such as s4.sender.workQueueSize > s4.sender.maxRate in the default.s4.core.properties file?
These parameters are for a throttling remote sender (which might also do load shedding). You'd need to define an overriding module for replacing the binding for RemoteSendersExecutorServiceFactory with a factory for a throttling executor to use it. See here for customization: http://incubator.apache.org/s4/doc/0.6.0/configuration/ But also note you can perform load shedding in the consumer. I'd suggest to have a look here: http://incubator.apache.org/s4/doc/0.6.0/event_dispatch/ and also to have a look at the code, that's the best way to clearly understand the framework. Matthieu > > > > > > 2013/6/4 Matthieu Morel <mmo...@apache.org> > >> >> On Jun 4, 2013, at 13:09 , baojian Zhou wrote: >> >>> hi all >>> i want to do an experiment in the benchmark package and test the >>> injector-rate and received-events, but i get the following confuses: >>> >>> 1. i test two times in the same parameters, but in each time, the number >> of >>> input events in the injector are different: >>> the first: 3324831 >>> the second: 9808773 >>> >>> and the two experiments have the same parameters: >>> >> -p=s4.adapter.output.stream=inputStream,s4.benchmark.keysCount=10000,s4.benchmark.testIterations=1000000,s4.benchmark.injector.iterationsBeforePause=1000,s4.benchmark.pauseTimeMs=20,s4.benchmark.injector.parallelism=4 >>> -c=testCluster2 -appName=adapter -zk=node4:2181 >>> >>> i want to expect that very time the injectors have the same rate and >> finish >>> in definite number of events. can i complete the work and how ? >> >> Note that the rate is not deterministic. >> >> Are you sure you are correctly reading the number of injected events? If >> you lose events, that could be due to load shedding (if you configured the >> platform to use that) or injecting before the all consumers are ready. >> >>> >>> 2. and i do not very clear about the dequeued csv files, dose it stand >> for >>> the number of the dequeued streams in given time interval. if it is true, >>> how can i get the the number of the events in the streams in very given >>> time interval ? >> >> The output provides total dequeued events, average over the whole run, and >> 1, 5, 15 minutes averages. Please have a look at the metrics documentation >> for more info. >> >> Matthieu