I'm using the same Kafka transactional trident spout in a production Trident topology and haven't experienced any problems with ZooKeeper. Could you post your ZooKeeper and any Storm custom configuration here?
Bear in mind that continous updating of ZooKeeper nodes is normal for the transactional spout. The spout must continuously update the status of the emitted offsets and related Trident transactional metadata to ZooKeeper. It's a perfectly valid use of ZooKeeper. However, I find it hard that ZooKeeper could saturate your disk I/O all by itself, are you sure you haven't enabled any debug or trace-level logging or some other related feature? On Wednesday, May 28, 2014, Andres Gomez <[email protected]> wrote: > Thanks for you help, but my problem is that I have my disk saturate, I > have free disk space fortunately :) … Zookeeper write at > /tmp/zookeeper/version-2/ continuously: > > [root@rbc01 ~]# ls /tmp/zookeeper/version-2/ > log.1 log.171c90 log.1fd4c1 log.27e5b7 > log.7d36b snapshot.108581 snapshot.184640 snapshot.210601 > snapshot.29014a snapshot.93af1 > log.108585 log.184647 log.210609 log.29014f > log.93af3 snapshot.11f8d2 snapshot.191e98 snapshot.220e16 > snapshot.2a6dc5 snapshot.aaf12 > log.11f8d4 log.191e9c log.220e1c log.2a6dca > log.aaf17 snapshot.130176 snapshot.1a1aac snapshot.2355c3 > snapshot.2b3e71 snapshot.b766c > log.130178 log.1a1ab0 log.2355ca log.2b3e73 > log.b7670 snapshot.14292 snapshot.1b5702 snapshot.246063 > snapshot.370f4 snapshot.c4a29 > log.14294 log.1b570b log.246065 log.370f8 > log.c4a33 snapshot.14628d snapshot.1ca89c snapshot.25cc31 > snapshot.43aed snapshot.db9ce > log.146295 log.1ca89e log.25cc35 log.43af1 > log.db9d3 snapshot.157264 snapshot.1dcb1b snapshot.269138 > snapshot.5a93f snapshot.e9c85 > log.157268 log.1dcb23 log.26913b log.5a943 > log.e9c8d snapshot.164f70 snapshot.1ec404 snapshot.27016 > snapshot.6b8f7 snapshot.fb9f5 > log.164f74 log.1ec406 log.27018 log.6b8fa > log.fb9f8 snapshot.171c8e snapshot.1fd4be snapshot.27e5b3 > snapshot.7d363 > > > I can see with ftop this, where you can see how zookeeper writes the logs > file: > > Wed May 28 16:19:32 2014 > > ftop 1.0 > Processes: 106 total, 0 unreadable > > Press h for help, o for options > Open Files: 1661 regular, 13 dir, 203 chr, 0 blk, 473 pipe, 741 sock, 227 > misc > > _ PID #FD USER COMMAND > -- 29016 127 zookeeper java -Dzookeeper.log.dir=. > -Dzookeeper.root.logger=INFO,CONSOLE -cp > /opt/rb/var/zookeeper/bin/../build/classes:/opt/rb/var/zookeeper/bin/../build/lib... > -- 127 --w >> 13.2M/64.0M /tmp/zookeeper/version-2/log.2c6048 > [ 92890/s, 9:32 remain > > ] > > and if I use atop I can see how my disk is over 91% busy and is the > zookeeper process who produce that! > > DSK | vdc | busy 91% | read 0 | write 927 | > KiB/r 0 | | KiB/w 4 | MBr/s 0.00 | MBw/s > 0.75 | avq 1.00 | avio 4.88 ms > > Regars, > > Andres > > El 28/05/2014, a las 04:01, Srinath C > <[email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');>> > escribió: > > Apart from the autopurge options, also set the number of transactions > after which a snapshot is taken (snapCount). This number should be set > depending on the rate of updates to zookeeper. > > > On Wed, May 28, 2014 at 12:32 AM, Danijel Schiavuzzi < > [email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');> > > wrote: > >> You have to configure ZooKeeper to automatically purge old logs. >> ZooKeeper's logs tend to grow very quickly in size, so you should enable >> the autopurge option in zoo.cfg or they will eat your available disk >> space. I suggest you read ZooKeeper's Installation And Maintainance Guide. >> >> >> On Tuesday, May 27, 2014, Andres Gomez >> <[email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');>> >> wrote: >> >>> Hi all, >>> >>> I use storm with kafka! Actually, I use a topology with trident storm = >>> when I use: >>> >>> 1- trasactional trident spout >>> 2- Some functions >>> >>> I have a problem in my zookeeper cluster, because storm continually = >>> writes at zkPath --> /trasactional/ and this generate a lot of logs and = >>> snapshot by which the disk fills up quickly! I see the logs and i can = >>> see that storm writes the state of transactional spout (commit of offset >>> = >>> of kafka).=20 >>> >>> Do you have a similar problem? Do you know how I could fix it? Is there = >>> any way to write less frequently in zookeeper? >>> >>> >>> Regards, >>> >>> Andres >>> >> >> >> -- >> Danijel Schiavuzzi >> >> E: >> [email protected]<javascript:_e(%7B%7D,'cvml','[email protected]');> >> W: www.schiavuzzi.com >> T: +385989035562 >> Skype: danijels7 >> > > > -- Danijel Schiavuzzi E: [email protected] W: www.schiavuzzi.com T: +385989035562 Skype: danijels7
