Hi, Irek What you have done is exactly I want, I was running my topology in localcluster, but I submit it to storm cluster :
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/etc/apache-storm-0.9.3/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/stuser/backup/pof.analytics.messaging/kafka-storm-ingress/target/kafka-storm-ingress-0.0.1-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/Static SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [ch.qos.logback.classic.util.ContextSelectorStaticBinder] DB connected ..... 531 [main] INFO backtype.storm.StormSubmitter - Jar not uploaded to master yet. Submitting jar... 542 [main] INFO backtype.storm.StormSubmitter - Uploading topology jar target/kafka-storm-ingress-0.0.1-SNAPSHOT-jar-with-dependencies.jar to assigned location: /app/storm/nimbus/inbox/s r 739 [main] INFO backtype.storm.StormSubmitter - Successfully uploaded topology jar to assigned location: /app/storm/nimbus/inbox/stormjar-f3b2a8bd-0d16-4ba5-9d94-51b3ecf53e5b.jar 740 [main] INFO backtype.storm.StormSubmitter - Submitting topology 2 in distributed mode with conf {"topology.max.task.parallelism":5,"nimbus.host":"10.100.70.128","topology.workers":2, ":6627,"storm.zookeeper.servers":["10.100.70.128"],"topology.trident.batch.emit.interval.millis":2000} 842 [main] INFO backtype.storm.StormSubmitter - Finished submitting topology: 2 but I find nothing shown in UI, this is one issure. Again back to batch mode, when you doing the batch copy, my assumption is, accumulate tuples in a byte array[], and cop/multi-insert into DB, clear array and reload ......, is that the way or an existing API I can use? thank Alec On Tue, Dec 9, 2014 at 2:04 PM, Irek Khasyanov <[email protected]> wrote: > >Do I need to make bulk copy? > > It depends. If you topology will fail, kafka spout will starts read from > last known offset. If you will have too many data to write. And inserting > one row can be bottleneck. > > You can test it actually, stop topology, write around 10000+/- messages to > kafka and start topology. In storm ui you will see capacity for writer > bolt. If it red colored and over 1.0 you should notice that and this is > your bottleneck. > > We have kafka to HP Vertica stream. Vertica don't like 1 row inserts and > we added batches with 10K rows. With 4 workers everything looks great. > > > > On 10 December 2014 at 00:34, Sa Li <[email protected]> wrote: > >> Hello, all >> >> I have a question here, as I post several threads before, I am using >> storm-rdbms to write into postgresqlDB, data was collected from >> kafkaSpout, it works. Since it insert into DB once I get a tuple, per >> row/insert operation. I have concern that if this type of consuming is fast >> enough and will potentially cost the overhead? >> >> Do I need to make bulk copy? >> >> >> thanks >> >> >> Alec >> > > > > -- > With best regards, Irek Khasyanov. >
