I can't help with storm ui, problem can be with many things. >Again back to batch mode, when you doing the batch copy, my assumption is, accumulate tuples in a byte array[], and cop/multi-insert into DB, clear array and >reload ......, is that the way or an existing API I can use? You can use LinkedBlockingQueue<Tuple> and store tuples, not bytes.
Good example is here https://github.com/hmsonline/storm-cassandra and http://hortonworks.com/blog/apache-storm-design-pattern-micro-batching/ On 10 December 2014 at 03:13, Sa Li <[email protected]> wrote: > Hi, Irek > > What you have done is exactly I want, I was running my topology in > localcluster, but I submit it to storm cluster : > > SLF4J: Class path contains multiple SLF4J bindings. > SLF4J: Found binding in > [jar:file:/etc/apache-storm-0.9.3/lib/logback-classic-1.0.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] > SLF4J: Found binding in > [jar:file:/home/stuser/backup/pof.analytics.messaging/kafka-storm-ingress/target/kafka-storm-ingress-0.0.1-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/Static > SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an > explanation. > SLF4J: Actual binding is of type > [ch.qos.logback.classic.util.ContextSelectorStaticBinder] > DB connected ..... > 531 [main] INFO backtype.storm.StormSubmitter - Jar not uploaded to > master yet. Submitting jar... > 542 [main] INFO backtype.storm.StormSubmitter - Uploading topology jar > target/kafka-storm-ingress-0.0.1-SNAPSHOT-jar-with-dependencies.jar to > assigned location: /app/storm/nimbus/inbox/s > r > 739 [main] INFO backtype.storm.StormSubmitter - Successfully uploaded > topology jar to assigned location: > /app/storm/nimbus/inbox/stormjar-f3b2a8bd-0d16-4ba5-9d94-51b3ecf53e5b.jar > 740 [main] INFO backtype.storm.StormSubmitter - Submitting topology 2 in > distributed mode with conf > {"topology.max.task.parallelism":5,"nimbus.host":"10.100.70.128","topology.workers":2, > > ":6627,"storm.zookeeper.servers":["10.100.70.128"],"topology.trident.batch.emit.interval.millis":2000} > 842 [main] INFO backtype.storm.StormSubmitter - Finished submitting > topology: 2 > > > but I find nothing shown in UI, this is one issure. Again back to batch > mode, when you doing the batch copy, my assumption is, accumulate tuples in > a byte array[], and cop/multi-insert into DB, clear array and reload > ......, is that the way or an existing API I can use? > > thank > > Alec > > On Tue, Dec 9, 2014 at 2:04 PM, Irek Khasyanov <[email protected]> wrote: > >> >Do I need to make bulk copy? >> >> It depends. If you topology will fail, kafka spout will starts read from >> last known offset. If you will have too many data to write. And inserting >> one row can be bottleneck. >> >> You can test it actually, stop topology, write around 10000+/- messages >> to kafka and start topology. In storm ui you will see capacity for writer >> bolt. If it red colored and over 1.0 you should notice that and this is >> your bottleneck. >> >> We have kafka to HP Vertica stream. Vertica don't like 1 row inserts and >> we added batches with 10K rows. With 4 workers everything looks great. >> >> >> >> On 10 December 2014 at 00:34, Sa Li <[email protected]> wrote: >> >>> Hello, all >>> >>> I have a question here, as I post several threads before, I am using >>> storm-rdbms to write into postgresqlDB, data was collected from >>> kafkaSpout, it works. Since it insert into DB once I get a tuple, per >>> row/insert operation. I have concern that if this type of consuming is fast >>> enough and will potentially cost the overhead? >>> >>> Do I need to make bulk copy? >>> >>> >>> thanks >>> >>> >>> Alec >>> >> >> >> >> -- >> With best regards, Irek Khasyanov. >> > > -- With best regards, Irek Khasyanov.
