Re: Storm performing very slow

2014-09-22 Thread Michael Rose
Storm is not your bottleneck. Check your Storm code to 1) ensure you're parallelizing your writes and 2) you're batching writes to your external resources if possible. Some quick napkin math shows you only doing 110 writes/s, which seems awfully low. Michael Rose (@Xorlev <https:

Re: Question on failing ack

2014-08-25 Thread Michael Rose
Hi Kushan, Depending on the Kafka spout you're using, it could be doing different things when it failed. However, if it's running reliably, the Cassandra insertion failures would have forced a replay from the spout until they had completed. Michael Rose (@Xorlev <https://twitt

Re: Implementing a barrier mechanism in storm

2014-08-01 Thread Michael Rose
g > one message Map? > Can you send the name of the bolts that have been processed the message to > the final bolt, and in the final bolt to check if the the list of all > processing bolts is the same? > > What is the best approach here? > > Thanks . > Regards, > Flor

Re: Implementing a barrier mechanism in storm

2014-07-31 Thread Michael Rose
It's another case of a streaming join. I've done this before, there aren't too many gotchas, other than you need a datastructure which purges stale unresolved joins beyond the tuple timeout time (I used a Guava cache for this). Michael Rose (@Xorlev <https://twitter.com/xorlev

Re: Storm useable for video processing??

2014-07-31 Thread Michael Rose
There's no reason you couldn't. If you look in the archives there was someone else who'd managed to do some video processing with Storm. If you make things work, consider sharing a blog post -- that'd be really great stuff! :) Michael Rose (@Xorlev <https://twitter.com/x

Re: Fetching data from a REST api into storm.

2014-07-24 Thread Michael Rose
the results downstream when the thread returns. Just make sure that you're synchronizing your OutputCollector when emitting from a multithreaded context. Hope that helps. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcont

Re: Acking is delayed by 5 seconds (in disruptor queue ?)

2014-07-18 Thread Michael Rose
Worth clarifying for anyone else in this thread that a LBQ separating production from consumption is not a default thing in Storm, it's something we cooked up to prefetch elements from slow/batching resources. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platfo

Re: Acking is delayed by 5 seconds (in disruptor queue ?)

2014-07-18 Thread Michael Rose
essage ready for processing. An alternative to this is the https://github.com/apache/incubator-storm/blob/master/storm-core/src/jvm/backtype/storm/spout/NothingEmptyEmitStrategy.java if you do see an impact on your throughput--but I've never needed this. Michael Rose (@Xorlev <https://twitter.com/x

Re: Acking is delayed by 5 seconds (in disruptor queue ?)

2014-07-18 Thread Michael Rose
her thread dealing with IO and asynchronously feeding a concurrent data structure the spout can utilize. For example, in our internal Amazon SQS client our IO thread continuously fetches up to 10 messages per get and shoves them into a LinkedBlockingQueue (until full, then it blocks the IO thread o

Re: Acking is delayed by 5 seconds (in disruptor queue ?)

2014-07-18 Thread Michael Rose
Run your producer code in another thread to fill a LBQ, poll that with nextTuple instead. You should never be blocking yourself inside a spout. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.

Re: Distribute Spout output among all bolts

2014-07-16 Thread Michael Rose
rkers will receive an executor but the others will not. It sounds like for your case, shuffle+parallelism is more than sufficient. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Wed, Jul

Re: Distribute Spout output among all bolts

2014-07-16 Thread Michael Rose
ot;b0" a 'RouterBolt', then having bolt b0 round-robin the received tuples between two streams, then have b1 and b2 shuffle over those streams instead. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/>

Re: Does Storm support JDK7 ??

2014-07-14 Thread Michael Rose
You need only run the existing releases on JDK 7 or 8. On Jul 14, 2014 7:15 AM, "Haralds Ulmanis" wrote: > Actually now I've customized a bit storm and recompiled as I needed some > changes in it. > But initially I just downloaded and run. > > > > > On 14 July 2014 14:02, Adrianos Dadis wrote:

Re: Choosing where your tasks run in Storm

2014-07-04 Thread Michael Rose
efore, the existing schedulers are in Clojure. It's not impossible to do for sure, but like Andrew said it might well just be easier to have separate clusters that share ZK clusters. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://ww

Re: Storm's performance limits to 1000 tuples/sec

2014-06-25 Thread Michael Rose
On serialization, make sure your custom classes are registered with Kryo otherwise it may use Java serialization (slow) On Jun 25, 2014 10:30 AM, "Robert Turner" wrote: > Serialisation across workers might be your problem, if you can use the > "localOrShuffle" grouping and arrange that the number

Re: Storm trident, multiple workers nothing happens

2014-06-17 Thread Michael Rose
In a single worker, you don't incur serialization or network overhead. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Tue, Jun 17, 2014 at 11:09 PM, Romain Leroux wrot

Re: Topology Memory leaks

2014-06-17 Thread Michael Rose
50 }, "mapped":{ "count":0, "memoryUsed":0, "totalCapacity":0 } }, Do you have a similar bump in direct buffer counts? Michael Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform

RE: Extracting Performance Metrics

2014-06-16 Thread Michael Rose
What kind of issues does Metrics have that leads you to recommend HdrHistogram? On Jun 16, 2014 6:57 PM, "Dan" wrote: > Be careful when using Coda Hale's Metrics package when measuring latency. > Consider using Gil Tene's > High Dynamic Range Histogram instead: > > http://hdrhistogram.github.io/H

Re: hot swap of topology

2014-06-16 Thread Michael Rose
unity to halt and reactivate) Kill storm-topology-1 This works for us as we don't have any critical in-memory state that isn't checkpointed to a persistent store, and the vast majority of our work can be replayed safely. Michael Michael Rose (@Xorlev <https://twitter.com/xorlev>)

Re: Non-DRPC topologies and number of assigned workers

2014-06-11 Thread Michael Rose
A topology can run in as many workers as you assign at launch time, DRPC or not. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Wed, Jun 11, 2014 at 1:08 PM, Nima Movafaghrad <

Re: Storm initialization on startup

2014-06-11 Thread Michael Rose
bolt's prepare methods. The first executor to start per JVM wins. Storm has some gotchas (bolts are serialized, so do your init in prepare()), but in general things that work in a normal Java application will end up working in Storm. Michael Michael Rose (@Xorlev <https://twitter.co

Re: resource aware task scheduling in Storm cluster

2014-06-10 Thread Michael Rose
Out of curiosity, what kind of changes have you been making? Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Tue, Jun 10, 2014 at 9:36 PM, Jason Jackson wrote: > Hi Alex, &g

Re: Storm/Hbase Bolt Performance

2014-06-09 Thread Michael Rose
Just as a side note, I've seen capacity numbers >6. The calculation for capacity is somewhat flawed, and does not represent a true percentage capacity, merely a relative measure next to your other bolts. Maybe that's something we can improve, I'll log a JIRA if there isn'

Re: [VOTE] Storm Logo Contest - Final Round

2014-06-09 Thread Michael Rose
#9 - 5pts Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Mon, Jun 9, 2014 at 2:49 PM, Bryan Stone < bryan.st...@synapse-wireless.com> wrote: > #9 – 5 pts > > *B

Re: Are Real-Time Game Servers a good use case for Storm

2014-06-08 Thread Michael Rose
If your game server is just running business logic, a totally stateless set of servers is really the way to go. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Sun, Jun 8, 2014 at 9:07 P

Re: Order of Bolt definition, catching ""that subscribes from non-existent component [ ...]"

2014-06-06 Thread Michael Rose
You can have a loop on a different stream. It's not always the best thing to do (deadlock possibilities from buffers) but we have a production topology that has that kind of pattern. In our case, one bolt acts as a coordinator for recursive search. Michael Rose (@Xorlev <https://twi

Re: Is there any way for my application code to get notified after it gets deserialized on a worker node and before spouts/bolts are opened/prepared ?

2014-06-02 Thread Michael Rose
lized) { // do stuff initialized = true; } } } Until there's a set of lifecycle hooks, that's about as good as I've cared to make it. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http

Re: Is there any way for my application code to get notified after it gets deserialized on a worker node and before spouts/bolts are opened/prepared ?

2014-06-02 Thread Michael Rose
ing a task hook for init code (e.g. properties / Guice). Check out BaseTaskHook, it's easily extendible and can be included pretty easily too: stormConfig.put(Config.TOPOLOGY_AUTO_TASK_HOOKS, Lists.newArrayList(MyTaskHook.class.getName())); Michael Rose (@Xorlev <https://twitter.com/x

Re: New Committer/PPMC Member: Michael G. Noll

2014-05-29 Thread Michael Rose
Congrats! Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Thu, May 29, 2014 at 3:01 PM, Derek Dagit wrote: > Welcome Michael! > > -- > Derek > > > On 5/29

Re: Workers constantly restarted due to session timeout

2014-05-29 Thread Michael Rose
Do you have GC logging turned on? With a 60GB heap I could pretty easily see stop-the-world GCs taking longer than the session timeout. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On

Re: Which daemon tool for Storm nodes?

2014-05-21 Thread Michael Rose
We use upstart. Supervisord would also work. Just anything to keep an eye on it and restart it if it dies (a very rare occurrence). Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On

Re: Storm with video/audio streams

2014-05-19 Thread Michael Rose
No reason why you couldn't do it, but as far as I know it hasn't been done before. You can send any kind of serializable data into a topology. You'd probably need to emit frames from the spout. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engine

Re: [VOTE] Storm Logo Contest - Round 1

2014-05-16 Thread Michael Rose
#9 - 3 #5 - 2 Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Fri, May 16, 2014 at 10:51 AM, Benjamin Black wrote: > #11 - 5 pts > > > On Fri, May 16, 2014 at 7:43

Re: How resilient is Storm w/o supervision

2014-05-02 Thread Michael Rose
ios). For development, there shouldn't be an issues foregoing supervision. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Fri, May 2, 2014 at 12:41 PM, P. Taylor Goetz wrote: &g

Re: Machine specs

2014-04-30 Thread Michael Rose
In AWS, we're fans of c1.xlarges, m3.xlarges, and c3.2xlarges, but have seen Storm successfully run on cheaper hardware. Our Nimbus server is usually bored on a m1.large. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcont

Re: Is it a must to have /etc/hosts mapping or a DNS in a multinode setup?

2014-04-29 Thread Michael Rose
We don't use /etc/hosts mapping, we only use hostnames / ips. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Tue, Apr 29, 2014 at 8:29 AM, Derek Dagit wrote: > I have not tr

Re: PDF processing use case in storm!!

2014-04-28 Thread Michael Rose
delegation of tuples. You could really abuse Storm if you wanted and use it as a distributed application container with threadpools, I've done it. But you're really going to see a better experience out of a webservice if it's live-mode requests. Michael Rose (@Xorlev <https:/

Re: sending the same stream to two different bolts, possible?

2014-04-25 Thread Michael Rose
That's correct, the stream will be duplicated in the above case. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Fri, Apr 25, 2014 at 12:44 PM, David Novogrodsky < david.n

Re: Passing command line arguments to storm

2014-04-17 Thread Michael Rose
You could set the args in the StormConfig, then they'll show up in the UI per-topology. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Thu, Apr 17, 2014 at 7:16 PM, Cody A. Ra

Re: Passing command line arguments to storm

2014-04-17 Thread Michael Rose
Yes. Ultimately, that runs the main method of MyTopology, so just like any other main method you get String[] args. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Thu, Apr 17, 201

Re: Working with JSON

2014-03-31 Thread Michael Rose
It's much more efficient to deserialize it once then pass around POJOs. JSON serialization is slow compared to Kryo. Our topologies tend to take in JSON, then emit JSON to external systems at later phases, but all intermediate stages are POJOs. Michael Rose (@Xorlev <https://twitter.co

Re: Server load - Topology optimization

2014-03-19 Thread Michael Rose
messageId is any unique identifier of the message, such that when ack is called on your spout you're returned the identifier to then mark the work as complete in the source in the case it supports replay. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer,

Re: Server load - Topology optimization

2014-03-18 Thread Michael Rose
if an event is available (and meets your criteria). Storm will handle rate limiting the spouts with sleeps. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Tue, Mar 18, 2014 at 5:14 PM

Re: Best way to get JMX access to storm metrics for spout tuples emitted / acked | bolt tuples emitted, latency etc.

2014-03-17 Thread Michael Rose
ar to what Ooyala did, customized for our specific needs, it's an excellent pattern. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Mon, Mar 17, 2014 at 6:21 PM, Chris Bedford wrote

Re: Processing ack'd Tuples?

2014-03-05 Thread Michael Rose
(Config.TOPOLOGY_AUTO_TASK_HOOKS, listOfHooks); OR In prepare(), topologyContext.addTaskTook() Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Wed, Mar 5, 2014 at 1:12 PM, Brian O'Neill wrot

Re: Storm stream grouping examples

2014-03-05 Thread Michael Rose
+1, localOrShuffle will be a winner, as long as it's evenly distributing work. If 1 tuple could say produce a variable 1-100 resultant tuples (and these results were expensive enough to process, e.g. IO), it might well be worth shuffling vs. localShuffling. Michael Rose (@Xorlev &

Re: Storm stream grouping examples

2014-03-05 Thread Michael Rose
What kind of comparisons are you looking for? How they functionally work? Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Wed, Mar 5, 2014 at 9:52 AM, Roberto Coluccio wrote: &

Re: Tuning and nimbus at 99%

2014-03-03 Thread Michael Rose
t yourself (ssh as storm, cd storm/daemon, supervise .) and seeing what kind of errors you see. Are your disks perhaps filled? Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Mon, M

Re: Tuning and nimbus at 99%

2014-03-02 Thread Michael Rose
The fact that the process is being killed constantly is a red flag. Also, why are you running it as a client VM? Check your nimbus.log to see why it's restarting. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcont

Re: Tuning and nimbus at 99%

2014-03-02 Thread Michael Rose
I'm not seeing too much to substantiate that. What size heap are you running, and is it near filled? Perhaps attach VisualVM and check for GC activity. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich..

Re: Tuning and nimbus at 99%

2014-03-02 Thread Michael Rose
Can you do a thread dump and pastebin it? It's a nice first step to figure this out. I just checked on our Nimbus and while it's on a larger machine, it's using <1% CPU. Also look in your logs for any clues. Michael Rose (@Xorlev <https://twitter.com/xorlev>)

Re: Zookeepr on different ports

2014-03-02 Thread Michael Rose
I'd recommend just using one Zookeeper instance if they're on the same physical host. There's no reason why a development ZK ensemble needs 3 nodes. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> m

Re: Netty Errors, chain reaction, topology breaks down

2014-03-02 Thread Michael Rose
Right now we're having slow, off-heap memory leaks, unknown if these are linked to Netty (yet). When the workers inevitably get OOMed, the topology will rarely recover gracefully with similar Netty timeouts. Sounds like we'll be heading back to 0mq. Michael Rose (@Xorlev <https:

Re: Tuning and nimbus at 99%

2014-03-02 Thread Michael Rose
Are you running Zookeeper on the same machine as the Nimbus box? Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Sun, Mar 2, 2014 at 6:16 PM, Sean Solbak wrote: > This is the

Re: JDBC Connections

2014-02-20 Thread Michael Rose
d. In either case, t's pushing the management, verification, and reestablishment of broken connections into the pool (which is also why we have 1 extra conn -- for when a conn is tied up running a validation query or is being reestablished). Michael Rose (@Xorlev <https://twitter.com/xorlev>)

Re: Bolts with long instantiation times -- A.K.A. Zookeeper shenanigans

2014-02-18 Thread Michael Rose
bolt instances. supervisor.worker.start.timeout.secs 120 supervisor.worker.timeout.secs 60 I'd try tuning your worker start timeout here. Try setting it up to 300s and (again) ensuring your prepare method only initializes expensive resources once, then shares them between instances in the JVM. Mi

Re: http-client version conflict

2014-02-06 Thread Michael Rose
We've done this with SLF4j and Guava as well without issues. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Thu, Feb 6, 2014 at 3:03 PM, Mark Greene wrote: > We had this

Re: ZeroMQ Exception causes Topology to get killed

2014-01-10 Thread Michael Rose
What version of ZeroMQ are you running? You should be running 2.1.7 with nathan's provided fork of JZMQ. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Fri, Jan 10, 2014 a

Re: Storm Performance

2014-01-10 Thread Michael Rose
Post your code. Even Dev mode is far faster for us. On Jan 10, 2014 8:44 AM, "Klausen Schaefersinho" wrote: > > I've benched storm at 1.8 million tuples per second on a big (24 core) > box using local or shuffle grouping between a spout and bolt. > Production or development mode? > > > you're on

Re: Workers elasticity

2014-01-06 Thread Michael Rose
ure what you mean 'round-robin fashion to distribute load' -- the ShuffleGrouping will partition work across tasks on an even basis. 3) Yes, in storm.yaml, supervisor.slots.ports. By default it'll run with 4 slots per machine. See https://github.com/nathanmarz/storm/blob/master

Re: Guaranteeing message processing on strom fails

2013-12-29 Thread Michael Rose
goes missing somewhere along the line, fail() will be called after a timeout. If you kill the acker that tuple was tracked with, it's then up to the message queue or other impl to be able to replay that message. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engin

Re: storm over WAN and NAT

2013-12-29 Thread Michael Rose
Generally speaking, I don't know of many services that work exceedingly well over a WAN. Can you not do processing at each location and forward it on with a queue that isn't adverse to WAN links? On Dec 29, 2013 10:03 AM, "Derrick Karimi" wrote: > Hello, > > I have a requirement for real time da

Re: Guaranteeing message processing on strom fails

2013-12-29 Thread Michael Rose
You are not guaranteed that tuples make it, only that if they go missing or processing fails it will replayed from the spout "at least once execution" On Dec 29, 2013 4:47 AM, "Michal Singer" wrote: > Hi, I am trying to test the guaranteeing of messages on storm: > > > > I have two nodes: > > 1.

Re: How to verify the statue of running topology / Storm+Graphite not working / Logs attached

2013-12-28 Thread Michael Rose
Check firewall settings and hosts. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Fri, Dec 27, 2013 at 6:51 AM, 程训焘 wrote: > Hi, all, > > I am having a problem about

RE: Spring bolts

2013-12-25 Thread Michael Rose
ns and in the > prepare method I will initialize the spring context. > > This way, the bolts will call other spring beans which are not bolts and > initialized in spring. But of course this is a very limited solution. > > > > > > *From:* Michael Rose [mailto:mich..

Re: Spring bolts

2013-12-25 Thread Michael Rose
Make a base spring bolt, in your prepare method inject the members. That's the best I've come up with, as prepare happens server side whereas topology config and static initializers happen at deploy time client side. On Dec 25, 2013 7:51 AM, "Michal Singer" wrote: > Hi, I am trying to understand

Re: Emitted ≠ Acked + Failed

2013-12-23 Thread Michael Rose
The statistics are sampled, but in general should be +/- 20 tuples of where they should be. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Mon, Dec 23, 2013 at 9:53 PM, churly lin

RE: Tick Tuple

2013-12-23 Thread Michael Rose
The issue is you've multiplied ticktuplems by 1000 vs dividing. So you're probably not waiting long enough ;) If you need an alternate spout, see storm contrib. There's a halfway decent one. On Dec 23, 2013 7:58 AM, "Adrian Mocanu" wrote: > If anyone is interested, I’ve decided to make my own T

Re: Example of bolt emitting more than one stream

2013-12-20 Thread Michael Rose
https://gist.github.com/Xorlev/8058947 This is the...gist...of it. :) Hope this helps! Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Fri, Dec 20, 2013 at 10:36 AM, Pete Carlson w

Re: Bigger or Smaller Workers?

2013-12-18 Thread Michael Rose
with moderately sized heaps. I can't say I'd think the overhead is too much more to have extra workers if you're doing shuffles or fields grouping most of the time anyways. Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http

Re: Proper way to ACK in a chain of bolts

2013-12-16 Thread Michael Rose
bolt 1 completed, bolt 2 outstanding } Bolt2 (IRichBolt / BaseRichBolt): def execute(tuple2: Tuple) { _collector.emit(tuple2, new Values("foo")) ---> If this is the end, you don't need this. Otherwise, acker knows that boltN must ack to complete _collector.ack(tuple2) --> ac

Re: Metric_Storm

2013-12-13 Thread Michael Rose
You add it as a task hook, e.x. Scala: config.put(Config.TOPOLOGY_AUTO_TASK_HOOKS, List(classOf[MetricsStormHooks].getName).asJava) Java: List> taskHooks = new ArrayList<>(); taskHooks.add(MetricsStormHooks.class.getName()); config.put(Config.TOPOLOGY_AUTO_TASK_HOOKS, taskHooks); Mic

Re: Question on storm topology, bolts with multiple threads

2013-11-26 Thread Michael Rose
using more bolt instances. If your IOs can be a little more variable, we've found it better to have a pool per bolt and run less bolts. This way, an IO that's 3x longer won't slow down other tuples. Of course, this is predicated on not saturating your downstream service and a desire t

Re: Future of Logback?

2013-11-11 Thread Michael Rose
I believe when Jon says "log4j" he refers to log4j2. Log4j2 is yet another successor to log4j, which claims to solve issues in logback. I wasn't able to discern a difference without log4j's usage of Disruptor (3.x). Michael Rose (@Xorlev <https://twitter.com/xorlev>

Re: Implications of no ackers -- and queue explosion

2013-11-10 Thread Michael Rose
ng to send 300k tuple/s/node, so we find the overhead of ackers to be a more than acceptable cost. Michael Michael Rose (@Xorlev <https://twitter.com/xorlev>) Senior Platform Engineer, FullContact <http://www.fullcontact.com/> mich...@fullcontact.com On Sun, Nov 10, 2013 at 12:1