flux with hdfsbolt

2015-05-30 Thread clay teahouse
Hi All, Two questions: 1) what version of hadoop does flux work with? My assumption was that it should not matter, but I am getting the following error when I use a hdfsbolt with flux which I assume implies some version mismatch. I don't have this issue if build a topology straightforward that

flux

2015-05-26 Thread clay teahouse
Hi All, I am trying to test flux module for writing template driven topologies. I setting the topologies in localcluster mode. 1) using template simple_wordcount.yaml 2) using kafka_spout.yaml (using TestBolt) With (1) I don't get any output form TestBolt and the topology exits. With (2), I get

Zookeeper ConnectionLossException

2015-05-05 Thread clay teahouse
Hi All, What would be the reason for getting this exception while running a topology? Everything works fine for a while, and then I get this error and the topology dies. org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /storm/partition_6 ... ...

storm logback freezing

2015-05-04 Thread clay teahouse
Hi all, Has anyone experienced a case where storm logback freezes? The topology seems to be functioning without an issue (I can see the results, in the destination consumers), but the storm log shows no progress. This usually happens a couple of hours after the topology starts, and not right away.

streams in storm and memory usage

2015-04-11 Thread clay teahouse
Hi all, I have a simple question which probably has been asked before, but I cannot find a concrete answer. I'd appreciate your feedback. Assume I have a spout that emit tuples which are consumed by 4 different bolts . Spout A --- Bolt B --- Bolt C --- Bolt D

template driven topology

2015-03-27 Thread clay teahouse
Hi All, Is there anything out there for building topologies based on templates; that is specifying the components in a template and just have a simple framework that builds the topology based on that template. Sorry if this topic has been discussed before. I couldn't find anything related.

reasons for topology hanging

2015-03-18 Thread clay teahouse
Hi All, What could be the reasons for a topology hanging, under a somewhat heavy load (a few hundred mb per minute)? There is no error in the logs. I am using kafkaspout to pull data from kafka and a simple bolt to stream the data. My spout max pending is set to 1024. My topology is running in

Re: cleaning up storm generated files

2015-03-06 Thread clay teahouse
of the supervisor got off. The only way to know what is not needed is to look at zookeeper and do what the supervisor does to determine what should be running. - Bobby On Wednesday, March 4, 2015 11:46 AM, clay teahouse clayteaho...@gmail.com wrote: Is this also the case if you kill

cleaning up storm generated files

2015-03-03 Thread clay teahouse
Hi all, How do I remove storm generated system files programmatically without stepping over the files that I shouldn't be deleting? I need this to cleanup the files left behind from the aborted topologies. I tried fuser and lsof, but with no luck. For example, fuser shows stormconf.ser and

Re: HdfsBolt and hdfs in HA mode

2015-02-19 Thread clay teahouse
or any of the hadoop compute nodes. This will work for the HdfsBolt that loads default configurations from the classpath before overriding them with any custom configurations you set for that bolt. - Bobby On Thursday, February 19, 2015 10:42 AM, clay teahouse clayteaho

HdfsBolt and hdfs in HA mode

2015-02-19 Thread clay teahouse
Hi All, Has anyone used HdfsBolt with hdfs in HA mode? How would you determine which hdfs node is the active node? thanks Clay

Re: HdfsBolt and hdfs in HA mode

2015-02-19 Thread clay teahouse
On Thursday, February 19, 2015 6:47 AM, clay teahouse clayteaho...@gmail.com wrote: Hi All, Has anyone used HdfsBolt with hdfs in HA mode? How would you determine which hdfs node is the active node? thanks Clay

SSL_OPTS

2015-02-18 Thread clay teahouse
Hi All, I have set and exported SSL_OPTS but it is not being picked up while running storm. I need to use client certificate with the http client connections originated from a bolt. I have set the options on the storm command as well, but still the same issue. thanks Clay

Re: [jira] [Commented] (STORM-660) IndexOutOfBoundsException with shuffle grouping and a large number of paralleliziation hint

2015-02-09 Thread clay teahouse
Affects Versions: 0.10.0 Environment: Linux Reporter: clay teahouse Storm throws IndexOutOfBoundsException error once in a while, when a bolt with shuffle grouping to is linked its predecessor bolt.especially with a high paralleliziation hint. This was reported

Do I need to synchronize emits?

2015-02-08 Thread clay teahouse
Hi All, I emit my tuples in batches. Do I need to put the emit in a synchronized block? The reason I am asking, I am getting the IndexOutOfBoundsException error once in a while, especially with a high paralleliziation hint. According to this link, it is a bug in storm, but I am using the latest

the external Zookeeper in Local mode

2015-02-03 Thread clay teahouse
Hi, I have a topology running in local mode. The topology uses kafkaspout and is configured to use the external zookeeper. But when I start the topology, I see the following: org.apache.storm.zookeeper.ZooKeeper - Initiating client connection, connectString=localhost:2000 I also messages like

kafkaspout is very slow

2015-02-03 Thread clay teahouse
Hi all, In my topology, kafka spout is responsible for over 85% of the latency. I have tried different spout max pending and played with the buffer size and fetch size, still no luck. Any hint on how to optimize the spout? The issue doesn't seem to be with the kafka side, as I see high

questions on task, threads and workers

2015-02-01 Thread clay teahouse
Hi, I have a few simple questions. 1)In storm .9.x, what is the default value for the bolt num tasks? According to the docs, the parallelism hint no longer sets the number of tasks, but the number of executor threads. 2)What happens if the number of tasks is less than the number of threads? Should

Re: passing objects to bolts through Config

2015-01-24 Thread clay teahouse
at 12:01 AM, clay teahouse clayteaho...@gmail.com wrote: Hi, I am trying to pass some objects to the bolts through config, but I am not having much success. These objects are hashmap and arrarylists. I am assuming these are serializable. Any idea what could be wrong? thanks, Clay

share singleton between topology and multiple bolts

2015-01-23 Thread clay teahouse
Hi All, My topology initializes a singleton containing a s set of static objects. These objects can be complex but are static. I want all the bolts to have access to this singleton. It seems that the bolts can access the primitives such as strings in this singleton, but cannot access more complex

loading jars

2015-01-22 Thread clay teahouse
Hi, I am trying to use storm in cluster mode. I've started nimbus and supervisor, but when I try to run the topology, I get the error it cannot find or load some jar in $STORM_HOME/lib. All the jars it is complaining about do exist there. When I do storm classpath, I see the jars that it is

null pointer exception

2015-01-04 Thread clay teahouse
Hi All, I have the following topology spout - Bolt1 -- Bolt2 Neither bolts are async or multi-threaded. Bolt2 uses http client to make post/put requests to a web server. Both bolts ack the tuples before exiting the execute. The topology runs fine for a while under a load of about 50MB/minute.

hdfsbolt

2014-12-23 Thread clay teahouse
Hi All, Why HdfsBolt doesn't retry when the hadoop node is down or not accessible and dies brings down the topology with it too? I can catch the run time exception and keep the topology going, but was wondering why the retry is not built into HdfsBolt. thank you Clay

[jira] [Created] (STORM-602) HdfsBolt dies when the hadoop node is not available

2014-12-23 Thread clay teahouse (JIRA)
clay teahouse created STORM-602: --- Summary: HdfsBolt dies when the hadoop node is not available Key: STORM-602 URL: https://issues.apache.org/jira/browse/STORM-602 Project: Apache Storm Issue

Re: bolt stop receiving tuples

2014-12-07 Thread clay teahouse
a thread dump of the java process when it is in hung state. That will clearly tell you what the problem is. For easy diagonosis set the worker to one and possibly set the number of tasks of spout/bolt to 1. Thanks and Regards, Devang On 4 Dec 2014 21:14, clay teahouse clayteaho...@gmail.com

Re: bolt stop receiving tuples

2014-12-04 Thread clay teahouse
the code in execute method of bolt B with a log statement and check if it's still an issue. Thanks and Regards, Devang On 4 Dec 2014 19:28, clay teahouse clayteaho...@gmail.com wrote: This is a local cluster. I don't see anything interesting in the logs that would tell me anything. I even

bolt stop receiving tuples

2014-12-03 Thread clay teahouse
Hello All, I have this configuration: spout - Bolt A (emits tuples) - Bolt B Bolt A emits tuples successfully but bolt B stops receiving tuples after the first time (it never enters the execute after the first time). The first time execution seems to be successful. Any idea what the issue could

acking and offsets

2014-11-17 Thread clay teahouse
Hello All, I am using kafka spout that comes with storm 0.9.3 ( https://github.com/apache/storm). I am having several different bolts consuming the same tuples from the spout (in the same topology). These bolts process the tuples and send the output to different destinations. I have a couple of

Re: join and in-bolt caching

2014-11-12 Thread clay teahouse
Hello All, I'd appreciate your input/pointers regarding my post on join and in-blot caching. thanks, Clay On Tue, Nov 11, 2014 at 5:09 AM, clay teahouse clayteaho...@gmail.com wrote: Hello All, I need to 1) Look up some values from a source and cache them in a bolt 2) Filter a second stream

emitting batches of tuples

2014-11-03 Thread clay teahouse
Hello All, Is it possible emit batches of tuples, as opposed to one tuple at a time? In other word, is it possible to batch the tuples before emitting them? An application for batching the tuples is for example for writing the tuples to a tcp socket but not wanting to do a flush after each tuple

multi-stream example

2014-10-30 Thread clay teahouse
Hello All, Can someone share an example of a bolt with multi stream output, with each particular output stream going to a particular bolt? Bolt A =stream 1 = Bolt B Bolt A = stream 2 = Bolt C Bolt A = stream 3 = Bolt D thanks, Clay

Re: TOPOLOGY_ACKER_EXECUTORS

2014-10-11 Thread clay teahouse
acking enabled, as its used to coordinate batches. On Thu, Oct 9, 2014 at 4:09 AM, clay teahouse clayteaho...@gmail.com wrote: Hello, I am trying to turn off acking by settingTOPOLOGY_ACKER_EXECUTORS to 0. But when I do that my trident topology fails with the following error

TOPOLOGY_ACKER_EXECUTORS

2014-10-09 Thread clay teahouse
Hello, I am trying to turn off acking by settingTOPOLOGY_ACKER_EXECUTORS to 0. But when I do that my trident topology fails with the following error and subsequently the worker dies. java.lang.RuntimeException: backtype.storm.topology.FailedException: Received commit for different transaction

tcp socket spout

2014-10-05 Thread clay teahouse
Hi, Is there a tcp socket spout out there? thanks, Clay