It's hard to understand your question or recommend a solution.

If you put too much of activity (business logic / processing) in a single
task - then it will be hard for you to scale up the topology and your
hardware utilization will be very high. Make tasks atomic and small, use
batching inserts to DB if possible. Analyze if cassandra becomes a
bottleneck.  Cache of data inside tasks's memory to avoid lookup queries to
DB.

On Tue, May 2, 2017 at 7:44 AM, I PVP <[email protected]> wrote:

> What is the high level best practice on Apache Storm ?
>
> a)  To create a OrderTopology that would receive  and process data from
> all Order related topics/Spouts like  OrderCreated, OrderUpdated,
> OrderCancelled and so on
>
> OR
>
> b) To create individual Topologies like OrderCreatedTopology,
> OrderUpdatedTopology, OrderCancelledTopology
>
> The reason I am asking is because  processing power is getting consumed
> 100% on all supervisor machines/instance... and does not matter how big the
> machines/instances are  or how many topologies are running.
> The overhead required to run a topology seems to be the attention point..
> as cpus on supervisors are at 100% even when there is no data coming into
> Spouts  or going out  to Bolts.
>
> Our application  has Topologies that  receive data from a KafkaSpouts ->
> Bolts write data to Cassandra. So far 32 Topologies.
>
> Should I  focus on consolidating all "business domain" ( like Order,
> Payment)  activities within the same Topology( like OrderTopology,
> PaymentTopology)?
>
> How does Storm based solutions “design” their topologies ?
> A side of individual logging , what are the pros and cons  from Apache
> Storm perspective ?
>
>
> thanks
>
> IPVP
>

Reply via email to