[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-16 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2829 Even if we change the spout to emit all the fetched records in one `nextTuple` call, the number and size of records returned in a fetch is limited by the `max.poll.records` KafkaConsumer setting (500

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-16 Thread roshannaik
Github user roshannaik commented on the issue: https://github.com/apache/storm/pull/2829 @arunmahadevan .. need to be careful when that we are not doing too many emits in a single nextTuple()... it can cause a OOM situation by flooding the pendingQ. Worth checking if

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-16 Thread roshannaik
Github user roshannaik commented on the issue: https://github.com/apache/storm/pull/2829 @HeartSaVioR 1) pendingEmitsQ prevents nextTuple() from blocking when downstream queue is full. It holds the overflow emits (one or more) that occurred within a **single**

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-15 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2829 Sorry my bad, there is an overflow.

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-14 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2829 If my understanding is right, there's pendingEmits (unbounded) which comes into play when it can't push tuple immediately, so emit should not block, nextTuple should not block as well. If emit

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-14 Thread arunmahadevan
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/2829 @revans2 thanks for merging. Raised https://github.com/apache/storm/pull/2835 for 1.x-branch. >If you want to send more it can be more efficient, but you risk going over the

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-14 Thread revans2
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/2829 The reason we want to send a single tuple is because of how we do flow control in the spouts. If you want to send more it can be more efficient, but you risk going over the max.spout.pending amount

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-13 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue: https://github.com/apache/storm/pull/2829 @srdo is right. The reason is to enforce "max.spout.pending.limit", but the concept is brought when backpressure is not in place. IMHO, the concept is the thing we should drop whenever

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-13 Thread srdo
Github user srdo commented on the issue: https://github.com/apache/storm/pull/2829 @arunmahadevan I think the reason for only emitting one tuple at a time was to let Storm enforce the max.spout.pending limit. Maybe it would be better to accept some overflow in order to get the spout

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-12 Thread hmcl
Github user hmcl commented on the issue: https://github.com/apache/storm/pull/2829 +1 ---

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-12 Thread hmcl
Github user hmcl commented on the issue: https://github.com/apache/storm/pull/2829 @arunmahadevan yes, it make sense to use a LinkedList here. Thanks for the explanation. ---

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-12 Thread hmcl
Github user hmcl commented on the issue: https://github.com/apache/storm/pull/2829 @revans2 if I am not mistaken I recall that when you reviewed the initial PR you suggested that a call to nextTuple should send only one tuple/record. Can you please refresh my mind on the reasons why?

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-12 Thread arunmahadevan
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/2829 >what's the motivation to change this to LinkedList? Its mentioned in the description. Heres the relevant code for some more details -

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-12 Thread hmcl
Github user hmcl commented on the issue: https://github.com/apache/storm/pull/2829 @arunmahadevan what's the motivation to change this to LinkedList? nextTuple emits only a single tuple because that's the contract of the method nextTuple, which must be honored. This was

[GitHub] storm issue #2829: STORM-3222: Fix KafkaSpout internals to use LinkedList in...

2018-09-12 Thread arunmahadevan
Github user arunmahadevan commented on the issue: https://github.com/apache/storm/pull/2829 Also I am not sure why the nextTuple only emits a single tuple wheres ideally it should emit whatever it can emit in a single nextTuple call which is more efficient. However the logic appears