Re: Basic questions about Strom

Navin Ipe Sun, 18 Sep 2016 04:59:43 -0700

Specifically about fields grouping, I've added a tutorial. It's very
confusing to understand other tutorials on the internet because of the lack
of explanation about the strings used as the fields.
http://nrecursions.blogspot.in/2016/09/understanding-fields-grouping-in-apache.html


On Thu, Jun 16, 2016 at 11:57 PM, Junguk Cho <[email protected]> wrote:

> Hi,
>
> Both replies are really helpful.
>
> Thanks,
> Junguk
>
> 2016-06-15 2:06 GMT-04:00 Navin Ipe <[email protected]>:
>
>> @Junguk: In any normal function you create in Java, you say something
>> like this:
>> public void someFunction(Integer firstValue, Float secondValue) {}
>>
>> This way, Java understands that the first parameter is an integer named
>> firstValue and the second parameter is a Float named second value.
>>
>> Same way, when you say declarer.declare(new Fields("word", "count"));
>> You are just telling Storm that when you receive a tuple, the first field
>> of the tuple will be some object named "word" and the second object in the
>> tuple will be some object named "count". Instead of "word" and "count" you
>> could have also named them like this and it would make no difference:
>> declarer.declare(new Fields("firstValue", "secondValue"));
>>
>> Now in your code when you extract the values from the tuple, you have to
>> know the datatypes of the "firstValue" and "secondValue".
>>
>> String w = (String) tuple.getValue(0);//firstValue
>> MyCountingClass mcc = (MyCountingClass) tuple.getValue(1);//secondValue
>>
>> I agree the storm tutorials are a bit confusing that way. Please see if
>> the tutorial I wrote is clearer: http://nrecursions.blogspot.
>> in/2016/04/a-simple-apache-storm-tutorial.html
>>
>>
>>
>>
>>
>>
>> On Sat, Jun 11, 2016 at 7:27 AM, Junguk Cho <[email protected]> wrote:
>>
>>> Hi, Jungtaek.
>>>
>>> Thank you for reply.
>>>
>>> I have following questions.
>>>
>>> 1. If we look at the example (WordCountTopology), in WordCount class, it
>>> uses   String word = tuple.getString(0); to get string (word).
>>> So, I don't understand exact roles of  "word" and "count". Internally,
>>> they use them for Map-like structure?
>>> To be clear, does each bolt exchange data with this format  "word" :
>>> <data> ?
>>>
>>> About default and non-default stream, do all tuples include stream id
>>> whenever they send?
>>>
>>>
>>> 3. To be clear, if we set "false", storm does not use serialization for
>>> inter-process and inter-node?
>>>
>>> Thanks in advance.
>>> - Junguk
>>>
>>>
>>>
>>>
>>> 2016-06-10 18:00 GMT-04:00 Jungtaek Lim <[email protected]>:
>>>
>>>> Hi Junguk,
>>>>
>>>> 1. In declareOutputFields, you're declaring schema of output stream of
>>>> this component. First value of tuple will be matched to "word", and second
>>>> value of tuple will be matched to "count". You can access value as field
>>>> name or index.
>>>>
>>>> Btw, declare() declares default stream, and there're other methods
>>>> which declare named (non-default) stream.
>>>>
>>>> 2. When you're rebalancing topology, you're encouraged to input
>>>> wait-time, too.
>>>> Topology will be deactivated immediately so that Spout will not call
>>>> nextTuple(), only Bolts will be running to handle on-going tuples while
>>>> wait-time.
>>>> If there're still on-going tuples left, they will not be acked. So if
>>>> datasource of Spout is RabbitMQ with ack mode or Kafka or so on, Spout will
>>>> read them from datasource again.
>>>>
>>>> 3. Right. In order to check serialization issue earlier, there's option
>>>> "topology.testing.always.try.serialize" as debug purpose. Note that it
>>>> affects performance so it should be disabled ("false" by default) for
>>>> production environment.
>>>>
>>>> Hope this helps.
>>>>
>>>> Thanks,
>>>> Jungtaek Lim (HeartSaVioR)
>>>>
>>>>
>>>> 2016년 6월 11일 (토) 오전 3:27, Junguk Cho <[email protected]>님이 작성:
>>>>
>>>>> Hi, I have some basic questions.
>>>>>
>>>>> 1. About Tuple.
>>>>> We declare tuple in declareOutputFields.
>>>>> For example, declarer.declare(new Fields("word", "count"));
>>>>>
>>>>> Are "word" and "count" forwarded to next node with actual data?
>>>>> What are the roles of "word" and "count" here internally?
>>>>>
>>>>>
>>>>> 2. About rebalancing (http://storm.apache.org/
>>>>> releases/1.0.1/Understanding-the-parallelism-of-a-Storm-topology.html)
>>>>>
>>>>> In storm, there is rebalancing capability.
>>>>> What happened on-going tuples while storm rebalances topology?
>>>>> Does it drop and replay?
>>>>>
>>>>> 3. Serialization.
>>>>> In storm, as far as I know for inter-thread communication,
>>>>> serialization does not happen. For inter-process and inter-node
>>>>> communication, serialization is required.
>>>>> Is it right?
>>>>>
>>>>> Thanks,
>>>>> Junguk
>>>>>
>>>>>
>>>
>>
>>
>> --
>> Regards,
>> Navin
>>
>
>


-- 
Regards,
Navin

Re: Basic questions about Strom

Reply via email to