Hi,

Both replies are really helpful.

Thanks,
Junguk

2016-06-15 2:06 GMT-04:00 Navin Ipe <[email protected]>:

> @Junguk: In any normal function you create in Java, you say something like
> this:
> public void someFunction(Integer firstValue, Float secondValue) {}
>
> This way, Java understands that the first parameter is an integer named
> firstValue and the second parameter is a Float named second value.
>
> Same way, when you say declarer.declare(new Fields("word", "count"));
> You are just telling Storm that when you receive a tuple, the first field
> of the tuple will be some object named "word" and the second object in the
> tuple will be some object named "count". Instead of "word" and "count" you
> could have also named them like this and it would make no difference:
> declarer.declare(new Fields("firstValue", "secondValue"));
>
> Now in your code when you extract the values from the tuple, you have to
> know the datatypes of the "firstValue" and "secondValue".
>
> String w = (String) tuple.getValue(0);//firstValue
> MyCountingClass mcc = (MyCountingClass) tuple.getValue(1);//secondValue
>
> I agree the storm tutorials are a bit confusing that way. Please see if
> the tutorial I wrote is clearer:
> http://nrecursions.blogspot.in/2016/04/a-simple-apache-storm-tutorial.html
>
>
>
>
>
>
> On Sat, Jun 11, 2016 at 7:27 AM, Junguk Cho <[email protected]> wrote:
>
>> Hi, Jungtaek.
>>
>> Thank you for reply.
>>
>> I have following questions.
>>
>> 1. If we look at the example (WordCountTopology), in WordCount class, it
>> uses   String word = tuple.getString(0); to get string (word).
>> So, I don't understand exact roles of  "word" and "count". Internally,
>> they use them for Map-like structure?
>> To be clear, does each bolt exchange data with this format  "word" :
>> <data> ?
>>
>> About default and non-default stream, do all tuples include stream id
>> whenever they send?
>>
>>
>> 3. To be clear, if we set "false", storm does not use serialization for
>> inter-process and inter-node?
>>
>> Thanks in advance.
>> - Junguk
>>
>>
>>
>>
>> 2016-06-10 18:00 GMT-04:00 Jungtaek Lim <[email protected]>:
>>
>>> Hi Junguk,
>>>
>>> 1. In declareOutputFields, you're declaring schema of output stream of
>>> this component. First value of tuple will be matched to "word", and second
>>> value of tuple will be matched to "count". You can access value as field
>>> name or index.
>>>
>>> Btw, declare() declares default stream, and there're other methods which
>>> declare named (non-default) stream.
>>>
>>> 2. When you're rebalancing topology, you're encouraged to input
>>> wait-time, too.
>>> Topology will be deactivated immediately so that Spout will not call
>>> nextTuple(), only Bolts will be running to handle on-going tuples while
>>> wait-time.
>>> If there're still on-going tuples left, they will not be acked. So if
>>> datasource of Spout is RabbitMQ with ack mode or Kafka or so on, Spout will
>>> read them from datasource again.
>>>
>>> 3. Right. In order to check serialization issue earlier, there's option
>>> "topology.testing.always.try.serialize" as debug purpose. Note that it
>>> affects performance so it should be disabled ("false" by default) for
>>> production environment.
>>>
>>> Hope this helps.
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>>
>>> 2016년 6월 11일 (토) 오전 3:27, Junguk Cho <[email protected]>님이 작성:
>>>
>>>> Hi, I have some basic questions.
>>>>
>>>> 1. About Tuple.
>>>> We declare tuple in declareOutputFields.
>>>> For example, declarer.declare(new Fields("word", "count"));
>>>>
>>>> Are "word" and "count" forwarded to next node with actual data?
>>>> What are the roles of "word" and "count" here internally?
>>>>
>>>>
>>>> 2. About rebalancing (
>>>> http://storm.apache.org/releases/1.0.1/Understanding-the-parallelism-of-a-Storm-topology.html
>>>> )
>>>>
>>>> In storm, there is rebalancing capability.
>>>> What happened on-going tuples while storm rebalances topology?
>>>> Does it drop and replay?
>>>>
>>>> 3. Serialization.
>>>> In storm, as far as I know for inter-thread communication,
>>>> serialization does not happen. For inter-process and inter-node
>>>> communication, serialization is required.
>>>> Is it right?
>>>>
>>>> Thanks,
>>>> Junguk
>>>>
>>>>
>>
>
>
> --
> Regards,
> Navin
>

Reply via email to