want to add bit more,
i am posting the json data using kafka-console-produer.sh file, copy the
json data and pasting on console.

On Mon, Jul 4, 2016 at 11:44 AM, praveen reddy <[email protected]>
wrote:

> Thanks Naveen for response, i was using mobile so couldn't see typo's.
> here is my requirement. this is my first POC on Kafka/Storm, so please help
> me if i can design it better way.
>
> i need to read a Json data from Kafka, than convert the Json Data to CSV
> file and save it on HDFS.
>
> this is how i did initial design and having lot of issues.
>
>         builder.setSpout("kafka-spout", new KafkaSpout(kafkaSpoutConfig));
>         builder.setBolt("TransformBolt", new
> TransformationBolt()).shuffleGrouping("kafka-spout");
>         builder.setBolt("Savebolt", new
> SaveBolt()).shuffleGrouping("TransformBolt");
>
> KafkaSpout to read the data from Kafka topic, TransformationBolt to
> convert the json to cvs file and savebolt is to save the csv file.
>
> KafkaSpout was able to read data from Kafka Topic. what i was expecting
> from Spout was to get the complete Json data but i am getting 1 line each
> from Json data i sent to topic
>
> here is my transport bolt
>     @Override
>     public void execute(Tuple input) {
>         String sentence = input.getString(0);
>         collector.emit(new Values(sentence));
>         System.out.println("emitted " + sentence);
>     }
>
> i was expecting getString(0) would return complete json data, but getting
> only 1 line at once.
>
> and i am not sure how to emit csv file so that Savebolt would save it.
>
> can you please let me know how to get complete Json data in single request
> rather than line by line, how to emit CSV file from bolt. and if you guys
> can help me to design this better it would be really helpful
>
>
> On Mon, Jul 4, 2016 at 5:59 AM, Navin Ipe <[email protected]
> > wrote:
>
>> Dear Praveen,
>>
>> The questions aren't silly, but it is rather tough to understand what you
>> are trying to convey. When you say "omit", do you mean "emit"?
>> Bolts can emit data even without having to write to disk (I think there's
>> a 2MB limit to the size of that data that can be emitted, because Thrift
>> can't handle more than that).
>> If you want one bolt to write to disk and then want another bolt to read
>> from disk, then that's also possible.
>> The first bolt can just send to the second bolt, whatever information is
>> necessary to read from file.
>> As of what I know, basic datatypes will automatically get serialized. If
>> you have a more complex class, then serialize it with Serializable.
>>
>> If you could re-phrase your question and make it clearer, people here
>> would be able to help you better.
>>
>>
>>
>> On Sat, Jul 2, 2016 at 7:16 AM, praveen reddy <
>> [email protected]> wrote:
>>
>>> Hi All,
>>>
>>> i am new to Storm and Kafka and working on POC.
>>>
>>> my requirement is get a message from Kafka in json format, spout reading
>>> that message and firts bolt converting the json message to different format
>>> like csv and the second bolt saving it to hadoop.
>>>
>>> now i came up with initial design where i can use kafkaspout to read
>>> kafka topics and bolt converting it to csv file and next bolt saving in
>>> hadoop.
>>>
>>> i have following questions
>>> can the first bold which coverts the message to csv file can omit it?
>>> the file would be saving on disk. can a file which is saved on disk can be
>>> omitted.
>>> how does the second bolt read the file which is saved on disk by first
>>> bolt?
>>> do we need to serialize message ommitted by spout and/or bolt?
>>>
>>> sorry if the questions sound silly, this is my first topology with
>>> minimum knowledge of storm.
>>>
>>> if you guys think of proper design how to implement the my requirement
>>> can you please let me know
>>>
>>> thanks in advance
>>>
>>> -Praveen
>>>
>>
>>
>>
>> --
>> Regards,
>> Navin
>>
>
>

Reply via email to