Hi everyone,

I want to store all my raw data incoming in my storm topology in a HDFS cluster.
This is JSON or binary data, incoming at a rate of 2k / secs.

I was trying to use the HDFS bolt, but it does not allow compression using the 
normal hdfs bolt
Compression is only possible using the Sequence File Bolt.
I don't want to use sequence file, as I don't have a real key.

Plus, I have already Cassandra for storing my key / value stuff and serving my 
request.
It just take too much disk (overhead) using Cassandra for my raw data (not this 
post objective to debate about this).

Can anyone help me with that ?
I can use the java Hadoop driver client to achieve that ?
Have anyone a code snippet of that ?

Thanks !
Regards
Bastien

Reply via email to