Just curious, why would you want to store sstables in HDFS? On 3/23/13 12:43 PM, "Amit Kumar" <[email protected]> wrote:
>I am starting some work on an input-format that would let us read >sstables stored in HDFS, I wonder if anyone has worked on something >similar before. I did come across > >http://techblog.netflix.com/2012/02/aegisthus-bulk-data-pipeline-out-of.ht >ml > >However it's not open sourced/available yet. > >I am writing for a sanity check before I go too deep into this. > >I have a few questions -hoping someone here would be able to help. > >So far, I have been able to read sstables stored on the local file >system using the SSTableScanner and the SSTableReader. I am wondering >what would be a good way to proceed -having a custom implementation of >RandomAccessFile like the (RandomAccessReader and the >CompressedRandomAccessReader), that would use hadoop's File System >API? > > >I did search for, but could have missed -Is there some documentation >on the binary format of the data, index, and stats files? That might >make it simpler for me to prototype without having to go through the >Cassandra Internals. I am currently working of our production >deployment that is 1.1.0. > >Any guidance if you want to give (I am new to Cassandra Internals). > >Many thanks >Amit Copy, by Barracuda, helps you store, protect, and share all your amazing things. Start today: www.copy.com.
