Object stores are some of our largest and oldest use cases. Cassandra has been 
a good choice for us. We do chunk the objects into 64k chunks (I think), so 
that partitions are not too large and it scales predictably. For us, the choice 
was more about high availability and scalability, which Cassandra provides well.

Sean Durity




From: Paul Chandler <p...@redshots.com>
Sent: Friday, April 19, 2019 5:24 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Using Cassandra as an object store

Gene,

I have found that clusters used as object stores have caused me more problems 
than normal in the past, so I recommend using a separate object store if 
possible.

However, it certainly can be done, there is just a few things to consider:

1) Deletion policy: How are these objects going to be deleted, we have had 
problems in the past where deleted objects didn’t get removed from disk. This 
was because by the time they were deleted they had been compacted into very 
large sstables that were rarely compacted again. So think about compaction 
strategy and any tombstone issues you may come across.

2) Compression: Are the objects already compressed before they are stored eg 
jpgs ? If so turn compression off on the table, this reduces the amount of data 
read into memory when reading the data, reducing pressure on the heap. We did 
some trials with one system, and found much better performance if the 
compression was performed on the client side. So try some tests with that.

3) How often is the data read? There will be be completely different hardware 
requirements depending on whether this is a image store for an e-commerce site, 
compared with a pdf store holding client invoices. With a small amount of reads 
per object, then you can specify smaller CPUs and memory machines with a large 
amount of storage. If there are a large amount of reads, them you need to think 
much more carefully about memory and CPU, as per the Walmart article you 
referenced.

Thanks

Paul Chandler
www.redshots.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.redshots.com&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=2XnWJZ_TELTnIh3QtGe5SMJbuLNmTeKSC_cHooe3jYw&s=qymTcRJstEMuDEFFmnzgGLitW-sPExPRTKslnzg56nI&e=>




On 19 Apr 2019, at 09:04, DuyHai Doan 
<doanduy...@gmail.com<mailto:doanduy...@gmail.com>> wrote:

Idea:

To guarantee data integrity, you can store an MD5 of all chunks data as static 
column in the partition that contains the chunks

On Fri, Apr 19, 2019 at 9:18 AM cclive1601你 
<cclive1...@gmail.com<mailto:cclive1...@gmail.com>> wrote:
we have use cassandra as object store for some years, you can just split the 
object into some small pieces. object got a pk, then the some small pieces got 
some pks ,object's pk and pieces's pk can be store in meta table in cassandra, 
and small pieces's pk and some pieces store in data table.  we store videos 
,picture and other no structure data.

Gene <gh5...@gmail.com<mailto:gh5...@gmail.com>> 于2019年4月19日周五 下午1:25写道:
Howdy

I'm looking at the possibility of using cassandra as an object store to offload 
image/blob data from an Oracle database.  I've seen mentions of it being used 
as an object store in a large scale fashion, like with Walmart:

https://medium.com/walmartlabs/building-object-store-storing-images-in-cassandra-walmart-scale-a6b9c02af593<https://urldefense.proofpoint.com/v2/url?u=https-3A__medium.com_walmartlabs_building-2Dobject-2Dstore-2Dstoring-2Dimages-2Din-2Dcassandra-2Dwalmart-2Dscale-2Da6b9c02af593&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=2XnWJZ_TELTnIh3QtGe5SMJbuLNmTeKSC_cHooe3jYw&s=Ea7HkmBSM32WG3930PP3mqmx7FmjQyJnNjNKULshL4U&e=>

However I have found little on small scale setups and if it's even worth using 
Cassandra in place of something else that's meant to be used for object 
storage, like Ceph.

Additionally, I've read that cassandra struggles with storing objects 10MB or 
larger and it's recommended to break objects up into smaller chunks, which 
either requires some kind of middleware between our application and cassandra, 
or it would require our application to split objects into smaller chunks and 
recombine them as needed.

I've looked into pithos and astyanax, but those are both no longer developed 
and I'm not seeing anything that might replace them in the long term.

https://github.com/exoscale/pithos<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_exoscale_pithos&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=2XnWJZ_TELTnIh3QtGe5SMJbuLNmTeKSC_cHooe3jYw&s=VXuCOqIAr5OnfYjD386q__7GaDFCeXxP2uVtDBWf4q0&e=>
https://github.com/Netflix/astyanax<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_Netflix_astyanax&d=DwMFaQ&c=MtgQEAMQGqekjTjiAhkudQ&r=aC_gxC6z_4f9GLlbWiKzHm1vucZTtVYWDDvyLkh8IaQ&m=2XnWJZ_TELTnIh3QtGe5SMJbuLNmTeKSC_cHooe3jYw&s=uLgsw32DlBnzdGCqCbWn2VMQ5YCtzTs6YpiozT79fpM&e=>

Any helpful information or advice would be greatly appreciated.

Thanks in advance.

-Gene


--
you are the apple of my eye !


________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Reply via email to