To be as objective as possible :
Product vendors
Datastax
Stratio
Infrastructure/ Database as a Service
Instaclustr
CosmosDB on Azure.
Container Orchestration
Mesosphere (DCOS creator) has limited support of “certified” Cassandra and DSE
containers on Mesos
Disclosure : our firm is a DataStax
Team,
I need to make a decision on Mongo DB vs Cassandra for loading the csv file
data and store csv file as well. If any of you did such study in last couple of
months, please share your analysis or observations.
Regards,
Sudhakar
Legal Disclaimer :
The information contained in this message m
Hi Sudhakar!
each one have a different goals, which means that they are complementary.
Could you share more detail of the use case to give you a better advice?
El El jue, 31 de may. de 2018 a las 5:50 a. m., Sudhakar Ganesan
escribió:
> Team,
>
>
>
> I need to make a decision on Mongo DB vs Cas
Sudhakar,
MongoDB will accommodate loading CSV without regard to schema while
still creating identifiable "columns" in the database, but you'll have
to predict or back-impose some schema later if you're going to create
indices for fast searching of the data. You can perform searching of
data
Hi,
I've deleted 50% of my data row by row now disk usage of cassandra data is more
than 80%.
The gc_grace of table was default (10 days), now i set that to 0, although many
compactions finished but no space reclaimed so far.
How could i force deletion of tombstones in sstables and reclaim th
Hi,
You need to manually force compaction if you do not care ending up with one
big sstable (nodetool compact)
On 31 May 2018 at 11:07, onmstester onmstester wrote:
> Hi,
> I've deleted 50% of my data row by row now disk usage of cassandra data is
> more than 80%.
> The gc_grace of table was de
At high level, in the production line, machine will provide the data in the
form of CSV in every 1 sec to 1 minutes to 1 day ( depending on machine type
used in the line operations). I need to parse those files and load it to DB and
build and API layer expose it to downstream systems.
Number of
If you are starting with a modest amount of data (e.g. under .25 PB) and do
not have extremely high availability requirements, then it is easier to
start with MongoDB, avoiding HA clusters. I would suggest you start with
MongoDB. Both are great, but C* scales far beyond MongoDB FOR A GIVEN LEVEL
OF
277 TB/day seems like the type of task I'd not trust to random mailing list
advice.
Cassandra can do that, but it's nontrivial. MongoDB may be able to do it,
too (not sure). A lot of it will depend on how you're trying to query the
data.
On Thu, May 31, 2018 at 9:00 AM, Sudhakar Ganesan <
sudha
I haven’t seen any query requirements, which is going to be the thing that
makes Cassandra difficult.
If you can’t define your queries beforehand, cassandra is a no go. If you
just want to store data somewhere, and it’s just CSV, I’d go with a simple
blob store like s3 and pick a DB later when you
Hello,
It's a very common but somewhat complex topic. We wrote about it 2 years
ago and I really think this post might have answers you are looking for:
http://thelastpickle.com/blog/2016/07/27/about-deletes-and-tombstones.html
Something that you could try (if you do care ending up with one big s
Based on the metrics you say, I think the big data architecture can be:
cassandra with spark. you mention high availability. the apis could use
node.js. This combination is powerful, the challenge is in the data model.
On the other hand, if you are willing to sacrifice high availability and
slow r
Hello Apache Supporters and Enthusiasts
This is a reminder that our Apache EU Roadshow in Berlin is less than
two weeks away and we need your help to spread the word. Please let your
work colleagues, friends and anyone interested in any attending know
about our Apache EU Roadshow event.
We h
13 matches
Mail list logo