[ 
https://issues.apache.org/jira/browse/CASSANDRA-11542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefania updated CASSANDRA-11542:
---------------------------------
    Description: 
I propose creating a benchmark for comparing Cassandra and HDFS bulk reading 
performance. Simple Spark queries will be performed on data stored in HDFS or 
Cassandra, and the entire duration will be measured. An example query would be 
the max or min of a column or a count\(*\).

This benchmark should allow determining the impact of:
* partition size
* number of clustering columns
* number of value columns (cells)


  was:
I propose creating a benchmark for comparing Cassandra and HDFS bulk reading 
performance. Data will be imported into Spark to perform very simple queries 
and the entire duration will be measured. An example query would be the max or 
min of a column or a count\(*\).

This benchmark should allow determining the impact of:
* partition size
* number of clustering columns
* number of value columns (cells)



> Create a benchmark to compare HDFS and Cassandra bulk read times
> ----------------------------------------------------------------
>
>                 Key: CASSANDRA-11542
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11542
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: Testing
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 3.x
>
>
> I propose creating a benchmark for comparing Cassandra and HDFS bulk reading 
> performance. Simple Spark queries will be performed on data stored in HDFS or 
> Cassandra, and the entire duration will be measured. An example query would 
> be the max or min of a column or a count\(*\).
> This benchmark should allow determining the impact of:
> * partition size
> * number of clustering columns
> * number of value columns (cells)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to