Aaron,

    Thank you very much for the answers! Helped me a lot!
    I would like just a bit more clarification about the points bellow, if
you allow me:


   - You can query your data using Hadoop easily enough. You may want take
   a look at DSE from  http://datastax.com/ it makes using Hadoop and Solr
   with cassandra easier.

Actually, if I use community edition for now, I wouldn't be able to use
hadoop against data stored in CFS? We are considering the enterprise
edition here, but the best scenario would be using it just when really
needed. Would writes on HDFS be so quick as in Cassandra?


   - It depends on how many moving parts you are comfortable with. Same for
   the questions about HDFS etc. Start with the smallest about of
   infrastructure.

Sorry, I didn't really understand this part. I am not sure what you wanted
to say, but the question was about using nosql instead a relational
database in this case. If learning nosql is not a problem, would I have
advantages in using Cassandra instead of HBase? If everything in my model
fits into a relational database, if my data is structured, would it still
be a good idea to use Cassandra? Why?


Thanks,
Marcelo.

2012/9/18 aaron morton <aa...@thelastpickle.com>

> Also, I saw a presentation which said that if I don't have rows with more
> than a hundred rows in Cassandra, whether I am doing something wrong or I
> shouldn't be using Cassandra.
>
> I do not agree with that statement. (I read that as rows with ore than a
> hundred _columns_)
>
>
>    - I need to support a high volume of writes per second. I might have a
>    billion writes per hour
>
> Thats about 280K /sec. Netflix did a benchmark that shows 1.1M/sec
> http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html
>
>
>    - I need to write non-structured data that will be processed later by
>    hadoop processes to generate structured data from it. Later, I index the
>    structured data using SOLR or SOLANDRA, so the data can be consulted by my
>    end user application. Is Cassandra recommended for that, or should I be
>    thinking in writting directly to HDFS files, for instance? What's the main
>    advantage I get from storing data in a nosql service like Cassandra, when
>    compared to storing files into HDFS?
>    -
>
> You can query your data using Hadoop easily enough. You may want take a
> look at DSE from  http://datastax.com/ it makes using Hadoop and Solr
> with cassandra easier.
>
>
>    - If I don't need to perform complicated queries in Cassandra, should
>    I store the json-like data just as a column value? I am afraid of doing
>    something wrong here, as I would need just to store the json file and some
>    more 5 or 6 fields to query the files later.
>    -
>
> Store the data in the way that best supports the read queries you want to
> make. If you always read all the fields, or it's a canonical record of
> events storing as JSON may be best. If you often get a few fields, and
> maybe they are updated, storing each field as a column value may be best.
>
>
>    - Does it make sense to you to use hadoop to process data from
>    Cassandra and store the results in a database, like HBase? Once I have
>    structured data, is there any reason I should use Cassandra instead of
>    HBase?
>    -
>
> It depends on how many moving parts you are comfortable with. Same for the
> questions about HDFS etc. Start with the smallest about of infrastructure.
>
> Hope that helps.
>
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/09/2012, at 10:28 AM, Marcelo Elias Del Valle <mvall...@gmail.com>
> wrote:
>
> Hello,
>
>      I am new to Cassandra and I am in doubt if Cassandra is the right
> technology to use in the architecture I am defining. Also, I saw a
> presentation which said that if I don't have rows with more than a hundred
> rows in Cassandra, whether I am doing something wrong or I shouldn't be
> using Cassandra. Therefore, it might be the case I am doing something
> wrong. If you could help me to find out the answer for these questions by
> giving any feedback, it would be highly appreciated.
>      Here is my need and what I am thinking in using Cassandra for:
>
>    - I need to support a high volume of writes per second. I might have a
>    billion writes per hour
>    - I need to write non-structured data that will be processed later by
>    hadoop processes to generate structured data from it. Later, I index the
>    structured data using SOLR or SOLANDRA, so the data can be consulted by my
>    end user application. Is Cassandra recommended for that, or should I be
>    thinking in writting directly to HDFS files, for instance? What's the main
>    advantage I get from storing data in a nosql service like Cassandra, when
>    compared to storing files into HDFS?
>    - Usually I will write json data associated to an ID and my hadoop
>    processes will process this data to write data to a database. I have two
>    doubts here:
>       - If I don't need to perform complicated queries in Cassandra,
>       should I store the json-like data just as a column value? I am afraid of
>       doing something wrong here, as I would need just to store the json file 
> and
>       some more 5 or 6 fields to query the files later.
>       - Does it make sense to you to use hadoop to process data from
>       Cassandra and store the results in a database, like HBase? Once I have
>       structured data, is there any reason I should use Cassandra instead of
>       HBase?
>
>      I am sorry if the questions are too dummy, I have been watching a lot
> of videos and reading a lot of documentation about Cassandra, but honestly,
> more I read more I have questions.
>
> Thanks in advance.
>
> Best regards,
> --
> Marcelo Elias Del Valle
> http://mvalle.com - @mvallebr
>
>
>


-- 
Marcelo Elias Del Valle
http://mvalle.com - @mvallebr

Reply via email to