---
Jeremiah Peschka
Managing Director, Brent Ozar PLF, LLC

On Fri, Feb 24, 2012 at 5:36 AM, gibraltar <[email protected]> wrote:

> Hello,
>
> I'm founder of a very small startup (a.k.a one man show). Although I did
> roll out a lot of critical piece of software to production environment
> before, I was never responsible of the database or systems that it depended
> on. Now, I'm responsible of those layers as well. Unlike many in this user
> list I do not have any scaleability problem, nor huge data to deal with, at
> least yet, and hopefully one day very soon. (It will have social aspect as
> well, but later). However easy horizontal replication allows me to deploy
> couple of Riak servers on a hosting site (to an environment I have no
> control at all) so that I will have (relatively) peace of mind if I lose
> one of the servers somehow. I am trying to relieve myself from admin work
> as much as I if I can set up environment with correct choice of data
> storage. This is the reason I have been considering riak seriously, since
> my admin experience with any traditional databases are limited I am at the
> same distance to SQL or NoSQL. I already developed my solution to almost
> completion with Riak but now having doubts.
>
> 1) What should be the bare minimum number of Riak server to have that
> security. (Sky is the limit for redundancy but I will limit myself only to
> one site, multiple servers for now).
>

You should have a minimum of 4 servers. When you increase the number of
servers, it's best practice to increase by powers of 2. You'll need to be
careful with your default ring size, though - see
https://wiki.basho.com/Cluster-Capacity-Planning.html#Ring-Size-Number-of-Partitionsfor
more info. Planning this initial ring size is critical for long term
success - otherwise you'll end up migrating to a new ring (although this
isn't the worst thing in the world).



>
> 2) Would anyone in the list recommend setting up Mysql or Postgresql
> instead a NoSQL given current small size of my venture. Am I optimizing
> prematurely. Again I am looking for security, ease of administration, and a
> future proof architecture. I am losing a lot (SQL) but it's ok. I do need
> transactions here and there but I mitigated it storing all in one object
> for atomicity.
>

This depends entirely on your querying structure. If you're performing a
lot of ad hoc queries, performing ad hoc reporting, require referential
integrity, require ACID properties, or are building complex aggregations
then MySQL or PostgreSQL is going to offer you a lot. If you're mainly
doing primary key look-ups, then Riak is a better choice. An RDBMS is a
very tried and true solution (especially PostgreSQL).

If your application doesn't need any RDBMS style features, then Riak is a
good fit.

I can tell you that a knowledgeable expert can make an RDBMS scale to
stupefying levels - both in terms of utilization and raw data stored. It's
a question of how and where you want to scale. Riak makes this easy. Need
more performance? Add a node. Need more storage? Add a node.


>
> 3) I read that people in this list mostly store secondary/derivative data
> or data that they can load again. Is Riak mature enough to store
> critical/primary data? A chain is as strong as its weakest link. So in
> Riak's case what is that weakest link? Backend storage choice? bitcask?
> leveldb? or riak-js? (I need 2i so my only option is leveldb.) Is Riak
> (NoSQL in general) for storing (lots of) secondary data and processing it?
> I will be storing data for other institutions whose customers will
> generate. Although it does not involve money it does mean monetary
> transaction.
>

I think some folks on this list store primary data in Riak, too. In Riak's
case, the weakest link is going to be your storage subsystem. Storage is
usually the bottleneck in most data intensive applications and Riak is no
exception.

Your ability to manipulate, process, and aggregated data is only limited by
your imagination. Riak is a solid data store. I don't think I'd go running
multi-hour MapReduce queries across a Riak cluster, but there aren't really
a lot of situations I can think of where you'd be running gigantic MR jobs.


>
> 4) I tried writing 500K objects to both Postgresql and Riak. I used
> nodejs/riak-js(http). My experience is that a) Postgresql is storing faster
> than Riak b) Riak interface sometimes reporting EADDRINUSE(see below) and
> some other errors randomly. Postgresql test did not generate even one
> error. This blemished my sense of security towards Riak a bit but it could
> be library stack I used. I wonder if this experience unique to me, or are
> there anyone else who had similar experience in the list? (This could also
> be due to node.js, so I am not sure where exactly to put the blame).
>
> { [Error: connect EADDRINUSE] code: 'EADDRINUSE', errno: 'EADDRINUSE',
> syscall: 'connect' }
>
> In brief, I am trying to understand where Riak will likely fail me in
> production, if it does.
>

If you're running all of your Riak tests on one node, then Riak may very
well be slower. Riak's performance and stability benefits come from having
multiple nodes running at the same time.

I have no idea about your specific error message, though.


>
> Sorry for the long post/questions and I appreciate if I could pick your
> mind.
>
> Thanks,
> Gibraltar
>
>
> _______________________________________________
> riak-users mailing list
> [email protected]
> http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com
>
>
_______________________________________________
riak-users mailing list
[email protected]
http://lists.basho.com/mailman/listinfo/riak-users_lists.basho.com

Reply via email to