[arangodb-google] Re: Timeline for a better memory model.

Iyobo Eki Sun, 16 Oct 2016 20:13:47 -0700

Hi Jan,

The argument of the price of memory falling is an old one, but it doesn't 
seem to translate all that much to the cloud.
Amazon's X1 instance (Just the 1TB) costs about $2 per hour, which comes 
down to a little under $1450 per month (Double this for 2TB).


That's way over JPD's intended data-tier bootstrap budget of $80. :)
I think memory being cheap is really only a boon for those who run their 
own servers....of which no one really does that anymore.

Anyway good job with VelocyPack and the various Arango 3+ improvements.
I can't wait for the pluggable storage engine so we can tune Arango's 
memory consumption to our needs and make the necessary sacrifices elsewhere.

I truly believe in ArangoDB. For once, a NoSQL DB and community that "gets 
it" and doesn't say dumb things to people in an attempt to sound smart like 
"If you need transactions, you're doing it wrong!"...because such a 
DB/community would really only be good for making toys and 
side-projects...Not real, legally-binding business intensive software.

And even more so I hate it when I hear people relate these deficiencies to 
"NoSQL" as a whole, instead of assigning said short-comings to the ACTUAL 
database that has the problem (No need to name names :D ).

It is my hope that ArangoDB becomes more popular as a whole, and I'll be 
sure to do my best to herald it among the masses within my reach.

Thanks again for the awesome work!

On Tuesday, May 10, 2016 at 7:05:18 AM UTC-4, Jan Stücke wrote:
>
> Hi JDP,
>
> First of all - form the whole ArangoDB team - thank you so much for your 
> awesome contribution to the ArangoDB community. Your support helped us grow 
> period
>
> We know how important the topic of memory usage is and that especially 
> bootstrapped startups suffer from tight budgets. I myself worked twice in 
> such environments so I know what you are talking about.
>
> We see three different developments which will tackle the problem of 
> memory usage of ArangoDB. (I have to say, that we always compare ourselfs 
> with the best solutions out there so ArangoDBs memory usage is relatively 
> higher to the best specialized solutions! Compared to all others we are 
> quite good.)
>
> A) *Prizes for memory are falling:* This trend is there for decades and 
> we haven´t seen a stopping sign yet. AWS just announced a 2TB machine which 
> will be available starting summer 2016 and we are confident that the low 
> cost standard machines will improve memory-wise as well. The performance 
> and capabilities you´ll get for 80$ will increase significantly 
> <https://pcpartpicker.com/trends/memory/#ram.204sodimm.ddr3_1600.2x8192>. 
> In addition there is a cool new development in the memory sphere like 
> nv-memory 
> which is quite promising <https://queue.acm.org/detail.cfm?id=2874238>   
> BUT the pace of this decline/development is not fast enough so we know that 
> we have to provide technical solutions within ArangoDB as well. 
>
> B) *Memory usage of ArangoDB improves:* We just implemented VelocyPack (
> VPack <https://github.com/arangodb/velocypack>) into our new release 
> ArangoDB 3.0. This binary storage format is even more compact than e.g. 
> MessagePack and will reduce memory usage for query results, storing 
> documents and temporarily computed values.
>
> *C) Persistent Indexes and plugable storage engine: *The problem you just 
> described is mainly caused by the memory dedicated to indices. With our 
> upcoming 3.0 release we will provide a solution for persistent indices 
> which will party minimize memory needs This is the first step to our 
> plugable storage engine, that will come with 3.x. With a plugable storage 
> engine it´s up to you if you want to optimize for performance at the cost 
> of higher memory usage (keep all in-memory) or if you are willing to 
> sacrifice a bit of your performance and store those indices on disk.
>
> This is the versatility and flexibility we want to achieve to get closer 
> to our vision of simplifying data work.
>
> Summed up we are on our way. The notion that we will "price ourselfs out 
> of the market" is a bit far fetching from my personal perspective. Our 
> current customer funnel tells another story.
> But either way we know about the problem, are tackling it already and will 
> provide performant solutions throughout 2016 starting with 3.0.
>
> Hope I could help,
>
> Jan
>
>
> Am Dienstag, 10. Mai 2016 04:37:50 UTC+2 schrieb JPatrick Davenport:
>>
>> Hello,
>> First, for those who don't know me, I'm a big fan of Arango. I've written 
>> a Clojure driver, travesedo <https://github.com/deusdat/travesedo>, and 
>> the only, as far as I know, Hadoop/Cascading taps in existence 
>> <https://github.com/deusdat/guacaphant>, which makes Arango a first 
>> class citizen in the "big data" world. So what follows comes from love.
>>
>> Is there a plan to move Arango to a more efficient memory model than 
>> essentially mapped files? I've seen multi-Gb databases work fine in MySQL 
>> with just 1 GB of RAM. I don't think that Arango would do as well under 
>> these tight, bootstrapped requirements. I've personally watched a mere 700 
>> GB collection bring Arango to its knees if the whole server only has 1 GB 
>> of RAM.
>>
>> The reason I ask is because I think there's a large market of small 
>> projects/bootstrapped startups out there that could really use an ArangoDB 
>> type store. Money is drying up in Silicon Valley. This means that single 
>> node, cloud-based systems of 96 GB are going away. We're going to return to 
>> micro systems on various cloud providers. ArangoDB is pricing itself out of 
>> the market.
>>
>> I appreciate the work of the shapes storage for the the data. I 
>> understand how it can condense documents much better than MongoDB. At the 
>> same time the memory usage for collections having to be read entirely 
>> cached in memory makes the system difficult. The non-binary storage makes 
>> it terribly inefficient as shown by ArangoDB's own benchmarks.
>>
>> Is there an effort to support partial collection loads (or other 
>> optimizations)? 
>>
>> Postgres does really well with the inefficient JSON format used by Arango 
>> in the comparison tests. It could do better in a Relational v Document 
>> showdown too provided the proper indexes are in Postgres.
>>
>> How does ArangoDB's dev team view this issue(s)? 
>>
>> I ask because I'm at a cross roads now. I can only afford about $80/month 
>> for my data tier. I can get a MySQL or Postgres system that would hum along 
>> for years in single node mode. ArangoDB concerns me because I can quickly 
>> need multiple systems to just happily process a few GB.
>>
>> Thanks,
>> JPD
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"ArangoDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

[arangodb-google] Re: Timeline for a better memory model.

Reply via email to