Re: [HACKERS] Opportunity for a Radical Changes in Database Software

2007-10-31 Thread J. Andrew Rogers


On Oct 28, 2007, at 2:54 PM, Josh Berkus wrote:
I'd actually be curious what incremental changes you could see  
making to

PostgreSQL for better in-memory operation.  Ideas?



It would be difficult to make PostgreSQL really competitive for in- 
memory operation, primarily because a contrary assumption pervades  
the entire design.  You would need to rip out a lot of the guts of  
it.  I was not even intending to suggest that it would be a good idea  
or trivial to adapt PostgreSQL to in-memory operation, but since I am  
at least somewhat familiar with the research I thought I'd offer a  
useful link that detailed the kinds of considerations involved. That  
said, I have seriously considered the idea since I have a major  
project that requires that kind of capability and there is some  
utility in using parts of PostgreSQL if possible, particularly since  
it was used to prototype it.  In my specific case I also need to  
shoehorn a new type of access method into it as well that there is no  
conceptual support for, so it will probably be easier to build a  
(mostly) new database engine altogether.


Personally, if I was designing a distributed in-memory database, I  
would use a somewhat more conservative set of assumptions than  
Stonebraker so that it would have a more general applicability.  For  
example, his assumption of extremely short CPU times for a  
transaction (1 millisecond) are not even valid for some types of  
OLTP loads, never mind the numerous uses that are not strictly OLTP- 
like but which nonetheless are built on relatively short  
transactions; in the Stonebraker design this much latency would be a  
pathology.  Unfortunately, if you remove that assumption, the design  
starts to unravel noticeably.  Nonetheless, there are other viable  
design paths that while not over-fitted to OLTP still could offer  
large gains.


I think the market is right for a well-designed distributed, in- 
memory database, but I think one would be starting with an  
architecture inferior for the purpose that would be hard to get away  
from if we made incremental changes to a solid disk-based engine.  It  
seems short-term expedient but long-term bad engineering -- think MySQL.



Cheers,

J. Andrew Rogers


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Opportunity for a Radical Changes in Database Software

2007-10-28 Thread Josh Berkus
J.,

I'd actually be curious what incremental changes you could see making to 
PostgreSQL for better in-memory operation.  Ideas?

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Opportunity for a Radical Changes in Database Software

2007-10-27 Thread Florian Weimer
* J. Andrew Rogers:

 Everything you are looking for is here:

 http://web.mit.edu/dna/www/vldb07hstore.pdf

 It is the latest Stonebraker et al on massively distributed in-memory
 OLTP architectures.

Ruby-on-Rails compiles into standard JDBC, but hides all the complexity
of that interface. Hence, H-Store plans to move from C++ to
Ruby-on-Rails as our stored procedure language.  This reads a bit
strange.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Opportunity for a Radical Changes in Database Software

2007-10-27 Thread J. Andrew Rogers


On Oct 27, 2007, at 2:20 PM, Florian Weimer wrote:

* J. Andrew Rogers:


Everything you are looking for is here:

http://web.mit.edu/dna/www/vldb07hstore.pdf

It is the latest Stonebraker et al on massively distributed in-memory
OLTP architectures.


Ruby-on-Rails compiles into standard JDBC, but hides all the  
complexity

of that interface. Hence, H-Store plans to move from C++ to
Ruby-on-Rails as our stored procedure language.  This reads a bit
strange.



Yeah, that's a bit of a WTF?.  Okay, a giant WTF?.  I could see  
using Ruby as a stored procedure language, but Ruby-on-Rails seems  
like an exercise in buzzword compliance.  And Ruby is just about the  
slowest language in its class, which based on the rest of the paper  
(serializing all transactions, doing all transactions strictly in- 
memory) means that you would be bottlenecking your database node on  
the procedural language rather than the usual I/O considerations.


Most of the architectural stuff made a considerable amount of sense,  
though I had quibbles with bits of it (I think the long history of  
the design makes some decisions look silly in a world that is now  
multi-core by default).  The Ruby-on-Rails part is obviously  
fungible.  Nonetheless, it is a good starting point for massively  
distributed in-memory OLTP architectures and makes a good analysis of  
many aspects of database design from that perspective, or at least I  
have not really seen anything better.  I prefer a slightly more  
conservative approach that generalizes better in that space than what  
is suggested personally.


Cheers,

J. Andrew Rogers


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Opportunity for a Radical Changes in Database Software

2007-10-25 Thread Jonah H. Harris
I'd suggest looking at the source code to several of the in-memory
databases which already exist.

On 10/25/07, Dan [EMAIL PROTECTED] wrote:
 Hi

 In looking at current developments in computers, it seems we're nearing
 a point where a fundamental change may be possible in databases...
 Namely in-memory databases which could lead to huge performance
 improvements.

 A good starting point is to look at memcached, since it provides proof
 that it's possible to interconnect hundreds of machines into a huge
 memory cluster with, albeit, some issues on reliability.

 For more info on memcached, try:
 http://www.socialtext.net/memcached/index.cgi?faq

 The sites that use it see incredible performance increases, but often at
 the cost of not being able to provide versioned results that are
 guaranteed to be accurate.

 The big questions are then, how would you create a distributed in-memory
 database?


 Another idea that may be workable

 Everyone knows the main problem with a standard cluster is that every
 machine has to perform every write, which leads to diminishing returns
 as the writes consume more and more of every machine's resources. Would
 it be possible to create a clustered environment where the master is the
 only machine that writes the data to disk, while the others just use
 cached data? Or, perhaps it would work better if the master or master
 log entry moves from machine to machine with a commit coinciding with a
 disk write on each machine?

 Any other ideas?  It seems to be a problem worth pondering since
 in-memory databases are possible.

 Thanks

 Dan



 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?

http://archives.postgresql.org



-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
499 Thornall Street, 2nd Floor  | [EMAIL PROTECTED]
Edison, NJ 08837| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Opportunity for a Radical Changes in Database Software

2007-10-25 Thread Martijn van Oosterhout
On Thu, Oct 25, 2007 at 08:05:24AM -0700, Dan wrote:
 In looking at current developments in computers, it seems we're nearing
 a point where a fundamental change may be possible in databases...
 Namely in-memory databases which could lead to huge performance
 improvements.

I think there are a number of challenges in this area. Higher end
machines are tending towards a NUMA architecture, where postgresql's
single buffer pool becomes a liability. In some situations you might
want a smaller per processor pool and an explicit copy to grab buffers
from processes on other CPUs.

I think relibility becomes the real issue though, you can always
produce the wrong answer instantly, the trick is to get the right
one...

Have a nice day,
-- 
Martijn van Oosterhout   [EMAIL PROTECTED]   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Opportunity for a Radical Changes in Database Software

2007-10-25 Thread J. Andrew Rogers


On Oct 25, 2007, at 8:05 AM, Dan wrote:
In looking at current developments in computers, it seems we're  
nearing

a point where a fundamental change may be possible in databases...
Namely in-memory databases which could lead to huge performance
improvements.
...
The sites that use it see incredible performance increases, but  
often at

the cost of not being able to provide versioned results that are
guaranteed to be accurate.

The big questions are then, how would you create a distributed in- 
memory

database?



Everything you are looking for is here:

http://web.mit.edu/dna/www/vldb07hstore.pdf

It is the latest Stonebraker et al on massively distributed in-memory  
OLTP architectures.



J. Andrew Rogers


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq