Hi Craig, On Sun, Jan 26, 2014 at 5:47 AM, Craig Ringer <cr...@2ndquadrant.com> wrote:
> On 01/21/2014 07:43 PM, Christian Convey wrote: > > Hi all, > > > > I'm playing around with Postgres, and I thought it might be fun to > > experiment with alternative formats for relation blocks, to see if I can > > get smaller files and/or faster server performance. > > It's not clear how you'd do this without massively rewriting the guts of > Pg. > > Per the docs on internal structure, Pg has a block header, then tuples > within the blocks, each with a tuple header and list of Datum values for > the tuple. Each Datum has a generic Datum header (handling varlena vs > fixed length values etc) then a type-specific on-disk representation > controlled by the type output function for that type. > I'm still in the process of getting familiar with the pg backend code, so I don't have a concrete plan yet. However, I'm working on the assumption that some set of macros and functions encapsulates the page layout. If/when I tackle this, I expect to add a layer of indirection somewhere around that boundary, so that some non-catalog tables, whose schemas meet certain simplifying assumptions, are read and modified using specialized code. I don't want to get into the specific optimizations I'd like to try, only because I haven't fully studied the code yet, so I don't want to put my foot in my mouth. What concrete problem do you mean to tackle? What idea do you want to > explore or implement? > My real motivation is that I'd like to get more familiar with the pg backend codebase, and tilting at this windmill seemed like an interesting way to accomplish that. If I was focused on really solving a real-world problem, I'd say that this lays the groundwork for table-schema-specific storage optimizations and optimized record-filtering code. But I'd only make that argument if I planned to (a) perform a careful study with statistically significant benchmarks, and/or (b) produce a merge-worthy patch. At this point I have no intentions of doing so. My main goal really is just to have fun with the code. > > Does anyone know if this has been done before with Postgres? I would > > have assumed yes, but I'm not finding anything in Google about people > > having done this. > > AFAIK (and I don't know much in this area) the storage manager isn't > very pluggable compared to the rest of Pg. > Thanks for the warning. Duly noted. Kind regards, Christian