from:"Jonah H. Harris"

Re: Postgres Architecture

2023-10-16 Thread Jonah H. Harris

On Mon, Oct 16, 2023 at 6:42 AM Timothy Nelson 
wrote:

> I'm expecting that people will pick the idea apart, and wanted to know
> what people think of it.
>

Thanks for the proposal. This is actually a model that's been around for a
very long time. And, in fact, variations of it (e.g. parsing done in one
place and generated plan fragments shipped to remote execution nodes where
the data resides) are already used by things like Postgres-XL. There have
also been a number of academic implementations where parsing is done
locally and raw parse trees are sent to the server as well. While these
things do reduce CPU, there are a number of negative aspects to deal with
that make such an architecture more difficult to manage.

-- 
Jonah H. Harris

Re: Implement a column store for pg?

2023-09-15 Thread Jonah H. Harris

On Fri, Sep 15, 2023 at 10:21 AM jacktby jacktby  wrote:

> > I’m trying to implement a new column store for pg, is there a good
> example to reference?
> That’s too complex, I just need to know the interface about design a
> column store. In fact, I just need a simple example, and I will implement
> it by myself, what I’m confusing is that, I don’t know how to implement a
> MVCC, because old version is tuple, this will make a big difference to the
> transaction?

If you're looking for the simplest version of a columnar implementation for
Postgres, I'd check out Citus' original cstore implemented via FDW. It
hasn't been updated in years, but it's still one of the faster simple
columnar implementations out there https://github.com/citusdata/cstore_fdw

--
Jonah H. Harris

Re: libpq compression (part 2)

2023-08-10 Thread Jonah H. Harris

Pinging to see if anyone has continued to work on this behind-the-scenes or
whether this is the latest patch set there is.

-- 
Jonah H. Harris

Re: How to build a new grammer for pg?

2023-08-01 Thread Jonah H. Harris

On Tue, Aug 1, 2023 at 3:45 PM Andrew Dunstan  wrote:

> Or to enable some language other than SQL (QUEL anyone?)
>

A few years ago, I got a minimal POSTQUEL working again to release as a
patch for April Fools' Day, which I never did. I should dig that up
somewhere :)

Anyway, as far as OP's original question regarding replacing the grammar,
there are a couple of such implementations floating around that have done
that. But, I actually think the pluggable parser patches were good examples
of how to integrate a replacement parser that generates the expected parse
tree nodes for anyone who wants to do their own custom parser. See Julien
Rouhaud's SQLOL in the "Hook for extensible parsing" thread and Jim
Mlodgenski's "Parser Hook" thread.

-- 
Jonah H. Harris

Re: Let's make PostgreSQL multi-threaded

2023-06-05 Thread Jonah H. Harris

On Mon, Jun 5, 2023 at 8:18 AM Tom Lane  wrote:

> For the record, I think this will be a disaster.  There is far too much
> code that will get broken, largely silently, and much of it is not
> under our control.
>

While I've long been in favor of a multi-threaded implementation, now in my
old age, I tend to agree with Tom. I'd be interested in Konstantin's
thoughts (and PostgresPro's experience) of multi-threaded vs. internal
pooling with the current process-based model. I recall looking at and
playing with Konstantin's implementations of both, which were impressive.
Yes, the latter doesn't solve the same issues, but many real-world ones
where multi-threaded is argued. Personally, I think there would be not only
a significant amount of time spent dealing with in-the-field stability
regressions before a multi-threaded implementation matures, but it would
also increase the learning curve for anyone trying to start with internals
development.

-- 
Jonah H. Harris

Re: Howdoes; pg; index; page; optimize; dead; tuples?;

2023-04-19 Thread Jonah H. Harris

On Wed, Apr 19, 2023 at 9:40 PM jack...@gmail.com  wrote:

> As far as I know, when a index page is full, if you insert a new tuple
> here, you will split it into two pages.
> But pg won't delete the half tuples in the old page in real. So if there
> is another tuple inserted into this old
> page, will pg split it again? I think that's not true, so how it solve
> this one? please give me a code example,thanks
>

This is not how the hackers list works; you need to do your own research.
The Postgres code is pretty straightforward and giving you examples in
isolation makes no sense. If you want to understand how things actually
work, you need to read the code in context and understand how the system
works, minimally, at a component level.

-- 
Jonah H. Harris

Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations

2023-02-17 Thread Jonah H. Harris

On Fri, Feb 17, 2023 at 12:03 AM David Rowley  wrote:

> On Fri, 17 Feb 2023 at 17:40, Jonah H. Harris 
> wrote:
> > Yeah. There’s definitely a smarter and more reusable approach than I was
> proposing. A lot of that code is fairly mature and I figured more people
> wouldn’t want to alter it in such ways - but I’m up for it if an approach
> like this is the direction we’d want to go in.
>
> Having something agnostic to if it's allocating a new context or
> adding a block to an existing one seems like a good idea to me.
>

I like this idea.


> I think the tricky part will be the discussion around which size
> classes to keep around and in which cases can we use a larger
> allocation without worrying too much that it'll be wasted. We also
> don't really want to make the minimum memory that a backend can keep
> around too bad. Patches such as [1] are trying to reduce that.  Maybe
> we can just keep a handful of blocks of 1KB, 8KB and 16KB around, or
> more accurately put, ALLOCSET_SMALL_INITSIZE,
> ALLOCSET_DEFAULT_INITSIZE and ALLOCSET_DEFAULT_INITSIZE * 2, so that
> it works correctly if someone adjusts those definitions.
>

Per that patch and the general idea, what do you think of either:

1. A single GUC, something like backend_keep_mem, that represents the
cached memory we'd retain rather than send directly to free()?
2. Multiple GUCs, one per block size?

While #2 would give more granularity, I'm not sure it would necessarily be
needed. The main issue I'd see in that case would be the selection approach
to block sizes to keep given a fixed amount of keep memory. We'd generally
want the majority of the next queries to make use of it as best as
possible, so we'd either need each size to be equally represented or some
heuristic.

I don't really like #2, but threw it out there :)

I think you'll want to look at what the maximum memory a backend can
> keep around in context_freelists[] and not make the worst-case memory
> consumption worse than it is today.
>

Agreed.


> I imagine this would be some new .c file in src/backend/utils/mmgr
> which aset.c, generation.c and slab.c each call a function from to see
> if we have any cached blocks of that size.  You'd want to call that in
> all places we call malloc() from those files apart from when aset.c
> and generation.c malloc() for a dedicated block.  You can probably get
> away with replacing all of the free() calls with a call to another
> function where you pass the pointer and the size of the block to have
> it decide if it's going to free() it or cache it.


Agreed. I would see this as practically just a generic allocator free-list;
is that how you view it also?


> I doubt you need to care too much if the block is from a dedicated
> allocation or a normal
> block.  We'd just always free() if it's not in the size classes that
> we care about.
>

Agreed.

--
Jonah H. Harris

Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations

2023-02-16 Thread Jonah H. Harris

On Thu, Feb 16, 2023 at 11:26 PM David Rowley  wrote:

> I didn't hear it mentioned explicitly here, but I suspect it's faster
> when increasing the initial size due to the memory context caching
> code that reuses aset MemoryContexts (see context_freelists[] in
> aset.c). Since we reset the context before caching it, then it'll
> remain fast when we can reuse a context, provided we don't need to do
> a malloc for an additional block beyond the initial block that's kept
> in the cache.


This is what we were seeing. The larger initial size reduces/eliminates the
multiple smaller blocks that are malloced and freed in each per-query
execution.

Maybe we should think of a more general-purpose way of doing this
> caching which just keeps a global-to-the-process dclist of blocks
> laying around.  We could see if we have any free blocks both when
> creating the context and also when we need to allocate another block.
> I see no reason why this couldn't be shared among the other context
> types rather than keeping this cache stuff specific to aset.c.  slab.c
> might need to be pickier if the size isn't exactly what it needs, but
> generation.c should be able to make use of it the same as aset.c
> could.  I'm unsure what'd we'd need in the way of size classing for
> this, but I suspect we'd need to pay attention to that rather than do
> things like hand over 16MBs of memory to some context that only wants
> a 1KB initial block.


Yeah. There’s definitely a smarter and more reusable approach than I was
proposing. A lot of that code is fairly mature and I figured more people
wouldn’t want to alter it in such ways - but I’m up for it if an approach
like this is the direction we’d want to go in.



-- 
Jonah H. Harris

Re: Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations

2023-02-16 Thread Jonah H. Harris

On Thu, Feb 16, 2023 at 7:32 PM Andres Freund  wrote:

> What PG version?
>

Hey, Andres. Thanks for the reply.

Given not much changed regarding that allocation context IIRC, I’d think
all recents. It was observed in 13, 14, and 15.

Do you have a way to reproduce this with core code,
> e.g. postgres_fdw/file_fdw?

I’ll have to create one - it was most evident on a TPC-C or sysbench test
using the Postgres, MySQL, SQLite, and Oracle FDWs. It may be reproducible
with pgbench as well.

What is all that memory used for? Is it possible that the real issue are too
> many tiny allocations, due to some allocation growing slowly?

The FDW state management allocations and whatever each FDW needs to
accomplish its goals. Different FDWs do different things.

This seems way too magic to expose to users. How would they ever know how to
> set it? And it will heavily on the specific queries, so a global config
> won't
> work well.

Agreed on the nastiness of exposing it directly. Not that we don’t give
users control of memory anyway, but that one is easier to mess up without
at least putting some custom set bounds on it.

If the issue is a specific FDW needing to make a lot of allocations, I can
> see
> adding an API to tell a memory context that it ought to be ready to
> allocate a
> certain amount of memory efficiently (e.g. by increasing the block size of
> the
> next allocation by more than 2x).

While I’m happy to be wrong, it seems is an inherent problem not really
specific to FDW implementations themselves but the general expectation that
all FDWs are using more of that context than non-FDW cases for similar
types of operations, which wasn’t really a consideration in that allocation
over time.

If we come up with some sort of alternate allocation strategy, I don’t know
how it would be very clean API-wise, but it’s definitely an idea.

-- 
Jonah H. Harris

Reducing System Allocator Thrashing of ExecutorState to Alleviate FDW-related Performance Degradations

2023-02-16 Thread Jonah H. Harris

Hi everyone,

I've been working on a federated database project that heavily relies on
foreign data wrappers. During benchmarking, we noticed high system CPU
usage in OLTP-related cases, which we traced back to multiple brk calls
resulting from block frees in AllocSetReset upon ExecutorEnd's
FreeExecutorState. This is because FDWs allocate their own derived
execution states and required data structures within this context,
exceeding the initial 8K allocation, that need to be cleaned-up.

Increasing the default query context allocation from ALLOCSET_DEFAULT_SIZES
to a larger initial "appropriate size" solved the issue and almost doubled
the throughput. However, the "appropriate size" is workload and
implementation dependent, so making it configurable may be better than
increasing the defaults, which would negatively impact users (memory-wise)
who aren't encountering this scenario.

I have a patch to make it configurable, but before submitting it, I wanted
to hear your thoughts and feedback on this and any other alternative ideas
you may have.

-- 
Jonah H. Harris

Re: Hook for extensible parsing.

2021-09-15 Thread Jonah H. Harris

On Wed, Sep 15, 2021 at 3:55 PM Andres Freund  wrote:

> On 2021-09-15 12:57:00 -0400, Tom Lane wrote:
> > That's not what the patch actually does, though.  It only replaces
> > the grammar, not semantic analysis.  So you couldn't associate the
> > (+)-decorated WHERE clause with the appropriate join.  (And no,
> > I will not accept that it's okay to perform catalog lookups in
> > the grammar to get around that.  See comment at the head of gram.y.)
>
> > In general, I'm having a hard time believing that anything very
> > interesting can be done at only the grammar level without changing
> > the parse analysis phase.  That's not unrelated to the restriction
> > that the grammar can't do catalog accesses.  Maybe with some fundamental
> > restructuring, we could get around that issue ... but this patch isn't
> > doing any fundamental restructuring, it's just putting a hook where it's
> > easy to do so.  We've often found that such hooks aren't as useful as
> > they initially seem.
>
> Agreed - it doesn't make sense to me to have a hook that only replaces raw
> parsing, without also hooking into parse-analysis. ISTM that the least a
> patchset going for a parser hook would have to do is to do sufficient
> restructuring so that one could hook together into both raw parsing and
> analysis. It could still be two callbacks, but perhaps we'd ensure that
> they're both set.
>

This is a bad example as it doesn't require semantic analysis from
Postgres. While most of the tools out there tend to do simple replacement,
this can be done within a custom parser by simply walking its own AST,
evaluating join conditions against the expression, and rewriting the join
accordingly. Or, do you have an example that couldn't be done this way
within a custom parser?

-- 
Jonah H. Harris

Re: ML-based indexing ("The Case for Learned Index Structures", a paper from Google)

2021-04-20 Thread Jonah H. Harris

On Tue, Apr 20, 2021 at 3:45 PM Peter Geoghegan  wrote:

> On Tue, Apr 20, 2021 at 12:35 PM Chapman Flack 
> wrote:
> > How would showing that to be true for data structure X be different from
> > making a case for data structure X?
>
> You don't have to understand the theoretical basis of B-Tree indexes
> to see that they work well. In fact, it took at least a decade for
> somebody to formalize how the space utilization works with B-Trees
> containing random data. Of course theory matters, but the fact is that
> B-Trees had been widely used for commercial and scientific
> applications that whole time.
>
> Maybe I'll be wrong about learned indexes - who knows? But the burden
> of proof is not mine. I prefer to spend my time on things that I am
> reasonably confident will work out well ahead of time.
>

Agreed on all of your takes, Peter. In time, they will probably be more
realistic. But, at present, I tend to see the research papers make
comparisons between learned vs. traditional pitching the benefits of the
former without any of the well-known optimizations of the latter - as if
time stood still since the original B-Tree. Similarly, where most academic
research starts to fall apart in practicality is the lack of addressing
realistic write volumes and related concurrency issues. I'm happy to be
disproven on this, though.

-- 
Jonah H. Harris

Re: Extensibility of the PostgreSQL wire protocol

2021-02-22 Thread Jonah H. Harris

On Mon, Feb 22, 2021 at 1:01 PM Tom Lane  wrote:

> Jan Wieck  writes:
> > As Jim Mlodgenski just posted in [0], having the ability to also extend
> > and/or replace the parser will give them the ability to do just that.
>
> Yeah, and as I pointed out somewhere upthread, trying to replace the
> whole parser will just end in a maintenance nightmare.  The constructs
> that the parser has to emit are complex, Postgres-specific, and
> constantly evolving.  We are NOT going to promise any sort of cross
> version compatibility for parse trees.
>

Wholeheartedly agreed. Core should only ever maintain the hooks, never
their usage. It's the responsibility of the extension author to maintain
their code just as it is to manage their use of all other hook usages. Yes,
it's sometimes a maintenance nightmare - but with great power comes great
responsibility... as is anything loaded directly into the process.

-- 
Jonah H. Harris

Re: Extensibility of the PostgreSQL wire protocol

2021-02-19 Thread Jonah H. Harris

On Fri, Feb 19, 2021 at 8:48 AM Heikki Linnakangas  wrote:

> With the hooks that exist today, would it possible to write a background
> worker that listens on a port, instead of postmaster? Can you launch
> backends from a background worker? And communicate the backend processes
> using a shared memory message queue (see pqmq.c).
>

Yes. That's similar to how mine work: A background worker that acts as a
listener for the new protocol which then sets up a new dynamic background
worker on accept(), waits for its creation, passes the fd to the new
background worker, and sits in a while (!got_sigterm) loop checking the
socket for activity and running the protocol similar to postmaster. I
haven't looked at the latest connection pooling patches but, in general,
connection pooling is an abstract issue and should be usable for any type
of connection as, realistically, it's just an event loop and state problem
- it shouldn't be protocol specific.

I would recommend this approach: write a separate program that sits
> between the client and PostgreSQL, speaking custom protocol to the
> client, and libpq to the backend. And then move that program into a
> background worker process.
>

Doing protocol conversion between libpq and a different protocol works, but
is slow. My implementations were originally all proxies that worked outside
the database, then I moved them inside, then I replaced all the libpq code
with SPI-related calls.


> > In a recent case, letting the browser talk directly to the database
> > allowed me to get rid of a ~100k-sloc .net backend and all the
> > complexity and infrastructure that goes with
> > coding/testing/deploying/maintaining it, while keeping all the
> > positives: per-query compression/data conversion, querying multiple
> > databases over a single connection, session cookies, etc. Deployment
> > is trivial compared to what was before. Latency is down 2x-4x across
> > the board.
>
> Querying multiple databases over a single connection is not possible
> with the approach taken here. Not sure about the others things you listed.
>

Accessing multiple databases from the same backend is problematic overall -
I didn't solve that in my implementations either. IIRC, once a bgworker is
attached to a specific database, it's basically stuck with that database.

-- 
Jonah H. Harris

Re: Extensibility of the PostgreSQL wire protocol

2021-02-11 Thread Jonah H. Harris

On Thu, Feb 11, 2021 at 9:28 AM Robert Haas  wrote:

> That being said, I'm not in favor of transferring maintenance work to
> the community for this set of hooks any more than I am for something
> on the parsing side. In general, I'm in favor of as much extensibility
> as we can reasonably create, but with a complicated proposal like this
> one, the community should expect to be able to get something out of
> it. And so far what I hear Jan saying is that these hooks could in
> theory be used for things other than Amazon's proprietary efforts and
> those things could in theory bring benefits to the community, but
> there are no actual plans to do anything with this that would benefit
> anyone other than Amazon. Which seems to bring us right back to
> expecting the community to maintain things for the benefit of
> third-party forks.
>

I'm quite sure I said I'd open source my MySQL implementation, which allows
Postgres to appear to MySQL clients as a MySQL/MariaDB server. This is
neither proprietary nor Amazon-related and makes Postgres substantially
more useful for a large number of applications.

As Jan said in his last email, they're not proposing all the different
aspects needed. In fact, nothing has actually been proposed yet. This is an
entirely philosophical debate. I don't even know what's being proposed at
this point - I just know it *could* be useful. Let's just wait and see what
is actually proposed before shooting it down, yes?

-- 
Jonah H. Harris

Re: Extensibility of the PostgreSQL wire protocol

2021-02-10 Thread Jonah H. Harris

On Wed, Feb 10, 2021 at 2:04 PM Tom Lane  wrote:

> That is a spot-on definition of where I do NOT want to end up.  Hooks
> everywhere and enormous extensions that break anytime we change anything
> in the core.  It's not really clear that anybody is going to find that
> more maintainable than a straight fork, except to the extent that it
> enables the erstwhile forkers to shove some of their work onto the PG
> community.
>

Given the work over the last few major releases to make several other
aspects of Postgres pluggable, how is implementing a pluggable protocol API
any different?

To me, this sounds more like a philosophical disagreement with how people
could potentially use Postgres than a technical one. My point is only that,
using current PG functionality, I could equally write a pluggable storage
interface for my Oracle and InnoDB data file readers/writers, which would
similarly allow for the creation of a Postgres franken-Oracle by extension
only.

I don't think anyone is asking for hooks for all the things I mentioned - a
pluggable transaction manager, for example, doesn't make much sense. But,
when it comes to having actually done this vs. posited about its
usefulness, I'd say it has some merit and doesn't really introduce that
much complexity or maintenance overhead to core - whether the extensions
still work properly is up to the extension authors... isn't that the whole
point of extensions?

-- 
Jonah H. Harris

Re: Extensibility of the PostgreSQL wire protocol

2021-02-10 Thread Jonah H. Harris

On Wed, Feb 10, 2021 at 1:10 PM Tom Lane  wrote:

> What I'm actually more concerned about, in this whole line of development,
> is the follow-on requests that will surely occur to kluge up Postgres
> to make its behavior more like $whatever.  As in "well, now that we
> can serve MySQL clients protocol-wise, can't we pretty please have a
> mode that makes the parser act more like MySQL".  If we start having
> modes for MySQL identifier quoting, Oracle outer join syntax, yadda
> yadda, it's going to be way more of a maintenance nightmare than some
> hook functions.  So if we accept any patch along this line, I want to
> drive a hard stake in the ground that the answer to that sort of thing
> will be NO.
>

Actually, a substantial amount can be done with hooks. For Oracle, which is
substantially harder than MySQL, I have a completely separate parser that
generates a PG-compatible parse tree packaged up as an extension. To handle
autonomous transactions, database links, hierarchical query conversion,
hints, and some execution-related items requires core changes. But, the
protocol and parsing can definitely be done with hooks. And, as was
mentioned previously, this isn't tied directly to emulating another
database - it would enable us to support an HTTP-ish interface directly in
the server as an extension as well. A lot of this can be done with
background worker extensions now, which is how my stuff was primarily
architected, but it's hacky when it comes to areas where the items Jan
discussed could clean things up and make them more pluggable.

Assuming we're going to keep to that, though, it seems like people
> doing this sort of thing will inevitably end up with a fork anyway.
> So maybe we should just not bother with the first step either.
>

Perhaps I'm misunderstanding you, but I wouldn't throw this entire idea out
(which enables a substantial addition of extensible functionality with a
limited set of touchpoints) on the premise of future objections.

-- 
Jonah H. Harris

Re: Extensibility of the PostgreSQL wire protocol

2021-02-10 Thread Jonah H. Harris

On Wed, Feb 10, 2021 at 11:43 AM Robert Haas  wrote:

> On Mon, Jan 25, 2021 at 10:07 AM Jan Wieck  wrote:
> > Our current plan is to create a new set of API calls and hooks that
> allow to register additional wire protocols. The existing backend libpq
> implementation will be modified to register itself using the new API. This
> will serve as a proof of concept as well as ensure that the API definition
> is not slanted towards a specific protocol. It is also similar to the way
> table access methods and compression methods are added.
>
> If we're going to end up with an open source implementation of
> something useful in contrib or whatever, then I think this is fine.
> But, if not, then we're just making it easier for Amazon to do
> proprietary stuff without getting any benefit for the open-source
> project. In fact, in that case PostgreSQL would ensure have to somehow
> ensure that the hooks don't get broken without having any code that
> actually uses them, so not only would the project get no benefit, but
> it would actually incur a small tax. I wouldn't say that's an
> absolutely show-stopper, but it definitely isn't my first choice.
>

Agreed on adding substantial hooks if they're not likely to be used. While
I haven't yet seen AWS' implementation or concrete proposal, given the
people involved, I assume it's fairly similar to how I implemented it.
Assuming that's correct and it doesn't require substantial redevelopment,
I'd certainly open-source my MySQL-compatible protocol and parser
implementation. From my perspective, it would be awesome if these could be
done as extensions.

While I'm not planning to open source it as of yet, for my
Oracle-compatible stuff, I don't think I'd be able to do anything other
than the protocol as an extension given the core-related changes similar to
what EDB has to do. I don't think there's any easy way to get around that.
But, for the protocol and any type of simple translation to Postgres'
dialect, I think that could easily be hook-based.

-- 
Jonah H. Harris

Re: Extensibility of the PostgreSQL wire protocol

2021-01-25 Thread Jonah H. Harris

On Mon, Jan 25, 2021 at 10:07 AM Jan Wieck  wrote:

> The following is a request for discussion and comments, not a refined
> proposal accompanied by a working patch.
>

After implementing this three different ways inside the backend over the
years, I landed on almost this identical approach for handling the MySQL,
TDS, MongoDB, and Oracle protocols for NEXTGRES.

Initially, each was implemented as an background worker extension which had
to handle its own networking, passing the fd off to new protocol-specific
connections, etc. This worked, but duplicate a good amount of logic. It
would be great to have a standard, loadable, way to add support for a new
protocol.

-- 
Jonah H. Harris

Re: pro pametniky -historie vzniku PL/SQL

2020-07-03 Thread Jonah H. Harris

On Fri, Jul 3, 2020 at 3:20 PM Pavel Stehule 
wrote:

> Hi
>
> I am sorry, wrong mailing list.
>

Thanks for reading/sharing my blog post, regardless of the mailing list :)

-- 
Jonah H. Harris

Re: Persist MVCC forever - retain history

2020-07-02 Thread Jonah H. Harris

On Thu, Jul 2, 2020 at 2:56 PM Mitar  wrote:

> Hi!
>
> (Sorry if this was already discussed, it looks pretty obvious, but I
> could not find anything.)

There have been a couple timetravel extensions done, each with their own
limitations. I don’t believe a perfect implementation could be done without
reading the functionality to core (which would be new functionality given
all the changes.) I’d say start with the extensions and go from there.

-- 
Jonah H. Harris

Re: Fwd: PostgreSQL: WolfSSL support

2020-06-27 Thread Jonah H. Harris

On Sat, Jun 27, 2020 at 3:37 PM Christoph Berg  wrote:

> Re: Jonah H. Harris
> > Somewhere, I recall seeing an open-source OpenSSL compatibility wrapper
> for
> > WolfSSL. Assuming that still exists, this patch seems entirely
> unnecessary.
>
> Unless you actually tried.

Did you? It worked for me in the past on a similarly large system...

-- 
Jonah H. Harris

Re: Fwd: PostgreSQL: WolfSSL support

2020-06-27 Thread Jonah H. Harris

On Sat, Jun 27, 2020 at 3:25 PM Ranier Vilela  wrote:

> Em sáb., 27 de jun. de 2020 às 09:50, Christoph Berg 
> escreveu:
>
>> Re: Peter Eisentraut
>> > What would be the advantage of using wolfSSL over OpenSSL?
>>
>> Avoiding the OpenSSL-vs-GPL linkage problem with readline.
>>
> I'm curious, how do you intend to solve a linking problem with
> OpenSSL-vs-GPL-readline, with another GPL product?
> WolfSSL, will provide a commercial license for PostgreSQL?
> Isn't LIbreSSL a better alternative?
>

Somewhere, I recall seeing an open-source OpenSSL compatibility wrapper for
WolfSSL. Assuming that still exists, this patch seems entirely unnecessary.

-- 
Jonah H. Harris

Re: Our naming of wait events is a disaster.

2020-05-12 Thread Jonah H. Harris

On Tue, May 12, 2020 at 11:16 AM Tom Lane  wrote:

> My inclination is to propose that we settle on the first style
> shown above, which is the majority case now, and rename the
> other events to fit that.  As long as we're breaking compatibility
> anyway, I'd also like to shorten one or two of the very longest
> names, because they're just giving me fits in fixing the PDF
> rendering.  (They would make a mess of the display of
> pg_stat_activity, too, anytime they come up in the field.)
>
> Thoughts?
>

+1

-- 
Jonah H. Harris

Re: Raw device on PostgreSQL

2020-05-01 Thread Jonah H. Harris

On Fri, May 1, 2020 at 4:59 PM Thomas Munro  wrote:

> On Fri, May 1, 2020 at 12:28 PM Jonah H. Harris 
> wrote:
> > Also, this will likely have an issue with O_DIRECT as additional buffer
> manager alignment is needed and I haven't tracked it down in 13 yet. As my
> default development is on a Mac, I have POSIX AIO only. As such, I can't
> natively play with the O_DIRECT or libaio paths to see if they work without
> going into Docker or VirtualBox - and I don't care that much right now :)
>
> Andres is prototyping with io_uring, which supersedes libaio and can
> do much more stuff, notably buffered and unbuffered I/O; there's no
> point in looking at libaio.  I agree that we should definitely support
> POSIX AIO, because that gets you macOS, FreeBSD, NetBSD, AIX, HPUX
> with one effort (those are the systems that use either kernel threads
> or true async I/O down to the driver; Solaris and Linux also provide
> POSIX AIO, but it's emulated with user space threads, which probably
> wouldn't work well for our multi process design).  The third API that
> we'd want to support is Windows overlapped I/O with completion ports.
> With those three APIs you can hit all systems in our build farm except
> Solaris and OpenBSD, so they'd still use synchronous I/O (though we
> could do our own emulation with worker processes pretty easily).
>

Is it public? I saw the presentations, but couldn't find that patch
anywhere.

-- 
Jonah H. Harris

Re: Raw device on PostgreSQL

2020-04-30 Thread Jonah H. Harris

On Wed, Apr 29, 2020 at 8:34 PM Jonah H. Harris 
wrote:

> On Tue, Apr 28, 2020 at 8:10 AM Andreas Karlsson 
> wrote:
>
>> To get the performance benefits from using raw devices I think you would
>> want to add support for asynchronous IO to PostgreSQL rather than
>> implementing your own layer to emulate the kernel's buffered IO.
>>
>> Andres Freund did a talk on aync IO in PostgreSQL earlier this year. It
>> was not recorded but the slides are available.
>>
>>
>> https://www.postgresql.eu/events/fosdem2020/schedule/session/2959-asynchronous-io-for-postgresql/
>
>
> FWIW, in 2007/2008, when I was at EnterpriseDB, Inaam Rana and I
> implemented a benchmarkable proof-of-concept patch for direct I/O and
> asynchronous I/O (for libaio and POSIX). We made that patch public, so it
> should be on the list somewhere. But, we began to run into performance
> issues related to buffer manager scaling in terms of locking and,
> specifically, replacement. We began prototyping alternate buffer managers
> (going back to the old MRU/LRU model with midpoint insertion and testing a
> 2Q variant) but that wasn't public. I had also prototyped raw device
> support, which is a good amount of work and required implementing a custom
> filesystem (similar to Oracle's ASM) within the storage manager. It's
> probably a bit harder now than it was then, given the number of different
> types of file access.
>

Here's a hack job merge of that preliminary PoC AIO/DIO patch against
13devel. This was designed to keep the buffer manager clean using AIO and
is write-only. I'll have to dig through some of my other old Postgres 8.x
patches to find the AIO-based prefetching version with aio_req_t modified
to handle read vs. write in FileAIO. Also, this will likely have an issue
with O_DIRECT as additional buffer manager alignment is needed and I
haven't tracked it down in 13 yet. As my default development is on a Mac, I
have POSIX AIO only. As such, I can't natively play with the O_DIRECT or
libaio paths to see if they work without going into Docker or VirtualBox -
and I don't care that much right now :)

The code is nasty, but maybe it will give someone ideas. If I get some time
to work on it, I'll rewrite it properly.

-- 
Jonah H. Harris

13dev_aiodio_latest.patch
Description: Binary data

Re: Raw device on PostgreSQL

2020-04-29 Thread Jonah H. Harris

On Tue, Apr 28, 2020 at 8:10 AM Andreas Karlsson  wrote:

> It would require quite a bit of work since 1) PostgreSQL stores its data
> in multiple files and 2) PostgreSQL currently supports only synchronous
> buffered IO.
>
> To get the performance benefits from using raw devices I think you would
> want to add support for asynchronous IO to PostgreSQL rather than
> implementing your own layer to emulate the kernel's buffered IO.
>
> Andres Freund did a talk on aync IO in PostgreSQL earlier this year. It
> was not recorded but the slides are available.
>
>
> https://www.postgresql.eu/events/fosdem2020/schedule/session/2959-asynchronous-io-for-postgresql/

FWIW, in 2007/2008, when I was at EnterpriseDB, Inaam Rana and I
implemented a benchmarkable proof-of-concept patch for direct I/O and
asynchronous I/O (for libaio and POSIX). We made that patch public, so it
should be on the list somewhere. But, we began to run into performance
issues related to buffer manager scaling in terms of locking and,
specifically, replacement. We began prototyping alternate buffer managers
(going back to the old MRU/LRU model with midpoint insertion and testing a
2Q variant) but that wasn't public. I had also prototyped raw device
support, which is a good amount of work and required implementing a custom
filesystem (similar to Oracle's ASM) within the storage manager. It's
probably a bit harder now than it was then, given the number of different
types of file access.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-29 Thread Jonah H. Harris

On Wed, Apr 29, 2020 at 5:54 PM Jonah H. Harris 
wrote:

> On Wed, Apr 29, 2020 at 4:50 PM Corey Huinker 
> wrote:
>
>> Having both WHERE and WHILE might look awkward.
>>>
>>
>> Maybe an UNTIL instead of WHILE?
>>
>
> While I'm not a huge fan of it, one of the other databases implementing
> this functionality does so using the syntax:
>
> WITH ITERATIVE R AS '(' R0 ITERATE Ri UNTIL N (ITERATIONS | UPDATES) ')' Qf
>
> Where N in ITERATIONS represents termination at an explicit count and, in
> UPDATES, represents termination after Ri updates more than n rows on table
> R.
>

Sent too soon :) One of the main reasons I dislike the above is that it
assumes N is known. In some cases, however, you really need termination
upon a condition.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-29 Thread Jonah H. Harris

On Wed, Apr 29, 2020 at 4:50 PM Corey Huinker 
wrote:

> Having both WHERE and WHILE might look awkward.
>>
>
> Maybe an UNTIL instead of WHILE?
>

While I'm not a huge fan of it, one of the other databases implementing
this functionality does so using the syntax:

WITH ITERATIVE R AS '(' R0 ITERATE Ri UNTIL N (ITERATIONS | UPDATES) ')' Qf

Where N in ITERATIONS represents termination at an explicit count and, in
UPDATES, represents termination after Ri updates more than n rows on table
R.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-29 Thread Jonah H. Harris

On Wed, Apr 29, 2020 at 7:22 AM Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> Yeah the RECURSIVE vs ITERATIVE is a bit of a red herring here.  As you
> say, the RECURSIVE keyword doesn't specify the processing but marks the
> fact that the specification of the query is recursive.
>

Agreed. I started thinking through Fabien's response last night.

I think a syntax that would fit better within the existing framework
> would be something like
>
> WITH RECURSIVE t AS (
>  SELECT base case
>REPLACE ALL  -- instead of UNION ALL
>  SELECT recursive case
> )
>

I was originally thinking more along the lines of Fabien's approach, but
this is similarly interesting.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-28 Thread Jonah H. Harris

On Mon, Apr 27, 2020 at 8:50 PM Jeff Davis  wrote:

> Can you illustrate with some examples? I get that one is appending and
> the other is modifying in-place, but how does this end up looking in
> the query language?
>

To ensure credit is given to the original authors, the initial patch I'm
working with (against Postgres 11 and 12) came from Denis Hirn, Torsten
Grust, and Christian Duta. Attached is a quick-and-dirty merge of that
patch that applies cleanly against 13-devel. It is not solid, at all, but
demonstrates the functionality. At present, my updates can be found at
https://github.com/jonahharris/postgres/tree/with-iterative

As the patch makes use of additional booleans for iteration, if there's
interest in incorporating this functionality, I'd like to discuss with Tom,
Robert, et al regarding the current use of booleans for CTE recursion
differentiation in parsing and planning. In terms of making it
production-ready, I think the cleanest method would be to use a bitfield
(rather than booleans) to differentiate recursive and iterative CTEs.
Though, as that would touch more code, it's obviously up for discussion.

I'm working on a few good standalone examples of PageRank, shortest path,
etc. But, the simplest Fibonacci example can be found here:

EXPLAIN ANALYZE
WITH RECURSIVE fib_sum (iteration, previous_number, new_number)
  AS (SELECT 1, 0::numeric, 1::numeric
   UNION ALL
  SELECT (iteration + 1), new_number, (previous_number + new_number)
FROM fib_sum
   WHERE iteration <= 1)
SELECT r.iteration, r.new_number
  FROM fib_sum r
 ORDER BY 1 DESC
 LIMIT 1;

QUERY PLAN

--
 Limit  (cost=3.81..3.81 rows=1 width=36) (actual time=44.418..44.418
rows=1 loops=1)
   CTE fib_sum
 ->  Recursive Union  (cost=0.00..3.03 rows=31 width=68) (actual
time=0.005..14.002 rows=10001 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=68) (actual
time=0.004..0.004 rows=1 loops=1)
   ->  WorkTable Scan on fib_sum  (cost=0.00..0.24 rows=3 width=68)
(actual time=0.001..0.001 rows=1 loops=10001)
 Filter: (iteration <= 1)
 Rows Removed by Filter: 0
   ->  Sort  (cost=0.78..0.85 rows=31 width=36) (actual time=44.417..44.417
rows=1 loops=1)
 Sort Key: r.iteration DESC
 Sort Method: top-N heapsort  Memory: 27kB
 ->  CTE Scan on fib_sum r  (cost=0.00..0.62 rows=31 width=36)
(actual time=0.009..42.660 rows=10001 loops=1)
 Planning Time: 0.331 ms
 Execution Time: 45.949 ms
(13 rows)

-- No ORDER/LIMIT is required with ITERATIVE as only a single tuple is
present
EXPLAIN ANALYZE
WITH ITERATIVE fib_sum (iteration, previous_number, new_number)
  AS (SELECT 1, 0::numeric, 1::numeric
   UNION ALL
  SELECT (iteration + 1), new_number, (previous_number + new_number)
FROM fib_sum
   WHERE iteration <= 1)
SELECT r.iteration, r.new_number
  FROM fib_sum r;

QUERY PLAN

--
 CTE Scan on fib_sum r  (cost=3.03..3.65 rows=31 width=36) (actual
time=11.626..11.627 rows=1 loops=1)
   CTE fib_sum
 ->  Recursive Union  (cost=0.00..3.03 rows=31 width=68) (actual
time=11.621..11.621 rows=1 loops=1)
   ->  Result  (cost=0.00..0.01 rows=1 width=68) (actual
time=0.001..0.001 rows=1 loops=1)
   ->  WorkTable Scan on fib_sum  (cost=0.00..0.24 rows=3 width=68)
(actual time=0.001..0.001 rows=1 loops=10001)
 Filter: (iteration <= 1)
 Rows Removed by Filter: 0
 Planning Time: 0.068 ms
 Execution Time: 11.651 ms
(9 rows)

-- 
Jonah H. Harris

with_iterative_v1.patch
Description: Binary data

Re: Proposing WITH ITERATIVE

2020-04-28 Thread Jonah H. Harris

On Tue, Apr 28, 2020 at 11:51 AM Stephen Frost  wrote:

> Greetings Jonah!
>

Hey, Stephen. Hope you're doing well, man!

> One of the first question that we need to ask though, imv anyway, is- do
> the other databases use the WITH ITERATIVE syntax?  How many of them?
> Are there other approaches?  Has this been brought up to the SQL
> committee?
>

There are four that I'm aware of, but I'll put together a full list. I
don't believe it's been proposed to the standards committee, but I'll see
if I can find transcripts/proposals. I imagine those are all still public.

In general, we really prefer to avoid extending SQL beyond the standard,
> especially in ways that the standard is likely to be expanded.  In
> other words, we'd really prefer to avoid the risk that the SQL committee
> declares WITH ITERATIVE to mean something else in the future, causing us
> to have to have a breaking change to become compliant.

Agreed. That's my main concern as well.

> Now, if all the other major DB vendors have WITH ITERATIVE and they all
> work the same
> way, then we can have at least some confidence that the SQL committee
> will end up defining it in that way and there won't be any issue.
>

This is where it sucks, as each vendor has done it differently. One uses
WITH ITERATE, one uses WITH ITERATIVE, others use, what I consider to be, a
nasty operator-esque style... I think ITERATIVE makes the most sense, but
it does create a future contention as that definitely seems like the syntax
the committee would use as well.

Oracle ran into this issue as well, which is why they added an additional
clause to WITH that permits selection of depth/breadth-first search et al.
While it's kinda nasty, we could always similarly amend WITH RECURSIVE to
add an additional ITERATE or something to the tail of the with_clause rule.
But, that's why I'm looking for feedback :)

We do have someone on the committee who watches these lists, hopefully
> they'll have a chance to comment on this.  Perhaps it's already in
> discussion in the committee.
>

That would be awesome.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-28 Thread Jonah H. Harris

On Mon, Apr 27, 2020 at 8:50 PM Jeff Davis  wrote:

> Hi,
>

Hey, Jeff. Long time no talk. Good to see you're still on here.

You might get better feedback in a month or so; right now we just got
> into feature freeze.
>

Yep. No hurry. I've just been playing with this and wanted to start getting
feedback. It's a side-project for me anyway, so time is limited.

> Can you illustrate with some examples? I get that one is appending and
> the other is modifying in-place, but how does this end up looking in
> the query language?
>

I'm putting together a few concrete real-world examples.

> Rather than stopping when no new tuples are generated, WITH ITERATIVE
> > stops when a user-defined predicate evaluates to true.
>
> Why stop when it evaluates to true, and not false?
>

It's how they implemented it. A few other databases have implemented
similar functionality but, as it's not standard, it's kinda just up to each
implementor. I'm not married to that idea, but it has worked well for me so
far.

It seems like the benefit comes from carrying information along within
> tuples (by adding to scores or counters) rather than appending tuples.
> Is it possible to achieve this in other ways? The recursive CTE
> implementation is a very direct implementation of the standard, perhaps
> there are smarter approaches?
>

Yeah, in that specific case, one of the other implementations seems to
carry the counters along in the executor itself. But, as not all uses of
this functionality are iteration-count-based, I think that's a little
limiting. Using a terminator expression (of some kind) seems most
adaptable, I think. I'll give some examples of both types of cases.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-28 Thread Jonah H. Harris

On Tue, Apr 28, 2020 at 6:19 AM Andreas Karlsson  wrote:

> Do you have any examples of queries where it would help? It is pretty
> hard to say how much value some new syntax adds without seeing how it
> improves an intended use case.
>

Hey, Andreas.

Thanks for the response. I'm currently working on a few examples per Jeff's
response along with some benchmark information including improvements in
response time and memory usage of the current implementation. In the
meantime, as this functionality has been added to a couple of other
databases and there's academic research on it, if you're interested, here's
a few papers with examples:

http://faculty.neu.edu.cn/cc/zhangyf/papers/2018-ICDCS2018-sqloop.pdf
http://db.in.tum.de/~passing/publications/dm_in_hyper.pdf

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-28 Thread Jonah H. Harris

On Tue, Apr 28, 2020 at 6:16 AM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> will help the community to focus on the actual details of your proposal.
>

I'd like it if the community would focus on the details of the proposal as
well :)

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-27 Thread Jonah H. Harris

On Mon, Apr 27, 2020 at 11:33 PM David Fetter  wrote:

>
> Have the authors agreed to make it available to the project under a
> compatible license?

If there’s interest, obviously. Otherwise I wouldn’t be asking.

I said from the start why I wasn’t attaching a patch and that I was seeing
feedback. Honestly, David, stop wasting my, and list time, asking pointless
off-topic questions.

-- 
Jonah H. Harris

Re: Proposing WITH ITERATIVE

2020-04-27 Thread Jonah H. Harris

On Mon, Apr 27, 2020 at 10:32 PM David Fetter  wrote:

> On Mon, Apr 27, 2020 at 12:52:48PM -0400, Jonah H. Harris wrote:
> > Hey, everyone.
>
> > If there's any interest, I'll clean-up their patch and submit. Thoughts?
>
> Where's the current patch?

It’s private. Hence, why I’m inquiring as to interest.

-- 
Jonah H. Harris

Proposing WITH ITERATIVE

2020-04-27 Thread Jonah H. Harris

Hey, everyone.

It's been quite a while since I last contributed a patch but, as this is a
new feature, I wanted to gather feedback before doing so. I've found this
functionality, already in use at Universität Tübingen, to be both highly
beneficial in many of my queries as well as a small change to Postgres - a
good trade-off IMO.

As you know, Postgres currently supports SQL:1999 recursive common table
expressions, constructed using WITH RECURSIVE, which permit the computation
of growing relations (e.g., transitive closures.) Although it is possible
to use this recursion for general-purpose iterations, doing so is a
deviation from its intended use case. Accordingly, an iterative-only use of
WITH RECURSIVE often results in sizable overhead and poor optimizer
decisions. In this email, I'd like to propose a similar but
iterative-optimized form of CTE - WITH ITERATIVE.

In that it can reference a relation within its definition, this iterative
variant has similar capabilities as recursive CTEs. In contrast to
recursive CTEs, however, rather than appending new tuples, this variant
performs a replacement of the intermediate relation. As such, the final
result consists of a relation containing tuples computed during the last
iteration only. Why would we want this?

This type of computation pattern is often found in data and graph mining
algorithms. In PageRank, for example, the initial ranks are updated in each
iteration. In clustering algorithms, assignments of data tuples to clusters
are updated in every iteration. Something these types of algorithms have in
common is that they operate on fixed-size datasets, where only specific
values (ranks, assigned clusters, etc.) are updated.

Rather than stopping when no new tuples are generated, WITH ITERATIVE stops
when a user-defined predicate evaluates to true. By providing a
non-appending iteration concept with a while-loop-style stop criterion, we
can massively speed up query execution due to better optimizer decisions.
Although it is possible to implement the types of algorithms mentioned
above using recursive CTEs, this feature has two significant advantages:

First, iterative CTEs consume significantly less memory. Consider a CTE of
N tuples and I iterations. With recursive CTEs, such a relation would grow
to (N * I) tuples. With iterative CTEs, however, all prior iterations are
discarded early. As such, the size of the relation would remain N,
requiring only a maximum of (2 * N) tuples for comparisons of the current
and the prior iteration. Furthermore, in queries where the number of
executed iterations represents the stop criterion, recursive CTEs require
even more additional per-tuple overhead - to carry along the iteration
counter.

Second, iterative CTEs have lower query response times. Because of its
smaller size, the time to scan and process the intermediate relation, to
re-compute ranks, clusters, etc., is significantly reduced.

In short, iterative CTEs retain the flexibility of recursive CTEs, but
offer a significant advantage for algorithms which don't require results
computed throughout all iterations. As this feature deviates only slightly
from WITH RECURSIVE, there is very little code required to support it (~250
loc.)

If there's any interest, I'll clean-up their patch and submit. Thoughts?

-- 
Jonah H. Harris

Re: color by default

2020-03-20 Thread Jonah H. Harris

On Tue, Dec 31, 2019 at 8:35 AM Tom Lane  wrote:

> Peter Eisentraut  writes:
> > With the attached patch, I propose to enable the colored output by
> > default in PG13.
>
> FWIW, I shall be setting NO_COLOR permanently if this gets committed.
> I wonder how many people there are who actually *like* colored output?
> I find it to be invariably less readable than plain B text.
>

Same.


> I may well be in the minority, but I think some kind of straw poll
> might be advisable, rather than doing this just because.
>

+1 on no color by default.

-- 
Jonah H. Harris

Re: Proposition to use '==' as synonym for 'IS NOT DISTINCT FROM'

2019-10-26 Thread Jonah H. Harris

On Sat, Oct 26, 2019 at 12:49 PM Tom Lane  wrote:

> David Fetter  writes:
> > On Sat, Oct 26, 2019 at 06:41:10PM +0300, Eugen Konkov wrote:
> >> I  have  noticed that it would be cool to use '==' in place of 'IS NOT
> >> DISTICT FROM'
> >> What do you think about this crazy idea?
>
> > Turning "IS NOT DISTINCT FROM" into an operator sounds like a great
> > idea.
>
> No it isn't.


+1


-- 
Jonah H. Harris

Re: pgbench - implement strict TPC-B benchmark

2019-07-31 Thread Jonah H. Harris

On Wed, Jul 31, 2019 at 10:10 AM Fabien COELHO  wrote:

>
> Hello Tom,
>
> >>> I'm also highly dubious about labeling this script "standard TPC-B",
> >>> when it resolves only some of the reasons why our traditional script
> >>> is not really TPC-B.  That's treading on being false advertising.
> >
> >> IANAL, but it may not even be permissible to claim that we have
> >> implemented "standard TPC-B".
> >
> > Yeah, very likely you can't legally say that unless the TPC
> > has certified your test.  (Our existing code and docs are quite
> > careful to call pgbench's version "TPC-like" or similar weasel
> > wording, and never claim that it is really TPC-B or even a close
> > approximation.)
>
> Hmmm.
>
> I agree that nobody really cares about TPC-B per se. The point of this
> patch is to provide a built-in example of recent and useful pgbench
> features that match a real specification.
>

I agree with this. When I was at EnterpriseDB, while it wasn't audited, we
had to develop an actual TPC-B implementation because pgbench was too
different. pgbench itself isn't that useful as a benchmark tool, imo, but
if we have the ability to make it better (i.e. closer to an actual
benchmark kit), I think we should.

-- 
Jonah H. Harris

Re: Indexing - comparison of tree structures

2019-05-24 Thread Jonah H. Harris

T-tree (and variants) are index types commonly associated with in-memory
database management systems and rarely, if-ever, used with on-disk
databases. There has been a lot of research in regard to more modern cache
conscious/oblivious b-trees that perform equally or better than t-tree.
What’s the use-case?

On Fri, May 24, 2019 at 5:38 AM Sascha Kuhl  wrote:

> Hi,
>
> I compared two data structures realistically by time, after estimating big
> O. T-tree outperforms b-tree, which is commonly used, for a medium size
> table. Lehmann and Carey showed the same, earlier.
>
> Can you improve indexing by this?
>
> Understandably
>
> Sascha Kuhl
>
-- 
Jonah H. Harris

Re: Row Visibility and Table Access Methods

2018-12-14 Thread Jonah H. Harris

On Fri, Dec 14, 2018 at 12:28 AM Pavel Stehule 
wrote:

>
>
> čt 13. 12. 2018 v 10:23 odesílatel Simon Riggs 
> napsal:
>


>> Thoughts?
>>
>
> looks great
>

Agreed. This sounds well-thought-out and rather simple to implement.


-- 
Jonah H. Harris

Re: proposal: plpgsql pragma statement

2018-12-06 Thread Jonah H. Harris

Ahh. Gotcha. Makes sense.

On Thu, Dec 6, 2018 at 11:57 AM Robert Haas  wrote:

> On Thu, Dec 6, 2018 at 11:39 AM Jonah H. Harris 
> wrote:
> > IIRC, PRAGMA in Ada was compile-time only. How would you foresee it
> affecting runtime?
>
> Well, I don't know what Ada does with PRAGMA exactly, but look at
> these examples from Oracle:
>
> http://psoug.org/definition/pragma.htm
>
> You wouldn't *execute* those at runtime, but at least for some of
> them, the runtime behavior would depend on whether or not they were
> specified.  It certainly seems possible that we might want to have
> similar things.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
-- 
Jonah H. Harris

Re: proposal: plpgsql pragma statement

2018-12-06 Thread Jonah H. Harris

IIRC, PRAGMA in Ada was compile-time only. How would you foresee it
affecting runtime?

On Thu, Dec 6, 2018 at 11:27 AM Robert Haas  wrote:

> On Tue, Dec 4, 2018 at 12:13 PM Pavel Stehule 
> wrote:
> > I wrote plpgsql_check https://github.com/okbob/plpgsql_check.
> >
> > It is working well, but because it does static analyse only, sometimes
> it can produces false alarms or it should to stop a analyse, because there
> are not necessary data.
> >
> > https://github.com/okbob/plpgsql_check/issues/36
> >
> > I see one possible solution in introduction of pragma statement with
> syntax:
> >
> >   PRAGMA keyword [content to semicolon];
> >
> > The pragma has a relation to following statement. So the issue 36 can be
> solved by pragma
> >
> > PRAGMA cmdtype CREATE;
> > EXECUTE format('CREATE TABLE xxx ...
> >
> > The PRAGMA statement does nothing in runtime. It works only in compile
> time, and add a pair of key, value to next non pragma statement. This
> information can be used by some plpgsql extensions.
> >
> > What do you think about this proposal?
>
> I think it's commandeering PRAGMA for a fairly narrow purpose.  It's
> hardly unimaginable that someone in future might want a PRAGMA that
> does change runtime behavior, or that affects something other than the
> statement which immediately follows.
>
> I don't see a big problem allowing some kind of annotation that
> plpgsql_check can easily access, and I don't even mind it being called
> PRAGMA.  But I don't think it should foreclose unrelated uses of
> PRAGMA which somebody might want to introduce in the future.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
Jonah H. Harris

Re: proposal: plpgsql pragma statement

2018-12-05 Thread Jonah H. Harris

You can alter the lexer and create a comment node, right? That’s how we did
hints in EnterpriseDB.

On Wed, Dec 5, 2018 at 11:41 AM Pavel Stehule 
wrote:

>
>
> st 5. 12. 2018 v 15:03 odesílatel Pavel Luzanov 
> napsal:
>
>>
>> But maybe your extension could read the PERFORM statement preceding it
>> and treat it as an annotation hint for the following statement.
>>
>>
>> In this case, comment line in some format will be better than real
>> PERFORM statement. Like this:
>>
>> /*+PRAGMA cmdtype CREATE; */
>> EXECUTE format('CREATE TABLE xxx ...
>>
>
> I looked there and It is not possible to implement it - plpgsql uses SQL
> lexer, and the content of comments are just ignored. So I cannot to read
> comments. There is not any possibility to read it simply from plpgsql.
> Unfortunately, but it is expected, there is nothing like query string for
> plpgsql statement.
>
> Regards
>
> Pavel
>
>> -
>> Pavel Luzanov
>> Postgres Professional: http://www.postgrespro.com
>> The Russian Postgres Company
>>
>>
>> --
Jonah H. Harris

Re: Disallow setting client_min_messages > ERROR?

2018-11-08 Thread Jonah H. Harris

On Thu, Nov 8, 2018 at 5:37 PM Tom Lane  wrote:

> > My patch used the check hook, but works either way.
>
> I was deliberately not getting into the detail of which hook to use ;-).
>
> Anyway, pushed with some adjustments and work on the documentation.
> Notably, I thought the warning message was inappropriate and
> overcomplicated, so I just dropped it.  I don't think we really need
> anything there.
>

+1

-- 
Jonah H. Harris

Re: Disallow setting client_min_messages > ERROR?

2018-11-06 Thread Jonah H. Harris

On Tue, Nov 6, 2018 at 2:46 PM Isaac Morland 
wrote:

> On Tue, 6 Nov 2018 at 14:07, Jonah H. Harris 
> wrote:
>
>> Two options presented:
>>
>> - Hard patch removes FATAL/PANIC from client_message_level_options in
>> guc.c, which also seems to make sense in regard to it's double-usage
>> with trace_recovery_messages.
>>
>> - Soft patch keeps FATAL/PANIC in client_message_level_options but
>> coerces client_min_messages to ERROR when set to FATAL/PANIC and issues a
>> warning. This also exports error_severity from elog.c to retrieve severity
>> -> text mappings for the warning message.
>>
>>
> What about no-op (soft patch) for 11.1 and backpatches, error (hard patch)
> for 12?
>

I'm usually a fan of the hard fix... but I do see the point they've made
about during an upgrade.

Also, fixed wording in the soft patch (frontend protocol requires %s or
above -> frontend protocol requires %s or below) attached.

-- 
Jonah H. Harris


client_min_messages_config_soft-v2.patch
Description: Binary data

Re: Disallow setting client_min_messages > ERROR?

2018-11-06 Thread Jonah H. Harris

Two options presented:

- Hard patch removes FATAL/PANIC from client_message_level_options in
guc.c, which also seems to make sense in regard to it's double-usage
with trace_recovery_messages.

- Soft patch keeps FATAL/PANIC in client_message_level_options but coerces
client_min_messages to ERROR when set to FATAL/PANIC and issues a warning.
This also exports error_severity from elog.c to retrieve severity -> text
mappings for the warning message.

-- 
Jonah H. Harris


client_min_messages_config_hard.patch
Description: Binary data


client_min_messages_config_soft.patch
Description: Binary data

49 matches

Mail list logo