Re: The "char" type versus non-ASCII characters

2021-12-03 Thread Kenneth Marshall
On Fri, Dec 03, 2021 at 03:13:24PM -0500, Tom Lane wrote:
> Andrew Dunstan  writes:
> > On 12/3/21 14:42, Tom Lane wrote:
> >> Right, I envisioned that ASCII behaves the same but we'd use
> >> a numeric representation for high-bit-set values.  These
> >> cases could be told apart fairly easily by charin(), since
> >> the numeric representation would always be three digits.
> 
> > OK, this seems the most attractive. Can we also allow 2 hex digits?
> 
> I think we should pick one base and stick to it.  I don't mind
> hex if you have a preference for that.
> 
>   regards, tom lane

+1 for hex

Regards,
Ken




Re: disfavoring unparameterized nested loops

2021-06-21 Thread Kenneth Marshall
> >
> > Most of the time when I see that happen it's down to either the
> > selectivity of some correlated base-quals being multiplied down to a
> > number low enough that we clamp the estimate to be 1 row.   The other
> > case is similar, but with join quals.
> 
> If an estimate is lower than 1, that should be a red flag that Something Is
> Wrong. This is kind of a crazy idea, but what if we threw it back the other
> way by computing 1 / est , and clamping that result to 2 <= res < 10 (or
> 100 or something)? The theory is, the more impossibly low it is, the more
> wrong it is. I'm attracted to the idea of dealing with it as an estimation
> problem and not needing to know about join types. Might have unintended
> consequences, though.
> 
> Long term, it would be great to calculate something about the distribution
> of cardinality estimates, so we can model risk in the estimates.
> 

Hi,

Laurenz suggested clamping to 2 in this thread in 2017:

https://www.postgresql.org/message-id/1509611428.3268.5.camel%40cybertec.at

Having been the victim of this problem in the past, I like the risk
based approach to this. If the cost of being wrong in the estimate is
high, use a merge join instead. In every case that I have encountered,
that heuristic would have given the correct performant plan.

Regards,
Ken




Re: Add SQL function for SHA1

2021-01-25 Thread Kenneth Marshall
On Tue, Jan 26, 2021 at 01:06:29PM +0900, Michael Paquier wrote:
> On Mon, Jan 25, 2021 at 10:42:25PM -0500, Sehrope Sarkuni wrote:
> > +1 to adding a SHA1 SQL function. Even if it's deprecated, there's plenty
> > of historical usage that I can see it being useful.
> 
> Let's wait for more opinions to see if we agree that this addition is
> helpful or not.  Even if this is not added, I think that there is
> still value in refactoring the code anyway for the SHA-2 functions.
> 

+1 I know that it has been deprecated, but it can be very useful when
working with data from pre-deprecation. :) It is annoying to have to
resort to plperl or plpython because it is not available. The lack or
orthogonality is painful.

Regards,
Ken




Re: libpq compression

2020-12-22 Thread Kenneth Marshall
On Tue, Dec 22, 2020 at 07:15:23PM +0100, Tomas Vondra wrote:
> 
> 
> On 12/22/20 6:56 PM, Robert Haas wrote:
> >On Tue, Dec 22, 2020 at 6:24 AM Daniil Zakhlystov
> > wrote:
> >>When using bidirectional compression, Postgres resource usage correlates 
> >>with the selected compression level. For example, here is the Postgresql 
> >>application memory usage:
> >>
> >>No compression - 1.2 GiB
> >>
> >>ZSTD
> >>zstd:1 - 1.4 GiB
> >>zstd:7 - 4.0 GiB
> >>zstd:13 - 17.7 GiB
> >>zstd:19 - 56.3 GiB
> >>zstd:20 - 109.8 GiB - did not succeed
> >>zstd:21, zstd:22  > 140 GiB
> >>Postgres process crashes (out of memory)
> >
> >Good grief. So, suppose we add compression and support zstd. Then, can
> >unprivileged user capable of connecting to the database can negotiate
> >for zstd level 1 and then choose to actually send data compressed at
> >zstd level 22, crashing the server if it doesn't have a crapton of
> >memory? Honestly, I wouldn't blame somebody for filing a CVE if we
> >allowed that sort of thing to happen. I'm not sure what the solution
> >is, but we can't leave a way for a malicious client to consume 140GB
> >of memory on the server *per connection*. I assumed decompression
> >memory was going to measured in kB or MB, not GB. Honestly, even at
> >say L7, if you've got max_connections=100 and a user who wants to make
> >trouble, you have a really big problem.
> >
> >Perhaps I'm being too pessimistic here, but man that's a lot of memory.
> >
> 
> Maybe I'm just confused, but my assumption was this means there's a
> memory leak somewhere - that we're not resetting/freeing some piece
> of memory, or so. Why would zstd need so much memory? It seems like
> a pretty serious disadvantage, so how could it become so popular?
> 
> 
> regards
> 

Hi,

It looks like the space needed for decompression is between 1kb and
3.75tb:

https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#window_descriptor

Sheesh! Looks like it would definitely need to be bounded to control
resource use.

Regards,
Ken




Re: Bump default wal_level to logical

2020-06-08 Thread Kenneth Marshall
On Mon, Jun 08, 2020 at 02:58:03PM -0400, Robert Haas wrote:
> On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera  
> wrote:
> > I think it's reasonable to push our default limits for slots,
> > walsenders, max_bgworkers etc a lot higher than current value (say 10 ->
> > 100).  An unused slot wastes essentially no resources; an unused
> > walsender is just one PGPROC entry.  If we did that, and also allowed
> > wal_level to be changed on the fly, we wouldn't need to restart in order
> > to enable logical replication, so there would be little or no pressure
> > to change the wal_level default.
> 
> Wouldn't having a whole bunch of extra PGPROC entries have negative
> implications for the performance of GetSnapshotData() and other things
> that don't scale well at high connection counts?
> 

+1

I think just having the defaults raised enough to allow even a couple DB
replication slots would be advantageous and allow it to be used to
address spur of the moment needs for systems that need to stay up. It
does seem wasteful to by default support large numbers of slots and
seems to be contrary to the project stance on initial limits.

Regards,
Ken




Re: factorial function/phase out postfix operators?

2020-05-19 Thread Kenneth Marshall
> 
> I won't lose a lot of sleep if we decide to rip out '!' as well, but I
> don't think that continuing to support it would cost us much.
> 
+1 for keeping ! and nuking the rest, if possible.

Regards,
Ken




Re: Protect syscache from bloating with negative cache entries

2019-01-15 Thread Kenneth Marshall
On Tue, Jan 15, 2019 at 01:32:36PM -0500, Tom Lane wrote:
> ... 
> > FYI, Oracle provides one parameter, shared_pool_size, that determine the
> > size of a memory area that contains SQL plans and various dictionary
> > objects.  Oracle decides how to divide the area among constituents.  So
> > it could be possible that one component (e.g. table/index metadata) is
> > short of space, and another (e.g. SQL plans) has free space.  Oracle
> > provides a system view to see the free space and hit/miss of each
> > component.  If one component suffers from memory shortage, the user
> > increases shared_pool_size.  This is similar to what Horiguchi-san is
> > proposing.
> 
> Oracle seldom impresses me as having designs we ought to follow.
> They have a well-earned reputation for requiring a lot of expertise to
> operate, which is not the direction this project should be going in.
> In particular, I don't want to "solve" cache size issues by exposing
> a bunch of knobs that most users won't know how to twiddle.
> 
>   regards, tom lane

+1

Regards,
Ken