Re: \dt shows table but \d says the table doesn't exist ?

2024-05-03 Thread jian he
On Sat, May 4, 2024 at 5:15 AM Tom Lane  wrote:
>
> Adrian Klaver  writes:
> > On 5/3/24 14:06, Magnus Hagander wrote:
> >> Looks like you might need a \d "some_idIds" (include the quotes) since
> >> it has an uppercase characters?
>
> > This:
> > "Did not find any relation named "public.some_idIds"."
> > to me indicates it did look for the properly cased name.
>
> No, that message just regurgitates what you typed.  Magnus is
> correct that the pattern will be case-folded if not quoted.
> You can check with --echo-hidden (-E):
>
> postgres=# \d public.some_idIds
> / QUERY */
> SELECT c.oid,
>   n.nspname,
>   c.relname
> FROM pg_catalog.pg_class c
>  LEFT JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
> WHERE c.relname OPERATOR(pg_catalog.~) '^(some_idids)$' COLLATE 
> pg_catalog.default
>   AND n.nspname OPERATOR(pg_catalog.~) '^(public)$' COLLATE pg_catalog.default
> ORDER BY 2, 3;
> //
>
> Did not find any relation named "public.some_idIds".
>
> So it is in fact looking for public.some_idids.
>

make it case insensitive?
like
> WHERE c.relname OPERATOR(pg_catalog.~*) '^(some_idids)$' COLLATE 
> pg_catalog.default

we already transformed `\d SOME_IDIDS` to `\d some_idids`, in some way
it looks case-insensitive.




Re: Documentation diff

2024-02-25 Thread jian he
On Mon, Feb 26, 2024 at 1:49 AM Daniel Gustafsson  wrote:
>
> > On 25 Feb 2024, at 17:38, Marcos Pegoraro  wrote:
>
> > So I was thinking of a way to get differences between this and that 
> > versions, and for all doc pages.
> > Something like we already have on [2], it explicits, this feature was 
> > introduced, this was changed, this does not exist anymore.
>
> The documentation is written in heavily structured form using XML, you can 
> most
> likely write a diffing tool fairly easily which gives enough hints to know 
> what
> to read up on (or find an existing tool where plugging in a XPath expression
> suffice).  By the sounds of it you are mostly interested in things found in
> tables which makes it even easier.
>
> --
> Daniel Gustafsson
>
>
>

I just found out this:
https://demo.deltaxml.com/compare-demonstration-results-sample/

but I guess the OP wants a side by side rendered html comparison.
like you buy a iphone then you can compare it like this:
https://www.apple.com/iphone/compare/




Re: postgresql custom variable in pg_settings table

2024-01-09 Thread jian he
On Tue, Jan 9, 2024 at 3:38 PM Yi Sun  wrote:
>
> Hello,
>
>
>
> But can not select it from pg_settings, as we use pgwatch2 to monitor, to 
> avoid pgwatch2 script change prefer to use pg_setting not current_setting() 
> function, is it possible to get the custom variable from pg_setting please? 
> Thanks

the following minimum c function doing that.
you can use LOAD (https://www.postgresql.org/docs/16/sql-load.html) to
load it, which will make it session persistent.
to make it always persistent, you need to make it in the
shared_preload_libraries.
of course, you can define more Custom Variables.

#include "postgres.h"
#include "funcapi.h"
#include "fmgr.h"
#include "utils/guc.h"
PG_MODULE_MAGIC;
static int test_guc = 2;
void
_PG_init(void)
{
/* Define custom GUC variables. */
DefineCustomIntVariable("patroni.count",
"test .",
"Valid range is 1 .. 11.",
_guc,
2,
1,
11,
PGC_USERSET,
0,
NULL,
NULL,
NULL);
MarkGUCPrefixReserved("patroni");
}




procedure string constant is parsed at procedure create time.

2023-11-07 Thread jian he
hi.
table "test" does not exist.

---1.this will work.
CREATE or replace PROCEDURE template_create() LANGUAGE SQL
AS $proc$ DROP TABLE if exists test cascade; CREATE TABLE test(id
int4range,valid_at tsrange,name text);$proc$;

2.this will have errors.
CREATE or replace PROCEDURE template_create() LANGUAGE SQL AS $proc$
DROP TABLE if exists test cascade;
CREATE TABLE test(id int4range,valid_at tsrange,name text);
INSERT INTO test VALUES (NULL, tsrange('2018-01-01',
'2019-01-01'), 'null key');
$proc$;

3.language plpgsql works fine.
CREATE or replace PROCEDURE template_create()
LANGUAGE plpgsql AS $proc$
begin
DROP TABLE if exists test101 cascade;
CREATE TABLE test101(id int4range,valid_at tsrange,name text);
INSERT INTO test101 VALUES (NULL, tsrange('2018-01-01',
'2019-01-01'), 'null key');
end
$proc$;

per manual: https://www.postgresql.org/docs/current/sql-createprocedure.html
"This form only works for LANGUAGE SQL, the string constant form
works for all languages. This form is parsed at procedure definition
time, the string constant form is parsed at execution time"

Why does the 2nd query fail? What am I missing?




Re: Local postgres manual

2023-11-03 Thread jian he
i think Ben Hancock
want something like https://en.cppreference.com/w/Cppreference:Archives




Re: Read only user permission

2023-08-23 Thread jian he
On Thu, Aug 24, 2023 at 2:49 AM Marcos Pegoraro  wrote:
>
> Have you tried grant pg_read_all_data to readonly ?
>

I tried. seems pg_read_all_data really means "read all data no matter what".
so you cannot grant pg_read_all_data then revoke certain schema privileges.

begin;
create user my_user login;
GRANT pg_read_all_data TO my_user;
create schema a;
create schema b;
create table a.a as select 1;
create table b.b as select 2;
revoke all PRIVILEGES on schema a, b from my_user CASCADE;
-- revoke all PRIVILEGES on schema a from my_user CASCADE;
revoke all PRIVILEGES on all tables in schema a,b from my_user CASCADE;
set role my_user;
table a.a;
table b.b;
rollback;




EXPLAIN statement can also apply to SELECT INTO.

2023-08-21 Thread jian he
hi.

https://www.postgresql.org/docs/current/sql-explain.html
>>>
Any SELECT, INSERT, UPDATE, DELETE, MERGE, VALUES, EXECUTE, DECLARE,
CREATE TABLE AS, or CREATE MATERIALIZED VIEW AS statement, whose
execution plan you wish to see.
>>>

seems you can use it with SELECT INTO.

explain (ANALYZE, BUFFERS, TIMING, VERBOSE ON, BUFFERS ON, WAL ON,SETTINGS ON)
select count(*) as cnt INTO s from tenk1;




Re: Loops and Case Statements Involving Dates

2023-08-21 Thread jian he
On Mon, Aug 21, 2023 at 3:07 PM Anthony Apollis
 wrote:
>
> Please review my code and make recommendations where needed. I have this code:
> ```
> NUMBER OF LOOPS FOR POSTGRESQL ETL:
> SELECT
> CASE
> WHEN (((EXTRACT(DAY FROM
> ((CASE
> WHEN
> (SELECT DISTINCT "CALDAY" FROM “system”.”IMETA_ZACTPRIC1_Cust_Pricing_TA_BW” 
> AS WEEK_NUMBER
> WHERE
> WEEK_NUMBER."CALDAY" = (SELECT MAX(DISTINCT "CALDAY") AS "Calendar day" FROM 
> “system”.”IMETA_ZACTPRIC1_Cust_Pricing_TA_BW” AS WEEK_NUMBER)) IS NULL THEN 
> '2020-07-01'
> ELSE
> (SELECT DISTINCT "CALDAY" FROM “system”.”IMETA_ZACTPRIC1_Cust_Pricing_TA_BW” 
> AS WEEK_NUMBER
> WHERE
> WEEK_NUMBER."CALDAY" = (SELECT MAX(DISTINCT "CALDAY") AS "Calendar day" FROM 
> “system”.”IMETA_ZACTPRIC1_Cust_Pricing_TA_BW” AS WEEK_NUMBER)) + interval '1 
> day'
> END))::timestamp - --start date
> (CAST(NOW() - INTERVAL '1 day' AS DATE))::timestamp)-1) --end date
> * -1)) <= 30 THEN 1
> ELSE
> CEIL(((EXTRACT(DAY FROM
> ((CASE
> WHEN
> (SELECT DISTINCT "CALDAY" FROM “system”.”IMETA_ZACTPRIC1_Cust_Pricing_TA_BW” 
> AS WEEK_NUMBER
> WHERE
> WEEK_NUMBER."CALDAY" = (SELECT MAX(DISTINCT "CALDAY") AS "Calendar day" FROM 
> “system”.”IMETA_ZACTPRIC1_Cust_Pricing_TA_BW” AS WEEK_NUMBER)) IS NULL THEN 
> '2020-07-01'
> ELSE
> (SELECT DISTINCT "CALDAY" FROM “system”.”IMETA_ZACTPRIC1_Cust_Pricing_TA_BW” 
> AS WEEK_NUMBER
> WHERE
> WEEK_NUMBER."CALDAY" = (SELECT MAX(DISTINCT "CALDAY") AS "Calendar day" FROM 
> system."IMETA_ZINV_AP1_Invoice_data_TA_BW"" AS WEEK_NUMBER)) + interval '1 
> day'
> END))::timestamp - --start date
> (CAST(NOW() - INTERVAL '1 day' AS DATE))::timestamp)-1) --end date
> * -1)/30) --30 DAY INTERVALS UNLESS LESS
> END
> AS "Number of days"
>

“system”.”IMETA_ZACTPRIC1_Cust_Pricing_TA_BW”
seems like some of the characters look like double quotes, but it's
maybe not a double quote.
wiki: https://en.wikipedia.org/wiki/Quotation_mark.




Re: Dynamically accessing columns from a row type in a trigger

2023-08-12 Thread jian he
On Sun, Aug 13, 2023 at 11:27 AM Rhys A.D. Stewart
 wrote:
>
> Hey Adrian,
>
> Thanks for your response. I don't think I explained myself clearly.
> pk_col is not the column name. pk_col is a variable that holds the
> name of a column. This is one trigger for several tables with varying
> structures. So pk_col would be a column specific to the current
> TG_TABLE_NAME when the trigger is invoked. This is why in my example I
> had to use EXECUTE to get the value of the pk_col from OLD.
>
> Actually, now that I'm thinking about it, I don't really want to store
> the value into a variable because the pk_col might be of any given
> type. So ideally, I'd love a way to just get the value from OLD and
> use it directly in another query. Something along the lines of:
>
> `EXECUTE format('SELECT * FROM %1$I.sometable WHERE pk = $1', myschma)
> USING OLD['pk_col']`.
>
> I reckon I may have to look at just generating a trigger function per
> table, or maybe look into using TG_ARGS.
>

google lead me to this post:
https://stackoverflow.com/questions/55245353/access-dynamic-column-name-of-row-type-in-trigger-function

> table, or maybe look into using TG_ARGS.
maybe  you are referring to TG_ARGV.

example of TG_ARGV =>
https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/expected/triggers.out




jsonb on-disk size calculation

2023-07-31 Thread jian he
hi.

drop table x;
create table x(js jsonb);
insert into x select '{"Hello world":1}'::jsonb;
select pg_column_size(js) from x; -- return 33.

based on src/include/utils/jsonb.h
The key and value part is 20 bytes (is it correct?), Jsonb->vl_len_  is 4
byte, JsonbContainer->header is 4 bytes. That's 28 bytes.

but now on-disk is 33 bytes.
so I am not sure where the remaining bytes are.


Re: [Beginner Question]Is there way to test the postgres's kernel function?

2023-07-25 Thread jian he
On Mon, Jul 24, 2023 at 5:43 PM Wen Yi  wrote:
>
> Hi community,
> I am learning the kernel of the postgres, and I want to test the 
> make_relative_path function to know what it's operational principle.(I can 
> read it's code, and I want to actually run it)
> But I really don't know how to do it, wrtite a extension?
>
> Can someone give me some advice?
> Thanks in advance!
>
> Yours,
> Wen Yi.

create a c file, copy c.h  include line to new c file. add an empty
main function. then gcc -Wextra your_c_absolute_path
then all kinds of errors will happen in the terminal.
found out why. either delete the line or figure out why error.

Finally, no errors happen. then experiment in the empty main function.
The following example only makes sure no error when gcc -Wextra.

/*-

gcc -Wextra 
/home/jian/.config/Code/User/globalStorage/buenon.scratchpads/scratchpads/47a6066b9dfaca1c2b13ea2e99f4e929/scratch1.c

*/

#include 
#include 
#include 
#include 
#include 
#include 
#ifdef HAVE_STRINGS_H
#include 
#endif
#include 
#include 
#include 
#if defined(WIN32) || defined(__CYGWIN__)
#include  /* ensure O_BINARY is available */
#endif
#include 
#ifdef ENABLE_NLS
#include 
#endif
#include 

#include 
#include 
#ifdef WIN32
#ifdef _WIN32_IE
#undef _WIN32_IE
#endif
#define _WIN32_IE 0x0500
#ifdef near
#undef near
#endif
#define near
#include 
#else
#include 
#endif

// #include "pg_config_paths.h"
#define MAXPGPATH 1024
#define IS_DIR_SEP(ch) IS_NONWINDOWS_DIR_SEP(ch)

#define IS_NONWINDOWS_DIR_SEP(ch) ((ch) == '/')

#ifndef WIN32
#define IS_PATH_VAR_SEP(ch) ((ch) == ':')
#else
#define IS_PATH_VAR_SEP(ch) ((ch) == ';')
#endif

#define StaticAssertDecl(condition, errmessage) \
_Static_assert(condition, errmessage)
#define StaticAssertStmt(condition, errmessage) \
do { _Static_assert(condition, errmessage); } while(0)
#define StaticAssertExpr(condition, errmessage) \
((void) ({ StaticAssertStmt(condition, errmessage); true; }))
#define  HAVE__BUILTIN_TYPES_COMPATIBLE_P 1

#if defined(HAVE__BUILTIN_TYPES_COMPATIBLE_P)
#define unconstify(underlying_type, expr) \
(StaticAssertExpr(__builtin_types_compatible_p(__typeof(expr), const
underlying_type), \
  "wrong cast"), \
(underlying_type) (expr))
#endif

static void make_relative_path(char *ret_path, const char *target_path,
   const char *bin_path, const char *my_exec_path);
static char *trim_directory(char *path);
static void trim_trailing_separator(char *path);
static char *append_subdir_to_path(char *path, char *subdir);

char *
first_dir_separator(const char *filename);

/*
 * Copy src to string dst of size siz.  At most siz-1 characters
 * will be copied.  Always NUL terminates (unless siz == 0).
 * Returns strlen(src); if retval >= siz, truncation occurred.
 * Function creation history:  http://www.gratisoft.us/todd/papers/strlcpy.html
 */
size_t
strlcpy(char *dst, const char *src, size_t siz)
{
char*d = dst;
const char *s = src;
size_t n = siz;

/* Copy as many bytes as will fit */
if (n != 0)
{
while (--n != 0)
{
if ((*d++ = *s++) == '\0')
break;
}
}

/* Not enough room in dst, add NUL and traverse rest of src */
if (n == 0)
{
if (siz != 0)
*d = '\0'; /* NUL-terminate dst */
while (*s++)
;
}

return (s - src - 1); /* count does not include NUL */
}



/*
 * skip_drive
 *
 * On Windows, a path may begin with "C:" or "//network/".  Advance over
 * this and point to the effective start of the path.
 */
#ifdef WIN32

static char *
skip_drive(const char *path)
{
if (IS_DIR_SEP(path[0]) && IS_DIR_SEP(path[1]))
{
path += 2;
while (*path && !IS_DIR_SEP(*path))
path++;
}
else if (isalpha((unsigned char) path[0]) && path[1] == ':')
{
path += 2;
}
return (char *) path;
}
#else

#define skip_drive(path) (path)
#endif

/*
 * has_drive_prefix
 *
 * Return true if the given pathname has a drive prefix.
 */
bool
has_drive_prefix(const char *path)
{
#ifdef WIN32
return skip_drive(path) != path;
#else
return false;
#endif
}

/*
 * first_dir_separator
 *
 * Find the location of the first directory separator, return
 * NULL if not found.
 */
char *
first_dir_separator(const char *filename)
{
const char *p;

for (p = skip_drive(filename); *p; p++)
if (IS_DIR_SEP(*p))
return unconstify(char *, p);
return NULL;
}

/*
 * first_path_var_separator
 *
 * Find the location of the first path separator (i.e. ':' on
 * Unix, ';' on Windows), return NULL if not found.
 */
char *
first_path_var_separator(const char *pathlist)
{
const char *p;

/* skip_drive is not needed */
for (p = pathlist; *p; p++)
if (IS_PATH_VAR_SEP(*p))
return unconstify(char *, p);
return NULL;
}

/*
 * last_dir_separator
 *
 * Find the location of the last directory separator, return
 * NULL if not found.
 */
char *
last_dir_separator(const char *filename)
{
const char *p,
   *ret = NULL;

for (p = skip_drive(filename); *p; p++)
if (IS_DIR_SEP(*p))
ret = p;
return unconstify(char *, ret);
}


/*
 * make_native_path - on WIN32, change / to \ in the path

Re: How to improve the performance of my SQL query?

2023-07-24 Thread jian he
On Mon, Jul 24, 2023 at 5:54 PM gzh  wrote:
>
> >Did you change any parameters that have an impact on query planning?
>
> >You can see that in the output of EXPLAIN (ANALYZE, BUFFERS, SETTINGS).
>
> I added some parameters and re-executed the Execution Plan.
>
> Except for the index not taking effect, I still don't know the reason why the 
> index is not working.
>
> Is it because there is too much data that meets the conditions?
>
>
> EXPLAIN (ANALYZE ON, VERBOSE ON, COSTS ON, BUFFERS ON)
>
> select COUNT(ET_CD)
>
> from TBL_SHA
>
> WHERE TBL_SHA.MS_CD = 'MLD009'
>
> and TBL_SHA.ETRYS in
>
>(select TBL_INF.RY_CD
>
> from TBL_INF
>
> WHERE TBL_INF.MS_CD = 'MLD009'
>
>AND TBL_INF.RY_CD = '0001'
>
>)
>
>
> - Execution Plan -
>
> Limit  (cost=2728633.22..2728633.23 rows=1 width=8) (actual 
> time=128691.521..128717.677 rows=1 loops=1)
>
>   Output: (count(tbl_sha.et_cd))
>
>   Buffers: shared hit=58948 read=2112758
>
>   I/O Timings: read=357249.120
>
>   ->  Aggregate  (cost=2728633.22..2728633.23 rows=1 width=8) (actual 
> time=128691.519..128717.674 rows=1 loops=1)
>
> Output: count(tbl_sha.et_cd)
>
> Buffers: shared hit=58948 read=2112758
>
> I/O Timings: read=357249.120
>
> ->  Nested Loop  (cost=1000.29..2722556.76 rows=2430587 width=9) 
> (actual time=2.364..128350.279 rows=2613500 loops=1)
>
>   Output: tbl_sha.et_cd
>
>   Buffers: shared hit=58948 read=2112758
>
>   I/O Timings: read=357249.120
>
>   ->  Index Only Scan using tbl_inf_pkc on mtpdb.tbl_inf  
> (cost=0.29..8.31 rows=1 width=9) (actual time=0.046..0.051 rows=1 loops=1)
>
> Output: tbl_inf.ms_cd, tbl_inf.ry_cd
>
> Index Cond: ((tbl_inf.ms_cd = 'MLD009'::bpchar) AND 
> (tbl_inf.ry_cd = '0001'::bpchar))
>
> Heap Fetches: 1
>
> Buffers: shared hit=4
>
>   ->  Gather  (cost=1000.00..2698242.58 rows=2430587 width=18) 
> (actual time=2.315..127773.087 rows=2613500 loops=1)
>
> Output: tbl_sha.et_cd, tbl_sha.etrys
>
> Workers Planned: 2
>
> Workers Launched: 2
>
> Buffers: shared hit=58944 read=2112758
>
> I/O Timings: read=357249.120
>
> ->  Parallel Seq Scan on mtpdb.tbl_sha  
> (cost=0.00..2454183.88 rows=1012745 width=18) (actual 
> time=952.728..127583.089 rows=871167 loops=3)
>
>   Output: tbl_sha.et_cd, tbl_sha.etrys
>
>   Filter: ((tbl_sha.ms_cd = 'MLD009'::bpchar) AND 
> (tbl_sha.etrys = '0001'::bpchar))
>
>   Rows Removed by Filter: 14062278
>
>   Buffers: shared hit=58944 read=2112758
>
>   I/O Timings: read=357249.120
>
>   Worker 0:  actual time=1432.292..127762.181 
> rows=988036 loops=1
>
> Buffers: shared hit=17875 read=706862
>
> I/O Timings: read=119193.744
>
>   Worker 1:  actual time=1425.878..127786.777 
> rows=992381 loops=1
>
> Buffers: shared hit=19813 read=706359
>
> I/O Timings: read=119386.899
>
> Planning:
>
>   Buffers: shared hit=42
>
> Planning Time: 1.024 ms
>
> Execution Time: 128717.731 ms
>
>

I think the whole query can just:
select COUNT(ET_CD)
from TBL_SHA
WHERE TBL_SHA.MS_CD = 'MLD009' AND TBL_SHA.ETRYS = '0001';

> and TBL_SHA.ETRYS in
>(select TBL_INF.RY_CD
> from TBL_INF
> WHERE TBL_INF.MS_CD = 'MLD009'
>AND TBL_INF.RY_CD = '0001'
>)

if subquery after IN clause part, no rows returned then the whole
query would return zero row.
if many duplicates rows returned, then there is no point of evaluate
something like {1 in (1,1,1,1,1,1,)}




pageinspect bt_page_items doc

2023-07-23 Thread jian he
hi.

https://www.postgresql.org/docs/current/pageinspect.html#id-1.11.7.34.6

> This is a B-tree leaf page. All tuples that point to the table happen to be 
> posting list tuples (all of which store a total of 100 6 byte TIDs).
> There is also a “high key” tuple at itemoffset number 1. ctid is used to 
> store encoded information about each tuple in this example, though
> leaf page tuples often store a heap TID directly in the ctid field instead. 
> tids is the list of TIDs stored as a posting list.

(all of which store a total of 100 6 byte TIDs)
I think the meaning is something like:
(all of which store a total of 100 TIDs, each TID is 6 byte long.)

 What's the meaning of  (16, 8292)?  After looking around. I'm still confused.
Would it be better to add an explanation about (16, 8292) to the docs?




Re: Re: How to improve the performance of my SQL query?

2023-07-20 Thread jian he
On Thu, Jul 20, 2023 at 7:36 PM gzh  wrote:
>
>
> Thank you very much for taking the time to reply to my question.
>
>
> Sorry, I provided incorrect information.
>
> The index also does not work in the following query statement.
>
>
> > select COUNT(ET_CD)
>
> > from TBL_SHA
>
> > WHERE MS_CD = '009'
>
> > AND ETRYS = '01'
>
>
> QUERY PLAN
>
> Limit  (cost=2419643.47..2419643.48 rows=1 width=8) (actual 
> time=128667.439..128668.250 rows=1 loops=1)
>
>   ->  Finalize Aggregate  (cost=2419643.47..2419643.48 rows=1 width=8) 
> (actual time=128667.437..128668.246 rows=1 loops=1)
>
> ->  Gather  (cost=2419643.25..2419643.46 rows=2 width=8) (actual 
> time=128664.108..128668.233 rows=3 loops=1)
>
>   Workers Planned: 2
>
>   Workers Launched: 2
>
>   ->  Partial Aggregate  (cost=2418643.25..2418643.26 rows=1 
> width=8) (actual time=128655.256..128655.258 rows=1 loops=3)
>
> ->  Parallel Seq Scan on TBL_SHA  (cost=0.00..2415548.85 
> rows=1237762 width=9) (actual time=75357.455..128531.615 rows=107 loops=3)
>
>   Filter: ((MS_CD = '009'::bpchar) AND (ETRYS = 
> '01'::bpchar))
>
>   Rows Removed by Filter: 11833442
>
> Planning Time: 0.118 ms
>
> Execution Time: 128668.290 ms
>
>
> The TBL_SHA table has another index, as shown below.
>
>
> CREATE INDEX index_search_02 ON mdb.TBL_SHA USING btree (ET_CD, ETRYS)
>
> CREATE INDEX index_search_03 ON mdb.TBL_SHA USING btree (MS_CD, ET_DAY, 
> BK_CD, FR_CD, RM_CD)
>

>   Rows Removed by Filter: 11833442
select (38700325 - 11833442) /38700325.0;
is 0.69 approx.
So I think it says around 69%  of rows satisfy the query condition.

but I am not sure in the following 2 cases, whether the actual rows
are noisy or not. I can not find the doc explaining it.
> Partial Aggregate  (actual time=128655.256..128655.258 rows=1 loops=3)
> Finalize Aggregate (actual time=128667.437..128668.246 rows=1 loops=1)




Re: error in the example given for numeric data types

2023-07-15 Thread jian he
> Page: https://www.postgresql.org/docs/15/datatype-numeric.html

"docs/15" means this url pointer to pg version 15.

> practice=# create table t1(height numeric(3,5));
> ERROR:  NUMERIC scale 5 must be between 0 and precision 3
> LINE 1: create table t1(height numeric(3,5));
>  Please look into the matter and kindly revert back to me whatever you find
> out about this so that i can correct myself incase i misunderstood what the
> document says...

it works in pg15, not in pg14.
see my test: https://dbfiddle.uk/wgfjCx7j




Re: psql -af > out, is possible to also have raise notice, raise info part.

2023-07-11 Thread jian he
On Wed, Jul 12, 2023 at 12:06 PM David G. Johnston
 wrote:
>
> On Tue, Jul 11, 2023 at 8:52 PM jian he  wrote:
>>
>> hi.
>>
>> test.sql content:
>> --
>> do $$
>> begin
>>   raise info 'information message %', now() ;
>>   raise debug 'debug message %', now();
>>   raise notice 'notice message %', now();
>> end $$;
>> --
>> psql -af test.sql > test.out
>>
>
> You've only redirected stdout (file # 1 - implied), the "raise" stuff goes to 
> stderr (file # 2)
>
> IIRC you can do:
>
> psql -af test.sql > test.out 2>&1
>
> (order matters, left-to-right)
>
> But you can search online for "output redirection in Linux" or some such if 
> you want to learn the Linux command line better.
>
> David J.
>

thanks.
I don't know that "raise" stuff goes to stderr.
To get rid of the line numbers, I use "psql -a < test.sql > test.out
2>&1 " to get the expected result.




psql -af > out, is possible to also have raise notice, raise info part.

2023-07-11 Thread jian he
hi.

test.sql content:
--
do $$
begin
  raise info 'information message %', now() ;
  raise debug 'debug message %', now();
  raise notice 'notice message %', now();
end $$;
--
psql -af test.sql > test.out

current result:
--
do $$
begin
  raise info 'information message %', now() ;
  raise notice 'notice message %', now();
end $$;
DO
--
expected result in test.out

do $$
begin
  raise info 'information message %', now() ;
  raise notice 'notice message %', now();
end $$;
INFO:  information message 2023-07-12 11:49:27.894126+08
NOTICE:  notice message 2023-07-12 11:49:27.894126+08
DO




now() literal in pg_input_is_valid

2023-07-09 Thread jian he
Hi
should I expect
select  pg_input_is_valid($$[now(), infinity)$$,'tstzrange');
returns true?




psql -c command parse "select $$hello$$" failed

2023-07-04 Thread jian he
Hi.
not sure this is the expected result.

/home/jian/postgres/pg16_test/bin/psql -d test_dev -p 5455 -c "select $$hello$$"
2023-07-04 20:15:51.066 CST [1562050] ERROR:  trailing junk after
numeric literal at or near "884070h" at character 8
2023-07-04 20:15:51.066 CST [1562050] STATEMENT:  select 884070hello884070
ERROR:  trailing junk after numeric literal at or near "884070h"
LINE 1: select 884070hello884070
   ^




Re: analyze partition

2023-06-29 Thread jian he
On Thu, Jun 29, 2023 at 7:55 PM Marc Millas  wrote:
>
> Hi,
>
> the documentation, on chapter 5.11.3.3 caveat  says that a manual vacuum or 
> analyze on the root table does it only for that said root table. To my 
> understanding, the root table when used with declarative partitioning, does 
> not contain data, so vacuuming or analyzing should be 'fast'.
> If I run vacuum analyze ma_table on my big partitioned table (130+ 
> partitions) it does work for quite a long time.
>
> Can someone clarify ?
>
> thanks
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>

per manual:
> 5.11.3.3. Caveats

>  The following caveats apply to partitioning implemented using inheritance:
.
> If you are using manual VACUUM or ANALYZE commands, don't forget that you 
> need to run them on each child table individually. A command like:
>
> ANALYZE measurement;
>
> will only process the root table.

declarative partitioning is not the same. Here the caveats refers to
partitioning implemented using inheritance.
These two are different things.




timetz need more bytes than timestamptz

2023-06-18 Thread jian he
Hi,
https://www.postgresql.org/docs/current/datatype-datetime.html
timetz, timestamptz:
same resolution.
fractional digits in the seconds field are also the same.
>
> All timezone-aware dates and times are stored internally in UTC. They are
converted to local time in the zone specified by the TimeZone configuration
parameter before being displayed to the client.

Why does timetz need more bytes (8 vs 12) than timestamptz?


Re: Where I can find the achieve of the 'generate_series' ?

2023-05-29 Thread jian he
On Tue, May 30, 2023 at 8:35 AM Wen Yi <896634...@qq.com> wrote:

> Hi team,
> I want to figure out the principle of the function 'generate_series', but
> I don't know it contains in which source file.(I download the pgsql 15.3
> source code)
> Can someone provide some advice?
> Thanks in advance!
>
> Yours,
> Wen Yi
>


select proname,prosrc,prosupport from pg_proc where proname  ~*
'generate_series';

then grep the prosrc.


explicit-locking.html "key values" reference

2023-05-26 Thread jian he
hi.
https://www.postgresql.org/docs/current/explicit-locking.html

FOR KEY SHARE
> Behaves similarly to FOR SHARE, except that the lock is weaker: SELECT
> FOR UPDATE is blocked, but not SELECT FOR NO KEY UPDATE. A key-shared
> lock blocks other transactions from performing DELETE or any UPDATE that
> changes the key values, but not other UPDATE, and neither does it prevent 
> SELECT
> FOR NO KEY UPDATE, SELECT FOR SHARE, or SELECT FOR KEY SHARE.


A key-shared lock blocks other transactions from performing DELETE or any
> UPDATE that changes the key values


querya: select col1,col2,col3 from a for key share of a
Does "the key values" refer to these rows returned by the querya?
I still feel confused about "the key values".


confused by int2vector typdelim

2023-05-24 Thread jian he
hi.
select
pt.typtype
, pt.typcategory
, pt.typdelim
, pt.typelem
, pt1.typname   as elem_type
,pt.typsubscript
,pt.typname
frompg_type pt  join pg_type pt1 on pt.typelem = pt1.oid
where   pt.typname = 'int2vector';

returns
 typtype | typcategory | typdelim | typelem | elem_type |  typsubscript
  |  typname
-+-+-+--+---+-+
 b| A  | ,|  21  | int2 |
array_subscript_handler | int2vector
(1 row)

from manual:

> typdelim char
> Character that separates two values of this type when parsing array input.
> Note that the delimiter is associated with the array element data type, not
> the array data type.


should I expect the typdelim be a white space? Since '1 2'::int2vector
works, '1,2'::int2vector does not work.


Re: function signature allow "default" keyword

2023-05-15 Thread jian he
On Mon, May 15, 2023 at 5:00 PM Thomas Kellerer  wrote:

> jian he schrieb am 15.05.2023 um 10:33:
> >
> > function idea.
> > allow function calling using the default keyword for any of the input
> arguments.
> >
> > example: https://dbfiddle.uk/FQwnfdmm
> > So something like this "SELECT * FROM customer_orders(2579927,
> 'order_placed_on DESC', default, 2);"
> > should work.
>
>
> You can use named parameters when calling the function, that will use the
> default value for those not mentioned:
>
> SELECT *
> FROM customer_orders(_customer_id => 2579927,
>  _sort_field_and_direction => 'order_placed_on DESC',
>  _offset => 2);
>
>
>
>

select  proname
,proargtypes
,pg_get_expr(pp.proargdefaults,0,true)
,pronargdefaults
frompg_proc pp
where   pp.proname  = 'customer_orders';
--return-
proname | customer_orders
proargtypes | 23 25 23 23 1082 1082
pg_get_expr | 10, 0, CURRENT_DATE - 30, CURRENT_DATE
pronargdefaults | 4

table insert works, i guess because pg_attribute has column ordinal number
(left to right).
but pg_proc only counts the number of arguments that have a default value.


function signature allow "default" keyword

2023-05-15 Thread jian he
function idea.
allow function calling using the default keyword for any of the input
arguments.

example: https://dbfiddle.uk/FQwnfdmm
So something like this "SELECT * FROM customer_orders(2579927,
'order_placed_on DESC', default, 2);"
should work.


customized function to Reconstructs the creating command for a role.

2023-05-05 Thread jian he
Hi,
similar to pg_get_functiondef ( *func* oid ) → text
is there any function to get the role definition?


pg_class char type should be "char"?

2023-04-27 Thread jian he
Hi,
https://www.postgresql.org/docs/current/catalog-pg-class.html

relkind char
> relpersistence char
> relreplident char


these 3 columns type should be "char" ?


wiki.postgres ​ Tighten trigger permission checks already resolved

2023-04-24 Thread jian he
Hi.
The following Todo item seems  already resolved in pg15.

https://wiki.postgresql.org/wiki/Todo#Triggers

>  Tighten trigger permission checks
>
>- Security leak with trigger functions?
>
>
>
But it seems to not appear in the pg15 release notes. (I searched for the
keywords "trigger" and "function").


Re: alter table rename column can event trigger capture new column name

2023-04-23 Thread jian he
On Mon, Apr 24, 2023 at 10:46 AM jian he 
wrote:

>
> hi.
> alter table t1 rename col1 to id.
>
> Is it possible to use an event trigger to capture the new column name
> "id"?
>
>
>
sorry for the noise.
I figured it out.
substring(_object_identity FROM (LENGTH(_object_identity) + 2 -
STRPOS(REVERSE(_object_identity), '.')))


alter table rename column can event trigger capture new column name

2023-04-23 Thread jian he
hi.
alter table t1 rename col1 to id.

Is it possible to use an event trigger to capture the new column name "id"?


doc sql-grant.html Synopsis error?

2023-04-08 Thread jian he
Hi.
--work as intended.
grant ALL PRIVILEGES on FUNCTION pg_catalog.pg_reload_conf() to test;
grant ALL PRIVILEGES on FUNCTION pg_reload_conf() to test;

-errors. it should be work, or I interpret the doc the wrong
way?
GRANT ALL PRIVILEGES ON FUNCTION pg_reload_conf() IN SCHEMA pg_catalog TO
test;
GRANT ALL PRIVILEGES ON FUNCTION pg_catalog.pg_reload_conf() IN SCHEMA
pg_catalog TO test;

doc: https://www.postgresql.org/docs/current/sql-grant.html

GRANT { EXECUTE | ALL [ PRIVILEGES ] }
ON { { FUNCTION | PROCEDURE | ROUTINE } *routine_name* [ ( [ [
*argmode* ] [ *arg_name* ] *arg_type* [, ...] ] ) ] [, ...]
 | ALL { FUNCTIONS | PROCEDURES | ROUTINES } IN SCHEMA
*schema_name* [, ...] }
TO *role_specification* [, ...] [ WITH GRANT OPTION ]
[ GRANTED BY *role_specification* ]

I am using postgres 16, but the grant function part does not change.
What did I miss?


Re: ​jsonb @@ jsonpath operator doc: ​Only the first item of the result is taken into account

2023-04-03 Thread jian he
On Tue, Apr 4, 2023 at 12:22 AM Erik Wienhold  wrote:

> > On 03/04/2023 17:36 CEST Adrian Klaver 
> wrote:
> >
> > On 4/3/23 08:11, Erik Wienhold wrote:
> > >> On 02/04/2023 17:40 CEST Adrian Klaver 
> wrote:
> > >>
> > >> That is a long way from:
> > >>
> > >> jsonb @@ jsonpath → boolean
> > >>
> > >> Returns the result of a JSON path predicate check for the specified
> JSON
> > >> value. Only the first item of the result is taken into account. If the
> > >> result is not Boolean, then NULL is returned.
> > >
> > > What do you mean?  I responded to the OP's question.  It's not a
> suggestion
> > > to update the docs.  Obviously it's quite a mouthful and needs to be
> boiled
> > > down for the docs.  Any suggestions?
> >
> > For me I don't see how:
> >
> > Predicates have existence semantics, because their operands are item
> > sequences.  Pairs of items from the left and right operand's sequences
> > are checked.  TRUE returned only if any pair satisfying the condition is
> > found. In strict mode, even if the desired pair has already been found,
> > all pairs still need to be examined to check the absence of errors.  If
> > any error occurs, UNKNOWN (analogous to SQL NULL) is returned.
> >
> > resolves to :
> >
> > Only the first item of the result is taken into account.
> >
> > In other words reconciling "TRUE returned only if any pair satisfying
> > the condition is found."  and "...first item of the result..."
>
> I see.
>
> Thinking about it now, I believe that "first item of the result" is
> redundant
> (and causing the OP's confusion) because the path predicate produces only a
> single item: true, false, or null.  That's what I wanted to show with the
> first
> two jsonb_path_query examples in my initial response, where the second
> example
> returns multiple items.
>
> I think the gist of @@ and json_path_match is:
>
> "Returns true if any JSON value at the given path matches the predicate.
>  Returns NULL when not a path predicate or comparing different types."
>
> --
> Erik
>
>
>
"Returns true if any JSON value at the given path matches the predicate.
>  Returns NULL when not a path predicate or comparing different types."
>
in first sentence, should we add something "otherwise return false." ?
also, should it be "Return true"? (since only one value returned)?


​jsonb @@ jsonpath operator doc: ​Only the first item of the result is taken into account

2023-04-01 Thread jian he
Hi,
https://www.postgresql.org/docs/current/functions-json.html

> jsonb @@ jsonpath → boolean
>
> Returns the result of a JSON path predicate check for the specified JSON
> value. Only the first item of the result is taken into account. If the
> result is not Boolean, then NULL is returned.
>
> '{"a":[1,2,3,4,5]}'::jsonb @@ '$.a[*] > 2' → t
>

select jsonb_path_query('{"a":[1,2,3,4,5]}',  '$.a[*]');
return

 jsonb_path_query
> --
>  1
>  2
>  3
>  4
>  5
> (5 rows)
>

I don't understand: "Only the first item of the result is taken into
account.".

Here, JSON path predicate check for the specified JSON value return true,
some return false. (1 > 2 is false, 2 > 2 is false).


Re: src/test/examples/testlibpq2.c where the HAVE_SYS_SELECT_H is defined.

2023-03-17 Thread jian he
Yes, you are right. my mistake. I should choose the manual dev version.

On Fri, Mar 17, 2023 at 4:12 PM Thomas Munro  wrote:

> On Fri, Mar 17, 2023 at 7:48 PM jian he 
> wrote:
> > Hi,
> > playing around with $[0] testlibpq2.c example. I wondered where
> HAVE_SYS_SELECT_H is defined?
> >
> > I searched on the internet, founded that people also asked the same
> question in $[1].
> >
> > In my machine, I do have .
> > system version: Ubuntu 22.04.1 LTS
> > gcc version: gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
> > gcc compile command: gcc pg_testlibpq2.c
> -I/home/jian/postgres/pg16/include \
> > -L/home/jian/postgres/pg16/lib -lpq
> >
> > [0]https://www.postgresql.org/docs/current/libpq-example.html
> > [1]
> https://stackoverflow.com/questions/37876850/in-compilation-time-how-to-find-the-macro-is-defined-in-which-header-file
>
> In 15 and earlier, it is defined in pg_config.h, which is created by
> configure.  But in 16, that particular macro was removed by commit
> 7e50b4e3c.  It looks like you are using PostgreSQL 16 sources, but
> looking at PostgreSQL 15 examples?
>


-- 
 I recommend David Deutsch's <>

  Jian


src/test/examples/testlibpq2.c where the HAVE_SYS_SELECT_H is defined.

2023-03-17 Thread jian he
Hi,
playing around with $[0] testlibpq2.c example. I wondered where
HAVE_SYS_SELECT_H is defined?

I searched on the internet, founded that people also asked the same
question in $[1].

In my machine, I do have .
system version: Ubuntu 22.04.1 LTS
gcc version: gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0
gcc compile command: gcc pg_testlibpq2.c -I/home/jian/postgres/pg16/include
\
-L/home/jian/postgres/pg16/lib -lpq

[0]https://www.postgresql.org/docs/current/libpq-example.html
[1]
https://stackoverflow.com/questions/37876850/in-compilation-time-how-to-find-the-macro-is-defined-in-which-header-file


Re: How to pass table column values to function

2023-02-11 Thread jian he
On Sat, Feb 11, 2023 at 3:55 PM Andrus  wrote:

> Hi!
>
> Table source contains integer column. Its values should be passed to
> function for selecting data from other table.
>
> I tried
>
> CREATE OR REPLACE FUNCTION
> public.TestAddAssetTransactions(dokumnrs int[])
>  RETURNS int AS
> $BODY$
>
> with i1 as (
> INSERT INTO bilkaib (dokumnr)
> select dokumnr  from dok where dokumnr in (select * from
> unnest(dokumnrs))
> returning *
> )
>
> select count(*) from i1;
> $BODY$ language sql;
>
> create temp table bilkaib (dokumnr int ) on commit drop;
> create temp table dok (dokumnr serial primary key ) on commit drop;
> create temp table source (dokumnr int ) on commit drop;
> insert into source values (1),(2);
>
> select TestAddAssetTransactions( (select ARRAY[dokumnr] from
> source)::int[] )
>
>
> but got error
>
> > ERROR: more than one row returned by a subquery used as an expression
>
> How to pass set of integers to function? Should temp table with fixed name
> used or is there better solution?
>
> Using Postgresql 12+
>
> Andrus.
>
try this:
select TestAddAssetTransactions((select array(select * from source)));


38.10.6. Composite-Type Arguments C-language function code demo works for int, but not for numeric.

2023-02-01 Thread jian he
Hi.

source:
https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-C-BASETYPE
(38.10.6. Composite-Type Arguments)
column "salary" int data type works fine. But it does not work if the
"salary" column data type is numeric.

/*
rm funcs.so && rm funcs.o
gcc -I/home/jian/postgres/pg16/include/postgresql/server -fPIC -c funcs.c
gcc -shared -o funcs.so funcs.o
*/

#include "postgres.h"
#include "executor/executor.h"
#include  "utils/numeric.h"

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(c_overaid);

Datum
c_overaid(PG_FUNCTION_ARGS)
{
HeapTupleHeader t = PG_GETARG_HEAPTUPLEHEADER(0);
Numeric   limit = PG_GETARG_NUMERIC_COPY(1);
bool isnull;
Datum   salary;

salary = GetAttributeByName(t,"salary",);
if(isnull)
PG_RETURN_BOOL(false);

/* alternatively, we might prefer to do PG_RETURN_NULL() for null
salary */
PG_RETURN_BOOL(DatumGetNumericCopy(salary) > limit);
}
---
then quit the session and reconnect the session.

drop table emp cascade;
create table emp(empid bigint, salary numeric, name text);
insert into emp values(1,1000,'a1');
insert into emp values(2,1100,'a2');
insert into emp values(3,1200,'a3');
CREATE OR REPLACE FUNCTION c_overaid(emp, numeric) returns boolean AS
'/home/jian/pg_ext/misc/funcs','c_overaid' LANGUAGE C STRICT;
select name, c_overaid(emp,numeric '1110') as over_paid from emp;
it returns

>  name | over_paid
> --+---
>  a1   | t
>  a2   | t
>  a3   | t
> (3 rows)
>

I also tried PG_GETARG_NUMERIC,DatumGetNumeric(salary).


Re: does refreshing materialized view make the database bloat?

2023-01-15 Thread jian he
On Mon, Jan 16, 2023 at 10:28 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Sunday, January 15, 2023, jian he  wrote:
>
>>
>>
>> Hi,
>> why the materialized view itself bloats. If no refresh then no bloat
>> right? If fresh then set based delete operation will make materialized view
>> bloat?
>> I also found the same question online.
>> https://dba.stackexchange.com/questions/219079/bloat-on-materialized-views
>> Unfortunately nobody answered...
>>
>>
> The definition of bloat is a deleted row.  Bloat can be reduced by
> subsequent row insertions.
>
> David J.
>
>
Hi.
In the following example, I cannot see bloat (via extension pgstattuple
dead_tuple_count>0). Wondering where the bloat is.

BEGIN;create table tbt( a int) with(fillfactor=40, autovacuum_enabled=off);
insert into tbt  select g from generate_series(1,2000) g;
create materialized view tbtmv as select * from tbt;
commit;

do the update.
update tbt set a = 10 + a  where a < 20;
REFRESH MATERIALIZED view tbtmv;
SELECT * FROM pgstattuple('tbtmv'); -no dead tuples count.

--try delete.
delete  from tbt  where a < 50;
REFRESH MATERIALIZED view tbtmv;
SELECT * FROM pgstattuple('tbtmv');---still no dead tuples.


Re: does refreshing materialized view make the database bloat?

2023-01-15 Thread jian he
On Sat, Jan 14, 2023 at 11:49 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Friday, January 13, 2023, jian he  wrote:
>
>>
>> Once we have the diff table, we perform set-based DELETE and INSERT
>>> operations against the materialized view, and discard both temporary
>>> tables.
>>>
>>
>> Here the temporary tables are "discard" meaning the temporary tables are
>> deleted and the temporary tables' spaces are reclaimed immediately?
>> Or the temporary tables are deleted and the spaces will be reclaimed by
>> another mechanism?
>>
>> simplify:does refreshing materialized view make the database bloat.
>>
>>
> The materialized view itself bloats.  The temp tables are removed
> immediately (the “drop table” docs don’t say this explicitly though it the
> most logical behavior and implied by the fact it takes an access exclusive
> lock).
>
> David J.
>
>
Hi,
why the materialized view itself bloats. If no refresh then no bloat right?
If fresh then set based delete operation will make materialized view bloat?
I also found the same question online.
https://dba.stackexchange.com/questions/219079/bloat-on-materialized-views
Unfortunately nobody answered...


does refreshing materialized view make the database bloat?

2023-01-13 Thread jian he
Hi,

src_backend_commands_matview.c
547: /*
548:  * refresh_by_match_merge
549:  *
550:  * Refresh a materialized view with transactional semantics, while
allowing
551:  * concurrent reads.
552:  *
553:  * This is called after a new version of the data has been created in a
554:  * temporary table.  It performs a full outer join against the old
version of
555:  * the data, producing "diff" results.  This join cannot work if there
are any
556:  * duplicated rows in either the old or new versions, in the sense
that every
557:  * column would compare as equal between the two rows.  It does work
correctly
558:  * in the face of rows which have at least one NULL value, with all
non-NULL
559:  * columns equal.  The behavior of NULLs on equality tests and on
UNIQUE
560:  * indexes turns out to be quite convenient here; the tests we need to
make
561:  * are consistent with default behavior.  If there is at least one
UNIQUE
562:  * index on the materialized view, we have exactly the guarantee we
need.
563:  *
564:  * The temporary table used to hold the diff results contains just the
TID of
565:  * the old record (if matched) and the ROW from the new table as a
single
566:  * column of complex record type (if matched).
567:  *
568:  * Once we have the diff table, we perform set-based DELETE and INSERT
569:  * operations against the materialized view, and discard both temporary
570:  * tables.
571:  *
572:  * Everything from the generation of the new data to applying the
differences
573:  * takes place under cover of an ExclusiveLock, since it seems as
though we
574:  * would want to prohibit not only concurrent REFRESH operations, but
also
575:  * incremental maintenance.  It also doesn't seem reasonable or safe
to allow
576:  * SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or
deleted by
577:  * this command.
578:  */
579:

> Once we have the diff table, we perform set-based DELETE and INSERT
> operations against the materialized view, and discard both temporary
> tables.
>

Here the temporary tables are "discard" meaning the temporary tables are
deleted and the temporary tables' spaces are reclaimed immediately?
Or the temporary tables are deleted and the spaces will be reclaimed by
another mechanism?

simplify:does refreshing materialized view make the database bloat.


https://wiki.postgresql.org/wiki/Working_with_Git link one link cannot open, another link is unrelated.

2022-12-28 Thread jian he
Hi. It's more like a git question less like a postgres question..
I executed the following commands

rm /home/jian/Downloads/001_psql_parse_only_test.patch
git checkout master
git branch --delete --force feature_to_review
git  checkout -b feature_to_review
patch -p1 --batch < /home/jian/Downloads/001_psql_parse_only.v1.patch
git commit -a -m "test"
git diff --patience master feature_to_review >
/home/jian/Downloads/001_psql_parse_only_test.patch
git diff --color-words --no-index
/home/jian/Downloads/001_psql_parse_only.v1.patch
 /home/jian/Downloads/001_psql_parse_only_test.patch

the last command should return nothing? The idea is to create a local
branch, patch the downloaded patch file, and commit the patch file.
Then try to use git diff to validate local branch = master + patch file by
comparing the download patch file with the output of (git diff branchA
branchB).
Generally what's the correct commands (workflow to patch the patch file)
and how can i validate the local branch only has the patched file changes
and no others.


also in this link: https://wiki.postgresql.org/wiki/Working_with_Git

> See the documentation and tutorials at http://git.or.cz/ for a more
> detailed Git introduction. For a more detailed lesson, check out
> http://progit.org and maybe get a hardcopy to help support the site.
>
the first link not working, the second link, is not that related to git.


Re: ON CONFLICT and WHERE

2022-11-13 Thread jian he
On Mon, Nov 14, 2022 at 2:55 AM Adrian Klaver 
wrote:

> On 11/13/22 13:07, Tom Lane wrote:
> > Adrian Klaver  writes:
> >> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
> >> ON CONFLICT (id)
> >> WHERE updated IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
> >> DO UPDATE
> >> SET version = books.version + 1, updated = CURRENT_TIMESTAMP;
> >
> >> I have not used WHERE with ON CONFLICT myself so it took longer then I
> >> care to admit to correct the above to:
> >
> >> INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
> >> ON CONFLICT (id)
> >> DO UPDATE
> >> SET version = books.version + 1, updated = CURRENT_TIMESTAMP
> >> WHERE books.version IS NULL OR books.updated + INTERVAL '2min' <
> >> CURRENT_TIMESTAMP;
> >
> >> The question is why did the first case just ignore the WHERE instead of
> >> throwing a syntax error?
> >
> > A WHERE placed there is an index_predicate attachment to the ON CONFLICT
> > clause.  It doesn't have any run-time effect other than to allow partial
> > indexes to be chosen as arbiter indexes.  TFM explains
> >
> >  index_predicate
> >
> >  Used to allow inference of partial unique indexes. Any indexes
> >  that satisfy the predicate (which need not actually be partial
> >  indexes) can be inferred.
> >
> > This strikes me as a bit of a foot-gun.  I wonder if we should make
> > it safer by insisting that the resolved index be partial when there's
> > a WHERE clause here.  (This documentation text is about as clear as
> > mud, too.  What does "inferred" mean here?  I think it means "chosen as
> > arbiter index", but maybe I misunderstand.)
>
> Alright I see how another use of WHERE comes into play.
>
> I do agree with the clarity of the description, especially after looking
> at the example:
>
> "
> Insert new distributor if possible; otherwise DO NOTHING. Example
> assumes a unique index has been defined that constrains values appearing
> in the did column on a subset of rows where the is_active Boolean column
> evaluates to true:
>
> -- This statement could infer a partial unique index on "did"
> -- with a predicate of "WHERE is_active", but it could also
> -- just use a regular unique constraint on "did"
> INSERT INTO distributors (did, dname) VALUES (10, 'Conrad International')
>  ON CONFLICT (did) WHERE is_active DO NOTHING;
> "
>
> I honestly cannot figure out what that is saying.
>
> >
> >   regards, tom lane
>
> --
> Adrian Klaver
> adrian.kla...@aklaver.com
>
>
>
>
INSERT INTO books VALUES (12, 0, CURRENT_TIMESTAMP)
ON CONFLICT (id)
WHERE version IS NULL OR updated + INTERVAL '2min' < CURRENT_TIMESTAMP
DO UPDATE
SET version = books.version + 1, updated = CURRENT_TIMESTAMP;

Since id is already the primary key, it skipped the WHERE part. it resolves
to the DO UPDATE part.

from test code.

> create table insertconflicttest(key int4, fruit text);
> create unique index partial_key_index on insertconflicttest(key) where
> fruit like '%berry';
>

In this case, the  on conflict clause should be exactly like *on conflict
(key) where fruit like '%berry'*

-- fails
insert into insertconflicttest values (23, 'Blackberry') on conflict (key)
do update set fruit = excluded.fruit;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT
specification
insert into insertconflicttest values (23, 'Blackberry') on conflict (key)
where fruit like '%berry' or fruit = 'consequential' do nothing;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT
specification
insert into insertconflicttest values (23, 'Blackberry') on conflict
(fruit) where fruit like '%berry' do update set fruit = excluded.fruit;
ERROR:  there is no unique or exclusion constraint matching the ON CONFLICT
specification


-- 
 I recommend David Deutsch's <>

  Jian


for integer/bigint type while format to scientific notation, automatically get the correct number of precision

2022-11-08 Thread jian he
Hi,

looking for the reverse of select '1.2923904e+07'::numeric::int;
every time, I have to select to_char(12923904,'9.999');
if I do select to_char(12923904,'9.99');
then the result is different.

For integer or bigint can I automatically have something like:
select to_char(int,'9.(precision-1)');


-- 
 I recommend David Deutsch's <>

  Jian


Re: Delete a table automatic?

2022-11-01 Thread jian he
On Tue, Nov 1, 2022 at 2:33 PM 黄宁  wrote:

> I now have two tables named A and B. Table B is calculated based on the
> data of table A. I wonder if table B can be automatically deleted when
> table A is deleted?
>

Your question seems not that specific.
You can use https://dbfiddle.uk/btGcOH30 to showcase your specific
problem/question.

you can use DROP TABLE CASCADE.
DROP TABLE manual:
https://www.postgresql.org/docs/current/sql-droptable.html

-- 
 I recommend David Deutsch's <>

  Jian


Re: How to know how much CPU, RAM is used by existing 1 database

2022-10-26 Thread jian he
On Wed, Oct 26, 2022 at 11:07 AM Yi Sun  wrote:

> Hi Guys,
>
> Who can help me with this please? I researched but still no result yet,
> thank you
>
> On Tue, 25 Oct 2022 at 16:30, Yi Sun  wrote:
>
>> Hi,
>>
>> There are many databases in our production patroni cluster and it seems
>> it is overloaded, so we decide to migrate the busiest database to a new
>> patroni cluster.
>>
>> pgwatch2 is implemented, how to know how much CPU, RAM is used by the
>> database please? Then we can use it to prepare the new patroni cluster
>> hardware. Thank you
>>
>> Best regards
>> Dennis
>>
>
manual: https://www.postgresql.org/docs/current/runtime-config-resource.html
except min_dynamic_shared_memory (integer)
all other parameters are used to cap the memory. almost all parameters
mentioned "database server" which means it's on cluster level.


-- 
 I recommend David Deutsch's <>

  Jian


understand pg_ndistinct type && Why with(autovacuum_enabled=off) some query estimate 100, some is 200.

2022-09-14 Thread jian he
source:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/test/regress/expected/stats_ext.out;h=431b3fa3de1f4f87205e7e27a99ef1cf337f1590;hb=676887a3b0b8e3c0348ac3f82ab0d16e9a24bd43

-- n-distinct tests
CREATE TABLE ndistinct (
filler1 TEXT,
filler2 NUMERIC,
a INT,
b INT,
filler3 DATE,
c INT,
d INT
)
WITH (autovacuum_enabled = off);
-- over-estimates when using only per-column statistics
INSERT INTO ndistinct (a, b, c, filler1)
 SELECT i/100, i/100, i/100, cash_words((i/100)::money)
   FROM generate_series(1,1000) s(i);
ANALYZE ndistinct;
-- Group Aggregate, due to over-estimate of the number of groups
SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct
GROUP BY a, b');
 estimated | actual
---+
   100 | 11
(1 row)

SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct
GROUP BY b, c');
 estimated | actual
---+
   100 | 11
(1 row)

SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct
GROUP BY a, b, c');
 estimated | actual
---+
   100 | 11
(1 row)

SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct
GROUP BY a, b, c, d');
 estimated | actual
---+
   200 | 11
(1 row)

SELECT * FROM check_estimated_rows('SELECT COUNT(*) FROM ndistinct
GROUP BY b, c, d');
 estimated | actual
---+
   200 | 11
(1 row)

-- correct command
CREATE STATISTICS s10 ON a, b, c FROM ndistinct;
ANALYZE ndistinct;
SELECT s.stxkind, d.stxdndistinct
  FROM pg_statistic_ext s, pg_statistic_ext_data d
 WHERE s.stxrelid = 'ndistinct'::regclass
   AND d.stxoid = s.oid;
 stxkind |stxdndistinct
-+-
 {d,f,m} | {"3, 4": 11, "3, 6": 11, "4, 6": 11, "3, 4, 6": 11}
(1 row)

---

I don't understand the query GROUP BY b, c, d estimate is 200, while  GROUP
BY a, b, c is 100.
I also don't understand the last query stxdndistinct result.
I know what d,f,m refer to.
I may found the pg_ndistinct type source:
https://doxygen.postgresql.org/mvdistinct_8c.html#a03c06f5f0db3fc22cd5323ea04906a7c
But my C knowledge is limited.

Is there any way in sql level to query more info (like base type)
about pg_ndistinct
?

-- 
 I recommend David Deutsch's <>

  Jian


In a partition why 1st time encounter NULL then call minvfunc

2022-08-25 Thread jian he
same question:
https://stackoverflow.com/questions/73476732/postgresql-in-a-partition-1st-time-null-then-call-minvfunc
dbfilddle


source


create or replace function logging_msfunc_strict(text,anyelement)returns text as
$$select $1 || '+' || quote_nullable($2)
$$LANGUAGE sql strict IMMUTABLE;
create or replace function logging_minvfunc_strict(text,
anyelement)returns text as
$$select $1 || '-' || quote_nullable($2)
$$LANGUAGE sql strict IMMUTABLE;
create aggregate logging_agg_strict(text)
(
stype = text,
sfunc = logging_sfunc_strict,
mstype =  text,
msfunc = logging_msfunc_strict,
minvfunc = logging_minvfunc_strict
);

create aggregate logging_agg_strict_initcond(anyelement)
(
stype = text,
sfunc = logging_sfunc_strict,
mstype = text,
msfunc = logging_msfunc_strict,
minvfunc = logging_minvfunc_strict,
initcond = 'I',
minitcond = 'MI'
);

execute following query:

SELECT
p::text || ',' || i::text || ':' || COALESCE(v::text, 'NULL') AS _row,
logging_agg_strict (v) OVER w AS nstrict,
logging_agg_strict_initcond (v) OVER w AS nstrictFROM (
VALUES (1, 1, NULL),
(1, 2, 'a'),
(1, 3, 'b'),
(1, 4, NULL),
(1, 5, NULL),
(1, 6, 'c'),
(2, 1, NULL),
(2, 2, 'x'),
(3, 1, 'z')) AS t (p, i, v)
WINDOW w AS (PARTITION BY p ORDER BY i ROWS BETWEEN 1 PRECEDING
AND CURRENT ROW);

return following result:

_row   |  nstrict  |nstrict--+---+
 1,1:NULL | [[null]]  | MI
 1,2:a| a | MI+'a'
 1,3:b| a+'b' | MI+'a'+'b'
 1,4:NULL | a+'b'-'a' | MI+'a'+'b'-'a'
 1,5:NULL | [[null]]  | MI
 1,6:c| c | MI+'c'
 2,1:NULL | [[null]]  | MI
 2,2:x| x | MI+'x'
 3,1:z| z | MI+'z'
(9 rows)


For now I don't understand row 1,4:NULL | a+'b'-'a' | MI+'a'+'b'-'a'. I am
not sure why the 1st time you encounter NULL then it will call inverse
transition function Overall, not sure about the idea of inverse transition
function.


Re: Creating constraint dynamically

2022-08-22 Thread jian he
On Mon, Aug 22, 2022 at 12:59 PM sivapostg...@yahoo.com <
sivapostg...@yahoo.com> wrote:

> Hello,
> Using PG 11.4
>
> We populate constraint string dynamically and add it to the table with
> alter table command.  It gets added, but without the required brackets.
> What we build is
> ALTER TABLE public.tx_barcode_stock
> ADD CONSTRAINT "tx_barcode_stock_CK1" CHECK
> ( (branchcode = '1'::bpchar  and barcodeitem = 'Y'::bpchar and
> closingstock >= 0::numeric)  Or (branchcode = '1' and barcodeitem =
> 'N'::bpchar and closingstock >= 0::numeric )  Or (branchcode = '2'::bpchar
> and barcodeitem = 'Y'::bpchar and closingstock >= 0::numeric)  Or
> (branchcode = '2' and  barcodeitem = 'N'::bpchar and closingstock >=
> 0::numeric ) ) NOT VALID;
>
> After creation, when we check what we find is  [ in PgAdmin ]
> ALTER TABLE public.tx_barcode_stock
> ADD CONSTRAINT "tx_barcode_stock_CK1" CHECK (branchcode::bpchar =
> '1'::bpchar AND barcodeitem = 'Y'::bpchar AND closingstock >= 0::numeric OR
> branchcode::text = '1'::text AND barcodeitem = 'N'::bpchar AND closingstock
> >= 0::numeric OR branchcode::bpchar = '2'::bpchar AND barcodeitem =
> 'Y'::bpchar AND closingstock >= 0::numeric OR branchcode::text = '2'::text
> AND barcodeitem = 'N'::bpchar AND closingstock >= 0::numeric)
> NOT VALID;
>
> We have only one bracket, in the final updated one.
>
> Since there are AND and OR conditions, without brackets the whole
> conditions becomes useless.
>
> How to create a constraint like the above one, with braces in tact ?   Or
> any other way that we can implement a check constraint as above?
>
> Happiness Always
> BKR Sivaprakash
>
>
I guess it's because  pgadmin uses pg_get_constraintdef ( *constraint* oid [,
*pretty* boolean ] ) pretty is true.
if you use pg_get_constraintdef ( *constraint* oid *,false* )  you will see
more braces.


-- 
 I recommend David Deutsch's <>

  Jian


lateral join with union all

2022-08-15 Thread jian he
select * from
(
(select 2 as v )
union all
(select 3 as v)
) as q1
cross join lateral
(   (select * from
((select 4 as v) union all
(select 5 as v)) as q3
)
union all
(select q1.v)
) as  q2;

I thought q1 will be materialized as a constant set and will be equivalent
as  select 2 union all select 3;
Then It will  have 8 (2 * 4) rows total. Then It will be like {2,3} cross
join with {2,3,4,5}

But Here the actual result(return 6 rows) feels like two separate
queries(A,B) then union together.
QueryA: (select 2 as v ) cross join lateral (.)
QueryB: (select 3 as v ) cross join lateral (.)
Query A 3 row + Query B 3 row. So the total is 6 rows.

Then I feel a little bit confused.

-- 
 I recommend David Deutsch's <>

  Jian


Re: ICU is not supported in this build. install from source code.

2022-08-03 Thread jian he
 undefined reference to `ucol_getSortKey_70'
> /usr/bin/ld: utils/adt/varlena.o: in function `varstr_abbrev_convert':
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:2589:
> undefined reference to `ucol_getSortKey_70'
> /usr/bin/ld:
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:2574:
> undefined reference to `uiter_setUTF8_70'
> /usr/bin/ld:
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:2577:
> undefined reference to `ucol_nextSortKeyPart_70'
> /usr/bin/ld: utils/adt/varlena.o: in function `varstrfastcmp_locale':
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:2411:
> undefined reference to `ucol_strcoll_70'
> /usr/bin/ld:
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:2392:
> undefined reference to `ucol_strcollUTF8_70'
> /usr/bin/ld: utils/adt/varlena.o: in function `varstr_cmp':
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:1688:
> undefined reference to `ucol_strcoll_70'
> /usr/bin/ld:
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:1669:
> undefined reference to `ucol_strcollUTF8_70'
> /usr/bin/ld: utils/adt/varlena.o: in function `varstr_abbrev_convert':
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:2584:
> undefined reference to `u_errorName_70'
> /usr/bin/ld: utils/adt/varlena.o: in function `varstrfastcmp_locale':
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:2397:
> undefined reference to `u_errorName_70'
> /usr/bin/ld: utils/adt/varlena.o: in function `varstr_cmp':
> /home/jian/Desktop/pg16_source/postgresql/src/backend/utils/adt/varlena.c:1674:
> undefined reference to `u_errorName_70'
> collect2: error: ld returned 1 exit status
> make[2]: *** [Makefile:66: postgres] Error 1
> make[2]: Leaving directory
> '/home/jian/Desktop/pg16_source/postgresql/src/backend'
> make[1]: *** [Makefile:42: all-backend-recurse] Error 2
> make[1]: Leaving directory '/home/jian/Desktop/pg16_source/postgresql/src'
> make: *** [GNUmakefile:16: world-src-recurse] Error 2
>

So now postgresql does not support ICU version 71?

On Tue, Aug 2, 2022 at 8:08 PM jian he  wrote:

>
> Install from source code(follow the manual)
> system version:  Ubuntu 22.04 LTS
> pg version:  PostgreSQL 15beta2 on x86_64-pc-linux-gnu, compiled by gcc
> (Ubuntu 11.2.0-19ubuntu1) 11.2.0, 64-bit
>
> install step:
>
>> ./configure --with-perl --with-python  --with-icu
>> ICU_CFLAGS='-I/usr/include/unicode'  ICU_LIBS='-L/usr/lib/icu'
>> --enable-debug --with-pgport=5439
>>
> make world
>> sudo su
>> make install world
>> mkdir -p /usr/local/pgsql/data
>> chown jian  /usr/local/pgsql/data
>> su - jian
>> /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
>> /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start
>> /usr/local/pgsql/bin/createdb test
>> /usr/local/pgsql/bin/psql test
>>
>
> I installed ICU4C 71.1package.
> use /usr/bin/icuinfo  return
>
>> 
>>  Copyright (C) 2016 and later: Unicode, Inc.
>> and others. License & terms of use: http://www.unicode.org/copyright.html
>> 
>> icu4c
>> International Components for Unicode for
>> C/C++
>> 71.1
>> 14.0
>> 4000
>> Linux
>> en_US
>> en-US
>> UTF-8
>> icudt71l
>> 
>> 41.0
>> 2022a
>> Asia/Kolkata
>> 64
>> 0
>> 4
>> 0
>> x86_64-pc-linux-gnu
>> x86_64-pc-linux-gnu
>> gcc
>> g++
>> 1
>> 1
>>  
>>
>>
>> ICU Initialization returned: U_ZERO_ERROR
>
> Plugins are disabled.
>>
>
> But now I cannot ICU. when I create an collation related to ICU then
>
>> ERROR:  ICU is not supported in this build
>>
>
>
> --
>  I recommend David Deutsch's <>
>
>   Jian
>
>
>

-- 
 I recommend David Deutsch's <>

  Jian


ICU is not supported in this build. install from source code.

2022-08-02 Thread jian he
Install from source code(follow the manual)
system version:  Ubuntu 22.04 LTS
pg version:  PostgreSQL 15beta2 on x86_64-pc-linux-gnu, compiled by gcc
(Ubuntu 11.2.0-19ubuntu1) 11.2.0, 64-bit

install step:

> ./configure --with-perl --with-python  --with-icu
> ICU_CFLAGS='-I/usr/include/unicode'  ICU_LIBS='-L/usr/lib/icu'
> --enable-debug --with-pgport=5439
>
make world
> sudo su
> make install world
> mkdir -p /usr/local/pgsql/data
> chown jian  /usr/local/pgsql/data
> su - jian
> /usr/local/pgsql/bin/initdb -D /usr/local/pgsql/data
> /usr/local/pgsql/bin/pg_ctl -D /usr/local/pgsql/data -l logfile start
> /usr/local/pgsql/bin/createdb test
> /usr/local/pgsql/bin/psql test
>

I installed ICU4C 71.1package.
use /usr/bin/icuinfo  return

> 
>  Copyright (C) 2016 and later: Unicode, Inc.
> and others. License & terms of use: http://www.unicode.org/copyright.html
> 
> icu4c
> International Components for Unicode for
> C/C++
> 71.1
> 14.0
> 4000
> Linux
> en_US
> en-US
> UTF-8
> icudt71l
> 
> 41.0
> 2022a
> Asia/Kolkata
> 64
> 0
> 4
> 0
> x86_64-pc-linux-gnu
> x86_64-pc-linux-gnu
> gcc
> g++
> 1
> 1
>  
>
>
> ICU Initialization returned: U_ZERO_ERROR

Plugins are disabled.
>

But now I cannot ICU. when I create an collation related to ICU then

> ERROR:  ICU is not supported in this build
>


-- 
 I recommend David Deutsch's <>

  Jian


Re: postgis

2022-07-20 Thread jian he
Can you try compiling from source: https://postgis.net/source/

postgis 3.2.1 is OK. postgresql & postgis version info:
https://trac.osgeo.org/postgis/wiki/UsersWikiPostgreSQLPostGIS

if you already installed check postgis version:
https://postgis.net/docs/PostGIS_Version.html

On Wed, Jul 20, 2022 at 7:37 PM Marc Millas  wrote:

> ???
> I didnt get any error, as I dont know the name of the package to be
> installed !!!
>
> my question was, and still is:
> "Hi,
> I would like to install postgis 3.04 on a debian 11.
>
> digging into various web sites, I didnt found the name of that packet.
>
> can someone help ?"
>
> so..
> the only info on the debian postgis page I was able to find is the name of
> a package:
> postgres-12-postgis-3 which do install a postgis 3.2.1
> not a postgis 3.0.x
>
>
>
>
>
>
> Marc MILLAS
> Senior Architect
> +33607850334
> www.mokadb.com
>
>
>
> On Wed, Jul 20, 2022 at 3:52 PM Ron  wrote:
>
>> This long drama is about *POSTGIS*, not Postgresql.  What error do you
>> get when trying to install *POSTGIS*?
>>
>> On 7/20/22 08:26, Marc Millas wrote:
>>
>> ???
>>
>> I did describe precisely what I did:
>>
>> On 7/19/22 2:09 PM, Marc Millas wrote:
>>> > I did run each step of the script and did install a postgres 12.11.
>>> > then destroyed the instance created by the script, and, then
>>> > pg_createcluster a new one, which is running fine.
>>
>> no error messages.. Postgres is fine.
>>
>> My question i(from the beginning) s about the availability of a *postgis
>> *3.0.x distro for postgres 12 on debian 11.
>> and, if yes, the name of the package, and the @ of the repo.
>>
>> To my understanding, such a *postgis* distro does not exist in the
>> postgresql.org repos
>> so through this mail list, I was trying to know if it may exist somewhere.
>>
>> I know that I am supposed to post such a question on the postgis mail
>> list.
>> But.. I register thru the postgis web site, and didnt get any answer.
>> ...
>>
>> Marc MILLAS
>> Senior Architect
>> +33607850334
>> www.mokadb.com
>>
>>
>>
>> On Wed, Jul 20, 2022 at 10:25 AM Ron  wrote:
>>
>>> You've never shown us *exactly what you did*, along with any *error
>>> messages*.
>>>
>>> On 7/19/22 22:07, Marc Millas wrote:
>>>
>>> Postgres installed, but not postgis.. which is why I need some help...
>>>
>>> Marc MILLAS
>>> Senior Architect
>>> +33607850334
>>> www.mokadb.com
>>>
>>>
>>>
>>> On Tue, Jul 19, 2022 at 11:43 PM Adrian Klaver <
>>> adrian.kla...@aklaver.com> wrote:
>>>
 On 7/19/22 2:09 PM, Marc Millas wrote:
 > I did run each step of the script and did install a postgres 12.11.
 > then destroyed the instance created by the script, and, then
 > pg_createcluster a new one, which is running fine.

 Does this mean you have PostGIS installed now?

 > Marc MILLAS
 > Senior Architect
 > +33607850334
 > www.mokadb.com 


 --
 Adrian Klaver
 adrian.kla...@aklaver.com

>>>
>>> --
>>> Angular momentum makes the world go 'round.
>>>
>>
>> --
>> Angular momentum makes the world go 'round.
>>
>

-- 
 I recommend David Deutsch's <>

  Jian


Re: About limitation of using postgresql in china

2022-07-19 Thread jian he
On Tue, Jul 19, 2022 at 2:07 PM pig lee  wrote:

> Dear postgresql member
>
>
>
> I will use postgresql in china for a project.
>
> But I’m not sure Whether there are some limitations when using
> postgresql in china(limitation for China only).
>
> For example,license limitation in china or other limitation when used.
>
>
> Can you tell me the postgresql limitations when used for china only.
>
>
>
> Look forward to your Email.
>
> Thank you very much.
>
>
>
> Best Regards
>
> Pengkun
>

Hi,

postgresql license: https://www.postgresql.org/about/licence/

technical limitation: So far I tested
string chinese text to date example:
https://stackoverflow.com/questions/67597719/how-to-convert-02-3%E6%9C%88-21-to-date-21-03-02-00-psql
date to chinese text:
https://www.postgresql.org/docs/current/functions-formatting.html
order by chinese pinyin and stroke is both ok.

Obviously there are some trigger cases, you probably need to take care of,
like is  壹(U+58F9) the same as  一(U+4E00)?
Also, there is simplified Chinese and traditional Chinese issues.

I didn't test it, but I guess it would take more effort if you name your
SQL identifiers in Chinese.



-- 
 I recommend David Deutsch's <>

  Jian


first order by then partition by x < a fixed value.

2022-07-14 Thread jian he
This question (https://stackoverflow.com/q/72975669/15603477) is fun.
DB fiddle link:
https://dbfiddle.uk/?rdbms=postgres_14=36d685ad463831877ae70361be2cfa3b

account
size   idname 1001 John 2002 Mary 3003
Jane 4004 Anne1005 Mike 6006 Joanne

Then expected output: account group  size   idname 1   100
1 John 1   2002 Mary 1   3003 Jane 2
4004 Anne2   1005 Mike 3   6006 Joanne

Idea is fixed order by id then cumulative sum. if  <=600 then grouped
together using the same row_number.

But I imagine this kind question has been solved many times.
Current posted solution uses a recursive query, which is really hard for
me.
Is there any simple or more intuitive way to solve this kind of problem?
I can write some customized aggregate function to aggregate stops at 600.

I can get the following result, So what's the next step?

+--+++---+-+
| size | id |  name  | capped_at_600 | large_or_eq_600 |
+--+++---+-+
|  100 |  1 | John   |   100 | f   |
|  200 |  2 | Mary   |   300 | f   |
|  300 |  3 | Jane   |   600 | t   |
|  400 |  4 | Anne   |   400 | f   |
|  100 |  5 | Mike   |   500 | f   |
|  600 |  6 | Joanne |  1100 | t   |
+--+++---+-+



-- 
 I recommend David Deutsch's <>

  Jian


range with infinity bound data type disallow extra empty white space.

2022-07-09 Thread jian he
Hi, there.

Not sure this is a bug, or intentional.
select '[1, )'::int8range; --fail.
select '[1,)'::int8range; --ok.

tested on pg14.4, pg15.1


-- 
 I recommend David Deutsch's <>

  Jian


GIN index operator ?(jsonb,text) not working?

2022-06-28 Thread jian he
Hi,

dbfiddle:
https://dbfiddle.uk/?rdbms=postgres_14=060af497bbb75ecddad9fd2744f8022b
---
create  table test101 (  doc_id bigserial, document jsonb);
insert into test101(document) values ('{"user_removed" :false}') returning
*;
insert into  test101(document)  select '{"user_removed" :false}'::jsonb
from generate_series(1,1);
insert into  test101(document)  select ('{"user_remove"  :false, "test":'
|| i ||'}'::text)::jsonb
from generate_series(1,400) i;

CREATE INDEX test101_gin_user_removed_na ON test101 USING GIN (document
jsonb_ops)
where (document ? 'user_removed') is false;
CREATE INDEX test101_gin_user_removed_na_b ON test101(document)
where (document ? 'user_removed') is false;

Since 400 is very few percent compared to 1.
but the following query will not use any of the indexes.
explain (analyze)  select * from test101 where  document ?
'user_removed' is false ;



-- 
 I recommend David Deutsch's <>

  Jian


Source code test data folder don't have CSV files. How to get the CSV file.

2022-06-22 Thread jian he
trying to understand the source code test part.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/test/regress/sql/copy.sql;h=d72d226f341f42c69ffcb773c4faf53d9e586894;hb=072132f04e55c1c3b0f1a582318da78de7334379

First I can just ignore all the abs_srcdir, abs_builddir.  Since I can type
the absolute file path. So till line 125, I know how to do it.
Then there is the \set filename :abs_builddir '/results/copytest.csv'.
There is only data,expected, sql folder. So where is the results folder.
How can I easily get the csv file? (because I don't know that much about
MAKE and C)..

-- 
 I recommend David Deutsch's <>

  Jian


Re: '{"x": 42, "y": null}'::jsonb != '{"x": 42}'::jsonb ... Really?

2022-06-19 Thread jian he
On Sat, Jun 18, 2022 at 5:44 AM Bryn Llewellyn  wrote:

>
> */* — START OF SPEC —— */*
>
>
> *The document's top-level object may use only these keys:*
>
> *"isbn" — string*
> *values must be unique across the entire set of documents (in other words,
> it defines the unique business key); values must have this pattern:*
>
> *  « ^[0-9]{3}-[0-9]{1}-[0-9]{2}-[0-9]{6}-[0-9]{1}$ »*
>
> *"title" — string*
>
> *"year" — number*
> *must be a positive integral value*
>
> *"authors" — array of objects;*
> *must be at least one object*
>
> *"genre" — string*
>
> *Each object in the "authors" array object may use only these keys:*
>
> *"family name" — string*
>
> *"given name" — string*
>
> *String values other than for "isbn" are unconstrained.*
>
> *Any key other than the seven listed here is illegal. The "genre" and
> "given name" keys are not required. All the other keys are required.*
>
> *The meaning of *required* is that no extracted value must bring a SQL
> null (so a required key must not have a JSON null value).*
>
> *And the meaning of *not required* is simply "no information is available
> for this key" (with no nuances). The spec author goes further by adding a
> rule: this meaning must be expressed by the absence of such a key.*
>
>
> */* — END OF SPEC  */*
>

create temp table source(
isbn text primary key,
book_info_text text,
book_info jsonb generated always as ( book_info_text::jsonb ) stored
CONSTRAINT
test_jsonb_constraints1 check (book_info_text is json)
CONSTRAINTtest_jsonb_constraints2 check
(JSON_EXISTS(book_info_text::jsonb,'$.title') )
CONSTRAINTtest_jsonb_constraints3 check
(JSON_VALUE(book_info_text::jsonb,'$.year' returning int) > 0)
CONSTRAINTtest_jsonb_constraints4 check
(JSON_EXISTS(book_info_text::jsonb,'$.genre'))
CONSTRAINTtest_jsonb_constraints5 check (not
JSON_EXISTS(book_info_text::jsonb,'$.not_as_toplevel_key'))
CONSTRAINTtest_jsonb_constraints6 check (
(JSON_VALUE(book_info_text::jsonb,'$.authors[*]."family name"')
is not null)
OR
(JSON_VALUE(book_info_text::jsonb,'$.authors[*]."given name"' )
is not null)
 )
);

Some of the problems I don't know how to solve. My intuition feels like
that isbn attribute in the json document column then enforcing the unique
constraint would be anti-pattern. So I put the isbn outside as a separate
column.
Another constraint is that there are only certain keys  in the jsonb. I
don't know how to implement it. But I feel like it's do-able.
two columns, one text, another generated column stored jsonb, So there is a
duplication issue.

So there is another alternative way to do it.
normal relation tables, insert is done via json_table construct convert
json to table. output can be easily done with row_to_json.

For example:

> select * from json_table('{
>   "title"   : "Joy Luck Club",
>   "year": 2006,
>   "authors" : [{"given name": "Amy", "family name" : "Tan"}],
>   "genre"   : "Novel"
>   }'::jsonb,
> '$'
> COLUMNS(
> id for ordinality,
> title text path '$.title',
> year int path '$.year',
> genre text path '$.genre',
> nested path '$.authors[*]'
> columns(
> "given name" text path '$."given name"'
> ,"family name" text path '$."family name"'
> )
> )
> );
>







-- 
 I recommend David Deutsch's <>

  Jian


Re: GIN theory

2022-06-03 Thread jian he
theory/explanation about GIN index:
https://github.com/postgres/postgres/blob/master/src/backend/access/gin/README
https://postgrespro.com/blog/pgsql/4261647
https://pgpedia.info/g/gin.html


On Fri, Jun 3, 2022 at 2:34 PM huangning...@yahoo.com <
huangning...@yahoo.com> wrote:

> Hi:
>
> I want to know the time that create a gin index for a array, or some
> theory about gin index?
>
>
> Thanks
>


-- 
 I recommend David Deutsch's <>

  Jian


How to display complicated Chinese character: Biang.

2022-06-02 Thread jian he
Inspired by this thread:
https://www.postgresql.org/message-id/011f01d8757e%24f5d69700%24e183c500%24%40ndensan.co.jp
Trying to display some special Chinese characters in Postgresql. For now I
am using postgresql 15 beta1. The OS is Ubuntu 20.

localhost:5433 admin@test=# show LC_COLLATE;
++
| lc_collate |
++
| C.UTF-8|
++

localhost:5433 admin@test=# select icu_unicode_version();

+-+

| icu_unicode_version |

+-+

| 13.0|

+-+

icu_unicode_version is the extension function.

Wiki about character Biang: https://en.wikipedia.org/wiki/Biangbiang_noodles

quote:

> The character's traditional and simplified forms were added to Unicode
>  version 13.0 in March 2020 in the CJK
> Unified Ideographs Extension G
>  block
> of the newly allocated Tertiary Ideographic Plane
> .[19]
>  The
> corresponding Unicode characters are:
>

Unicode character info: https://www.compart.com/en/unicode/U+30EDD

query

with strings(s) as (
>  values (U&'\+0030EDD')
> )
> select s,
>   octet_length(s),
>   char_length(s),
>   (select count(*) from icu_character_boundaries(s,'en')) as graphemes
> from strings;
>

return

+-+--+-+---+
|  s| octet_length | char_length | graphemes |
+-+--+-+---+
| ロD |4  |   2  | 2 |
+-+--+-+---+

Seems not right. graphemes should be 1?
And I am not sure values (U&'\+0030EDD') is the same as ໝ.





-- 
 I recommend David Deutsch's <>

  Jian


Re: link that explain ICU_LOCALE locale concept about local variant code and keywords.

2022-05-25 Thread jian he
I found answer on  https://www.unicode.org/reports/tr35/tr35-collation.html
and https://cldr.unicode.org/index/bcp47-extension

On Wed, May 25, 2022 at 1:52 PM jian he  wrote:

>
> code from
> https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com
>
>> DROP DATABASE IF EXISTS dbicu;
>> CREATE DATABASE dbicu LOCALE_PROVIDER icu LOCALE 'en_US' ICU_LOCALE
>> 'en-u-kf-upper' template 'template0';
>> \c dbicu
>> CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper');
>> CREATE TABLE icu(def text, en text COLLATE "en_US", upfirst text COLLATE
>> upperfirst);
>> INSERT INTO icu VALUES ('a', 'a', 'a'), ('b','b','b'), ('A','A','A'),
>> ('B','B','B');
>> SELECT def AS def FROM icu ORDER BY def;
>> SELECT def AS en FROM icu ORDER BY en;
>> SELECT def AS upfirst FROM icu ORDER BY upfirst;
>> SELECT def AS upfirst_explicit FROM icu ORDER BY en COLLATE upperfirst;
>> SELECT def AS en_x_explicit FROM icu ORDER BY def COLLATE "en-x-icu";
>>
>
>- trying to replicate the above quoted command.
>- So I don't know * ICU_LOCALE 'en-u-kf-upper'  *the *u *and the *kf *refer
>to?
>
> Even I followed
> https://unicode-org.github.io/icu/userguide/locale/#the-locale-concept
>
>1. Language code
>2. Script code
>3. Country code
>4. Variant code
>5. Keywords
>
> So which website can I get the info about the "kf" and "u".
>
> --
>  I recommend David Deutsch's <>
>
>   Jian
>
>
>

-- 
 I recommend David Deutsch's <>

  Jian


link that explain ICU_LOCALE locale concept about local variant code and keywords.

2022-05-25 Thread jian he
code from
https://www.postgresql.org/message-id/flat/5e756dd6-0e91-d778-96fd-b1bcb06c161a%402ndquadrant.com

> DROP DATABASE IF EXISTS dbicu;
> CREATE DATABASE dbicu LOCALE_PROVIDER icu LOCALE 'en_US' ICU_LOCALE
> 'en-u-kf-upper' template 'template0';
> \c dbicu
> CREATE COLLATION upperfirst (provider = icu, locale = 'en-u-kf-upper');
> CREATE TABLE icu(def text, en text COLLATE "en_US", upfirst text COLLATE
> upperfirst);
> INSERT INTO icu VALUES ('a', 'a', 'a'), ('b','b','b'), ('A','A','A'),
> ('B','B','B');
> SELECT def AS def FROM icu ORDER BY def;
> SELECT def AS en FROM icu ORDER BY en;
> SELECT def AS upfirst FROM icu ORDER BY upfirst;
> SELECT def AS upfirst_explicit FROM icu ORDER BY en COLLATE upperfirst;
> SELECT def AS en_x_explicit FROM icu ORDER BY def COLLATE "en-x-icu";
>
- trying to replicate the above quoted command.
- So I don't know * ICU_LOCALE 'en-u-kf-upper'  *the *u *and the *kf *refer
to?
Even I followed
https://unicode-org.github.io/icu/userguide/locale/#the-locale-concept

   1. Language code
   2. Script code
   3. Country code
   4. Variant code
   5. Keywords

So which website can I get the info about the "kf" and "u".

-- 
 I recommend David Deutsch's <>

  Jian


Re: cast to domain with default collation issue.

2022-05-25 Thread jian he
postgresql 15 manual parts:

Otherwise, all input expressions must have the same implicit collation
> derivation or the default collation. If any non-default collation is
> present, that is the result of the collation combination. Otherwise, the
> result is the default collation.
>

For example, consider this table definition:
>
> CREATE TABLE test1 (
> a text COLLATE "de_DE",
> b text COLLATE "es_ES",
> ...
> );
>
> Then in
>
> SELECT a < 'foo' FROM test1;
>
> the < comparison is performed according to de_DE rules, because the
> expression combines an implicitly derived collation with the default
> collation.
>
 query: * SELECT a < 'foo' FROM test1;*
is an example of {{If any non-default collation is present,  that is the
result of the collation combination. }}

So it should be something like {{ if any side of expression don't have
implicit derived collation is present, that is the result of the collation
combination}

I personally feel wording *non-default* may not be that correct. Because if
the column is text then it automatically at least has default collation.

see manual quote about default collation:

> The collation of an expression can be the “default” collation, which
> means the locale settings defined for the database. It is also possible for
> an expression's collation to be indeterminate. In such cases, ordering
> operations and other operations that need to know the collation will fail.
>





On Wed, May 25, 2022 at 12:08 PM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> Please don’t top-post.
>
> On Tuesday, May 24, 2022, jian he  wrote:
>
>>
>> Otherwise, all input expressions must have the same implicit collation
>>> derivation or the default collation. If any non-default collation is
>>> present, that is the result of the collation combination. Otherwise, the
>>> result is the default collation.
>>>
>>
>> I think the above quote part can be used to explain the  following
>> examples.
>>
>>> CREATE TABLE test1 ( a text COLLATE "de_DE", b text COLLATE "es_ES", c
>>> text );
>>> SELECT a < 'foo' FROM test1;
>>
>> SELECT c < 'foo' FROM test1;
>>
>> But the *non-default* seems not that correct for me. Like a column if it 
>> does not mention anything, then the default value is null. So
>> * create table test111( a tex*t) The default collation for column a is the 
>> same as the output of  *show lc_collate*.
>>
>> so there is no *non-default? *
>>
>>
> I’m not following the point you are trying to make.  table111.a
> contributes the default collation for any expression needing a collation
> implicitly resolved.
>
> David J.
>
>


-- 
 I recommend David Deutsch's <>

  Jian


Re: cast to domain with default collation issue.

2022-05-24 Thread jian he
Otherwise, all input expressions must have the same implicit collation
> derivation or the default collation. If any non-default collation is
> present, that is the result of the collation combination. Otherwise, the
> result is the default collation.
>

I think the above quote part can be used to explain the  following
examples.

> CREATE TABLE test1 ( a text COLLATE "de_DE", b text COLLATE "es_ES", c
> text );
> SELECT a < 'foo' FROM test1;

SELECT c < 'foo' FROM test1;

But the *non-default* seems not that correct for me. Like a column if
it does not mention anything, then the default value is null. So
* create table test111( a tex*t) The default collation for column a is
the same as the output of  *show lc_collate*.

so there is no *non-default? *




On Tue, May 24, 2022 at 10:43 AM David G. Johnston <
david.g.johns...@gmail.com> wrote:

> On Monday, May 23, 2022, jian he  wrote:
>
>> CREATE DOMAIN testdomain AS text;
>>
>>  --asume the default collation is as per show LC_COLLATE;
>>
>> – on my pc, it is C.UTF-8.
>>
>> --So the testdomain will be collation "C.UTF-8"
>> 
>>
>> => \d collate_test1
>>
>> Table "test.collate_test1"
>>
>>  Column |  Type   | Collation | Nullable | Default
>>
>> +-+---+--+-
>>
>>  a  | integer |   |  |
>>
>>  b  | text| en-x-icu  | not null |
>>
>> ---
>>
>> My guess is that the following should be the same.
>>
>>
>>
> My reading of the docs say this is consistent with outcome #2.
>
> https://www.postgresql.org/docs/current/collation.html
>
>  David J.
>
>

-- 
 I recommend David Deutsch's <>

  Jian


cast to domain with default collation issue.

2022-05-23 Thread jian he
CREATE DOMAIN testdomain AS text;

 --asume the default collation is as per show LC_COLLATE;

– on my pc, it is C.UTF-8.

--So the testdomain will be collation "C.UTF-8"


=> \d collate_test1

Table "test.collate_test1"

 Column |  Type   | Collation | Nullable | Default

+-+---+--+-

 a  | integer |   |  |

 b  | text| en-x-icu  | not null |

=> \d collate_test2

Table "test.collate_test2"

 Column |  Type   | Collation | Nullable | Default

+-+---+--+-

 a  | integer |   |  |

 b  | text| sv-x-icu  |  |

=> \d collate_test3

Table "test.collate_test3"

 Column |  Type   | Collation | Nullable | Default

+-+---+--+-

 a  | integer |   |  |
 b  | text| C |  |
---

My guess is that the following should be the same. Since the same content
in the end will be cast to the same collation. However the following output
contradicts with my understanding.


SELECT a, b::testdomain FROM collate_test1 ORDER BY 2;

/*

 +---+-+

| a |  b  |

+---+-+

| 1 | abc |

| 4 | ABC |

| 2 | äbc |

| 3 | bbc |

+---+-+

 */

SELECT a, b::testdomain FROM collate_test2 ORDER BY 2;

/*

 +---+-+

| a |  b  |

+---+-+

| 1 | abc |

| 4 | ABC |

| 3 | bbc |

| 2 | äbc |

+---+-+

 */

SELECT a, b::testdomain FROM collate_test3 ORDER BY 2;

/*

 +---+-+

| a |  b  |

+---+-+

| 4 | ABC |

| 1 | abc |

| 3 | bbc |

| 2 | äbc |

+---+-+

 */


-- 
 I recommend David Deutsch's <>

  Jian


range of composite types!

2022-04-26 Thread Jian He
 range of composite types. I found this would be a great idea!!!
Question on stackoverflow

DB Fiddle


 source code regress test

ranges of composite types code part:

 504 --
>  505 -- Ranges of composites
>  506 --
>  507
>  508 create type two_ints as (a int, b int);
>  509 create type two_ints_range as range (subtype = two_ints);
>  510
>  511 -- with force_parallel_mode on, this exercises tqueue.c's range
> remapping
>  512 select *, row_to_json(upper(t)) as u from
>  513   (values (two_ints_range(row(1,2), row(3,4))),
>  514   (two_ints_range(row(5,6), row(7,8 v(t);
>

-- composite type range.
> create type mytype as (t1 int, t2 date);
> -- create type my_interval as (t1 int, t2 interval);
> select (2,'2022-01-02')::mytype ;
> create type mytyperange as range(subtype = mytype);
>

I am thinking construct a composite type range that would be equivalent as:

> select a, b::datefrom generate_series(1,8) a,
> generate_series('2022-01-01'::timestamp,
> '2022-01-31'::timestamp, interval '1 day') b;
>
> for that means the following sql queries should return* false:*

select mytyperange (
> (1,'2022-01-01')::mytype,
> (8, '2022-01-31')::mytype, '[]') @> (2, '2020-01-19')::mytype;
>


>  select
> (2, '2020-01-19')::mytype <@
> mytyperange(
> (1,'2022-01-01')::mytype,
> (8, '2022-01-31')::mytype, '[]') ;
>


> --does the range overlaps, that is, have any common element.
> select
> mytyperange ((2,'2020-12-30')::mytype,
> (2, '2020-12-31')::mytype)
> &&
> mytyperange(
> (1,'2022-01-01')::mytype,
> (8, '2022-01-31')::mytype) ;
>

from the db fiddle link, so far I failed.
If this is possible then we may need a *subtype_diff *function and *canonical
*function.


Re: Regular Expression For Duplicate Words

2022-02-02 Thread Jian He
It's an interesting question. But I also don't know how to do it in
PostgreSQL.
But I figured out alternative solutions.

GNU Grep:grep -E '(hello)[[:blank:]]+\1' <<<'one hello hello world'
ripgrep: rg  '(hello)[[:blank:]]+\1' --pcre2  <<<'one hello hello world'

On Wed, Feb 2, 2022 at 8:53 PM David G. Johnston 
wrote:

> On Wed, Feb 2, 2022 at 1:00 AM Shaozhong SHI 
> wrote:
>
>> This link is interesting.
>>
>> regex - Regular Expression For Duplicate Words - Stack Overflow
>> 
>>
>> Is there any example in Postgres?
>>
>>
> Not that I'm immediately aware of, and I'm not going to search the
> internet for you.
>
> The regex capabilities in PostgreSQL are pretty full-featured so a
> solution should be possible.  You should try translating the SO post
> concepts into PostgreSQL yourself and ask specific questions if you get
> stuck.
>
> David J.
>
>


Re: Counting the number of repeated phrases in a column

2022-01-25 Thread Jian He
you may be interested > https://dba.stackexchange.com/q/166762/238839

On Wed, Jan 26, 2022 at 3:03 AM Ivan Panchenko 
wrote:

>
> On 26.01.2022 00:21, benj@laposte.net wrote:
> > Le 25/01/2022 à 18:10, Shaozhong SHI a écrit :
> >> There is a short of a function in the standard Postgres to do the
> >> following:
> >>
> >> It is easy to count the number of occurrence of words, but it is
> >> rather difficult to count the number of occurrence of phrases.
> >>
> >> For instance:
> >>
> >> A cell of value:  'Hello World' means 1 occurrence a phrase.
> >>
> >> A cell of value: 'Hello World World Hello' means no occurrence of any
> >> repeated phrase.
> >>
> >> But, A cell of value: 'Hello World World Hello Hello World' means 2
> >> occurrences of 'Hello World'.
> >>
> >> 'The City of London, London' also has no occurrences of any repeated
> >> phrase.
> >>
> >> Anyone has got such a function to check out the number of occurrence
> >> of any repeated phrases?
> >>
> >> Regards,
> >>
> >> David
> >
> > Don't know if it's exactly what you want, but you can replace all
> > occurence of the phrase in the text by empty string and compute the
> > diff between the initial and the result and next divide by the length
> > of your phrase.
> >
> > Example :
> > WITH x AS (SELECT 'toto like tata and toto like titi and toto like
> > tutu' , 'toto like' phrase)
> > SELECT (char_length(texte) - char_length(replace(texte, phrase, '')))
> > / char_length(phrase) AS nb_occurence
> > FROM x
> >
> This works if the user knows the phrase. As far as I understood, the
> phrase is not known, and user wants to count number of repeats of any
> phrases.
> Of course this can be done with recursive CTE. Split into words,
> generate all phrases (AFAIK requires recursion), then group and count.
>
> But probably in PL/Perl this could be done  more effectively.
>
>
>
>
>


Chain Hashing

2021-05-06 Thread Jian He
Been following YouTube to study about Database Hash Join.
https://www.youtube.com/watch?v=J0nbgXIarhc
* HASH TABLE*
*Design Decision*
* #1: Hash Function *→ How to map a large key space into a smaller domain.
→ Trade-off between being fast vs. collision rate. Design Decision
*#2: Hashing Scheme* → How to handle key collisions after hashing. →
Trade-off between allocating a large hash table vs. additional instructions
to find/insert keys.

Now I have some idea about the Hash function, but the math part I have no
idea.
Hash Scheme is for handling collisions. Collision means that different
input keys will have some output hash bits.

The following part is about the *Chain Hashing. *

>   Maintain a linked list of buckets for each slot in the hash table.
> Resolve collisions by placing all elements with the same hash key into the
> same bucket.
> → To determine whether an element is present, hash to its bucket and scan
> for it.
>  → Insertions and deletions are generalizations of lookups.


   I still don't get it. Stackoverflow seems don't have good answers yet.
So I come here, asking


Re: database sorting algorithms.

2021-05-01 Thread Jian He
Thanks a lot.
I found out about this Youtube video (https://youtu.be/alJswNJ4P3U?t=1852),
in case you guys are interested.
This video really clarify about the time complixty of MergeSort.

On Sat, May 1, 2021 at 3:19 PM Gavan Schneider 
wrote:

> On 1 May 2021, at 17:06, Jian He wrote:
>
> Been self study Database, from database I deep dived into sorting
> algorithms.
>
> Databases can do in-memory QuickSort. It also has an on-disk MergeSort.
>
> For MergeSort: I follow this tutorial https://youtu.be/6pV2IF0fgKY?t=1108
> (around 1 minutes only)
>
> Also check https://en.wikipedia.org/wiki/Merge_sort
>
> But I am still not fully understanding about *nlogn*. I understand how
> many
> passes it will take, that is* logn. *
> Yes each pass will sort N elements.
> But I still don't get the *N* stand f*or in n*logn.*
>
> So, answering the question…
> The ’n’ refers to the need to do something to each element at least once,
> so the sort time grows in simple proportion to the size of the list that
> needs to be sorted. Unfortunately that is not enough work to get the list
> sorted so extra steps are needed. The log(n) indicates how many extra steps
> are needed. So the overall performance is proportional to the number of
> elements (N) multiplied by the log of the number of elements, viz., N *
> log(N)
>
> Regards
> Gavan Schneider
> ——
> Gavan Schneider, Sodwalls, NSW, Australia
> Explanations exist; they have existed for all time; there is always a
> well-known solution to every human problem — neat, plausible, and wrong.
> — H. L. Mencken, 1920
>


database sorting algorithms.

2021-05-01 Thread Jian He
Been self study Database, from database I deep dived into sorting
algorithms.

Databases can do in-memory QuickSort. It also has an on-disk MergeSort.

For MergeSort: I follow this tutorial https://youtu.be/6pV2IF0fgKY?t=1108
(around 1 minutes only)

But I am still not fully understanding about *nlogn*. I understand how many
passes it will take, that is* logn. *
Yes each pass will sort N elements.
But I still don't get the *N* stand f*or in n*logn.*
Why does each pass take
*n time to sort it? *