Re: Question about where to deploy the business logics for data processing

2023-06-12 Thread Merlin Moncure
On Thu, Jun 8, 2023 at 10:22 PM Nim Li  wrote:

> I wonder if anyone in the community has gone through changes like this?  I
> mean ... moving the business logics from PL/SQL within the database to the
> codes in NestJS framework, and reply on only the TypeORM to manage the
> update of the database without any SQL codes?  Any thoughts about such a
> change?
>

Heads up, this is something of a religious database debate in the industry,
and you are asking a bunch of database guys what they think about this, and
their biases will show in their answers.

Having said that, your developers are utterly, completely, wrong. This is
classic, "my technology good, your technology bad", and most of the reasons
given to migrate the stack boil down to "I don't know SQL any will do
absolutely anything to avoid learning it", to the point of rewriting the
entire freaking project into (wait for it) javascript, which might very be
the worst possible language for data management.

The arguments supplied are tautological: "SQL is bad because you have to
write SQL, which is bad", except for the laughably incorrect "sql can't be
checked into git".  Guess what, it can (try git -a my_func.sql), and there
are many techniques to deal with this.

Now, database deployments are a very complex topic which don't go away when
using an ORM.  in fact, they often get worse.  There are tools which can
generate change scripts from database A to A', are there tools to do that
for NestJS object models? Is there automatic dependency tracking for them?
Next thing you know, they will moving all your primary keys to
guids ("scaling problem, solved!") and whining about database performance
when you actually get some users.

WHY is writing SQL so bad?  Is it slower? faster? Better supported?
plpgsql is very highly supported and draws from a higher talent pool than
"NestJS".  Suppose you want to mix in some python, enterprise java, to your
application stack. What then?

ORMs are famously brittle and will often break if any data interaction to
the database does not itself go through the ORM, meaning you will be
writing and deploying programs to do simple tasks.  They are slow,
discourage strong data modelling, interact with the database inefficiently,
and do not manage concurrent access to data well.

merlin


Re: Dynamic binding issue

2023-06-12 Thread Adrian Klaver

On 6/12/23 15:13, Lorusso Domenico wrote:

Hello guys,
I'm a problem with dynamic sql.
I am trying to write a generic function that is able to read and update 
a table based on some data coming from e previous record.

Here the example
_sqlStr=format('select *
   from  %1$s.%2$s
   where (' || array_to_string(_activeRec.pk_columns_list, ',') || ') in 
(select ' ||
   'row($1[''' || array_to_string(_activeRec.pk_columns_list, 
'''],$1[''') || ''']))'

   , _activeRec.name_of_schema, _activeRec.main_table);

execute _sqlStr using oldRec into _rec;

My problem is oldRec is a type record, so the substitution performed by 
execute fails, because it can't recognize the field if the variable is 
record and not a specific composite record type.


Why not use a row type?:

https://www.postgresql.org/docs/current/plpgsql-declarations.html#PLPGSQL-DECLARATION-ROWTYPES



I suppose this is a recurrent question, but I can't find a solution...

--
Domenico L.


--
Adrian Klaver
adrian.kla...@aklaver.com





Dynamic binding issue

2023-06-12 Thread Lorusso Domenico
Hello guys,
I'm a problem with dynamic sql.
I am trying to write a generic function that is able to read and update a
table based on some data coming from e previous record.
Here the example
_sqlStr=format('select *
  from  %1$s.%2$s
  where (' || array_to_string(_activeRec.pk_columns_list, ',') || ') in
(select ' ||
  'row($1[''' || array_to_string(_activeRec.pk_columns_list, '''],$1[''')
|| ''']))'
  , _activeRec.name_of_schema, _activeRec.main_table);

execute _sqlStr using oldRec into _rec;

My problem is oldRec is a type record, so the substitution performed by
execute fails, because it can't recognize the field if the variable is
record and not a specific composite record type.

I suppose this is a recurrent question, but I can't find a solution...

-- 
Domenico L.


Re: Question about where to deploy the business logics for data processing

2023-06-12 Thread Lorusso Domenico
Hi Nim,
well this is a very particular scenario.
In a few words, these projects will never go live for production purposes,
but just to verify some hypotheses.

In this case, could be acceptable to generate schema on the fly, but isn't
easy to automatize each aspect related to optimization (partitioning, index
and so on).

Coming to your last question, where set the logic of data manipulation,
again, in this case, minimize the lan traffic could be your main goal, this
means logic inside the DB.


Il giorno ven 9 giu 2023 alle ore 18:34 Lorusso Domenico <
domenico@gmail.com> ha scritto:

> Uhm me need to start form 2 concepts:
>
>1. competence
>2. Network lag
>
> Competence: usually programmers aren't skilled enough about the
> architectures and the actual needs of each layer.
> This is a problem, because often programmers try to do something with what
> he already know (e.g. perform join in Java).
>
> A correct design requires to identify at least the data logic, the process
> logic, the business logic and the presentation logic.
>
> One of the most important goals of Data logic is to ensure the
> correctness of data from many point of view (all is impossible).
>
> That involve:
>
>- audit information
>- bitemporal management
>- strictly definition and verification of data (foreign key, checks,
>management of compatibility)
>- replicate consistently data for different usage
>- isolate access for actual needs
>- design
>
> So an application that requires changing the data model does not seem to
> be well designed...
>
> Network lag
> The first problem is latency, I must minimize the passage of data over the
> network.
> This means, for example, creating a service that allows the caller to
> choose only the information it needs.
> But it also means, to get all the information needed in a single call,
> design asynchronous service, use cache data physically near to the frontend
> or the middle layer.
>
> Based on these 2 concepts I suggest:
>
>- develop the Data logic near or inside the database;
>- design powerful and addictive api;
>- don't allow model change by the business logic
>- organize/copy data in jsonb with a powerful json schema to provide
>coherence through every layer
>- ensure a system to grant ACID features to your process.
>
>
>
> Il giorno ven 9 giu 2023 alle ore 05:22 Nim Li  ha
> scritto:
>
>> Hello.
>>
>> We have a PostgreSQL database with many tables, as well as foreign table,
>> dblink, triggers, functions, indexes, etc, for managing the business logics
>> of the data within the database.  We also have a custom table for the
>> purpose of tracking the slowly changing dimensions (type 2).
>>
>> Currently we are looking into using TypeORM (from Nest JS framework) to
>> connect to the database for creating a BE that provides web service.  Some
>> reasons of using TypeORM are that it can update the database schema without
>> any SQL codes, works very well with Git, etc.  And from what I am reading,
>> Git seems to work better with TypeORM, rather than handling individual
>> batch files with SQL codes (I still need to find out more about this)  Yet
>> I do not think the ORM concept deals with database specify functions, such
>> as dblink and/or trigger-function, etc, which handles the business logics
>> or any ETL automation within the database itself (I should read more about
>> this as well.)
>>
>> Anyway, in our team discussion, I was told that in modern programming
>> concept, the world is moving away from deploying programming logics within
>> the database (eg, by using PL/SQL).  Instead, the proper way should be to
>> deploy all the programming logics to the framework which is used to connect
>> to the database, such as NestJS in our case.  So, all we need in a database
>> should be only the schema (managed by ORM), and we should move all the
>> existing business logics (currently managed by things like the database
>> triggers, functions, dblink, etc.) to the Typescript codes within the
>> NestJS framework.
>>
>> I wonder if anyone in the community has gone through changes like this?
>> I mean ... moving the business logics from PL/SQL within the database to
>> the codes in NestJS framework, and reply on only the TypeORM to manage the
>> update of the database without any SQL codes?  Any thoughts about such a
>> change?
>>
>> Thank you!!
>>
>>
>
> --
> Domenico L.
>
> per stupire mezz'ora basta un libro di storia,
> io cercai di imparare la Treccani a memoria... [F.d.A.]
>


-- 
Domenico L.

per stupire mezz'ora basta un libro di storia,
io cercai di imparare la Treccani a memoria... [F.d.A.]


How to trace the postgres?

2023-06-12 Thread Wen Yi
Hi team,
Now I am start to analyse the process of the postgres, using the gdb.
As you can see, at the beginnning of my work, I use the ps to find the 
postgres's process.


[postgres@fedora postgresql]$ ps -ef | grep postgres
postgres 67468 1599 0 08:11 
? 00:00:01 
/home/postgres/pgsql/lib/bin/postgres -D /home/postgres/pgsql/data
postgres 67471 67468 0 08:11 
? 00:00:00 postgres: checkpointer 
postgres 67472 67468 0 08:11 
? 00:00:00 postgres: background 
writer 
postgres 67474 67468 0 08:11 
? 00:00:00 postgres: walwriter 
postgres 67475 67468 0 08:11 
? 00:00:00 postgres: autovacuum 
launcher 
postgres 67476 67468 0 08:11 
? 00:00:00 postgres: logical 
replication launcher 
root 90486 90459 0 14:32 
pts/0 00:00:00 su postgres
postgres 90494 90486 0 14:32 
pts/0 00:00:00 bash
postgres 90526 90494 0 14:32 
pts/0 00:00:00 psql
postgres 90527 67468 0 14:32 
? 00:00:00 postgres: postgres 
postgres [local] idle
root 90680 90652 0 14:33 
pts/2 00:00:00 su postgres
postgres 90683 90680 0 14:33 
pts/2 00:00:00 bash
postgres 90766 90683 0 14:33 
pts/2 00:00:00 ps -ef
postgres 90767 90683 0 14:33 
pts/2 00:00:00 grep --color=auto postgres


After that, I use the gdp to debug one of the postgres's process



[postgres@fedora postgresql]$ gdb -p 67468



WaitEventSetWait (set=0x25221d8, timeout=6, occurred_events=0x7ffe2ea9df70, 
nevents=64, wait_event_info=0) at latch.c:1478
1478 in latch.c
(gdb) next
1480 in latch.c
(gdb) next
1481 in latch.c
(gdb) next
1484 in latch.c
(gdb) next
1487 in latch.c
(gdb) next
1490 in latch.c
(gdb) next
1492 in latch.c
(gdb) next
1493 in latch.c
(gdb) next
1494 in latch.c
(gdb) next
1495 in latch.c
(gdb) next



With the gdb's work, I use the psql to send the query to the postgres.


One thing confuses me is, I really don't know how to trace the postgres's 
process, I try to trace the background writer, the walwriter but I still don't 
know what is them work logical.


I think they seem to work in a loop(I read the related code, that's true), but  
what I want to verify is to identify all the functions and objects  involved in 
the entire process of PGSQL (from syntax parsing to  rewriting, from rewriting 
to optimization, and from optimization to  execution).  


Can someone give me some advice?
Thanks in advance!


Yours,
Wen Yi