Re: support for CREATE MODULE

2022-03-17 Thread Swaha Miller
On Thu, Mar 17, 2022 at 4:16 PM Nathan Bossart 
wrote:

> It seems unlikely that this will be committed for v15.  Swaha, should the
> commitfest entry be adjusted to v16 and moved to the next commitfest?
>
>
Yes please, thank you Nathan


Re: support for CREATE MODULE

2022-02-15 Thread Swaha Miller
On Mon, Feb 14, 2022 at 4:58 PM Bruce Momjian  wrote:

> On Mon, Feb 14, 2022 at 07:42:21PM -0500, Bruce Momjian wrote:
> > On Mon, Feb 14, 2022 at 03:23:07PM -0800, Swaha Miller wrote:
> > > A prominent use case for grouping functions into modules would
> > > be access control on the group as a whole, with one command
> > > for an entire module instead of many individual functions. One reason
> > > for such a grouping is to set ACLs. Users migrating their database from
> > > commercial databases to PostgreSQL wanting to set ACLs on
> > > thousands or hundreds of thousands of functions would benefit from
> > > a grouping concept like modules.
> > >
> > > If users use schemas to group functions, then they have to break up
> > > functions that may have all been in one namespace into a bunch of new
> > > schemas. So we'd like to have a different grouping mechanism that can
> > > group functions within a schema. So we're looking at a new construct
> like
> > > modules that can serve that purpose without introducing sub-schemas.
> >
> > I was working on a talk about microservices today and decided to create
> > two schemas --- a public one that has USAGE permission for other services
> > with views and SECURITY DEFINER functions that referenced a private
> > schema that can only be accessed by the owning service.  Is that a usage
> > pattern that requires modules since it already works with schemas and
> > just uses schema permissions to designate public/private schema
> > interfaces.
>

I think the reason for modules would be to make it a better experience for
PostgreSQL users to do what they need. Yours is a great example
for a talk. A practical scenario would require many schemas following this
pattern of X and X_private. In a more expressive programming language,
the user would express modular design with classes X1, X2...Xn that
have both public interface calls and private functions. The modularity
would
reflect what they are trying to do and not be restricted by how they are
allowed to do so. In this context, I am going to link to a scenario Jim had
described before

https://www.postgresql.org/message-id/CAB_5SRebSCjO12%3DnLsaLCBw2vnkiNH7jcNchirPc0yQ2KmiknQ%40mail.gmail.com

In your example, the only way a developer would know that X and
X_private are related is that they are named consistently. With another
grouping construct like Modules, they would both have been in the same
schema, but now they are not. From a long term maintainability of the code
perspective, that becomes an issue because people make simple mistakes.
What if in the future a new developer added a 3rd grouping, x_utils. Is it
related to the other groupings? This has a lower case x instead of the X.
How is one to know? What if in the future a new developer added a 3rd
grouping X_utils but without having paid attention to the already existing
X
and X_private? Now naming becomes the means of expressing
classification and modular design. It is a relatively weak construct
compared to a language construct.

Yes, anything a user may want to do with modules is likely possible to
do already with schemas. But just because it is possible doesn't mean
it is not tedious and awkward because of the mechanisms available to do
them now. And I would advocate for more expressive constructs to enable
users of PostgreSQL to focus and reason about more of the "what" than
the "how" of the systems they are trying to build or migrate.

Swaha


Re: support for CREATE MODULE

2022-02-14 Thread Swaha Miller
On Thu, Feb 10, 2022 at 1:06 PM Bruce Momjian  wrote:

> On Thu, Feb 10, 2022 at 08:53:15AM -0800, Swaha Miller wrote:
> > On Fri, Feb 4, 2022 at 3:51 PM Tom Lane  wrote:
> >
> > Hm. If the functional requirement is "group objects without needing
> > any out-in-the-filesystem infrastructure", then I could see defining
> > a module as being exactly like an extension except there's no such
> > infrastructure --- and hence no concept of versions, plus pg_dump
> > needs to act differently.  That's probably enough semantic difference
> > to justify using a separate word, even if we can share a lot of
> > code infrastructure.
> >
> > Then as a first cut for modules, could we add CREATE MODULE
> > syntax which adds an entry to pg_extension like CREATE EXTENSION
> > does? And also add a new column to pg_extension to distinguish
> > modules from extensions.
> >
> > The three-part path name resolution for functions would remain the
> > same, nothing would need to change there because of modules.
> >
> > Would that be an acceptable direction to go?
>
> Well, that would allow us to have CREATE EXTENSION syntax, but what
> would it do that CREATE SCHEMA does not?
>

A prominent use case for grouping functions into modules would
be access control on the group as a whole, with one command
for an entire module instead of many individual functions. One reason
for such a grouping is to set ACLs. Users migrating their database from
commercial databases to PostgreSQL wanting to set ACLs on
thousands or hundreds of thousands of functions would benefit from
a grouping concept like modules.

If users use schemas to group functions, then they have to break up
functions that may have all been in one namespace into a bunch of new
schemas. So we'd like to have a different grouping mechanism that can
group functions within a schema. So we're looking at a new construct like
modules that can serve that purpose without introducing sub-schemas.

Also a module would be a grouping of primarily functions. Schemas can
have many other object types. Setting ACLs on groups of functions in a
schema, which is what users could do with a module, would require
breaking up a schema into other schemas based on ACLs. But schemas
allow setting ACLs on other objects in it, not only functions. If we
create groupings based on functions, what happens to the other types
of objects, what schema do they get grouped into? If modules are
supposed to solve setting ACLs on groups of functions, does using
schemas conflate setting ACLs for those other types of objects with
setting ACLs for functions?

Modules could also, in the future, serve as a way of allowing for private
functions within the grouping. Doing this in schemas would require
those kinds of changes in the schema construct.

Swaha


Re: support for CREATE MODULE

2022-02-10 Thread Swaha Miller
On Fri, Feb 4, 2022 at 3:51 PM Tom Lane  wrote:

> Hm. If the functional requirement is "group objects without needing
> any out-in-the-filesystem infrastructure", then I could see defining
> a module as being exactly like an extension except there's no such
> infrastructure --- and hence no concept of versions, plus pg_dump
> needs to act differently.  That's probably enough semantic difference
> to justify using a separate word, even if we can share a lot of
> code infrastructure.
>

Then as a first cut for modules, could we add CREATE MODULE
syntax which adds an entry to pg_extension like CREATE EXTENSION
does? And also add a new column to pg_extension to distinguish
modules from extensions.

The three-part path name resolution for functions would remain the
same, nothing would need to change there because of modules.

Would that be an acceptable direction to go?

Swaha


Re: support for CREATE MODULE

2022-02-04 Thread Swaha Miller
On Fri, Feb 4, 2022 at 10:46 AM Bruce Momjian  wrote:

> On Wed, Feb  2, 2022 at 06:28:30PM -0800, Swaha Miller wrote:
> > Hi,
> >
> > I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]
> >
> > My proposal implements modules as schema objects to be stored in a new
> > system catalog pg_module with new syntax for CREATE [OR REPLACE] MODULE,
>
> You might want to consider the steps that are most successful at getting
> Postgres patches accepted:
>
> https://wiki.postgresql.org/wiki/Todo
> Desirability -> Design -> Implement -> Test -> Review -> Commit
>
> In this case, you have jumped right to Implement.  Asking about
> Desirability first can avoid a lot of effort.
>

Thanks Bruce, that's really helpful. I was building on the discussion in
Jim's
original thread, which is why I went ahead with another POC implementation,
but I do consider this implementation as part of the desirability/design
aspect
and am hoping to get input from the community to shape this proposal/patch.

Swaha Miller


Re: support for CREATE MODULE

2022-02-04 Thread Swaha Miller
On Thu, Feb 3, 2022 at 5:42 PM Alvaro Herrera 
wrote:

> On 2022-Feb-03, Pavel Stehule wrote:
>
> > The biggest problem is coexistence of Postgres's SEARCH_PATH  object
> > identification, and local and public scopes used in MODULEs or in
> Oracle's
> > packages.
> >
> > I can imagine MODULES as third level of database unit object grouping
> with
> > following functionality
> >
> > 1. It should support all database objects like schemas
>
> I proposed a way for modules to coexist with schemas that got no reply,
> https://postgr.es/m/202106021908.ddmebx7qfdld@alvherre.pgsql
>

Yes, I arrived a little after that thread, and used that as my starting
point.

The POC patch Jim Mlodgenski had on that thread was similar to your
proposed
way - modules were rows in pg_namespace, with the addition of a new column
in pg_namespace for the nspkind (module or schema.)

Jim had asked about two options  -- the above approach and an alternative
one
of having a pg_module system catalog and got some input
https://www.postgresql.org/message-id/2897116.1622642280%40sss.pgh.pa.us

Picking up from there, I am exploring the alternative approach. And what
qualified
names would look like and how they get interpreted unambiguously, when
schemas and modules co-exist. (Also, being new to PostgreSQL, it has been a
great learning exercise for me on some of the internals of PostgreSQL.)

With modules being their own type of object stored in a pg_module system
catalog, deconstructing a qualified path could always give precedence to
schema over module. So when there is function f() in schema s and another
function f() in module s in schema public, then s.f() would invoke the
former.

What are some other directions we might want to take this patch?

I still think that that idea is valuable; it would let us create
> "private" routines, for example, which are good for encapsulation.
> But the way it interacts with schemas means we don't end up with a total
> mess in the namespace resolution rules.  I argued that modules would
> only have functions, and maybe a few other useful object types, but not
> *all* object types, because we don't need all object types to become
> private.  For example, I don't think I would like to have data types or
> casts to be private, so they can only be in a schema and they cannot be
> in a module.
>
> Of course, that idea of modules would also ease porting large DB-based
> applications from other database systems.
>

Indeed. Looking at commercial databases Oracle and Microsoft SQLServer,
they both have implemented module-like structures.

Swaha Miller


Re: support for CREATE MODULE

2022-02-03 Thread Swaha Miller
Thank you for the feedback Pavel and Julien. I'll try to explain some of
the issues and points you raise to the best of my understanding.

The reason for modules is that it would serve as an organizational unit
that can allow setting permissions on those units. So, for example, all
functions in a module can be subject to setting access permissions on for
some user(s) or group(s). I didn't explain it well in the sgml docs, but
along with module syntax, I'm proposing introducing privileges to
grant/revoke on modules and routines in modules. And why modules for this
purpose? Because its in the SQL spec so seems like a way to do it.

I'm adding comments inline for the list of functionality you mentioned. I
look forward to discussing this more and figuring out how to make a useful
contribution to the community.

On Wed, Feb 2, 2022 at 11:22 PM Pavel Stehule 
wrote:

>
>
> čt 3. 2. 2022 v 5:46 odesílatel Julien Rouhaud 
> napsal:
>
>> Hi,
>>
>> On Thu, Feb 03, 2022 at 05:25:27AM +0100, Pavel Stehule wrote:
>> >
>> > čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller 
>> > napsal:
>> >
>> > > Hi,
>> > >
>> > > I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]
>> > >
>> > > My proposal implements modules as schema objects to be stored in a new
>> > > system catalog pg_module with new syntax for CREATE [OR REPLACE]
>> MODULE,
>> > > ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
>> > > modules and module routines. I am attempting to follow the SQL spec.
>> > > However, for right now, I'm proposing to support only routines as
>> module
>> > > contents, with local temporary tables and path specifications as
>> defined
>> > > in the SQL spec, to be supported in a future submission. We could also
>> > > include support for variables depending on its status. [2]
>> >
>> > I dislike this feature. The modules are partially redundant to schemas
>> and
>> > to extensions in Postgres, and I am sure, so there is no reason to
>> > introduce this.
>> >
>> > What is the benefit against schemas and extensions?
>>
>> I agree with Pavel.  It seems that it's mainly adding another namespacing
>> layer
>> between schemas and objects, and it's likely going to create a mess.
>> That's also going to be problematic if you want to add support for module
>> variables, as you won't be able to use e.g.
>> dbname.schemaname.modulename.variablename.fieldname.
>>
>>
I haven't yet added support for variables so will need to look into the
problems with this if we're going to do that.


> Also, my understanding was that the major interest of modules (at least
>> for the
>> routines part) was the ability to make some of them private to the
>> module, but
>> it doesn't look like it's doing that, so I also don't see any real benefit
>> compared to schemas and extensions.
>>
>
>
Yes, that is indeed the goal/use-case with setting permissions with grant
and revoke. Right now, I have proposed create and reference as the kinds of
access that can be controlled on modules, and reference as the kind of
access that can be controlled on routines inside modules.


> The biggest problem is coexistence of Postgres's SEARCH_PATH  object
> identification, and local and public scopes used in MODULEs or in Oracle's
> packages.
>
>
I am not extremely familiar with Oracle's packages, but do know of them.
I'm wondering if local and public scopes for MODULE is in the SQL spec? (I
will check for that...) My thinking was to implement functionality that
conforms to the SQL spec, not try to match Oracle's package which differs
from the spec in some ways.


> I can imagine MODULES as third level of database unit object grouping with
> following functionality
>
> 1. It should support all database objects like schemas
>

Do you mean that schemas should be groupable under modules? My thinking was
to follow what the SQL spec says about what objects should be in modules,
and I started with routines as one of the objects that there are use cases
for. Such a controlling access permissions on routines at some granularity
that is not an entire schema and not individual functions/procedures.


> 2. all public objects should be accessed directly when outer schema is in
> SEARCH_PATH
>

Yes, an object inside a module is in a schema and can be accessed with
schemaname.func() as well as modulename.func() as well as
schemaname.modulename.func(). I think you are saying it should be
accessible with func() without a qualifying schemana