Re: nested tags in glossary entries in html docs

2024-04-25 Thread Jürgen Purtz



On 25.04.24 13:24, Jürgen Purtz wrote:


On 25.04.24 11:24, Alvaro Herrera wrote:

On 2024-Apr-12, Erik Wienhold wrote:


There's this bug[1] in the DocBook XSLT stylesheets.  Looks like the
fix[2] landed in 1.79.2 (latest version on Arch,

Maybe one of these days we should get going with the migration to
Docbook 5.x that Jürgen Purtz proposed.

https://postgr.es/m/21ed3fd9-9020-4b53-b04f-a08a831b6...@purtz.de

In the meantime, if anyone wants to suggest a XSLT patch to carry in our
local definition, we could try that.


Great recommendation. I may have time in the second half of June to
pursue the migration further. There is  a performance problem, which
possibly results from our XSLT script that optimizes the
transformation-speed and works well in 4.x.

Jürgen Purtz

@Anton: AFAIK the script was developed by your (former?) college
Alexander Lakhin.


... or do we have a problem with the fact that our xml files are not
well-formed? Some of them contain more than one root-element:

xmllint --noout *.sgml ref/*.sgml 2> >(grep Extra)

J. Purtz


Re: nested tags in glossary entries in html docs

2024-04-25 Thread Jürgen Purtz


On 25.04.24 11:24, Alvaro Herrera wrote:

On 2024-Apr-12, Erik Wienhold wrote:


There's this bug[1] in the DocBook XSLT stylesheets.  Looks like the
fix[2] landed in 1.79.2 (latest version on Arch,

Maybe one of these days we should get going with the migration to
Docbook 5.x that Jürgen Purtz proposed.

https://postgr.es/m/21ed3fd9-9020-4b53-b04f-a08a831b6...@purtz.de

In the meantime, if anyone wants to suggest a XSLT patch to carry in our
local definition, we could try that.


Great recommendation. I may have time in the second half of June to
pursue the migration further. There is  a performance problem, which
possibly results from our XSLT script that optimizes the
transformation-speed and works well in 4.x.

Jürgen Purtz

@Anton: AFAIK the script was developed by your (former?) college
Alexander Lakhin.


Re: DocBook 5.2

2022-09-27 Thread Jürgen Purtz

Yeah, I think we'd have to convert all the supported versions to
make this palatable.  If the conversion is sufficiently automated,
that might not be a big lift.  (If it's*not*  automated, I think
the change would never get off the ground even for HEAD, because
the docs are too much of a moving target.)

+1


The process is totally automated. There are general steps which work for 
every DocBook 4.x book. And there are Postgres-specific steps 
(doRealModifictions.sh) which looks for individual patterns per file. 
This is the critical part. Currently it's tested only with HEAD. My 
expectation is, that it should work also for translations to other 
languages because the patterns contain only elements and attributes, no 
text. But if they run against older versions it's likely that we need 
some changes or additional patterns. I'm willing to work on this if a) 
there is a consensus in the community that the work should go on and b) 
someone helps me to resolve the reported unacceptable runtime problem 
during PDF generation.


--

J. Purtz


Re: DocBook 5.2

2022-09-05 Thread Jürgen Purtz

On 05.09.22 14:15, Guillaume Lelarge wrote:
Le lun. 5 sept. 2022 à 13:14, Alvaro Herrera  
a écrit :


On 2022-Sep-05, Jürgen Purtz wrote:

> - 
> + 
>    User-Defined Types

OK, these seem quite significant changes that are likely to cause
great
pain.  So I repeat my question, what are the benefits of making this
change?  They better be very very substantial.


I totally agree with Alvaro.

They will also cause massive pain for translators. There are already 
some changes that were pretty bad for me. For example, when all the 
tables in func.sgml were modified. In v15, I also remember massive 
changes on protocol.sgml. I won't complain if there is a significant 
benefit for readers, which is why I didn't complain for func.sgml even 
if it meant I had to translate it all over again. But if there's a 
massive change over the whole manual for a strictly limited benefit, I 
guess there won't be enough motivation for me to translate it all over 
again.



--
Guillaume.


The goal of the migration is an approximation to today's technology, 
especially programming interfaces and standards, to be able to use and 
interact with nowadays tools. Of course, this leads to internal 
technical changes. It is not intended to change anything at the readers 
surface. In that respect, it is comparable with the sgml to xml conversion.


 * The introduction of RELAX NG instead of DTDs leads to a much richer
   controlling of the sgml files.
 * The introduction of namespaces instead of a DOCTYPE definition
   offers the possibility to integrate tags of other namespaces into
   our documentation, eg.: MathML, XInclude, XLink, ... .
 * The changes during the migration consist mainly in a renaming of
   tag-names. The most important for us is 'ulink'.

 * After the migration the validation is much stricter than before.
   Because we have used tags in a more or less 'individual' style,
   especially when describing commands, there are a lot of violations
   against the RELAX NG schema. Modifications caused by such problems
   are those, which will create the most pain - for back-patching as
   well as for translators.
 * Possibly the pain for translators decreases significantly by using
   the same migration scripts on their already translated sgml files.
 * I don't understand where the pain for back-patching is when the
   attribute 'id' changes to 'xml:id'. It is very unlikely that the id
   of a section or another tag will change, or something else in such
   lines. In nearly all cases such lines will keep as they are,
   back-patching will not be necessary at such places.


What is the alternative to a migration? DocBook 4.5 forever?


--
J. Purtz


Re: DocBook 5.2

2022-09-05 Thread Jürgen Purtz

On 05.09.22 11:59, Daniel Gustafsson wrote:

Will the markup be similar enough to not carry a significant risk of
introducing pain for backpatching doc patches?


There are many changes. Most of them are systematically and others are 
individual, which is more painful. To give you an impression what 
typically changes, here is the diff of an arbitrary file. The 
HTML-output looks quite good - as far as I have seen.



diff --git a/doc/src/sgml/xtypes.sgml b/doc/src/sgml/xtypes.sgml
index e67e5bdf4c..6b6e6eb059 100644
--- a/doc/src/sgml/xtypes.sgml
+++ b/doc/src/sgml/xtypes.sgml
@@ -1,6 +1,6 @@
 

- 
+ 
   User-Defined Types

   
@@ -72,7 +72,7 @@ typedef struct Complex {
   write a complete and robust parser for that representation as your
   input function.  For instance:

-
+
 

   The output function can simply be:

-
+
 
  

@@ -132,7 +132,7 @@ complex_out(PG_FUNCTION_ARGS)
   complex, we will piggy-back on the binary I/O converters
   for type float8:

-
+
 
  

@@ -237,7 +237,7 @@ CREATE TYPE complex (
   If the internal representation of the data type is variable-length, the
   internal representation must follow the standard layout for 
variable-length
   data: the first four bytes must be a char[4] field 
which is
-  never accessed directly (customarily named 
vl_len_). You
+  never accessed directly (customarily named remap="structfield">vl_len_). You
   must use the SET_VARSIZE() macro to store the 
total

   size of the datum (including the length field itself) in this field
   and VARSIZE() to retrieve it. (These macros exist
@@ -249,7 +249,7 @@ CREATE TYPE complex (
    command.
  

- 
+ 
   TOAST Considerations
    
 TOAST
@@ -258,8 +258,7 @@ CREATE TYPE complex (

  
   If the values of your data type vary in size (in internal form), it's
-  usually desirable to make the data type TOAST-able 
(see -  linkend="storage-toast"/>). You should do this even if the values are 
always
+  usually desirable to make the data type TOAST-able 
(see ). You should do this even if the 
values are always

   too small to be compressed or stored externally, because
   TOAST can save space on small data too, by 
reducing header

   overhead.
@@ -290,7 +289,7 @@ CREATE TYPE complex (

  
   
-   Older code frequently declares vl_len_ as an
+   Older code frequently declares remap="structfield">vl_len_ as an
    int32 field instead of char[4].  This is 
OK as long as
    the struct definition has other fields that have at least 
int32

    alignment.  But it is dangerous to use such a struct definition when






Re: DocBook 5.2

2022-09-05 Thread Jürgen Purtz

On 04.09.22 17:39, Alvaro Herrera wrote:

What changes?
I doubt we'll want to adopt a new version immediately after release,
since we want to stay compatible with older systems.


The migration isn't a matter of days. It's a huge step because nearly 
all files are touched and we have to act carefully to deliver (nearly) 
identical HTML, PDF, ... files as before. As a preview of the ongoing 
the actual README.md file is attached.


Jürgen Purtz

Migration of PG's documentation from DocBook 4.5 to DocBook 5.2
===

The migration from DocBook 4.x to 5.x is a huge step which changes most
of PG's sgml-files. DocBook supports the migration with some scripts, but
PG's documentation doesn't meet all prerequisites. Therefore we use - in
addition to DocBook's scripts - some bash and sed commands to overcome the
problems. In addition, we need a lot of small detailed changes to many
sgml-files. This is more or less individual manual work. To be able
to perform such changes at any point in time, the changes are done with
Perl scripts per file.

Major DocBook changes
-

- Discontinuation of a DOCTYPE declaration. Instead, there is an XML conforming
namespace which uniquely identifies DocBook tags.
- Discontinuation of DTDs (and XSD schema). Instead, the validation is done against
a RELAX NG schema.
- Some tag names change, especially to adopt the XML conventions and standards.
The content model of some tags is narrowed down and defined more precise.


Migration steps
---

The migration is steered by conv.sh. The script uses 3 directories: All scripts and
other necessary information is located in **$ToolDir**, the existing sgml-files
are located in **$FromSgmlDir**, the migrated ones are in **$ToSgmlDir**. 
1. Preparation: The git tree of the complete PG source is copied to a different place.
   So we can use git after any intermediate step to check the changes. 
2. Migration: Some standard modifications to every single file to make them XML conform.
   Few individual changes per file. Perform the standard DocBook migration.
3. Changes: Perform some standard changes on few files. Perform many individual changes
   on many files.
4. Validation: Perform validation against the RELAX NG schema.


Introduction of a new tool
--

In the past, we used the tool **xmllint** to validate the sgml files against the DocBook
DTD. This worked well. Also, its validation against a RELAX NG schema works well as far
as no schema-validation occurs. But if the RELAX NG schema is violated by an sgml file,
the resulting error messages are more confusing than helpful.

Therefore, we should consider to introduce another validator. During the migration phase,
we have used **jing**. It's Java, it's fast, the error messages are very precise. But there
are many others: https://relaxng.org/#validators. Should we possibly provide multiple
validators in doc/src/sgml/Makefile?



ToDo


- Adoption of doc/src/sgml/Makefile
- Additional CSS definitions ???
- Adoption of Appendix J: Documentation


Forecast


- Entities: We use **character entities** (e.g.: \) as well as **parameter entities**
  (e.g.: %filelist;). The use of character entities instead of hex-values or direct
  Unicode-values is helpful because it supports the readability of the source for authors.
  The use of parameter entities can be replaced by the more XML-conform XInclude mechanism.
  But this isn't possible without major changes in most files:
   - Every xml/sgml-file must be XML conform, especially it needs a single root element.
   - In every xml/sgml-file we must re-declare namespace(s). The reason is that parameter
 entities perform a plain text substitution whereas xi:include creates trees and combines
 them.
- Description of user- and program-interfaces: We use a confusing mixture of ,
  , ,  (but not ), , and '[]'.
  We shall work on the unification of this diversity by creating some rules (like README.links)
  and then - more or less manually - change the source code over time.





DocBook 5.2

2022-09-04 Thread Jürgen Purtz
DocBook 5.2 is around the corner [1], we use DocBook 4.5 which is 
'feature frozen' since 2006, and there are even ideas for DocBook 6.x [2].


I want to inform you that I'm working on an upgrade of our documentation 
to DocBook 5.2. Major steps have been done, but I need some more time 
before I can publish a first working draft. Please keep me informed, if 
someone else is working on the same issue.


J. Purtz

[1]: https://github.com/docbook/docbook/releases/tag/5.2CR3
[2]: 
https://github.com/docbook/docbook/issues?q=is%3Aissue+is%3Aopen+label%3Av6




Re: Be more clear what names can be used for tables with "CREATE TABLE"?

2021-10-30 Thread Jürgen Purtz

a) Don't forget: constraint, function, procedure, trigger.

b) The hint applies to each of such objects: CREATE 
[TABLE|VIEW|FUNCTION|...]. To avoid such redundancy we could extend and 
clarify the definition of 'SQL object / local object / global object' 
within the glossary and refer to it.


c) In general we have 3 levels (namespaces) where object names are 
unique across the same or different types.


 * cluster-level: database name, tablespace, replication origin,
   subscription for logical replication, role, ...?
 * database-level: schema, extension, collation, data type cast, ...?
 * schema-level: table, index, view, materialized view, foreign table,
   sequence, constraint, function, procedure, trigger, operator, ...?

--

Jürgen Purtz



Re: [PATCH] add link to domain data types section from locale documentation

2021-07-23 Thread Jürgen Purtz

On 27.05.21 09:45, Laurenz Albe wrote:

I like the new link from "src/sgml/charset.sgml", but I think that
there are a few other places in the documentation that are likely to be
read by people who don't know about SQL domains and could benefit from such
a link:

- src/sgml/earthdistance.sgml

- src/sgml/lo.sgml

- src/sgml/information_schema.sgml (the entries about "check_constraints",
   "column_domain_usage", "data_type_privileges", "domain_constraints",
   "domain_catalog", "domain_udt_usage", "domains" and the description below
   "columns")

This selection is a bit arbitrary, I admit.

Yours,
Laurenz Albe


Done at the proposed pages - and a few more.

--

Jürgen Purtz


diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 98df74d0e1..8434bf6ba4 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -346,7 +346,8 @@ initdb --locale=sv_SE
 collation.  (The built-in collatable data types are
 text, varchar, and char.
 User-defined base types can also be marked collatable, and of course
-a domain over a collatable data type is collatable.)  If the
+a domain over a
+collatable data type is collatable.)  If the
 expression is a column reference, the collation of the expression is the
 defined collation of the column.  If the expression is a constant, the
 collation is the default collation of the data type of the
diff --git a/doc/src/sgml/earthdistance.sgml b/doc/src/sgml/earthdistance.sgml
index 641e69c5e9..f5b60ea612 100644
--- a/doc/src/sgml/earthdistance.sgml
+++ b/doc/src/sgml/earthdistance.sgml
@@ -50,7 +50,8 @@
   
Data is stored in cubes that are points (both corners are the same) using 3
coordinates representing the x, y, and z distance from the center of the
-   Earth.  A domain earth over cube is provided, which
+   Earth.  A domain earth
+   over cube is provided, which
includes constraint checks that the value meets these restrictions and
is reasonably close to the actual surface of the Earth.
   
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index c8d0440e80..0c1176fa07 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -389,9 +389,7 @@
 
  A restriction on the values of data allowed within a
  table,
- or in attributes of a
- 
- domain.
+ or in attributes of a domain.
 
 
  For more information, see
@@ -508,6 +506,27 @@

   
 
+  
+   Domain (Internet)
+   
+
+ A realm to administer certain Internet resources.
+
+   
+  
+
+  
+   Domain (SQL)
+   
+
+ A user-defined data type that is based on another underlying data type.
+
+
+ For more information, see .
+
+   
+  
+
   
Durability

diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml
index 4100198252..13de037b8e 100644
--- a/doc/src/sgml/information_schema.sgml
+++ b/doc/src/sgml/information_schema.sgml
@@ -912,8 +912,8 @@
 
   
The view check_constraints contains all check
-   constraints, either defined on a table or on a domain, that are
-   owned by a currently enabled role.  (The owner of the table or
+   constraints, either defined on a table or on a domain,
+   that are owned by a currently enabled role.  (The owner of the table or
domain is the owner of the constraint.)
   
 
@@ -1199,7 +1199,8 @@
 
   
The view column_domain_usage identifies all
-   columns (of a table or a view) that make use of some domain defined
+   columns (of a table or a view) that make use of some
+   domain defined
in the current database and owned by a currently enabled role.
   
 
@@ -2307,7 +2308,8 @@
data type descriptors that the current user has access to, by way
of being the owner of the described object or having some privilege
for it.  A data type descriptor is generated whenever a data type
-   is used in the definition of a table column, a domain, or a
+   is used in the definition of a table column, a
+   domain, or a
function (as parameter or return type) and stores some information
about how the data type is used in that instance (for example, the
declared maximum length, if applicable).  Each data type
@@ -2393,7 +2395,8 @@
 
   
The view domain_constraints contains all constraints
-   belonging to domains defined in the current database.  Only those domains
+   belonging to domains
+   defined in the current database.  Only those domains
are shown that the current user has access to (by way of being the owner or
having some privilege).
   
@@ -2445,7 +2448,8 @@
domain_catalog sql_identifier
   
   
-   Name of the database that contains the domain (always the current database)
+   Name of the database that contains the domain
+   (always the current database)
   
  
 
@@ -2493,7 +2497,8 @@
 

Re: Headers for each index

2021-07-20 Thread Jürgen Purtz

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/13/indexes-types.html
Description:

For easier visual navigation could you please add headers or color the
different types of indexes? At the moment, one needs to read the whole
document or do search in the page to find the available indexes.

Thanks a ton!


The look has changed in version 14 in a way that hopefully matches 
your suggestion: https://www.postgresql.org/docs/14/indexes-types.html


--

Jürgen Purtz






Re: Additional Chapter for Tutorial

2021-05-21 Thread Jürgen Purtz

Hi Jürgen,

What's going to happen with this work?

If you intend to have it eventually committed, I think it will be 
necessary to make the patches smaller, and bring them into the 
commitfest app, so that others can follow progress.


I for one, cannot see/remember/understand what has been done, or even 
whether you intend to continue with it.


Thanks,

Erik


Peter changed the status to 'Returned with feedback' at the end of the 
last commit fest. I'm not absolutely sure, but my understanding is that 
the patch is rejected.


--

Jürgen Purtz






Re: [PATCH] add link to domain data types section from locale documentation

2021-04-01 Thread Jürgen Purtz

On 30.03.21 21:46, Anton Voloshin wrote:

Hello.

While browsing character set collation documentation [1]
I've found myself a bit puzzled by unfamiliar terminology:
> of course a domain over a collatable data type is collatable

Of course, it only takes a search or two to find out the exact
meaning of "domain" in this context, but I think it would be useful
to make this word a link to the relevant section as a help for less 
experienced users like myself.


I've attached a patch that does that. Please let me know whether this 
fix is desirable and correct (I'm new to Postgres) and if there is a 
better way of suggesting an improvement for the documentation.


[1] https://www.postgresql.org/docs/current/collation.html

It's likely that the term "domain" within an SQL context confuse some 
people. An additional remark in the glossary concerning the SQL and IP 
meaning may be helpful.


--

Jürgen Purtz


diff --git a/doc/src/sgml/charset.sgml b/doc/src/sgml/charset.sgml
index 1b00e543a6..300a0d83d4 100644
--- a/doc/src/sgml/charset.sgml
+++ b/doc/src/sgml/charset.sgml
@@ -346,7 +346,8 @@ initdb --locale=sv_SE
 collation.  (The built-in collatable data types are
 text, varchar, and char.
 User-defined base types can also be marked collatable, and of course
-a domain over a collatable data type is collatable.)  If the
+a domain over a collatable data type
+is collatable.)  If the
 expression is a column reference, the collation of the expression is the
 defined collation of the column.  If the expression is a constant, the
 collation is the default collation of the data type of the
diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 41f3e5ee86..a7a0e35ac5 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -508,6 +508,27 @@

   
 
+  
+   Domain (Internet)
+   
+
+ A realm to administer certain Internet resources.
+
+   
+  
+
+  
+   Domain (SQL)
+   
+
+ A user-defined data type that is based on another underlying data type.
+
+
+ For more information, see .
+
+   
+  
+
   
Durability



Re: Additional Chapter for Tutorial

2020-10-23 Thread Jürgen Purtz

On 21.10.20 22:33, David G. Johnston wrote:
I've begun looking at this and have included quite a few html comments 
within the patch.  However, the two main items that I have found so 
far are:


One, I agree with Peter that this seems misplaced in Tutorial.  I 
would create a new Internals Chapter and place this material there, or 
maybe consider a sub-chapter under "Overview of PostgreSQL 
Internals".  If this is deemed to be of a more primary importance than 
the content in the Internals section I would recommend placing it in 
Reference.  I feel it does fit there and given the general importance 
of that section readers will be inclined to click into it and skim 
over its content.


I like the idea of dividing the material into two different chapters. 
The existing part "I. Tutorial" contains the first concrete steps: 
installation, creating database and database objects, using SQL basic 
and advanced features. Its typical audience consists of persons doing 
their first steps with PG. The new material is aimed at persons 
interested in implementation aspects of PG. Therefore, the part "VII. 
Internals" seems to be the natural place to integrate it, something like 
"Architecture and Implementation Aspects" or "Architecture and 
Implementation Cornerstones".


Creating such a chapter in "VII. Internals" will increase the existing 
chapter numbers 50 - 71, which may lead to some confusion. On the other 
hand the content can possibly be applied to all supported PG versions at 
the same time, which will lead to a consistent behavior. Extending one 
of the existing chapters won't work because all of them handle their own 
topic, eg.: "50. Overview of PostgreSQL Internals" (misleading title?) 
focuses on the handling of SQL statements from parsing to execution.


What are your thoughts?

--

J. Purtz



Re: Change JOIN tutorial to focus more on explicit joins

2020-10-23 Thread Jürgen Purtz

On 22.10.20 17:14, Pavel Stehule wrote:


Why do you use parenthesis for ON clause?  It is useless. SQL is not C 
or JAVA.



Two more general answers:
- Why do people use tabs, spaces, and newlines to format their code even 
though it's not necessary? SQL is a language to develop applications. 
And what are the main costs of an application? It's not the time which 
it takes to develop them. It's the time for their maintenance. During 
the course of one or more decades, different persons will have to read 
the code, add additional features, and fix bugs. They need some time to 
read and understand the existing code. This task can be accelerated if 
the code is easy to read. Therefore, it's a good habit of developers to 
sometimes spend some extra characters to the code than is required -  
not only comments. An example: there are clear precedence rules for 
Boolean operators NOT/AND/OR. In an extensive statement it may be 
helpful - for the developer himself as well as for anybody else -to use 
newlines and parentheses at places where they are not necessary to keep 
an overview of the intention of the statement. In such cases, 
code-optimization is the duty of the compiler, not of the developer.
- In my professional life as a software developer, I have seen about 15 
different languages. But only in rare cases, they have offered new 
features or concepts. To overcome this Babylonian linguistic diversity I 
tend to use such syntactical constructs which are common to many of them 
even, even if they are not necessary for the concrete language.


And the concrete answer: Omitting the parentheses for the join condition 
raises the danger that its Boolean operators are mixed with the Boolean 
operators of the WHERE condition. The result at runtime is the same, but 
a reader will understand the intention of the statement faster if the 
parentheses exists.


--

J. Purtz






Re: Change JOIN tutorial to focus more on explicit joins

2020-10-22 Thread Jürgen Purtz

On 22.10.20 01:40, David G. Johnston wrote:
On Fri, Sep 4, 2020 at 2:36 AM Jürgen Purtz <mailto:juer...@purtz.de>> wrote:


On 04.09.20 08:52, Peter Eisentraut wrote:
>
> For the remaining patch I have a couple of concerns:


This patch should not be changing the formatting choices for these 
queries, just the addition of a JOIN clause and modification of the 
WHERE clause.  Specifically, SELECT is left-aligned while all 
subsequent clauses indent under it.  Forced alignment by adding extra 
spaces isn't done here either.  I have not altered those in the attached.


Did some word-smithing on the first paragraph.  The part about the 
cross-join was hurt by "in some way" and "may be" is not needed.


Pointing out that values from both tables doesn't seem like an 
improvement when the second item covers that and it is more specific 
in noting that the city name that is joined on appears twice - once 
from each table.


ON expression is more precise and the reader should be ok with the term.

Removal of the exercise is good.  Not the time to discuss cross join 
anyway.  Given that "ON true" works the cross join form isn't even 
required.


In the FROM clause form I would not add table prefixes to the column 
names.  They are not part of the form changing.  If discussion about 
table prefixing is desired it should be done explicitly and by 
itself.  They are used later on, I didn't check to see whether that 
was covered or might be confusing.


I suggested a wording for why to use join syntax that doesn't involve 
legacy and points out its merit compared to sticking a join expression 
into the where clause.


The original patch missed having the syntax for the first left outer 
join conform to the multi-line query writing standard you introduced.  
I did not change.


The "AND" ON clause should just go with (not changed):

ON (w1.temp_lo < w2.temp_lo
    AND w1.temp_hi > w2.temp_high);

Attaching my suggestions made on top of the attached original 
0002-query.patch


David J.

(Hopefully) I have integrated all of David's suggestions as well as the 
following rules:


- Syntax formatting with the previously used 4 spaces plus newline for JOIN

- Table aliases only when necessary or explicitly discussed

The discussion about the explicit vs. implicit syntax is added to the 
"As join expressions serve a specific purpose ... " sentence and creates 
a paragraph of its own.


The patch is build on top of master.

--

J. Purtz


diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index e73e805ec4..532427ab4e 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -440,13 +440,12 @@ SELECT DISTINCT city
 

 Thus far, our queries have only accessed one table at a time.
-Queries can access multiple tables at once, or access the same
-table in such a way that multiple rows of the table are being
-processed at the same time.  A query that accesses multiple rows
-of the same or different tables at one time is called a
-join query.  As an example, say you wish to
-list all the weather records together with the location of the
-associated city.  To do that, we need to compare the city
+Queries which access multiple tables (including repeats) at once are called
+join queries.  They internally combine
+each row from one table with each row of a second table.  An expression is
+specified to limit which pairs of rows are returned.
+For example, to return all the weather records together with the location of the
+associated city, the database compare the city
 column of each row of the weather table with the
 name column of all rows in the cities
 table, and select the pairs of rows where these values match.
@@ -461,10 +460,17 @@ SELECT DISTINCT city
 
 
 SELECT *
-FROM weather, cities
-WHERE city = name;
+FROM weather
+JOIN cities ON (city = name);
 
 
+After the keyword ON follows the
+expression comparing their rows. In this case, the
+definition how to compare their rows. In this case, the
+column city of table weather
+must be equal to the column name
+of table cities.
+
 
  city  | temp_lo | temp_hi | prcp |date| name  | location
 ---+-+-+--++---+---
@@ -499,23 +505,14 @@ SELECT *
*:
 
 SELECT city, temp_lo, temp_hi, prcp, date, location
-FROM weather, cities
-WHERE city = name;
+FROM weather
+JOIN cities ON (city = name);
 
   
  
 

 
-   
-Exercise:
-
-
- Attempt to determine the semantics of this query when the
- WHERE clause is omitted.
-
-   
-

 Since the columns all had different names, the parser
 automatically found which table they belong to.  If there
@@ -526,8 +523,8 @@ SELECT city, temp_lo, temp_hi, prcp, date, location
 
 SELE

Re: exceptions

2020-10-22 Thread Jürgen Purtz

On 21.10.20 18:17, Tom Lane wrote:

PG Doc comments form  writes:

Description:
02000   no_data
should read no_data_found

Looks correct as stated to me.  plpgsql's ERRCODE_NO_DATA_FOUND
(P0002) is a different animal.

regards, tom lane



Yes, our documentation correlates to the SQL standard (as of 2011):

--

J. Purtz



Re: Additional Chapter for Tutorial

2020-09-11 Thread Jürgen Purtz

On 10.09.20 18:26, Peter Eisentraut wrote:

On 2020-09-02 09:04, Jürgen Purtz wrote:

On 01.09.20 23:30, Peter Eisentraut wrote:

It is mostly advanced low-level information that is irrelevant for
someone starting up,

That applies only to the VACUUM chapter. VACUUM and AUTOVACUUM are
controlled by a lot of parameters. Therefor the current documentation
concerning the two mechanism spreads the description across different
pages (20.4, 25.1, VACUUM command). Because of the structure of our
documentation that's ok. But we should have a summary page somewhere -
not necessarily in the tutorial.


There is probably room for improvement, but the section numbers you 
mention are not about VACUUM, AFAICT, so I can't really comment on 
what you have in mind.


Because of the additional chapter for the 'tutorial' on my local 
computer, the numbers increased for me. The regular chapter numbers are 
19.4 and 24.1. Sorry for the confusion. In detail:


19.4: parameters to configure the server, especially five parameters 
'vacuum_cost_xxx'.


19.10: parameters to configure autovacuum.

19.11: parameters to configure client connections, especially five 
parameters 'vacuum_xxx' concerning their freeze-behavior.


24.1: explains the general necessity of (auto)vacuum and their strategies.

The page about the SQL command VACUUM explains the different options 
(FULL, FREEZE, ..) and their meaning.


--

Jürgen Purtz






Re: Change JOIN tutorial to focus more on explicit joins

2020-09-04 Thread Jürgen Purtz

On 04.09.20 08:52, Peter Eisentraut wrote:


For the remaining patch I have a couple of concerns:

>  
>   Attempt to determine the semantics of this query when the
> - WHERE clause is omitted.
> + ON clause is omitted.
>  
> 

This no longer works.


Ok, but I don't have any better suggestion than to delete this para.
In general, I agree that some more emphasis on the JOIN syntax is 
okay. But I think the order in which the tutorial has taught it so far 
is okay: First you do it the manual way, then you learn the more 
abstract way.


In this context, I wouldn't use the terms 'manual' and 'abstract', it's 
more about 'implicit' and 'explicit' syntax. The 'explicit' syntax does 
not only emphasis the aspect of 'joining' tables, it also differentiates 
between the usage of following AND/OR/NOT key words as join conditions 
or as additional restrictions (the results are identical but not the 
semantic). Because the purpose of this patch is the preference of the 
explicit syntax, we shall show this syntax first.




> +   
> +    
> + The examples shown here combine rows via city names.
> + This should help to understand the concept. Professional
> + solutions prefer to use numerical IDs and foreign keys
> + to join tables.
> +    
> +   

While there are interesting debates to be had about natural vs. 
surrogate keys, I don't think we should imply that one of them is 
unprofessional and then leave it at that and give no further guidance. 
I think we should leave this out.



Ok, deleted.

--

Jürgen Purtz

diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index e73e805ec4..413763691e 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -441,11 +441,11 @@ SELECT DISTINCT city

 Thus far, our queries have only accessed one table at a time.
 Queries can access multiple tables at once, or access the same
-table in such a way that multiple rows of the table are being
-processed at the same time.  A query that accesses multiple rows
-of the same or different tables at one time is called a
-join query.  As an example, say you wish to
-list all the weather records together with the location of the
+table several times. Such queries  they are called
+join queries  combine
+rows of one table in some way with rows of the other table
+and return a single row per combination. An example may be a
+list of all the weather records together with the location of the
 associated city.  To do that, we need to compare the city
 column of each row of the weather table with the
 name column of all rows in the cities
@@ -461,10 +461,17 @@ SELECT DISTINCT city
 
 
 SELECT *
-FROM weather, cities
-WHERE city = name;
+FROM   weather
+JOIN   cities ON (city = name);
 
 
+The keyword JOIN connects the two tables.
+Behind the keyword ON follows the
+definition how to compare their rows. In this case, the
+column city of table weather
+must be equal to the column name
+of table cities.
+
 
  city  | temp_lo | temp_hi | prcp |date| name  | location
 ---+-+-+--++---+---
@@ -476,8 +483,14 @@ SELECT *

 

-Observe two things about the result set:
+Observe some things about the result set:
 
+ 
+  
+   The resulting rows contain values from both tables.
+  
+ 
+
  
   
There is no result row for the city of Hayward.  This is
@@ -499,23 +512,14 @@ SELECT *
*:
 
 SELECT city, temp_lo, temp_hi, prcp, date, location
-FROM weather, cities
-WHERE city = name;
+FROM   weather
+JOIN   cities ON (city = name);
 
   
  
 

 
-   
-Exercise:
-
-
- Attempt to determine the semantics of this query when the
- WHERE clause is omitted.
-
-   
-

 Since the columns all had different names, the parser
 automatically found which table they belong to.  If there
@@ -526,8 +530,8 @@ SELECT city, temp_lo, temp_hi, prcp, date, location
 
 SELECT weather.city, weather.temp_lo, weather.temp_hi,
weather.prcp, weather.date, cities.location
-FROM weather, cities
-WHERE cities.name = weather.city;
+FROM   weather
+JOIN   cities ON (cities.name = weather.city);
 
 
 It is widely considered good style to qualify all column names
@@ -537,15 +541,18 @@ SELECT weather.city, weather.temp_lo, weather.temp_hi,
 

 Join queries of the kind seen thus far can also be written in this
-alternative form:
+form:
 
 
 SELECT *
-FROM weather INNER JOIN cities ON (weather.city = cities.name);
+FROM   weather, cities
+WHERE  weather.city = cities.name;
 
 
-This syntax is not as commonly used as the one above, but we show
-it here to help you understand the following topics.
+This syntax is mainly used in legacy applications. I

Re: Additional Chapter for Tutorial

2020-09-02 Thread Jürgen Purtz

On 01.09.20 23:30, Peter Eisentraut wrote:
It is mostly advanced low-level information that is irrelevant for 
someone starting up,


That applies only to the VACUUM chapter. VACUUM and AUTOVACUUM are 
controlled by a lot of parameters. Therefor the current documentation 
concerning the two mechanism spreads the description across different 
pages (20.4, 25.1, VACUUM command). Because of the structure of our 
documentation that's ok. But we should have a summary page somewhere - 
not necessarily in the tutorial.


the most part the information just duplicates what is already 
explained elsewhere.


That is the nature of a tutorial respectively a summary.

--

J. Purtz






Re: initdb - creating clusters

2020-08-31 Thread Jürgen Purtz

On 30.08.20 17:21, Tom Lane wrote:

Do you have a feeling one way or the other about whether to repeat
some of this text in each of the relevant sub-sections?  I initially
didn't want to do that, but thinking about how people consume the
HTML docs, I'm afraid that anything not appearing on the same page
won't get seen.


If we do so but avoid redundant text parts, we can use the entity 
mechanism or the more modern XInclude mechanism. The attached patch uses 
both techniques in an example file: brin.sgml includes lorem.sgml two times.


(In both cases we should avoid files with multiple root elements, eg. 
multiple  or  without a parent element, because this would 
violate the well-formed-ness of the included XML document.)


--

J. Purtz

diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 0401a515df..562f607190 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -132,8 +132,8 @@ endif
 html: html-stamp
 
 html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML) $(ALL_IMAGES)
-	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
-	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $(wordlist 1,2,$^)
+	$(XMLLINT) $(XMLINCLUDE) --noout --xinclude --postvalid --noent $(word 2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --xinclude $(XSLTPROC_HTML_FLAGS) $(wordlist 1,2,$^)
 	cp $(ALL_IMAGES) html/
 	cp $(srcdir)/stylesheet.css html/
 	touch $@
@@ -167,12 +167,12 @@ postgres.pdf:
 XSLTPROC_FO_FLAGS += --stringparam img.src.path '$(srcdir)/'
 
 %-A4.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
-	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
-	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_FO_FLAGS) --stringparam paper.type A4 -o $@ $(wordlist 1,2,$^)
+	$(XMLLINT) $(XMLINCLUDE) --noout --xinclude --postvalid --noent $(word 2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --xinclude $(XSLTPROC_FO_FLAGS) --stringparam paper.type A4 -o $@ $(wordlist 1,2,$^)
 
 %-US.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
-	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
-	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_FO_FLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
+	$(XMLLINT) $(XMLINCLUDE) --noout --xinclude --postvalid --noent $(word 2,$^)
+	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --xinclude $(XSLTPROC_FO_FLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
 
 %.pdf: %.fo $(ALL_IMAGES)
 	$(FOP) -fo $< -pdf $@
diff --git a/doc/src/sgml/brin.sgml b/doc/src/sgml/brin.sgml
index 4420794e5b..ac5ef96173 100644
--- a/doc/src/sgml/brin.sgml
+++ b/doc/src/sgml/brin.sgml
@@ -11,6 +11,13 @@
 
  Introduction
 
+ 
+ via entity mechanism:
+ 
+ 
+ via xinclude mechanism:
+ http://www.w3.org/2001/XInclude; href="lorem.sgml" />
+
  
   BRIN stands for Block Range Index.
   BRIN is designed for handling very large tables
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index 64b5da0070..2e42197844 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -182,3 +182,5 @@
 
 
 
+
+
diff --git a/doc/src/sgml/lorem.sgml b/doc/src/sgml/lorem.sgml
new file mode 100644
index 00..ffb557bf22
--- /dev/null
+++ b/doc/src/sgml/lorem.sgml
@@ -0,0 +1,5 @@
+  
+   Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed
+   do eiusmod tempor incididunt ut labore et dolore magna aliqua.
+   An arbitrary xref-link: 
+  


Re: Procedures

2020-08-22 Thread Jürgen Purtz

On 22.08.20 13:05, Robin Abbi wrote:
On Fri, 21 Aug 2020 at 23:52, Bruce Momjian > wrote:


I developed the attached patach for this.  Is this sufficient?


Would it be appropriate to consider including some language with a 
similar information content to this


" ... prior to PostgreSQL 11, these functions were unable to
manage their own transactions. PostgreSQL 11 adds SQL procedures
that can perform full transaction management within the body of a
function, enabling developers to create more advanced server-side
applications, such as ones involving incremental bulk data loading."

from here https://www.postgresql.org/about/news/1894/ .

Robin Abbi


Can we more clearly distinguish between "function" and "procedure"? eg:

"Developers have been able to create user-defined functions in 
PostgreSQL since decades, but functions are unable to manage their own 
transactions. PostgreSQL 11 adds SQL procedures that can perform full 
transaction management within their body, enabling developers to create 
more advanced server-side applications, such as ones involving 
incremental bulk data loading."


as a modification of the original release notes:

"Developers have been able to create user-defined functions in 
PostgreSQL for over 20 years, but prior to PostgreSQL 11, these 
functions were unable to manage their own transactions. PostgreSQL 11 
adds SQL procedures that can perform full transaction management within 
the body of a function, enabling developers to create more advanced 
server-side applications, such as ones involving incremental bulk data 
loading."


--

J. Purtz



Re: obsolete indexing method "rtree"

2020-08-07 Thread Jürgen Purtz


On 06.08.20 10:12, David G. Johnston wrote:
On Thu, Aug 6, 2020 at 12:18 AM Jürgen Purtz <mailto:juer...@purtz.de>> wrote:



> Why?

Because it can hinder the learning process.


FWIW I'd at least fixup comment.sgml to reference a current type.  And 
while in there might as well add missing examples for the following 
since we seem to have a goal of providing one example for every syntax 
(operators get two in order to show NONE, rightarg).


  EVENT TRIGGER object_name |
  PUBLICATION object_name |
  SUBSCRIPTION object_name |
  ROUTINE routine_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] 
) ] |


Such a "make current" patch that covers this complaint seems 
reasonable; IOW, why not if you are in there anyway - rtree has some 
baggage and no particular merit such that a different label would be 
worse.


I don't see the value of removing the backward compatibility hack if 
it's not bothering the developers.  Whether the documentation should 
be changed to basically only cover this and nothing more I cannot say 
without studying said documentation.  I agree with the general motive 
though.


David J.


Patch with four additional comments and one modified comment attached.

--

Jürgen Purtz


diff --git a/doc/src/sgml/ref/comment.sgml b/doc/src/sgml/ref/comment.sgml
index 965c5a40ad..fd7492a255 100644
--- a/doc/src/sgml/ref/comment.sgml
+++ b/doc/src/sgml/ref/comment.sgml
@@ -306,7 +306,7 @@ COMMENT ON TABLE mytable IS NULL;
Some more examples:
 
 
-COMMENT ON ACCESS METHOD rtree IS 'R-Tree access method';
+COMMENT ON ACCESS METHOD gin IS 'GIN index access method';
 COMMENT ON AGGREGATE my_aggregate (double precision) IS 'Computes sample variance';
 COMMENT ON CAST (text AS int4) IS 'Allow casts from text to int4';
 COMMENT ON COLLATION "fr_CA" IS 'Canadian French';
@@ -316,6 +316,7 @@ COMMENT ON CONSTRAINT bar_col_cons ON bar IS 'Constrains column col';
 COMMENT ON CONSTRAINT dom_col_constr ON DOMAIN dom IS 'Constrains col of domain';
 COMMENT ON DATABASE my_database IS 'Development Database';
 COMMENT ON DOMAIN my_domain IS 'Email Address Domain';
+COMMENT ON EVENT TRIGGER abort_ddl IS 'Aborts all DDL commands';
 COMMENT ON EXTENSION hstore IS 'implements the hstore data type';
 COMMENT ON FOREIGN DATA WRAPPER mywrapper IS 'my foreign data wrapper';
 COMMENT ON FOREIGN TABLE my_foreign_table IS 'Employee Information in other database';
@@ -330,12 +331,15 @@ COMMENT ON OPERATOR CLASS int4ops USING btree IS '4 byte integer operators for b
 COMMENT ON OPERATOR FAMILY integer_ops USING btree IS 'all integer operators for btrees';
 COMMENT ON POLICY my_policy ON mytable IS 'Filter rows by users';
 COMMENT ON PROCEDURE my_proc (integer, integer) IS 'Runs a report';
+COMMENT ON PUBLICATION alltables IS 'Publishes all operations on all tables';
 COMMENT ON ROLE my_role IS 'Administration group for finance tables';
+COMMENT ON ROUTINE my_routine (integer, integer) IS 'Runs a routine (which is a function or procedure)';
 COMMENT ON RULE my_rule ON my_table IS 'Logs updates of employee records';
 COMMENT ON SCHEMA my_schema IS 'Departmental data';
 COMMENT ON SEQUENCE my_sequence IS 'Used to generate primary keys';
 COMMENT ON SERVER myserver IS 'my foreign server';
 COMMENT ON STATISTICS my_statistics IS 'Improves planner row estimations';
+COMMENT ON SUBSCRIPTION alltables IS 'Subscription for all operations on all tables';
 COMMENT ON TABLE my_schema.my_table IS 'Employee Information';
 COMMENT ON TABLESPACE my_tablespace IS 'Tablespace for indexes';
 COMMENT ON TEXT SEARCH CONFIGURATION my_config IS 'Special word filtering';



Re: obsolete indexing method "rtree"

2020-08-06 Thread Jürgen Purtz




Why?


Because it can hinder the learning process. And it's possible that 
people ignore or forget about the warning message, but are confused 
somewhat later that commands like \d shows GiST instead of RTREE.



It's not hurting anything to continue the substitution in CREATE
INDEX.
Yes, it's not hurting. But it makes the life of newcomers a little 
harder. Why not stripping away thinks which aren't used since 12 years?

Possibly the comment in seg.sgml should be adjusted to refer to GiST
indexes --- but then again, it's not entirely clear whether it's even
still relevant for GiST indexes.
This comment is not specific to the old rtree implementation. In 
general, the input to R-Trees and the algorithm which creates the 
'rectangles' in non-leaf nodes influences the building of the tree and 
hereinafter the performance of access methods.

Perhaps it's talking about a deficiency
that was specific to the old rtree code.


And sometimes even experts have to spend time thinking about the situation.

Nevertheless, this is a tiny issue. Unfortunately I don't have the 
knowledge to modify the C-source and create a patch.


--

Jürgen Purtz





obsolete indexing method "rtree"

2020-08-05 Thread Jürgen Purtz
We removed the indexing method "rtree" in version 8.3 and replaced it 
with "gist". Until today there are some "rtree" heritages:


 * "CREATE INDEX ... USING rtree ..." is valid syntax and leads to the
   message "NOTICE:  substituting access method gist for obsolete
   method rtree"
 * The documentation refers to the method in ref/comments.sgml and
   ref/create_index.sgml.
 * The documentation uses the term "R-Tree" in gist.sgml, seg.sgml, and
   xindex.sgml. That's ok because in this cases it refers to the
   abstract concept of R-trees and not to the eliminated implementation.

Isn't it time to disallow this syntax and remove all references and 
messages? The situation concerning index types, access methods, operator 
classes, and index related extensions is complicate enough and hard to 
explain to newcomers. Every simplification should be welcomed.


--

Jürgen Purtz




Import an XML file

2020-07-22 Thread Jürgen Purtz

On 22.07.20 01:49, PASCAL CROZET wrote:


Hi all,

I want to import XML file into PG database table.

I've find functions to get the XML content of a cell after imported an 
XML file with the pg_get_file function.


But, I want to explode the XML content to colums. How can I do this ?

PG 10 under Ubuntu 18

*_*

Cordialement, *Pascal CROZET**
*

*DBA *- Qualis Consulting

•300 Route Nationale 6 – 69760 LIMONEST

*_*



Pascal, your situation is unclear - at least for me. Please give us more 
information concerning


 * your pg_get_file function
 * a short excerpt of your file
 * the DDL of your table
 * what is your understanding of a 'cell'?

Is this helpful: 
https://www.postgresql.org/docs/current/functions-xml.html ?


Kind regards, Jürgen Purtz




Re: Add A Glossary

2020-07-21 Thread Jürgen Purtz

On 19.06.20 19:10, Alvaro Herrera wrote:

Thanks for these fixes!  I included all of these.

On 2020-Jun-19, Erik Rijkers wrote:


And one thing that I am not sure of (but strikes me as a bit odd):
there are several cases of
'are enforced unique'. Should that not be
'are enforced to be unique'  ?

I included this change too; I am not too sure of it myself.  If some
English language neatnik wants to argue one way or the other, be my
guest.


- Added '(process)' to the two terms 'Autovacuum' and 'Stats Collector'

- Removed link to himself in 'Logger (process)'

- new term: Base Backup


--

Jürgen Purtz


diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index 76525c6302..58e5071642 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -108,7 +108,7 @@
   
 
   
-   Autovacuum
+   Autovacuum (process)

 
  A set of background processes that routinely perform
@@ -178,6 +178,19 @@

   
 
+  
+   Base Backup
+   
+
+ A binary copy of all
+ database cluster
+ files. It is generated by the tool .  
+ In combination with WAL files it can be used as the starting point
+ for recovery, log shipping, or streaming replication.
+
+   
+  
+
   
Bloat

@@ -855,8 +868,7 @@
Logger (process)

 
- If activated, the
- Logger process
+ If activated, the process
  writes information about database events into the current
  log file.
  When reaching certain time- or
@@ -1486,7 +1498,7 @@
   
 
   
-   Stats collector
+   Stats collector (process)

 
  This process collects statistical information about the


Re: Additional Chapter for Tutorial

2020-07-13 Thread Jürgen Purtz



On 12.07.20 22:45, Daniel Gustafsson wrote:

This patch no longer applies, due to conflicts in start.sgml, can you please
submit a rebased version?


ok. but I need some days.  juergen






Re: Add A Glossary

2020-06-17 Thread Jürgen Purtz


On 17.06.20 02:09, Alvaro Herrera wrote:

On 2020-Jun-09, Jürgen Purtz wrote:


Can you agree to the following definitions? If no, we can alternatively
formulate for each of them: "Under discussion - currently not defined". My
proposals are inspired by chapter 2.2 Concepts: "Tables are grouped into
databases, and a collection of databases managed by a single PostgreSQL
server instance constitutes a database cluster."

After sleeping on it a few more times, I don't oppose the idea of making
"instance" be the running state and "database cluster" the on-disk stuff
that supports the instance.  Here's a patch that does things pretty much
along the lines you suggested.

I made small adjustments to "SQL objects":

* SQL objects in schemas were said to have their names unique in the
schema, but we failed to say anything about names of objects not in
schemas and global objects.  Added that.

* Had example object types for global objects and objects not in
schemas, but no examples for objects in schemas.  Added that.


Some programs whose output we could tweak per this:
pg_ctl

pg_ctl is a utility to initialize, start, stop, or control a PostgreSQL server.
  -D, --pgdata=DATADIR   location of the database storage area

to:

pg_ctl is a utility to initialize or control a PostgreSQL database cluster.
  -D, --pgdata=DATADIR   location of the database directory

pg_basebackup:

pg_basebackup takes a base backup of a running PostgreSQL server.

to:

pg_basebackup takes a base backup of a PostgreSQL instance.


+1, with two formal changes:

-  Rearrangement of term "Data page" to meet alphabetical order.

-  Add  in one case to meet xml-well-formedness.


One last question: The definition of "Data directory" reads "... A 
cluster's storage space comprises the data directory plus ..." and 
'cluster' links to '"glossary-instance". Shouldn't it link to 
"glossary-db-cluster"?


--

Jürgen Purtz


diff --git a/doc/src/sgml/glossary.sgml b/doc/src/sgml/glossary.sgml
index e29b55e5ac..0499f9044f 100644
--- a/doc/src/sgml/glossary.sgml
+++ b/doc/src/sgml/glossary.sgml
@@ -413,6 +413,22 @@

   
 
+  
+   Data page
+   
+
+ The basic structure used to store relation data.
+ All pages are of the same size.
+ Data pages are typically stored on disk, each in a specific file,
+ and can be read to shared buffers
+ where they can be modified, becoming
+ dirty.  They become clean when written
+ to disk.  New pages, which initially exist in memory only, are also
+ dirty until written.
+
+   
+  
+
   
Database

@@ -441,6 +457,7 @@
  cluster is also sometimes used to refer to an instance.
  (Don't confuse this term with the SQL command CLUSTER.)
 
+   
   
 
   
@@ -448,22 +465,6 @@

   
 
-  
-   Data page
-   
-
- The basic structure used to store relation data.
- All pages are of the same size.
- Data pages are typically stored on disk, each in a specific file,
- and can be read to shared buffers
- where they can be modified, becoming
- dirty.  They become clean when written
- to disk.  New pages, which initially exist in memory only, are also
- dirty until written.
-
-   
-  
-
   
Datum



Re: some charts or graphs of possible permissions would be nice

2020-06-11 Thread Jürgen Purtz

On 10.06.20 17:53, Alvaro Herrera wrote:

On 2020-Jun-10, Laurenz Albe wrote:


On Tue, 2020-06-09 at 21:34 +, PG Doc comments form wrote:

Having read through the documentation on roles/granting I think I more or
less understand how it works, but what isn't really clarified is what the
overall universe of permissions that can be granted looks like. For example
I still didn't realize that to create a schema, you need to "GRANT CREATE"
to the role on the database before the role is allowed to do that. It's hard
to make a mental map of everything that a new role might need when I am
creating it.

That would be material for a tutorial rather than a documentation.

... but our documentation *does* have a tutorial, which could perhaps
gain a section about privileges.

What permissions issues do users typically struggle with? I personally 
have seen no problems in this area. Stephen sends one example; can you 
send more examples - or even a short text or a sketch of what you expect 
to be in the documentation?


More general: Is it a real problem? My experience is that in most cases 
permissions are handled at the application level, not at the database 
level. Is it worth to give more details? I don't think so. But it may be 
a good idea to follow Stephen's suggestion and put an introductory 
summary to the tutorial chapter.


--

Jürgen Purtz






Re: Add A Glossary

2020-06-09 Thread Jürgen Purtz

On 17.05.20 17:28, Alvaro Herrera wrote:

I think the terms under discussion are just

* cluster
* instance
* server



Despite the short period of its existence the glossary achieved some 
importance, see: 
https://www.postgresql.org/message-id/b8e12875ebec9e6d3107df5fa1129e1e%40postgrespro.ru 
. We have to be careful with publications. It's not acceptable that we 
change definitions from release to release. Therefore IMO we should mark 
or even ignore such terms for which we cannot reach consensus.


Can you agree to the following definitions? If no, we can alternatively 
formulate for each of them: "Under discussion - currently not defined". 
My proposals are inspired by chapter 2.2 Concepts: "Tables are grouped 
into databases, and a collection of databases managed by a single 
PostgreSQL server instance constitutes a database cluster."



- "Database" (No change to existing definition): "A named collection of 
SQL objects."



- "Database Cluster", "Cluster" (New definition and rearrangements of 
some sentences): "A collection of related databases, and their common 
static and dynamic meta-data.


This term is sometimes used to refer to an instance.

(Don't confuse the term CLUSTER with the SQL command CLUSTER.)"


- "Data Directory" (Replaced 'instance' by 'cluster'): "The base 
directory on the filesystem of a server that contains all data files and 
subdirectories associated with a cluster (with the exception of 
tablespaces). The environment variable PGDATA is commonly used to refer 
to the data directory.


A cluster's storage space comprises the data directory plus any 
additional tablespaces.


For more information, see Section 68.1."


- "Database Server", "Instance" (Major changes): "A group of backend and 
auxiliary processes that communicate using a common shared memory area. 
One postmaster process manages the instance; one instance manages 
exactly one cluster with all its databases. Many instances can run on 
the same server as long as their TCP ports do not conflict.


The instance handles all key features of a DBMS: read and write access 
to files and shared memory, assurance of the ACID properties, 
connections to client processes, privilege verification, crash recovery, 
replication, etc."



- "Server" (No change to existing definition): "A computer on which 
PostgreSQL instances run. The term server denotes real hardware, a 
container, or a virtual machine.


This term is sometimes used to refer to an instance or to a host."


- "Host" (No change to existing definition): "A computer that 
communicates with other computers over a network. This is sometimes used 
as a synonym for server. It is also used to refer to a computer where 
client processes run."



--

Jürgen Purtz






Re: Change JOIN tutorial to focus more on explicit joins

2020-05-27 Thread Jürgen Purtz

On 20.05.20 23:56, Thomas Munro wrote:

On Thu, May 21, 2020 at 1:37 AM PG Doc comments form
 wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/12/tutorial-join.html
Description:

The tutorial about joins makes the following statement about the explicit
JOIN operator:


This syntax is not as commonly used as the one above

I think in 2020 this claim is no longer true, and I would love to see the
manual prefer the "modern" explicit JOIN operator rather than sticking to
the ancient implicit joins in the WHERE clause.

+1

The "new" syntax is 28 years old, from SQL 92.  I don't see too many
SQL 86 joins.  Would you like to write a documentation patch?



The attached patch

- prefers the explicit join-syntax over the implicit one and explains 
the keywords of the explicit syntax


- uses a more accurate definition of 'join'

- separates  and  tags

- shifts  definitions outside of  to get a better 
rendering in PDF


- adds a note concerning IDs and foreign keys


--

J. Purtz


diff --git a/doc/src/sgml/query.sgml b/doc/src/sgml/query.sgml
index c0889743c4..6f8ea373b5 100644
--- a/doc/src/sgml/query.sgml
+++ b/doc/src/sgml/query.sgml
@@ -441,11 +441,11 @@ SELECT DISTINCT city

 Thus far, our queries have only accessed one table at a time.
 Queries can access multiple tables at once, or access the same
-table in such a way that multiple rows of the table are being
-processed at the same time.  A query that accesses multiple rows
-of the same or different tables at one time is called a
-join query.  As an example, say you wish to
-list all the weather records together with the location of the
+table several times. Such queries  they are called
+join queries  combine
+rows of one table in some way with rows of the other table
+and return a single row per combination. An example may be a
+list of all the weather records together with the location of the
 associated city.  To do that, we need to compare the city
 column of each row of the weather table with the
 name column of all rows in the cities
@@ -461,10 +461,17 @@ SELECT DISTINCT city
 
 
 SELECT *
-FROM weather, cities
-WHERE city = name;
+FROM   weather
+JOIN   cities ON (city = name);
 
 
+The keyword JOIN connects the two tables.
+Behind the keyword ON follows the
+definition how to compare their rows. In this case, the
+column city of table weather
+must be equal to the column name
+of table cities.
+
 
  city  | temp_lo | temp_hi | prcp |date| name  | location
 ---+-+-+--++---+---
@@ -476,8 +483,14 @@ SELECT *

 

-Observe two things about the result set:
+Observe some things about the result set:
 
+ 
+  
+   The resulting rows contain values from both tables.
+  
+ 
+
  
   
There is no result row for the city of Hayward.  This is
@@ -499,8 +512,8 @@ SELECT *
*:
 
 SELECT city, temp_lo, temp_hi, prcp, date, location
-FROM weather, cities
-WHERE city = name;
+FROM   weather
+JOIN   cities ON (city = name);
 
   
  
@@ -509,10 +522,9 @@ SELECT city, temp_lo, temp_hi, prcp, date, location
 

 Exercise:
-
 
  Attempt to determine the semantics of this query when the
- WHERE clause is omitted.
+ ON clause is omitted.
 

 
@@ -526,8 +538,8 @@ SELECT city, temp_lo, temp_hi, prcp, date, location
 
 SELECT weather.city, weather.temp_lo, weather.temp_hi,
weather.prcp, weather.date, cities.location
-FROM weather, cities
-WHERE cities.name = weather.city;
+FROM   weather
+JOIN   cities ON (cities.name = weather.city);
 
 
 It is widely considered good style to qualify all column names
@@ -537,20 +549,22 @@ SELECT weather.city, weather.temp_lo, weather.temp_hi,
 

 Join queries of the kind seen thus far can also be written in this
-alternative form:
+form:
 
 
 SELECT *
-FROM weather INNER JOIN cities ON (weather.city = cities.name);
+FROM   weather, cities
+WHERE  weather.city = cities.name;
 
 
-This syntax is not as commonly used as the one above, but we show
-it here to help you understand the following topics.
+This syntax is not as commonly used as the one above. It dates back
+to the very first times of SQL, avoids the JOIN
+keyword and uses the WHERE clause instead of the
+ON clause.

 
+   joinouter

-joinouter
-
 Now we will figure out how we can get the Hayward records back in.
 What we want the query to do is to scan the
 weather table and for each row to find the
@@ -563,15 +577,19 @@ SELECT *
 
 
 SELECT *
-FROM weather LEFT OUTER JOIN cities ON (weather.city = cities.name);
+FROM   weather
+LEFT OUTER JOIN cities ON (weather.city = cities.name);
 
+
+
+
  city  | temp_lo | temp_hi | prcp |

Re: Add A Glossary

2020-05-20 Thread Jürgen Purtz

On 19.05.20 08:17, Laurenz Albe wrote:

On Mon, 2020-05-18 at 18:08 +0200, Jürgen Purtz wrote:

cluster/instance: PG (mainly) consists of a group of processes that commonly
act on shared buffers. The processes are very closely related to each other
and with the buffers. They exist altogether or not at all. They use a common
initialization file and are incarnated by one command. Everything exists
solely in RAM and therefor has a fluctuating nature. In summary: they build
a unit and this unit needs to have a name of itself. In some pages we used
to use the term *instance* - sometimes in extended forms: *database instance*,
*PG instance*, *standby instance*, *standby server instance*, *server instance*,
or *remote instance*.  For me, the term *instance* makes sense, the extensions
*standby instance* and *remote instance* in their context too.

FWIW, I feel somewhat like Alvaro on that point; I use those terms synonymously,
perhaps distinguishing between a "started cluster" and a "stopped cluster".
After all, "cluster" refers to "a cluster of databases", which are there, 
regardless
if you start the server or not.

The term "cluster" is unfortunate, because to most people it suggests a group of
machines, so the term "instance" is better, but that ship has sailed long ago.

The static part of a cluster to me is the "data directory".


cluster/instance: The different nature (static/dynamic) of what I call 
"cluster" and "instance" as well as the existence of the two commands 
"initdb — create a new PostgreSQL database cluster" and "pg_ctl — 
initialize, start, stop, or control a PostgreSQL server" confirms me in 
my opinion that we need two different terms for them. Those two terms 
shall not be synonym to each other, they label distinct things. If 
people prefer "data directory" instead of "cluster", this is ok for me.


There are situations where we need a single term for both of them. 
"Instance and its data directory" or "Instance and its cluster" are too 
wordy. In many cases we use "database server" or "server" in this sense. 
Imo "Server" is too short and ambiguous. "database server", the plural 
form "databases server", or the new term "cluster server", which is more 
accurate, would be ok for me. (Similar to "server", the term "cluster" 
is also used in many different contexts - but only outside of the PG 
world; within our context "cluster" is not ambiguous.)



server/host: We need a term to describe the underlying hardware respectively
the virtual machine or container, where PG is running. I suggest to use both
*server* and *host*. In computer science, both have their eligibility and are
widely used. Everybody understands *client/server architecture* or *host* in
TCP/IP configuration. We cannot change such matter of course. I suggest to
use both depending on the context, but with the same meaning: "real hardware,
a container, or a virtual machine".

On this I have a strong opinion because of my Unix mindset.
"machine" and "host" are synonyms, and it doesn't matter to the database if they
are virtualized or not.  You can always disambiguate by adding "virtual" or 
"physical".

A "server" is a piece of software that responds to client requests, never a 
machine.
In my book, this is purely Windows jargon.  The term "client-server 
architecture"
that you quote emphasized that.

Perhaps "machine" would be the preferable term, because "host" is more prone to
misunderstandings (except in a networking context).

server/host: I agree that we are not interested in the question whether 
there is real hardware or any virtualization container. We are even not 
interested in the operating system. Our primary concern is the existence 
of a port of the Internet Protocol. But is the term "server" appropriate 
to name an IP-port? Additionally, "server" is used for other meanings: 
a) the previously mentioned "database server" b) a (virtual) machine: 
"server-side", "... the file ... loaded by the server ..." c) binaries 
"... the server must be built with SSL support ..." d) whenever it seems 
to be appropriate: "standby server", "... the server parses query ...", 
"server configuration", "server process".


Because of its ambiguous usage, the definition of "server" must clarify 
the allowed meanings. What's about:


--

server: Depending on the context, the term *server* denotes:

 * An IP-port which is offered by any OS.   ?
 * A - possibly virtualized - machine
 * An abbreviation for the slightly longer term "database(s)/cluster
   server"  ??? this will support the readability, but not the clarity ???
 * More ?

--

The term "host" is used mainly for IP configuration "host name", "host 
address" and in the context of compiling "host language", "host 
variable". These are clear situations and can be defined easily.





Re: Add A Glossary

2020-05-18 Thread Jürgen Purtz

On 17.05.20 17:28, Alvaro Herrera wrote:

On 2020-May-17, Erik Rijkers wrote:


On 2020-05-17 08:51, Alvaro Herrera wrote:

I don't think that's the general understanding of those terms.  For all
I know, they*are*  synonyms, and there's no specific term for "the
fluctuating objects" as you call them.  The instance is either running
(in which case there are processes and RAM) or it isn't.

For what it's worth, I've also always understood 'instance' as 'a running
database'.  I admit it might be a left-over from my oracle years:

https://docs.oracle.com/cd/E11882_01/server.112/e40540/startup.htm#CNCPT601

There, 'instance' clearly refers to a running database.  When that database
is stopped, it ceases to be an instance.

I've never understood it that way, but I'm open to having my opinion on
it changed.  So let's discuss it and maybe gather opinions from others.

I think the terms under discussion are just

* cluster
* instance
* server

We don't have "host" (I just made it a synonym for server), but perhaps
we can add that too, if it's useful.  It would be good to be consistent
with historical Postgres usage, such as the initdb usage of "cluster"
etc.

Perhaps we should not only define what our use of each term is, but also
explain how each term is used outside PostgreSQL and highlight the
differences.  (This would be particularly useful for "cluster" ISTM.)


In fact, we have reached a point where we don't have a common 
understanding of a group of terms. I'm sure that we will meet some more 
situations like this in the future. Such discussions, subsequent 
decisions, and implementations in the docs are necessary to gain a solid 
foundation - primarily for newcomers (what is my first motivation) as 
well as for more complex discussions among experts. Obviously, each of 
us will include his previous understanding of terms. But we also should 
be open to sometimes revise old terms.


Here are my two cents.

cluster/instance: PG (mainly) consists of a group of processes that 
commonly act on shared buffers. The processes are very closely related 
to each other and with the buffers. They exist altogether or not at all. 
They use a common initialization file and are incarnated by one command. 
Everything exists solely in RAM and therefor has a fluctuating nature. 
In summary: they build a unit and this unit needs to have a name of 
itself. In some pages we used to use the term *instance* - sometimes in 
extended forms: *database instance*, *PG instance*, *standby instance*, 
*standby server instance*, *server instance*, or *remote instance*.  For 
me, the term *instance* makes sense, the extensions *standby instance* 
and *remote instance* in their context too.


The next essential component is the data itself. It is organized as a 
group of databases plus some common management information (global, 
pg_wal, pg_xact, pg_tblspc, ...). The complete data must be treated as a 
whole because the management information concerns all databases. Its 
nature is different from the processes and shared buffers. Of course, 
its content changes, but it has a steady nature. It even survives a 
'power down'. There is one command to instantiate a new incarnation of 
the directory structure and all files. In summary, it's something of its 
own and should have its own name. 'database' is not possible because it 
consists of databases and other things. My favorite is *cluster*; 
*database cluster* is also possible.


server/host: We need a term to describe the underlying hardware 
respectively the virtual machine or container, where PG is running. I 
suggest to use both *server* and *host*. In computer science, both have 
their eligibility and are widely used. Everybody understands 
*client/server architecture* or *host* in TCP/IP configuration. We 
cannot change such matter of course. I suggest to use both depending on 
the context, but with the same meaning: "real hardware, a container, or 
a virtual machine".


--

Jürgen Purtz

(PS: I added the docs mailing list)




Re: dot dot dot dot

2020-05-13 Thread Jürgen Purtz




The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/12/tutorial-sql-intro.html
Description:

 is wrong.


The four dots are not really a part of the path. Don't use them 
literally. They indicate that you should use your local path instead. 
This may differ from installation to installation.


Kind regards, Jürgen






Re: Documentation - chapter 52, system catalogs

2020-05-06 Thread Jürgen Purtz

On 06.05.20 00:01, Tom Lane wrote:

I don't deny that we have a problem here: in the website rendering,
that text tends to be pushed down out of sight by the chapter's
sub-table-of-contents.  But that issue exists for every chapter
that's got more than a couple of sections.  We shouldn't hack
around it for just these two chapters.  Chapter 9 and Appendix F
are additional examples where this is a fairly urgent issue.

Generic solutions are always better than individual ones.

I wonder if we should just drop the sub-table-of-contents material.
(I'm assuming DocBook can be coerced to do that; but since the PDF
output has no such material, it seems like it ought to be possible.)
If we drop TOCs, we loose the automatically created links. As a 
substitute we would need tables like in example 51.1. So it's again an 
individual solution.

Or ... is there a way to postpone it to the bottom of the page,
ie just before the first , instead of having it in front
of the chapter preface?

The same issue exists for the sub-sub-tables-of-contents for s,
though it's less bad because few of those have grown enormous lists
of 's.


Swapping TOC and content may work in such cases, but for me it seems to 
be a hard work with xslt.


A real generic solution would be an adaption of the HTML output to the 
PDF output: two columns with a collapsible menu containing all TOC 
information in the left one ('outline' in PDF) and nothing than content 
in the right one. But this is a huge change of the look-and-feel as well 
as for all technical stuff: HTML, CSS, bootstrap, Javascript, Ajax(?), ... .


--

Jürgen Purtz






Re: Documentation - chapter 52, system catalogs

2020-05-05 Thread Jürgen Purtz

On 04.05.20 17:23, PG Doc comments form wrote:

The following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/11/catalogs-overview.html
Description:

The documentation for chapter 52 does not clearly identify the schema
associated with the system catalogs in either the chapter header or overview
sections; had I not stumbled across a reference in the documentation for the
postgres_fdw to the pg_catalog search path, I would have been unable to
reference the content in those catalogs via foreign data wrappers, and
although the majority of the content has been exposed via the
information_schema views and tables, there remain a few elements of interest
that appear to only exist in the pg_catalog qualified content.


I think that the topic "fdw" is only a symptom, not the real reason for 
the confusion. We can improve the chapters "System Catalog" and 
"Information Schema" in general by centralizing some already existing 
paragraphs in the "Overview" chapter and adding some more explanations. 
The attached patch contains:


- for "System Catalog": moving paragraphs from bottom of 51. to top of 
51.1. (in PG 11 it is chapter 52); explanation that "System Catalog" is 
a synonym for a concrete schema and its tables.


- for "Information Schema": moving paragraphs from bottom of 36. to 
middle of 36.1. ; an explanation that it relies on the system catalog; 
change the title of 36.1. to "Overview" in correlation with "System 
Catalog".


--

Jürgen Purtz

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index ce33df9e58..8854bda05f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -6,6 +6,23 @@
 
  System Catalogs
 
+ 
+  Overview
+
+  
+   The term system catalog is a widely used
+   synonym for the schema pg_catalog and
+   its tables. They are listed in 
+   with links to more detailed documentation.
+  
+
+  
+   Most system catalogs are copied from the template database during
+   database creation and are thereafter database-specific. A few
+   catalogs are physically shared across all databases in a cluster;
+   these are noted in the descriptions of the individual catalogs.
+  
+
   
The system catalogs are the place where a relational database
management system stores schema metadata, such as information about
@@ -23,21 +40,6 @@
of the system catalogs is ever decreasing.
   
 
- 
-  Overview
-
-  
-lists the system catalogs.
-   More detailed documentation of each catalog follows below.
-  
-
-  
-   Most system catalogs are copied from the template database during
-   database creation and are thereafter database-specific. A few
-   catalogs are physically shared across all databases in a cluster;
-   these are noted in the descriptions of the individual catalogs.
-  
-
   
System Catalogs
 
diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml
index 7a995a1b64..ea8561d9e5 100644
--- a/doc/src/sgml/information_schema.sgml
+++ b/doc/src/sgml/information_schema.sgml
@@ -7,46 +7,8 @@
   information schema
  
 
- 
-  The information schema consists of a set of views that contain
-  information about the objects defined in the current database.  The
-  information schema is defined in the SQL standard and can therefore
-  be expected to be portable and remain stable  unlike the system
-  catalogs, which are specific to
-  PostgreSQL and are modeled after
-  implementation concerns.  The information schema views do not,
-  however, contain information about
-  PostgreSQL-specific features; to inquire
-  about those you need to query the system catalogs or other
-  PostgreSQL-specific views.
- 
-
- 
-  
-   When querying the database for constraint information, it is possible
-   for a standard-compliant query that expects to return one row to
-   return several.  This is because the SQL standard requires constraint
-   names to be unique within a schema, but
-   PostgreSQL does not enforce this
-   restriction.  PostgreSQL
-   automatically-generated constraint names avoid duplicates in the
-   same schema, but users can specify such duplicate names.
-  
-
-  
-   This problem can appear when querying information schema views such
-   as check_constraint_routine_usage,
-   check_constraints, domain_constraints, and
-   referential_constraints.  Some other views have similar
-   issues but contain the table name to help distinguish duplicate
-   rows, e.g., constraint_column_usage,
-   constraint_table_usage, table_constraints.
-  
- 
-
-
  
-  The Schema
+  Overview
 
   
The information schema itself is a schema named
@@ -57,6 +19,44 @@
the space savings achieved by that are minuscule).
   
 
+  
+   The information schema consists of a set of views to tables
+   in the schema pg_catalog, so it contains
+   information about the objects defined in the current database. T

Re: Additional Chapter for Tutorial

2020-04-30 Thread Jürgen Purtz

On 29.04.20 21:12, Peter Eisentraut wrote:


I don't see this really as belonging into the tutorial.  The tutorial 
should be hands-on, how do you get started, how do you get some results.


Yes, the tutorial should be a short overview and give instructions how 
to start. IMO the first 4 sub-chapters fulfill this expectation. Indeed, 
the fifth (VACUUM) is extensive and offers many details.


During the inspection of the existing documentation I recognized that 
there are many details about VACUUM, AUTOVACUUM, all of their parameters 
as well as their behavior. But the information is spread across many 
pages: Automatic Vacuuming, Client Connection Defaults, Routine 
Vacuuming, Resource Consumption, VACUUM. Even for a person with some 
pre-knowledge it is hard to get an overview how this fits together and 
why things are solved in exactly this way. In the end we have very good 
descriptions of all details but I miss the 'big picture'. Therefore I 
summarized central aspects and tried to give an answer to the question 
'why is it done in this way?'. I do not dispute that the current version 
of the page is not adequate for beginners. But at some place we should 
have such a summary about vacuuming and freezing.


How to proceed?

- Remove the page and add a short paragraph to the MVCC page instead.

- Cut down the page to a tiny portion.

- Divide it into two parts: a) a short introduction and b) the rest 
after a statement like 'The following offers more details and parameters 
that are more interesting for an experienced user than for a beginner. 
You can easily skip it.'



Your material is more of an overview of the whole system.  What's a 
new user supposed to do with that?


When I dive into a new subject, I'm more interested in its architecture 
than in its details. We shall offer an overview about the major PG 
components and strategies to beginners.



--

Jürgen Purtz






Re: An XSLT example script

2020-04-21 Thread Jürgen Purtz

On 21.04.20 20:10, Peter Eisentraut wrote:

On 2020-04-14 10:03, Jürgen Purtz wrote:

The example "XSLT Stylesheet for Converting SQL/XML Output to HTML" is
tagged as , but it isn't a figure, it's an example script.


It's not an example, it's an actual script that you are supposed to use.


The
PDF output contains lists for examples, figures and tables and shows it
in the wrong list. We should change the tagging.


Why is it wrong to make this a figure?

Sorry, I don't understand. Do we speak about the same position in the 
documentation?


https://www.postgresql.org/docs/12/functions-xml.html#XSLT-XML-HTML

Scripts usually are tagged as "screen", "programmlisting", "example", 
"synopsis/function" (for functions) - but never as a figure. The 
introductory text explicitly says "As an example ...". Therefor 
"example" seems to be appropriate. IMO "programmlisting" is also possible.


And: there is no single graphical element like a line, a circle, a 
color, or an UML-symbol.



--

Jürgen Purtz






Re: Additional Chapter for Tutorial

2020-04-20 Thread Jürgen Purtz

On 17.04.20 20:40, Erik Rijkers wrote:

Very good stuff, and useful. I think.

I mean that but nevertheless here is a lot of comment :)

(I didn't fully compile as docs, just read the 'text' from the patch 
file)


Thanks. Added nearly all of the suggestions.


--

Jürgen Purtz

diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index ae5f3fac75..965eb751c0 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -1,7 +1,7 @@
 
 
  
-  Advanced Features
+  Advanced SQL Features
 
   
Introduction
diff --git a/doc/src/sgml/architecture.sgml b/doc/src/sgml/architecture.sgml
new file mode 100644
index 00..81dedc90b4
--- /dev/null
+++ b/doc/src/sgml/architecture.sgml
@@ -0,0 +1,449 @@
+
+
+ 
+  The Architecture
+
+  
+   Every DBMS implements basic strategies to achieve a fast and
+   robust system. This chapter provides an overview of what
+   techniques PostgreSQL uses to
+   reach this aim.
+  
+
+  
+   Collaboration of Processes, RAM, and Files
+   
+As is a matter of course, in a client/server architecture
+clients do not have direct access to the database. Instead,
+they merely send requests to the server side and receives
+according information from there. In the case of
+PostgreSQL, at the server
+side there is one process per client, the so-called
+Backend process.
+It acts in close cooperation with the
+Instance which
+is a group of tightly coupled other server side processes plus a
+Shared Memory
+area.
+   
+
+   
+At start time, an instance is initiated by the
+Postmaster process.
+It loads the configuration files, allocates the
+Shared Memory
+and starts the comprehensive network of processes:
+Background Writer,
+Checkpointer,
+WAL Writer,
+WAL Archiver,
+Autovacuum processes,
+Statistics Collector,
+Logger, and more.
+ visualizes
+main aspects of their collaboration.
+   
+
+   
+Architecture
+
+ 
+  
+  
+ 
+ 
+  
+  
+ 
+
+   
+
+   
+Whenever a client application tries to connect to a
+database, this request is handled
+in a first step by the Postmaster
+process. It checks the authorization, starts a
+new Backend process,
+and instructs the client application to connect to it. All further
+client requests go to this process and are handled
+by it.
+   
+
+   
+Client requests (SELECT, UPDATE, ...) usually leads to the
+necessity to read or write some data. In a first attempt
+the client's Backend process tries
+to get the information out of Shared
+Memory. This Shared
+Memory is a mirror of parts of the
+heap and
+index files. Because files are
+much larger than memory, it's likely that
+the desired information is not (completely) available
+in the RAM. In this case the Backend process
+ must transfer additional file pages to
+Shared Memory. Files are physically
+organized in pages. Every transfer between files and
+RAM is performed in units of complete pages, retaining
+their size and layout.
+   
+
+   
+Reading file pages is notedly slower than reading
+RAM. This is the main motivation for the existence of
+Shared Memory. As soon as one
+of the Backend processes has done
+the job those pages are available for all other
+Backend processes for direct
+access in RAM.
+   
+
+   
+Shared Memory is limited in size.
+Sooner or later it becomes necessary to overwrite old RAM
+pages. As long as the content of such pages hasn't
+changed this is not a problem. But in
+Shared Memory also write
+actions take place
+- performed by any of the Backend
+processes (or an
+autovacuum process,
+or other processes). Such modified pages are called
+dirty pages.
+Before dirty pages can be overwritten,
+they must be transferred back to disk. This is a two-step process.
+   
+
+   
+First, whenever the content of a page changes, a
+WAL record is created out
+of the delta-information (difference between old and
+new content) and stored in another area of the
+Shared Memory. These
+WAL records are read by the
+WAL Writer process,
+which runs in parallel to the Backend
+processes and all other processes of
+the Instance. It writes
+the continuously arising WAL records to
+the end of the current
+WAL file.
+Because of the sequential nature of this writing, it is much
+faster than the more or less random access
+to data files with heap
+and index information.
+As mentioned, this WAL-writing happens
+in an independent process. Nevertheless all
+WAL records created out of one
+dirty page must be transferred
+to disk before the dirty page
+itself can be transferred to disk (???).
+   
+
+   
+Second, the transfer of dirty pages
+from Shared Memory

Additional Chapter for Tutorial

2020-04-17 Thread Jürgen Purtz
Our documentation explains many details about commands, tools, 
parameters in detail and with high accuracy. Nevertheless my impression 
is that we neglect the 'big picture': why certain processes exist and 
what their relation to each other is, summary of strategies, 
visualization of key situations, ... . People with mature knowledge 
don't miss this information because they know all about it. But for 
beginners such explanations would be a great help. In the time before 
GSoD 2019 we had similar discussions.


I plan to extend over time the part 'Tutorial' by an additional chapter 
with an overview about key design decisions and basic features. The 
typical audience should consist of persons with limited pre-knowledge in 
database systems and some interest in PostgreSQL. In the attachment you 
find a patch for the first sub-chapter. Subsequent sub-chapters should 
be: MVCC, transactions, VACUUM, backup, replication, ... - mostly with 
the focus on the PostgreSQL implementation and not on generic topics 
like b-trees.


There is a predecessor of this patch: 
https://www.postgresql.org/message-id/974e09b8-edf5-f38f-2fb5-a5875782ffc9%40purtz.de 
. In the meanwhile its glossary-part is separated and commited. The new 
patch contains two elements: textual descriptions and 4 figures. My 
opinion concerning figures is set out in detail in the previous patch.


Kind regards, Jürgen Purtz


diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index ae5f3fac75..965eb751c0 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -1,7 +1,7 @@
 
 
  
-  Advanced Features
+  Advanced SQL Features
 
   
Introduction
diff --git a/doc/src/sgml/architecture.sgml b/doc/src/sgml/architecture.sgml
new file mode 100644
index 00..bbed0c3466
--- /dev/null
+++ b/doc/src/sgml/architecture.sgml
@@ -0,0 +1,449 @@
+
+
+ 
+  The Architecture
+
+  
+   Every DBMS implements basic strategies to achieve a fast and
+   robust system. This chapter provides an overview of what
+   techniques PostgreSQL uses to
+   reach this aim.
+  
+
+  
+   Collabortion of Processes, RAM, and Files
+   
+As is a matter of course, in a client/server architecture
+clients do not have direct access to the database. Instead,
+they merely send requests to the server side and receives
+resulting information from there. In the case of
+PostgreSQL, at the server
+side there is one process per client, the so-called
+Backend process.
+He acts in close cooperation with the
+Instance which
+is a group of tightly coupled other server side processes plus a
+Shared Memory
+area.
+   
+
+   
+At start time, an instance is initiated by the
+Postmaster process.
+He loads the configuration files, allocates the
+Shared Memory
+and starts the comprehensive network of processes:
+Background Writer,
+Checkpointer,
+WAL Writer,
+WAL Archiver,
+Autovacuum processes,
+Statistics Collector,
+Logger, and more.
+ visualizes
+main aspects of their collaboration.
+   
+
+   
+Architecture
+
+ 
+  
+  
+ 
+ 
+  
+  
+ 
+
+   
+
+   
+Whenever a client application tries to connect to a
+database, this request is handled
+in a first step by the Postmaster
+process. He checks the authorization, starts a
+new Backend process,
+and instructs the client application to connect to him. All further
+client requests go to this process and are handled
+by him.
+   
+
+   
+Client requests (SELECT, UPDATE, ...) usually leads to the
+necessity to read or write some data. In an first attempt
+the client's Backend process tries
+to get the information out of Shared
+Memory. This Shared
+Memory is a mirror of parts of the
+heap and
+index files. Because files are
+much huger than memory, it's likely that
+the desired information is not (completely) available
+in the RAM. In this case the Backend process
+ must transfer additional file pages to
+Shared Memory. Files are physically
+organized in pages. Every transfer between files and
+RAM is performed in units of complete pages while retaining
+their size and layout.
+   
+
+   
+Reading file pages is notedly slower than reading
+RAM. This is the main motivation for the existence of
+Shared Memory. As soon as one
+of the Backend processes has done
+the job those pages are available for all other
+Backend processes for direct
+access in RAM.
+   
+
+   
+Shared Memory is limited in size.
+Sooner or later it is necessary to overwrite old RAM
+pages. As long as the content of such pages hasn't
+changed this is not a problem. But in
+Shared Memory also write
+actions take place
+- performed by any of the Backend
+processes (or an
+autovacuum process,
+or other processes). Such modified pages

Re: Getting our tables to render better in PDF output

2020-04-12 Thread Jürgen Purtz


On 11.04.20 22:51, Tom Lane wrote:

Yet another possibility is to use the docbook tags:
func()
int.
Then we can define the desired formatting for such markup (similar to
..).

I looked into this.  It appears that  is fairly tightly tied
to C function declaration syntax, plus it sounds like it might get
deprecated in future docbook versions.


funcsynopsis, funcdef, function, ... keeps valid in Docbook 5, see: 
https://tdg.docbook.org/tdg/5.1/funcsynopsis.html . There is even an 
option to distinguish between K and ANSI style during rendering: 



Kind regards, Jürgen Purtz



Re: documentation pdf build fail (HEAD)

2020-03-24 Thread Jürgen Purtz

Ubuntu 18.04: no crash, but possibly a side effect:

[INFO] FOUserAgent - Rendered page #2685.
[INFO] FOUserAgent - Rendered page #2686.
[INFO] FOUserAgent - Rendered page #2687.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"function-encode" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"function-decode" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-altercollation-notes-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-altertable-notes-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-createaggregate-notes-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-createindex-storage-parameters-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-createindex-concurrently-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-createtable-storage-parameters-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-createtable-compatibility-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-declare-notes-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-inserting-params-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-on-conflict-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-prepare-examples-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-reindex-concurrently-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-with-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-from-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-where-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-groupby-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-having-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-window-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-select-list-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-distinct-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-union-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-intersect-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-except-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-orderby-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-limit-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"sql-for-update-share-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"pg-dump-examples-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"app-psql-patterns-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"app-psql-variables-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"app-psql-interpolation-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"app-psql-prompting-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"app-psql-environment-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"app-psql-examples-title" found.
[WARN] FOUserAgent - Destination: Unresolved ID reference 
"app-postgres-single-user-title" found.

[INFO] FOUserAgent - Rendered page #2688.
[WARN] FOUserAgent - Page 226: Unresolved ID reference "function-decode" 
found.
[WARN] FOUserAgent - Page 226: Unresolved ID reference "function-encode" 
found.


Kind regards, J. Purtz





Re: Users/Roles do not align.

2020-02-06 Thread Jürgen Purtz




There's a few things wrong about this part anyway- namely that we've got
FDWs now, and there's certainly other cluster-wide things that exist
beyond the specific items listed, so I wonder if perhaps we should just
stop trying to list everything here.


Inspiring answer! After some inquiry I became aware, that we do not have 
only 2 levels of 'belong-to' but 3: tables, views, operators, and much 
more objects belong to a schema; schemata, extensions (e.g. FDW), and 
more(?) belong to a database; databases, roles, tablespaces, and more 
belong to a cluster. Two aspects of 'belong-to' are: object names are 
unique within their level, and objects are automatically known 
everywhere within their level.


Information about such dependencies and their consequences is spread 
across different chapters of the documentation and the System Catalog. 
Of course the chapter about roles/users is not suitable to explain the 
details. But it's important to know the hierarchy, it shut be summarized 
somewhere.


Kind regards,  J. Purtz






Re: Users/Roles do not align.

2020-02-03 Thread Jürgen Purtz



Based on this, I believe Section 5.9 should read:
`A PostgreSQL database cluster contains one or more named databases. Roles
are shared across the entire cluster, but no other data is shared across
databases. Any given client connection to the server can access only the
data in a single database, the one specified in the connection request.`


imo the following is a more precise wording:

'A cluster contains three or more named databases ('template0', 
'template1', 'postgres', ...). Roles, which are users or groups, see 
Chapter 21, - as well as database names and tablespace definitions - are 
shared across the entire cluster. No other data is shared across 
databases or schemas. Any given client connection to the server can 
access only the data in a single database, the one specified in the 
connection request. If it has the necessary privileges, the connection 
can access all schemas within this database.'


And the last sentence of the paragraph behind 'Note' shall be extended, 
because schemas are rigidly separated from each other - only the access 
to different schemas is easily done from a single connection.


'Unlike _access to_ databases, _access to_ schemas is not rigidly 
separated: a _connection_ can access objects in any of the schemas in 
the database they are connected to, if they have privileges to do so.'



J. Purtz



Re: doc: Add anchors in create_table.sgml

2020-01-03 Thread Jürgen Purtz

We also converted from SGML to XML in the meantime, so you can
probably make do with a standard XML parser without having to write a
custom SGML one.


Use XML tools with care! Some of our XML files are not well formed 
because they contain more than one root element.


Kind regards, Jürgen Purtz.






Shrinking SVG (Again)

2019-11-17 Thread Jürgen Purtz
Our current policy says that we use the two tools Graphviz (drawing 
Graphs) and Ditaa (drawing ascii art) to generate SVG. The two tools are 
highly specialized and bound to their original purpose. I strongly 
believe that we will not be able to visualize more complex situations 
like this 
http://1.bp.blogspot.com/--CG_kXBFWzw/UgW5JROpbDI/AHc/9V8iwO1qluQ/s1600/arch.bmp 
or that: 
https://severalnines.com/sites/default/files/blog/node_5266/image5.jpg 
without using other tools, which are more flexible.


In the past, we had many discussions on this topic and didn't come to a 
final decision in favor on one single or a few number of such tools. 
However, we came to the important and very helpful conclusion to store 
two files per graphic: the original tool-specific (simple to read, 
diff-able) and the generated SVG version.


Out in the wild there is the open source tool SVGO 
https://github.com/svg/svgo. Its purpose is the generation of a small, 
easy to read SVG file out of any other SVG file by removing all the 
clutter, tools typically generate. In my opinion it's not perfect, but 
it has some strong advantages: the generated files are really small and 
nearly free of unnecessary information; it is flexible because a lot of 
parameters control the degree of optimization; it is extensible: missing 
features may be added by ourselves.



The SVGO tool is a node.js application. At Ubuntu you can install it:
  sudo apt install npm nodejs
  sudo npm install -g svgo
run it, e.g.:
  svgo test1_ink.svg --pretty --indent=2 --precision=2 --multipass 
--disable=removeComments  \

 --output test1_svgo.svg
Please test it with your own examples.


In contrast to our previous policy the generated file is the diff-able, 
whereas the original one is hard to read.



Jürgen Purtz



Re: Make 'Genetic Algorithm' a real figure

2019-07-19 Thread Jürgen Purtz
This is an adoption of the genetic-algorithm.gv file to the style of the 
gin.gv file: same font, similar header, use of TAB instead of SPACE.


Kind regards

Jürgen Purtz


diff --git a/doc/src/sgml/geqo.sgml b/doc/src/sgml/geqo.sgml
index 5120dfbb42..5a52fb46db 100644
--- a/doc/src/sgml/geqo.sgml
+++ b/doc/src/sgml/geqo.sgml
@@ -84,7 +84,8 @@
 Through simulation of the evolutionary operations recombination,
 mutation, and
 selection new generations of search points are found
-that show a higher average fitness than their ancestors.
+that show a higher average fitness than their ancestors. 
+illustrates in which order and how long the three steps are processed.

 

@@ -94,48 +95,13 @@
 non-random (better than random).

 
-   
-Structured Diagram of a Genetic Algorithm
-
-
- 
-  
-   
-P(t)
-generation of ancestors at a time t
-   
-
-   
-P''(t)
-generation of descendants at a time t
-   
-  
- 
-
-
-
-+=+
-|  Algorithm GA  |
-+=+
-| INITIALIZE t := 0   |
-+=+
-| INITIALIZE P(t) |
-+=+
-| evaluate FITNESS of P(t)|
-+=+
-| while not STOPPING CRITERION do |
-|   +-+
-|   | P'(t)  := RECOMBINATION{P(t)}   |
-|   +-+
-|   | P''(t) := MUTATION{P'(t)}   |
-|   +-+
-|   | P(t+1) := SELECTION{P''(t) + P(t)}  |
-|   +-+
-|   | evaluate FITNESS of P''(t)  |
-|   +-+
-|   | t := t + 1  |
-+===+=+
-
+   
+Structure of a Genetic Algorithm
+
+ 
+  
+ 
+

   
 
diff --git a/doc/src/sgml/images/Makefile b/doc/src/sgml/images/Makefile
index 1e7993020b..f9e356348b 100644
--- a/doc/src/sgml/images/Makefile
+++ b/doc/src/sgml/images/Makefile
@@ -3,6 +3,7 @@
 # see README in this directory about image handling
 
 ALL_IMAGES = \
+	genetic-algorithm.svg \
 	gin.svg \
 	pagelayout.svg
 
diff --git a/doc/src/sgml/images/genetic-algorithm.gv b/doc/src/sgml/images/genetic-algorithm.gv
index e69de29bb2..520c44bc65 100644
--- a/doc/src/sgml/images/genetic-algorithm.gv
+++ b/doc/src/sgml/images/genetic-algorithm.gv
@@ -0,0 +1,51 @@
+digraph {
+
+	layout=dot;
+
+	// default values
+	node  [shape=box, label="", fontname="sans-serif", style=filled, fillcolor=white, width=2.2, fontsize=8];
+	graph [fontname="sans-serif"]; // must be specified separately
+	edge  [fontname="sans-serif"]; // must be specified separately
+
+	// an unobtrusive background color
+	pad="1.0, 0.5";
+	bgcolor=whitesmoke;
+
+	// layout of edges and nodes
+	splines=ortho;
+	nodesep=0.3;
+	ranksep=0.3;
+
+	// label="Structure of a Genetic Algorithm" fontsize=26;
+
+	// nodes 
+	a1[label="INITIALIZE t := 0"];
+	a2[label="INITIALIZE P(t)"];
+	a3[label="evaluate FITNESS of P(t)"];
+	a4[shape="diamond", label="STOPPING CRITERION"; width=4];
+
+	// connect 'end' node with 'a9' node (bottom of figure)
+	{
+	  rank=same;
+	  a9[label="t := t + 1"];
+	  // end-symbol similar to UML notation
+	  end[shape=doublecircle, label="end", width=0.5];
+	}
+
+	a5[label="P'(t) := RECOMBINATION{P(t)}"];
+	a6[label="P''(t) := MUTATION{P'(t)}"];
+	a7[label="P(t+1) := SELECTION{P''(t) + P(t)}"];
+	a8[label="evaluate FITNESS of P''(t)"];
+
+	// edges
+	a1 -> a2 -> a3 -> a4;
+	a4 -> a5[xlabel="false   ", fontsize=10];
+	a4 -> end[xlabel="true  ",  fontsize=10];
+	a5 -> a6 -> a7 -> a8 -> a9;
+	a4 -> a9 [dir=back]; // forces 'true'-edge to right side of figure
+
+	// explain the notation
+	expl [shape=plaintext, fontsize=10, width=3.2, fillcolor=whitesmoke,
+	  label="P(t) generation of ancestors at a time t\nP''(t) generation of descendants at a time t"];
+
+}


Re: Improvement of GIN figure

2019-07-13 Thread Jürgen Purtz



On Tue, Jul 9, 2019 at 3:22 PM Tatsuo Ishii  wrote:

I agree that the existing colors look awful, and that muted pastel
colors would work better. Doesn't seem like something that should
happen at the cost of making the diagram less informative, though.

I am not an expert in the area but I think we should cosider people
with color disability.

Good point. I think that that shouldn't be too hard to mostly get right.

It's good that the diagrams will already work with a screen reader.

Due to the discussions of recent days as well as some improvements of 
the graphiz know-how, the graphic is subject to many changes: different 
colors (variations of 'PG blue', and possibly helpful for people with 
color vision deficiency), a different font (adaption to the font in PG's 
documentation), changes in the meaning and explanation of nodes (as a 
result of discussion with Oleg Bartunov), introduction of a - hopefully 
unobtrusive - background color (to circumvent graphic from text), use of 
DOT syntax.








diff --git a/doc/src/sgml/images/README b/doc/src/sgml/images/README
index 07c4580255..c451d4bca4 100644
--- a/doc/src/sgml/images/README
+++ b/doc/src/sgml/images/README
@@ -13,7 +13,8 @@ involve diffable source files.
 These tools are acceptable:
 
 - Graphviz (https://graphviz.org/)
-- Ditaa (http://ditaa.sourceforge.net/)
+- Ditaa (https://github.com/stathissideris/ditaa), version 0.11 or
+  higher. Previous versions does not support SVG.
 
 We use SVG as the format for integrating the image into the ultimate
 output formats of the documentation, that is, HTML, PDF, and others.
diff --git a/doc/src/sgml/images/gin.gv b/doc/src/sgml/images/gin.gv
index 097e91029a..35520736ec 100644
--- a/doc/src/sgml/images/gin.gv
+++ b/doc/src/sgml/images/gin.gv
@@ -1,85 +1,101 @@
 digraph "gin" {
+
 layout=dot;
-node [label="", shape=box, style=filled, fillcolor=gray, width=1.4];
 
-m1 [label="meta page"];
+// default values
+node  [shape=box, label="", fontname="sans-serif", style=filled, fillcolor=lightgray, width=1.4];
+graph [fontname="sans-serif"]; // must be specified separately
+edge  [fontname="sans-serif"]; // must be specified separately
+
+// an unobtrusive background color
+pad="1.0, 0.5";
+bgcolor=whitesmoke;
 
+// The top node
+m1 [label="meta page", style=filled, fillcolor=beige];
+
+// other nodes
 subgraph cluster01 {
-label="entry tree";
-subgraph egroup1 {
-rank=same;
-e1;
-}
-subgraph egroup2 {
-rank=same;
-e2 -> e3 -> e4;
-}
-subgraph egroup3 {
-rank=same;
-e5 -> e6 -> e7 -> e8 -> e9;
-}
-e1 -> e4;
-e1 -> e3;
-e1 -> e2;
-e2 -> e5;
-e2 -> e6;
-e3 -> e7;
-e4 -> e8;
-e4 -> e9;
-
-e6 [fillcolor=green, label="posting list"];
-e8 [fillcolor=green, label="posting list"];
-e9 [fillcolor=green, label="posting list"];
+ bgcolor=white;
+ label="entry tree";
+ subgraph egroup1 {
+  rank=same;
+  e1;
+  }
+  subgraph egroup2 {
+   rank=same;
+   e2 -> e3 -> e4;
+  }
+  subgraph egroup3 {
+   rank=same;
+   e5 -> e6 -> e7 -> e8 -> e9;
+  }
+  e1 -> e4;
+  e1 -> e3;
+  e1 -> e2;
+  e2 -> e5;
+  e2 -> e6;
+  e3 -> e7;
+  e4 -> e8;
+  e4 -> e9;
+
+  e5 [fillcolor=cyan];
+  e6 [fillcolor=cyan];
+  e7 [fillcolor=cyan];
+  e8 [fillcolor=cyan];
+  e9 [fillcolor=cyan];
+
 }
 
 subgraph cluster02 {
-label="posting tree";
-subgraph pgroup1 {
-rank=same;
-p1;
-}
-subgraph pgroup2 {
-rank=same;
-p2 -> p3;
-}
-p1 -> p2;
-p1 -> p3;
-
-p2 [fillcolor=green, label="heap ptr"];
-p3 [fillcolor=green, label="heap ptr"];
+ bgcolor=white;
+ label="posting tree";
+ subgraph pgroup1 {
+  p1;
+ }
+ subgraph pgroup2 {
+  rank=same;
+

Make 'Genetic Algorithm' a real figure

2019-07-08 Thread Jürgen Purtz
Currently the 'figure' of a genetic algorithm is created with some kind 
of 'ascii art'. The patch makes it a real figure.


Kind regards

Jürgen Purtz





diff --git a/doc/src/sgml/geqo.sgml b/doc/src/sgml/geqo.sgml
index 5120dfbb42..5a52fb46db 100644
--- a/doc/src/sgml/geqo.sgml
+++ b/doc/src/sgml/geqo.sgml
@@ -84,7 +84,8 @@
 Through simulation of the evolutionary operations recombination,
 mutation, and
 selection new generations of search points are found
-that show a higher average fitness than their ancestors.
+that show a higher average fitness than their ancestors. 
+illustrates in which order and how long the three steps are processed.

 

@@ -94,48 +95,13 @@
 non-random (better than random).

 
-   
-Structured Diagram of a Genetic Algorithm
-
-
- 
-  
-   
-P(t)
-generation of ancestors at a time t
-   
-
-   
-P''(t)
-generation of descendants at a time t
-   
-  
- 
-
-
-
-+=+
-|  Algorithm GA  |
-+=+
-| INITIALIZE t := 0   |
-+=+
-| INITIALIZE P(t) |
-+=+
-| evaluate FITNESS of P(t)|
-+=+
-| while not STOPPING CRITERION do |
-|   +-+
-|   | P'(t)  := RECOMBINATION{P(t)}   |
-|   +-+
-|   | P''(t) := MUTATION{P'(t)}   |
-|   +-+
-|   | P(t+1) := SELECTION{P''(t) + P(t)}  |
-|   +-+
-|   | evaluate FITNESS of P''(t)  |
-|   +-+
-|   | t := t + 1  |
-+===+=+
-
+   
+Structure of a Genetic Algorithm
+
+ 
+  
+ 
+

   
 
diff --git a/doc/src/sgml/images/Makefile b/doc/src/sgml/images/Makefile
index 1e7993020b..f9e356348b 100644
--- a/doc/src/sgml/images/Makefile
+++ b/doc/src/sgml/images/Makefile
@@ -3,6 +3,7 @@
 # see README in this directory about image handling
 
 ALL_IMAGES = \
+	genetic-algorithm.svg \
 	gin.svg \
 	pagelayout.svg
 
diff --git a/doc/src/sgml/images/genetic-algorithm.gv b/doc/src/sgml/images/genetic-algorithm.gv
index e69de29bb2..170cc9ee28 100644
--- a/doc/src/sgml/images/genetic-algorithm.gv
+++ b/doc/src/sgml/images/genetic-algorithm.gv
@@ -0,0 +1,44 @@
+digraph {
+layout=dot;
+splines=ortho;
+nodesep=0.3;
+ranksep=0.3;
+node [shape=box; width=2.2; fontsize=8];
+pad="1.0, 0.5";
+bgcolor=grey95;
+
+// title
+// label="Structure of a Genetic Algorithm" fontsize=26;
+
+
+// nodes 
+a1[label="INITIALIZE t := 0"];
+a2[label="INITIALIZE P(t)"];
+a3[label="evaluate FITNESS of P(t)"];
+a4[shape="diamond", label="STOPPING CRITERION"; width=4];
+
+// connect 'end' node with 'a9' node (bottom of figure)
+{
+  rank=same;
+  a9[label="t := t + 1"];
+  // end-symbol similar to UML notation
+  end[shape=doublecircle; label="end"; width=0.5];
+}
+
+a5[label="P'(t) := RECOMBINATION{P(t)}"];
+a6[label="P''(t) := MUTATION{P'(t)}"];
+a7[label="P(t+1) := SELECTION{P''(t) + P(t)}"];
+a8[label="evaluate FITNESS of P''(t)"];
+
+// edges
+a1 -> a2 -> a3 -> a4;
+a4 -> a5[xlabel="false   "; fontsize=10];
+a4 -> end[xlabel="true  ";  fontsize=10];
+a5 -> a6 -> a7 -> a8 -> a9;
+a4 -> a9 [dir=back]; // forces 'true'-edge to right side of figure
+
+// explain the notation
+expl [shape=plaintext; fontsize=10; width=3.2;
+  label="P(t) generation of ancestors at a time t\nP''(t) generation of descendants at a time t"];
+
+}
diff --git a/doc/src/sgml/images/genetic-algorithm.svg b/doc/src/sgml/images/genetic-algorithm.svg
index e69de29bb2..2db8bef230 100644
--- a/doc/src/sgml/images/genetic-algorithm.svg
+++ b/doc/src/sgml/images/genetic-algorithm.svg
@@ -0,0 +1,139 @@
+
+
+
+http://www.w3.org/2000/svg; xmlns:xlink="http://www.w3.org/1999/xlink; width="619pt" height="580pt" viewBox="0.00 0.00 619.00 580.00">
+
+%3
+
+
+
+a1
+
+INITIALIZE t := 0
+
+
+
+a2
+
+INITIALIZE P(t)
+
+
+
+a1-a2
+
+
+
+
+
+a3
+
+evaluate FITNESS of P(t)
+
+
+
+a2-a3
+
+
+
+
+
+a4
+
+STOPPING CRITERION
+
+
+
+a3-a4
+
+
+
+
+
+a9
+
+t := t + 1
+
+
+
+a4-a9
+
+
+
+
+
+end
+
+
+end
+
+
+
+a4-end
+
+
+true  
+
+
+
+a5
+
+P'(t) := RECOMBINATION{P(t)}
+
+
+
+a4-a5
+
+
+false 

Re: TOC: List of Figures

2019-07-07 Thread Jürgen Purtz

Peter Eisentraut  writes:

On 2019-07-02 10:13, Jürgen Purtz wrote:

After the integration of figures into the documentation it may be
helpful to extent the TOC with a 'List of Figures'. Any opinion?

If yes: The same for 'List of Tables' and 'List of Examples'?

I have never found these useful.  What would you use them for?  Who goes
to the documentation (or any other book-like material) thinking, I'm in
the mood to look at some tables today, let's see what's in them.

+1.  I have a vague recollection that we once had such lists in the
TOC (because that was the default behavior for whatever toolchain we
were then using) and later took them out precisely because they were
useless.  Not sure how a list of figures would be any more useful.

regards, tom lane

The two main reasons for using the documentation are 'reference for 
concepts and syntax' and 'learning'. For the first issue figures are 
less important - at least as BNF is not presented graphically. The 
second issue is important for all persons which are not very familiar 
with PG. Pedagogues tell us, that learning is done best with different 
media types, where typical methods are direct speech, videos (see the 
huge number of learning videos at YouTube), graphics, exercises, reading 
manuals. In essence, they recommend some kind of 'media changes' within 
each lesson. Currently our documentation is not much more than a manual, 
the rising element 'graphics' is and will be tiny small (unless we 
integrate BNF). To support the didactic method of media changes we 
should promote graphics with a summarizing list at a prominent place in 
front of the manual.


Kind regards, Jürgen Purtz





Improvement of GIN figure

2019-07-07 Thread Jürgen Purtz
To increase the consistency between the surrounding, explaining text and 
the figure, there are some changes and additions to the texts within the 
figure.


Also: Added a hint in README for ditaa developers.


Jürgen Purtz


diff --git a/doc/src/sgml/images/README b/doc/src/sgml/images/README
index 07c4580255..c451d4bca4 100644
--- a/doc/src/sgml/images/README
+++ b/doc/src/sgml/images/README
@@ -13,7 +13,8 @@ involve diffable source files.
 These tools are acceptable:
 
 - Graphviz (https://graphviz.org/)
-- Ditaa (http://ditaa.sourceforge.net/)
+- Ditaa (https://github.com/stathissideris/ditaa), version 0.11 or
+  higher. Previous versions does not support SVG.
 
 We use SVG as the format for integrating the image into the ultimate
 output formats of the documentation, that is, HTML, PDF, and others.
diff --git a/doc/src/sgml/images/gin.gv b/doc/src/sgml/images/gin.gv
index 097e91029a..d0442abf15 100644
--- a/doc/src/sgml/images/gin.gv
+++ b/doc/src/sgml/images/gin.gv
@@ -27,9 +27,12 @@ digraph "gin" {
 e4 -> e8;
 e4 -> e9;
 
-e6 [fillcolor=green, label="posting list"];
-e8 [fillcolor=green, label="posting list"];
-e9 [fillcolor=green, label="posting list"];
+e5 [fillcolor=green4, label="pointer to\nposting tree"];
+e6 [fillcolor=green,  label="posting list"];
+e7 [fillcolor=green4, label="pointer to\nposting tree"];
+e8 [fillcolor=green,  label="posting list"];
+e9 [fillcolor=green,  label="posting list"];
+
 }
 
 subgraph cluster02 {
@@ -45,8 +48,8 @@ digraph "gin" {
 p1 -> p2;
 p1 -> p3;
 
-p2 [fillcolor=green, label="heap ptr"];
-p3 [fillcolor=green, label="heap ptr"];
+p2 [fillcolor=green, label="posting list"];
+p3 [fillcolor=green, label="posting list"];
 }
 
 subgraph cluster03 {
@@ -56,7 +59,7 @@ digraph "gin" {
 p4;
 }
 
-p4 [fillcolor=green, label="heap ptr"];
+p4 [fillcolor=green, label="posting list"];
 }
 
 subgraph cluster04 {
@@ -72,14 +75,14 @@ digraph "gin" {
 p5 -> p6;
 p5 -> p7;
 
-p6 [fillcolor=green, label="heap ptr"];
-p7 [fillcolor=green, label="heap ptr"];
+p6 [fillcolor=green, label="posting list"];
+p7 [fillcolor=green, label="posting list"];
 }
 
 subgraph cluster05 {
 label="pending list";
-node [style=filled, fillcolor=red];
-n1 -> n2 -> n3 -> n4;
+node [fillcolor=red, label="pend. entries"];
+n1 -> n2 -> n3 -> n4 -> n5;
 }
 
 m1 -> e1;
@@ -88,6 +91,8 @@ digraph "gin" {
 e7 -> p5;
 m1 -> n1;
 
-e5 [style=filled, fillcolor=green4];
-e7 [style=filled, fillcolor=green4];
+// An explanation for readers
+expl [shape=plaintext, fillcolor=green, label="  A  is a list of heap pointers (row IDs)  "];
+p6 -> expl [style=invis];
+
 }


Re: TOC: List of Figures

2019-07-03 Thread Jürgen Purtz
After the integration of figures into the documentation it may be 
helpful to extent the TOC with a 'List of Figures'. Any opinion?


If yes: The same for 'List of Tables' and 'List of Examples'?

There is a simple way to enable this feature: change line 56 of 
stylesheet-html-common.xsl to: "book toc,title,figure,table,example". 
As shown in a previous thread this leads to an ugly swelling of the 
TOC similar to the formerly handling of release notes - especially for 
tables and examples -, see attachment 1.


The alternative is a downshift of the postings by one level, see 
attachment 2. How to realize this behavior is shown in attachment 3.


I understood that the majority of voters recommend the 'list of figures' 
but refuses lists for 'tables' and 'examples' (please consider that PDF 
generates all of them by default). The attached patch realizes the 
desired behavior. In 'func.sgml' there is also a modification because 
the given example isn't a figure, it's an example/programlisting.


Jürgen Purtz


diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 3a8581d205..538a9c45fd 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11444,7 +11444,7 @@ table2-mapping
 converted into other XML-based formats.

 
-   
+   
 XSLT Stylesheet for Converting SQL/XML Output to HTML
 
-   
+   
   
  
 
diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml
index 3e115f1c76..626b19bfe8 100644
--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
@@ -279,6 +279,11 @@
   
   
 
+  
+   List of Figures
+   
+  
+
  
 
  
diff --git a/doc/src/sgml/stylesheet-html-common.xsl b/doc/src/sgml/stylesheet-html-common.xsl
index 9edce52a10..d730f469a4 100644
--- a/doc/src/sgml/stylesheet-html-common.xsl
+++ b/doc/src/sgml/stylesheet-html-common.xsl
@@ -81,6 +81,13 @@ set   toc,title
   
 
 
+
+
+  
+
+
+  
+
 
 
 


TOC: List of Figures

2019-07-02 Thread Jürgen Purtz
After the integration of figures into the documentation it may be 
helpful to extent the TOC with a 'List of Figures'. Any opinion?


If yes: The same for 'List of Tables' and 'List of Examples'?

There is a simple way to enable this feature: change line 56 of 
stylesheet-html-common.xsl to: "book toc,title,figure,table,example". As 
shown in a previous thread this leads to an ugly swelling of the TOC 
similar to the formerly handling of release notes - especially for 
tables and examples -, see attachment 1.


The alternative is a downshift of the postings by one level, see 
attachment 2. How to realize this behavior is shown in attachment 3.



diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml
index 3e115f1c76..6a1b5228d0 100644
--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
@@ -281,6 +281,22 @@
 
  
 
+ 
+  Lists of Figures, Tables and Examples
+  
+   List of Figures
+   
+  
+  
+   List of Tables
+   
+  
+  
+   List of Examples
+   
+  
+ 
+
  
  
 
diff --git a/doc/src/sgml/stylesheet-html-common.xsl b/doc/src/sgml/stylesheet-html-common.xsl
index 9edce52a10..eb04549fb9 100644
--- a/doc/src/sgml/stylesheet-html-common.xsl
+++ b/doc/src/sgml/stylesheet-html-common.xsl
@@ -81,6 +81,27 @@ set   toc,title
   
 
 
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
+
+
+  
+
+
+  
+
 
 
 



Re: First SVG graphic

2019-03-28 Thread Jürgen Purtz

On 28.03.19 00:41, Tatsuo Ishii wrote:

Is it possible to generate figure indexes?


This is an additional topic and there was a brief discussion on 7 
February (and in older conversations) regarding this. It concerns list 
of figures, tables, and examples. I intend to make a proposal around May 
2019.


Kind regards

Jürgen Purtz






Re: First SVG graphic

2019-03-11 Thread Jürgen Purtz

On 11.03.19 12:13, Peter Eisentraut wrote:

I played with this further.  My conclusion is that SVG as a source
format is not workable.  Aside from the tooling issues that are being
discussed, which might be solvable, I think it's not the right level of
abstraction.  The problem is that it is a *vector* format, but not a
*graph* or *chart* format.  You can draw boxes and lines and text, but
nothing in the format indicates how they are connected.


Yes, SVG knows nothing about higher level concepts. That's one of the 
reasons why there are tools to create SVG, eg. Inkscape with the 
connectors feature (it was not used in the examples but obviously it's 
possible.)



I looked at some alternatives.  I rebuilt the GIN image using Graphviz
and the page layout image using Ditaa.


That was the heart of the proposal: Since  we couldn't find consensus 
about a single tool in the long lasting discussion, everybody shall use 
his favourite tool and deliver the original source plus a SVG version.


Kind regards,

Jürgen Purtz




Re: First SVG graphic

2019-03-09 Thread Jürgen Purtz

On 08.03.19 18:55, Peter Eisentraut wrote:

How do you get from the Inkscape SVG files to the what you call
"optimized SVG" files?

I loaded the gin_inkscape.svg file into Inkscape, saved it back out as
"Plain SVG", but the resultant file did not look at all similar to the
existing gin.svg.

Inkscape supports a lot of different formats. "Plain SVG" and "Optimized 
SVG" are two of them - and they differ in some aspects. "Plain SVG" is 
in some respect garrulous and generates unnecessary elements: 
namespaces, metadata, IDs, XML-entities. "Optimized SVG" avoids them, as 
long as you follow the hints described in 
https://wiki.postgresql.org/wiki/SVG_using_Inkscape. Conclusion: People 
working with Inkscape shall avoid "Plain SVG" and use "Optimized SVG".


Nevertheless Peter points out an important aspect. If you generate 
"Optimized SVG" out of the uploaded files, they look different in 
comparison to their original version. As mentioned in 
https://wiki.postgresql.org/wiki/SVG_using_Inkscape#Manual_corrections, 
it is possible - and sometimes advised - to manually 'optimize' the 
generated "Optimized SVG" file. I used this technique and tried to make 
the files good readable for everyone, especially because we are in an 
early phase of SVG integration. But apparently I have overstressed this 
possibility. You will find the following differences between the 
uploaded 'additionally manually optimized' files and the generated 
"Optimized SVG" files:


   - newlines

  -  element. This is an Inkscape optimization, which 
generates a group out of those elements, which are in a sequence and use 
- by chance - the same stroke-attribute. This optimization step is 
purely formal and in my opinion misuses the meaning of the  element 
as a cramp for logically related elements, e.g. to resize or transform 
them simultaneously.


As an example of such differences I append two files: gin.svg 
("Optimized SVG" plus my manually optimizations; the originally uploaded 
file) and gin_pure_opt.svg (pure "Optimized SVG").


What is your opinion? Should we renounce the additional manual step and 
use only the pure "Optimized SVG" format? This will increase the 
'diff-ablility', which may be valuable in the long term. But direct 
readability of the files suffers more or less.


Kind regards, Jürgen




Re: First SVG graphic

2019-02-23 Thread Jürgen Purtz

On 20.02.19 17:28, Peter Eisentraut wrote:

On 2019-02-15 11:58, Jürgen Purtz wrote:

The graphic for dump/restore is transferred from 'pg_dump utility'
chapter to the 'backup and restore' chapter in sgml/backup.sgml. In this
chapter I made a lot of textual changes to explain the relation between
pg_dump, pg_restore and psql in more detail. Especially I tried to
introduce a more stringent use of terms, eg: avoiding 'archive' because
this term is used with a different meaning in the PITR chapter. (It
would be a good idea to do the same for the description of the pg_dump
utility.) Because I'm not a native English speaker, feel free to correct
my wording.

I think we should have some in-tree documentation about how to edit
images, probably at doc/src/sgml/svg/README.  You had published some of
that documentation earlier in this thread, and whatever is relevant to
developers should be included in the tree.  I'm specifically wondering
about the relationship between the *.svg and the inkscape/*.svg files.


Good idea. README is created. Also chapter "J.4. Documentation 
Authoring" has some enhancements to differentiate between text and 
graphic. It's also possible that we will see more sub-chapters in J.4 in 
the future, e.g.: "Creating Mathematical Formulas".


Kind regards, Jürgen


diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 8326c7c673..12997e903c 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -57,6 +57,8 @@ GENERATED_SGML = version.sgml \
 
 ALLSGML := $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) $(GENERATED_SGML)
 
+ALLSVG := $(wildcard $(srcdir)/svg/*.svg)
+
 
 ##
 ## Man pages
@@ -125,10 +127,12 @@ endif
 
 html: html-stamp
 
-html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML)
+html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML) $(ALLSVG)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $(wordlist 1,2,$^)
 	cp $(srcdir)/stylesheet.css html/
+	$(MKDIR_P) html/svg
+	cp $(ALLSVG) html/svg
 	touch $@
 
 htmlhelp: stylesheet-hh.xsl postgres.sgml $(ALLSGML)
@@ -136,7 +140,7 @@ htmlhelp: stylesheet-hh.xsl postgres.sgml $(ALLSGML)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(wordlist 1,2,$^)
 
 # single-page HTML
-postgres.html: stylesheet-html-nochunk.xsl postgres.sgml $(ALLSGML)
+postgres.html: stylesheet-html-nochunk.xsl postgres.sgml $(ALLSGML) $(ALLSVG)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) -o $@ $(wordlist 1,2,$^)
 
@@ -160,7 +164,7 @@ postgres.pdf:
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
 
-%.pdf: %.fo
+%.pdf: %.fo $(ALLSVG)
 	$(FOP) -fo $< -pdf $@
 
 
@@ -169,7 +173,7 @@ postgres.pdf:
 ##
 
 epub: postgres.epub
-postgres.epub: postgres.sgml $(ALLSGML)
+postgres.epub: postgres.sgml $(ALLSGML) $(ALLSVG)
 	$(XMLLINT) --noout --valid $<
 	$(DBTOEPUB) $<
 
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index a73fd4d044..f192ea3b52 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -16,7 +16,7 @@
   There are three fundamentally different approaches to backing up
   PostgreSQL data:
   
-   SQL dump
+   Dump into SQL INSERT syntax or a binary format
File system level backup
Continuous archiving
   
@@ -25,23 +25,28 @@
  
 
  
-  SQL Dump
+  Dump
 
   
-   The idea behind this dump method is to generate a file with SQL
-   commands that, when fed back to the server, will recreate the
-   database in the same state as it was at the time of the dump.
+   The idea behind this dump technique is to generate a file
+   that, when fed back to the server, will recreate the
+   database in the same state as it was at the time of the dump generation.
+  
+
+  
+   Creating the Dump
+  
PostgreSQL provides the utility program
 for this purpose. The basic usage of this
command is:
 
 pg_dump dbname  dumpfile
 
-   As you see, pg_dump writes its result to the
+   As you see, pg_dump in this basic form writes its result to the
standard output. We will see below how this can be useful.
-   While the above command creates a text file, pg_dump
-   can create files in other formats that allow for parallelism and more
-   fine-grained control of object restoration.
+   While the above command creates a plain-text SQL script, pg_dump
+   can create files also in other, dense binary formats that allow for parallelism and more
+   fine-grained control of object restoration, see: 
   
 
   
@@ -100,12 +105,13 @@ pg_dump dbname  ALTER TABLE.)
   
+  
 
   
Restoring the Dump
 

-Text files created by pg_dump are intended to
+Such plain-text SQL scripts created by pg_dump are intended to
 be read in by the psql program. The
 general command form to restore a dump is
 
@@ -121,8 +127,6 @@ psql d

Re: First SVG graphic

2019-02-15 Thread Jürgen Purtz
The graphic for dump/restore is transferred from 'pg_dump utility' 
chapter to the 'backup and restore' chapter in sgml/backup.sgml. In this 
chapter I made a lot of textual changes to explain the relation between 
pg_dump, pg_restore and psql in more detail. Especially I tried to 
introduce a more stringent use of terms, eg: avoiding 'archive' because 
this term is used with a different meaning in the PITR chapter. (It 
would be a good idea to do the same for the description of the pg_dump 
utility.) Because I'm not a native English speaker, feel free to correct 
my wording.


The list of figures as well as lists of other prominent elements are 
removed. 'lists in TOC' seems to be a topic of its own and will be 
resumed later.


Kind regards, Jürgen


diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 8326c7c673..12997e903c 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -57,6 +57,8 @@ GENERATED_SGML = version.sgml \
 
 ALLSGML := $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) $(GENERATED_SGML)
 
+ALLSVG := $(wildcard $(srcdir)/svg/*.svg)
+
 
 ##
 ## Man pages
@@ -125,10 +127,12 @@ endif
 
 html: html-stamp
 
-html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML)
+html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML) $(ALLSVG)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $(wordlist 1,2,$^)
 	cp $(srcdir)/stylesheet.css html/
+	$(MKDIR_P) html/svg
+	cp $(ALLSVG) html/svg
 	touch $@
 
 htmlhelp: stylesheet-hh.xsl postgres.sgml $(ALLSGML)
@@ -136,7 +140,7 @@ htmlhelp: stylesheet-hh.xsl postgres.sgml $(ALLSGML)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(wordlist 1,2,$^)
 
 # single-page HTML
-postgres.html: stylesheet-html-nochunk.xsl postgres.sgml $(ALLSGML)
+postgres.html: stylesheet-html-nochunk.xsl postgres.sgml $(ALLSGML) $(ALLSVG)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) -o $@ $(wordlist 1,2,$^)
 
@@ -160,7 +164,7 @@ postgres.pdf:
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
 
-%.pdf: %.fo
+%.pdf: %.fo $(ALLSVG)
 	$(FOP) -fo $< -pdf $@
 
 
@@ -169,7 +173,7 @@ postgres.pdf:
 ##
 
 epub: postgres.epub
-postgres.epub: postgres.sgml $(ALLSGML)
+postgres.epub: postgres.sgml $(ALLSGML) $(ALLSVG)
 	$(XMLLINT) --noout --valid $<
 	$(DBTOEPUB) $<
 
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index a73fd4d044..f192ea3b52 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -16,7 +16,7 @@
   There are three fundamentally different approaches to backing up
   PostgreSQL data:
   
-   SQL dump
+   Dump into SQL INSERT syntax or a binary format
File system level backup
Continuous archiving
   
@@ -25,23 +25,28 @@
  
 
  
-  SQL Dump
+  Dump
 
   
-   The idea behind this dump method is to generate a file with SQL
-   commands that, when fed back to the server, will recreate the
-   database in the same state as it was at the time of the dump.
+   The idea behind this dump technique is to generate a file
+   that, when fed back to the server, will recreate the
+   database in the same state as it was at the time of the dump generation.
+  
+
+  
+   Creating the Dump
+  
PostgreSQL provides the utility program
 for this purpose. The basic usage of this
command is:
 
 pg_dump dbname  dumpfile
 
-   As you see, pg_dump writes its result to the
+   As you see, pg_dump in this basic form writes its result to the
standard output. We will see below how this can be useful.
-   While the above command creates a text file, pg_dump
-   can create files in other formats that allow for parallelism and more
-   fine-grained control of object restoration.
+   While the above command creates a plain-text SQL script, pg_dump
+   can create files also in other, dense binary formats that allow for parallelism and more
+   fine-grained control of object restoration, see: 
   
 
   
@@ -100,12 +105,13 @@ pg_dump dbname  ALTER TABLE.)
   
+  
 
   
Restoring the Dump
 

-Text files created by pg_dump are intended to
+Such plain-text SQL scripts created by pg_dump are intended to
 be read in by the psql program. The
 general command form to restore a dump is
 
@@ -121,8 +127,6 @@ psql dbname  pg_dump for specifying
 the database server to connect to and the user name to use. See
 the  reference page for more information.
-Non-text file dumps are restored using the  utility.

 

@@ -239,12 +243,12 @@ psql -f dumpfile postgres
 Some operating systems have maximum file size limits that cause
 problems when creating large pg_dump output files.
 Fortunately, pg_dump can write to the standard
-output, so you can use standard Unix tools to work around this
+output, so you can use standard tools to work around this

Re: Tutorial section of documentation: enhancements needed

2019-02-12 Thread Jürgen Purtz

On 12.02.19 09:06, Lætitia Avrot wrote:
I think we really need to take care of beginners who would like to 
install postgres on their laptop to "give it a try".



+1





Re: First SVG graphic

2019-02-07 Thread Jürgen Purtz

On 07.02.19 12:06, Peter Eisentraut wrote:


First, let's fix some of these whitespace errors:

firstSvg_2.patch:677: trailing whitespace.
   

Done.


Let's not use mixed-case file names:

Inkscape/
PageLayout.svg
gin.svg
pgDump.svg

Done.

@@ -152,15 +156,15 @@ postgres.txt: postgres.html
  postgres.pdf:
   $(error Invalid target;  use postgres-A4.pdf or postgres-US.pdf as targets)

-%-A4.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
+%-A4.fo: stylesheet-fo.xsl %.sgml
   $(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
   $(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type A4 -o $@ 
$(wordlist 1,2,$^)

-%-US.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
+%-US.fo: stylesheet-fo.xsl %.sgml
   $(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
   $(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type USletter 
-o $@ $(wordlist 1,2,$^)

-%.pdf: %.fo
+%.pdf: %.fo $(ALLSGML) $(SVGSRC)
   $(FOP) -fo $< -pdf $@


This seems a bit wrong.  The .fo target does depend on ALLSGML.  The
.pdf target does not, but it presumably does depend on SVGSRC.
It's a transitive dependency: the pdf target is triggered after changes 
in svg (or sgml), this triggers the fo targets. Therefore it's not 
necessary to have svg (or sgml) dependencies at the fo level.

The variable name SVGSRC is a bit confusing.  What is it the source of?
The variable SVGSRC points to all svg-files, similar to ALLSGML which 
points to the sgml files. Whenever any of them changes, certain targets 
will fire.

@@ -209,7 +213,7 @@ check: postgres.sgml $(ALLSGML) check-tabs
  install: install-html install-man

  installdirs:
- $(MKDIR_P) '$(DESTDIR)$(htmldir)'/html $(addprefix '$(DESTDIR)$(mandir)'/man, 
1 3 $(sqlmansectnum))
+ $(MKDIR_P) '$(DESTDIR)$(htmldir)'/html/svg html/svg $(addprefix 
'$(DESTDIR)$(mandir)'/man, 1 3 $(sqlmansectnum))

  # If the install used a man directory shared with other applications, this 
will remove all files.
  uninstall:

html/svg is not an installation directory.  You need to create it
somewhere else.
Please help. I haven't understood the distinction between installation 
directory and DESTDIR. On the other hand, in the Makefile there is a - 
redundant - command within the html-stamp target: $(MKDIR_P) html/svg. 
But this will run frequently, which is not necessary.

+ 
+  Lists of Figures, Tables and Examples
+  
+   List of Figures
+   
+  
+  
+   List of Tables
+   
+  
+  
+   List of Examples
+   
+  
+ 

These ought to be created by the stylesheet.  We have probably turned
them off somewhere, so you should see where you can turn them on.


There is a simple mechanism to create those list of figures: change line 
55 of stylesheet-html-common.xsl to "book toc,title,*figure*". But the 
result is ugly - see attached screenshot. The list is out-of-line. 
Additionally, in the future we will have many figures (and examples and 
tables). This will lead to similar problems we actually faced with the 
release notes. The proposed solution moves this inflation of lists to 
deeper levels of the TOC. We can and have defined theirs layout within 
stylesheet-common.xsl.



diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..d31ee2d210 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -73,6 +73,21 @@
 architectures.


+  
+   pg_dump: Formats and Restore Proceedings

This doesn't work for man page output.

I think we should avoid putting images into reference pages.  This one
could perhaps go into the Backup chapter.


What is the problem here? Actually I don't have enough time to evaluate 
it in deep. If it is an urgent problem (I have seen that the 
commitfest-entry is tagged as "release 12") we shall shift the pg_dump 
figure to a later release.


I think that the Backup chapter isn't a good place for this because it 
does not explain the complete interaction between pg_dump, psql and restore.



Also, it should be linked to from somewhere.  An image that's just
floating around and not referred to in the text seems odd.

Also we tend to use hyphens instead of underscores for IDs.  (At some
point, underscores where not allowed.  I'm surprised that that's no
longer the case.)
Done. (But I hate the use of the minus-signs within any identifier of 
any language. For me it's an operator.)

   I also wouldn't put "_svg" into the ID.  The format
is irrelevant to the ID.

I changed _svg to _figure. I agree that the format is irrelevant. But it 
may be of interest, whether it is an id to a text or a figure.



Kind regards, Jürgen


diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 8326c7c673..1179d29096 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -57,6 +57,8 @@ GENERATED_SGML = version.sgml \
 
 ALLSGML := $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) $(GENERATED_SGML)
 
+SVGSRC := $(wildcard $(srcdir)/svg/*.svg)
+
 
 ##
 ## Man pages
@@ -125,10 +127,12 @@ endif
 
 html: html-stamp
 

Re: First SVG graphic

2019-02-05 Thread Jürgen Purtz

On 25.01.19 10:39, Peter Eisentraut wrote:


On 24/01/2019 00:53, Bruce Momjian wrote:

This is a pretty complicated issue with a lot of back-story.  I am
thinking Tatsuo or me will probably commit it before March.

Isn't that all the more reason to add it to the commitfest?


I added it to commitfest 2019-03.

In the attachment you find a new patch which extends the first one in 
regards of three topics:


 * Index terms for each graphic
 * List of figures in the TOC for HTML and PDF output
 * More or less a side-effect: List of tables and examples for HTML.
   This already exists for PDF.
 * (In a future version there may also be a 'List of Program-Listings'.
   But this requires some more actions.)

Kind regards, Jürgen Purtz


diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 8326c7c673..27d1e674f4 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -57,6 +57,8 @@ GENERATED_SGML = version.sgml \
 
 ALLSGML := $(wildcard $(srcdir)/*.sgml $(srcdir)/ref/*.sgml) $(GENERATED_SGML)
 
+SVGSRC := $(wildcard $(srcdir)/svg/*.svg)
+
 
 ##
 ## Man pages
@@ -125,10 +127,12 @@ endif
 
 html: html-stamp
 
-html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML)
+html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML) $(SVGSRC)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) $(wordlist 1,2,$^)
 	cp $(srcdir)/stylesheet.css html/
+	$(MKDIR_P) html/svg
+	cp $(SVGSRC) html/svg
 	touch $@
 
 htmlhelp: stylesheet-hh.xsl postgres.sgml $(ALLSGML)
@@ -136,7 +140,7 @@ htmlhelp: stylesheet-hh.xsl postgres.sgml $(ALLSGML)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(wordlist 1,2,$^)
 
 # single-page HTML
-postgres.html: stylesheet-html-nochunk.xsl postgres.sgml $(ALLSGML)
+postgres.html: stylesheet-html-nochunk.xsl postgres.sgml $(ALLSGML) $(SVGSRC)
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) -o $@ $(wordlist 1,2,$^)
 
@@ -152,15 +156,15 @@ postgres.txt: postgres.html
 postgres.pdf:
 	$(error Invalid target;  use postgres-A4.pdf or postgres-US.pdf as targets)
 
-%-A4.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
+%-A4.fo: stylesheet-fo.xsl %.sgml
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type A4 -o $@ $(wordlist 1,2,$^)
 
-%-US.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
+%-US.fo: stylesheet-fo.xsl %.sgml
 	$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
 	$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type USletter -o $@ $(wordlist 1,2,$^)
 
-%.pdf: %.fo
+%.pdf: %.fo $(ALLSGML) $(SVGSRC)
 	$(FOP) -fo $< -pdf $@
 
 
@@ -169,7 +173,7 @@ postgres.pdf:
 ##
 
 epub: postgres.epub
-postgres.epub: postgres.sgml $(ALLSGML)
+postgres.epub: postgres.sgml $(ALLSGML) $(SVGSRC)
 	$(XMLLINT) --noout --valid $<
 	$(DBTOEPUB) $<
 
@@ -209,7 +213,7 @@ check: postgres.sgml $(ALLSGML) check-tabs
 install: install-html install-man
 
 installdirs:
-	$(MKDIR_P) '$(DESTDIR)$(htmldir)'/html $(addprefix '$(DESTDIR)$(mandir)'/man, 1 3 $(sqlmansectnum))
+	$(MKDIR_P) '$(DESTDIR)$(htmldir)'/html/svg html/svg $(addprefix '$(DESTDIR)$(mandir)'/man, 1 3 $(sqlmansectnum))
 
 # If the install used a man directory shared with other applications, this will remove all files.
 uninstall:
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4930ec17f6..fe83c46d2c 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11228,17 +11228,18 @@ table2-mapping
 

 As an example of using the output produced by these functions,
- shows an XSLT stylesheet that
-converts the output of
+the following  shows
+an XSLT stylesheet that converts the output of
 table_to_xml_and_xmlschema to an HTML
 document containing a tabular rendition of the table data.  In a
 similar manner, the results from these functions can be
 converted into other XML-based formats.

 
-   
+   
 XSLT Stylesheet for Converting SQL/XML Output to HTML
-
-   
+   
+   
   
  
 
diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
index cc7cd1ed2c..75dbbf36df 100644
--- a/doc/src/sgml/gin.sgml
+++ b/doc/src/sgml/gin.sgml
@@ -453,6 +453,23 @@
   key values for different columns can be of different types.
  
 
+ 
+  GIN Overview
+  
+   index
+   GIN
+   GIN (Figure)
+  
+  
+   
+
+   
+   
+
+   
+  
+ 
+
  
   GIN Fast Update Technique
 
diff --git a/doc/src/sgml/postgres.sgml b/doc/src/sgml/postgres.sgml
index 96d196d229..06d270aed6 100644
--- a/doc/src/sgml/postgres.sgml
+++ b/doc/src/sgml/postgres.sgml
@@ -280,6 +280,22 @@
 
  
 
+ 
+  Lists of Figures, Tables and Examples
+  
+   List of Figures
+   
+  
+  
+   List of Tables
+   
+  
+  
+   List of Examples
+   
+  
+ 
+
  
  
 
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..d31ee2d210 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src

Re: First SVG graphic

2019-01-07 Thread Jürgen Purtz

My questions to the community are:

  * Does anyone has an idea how to generate single HTML file in the
actual situation?



Thanks to an additional template created by Alexander Lakhin, which 
extends the 'nochunk' stylesheet for SVG and MathML processing, it is 
now possible to create the "single HTML file" of our documentation 
including SVG. For me this is a working solution as long as we use 
Docbook 4. After the migration to Docbook 5, both languages as well as 
full namespace support will be natively included in Docbook.


Does anyone faced some more problems? Or can we start to include the 
three first SVG graphics into PG's documentation?


Kind regards

Jürgen Purtz



diff --git a/doc/src/sgml/stylesheet-html-nochunk.xsl b/doc/src/sgml/stylesheet-html-nochunk.xsl
index ffd2012e91..3a20a47bce 100644
--- a/doc/src/sgml/stylesheet-html-nochunk.xsl
+++ b/doc/src/sgml/stylesheet-html-nochunk.xsl
@@ -9,4 +9,23 @@
 
 
 
+
+  
+
+  
+
+  
+
+  
+
+http://www.w3.org/1998/Math/MathML; test="mml:*">
+  
+
+http://www.w3.org/2000/svg; test="svg:*">
+  
+
+  
+  
+
+
 


Re: First SVG graphic

2018-12-28 Thread Jürgen Purtz

I failed to generate the "single HTML file".

The default Makefile task, which creates multiple HTML files, works 
properly, because it confines itself to create links to SVG files. The 
SVG structure keeps hidden to the Docbook validation and processing - 
Docbook recognises only some additional links. What I have tried to 
generate a single HTML file is the use of xi:XInclude before validating 
and further processing. In this case the SVG structure is visible to 
Docbook.


In general it is possible to use SVG within Docbook 4.x, if you switch 
the doctype to


 "http://www.oasis-open.org/docbook/xml/svg/1.1CR1/dbsvg.dtd;  

This is an Docbook 4.x extension 
(https://docbook.org/specs/wd-docbook-svg-1.1cr1) for SVG-integration. 
But it's only a working draft from 2004, it never reached the status of 
an official OASIS standard. As far as I have seen, it works (with some 
limitations), if the SVG data is an integral part of the xml/sgml source 
file. I failed to combine it with xi:XInclude. In opposite to this the 
combination of xi:XInclude, SVG, and Docbook 5.1 works well. In my 
opinion this results from the fact, that the structure of Docbook 4.x is 
based on a DTD, whereas Docbook 5.x uses Relax-NG (and generates xsd 
files out of rng). DTDs natively are not namespace-aware, you must do 
some trickery to handle namespaces. Docbook 5.x is not only namespace 
aware, it natively includes definitions for SVG and other important 
standards like MathML.


My questions to the community are:

 * Does anyone has an idea how to generate single HTML file in the
   actual situation?
 * Shall we delay the SVG integration until we have switched to Docbook
   5.x? This task is a great step, but it must be done in any case,
   because Docbook 4.x is outdated since many years. Btw: Because of
   other problems (https://github.com/docbook/docbook/issues/74) it is
   likely that we cannot use 5.1 but have to wait for the upcoming
   release 5.2.

Kind regards

Jürgen Purtz




Re: First SVG graphic

2018-12-23 Thread Jürgen Purtz
a) The "Entry tree" and the "Posting trees" of the graphic "gin.svg" 
shows links not only from one tree-level to the next but also within 
each level from node to node. Is that correct?


b) Is it worth to visualize PG's tree-implementation in a separate 
graphic - or is it the same as in every other tree-implementation that 
you have learned in your academic studies? If yes: in which chapter?


Kind regards, Jürgen



Re: First SVG graphic

2018-12-18 Thread Jürgen Purtz
There are three wiki pages describing the procedure: general 
description, Inkscape specifics, colors. You can find them in the 
category SVG.


This mail has 3 SVG graphics attached, each in pure SVG and in Inksape 
SVG format. Furthermore there is a patch for the Makefile and the 
modifications to three sgml files, which are necessary to incorporate 
the graphics.


Kind regards

Jürgen Purtz


diff --git a/doc/src/sgml/Makefile b/doc/src/sgml/Makefile
index 74aac01c39..2433571358 100644
--- a/doc/src/sgml/Makefile
+++ b/doc/src/sgml/Makefile
@@ -131,10 +131,12 @@ endif
 
 html: html-stamp
 
-html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML)
+html-stamp: stylesheet.xsl postgres.sgml $(ALLSGML) svg/*.svg
$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) 
$(wordlist 1,2,$^)
cp $(srcdir)/stylesheet.css html/
+   mkdir -p $(srcdir)/html/svg/
+   cp $(srcdir)/svg/*.svg html/svg/
touch $@
 
 htmlhelp: stylesheet-hh.xsl postgres.sgml $(ALLSGML)
@@ -142,7 +144,7 @@ htmlhelp: stylesheet-hh.xsl postgres.sgml $(ALLSGML)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(wordlist 1,2,$^)
 
 # single-page HTML
-postgres.html: stylesheet-html-nochunk.xsl postgres.sgml $(ALLSGML)
+postgres.html: stylesheet-html-nochunk.xsl postgres.sgml $(ALLSGML) svg/*.svg
$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) $(XSLTPROC_HTML_FLAGS) -o $@ 
$(wordlist 1,2,$^)
 
@@ -158,11 +160,11 @@ postgres.txt: postgres.html
 postgres.pdf:
$(error Invalid target;  use postgres-A4.pdf or postgres-US.pdf as 
targets)
 
-%-A4.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
+%-A4.fo: stylesheet-fo.xsl %.sgml $(ALLSGML) svg/*.svg
$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type A4 
-o $@ $(wordlist 1,2,$^)
 
-%-US.fo: stylesheet-fo.xsl %.sgml $(ALLSGML)
+%-US.fo: stylesheet-fo.xsl %.sgml $(ALLSGML) svg/*.svg
$(XMLLINT) $(XMLINCLUDE) --noout --valid $(word 2,$^)
$(XSLTPROC) $(XMLINCLUDE) $(XSLTPROCFLAGS) --stringparam paper.type 
USletter -o $@ $(wordlist 1,2,$^)
 
@@ -175,7 +177,7 @@ postgres.pdf:
 ##
 
 epub: postgres.epub
-postgres.epub: postgres.sgml $(ALLSGML)
+postgres.epub: postgres.sgml $(ALLSGML) svg/*.svg
$(XMLLINT) --noout --valid $<
$(DBTOEPUB) $<
 
diff --git a/doc/src/sgml/gin.sgml b/doc/src/sgml/gin.sgml
index cc7cd1ed2c..34e4eabaad 100644
--- a/doc/src/sgml/gin.sgml
+++ b/doc/src/sgml/gin.sgml
@@ -453,6 +453,18 @@
   key values for different columns can be of different types.
  
 
+ 
+  
+   
+
+
+   
+   
+
+   
+  
+ 
+
  
   GIN Fast Update Technique
 
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2015410a42..ea2681d707 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -73,6 +73,17 @@ PostgreSQL documentation
architectures.
   
 
+  
+   
+
+ 
+
+
+ 
+
+   
+  
+
   
When used with one of the archive file formats and combined with
pg_restore,
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 8ef2ac8010..144c1deeef 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -775,6 +775,18 @@ data. Empty in ordinary tables.
 
 
 
+ 
+  
+   
+
+
+   
+   
+
+   
+  
+ 
+
  
 
   The first 24 bytes of each page consists of a page header


Re: First SVG graphic

2018-11-30 Thread Jürgen Purtz
I take the reactions of the last days as a strong consent to go on with 
the effort to integrate graphics into the documentation and use SVG as 
the language which creates such graphics. Also the proposed parallel 
handling of two SVG files - a rich but tool-specific version (optional 
and not normative) and a poor tool-independent version (mandatory and 
normative) - for the same graphic seems to be accepted. The community 
agrees that this way is not optimal because the use of *different 
SVG-tools* will lead to unnecessary problems - but there is no consensus 
about tools.


What shall we do next:

 * I will create one or more wiki pages where the procedure is
   described. Everybody can extend this pages or contribute to their
   discussion sites. The pages will be found in the category
   'Documentation' and its subcategory 'SVG' (to be created).
 * Actually, we have the very simple example 'PageLayout.svg' and an
   example of medium complexity 'gin.svg'. For testing purposes we
   shall have a third one of high complexity and with many different
   graphical elements. Can someone send such an example - as a
   screenshot or in any other format?
 * I want to engage everybody to identify important issues of PG and
   visualize them (similar to Oleg's proposal). We will have a -
   possibly long lasting - period of experiments with different
   examples. I think, it's necessary that we make our experiences with
   different tools and proceedings (one person creates a graphic,
   another one contributes changes, using the same or a different tool,
   ...). Those examples shall not be pure academic use cases. They
   shall reflect real situations with the expectation to be included
   into the documentation - one day or another.
 * In the initial phase, it may be helpful to do some centralized
   clearings on the first SVG source files. 'Copy' is widely used
   and the first examples will have the meaning of a lighthouse.
 * I will contact our web-team to discuss style-guide related issues.


Kind regards, Jürgen




Re: First SVG graphic

2018-11-28 Thread Jürgen Purtz
After one week no response at all? Neither positive nor negative. It 
seems that the community has little interest in the SVG issue. Or in my 
suggestion?


Jürgen Purtz




Re: "System roles" mentioned in psql documentation

2018-08-23 Thread Jürgen Purtz
This inconsistency is part of the more general problem that we miss a 
chapter, where our basic terms like 'database', 'cluster', 'segment', 
'catalog', 'schema', ... are explicitly defined.


Kind regards

Jürgen Purtz





Re: Images in the official documentation

2018-07-22 Thread Jürgen Purtz
Because I strongly support the involvement of SVG into our 
documentation, I welcome every activity to establish this goal. The mail 
to which Pavel had replied 
(https://www.postgresql.org/message-id/4ea1bacb-02ca-e967-31d7-d2a6db30abff%40purtz.de) 
contained the two graphics /PageLayout.svg/ "Page Layout" and 
/pgDump.svg/ "pg_dump, psql, pg_restore". Actually they are *not* part 
of our documentation. They are only part of a proof-of-concept to 
generate SVG and include the result into HTML and PDF output. The two 
examples can be used as references for comparisons with any tool.


Kind regards, Jürgen


On 20.07.2018 18:14, Pavel Golub wrote:

Hello, Jürgen.

You wrote:


JP> On 19.07.2018 14:06, Pavel Golub wrote:

I disagree. From what I heard, GraphViz is the winner for now. I can
give you my two cents: plantuml is another good choice

JP> Ok, please give us an example - possibly the two previous graphics.

Fair enough. Let's try different formats. Sorry, what exactly previous
graphics we're talking about?

JP> Kind regards, Jürgen










Re: Images in the official documentation

2018-07-19 Thread Jürgen Purtz



On 19.07.2018 14:06, Pavel Golub wrote:

I disagree. From what I heard, GraphViz is the winner for now. I can
give you my two cents: plantuml is another good choice


Ok, please give us an example - possibly the two previous graphics.

Kind regards, Jürgen





Re: Appendix A. PostgreSQL Error Codes

2018-04-28 Thread Jürgen Purtz
The SQL standard sorts the SQLSTATE table in a first step according to 
the *textual representation* of class (first two byte) and in a second 
step to the textual representation of subclass (next 3 byte) - keeping 
the text '(no subclass)' for subclass '000' on top. Obviously for some 
people this 'textual' sort order is relevant.


For my personal use I strongly prefer a pure 'numeric' sort order of 
SQLSTATE. Therefore I created a Perl program, which generates the SGML 
file out of errcode.txt and published it in 
https://www.postgresql.org/message-id/d69ce4f1-dd4e-61c1-070e-25f662e9c622%40purtz.de 
. This was a follow-up action after a discussion about illegal SQLSTATE 
values in PG, see: 
https://www.postgresql.org/message-id/11558.1513819695%40sss.pgh.pa.us.


PG documentation uses a numeric sort order for class and a semantic sort 
order for subclass.


Which one of the three is best? Here is my judgement: For technical 
oriented people the first approach is horrible because it sorts not only 
the subclass-text but also the class-text. This scrambles important and 
less important values to arbitrary positions. For me, sorting of the 
numeric value of class in a first step is essential. The third approach 
may be good for persons which work on the improvement of the PG source 
code. But they can refer to errcodes.txt. The primary readers of our 
documentation are users which work in roles as DBA or application 
developer. I doubt that they are interested in the semantic order of 
subclass. But I do not know whether they prefer numeric to text or text 
to numeric translation.


Jürgen Purtz


On 27.04.2018 17:16, Tom Lane wrote:


Bruce Momjian <br...@momjian.us> writes:

On Fri, Apr 27, 2018 at 10:08:12AM +, PG Doc comments form wrote:

Would it be possible to present the codes in ascending sequence?

Uh, I am guessing this is the order listed in the SQL standard.  How
would you order them differently?  Have the letters only at the end?

There was some discussion of this a couple months ago, which I'm too lazy
to search the archives for right now, but it trailed off without any
agreement on what to do.

There are at least three different ordering principles that might be
applied here: numeric by code (well, perhaps more like "ASCII sort
order"); alphabetical by exception name (but probably still keeping the
000 category codes at the top); or semantic grouping (ie try to keep
related errors together).  It looks to me like all three of these have
been applied in different places ;-).

One idea that might help is to present two tables with different sort
orders, say strict numeric and strict alphabetical.  (If we did that,
I'd be inclined to leave errcodes.txt alone and put the sorting
responsibility on the script that converts it to SGML.)

One question that has to be asked is what we think the use-case for this
table is at all.  Different use-cases result in different ideas about the
best ordering.

regards, tom lane






Re: Images in the official documentation

2018-02-25 Thread Jürgen Purtz
As an addition to my mail from January 2016 concerning graphics 
(https://www.postgresql.org/message-id/568A9148.30303%40purtz.de) I 
propose to use SVG (after switching to XML) - but not an SVG which is 
generated by Inkscape or similar tools. Those editors generate very ugly 
and chatty commands. This form is not easy to read or understand. 
Therefore we shall use nothing but a simple text editor and write every 
line by our self. The process is divided into two parts:


As a basis we shall develop an SVG library containing a bunch of 
"atomic" symbols of simple graphical elements (rectangle, arrow, ...) up 
to complex elements (magnetic disc, laptop, cloud, UML-elements, ...). 
The SVG routines creating those symbols shall accept parameters for 
position, size, rotation, colour, ... . This library shortens the 
individual SVG files, it ensures a consistent rendering of common 
graphical elements, it is diff-able, and it will reach a stable state - 
some day.


The real graphics shall use the elements of the library and add 
individual SVG elements. The rules for this part are the same as above: 
create SVG commands with vi (or similar), store it in git.


If such an approach works (we must distribute the docs across a wide 
range of different systems, a proof-of-system is necessary) and the 
community accepts my proposal, I would like to work on the library-part 
- starting after finishing my actual project in about 6 weeks from now. 
The attached file contains a very first draft as of Jan. 2016.


Kind regards
Jürgen Purtz



On 23.02.2018 22:14, Daniel Westermann wrote:

Hi %,

I am working with PostgreSQL documentation quite a few years now and I 
am almost happy. What I think is completely missing (especially if you 
compare to commercial product documentation) are pictures that 
illustrate a topic, e.g. the relation of 
instance->database->user/role->schema->objects.


Is there an agreement not to include that pictures for any reason? I 
can not promise that I find time for that in the near future but if 
that will be appreciated I am willing to spend time on that to make 
the documentation even better.


Regards
Daniel