Re: Does 'instead of delete' trigger support modification of OLD

2019-11-11 Thread Liudmila Mantrova


> 8 нояб. 2019 г., в 0:26, Bruce Momjian  написал(а):
> 
> First, notice "only", which was missing from the later sentence:
> 
>For INSERT and UPDATE
>operations [only], the trigger may modify the
>NEW row before returning it.
> 
> which I have now added with my applied patch to all supported releases. 
> 

Hi Bruce, 

I happened to browse recent documentation-related commits and I didn’t see this 
patch in REL_12_STABLE. Judging by the commit message, it should be applied 
there too.



Re: JSONPATH documentation

2019-09-25 Thread Liudmila Mantrova

On 9/25/19 12:08 AM, Peter Eisentraut wrote:

On 2019-09-23 00:03, Tom Lane wrote:

While we're whining about this, I find it very off-putting that
the jsonpath stuff was inserted in the JSON functions section
ahead of the actual JSON functions.  I think it should have
gone after them, because it feels like a barely-related interjection
as it stands.  Maybe there's even a case that it should be
its own , after the "functions-json" section.

I'd just switch the sect2's around.


As more SQL/JSON functionality gets added, I believe a separate sect1 is 
likely to be more justified. However, for v12 I'd vote for moving sect2 
down. The patch is attached, it also fixes the ambiguous sentence that 
has raised questions in this thread.


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 432dbad868..6cb5d6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11646,14 +11646,14 @@ table2-mapping
   

 
- SQL/JSON path expressions
- (see ).
+ PostgreSQL-specific functions and operators for JSON
+ data types (see ).
 


 
- PostgreSQL-specific functions and operators for JSON
- data types (see ).
+ SQL/JSON path expressions
+ (see ).
 

   
@@ -11665,1748 +11665,1748 @@ table2-mapping
 see .
   
 
- 
-  SQL/JSON Path Expressions
+  
+  JSON Functions and Operators
   
-SQL/JSON
-path expressions
+JSON
+functions and operators
   
 
   
-   SQL/JSON path expressions specify the items to be retrieved
-   from the JSON data, similar to XPath expressions used
-   for SQL access to XML. In PostgreSQL,
-   path expressions are implemented as the jsonpath
-   data type and can use any elements described in
-   .
+shows the operators that
+   are available for use with JSON data types (see ).
   
 
-  JSON query functions and operators
-   pass the provided path expression to the path engine
-   for evaluation. If the expression matches the queried JSON data,
-   the corresponding SQL/JSON item is returned.
-   Path expressions are written in the SQL/JSON path language
-   and can also include arithmetic expressions and functions.
-   Query functions treat the provided expression as a
-   text string, so it must be enclosed in single quotes.
-  
+  
+ json and jsonb Operators
+ 
+  
+   
+Operator
+Right Operand Type
+Return type
+Description
+Example
+Example Result
+   
+  
+  
+   
+-
+int
+json or jsonb
+Get JSON array element (indexed from zero, negative
+integers count from the end)
+'[{"a":"foo"},{"b":"bar"},{"c":"baz"}]'::json-2
+{"c":"baz"}
+   
+   
+-
+text
+json or jsonb
+Get JSON object field by key
+'{"a": {"b":"foo"}}'::json-'a'
+{"b":"foo"}
+   
+
+-
+int
+text
+Get JSON array element as text
+'[1,2,3]'::json-2
+3
+   
+   
+-
+text
+text
+Get JSON object field as text
+'{"a":1,"b":2}'::json-'b'
+2
+   
+   
+#
+text[]
+json or jsonb
+Get JSON object at the specified path
+'{"a": {"b":{"c": "foo"}}}'::json#'{a,b}'
+{"c": "foo"}
+   
+   
+#
+text[]
+text
+Get JSON object at the specified path as text
+'{"a":[1,2,3],"b":[4,5,6]}'::json#'{a,2}'
+3
+   
+  
+ 
+   
 
+  
+   
+There are parallel variants of these operators for both the
+json and jsonb types.
+The field/element/path extraction operators
+return the same type as their left-hand input (either json
+or jsonb), except for those specified as
+returning text, which coerce the value to text.
+The field/element/path extraction operators return NULL, rather than
+failing, if the JSON input does not have the right structure to match
+the request; for example if no such element exists.  The
+field/element/path extraction operators that accept integer JSON
+array subscripts all support negative subscripting from the end of
+arrays.
+   
+  
   
-   A path expression consists of a sequence of elements allowed
-   by the jsonpath data type.
-   The path expression is evaluated from left to right, but
-   you can use parentheses to change the order of operations.
-   If the evaluation is successful, a sequence of SQL/JSON items
-   (SQL/JSON sequence) is produced,
-   and the evaluation result is ret

Re: s/rewinded/rewound/?

2019-08-07 Thread Liudmila Mantrova

On 8/7/19 12:00 PM, Michael Paquier wrote:

On Wed, Aug 07, 2019 at 10:53:45AM +0200, Magnus Hagander wrote:

To me this sounds like a classic non-English-native-speaker-mistake.  But
it seems at least the one in the docs come from Bruce, who definitely is...
So perhaps it's intentional to refer to "what pg_rewind does", and not
necessarily to the regular word for it?

I am not sure :)
"rewound" sounds much more natural.
--
Michael


+1 for rewound from a non-English-native-speaker. The use of "rewound" 
in the same file also supports Michael's view.


If we decide to fix this, we should probably revise and back-patch the 
whole paragraph where it appears as it seems to mix up scanning target 
cluster
WALs and applying source cluster WALs. A small patch is attached for 
your consideration (originally proposed on pgsql-docs [1]).


[1] 
https://www.postgresql.org/message-id/ad6ac5bb-6689-ddb0-dc60-c5fc197d728e%40postgrespro.ru 



--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml
index 52a1caa..a7e1705 100644
--- a/doc/src/sgml/ref/pg_rewind.sgml
+++ b/doc/src/sgml/ref/pg_rewind.sgml
@@ -66,14 +66,12 @@ PostgreSQL documentation
can be found either on the target timeline, the source timeline, or their common
ancestor. In the typical failover scenario where the target cluster was
shut down soon after the divergence, this is not a problem, but if the
-   target cluster ran for a long time after the divergence, the old WAL
+   target cluster ran for a long time after the divergence, its old WAL
files might no longer be present. In that case, they can be manually
-   copied from the WAL archive to the pg_wal directory, or
-   fetched on startup by configuring  or
-   .  The use of
-   pg_rewind is not limited to failover, e.g.  a standby
-   server can be promoted, run some write transactions, and then rewinded
-   to become a standby again.
+   copied from the WAL archive to the pg_wal directory.
+   The use of pg_rewind is not limited to failover,
+   e.g. a standby server can be promoted, run some write transactions, and then
+   get rewound to become a standby again.
   
 
   


Re: [Patch] pg_rewind: options to use restore_command from recovery.conf or command line

2019-07-26 Thread Liudmila Mantrova

On 7/1/19 5:20 PM, Alexey Kondratov wrote:

Hi Thomas,

On 01.07.2019 15:02, Thomas Munro wrote:


Hi Alexey,

This no longer applies.  Since the Commitfest is starting now, could
you please rebase it?


Thank you for a reminder. Rebased version of the patch is attached. 
I've also modified my logging code in order to obey new unified 
logging system for command-line programs commited by Peter (cc8d415117).



Regards


Hi Alexey,

I would like to suggest a couple of changes to docs and comments, please 
see the attachment.
The "...or fetched on startup" part also seems wrong here, but it's not 
a part of your patch, so I'm going to ask about it on psql-docs separately.


It might also be useful to reword the following error messages:
- "using restored from archive version of file \"%s\""
- "could not open restored from archive file \"%s\"
We could probably say something like "could not open file \"%s\" 
restored from WAL archive" instead.


On a more general note, I wonder if everyone is happy with the 
--using-postgresql-conf option name, or we should continue searching for 
a narrower term. Unfortunately, I don't have any better suggestions 
right now, but I believe it should be clear that its purpose is to fetch 
missing WAL files for target. What do you think?


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/pg_rewind.sgml b/doc/src/sgml/ref/pg_rewind.sgml
index 52a1caa..7e76fcc 100644
--- a/doc/src/sgml/ref/pg_rewind.sgml
+++ b/doc/src/sgml/ref/pg_rewind.sgml
@@ -66,9 +66,12 @@ PostgreSQL documentation
can be found either on the target timeline, the source timeline, or their common
ancestor. In the typical failover scenario where the target cluster was
shut down soon after the divergence, this is not a problem, but if the
-   target cluster ran for a long time after the divergence, the old WAL
-   files might no longer be present. In that case, they can be manually
-   copied from the WAL archive to the pg_wal directory, or
+   target cluster ran for a long time after the divergence, its old WAL
+   files might no longer be present. In this case, you can manually copy them
+   from the WAL archive to the pg_wal directory, or run
+   pg_rewind with the -r or
+   -R option to automatically retrieve them from the WAL
+   archive, or
fetched on startup by configuring  or
.  The use of
pg_rewind is not limited to failover, e.g.  a standby
@@ -203,6 +206,39 @@ PostgreSQL documentation
  
 
  
+  -r
+  --use-postgresql-conf
+  
+   
+Use the restore_command defined in
+postgresql.conf to retrieve WAL files from
+the WAL archive if these files are no longer available in the
+pg_wal directory of the target cluster.
+   
+   
+This option cannot be used together with --restore-command.
+   
+  
+ 
+
+ 
+  -R restore_command
+  --restore-command=restore_command
+  
+   
+Specifies the restore_command to use for retrieving
+WAL files from the WAL archive if these files are no longer available
+in the pg_wal directory of the target cluster.
+   
+   
+If restore_command is already set in
+postgresql.conf, you can provide the
+--use-postgresql-conf option instead.
+   
+  
+ 
+
+ 
   --debug
   

@@ -288,7 +324,10 @@ GRANT EXECUTE ON function pg_catalog.pg_read_binary_file(text, bigint, bigint, b
   history forked off from the target cluster. For each WAL record,
   record each data block that was touched. This yields a list of all
   the data blocks that were changed in the target cluster, after the
-  source cluster forked off.
+  source cluster forked off. If some of the WAL files are no longer
+  available, try re-running pg_rewind with
+  the -r or -R option to search
+  for the missing files in the WAL archive.
  
 
 
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index 287af60..5a7f759 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -12,6 +12,7 @@
 #include "postgres_fe.h"
 
 #include 
+#include 
 
 #include "pg_rewind.h"
 #include "filemap.h"
@@ -44,6 +45,7 @@ static char xlogfpath[MAXPGPATH];
 typedef struct XLogPageReadPrivate
 {
 	const char *datadir;
+	const char *restoreCommand;
 	int			tliIndex;
 } XLogPageReadPrivate;
 
@@ -52,6 +54,9 @@ static int	SimpleXLogPageRead(XLogReaderState *xlogreader,
 			   int reqLen, XLogRecPtr targetRecPtr, char *readBuf,
 			   TimeLineID *pageTLI);
 
+static int RestoreArchivedWAL(const char *path, const char *xlogfname,
+   off_t expectedSize, const char *restoreCommand);
+
 /*
  * Read WAL from the datadir/pg_wal, starting from 'start

Re: SQL/JSON path issues/questions

2019-07-04 Thread Liudmila Mantrova


On 7/3/19 11:59 PM, Alexander Korotkov wrote:

Hi!

On Wed, Jul 3, 2019 at 5:27 PM Liudmila Mantrova
 wrote:

I have rechecked the standard and I agree that we should use "filter
expression" whenever possible.
"A filter expression must be enclosed in parentheses..." looks like an
oversight, so I fixed it. As for what's actually enclosed, I believe we
can still use the word "condition" here as it's easy to understand and
is already used in our docs, e.g. in description of the WHERE clause
that serves a similar purpose.
The new version of the patch fixes the terminology, tweaks the examples,
and provides some grammar and style fixes in the jsonpath-related chapters.


It looks good to me.  But this sentence looks a bit too complicated.

"It can be followed by one or more accessor operators to define the
JSON element on a lower nesting level by which to filter the result."

Could we phrase this as following?

"In order to filter the result by values lying on lower nesting level,
@ operator can be followed by one or more accessor operators."

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Thank  you!

I think we can make this sentence even shorter, the fix is attached:

"To refer to a JSON element stored at a lower nesting level, add one or 
more accessor operators after @."



--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 3a8581d..6d2aefb 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11538,7 +11538,8 @@ table2-mapping
from the JSON data, similar to XPath expressions used
for SQL access to XML. In PostgreSQL,
path expressions are implemented as the jsonpath
-   data type, described in .
+   data type and can use any elements described in
+   .
   
 
   JSON query functions and operators
@@ -11585,7 +11586,7 @@ table2-mapping
   },
   { "location":   [ 47.706, 13.2635 ],
 "start time": "2018-10-14 10:39:21",
-"HR": 130
+"HR": 135
   } ]
   }
 }
@@ -11637,23 +11638,33 @@ table2-mapping
 
   
When defining the path, you can also use one or more
-   filter expressions, which work similar to
-   the WHERE clause in SQL. Each filter expression
-   can provide one or more filtering conditions that are applied
-   to the result of the path evaluation. Each filter expression must
-   be enclosed in parentheses and preceded by a question mark.
-   Filter expressions are evaluated from left to right and can be nested.
-   The @ variable denotes the current path evaluation
-   result to be filtered, and can be followed by one or more accessor
-   operators to define the JSON element by which to filter the result.
-   Functions and operators that can be used in the filtering condition
-   are listed in .
-   SQL/JSON defines three-valued logic, so the result of the filter
-   expression may be true, false,
+   filter expressions that work similar to the
+   WHERE clause in SQL. A filter expression begins with
+   a question mark and provides a condition in parentheses:
+
+
+? (condition)
+
+  
+
+  
+   Filter expressions must be specified right after the path evaluation step
+   to which they are applied. The result of this step is filtered to include
+   only those items that satisfy the provided condition. SQL/JSON defines
+   three-valued logic, so the condition can be true, false,
or unknown. The unknown value
-   plays the same role as SQL NULL. Further path
+   plays the same role as SQL NULL and can be tested
+   for with the is unknown predicate. Further path
evaluation steps use only those items for which filter expressions
-   return true.
+   return true.
+  
+
+  
+   Functions and operators that can be used in filter expressions are listed
+   in . The path
+   evaluation result to be filtered is denoted by the @
+   variable. To refer to a JSON element stored at a lower nesting level,
+   add one or more accessor operators after @.
   
 
   
@@ -11667,8 +11678,8 @@ table2-mapping
   
To get the start time of segments with such values instead, you have to
filter out irrelevant segments before returning the start time, so the
-   filter is applied to the previous step and the path in the filtering
-   condition is different:
+   filter expression is applied to the previous step, and the path used
+   in the condition is different:
 
 '$.track.segments[*] ? (@.HR  130)."start time"'
 
@@ -11693,9 +11704,9 @@ table2-mapping
   
 
   
-   You can also nest filters within each other:
+   You can also nest filter expressions within each other:
 
-'$.track ? (@.segments[*] ? (@.HR  130)).segments.size()'
+'$.track ? (exists(@.segments[*] ? (@.HR  130))).segments.size()'
 
This expression r

Re: SQL/JSON path issues/questions

2019-07-03 Thread Liudmila Mantrova

On 6/28/19 6:47 AM, Alexander Korotkov wrote:

On Tue, Jun 25, 2019 at 6:38 PM Liudmila Mantrova
 wrote:

Thank you for the catch! Please see the modified version of patch 0004
attached.

I tried to review and revise the part related to filters, but I failed
because I don't understand the notions used in the documentation.

What is the difference between filter expression and filter condition?
  I can guess that filter expression contains question mark,
parentheses and filter condition inside.  But this sentence is in
contradiction with my guess: "A filter expression must be enclosed in
parentheses and preceded by a question mark".  So, filter expression
is inside the parentheses.  Then what is filter condition?  The same?


Each filter expression can provide one or more filters
that are applied to the result of the path evaluation.


So additionally to filter condition and filter expression we introduce
the notion of just filter.  What is it?  Could we make it without
introduction of new notion?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Hi,

I have rechecked the standard and I agree that we should use "filter 
expression" whenever possible.
"A filter expression must be enclosed in parentheses..." looks like an 
oversight, so I fixed it. As for what's actually enclosed, I believe we 
can still use the word "condition" here as it's easy to understand and 
is already used in our docs, e.g. in description of the WHERE clause 
that serves a similar purpose.
The new version of the patch fixes the terminology, tweaks the examples, 
and provides some grammar and style fixes in the jsonpath-related chapters.



--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 3a8581d..b0de624 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11538,7 +11538,8 @@ table2-mapping
from the JSON data, similar to XPath expressions used
for SQL access to XML. In PostgreSQL,
path expressions are implemented as the jsonpath
-   data type, described in .
+   data type and can use any elements described in
+   .
   
 
   JSON query functions and operators
@@ -11585,7 +11586,7 @@ table2-mapping
   },
   { "location":   [ 47.706, 13.2635 ],
 "start time": "2018-10-14 10:39:21",
-"HR": 130
+"HR": 135
   } ]
   }
 }
@@ -11637,23 +11638,33 @@ table2-mapping
 
   
When defining the path, you can also use one or more
-   filter expressions, which work similar to
-   the WHERE clause in SQL. Each filter expression
-   can provide one or more filtering conditions that are applied
-   to the result of the path evaluation. Each filter expression must
-   be enclosed in parentheses and preceded by a question mark.
-   Filter expressions are evaluated from left to right and can be nested.
-   The @ variable denotes the current path evaluation
-   result to be filtered, and can be followed by one or more accessor
-   operators to define the JSON element by which to filter the result.
-   Functions and operators that can be used in the filtering condition
-   are listed in .
-   SQL/JSON defines three-valued logic, so the result of the filter
-   expression may be true, false,
+   filter expressions that work similar to the
+   WHERE clause in SQL. A filter expression begins with
+   a question mark and provides a condition in parentheses:
+
+
+? (condition)
+
+  
+
+  
+   Filter expressions must be specified right after the path evaluation step
+   to which they are applied. The result of this step is filtered to include
+   only those items that satisfy the provided condition. SQL/JSON defines
+   three-valued logic, so the condition can be true, false,
or unknown. The unknown value
-   plays the same role as SQL NULL. Further path
+   plays the same role as SQL NULL and can be tested
+   for with the is unknown predicate. Further path
evaluation steps use only those items for which filter expressions
-   return true.
+   return true.
+  
+
+  
+   Functions and operators that can be used in filter expressions are listed
+   in . The path
+   evaluation result to be filtered is denoted by the @
+   variable. It can be followed by one or more accessor operators to define
+   the JSON element on a lower nesting level by which to filter the result.
   
 
   
@@ -11667,8 +11678,8 @@ table2-mapping
   
To get the start time of segments with such values instead, you have to
filter out irrelevant segments before returning the start time, so the
-   filter is applied to the previous step and the path in the filtering
-   condition is different:
+   filter expression is applied to the previous step, and the path used
+   in the condition is different:
 

Re: SQL/JSON path issues/questions

2019-06-25 Thread Liudmila Mantrova

On 6/21/19 8:04 PM, Alexander Korotkov wrote:

On Wed, Jun 19, 2019 at 10:14 PM Alexander Korotkov
 wrote:

While I have no objections to the proposed fixes, I think we can further
improve patch 0003 and the text it refers to.
In attempt to clarify jsonpath docs and address the concern that ? is
hard to trace in the current text, I'd also like to propose patch 0004.
Please see both of them attached.

Thank you for your editing.  I'm going to commit them as well.

But I'm going to commit your changes separately from 0003 I've posted
before.  Because 0003 fixes factual error, while you're proposing set
of grammar/style fixes.

I made some review of these patches.  My notes are following:


-See also  for the aggregate
-function json_agg which aggregates record
-values as JSON, and the aggregate function
-json_object_agg which aggregates pairs of values
-into a JSON object, and their jsonb equivalents,
+See also  for details on
+json_agg function that aggregates record
+values as JSON, json_object_agg function
+that aggregates pairs of values into a JSON object, and their
jsonb equivalents,
  jsonb_agg and jsonb_object_agg.


This part is not directly related to jsonpath, and it has been there
for a long time.  I'd like some native english speaker to review this
change before committing this.

 
-Expression inside subscript may consititue an integer,
-numeric expression or any other jsonpath expression
-returning single numeric value.  The last keyword
-can be used in the expression denoting the last subscript in an array.
-That's helpful for handling arrays of unknown length.
+The specified index can be an integer,
+as well as a numeric or jsonpath expression that
+returns a single integer value. Zero index corresponds to the first
+array element. To access the last element in an array, you can use
+the last keyword, which is useful for handling
+arrays of unknown length.
 

I think this part requires more work.  Let's see what cases do we have
with examples:

1) Integer: '$.ar[1]'
2) Numeric: '$.ar[1.5]' (converted to integer)
3) Some numeric expression: '$.ar[last - 1]'
4) Arbitrary jsonpath expression: '$.ar[$.ar2.size() + $.num - 2]'

In principle, it not necessary to divide 3 and 4, or divide 1 and 2.
Or we may don't describe cases at all, but just say it's a jsonpath
expression returning numeric, which is converted to integer.

Also, note that we do not necessary *access* last array element with
"last" keyword.  "last" keyword denotes index of last element in
expression.  But completely different element might be actually
accessed.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Hi Alexander,

Thank you for the catch! Please see the modified version of patch 0004 
attached.


As for your comment on patch 0003, since I'm not a native speaker, I can 
only refer to a recent discussion in pgsql-docs mailing list that seems 
to support my view on a similar issue:


https://www.postgresql.org/message-id/9484.1558050957%40sss.pgh.pa.us


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 39ba18d..fa5afc1 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11638,10 +11638,17 @@ table2-mapping
   
When defining the path, you can also use one or more
filter expressions, which work similar to
-   the WHERE clause in SQL. Each filter expression
-   can provide one or more filtering conditions that are applied
-   to the result of the path evaluation. Each filter expression must
-   be enclosed in parentheses and preceded by a question mark.
+   the WHERE clause in SQL. A filter expression must
+   be enclosed in parentheses and preceded by a question mark:
+
+
+? (@ filter ...)
+
+  
+
+  
+   Each filter expression can provide one or more filters
+   that are applied to the result of the path evaluation.
Filter expressions are evaluated from left to right and can be nested.
The @ variable denotes the current path evaluation
result to be filtered, and can be followed by one or more accessor
diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index daebb4f..0d8e2c6 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -815,21 +815,18 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
 .**{level}


-.**{lower_level to
-upper_level}
-   
-   
-.**{lower_level to
-last}
+.**{start_level to
+end_level}

   
   

-Same as .**, but with filter over nesting
-level of JSON hierarchy.  Levels are specified as integers.
-Zero 

Re: SQL/JSON path issues/questions

2019-06-17 Thread Liudmila Mantrova


On 6/17/19 11:36 AM, Alexander Korotkov wrote:

I'm going to push attached 3 patches if no objections.

Regarding 0003-Separate-two-distinctive-json-errors.patch, I think it
requires more thoughts.

 RETURN_ERROR(ereport(ERROR,
  (errcode(ERRCODE_SINGLETON_JSON_ITEM_REQUIRED),
   errmsg("left operand of jsonpath
operator %s is not a single numeric value",
-jspOperationName(jsp->type);
+jspOperationName(jsp->type)),
+ (llen != 1 ?
+  errdetail("It was an array with %d
elements.", llen):
+  errdetail("The only element was not a
numeric.");

When we have more than 1 value, it's no exactly array.  Jsonpath can
extract values from various parts of json document, which never
constitute and array.  Should we say something like "There are %d
values"?  Also, probably we should display the type of single element
if it's not numeric.  jsonb_path_match() also throws
ERRCODE_SINGLETON_JSON_ITEM_REQUIRED, should we add similar
errdetail() there?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Hi Alexander,

While I have no objections to the proposed fixes, I think we can further 
improve patch 0003 and the text it refers to.
In attempt to clarify jsonpath docs and address the concern that ? is 
hard to trace in the current text, I'd also like to propose patch 0004.

Please see both of them attached.

--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index e918133..39ba18d 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12281,7 +12281,7 @@ table2-mapping

 @?
 jsonpath
-Does JSON path returns any item for the specified JSON value?
+Does JSON path return any item for the specified JSON value?
 '{"a":[1,2,3,4,5]}'::jsonb @? '$.a[*] ? (@ > 2)'


@@ -12309,8 +12309,8 @@ table2-mapping
   

 The @? and @@ operators suppress
-errors including: lacking object field or array element, unexpected JSON
-item type and numeric errors.
+the following errors: lacking object field or array element, unexpected
+JSON item type, and numeric errors.
 This behavior might be helpful while searching over JSON document
 collections of varying structure.

@@ -13166,26 +13166,25 @@ table2-mapping
 jsonb_path_query, jsonb_path_query_array and
 jsonb_path_query_first
 functions have optional vars and silent
-argument.
+arguments.


-When vars argument is specified, it constitutes an object
-contained variables to be substituted into jsonpath
-expression.
+If the vars argument is specified, it provides an
+object containing named variables to be substituted into a
+jsonpath expression.


-When silent argument is specified and has
-true value, the same errors are suppressed as it is in
-the @? and @@ operators.
+If the silent argument is specified and has the
+true value, these functions suppress the same errors
+as the @? and @@ operators.

   
 
   
-See also  for the aggregate
-function json_agg which aggregates record
-values as JSON, and the aggregate function
-json_object_agg which aggregates pairs of values
-into a JSON object, and their jsonb equivalents,
+See also  for details on
+json_agg function that aggregates record
+values as JSON, json_object_agg function
+that aggregates pairs of values into a JSON object, and their jsonb equivalents,
 jsonb_agg and jsonb_object_agg.
   
 
diff --git a/doc/src/sgml/json.sgml b/doc/src/sgml/json.sgml
index b8246ba..daebb4f 100644
--- a/doc/src/sgml/json.sgml
+++ b/doc/src/sgml/json.sgml
@@ -733,10 +733,12 @@ SELECT jdoc-'guid', jdoc-'name' FROM api WHERE jdoc @ '{"tags": ["qu
  
  
   $varname
-  A named variable. Its value must be set in the
-  PASSING clause of an SQL/JSON query function.
- 
-  for details.
+  
+A named variable. Its value can be set by the parameter
+vars of several JSON processing functions.
+See  and
+its notes for details.
+
   
  
  
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 39ba18d..fa5afc1 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11638,10 +11638,17 @@ table2-mapping
   
When defining the path, you can also use one or more
filter expressions, which work similar to
-   the WHERE clause in SQL. Each filter expression
-   can provide one or more filtering conditions that are applied
-   to the result of the p

Re: SQL/JSON: documentation

2018-12-03 Thread Liudmila Mantrova

On 11/29/18 7:34 PM, Dmitry Dolgov wrote:

Hi,

Any progress on that? It would be nice to have a new version of the
documentation, and I would even advocate to put it into the json path patch [1]
(especially, since there were already requests for that, and I personally don't
see any reason to keep them separately). For now I'll move the item to
the next CF.

[1]:https://www.postgresql.org/message-id/flat/fcc6fc6a-b497-f39a-923d-aa34d0c588e8%402ndQuadrant.com


Hi Dmitry,

Unfortunately, I couldn't find much time for this activity, but as far 
as I understand, thread [1] only requires jsonpath documentation right 
now. So I extracted the relevant parts from this patch, reworked path 
expression description, and moved it to func.sgml as Peter suggested 
(attached). Nikita is going to add this patch to the jsonpath thread 
together with the updated code once it's ready.


Next, I'm going to address Peter's feedback on the rest of this 
documentation patch (which probably also needs to be split for threads 
[2] and [3]).


[2] 
https://www.postgresql.org/message-id/flat/cd0bb935-0158-78a7-08b5-904886dea...@postgrespro.ru
[3] 
https://www.postgresql.org/message-id/flat/132f26c4-dfc6-f8fd-4764-2cbf455a3...@postgrespro.ru


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/biblio.sgml b/doc/src/sgml/biblio.sgml
index 4953024..f06305d 100644
--- a/doc/src/sgml/biblio.sgml
+++ b/doc/src/sgml/biblio.sgml
@@ -136,6 +136,17 @@
 1988

 
+   
+SQL Technical Report
+Part 6: SQL support for JavaScript Object
+  Notation (JSON)
+First Edition.
+
+http://standards.iso.org/ittf/PubliclyAvailableStandards/c067367_ISO_IEC_TR_19075-6_2017.zip;>.
+
+2017.
+   
+
   
 
   
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 112d962..20ef7df 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11285,26 +11285,661 @@ table2-mapping
  
 
  
-  JSON Functions and Operators
+  JSON Functions, Operators, and Expressions
 
-  
+  
+   The functions, operators, and expressions described in this section
+   operate on JSON data:
+  
+
+  
+   
+
+ SQL/JSON path expressions
+ (see ).
+
+   
+   
+
+ PostgreSQL-specific functions and operators for JSON
+ data types (see ).
+
+   
+  
+
+  
+To learn more about the SQL/JSON standard, see
+. For details on JSON types
+supported in PostgreSQL,
+see .
+  
+
+ 
+  SQL/JSON Path Expressions
+
+  
+   SQL/JSON path expressions specify the items to be retrieved
+   from the JSON data, similar to XPath expressions used
+   for SQL access to XML. In PostgreSQL,
+   path expressions are implemented as the jsonpath
+   data type, described in .
+  
+
+  JSON query functions and operators
+   pass the provided path expression to the path engine
+   for evaluation. If the expression matches the JSON data to be queried,
+   the corresponding SQL/JSON item is returned.
+   Path expressions are written in the SQL/JSON path language
+   and can also include arithmetic expressions and functions.
+   Query functions treat the provided expression as a
+   text string, so it must be enclosed in single quotes.
+  
+
+  
+   A path expression consists of a sequence of elements allowed
+   by the jsonpath data type.
+   The path expression is evaluated from left to right, but
+   you can use parentheses to change the order of operations.
+   If the evaluation is successful, an SQL/JSON sequence is produced,
+   and the evaluation result is returned to the JSON query function
+   that completes the specified computation.
+  
+
+  
+   To refer to the JSON data to be queried (the
+   context item), use the $ sign
+   in the path expression. It can be followed by one or more
+   accessor operators,
+   which go down the JSON structure level by level to retrieve the
+   content of context item. Each operator that follows deals with the
+   result of the previous evaluation step.
+  
+
+  
+   For example, suppose you have some JSON data from a GPS tracker that you
+   would like to parse, such as:
+
+{ "track" :
+  {
+"segments" : [ 
+  { "location":   [ 47.763, 13.4034 ],
+"start time": "2018-10-14 10:05:14",
+"HR": 73
+  },
+  { "location":   [ 47.706, 13.2635 ],
+"start time": "2018-10-14 10:39:21",
+"HR": 130
+  } ]
+  }
+}
+
+  
+
+  
+   To retrieve the available track segments, you need to use the
+   .key accessor
+   operator for all the preceding JSON objects:
+
+'$.track.segments'
+
+  
+
+  
+   If the item to retrieve is an element of an array, you have
+   to unnest this array using the [*] operator. For example,
+   the following path will return location coordinates for all
+   the available track segments:
+
+'$.track.segment

Re: SQL/JSON: documentation

2018-10-01 Thread Liudmila Mantrova



On 09/28/2018 08:29 PM, Peter Eisentraut wrote:

On 28/06/2018 01:36, Nikita Glukhov wrote:

Attached patch with draft of SQL/JSON documentation written by
Liudmila Mantrova, Oleg Bartunov and me.

Also it can be found in our sqljson repository on sqljson_doc branch:
https://github.com/postgrespro/sqljson/tree/sqljson_doc

We continue to work on it.

Some structural comments:

- I don't think this should be moved to a separate file.  Yes, func.sgml
is pretty big, but if we're going to split it up, we should do it in a
systematic way, not just one section.

- The refentries are not a bad idea, but again, if we just used them for
this one section, the navigation will behave weirdly.  So I'd do it
without them, just using normal subsections.

- Stick to one-space indentation in XML.


Hi Peter,

Thanks for your comments! I'm OK with keeping all reference information 
in func.sgml and will rework it as you suggest. While refentries are 
dear to my heart, let's use subsections for now for the sake of 
consistency. We'll continue working with Nikita and Oleg to improve the 
content before we resend an updated patch; I believe we might still need 
a separate source file if we end up having a separate chapter with usage 
examples and implementation details.


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [HACKERS] Bug in to_timestamp().

2018-09-26 Thread Liudmila Mantrova


On 09/22/2018 10:00 PM, Alexander Korotkov wrote:

On Thu, Sep 20, 2018 at 3:52 PM Alexander Korotkov
 wrote:

On Thu, Sep 20, 2018 at 6:09 AM amul sul  wrote:

Agreed, thanks for working on this.

Pushed, thanks.

Please, find attached patch improving documentation about
letters/digits in to_date()/to_timestamp() template string.  I think
it needs review from native English speaker.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Hi Alexander,

I'm not a native speaker, but let me try to help. A new doc version is 
attached.



--

Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9a7f683..1532bcc 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6286,13 +6286,46 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
exceeds the number of separators in the template.
   
   
-   If FX is specified, separator in template string
-   matches to exactly one character in input string.  Notice we don't insist
-   input string character to be the same as template string separator.
+   If FX is specified, a separator in the template
+   string matches exactly one character in the input string.  The
+   input string character does not need to be the same as the template
+   string separator.
For example, to_timestamp('2000/JUN', 'FX MON')
works, but to_timestamp('2000/JUN', 'FXMON')
-   returns an error because a space second template string space consumed
-   letter J from the input string.
+   returns an error because the second space in the template string
+   consumes letter J of the input string.
+  
+ 
+
+ 
+  
+Template strings of to_timestamp and
+to_date functions can also contain arbitrary
+letters/digits between patterns.  Such letters/digits can match
+any characters in the input string.  For example,
+to_timestamp('2000yJUN', 'xMON') works.
+  
+  
+Letters/digits consume an input string character only if the number
+of extra characters at the beginning of the input string or between
+the identified date/time values is less than or equal to the number
+of the corresponding characters in the template string. This ensures
+that the template string does not consume any characters of date/time
+values when used without the FX option, even if
+a letter/digit separator in the input string appears after a space.
+For example, to_timestamp('2000y JUN', 'xMON')
+works, but to_timestamp('2000 yJUN', 'xMON')
+returns an error.
+  
+  
+Note that if the template string contains an arbitrary letter,
+the pattern that precedes this letter becomes greedy and tries
+to match as many characters as possible. For example,
+to_timestamp('2000906901', 'xMMxDD')
+fails because the  pattern matches
+the whole input string instead of the first four characters.
+Patterns separated by digits are non-greedy, so
+to_timestamp('2000906901', '0MM0DD') works fine.
   
  
 


Re: [HACKERS] Bug in to_timestamp().

2018-08-16 Thread Liudmila Mantrova


On 08/14/2018 06:38 PM, Alexander Korotkov wrote:

On Thu, Aug 2, 2018 at 9:06 PM Alexander Korotkov
 wrote:

On Thu, Aug 2, 2018 at 6:17 PM Alexander Korotkov
 wrote:

After some experiments I found that when you mix spaces and separators
between two fields, then Oracle takes into account only length of last
group of spaces/separators.

# SELECT to_timestamp('2018- -01 02', '   --- --MM-DD') FROM
dual2018-01-01 00:00:00 -10:00
(length of last spaces/separators group is 2)

# SELECT to_timestamp('2018- -01 02', '   --- --MM-DD') FROM dual
2018-01-01 00:00:00 -10:00
(length of last spaces/separators group is 3)

# SELECT to_timestamp('2018- -01 02', '   -- ---MM-DD') FROM dual
02.01.2018 00:00:00
(length of last spaces/separators group is 2)

Ooops... I'm sorry, but I've posted wrong results here.  Correct
version is here.

# SELECT to_timestamp('2018- -01 02', '   --- --MM-DD') FROM dual
ORA-01843: not a valid month
(length of last spaces/separators group is 2)

# SELECT to_timestamp('2018- -01 02', '   -- ---MM-DD') FROM dual
02.01.2018 00:00:00
(length of last spaces/separators group is 3)

So length of last group of spaces/separators in the pattern should be
greater or equal to length of spaces/separators in the input string.
Other previous groups are ignored in Oracle.  And that seems
ridiculous for me.

BTW, I've also revised documentation and regression tests.  Patch is attached.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Hi,
Please consider some further documentation improvements in the attached 
patch.


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index edc9be9..a8bbafc 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6262,11 +6262,12 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
  
   
to_timestamp and to_date
-   skip multiple blank spaces in the input string unless the
-   FX option is used. For example,
-   to_timestamp('2000JUN', ' MON') works, but
+   skip multiple blank spaces at the beginning of the input string and
+   around date and time values unless the FX option is used.  For example,
+   to_timestamp('2000JUN', ' MON') and
+   to_timestamp('2000 - JUN', '-MON') work, but
to_timestamp('2000JUN', 'FX MON') returns an error
-   because to_timestamp expects one space only.
+   because to_timestamp expects a single space only.
FX must be specified as the first item in
the template.
   
@@ -6274,6 +6275,43 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
 
  
   
+   A separator (a space or a non-letter/non-digit character) in the template string of
+   to_timestamp and to_date
+   matches any single separator in the input string or is skipped,
+   unless the FX option is used.
+   For example, to_timestamp('2000JUN', '///MON') and
+   to_timestamp('2000/JUN', ' MON') work, but
+   to_timestamp('2000//JUN', '/MON')
+   returns an error because the number of separators in the input string
+   exceeds the number of separators in the template.
+  
+  
+   If FX is specified, separators in the
+   input and template strings must match exactly. For example,
+   to_timestamp('2000/JUN', 'FX MON')
+   returns an error because a space is expected in the input string.
+  
+ 
+
+ 
+  
+   TZH template pattern can match a signed number.
+   Without the FX option, it may lead to ambiguity in
+   interpretation of the minus sign, which can also be interpreted as a separator.
+   This ambiguity is resolved as follows.  If the number of separators before
+   TZH in the template string is less than the number of
+   separators before the minus sign in the input string, the minus sign
+   is interpreted as part of TZH.
+   Otherwise, the minus sign is considered to be a separator between values.
+   For example, to_timestamp('2000 -10', ' TZH') matches
+   -10 to TZH, but
+   to_timestamp('2000 -10', 'TZH')
+   matches 10 to TZH.
+  
+ 
+
+ 
+  
Ordinary text is allowed in to_char
templates and will be output literally.  You can put a substring
in double quotes to force it to be interpreted as literal text
@@ -6287,6 +6325,19 @@ SELECT regexp_match('abc01234xyz', '(?:(.*?)(\d+)(.*)){1,1}');
string; for example "XX" skips two input characters
(whether or not they are XX).
   
+  
+
+  Prior to PostgreSQL 12, it was possible to
+  skip arbitrary text in the input string using non-letter or non-digit
+  characters. For example,
+  to_timestamp('2000y6m

Re: Fix for documentation of Covering Indexes

2018-07-30 Thread Liudmila Mantrova

On 04/18/2018 12:52 PM, Heikki Linnakangas wrote:

On 11/04/18 04:20, Michael Paquier wrote:

Hi all,

The documentation of covering indexes is incorrect for CREATE and ALTER
TABLE:
- ALTER TABLE's page is missing the call.
- Exclusion constraints can use INCLUDE clauses.

In order to simplify the documentation, please let me suggest the
attached which moves the definition of the INCLUDE clause into the
section index_parameters, which is compatible with what I read from the
parser.


Committed, thanks!

- Heikki

Following this change, I believe we need to modify UNIQUE and PRIMARY 
KEY descriptions in CREATE TABLE as they still mention INCLUDE but not 
the other index_parameters. The attached patch fixes this inconsistency, 
as well as adds a separate paragraph for INCLUDE in CREATE TABLE to 
clarify its purpose and avoid repetition in constraint descriptions. It 
also reorders the paragraphs in CREATE INDEX to follow the syntax.


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/ref/create_index.sgml b/doc/src/sgml/ref/create_index.sgml
index 3c1223b..c67f187 100644
--- a/doc/src/sgml/ref/create_index.sgml
+++ b/doc/src/sgml/ref/create_index.sgml
@@ -145,52 +145,6 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 
  
-  INCLUDE
-  
-   
-The optional INCLUDE clause specifies a
-list of columns which will be included in the index
-as non-key columns.  A non-key column cannot
-be used in an index scan search qualification, and it is disregarded
-for purposes of any uniqueness or exclusion constraint enforced by
-the index.  However, an index-only scan can return the contents of
-non-key columns without having to visit the index's table, since
-they are available directly from the index entry.  Thus, addition of
-non-key columns allows index-only scans to be used for queries that
-otherwise could not use them.
-   
-
-   
-It's wise to be conservative about adding non-key columns to an
-index, especially wide columns.  If an index tuple exceeds the
-maximum size allowed for the index type, data insertion will fail.
-In any case, non-key columns duplicate data from the index's table
-and bloat the size of the index, thus potentially slowing searches.
-   
-
-   
-Columns listed in the INCLUDE clause don't need
-appropriate operator classes; the clause can include
-columns whose data types don't have operator classes defined for
-a given access method.
-   
-
-   
-Expressions are not supported as included columns since they cannot be
-used in index-only scans.
-   
-
-   
-Currently, only the B-tree index access method supports this feature.
-In B-tree indexes, the values of columns listed in the
-INCLUDE clause are included in leaf tuples which
-correspond to heap tuples, but are not included in upper-level
-index entries used for tree navigation.
-   
-  
- 
-
- 
   name
   

@@ -317,6 +271,52 @@ CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] 
 
  
+  INCLUDE
+  
+   
+The optional INCLUDE clause specifies a
+list of columns which will be included in the index
+as non-key columns.  A non-key column cannot
+be used in an index scan search qualification, and it is disregarded
+for purposes of any uniqueness or exclusion constraint enforced by
+the index.  However, an index-only scan can return the contents of
+non-key columns without having to visit the index's table, since
+they are available directly from the index entry.  Thus, addition of
+non-key columns allows index-only scans to be used for queries that
+otherwise could not use them.
+   
+
+   
+It's wise to be conservative about adding non-key columns to an
+index, especially wide columns.  If an index tuple exceeds the
+maximum size allowed for the index type, data insertion will fail.
+In any case, non-key columns duplicate data from the index's table
+and bloat the size of the index, thus potentially slowing searches.
+   
+
+   
+Columns listed in the INCLUDE clause don't need
+appropriate operator classes; the clause can include
+columns whose data types don't have operator classes defined for
+a given access method.
+   
+
+   
+Expressions are not supported as included columns since they cannot be
+used in index-only scans.
+   
+
+   
+Currently, only the B-tree index access method supports this feature.
+In B-tree indexes, the values of columns listed in the
+INCLUDE clause are included in leaf

Re: documentation is now XML

2018-06-20 Thread Liudmila Mantrova

On 04/27/2018 06:36 PM, Tom Lane wrote:

Bruce Momjian  writes:

On Fri, Apr 27, 2018 at 11:00:36AM -0400, Peter Eisentraut wrote:

That proposal seemed to indicate not only converting the source code to
XML but also the build system to XSL.  The latter is out of the
question, I think.

Why is that?  I thought people building current Postgres would already
have the XML toolchain built to build older release documentation.

Yeah, in principle, anyone who's able to build the v10 branch's docs
should be ready for such a change.  We had discussed polling the
pgsql-packagers list to see whether anyone would have a problem with
changing the docs build toolset for the pre-v10 branches, but I don't
think we actually asked them.

regards, tom lane


To complete the picture of possible issues with older branches in XML, 
we posted a question in packager lists some time ago and didn't receive 
any objections. Just to keep record of all related questions in one 
place, here's the link:


https://www.postgresql.org/message-id/flat/06efc906-16f8-0cde-5bee-e3d5abfc00ba%40postgrespro.ru#06efc906-16f8-0cde-5bee-e3d5abfc0...@postgrespro.ru 



We totally understand the reluctance to volunteer personal time for 
manual testing of legacy branches - sad as it is that we may miss some 
benefits of the previous efforts. E.g. all styles and transforms have 
been stabilized for version 11, so they could be reused for older 
branches. As for testing the content conversion, our scripts can handle 
it in a fully automated way by comparing the original and the final 
outputs via the .txt format, so all possible differences will be caught.


It's unfortunate that we'll have to deal with different formats in the 
supported branches for several more years, but we at Postgres 
Professional are ready to accept any your decision on this matter for now.


--
Liudmila Mantrova
Technical writer at Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company



Re: doc fixes: vacuum_cleanup_index_scale_factor

2018-05-10 Thread Liudmila Mantrova

On 05/08/2018 02:05 PM, Justin Pryzby wrote:

3rd iteration ; thanks for bearing with me.

On Tue, May 08, 2018 at 12:35:00PM +0300, Alexander Korotkov wrote:

Hi, Justin!

Thank you for revising documentation patch.

On Mon, May 7, 2018 at 7:55 PM, Justin Pryzby <pry...@telsasoft.com> wrote:


+In order to detect stale index statistics, the number of total heap
+tuples during previous statistics collection is stored in the index
+meta-page.

Consider removing: "during previous statistics collection"
Or: "during THE previous statistics collection"

+ Once the number of inserted tuples since previous

since THE previous

+statistics collection is more than
+vacuum_cleanup_index_scale_factor fraction of

Since the multiplier can be greater than 1, should we say "multiple" instead of
fraction?

+during VACUUM cycle.  Thus, skipping of the B-tree
+index scan during cleanup stage is only possible when second and
+subsequent VACUUM cycles detecting no dead tuples.

Change "detecting" to "detect".  Or maybe just "find"

Justin


Hi Justin,

Thank you for helping with the docs. Attached is another doc patch that 
should address most of the issues you've brought up.
I've also reshuffled the text a bit to make it more like an option 
description. Hope you'll find it useful.


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index ffea744..c4afd14 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1893,15 +1893,34 @@ include_dir 'conf.d'
   
   

-When no tuples were deleted from the heap, B-tree indexes might still
-be scanned during VACUUM cleanup stage by two
-reasons.  The first reason is that B-tree index contains deleted pages
-which can be recycled during cleanup.  The second reason is that B-tree
-index statistics is stalled.  The criterion of stalled index statistics
-is number of inserted tuples since previous statistics collection
-is greater than vacuum_cleanup_index_scale_factor
-fraction of total number of heap tuples.
+Specifies the fraction of the total number of heap tuples counted in
+the previous statistics collection that can be inserted without
+incurring an index scan at the VACUUM cleanup stage.
+This setting currently applies to B-tree indexes only.

+
+   
+If no tuples were deleted from the heap, B-tree indexes are still
+scanned at the VACUUM cleanup stage when at least one
+of the following conditions is met: the index statistics are stale, or
+the index contains deleted pages that can be recycled during cleanup.
+Index statistics are considered to be stale if the number of newly
+inserted tuples exceeds the vacuum_cleanup_index_scale_factor
+fraction of the total number of heap tuples detected by the previous
+statistics collection. The total number of heap tuples is stored in
+the index meta-page. Note that the meta-page does not include this data
+until VACUUM finds no dead tuples, so B-tree index
+scan at the cleanup stage can only be skipped if the second and
+subsequent VACUUM cycles detect no dead tuples.
+   
+
+   
+The value can range from 0 to 100.
+When vacuum_cleanup_index_scale_factor is set to
+0, index scans are never skipped during
+VACUUM cleanup. The default value is 0.1.
+   
+
   
  
  
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 3bcc56e..22b4a75 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -189,7 +189,7 @@ _bt_update_meta_cleanup_info(Relation rel, TransactionId oldestBtpoXact,
 	if (metad->btm_version < BTREE_VERSION)
 		_bt_upgrademetapage(metapg);
 
-	/* update cleanup-related infromation */
+	/* update cleanup-related information */
 	metad->btm_oldest_btpo_xact = oldestBtpoXact;
 	metad->btm_last_cleanup_num_heap_tuples = numHeapTuples;
 	MarkBufferDirty(metabuf);
diff --git a/src/backend/access/nbtree/nbtree.c b/src/backend/access/nbtree/nbtree.c
index d894ba0..27a3032 100644
--- a/src/backend/access/nbtree/nbtree.c
+++ b/src/backend/access/nbtree/nbtree.c
@@ -818,10 +818,11 @@ _bt_vacuum_needs_cleanup(IndexVacuumInfo *info)
 		float8		cleanup_scale_factor;
 
 		/*
-		 * If table receives large enough amount of insertions and no cleanup
-		 * was performed, then index might appear to have stalled statistics.
-		 * In order to evade that, we perform cleanup when table receives
-		 * vacuum_cleanup_index_scale_factor fractions of insertions.
+		 * If table receives enough insertions and no cleanup

Re: documentation is now XML

2018-05-03 Thread Liudmila Mantrova

Hi Peter,

Thank you for taking your time to comment on this long-discussed topic.
I was not around when all the great work was done to implement XSLT 
transforms in branch 10 and convert branch 11 to XML, but judging by 
this thread 
(https://www.postgresql.org/message-id/flat/4ffd72d6-8ab6-37c6-d7d5-dfed8967c0fc%40gmail.com), 
much of the testing was automated and could be re-applied to older 
branches.


Taking into account that both XSLT transforms and testing scripts can be 
reused for the most part, do you think that the benefits of having 
consistent source (easy back-porting of doc patches and one-time 
translation to all languages) in all the supported branches could 
potentially outweigh the inevitable conversion overhead? Are there any 
specific features that make branch 10 considerably different from its 
predecessors that we are missing?



On 04/30/2018 10:20 PM, Peter Eisentraut wrote:

On 4/27/18 11:03, Bruce Momjian wrote:

On Fri, Apr 27, 2018 at 11:00:36AM -0400, Peter Eisentraut wrote:

On 4/23/18 05:54, Liudmila Mantrova wrote:

Reading this thread, I got an impression that everyone would benefit
from converting back branches to XML, but the main concern is lack of
resources to complete this task. Are there any other issues that affect
this decision? Looks like Aleksander Lakhin's offer to prepare patches
was missed somehow as the discussion sidetracked to other issues

That proposal seemed to indicate not only converting the source code to
XML but also the build system to XSL.  The latter is out of the
question, I think.

Why is that?

Because there would be a thousand lines of tooling changes to be
backported and thousands of pages of documentation to be checked
manually that it doesn't create a mess (times branches times platforms).



--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: documentation is now XML

2018-04-23 Thread Liudmila Mantrova

Hi everyone,

Reading this thread, I got an impression that everyone would benefit 
from converting back branches to XML, but the main concern is lack of 
resources to complete this task. Are there any other issues that affect 
this decision? Looks like Aleksander Lakhin's offer to prepare patches 
was missed somehow as the discussion sidetracked to other issues:


I can prepare such patches (scripts to generate them). In fact we 
(Postgres Pro) perform such conversion (SGML->XML) on-fly when 
building docs starting from 9.6. So it's not problem to convert *.sgml 
and replace Makefile and *.xsl. But I would prefer to perform the 
conversion when we finish the move on 11devel (renaming sgml to xml, 
maybe optimizing xsl's...).


Do you think it's worth considering after May releases? I could also 
help with manual testing of the prepared patches if required. Supporting 
documentation and translation in several branches will be much easier if 
the sources are consistent.


--
Liudmila Mantrova
Technical writer at Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Fwd: [BUGS] pg_trgm word_similarity inconsistencies or bug

2018-04-16 Thread Liudmila Mantrova

Hi everyone,

When translating doc updates, Alexander Lakhin noticed that trigram 
examples were not quite accurate.

A small patch fixing this issue is attached.


On 03/21/2018 03:35 PM, Teodor Sigaev wrote:

Thank you, pushed

David Steele wrote:

On 3/6/18 7:04 AM, Teodor Sigaev wrote:

I agree with Teodor (upthread, not quoted here) that the documentation
could use some editing.

I started to do it myself, but quickly realized I have no knowledge of
the content.  I'm afraid I would destroy the meaning while updating 
the

grammar.

Anyone understand the subject matter well enough to review the
documentation?


Liudmila tried to improve docs in Alexander's patchset.

https://www.postgresql.org/message-id/f43b242d-000c-f4c8-cb8b-d37e9752c...@postgrespro.ru 



This looks good to me with a few minor exceptions:

+   word_similarity(text, text) requires further
+   explanation. Consider the following example:

Maybe too verbose?  I think "word_similarity(text,
text) requires further explanation." can be removed entirely.

+   string.  However, this function does not add paddings to the

"add padding"


BTW, adding Liudmila's message to commitfest task
(https://commitfest.postgresql.org/17/1403/) doesn't work


Doesn't work for me either.

Alexander, can you post the final patches to the thread so they show up
in the CF app?

Thanks,





--
Liudmila Mantrova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/doc/src/sgml/pgtrgm.sgml b/doc/src/sgml/pgtrgm.sgml
index 8f39529..be43cdf 100644
--- a/doc/src/sgml/pgtrgm.sgml
+++ b/doc/src/sgml/pgtrgm.sgml
@@ -152,9 +152,9 @@
 
 
In the first string, the set of trigrams is
-   {"  w"," wo","ord","wor","rd "}.
+   {"  w"," wo","wor","ord","rd "}.
In the second string, the ordered set of trigrams is
-   {"  t"," tw",two,"wo ","  w"," wo","wor","ord","rds", ds "}.
+   {"  t"," tw","two","wo ","  w"," wo","wor","ord","rds","ds "}.
The most similar extent of an ordered set of trigrams in the second string
is {"  w"," wo","wor","ord"}, and the similarity is
0.8.
@@ -172,7 +172,7 @@
At the same time, strict_word_similarity(text, text)
has to select an extent that matches word boundaries.  In the example above,
strict_word_similarity(text, text) would select the
-   extent {"  w"," wo","wor","ord","rds", ds "}, which
+   extent {"  w"," wo","wor","ord","rds","ds "}, which
corresponds to the whole word 'words'.
 
 


Re: Fwd: [BUGS] pg_trgm word_similarity inconsistencies or bug

2018-01-19 Thread Liudmila Mantrova

Hello everyone,

I would like to contribute to documentation review of the patches 
discussed in thread 
https://www.postgresql.org/message-id/flat/cy4pr17mb13207ed8310f847cf117eed0d8...@cy4pr17mb1320.namprd17.prod.outlook.com 
(https://commitfest.postgresql.org/16/1403/). Unfortunately, I was not 
subscribed to pgsql-hackers before, so I cannot respond to this thread 
directly.


Please find attached new revisions of the original patches. I hope 
you'll find the changes useful!


--
Liudmila Mantrova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/pg_trgm/trgm_op.c b/contrib/pg_trgm/trgm_op.c
index f7e96ac..306d60b 100644
--- a/contrib/pg_trgm/trgm_op.c
+++ b/contrib/pg_trgm/trgm_op.c
@@ -456,7 +456,7 @@ iterate_word_similarity(int *trg2indexes,
 			lastpos[trgindex] = i;
 		}
 
-		/* Adjust lower bound if this trigram is present in required substring */
+		/* Adjust upper bound if this trigram is present in required substring */
 		if (found[trgindex])
 		{
 			int			prev_lower,
@@ -473,7 +473,7 @@ iterate_word_similarity(int *trg2indexes,
 
 			smlr_cur = CALCSML(count, ulen1, ulen2);
 
-			/* Also try to adjust upper bound for greater similarity */
+			/* Also try to adjust lower bound for greater similarity */
 			tmp_count = count;
 			tmp_ulen2 = ulen2;
 			prev_lower = lower;
diff --git a/doc/src/sgml/pgtrgm.sgml b/doc/src/sgml/pgtrgm.sgml
index 338ef30..005961c 100644
--- a/doc/src/sgml/pgtrgm.sgml
+++ b/doc/src/sgml/pgtrgm.sgml
@@ -99,12 +99,10 @@
   
   real
   
-   Returns a number that indicates how similar the first string
-   to the most similar word of the second string. The function searches in
-   the second string a most similar word not a most similar substring.  The
-   range of the result is zero (indicating that the two strings are
-   completely dissimilar) to one (indicating that the first string is
-   identical to one of the words of the second string).
+   Returns a number that indicates the greatest similarity between
+   the set of trigrams in the first string and any continuous extent
+   of an ordered set of trigrams in the second string.  For details, see
+   the explanation below.
   
  
  
@@ -131,6 +129,35 @@

   
 
+  
+   word_similarity(text, text) requires further
+   explanation. Consider the following example:
+
+
+# SELECT word_similarity('word', 'two words');
+ word_similarity
+-
+ 0.8
+(1 row)
+
+
+   In the first string, the set of trigrams is
+   {"  w"," wo","ord","wor","rd "}.
+   In the second string, the ordered set of trigrams is
+   {"  t"," tw",two,"wo ","  w"," wo","wor","ord","rds", ds "}.
+   The most similar extent of an ordered set of trigrams in the second string
+   is {"  w"," wo","wor","ord"}, and the similarity is
+   0.8.
+  
+
+  
+   This function returns a value that can be approximately understood as the
+   greatest similarity between the first string and any substring of the second
+   string.  However, this function does not add paddings to the boundaries of
+   the extent.  Thus, a whole word match gets a higher score than a match with
+   a part of the word.
+  
+
   
pg_trgm Operators

@@ -156,10 +183,11 @@
   text % text
   boolean
   
-   Returns true if its first argument has the similar word in
-   the second argument and they have a similarity that is greater than the
-   current word similarity threshold set by
-   pg_trgm.word_similarity_threshold parameter.
+   Returns true if the similarity between the trigram
+   set in the first argument and a continuous extent of an ordered trigram
+   set in the second argument is greater than the current word similarity
+   threshold set by pg_trgm.word_similarity_threshold
+   parameter.
   
  
  
@@ -302,10 +330,11 @@ SELECT t, word_similarity('word', t) AS sml
   WHERE 'word' % t
   ORDER BY sml DESC, t;
 
-   This will return all values in the text column that have a word
-   which sufficiently similar to word, sorted from best
-   match to worst.  The index will be used to make this a fast operation
-   even over very large data sets.
+   This will return all values in the text column for which there is a
+   continuous extent in the corresponding ordered trigram set that is
+   sufficiently similar to the trigram set of word,
+   sorted from best match to worst.  The index will be used to make this
+   a fast operation even over very large data sets.
   
 
   
diff --git a/contrib/pg_trgm/Makefile b/contrib/pg_trgm/Makefile
index 212a890..dfecc2a 100644
--- a/contrib/pg_trgm/Makefile
+++ b/contrib/pg_trgm/Makefile
@@ -4,11 +4,12 @@ MODULE_big = pg_trgm
 OBJS = trgm_op.o trgm_gist.o trgm_g