Re: [GENERAL] jsonb

2017-11-14 Thread Johannes Graën
On 11/14/2017 11:30 PM, hmidi slim wrote:
> I'm trying to use the function to_jsonb and create the name:
> to_jsonb ('{"key1":" ' || 'text1' || '","key2":" ' || 'text2' || '"}');

This query converts a string into a JSON object that consist of that
string. I guess what you intend to accomplish is rather:

select jsonb_build_object('key1', 'text1', 'key2', 'text2');




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Combine multiple text search configuration

2017-11-12 Thread Johannes Graën
Hi,


On 2017-11-06 09:17, hmidi slim wrote:
> Hi,
> I want to know if I can combine multiple text search configurations when
> I tried to use FTS.
> Is there any options like this:
> *to_tsvector(['english', 'french'], document)*
> *
> *
> Trying to create a new text configuration:
> *Create text search configuration test (copy=simple)*
> *Alter text search configuration test*
> *add mapping for asciiword with english_stem,french_stem*
> *
> *
> This query doesn't work. How can I combine multiple text search
> configurations if I need more than one into my query to search a word?

what about using two indexes, one for each language? If your documents
can either be English OR French, the English OR the French vector should
match an English OR French tsquery.

It is not clear to me how combining two stemmers should practically work
since each word can only have one stem. If you have multilingual
documents or texts with code switching, you could also try combining the
two vectors both for the documents and the query:

(to_tsvector('english', document) || to_tsvector('french', document)) @@
(to_tsquery('english', query) || to_tsquery('french', query))



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Combine multiple text search configuration

2017-11-12 Thread Johannes Graën
On 2017-11-07 08:27, hmidi slim wrote:
> Hi, 
> Thank for your proposition but when to use this query : 
> (to_tsvector('english', document) || to_tsvector('french', document)) @@
> (to_tsquery('english', query) || to_tsquery('french', query))
> I think that the performance decrease and not a good solution for big
> amount of data. Is it?

You have more lexems when you combine two languages, but not twice as
many as there will be some overlap. That means your index will also be
be bigger than a single language index. Anyhow I would expect this
variant to perform better than querying two single columns
simultaneously. Maybe one of the FTS developers could comment on this?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Fwd: Re: [GENERAL] Combine multiple text search configuration

2017-11-09 Thread Johannes Graën
On 2017-11-07 08:27, hmidi slim wrote:
> Hi, 
> Thank for your proposition but when to use this query : 
> (to_tsvector('english', document) || to_tsvector('french', document)) @@
> (to_tsquery('english', query) || to_tsquery('french', query))
> I think that the performance decrease and not a good solution for big
> amount of data. Is it?

You have more lexems when you combine two languages, but not twice as
many as there will be some overlap. That means your index will also be
be bigger than a single language index. Anyhow I would expect this
variant to perform better than querying two single columns
simultaneously. Maybe one of the FTS developers could comment on this?


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Combine multiple text search configuration

2017-11-06 Thread Johannes Graën
Hi,


On 2017-11-06 09:17, hmidi slim wrote:
> Hi,
> I want to know if I can combine multiple text search configurations when
> I tried to use FTS.
> Is there any options like this:
> *to_tsvector(['english', 'french'], document)*
> *
> *
> Trying to create a new text configuration:
> *Create text search configuration test (copy=simple)*
> *Alter text search configuration test*
> *add mapping for asciiword with english_stem,french_stem*
> *
> *
> This query doesn't work. How can I combine multiple text search
> configurations if I need more than one into my query to search a word?

what about using two indexes, one for each language? If your documents
can either be English OR French, the English OR the French vector should
match an English OR French tsquery.

It is not clear to me how combining two stemmers should practically work
since each word can only have one stem. If you have multilingual
documents or texts with code switching, you could also try combining the
two vectors both for the documents and the query:

(to_tsvector('english', document) || to_tsvector('french', document)) @@
(to_tsquery('english', query) || to_tsquery('french', query))



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] create roles as normal user

2016-08-28 Thread Johannes
Hello,

I want a "normal" user to create roles inside the database.
Is it only possible through the createrole privilige, or is there
something like sudo possibile.

I thought it is possible with the security definer option in create
function, but that does not work for me: (In this test the function was
created as superuser).

```sql
CREATE OR REPLACE FUNCTION add_db1_user() returns void language sql
STRICT security definer AS $$
CREATE ROLE db1_testuser login password 'secret';
$$

ERROR:  permission denied to create role


Best regards
Johannes



signature.asc
Description: OpenPGP digital signature


[GENERAL] recordings of pgconf us 2016

2016-05-28 Thread Johannes
I guess I have seen all video recording from pgconf us 2015 at youtube.
Are there any recording from this year available?

Best regards Johannes



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] execute same query only one time?

2016-02-09 Thread Johannes


Am 09.02.2016 um 00:56 schrieb Vitaly Burovoy:
> On 2/8/16, Johannes <jo...@posteo.de> wrote:
>> Am 08.02.2016 um 21:50 schrieb Vitaly Burovoy:
>>> On 2/8/16, Johannes <jo...@posteo.de> wrote:
>>>> Am 08.02.2016 um 21:17 schrieb Vitaly Burovoy:
>>>>> Hmm. Could you clarify why you don't want to pass id from the first
>>>>> query to the second one:
>>>>>
>>>>> select col1 from t1 where t0_id = value_id_from_the_first_query
>>>>
>>>> Of course I could do that, but in that case I would not ask.
>>>>
>>>> I thougt there could be a better solution to execute all statements at
>>>> once.
>>>
>>> What the reason to execute all statements which return different
>>> columns at once?
>>>
>>>> Saving roundtrips,
>>>
>>> In most cases they are not so big. Getting a bunch of duplicated data
>>> is wasting you network bandwidth and don't increase speed.
>>
>> In my and your example no duplicated data (result sets) is send over the
>> network. The server do not need to wait until the client snips out the
>> id and sends it id in the next query again. So the server can compute
>> the result set without external dependencies as fast as possible.
> 
> We are talking about executing all statements at once to save RTT. Are we?

Yes, we do.

> And a parallel thread has advice to join tables (queries). It is a way
> to run both queries at once, but it is not a solution.

Right.

>>>> increase speed,
>>>
>>> Speed will be at least the same. In your case either you have to use
>>> more DDL (like CREATE TEMP TABLE) or get copied columns that leads
>>> more time to encode/decode and send it via network.
>>
>> The time difference is small, yes.
>> My old variant with executing the first select, remember the returned id
>> value and paste it into the second query and execute it takes 32ms.
>>
>> Your temp table variant need 29ms. Nice to see. That are 10% speed
>> improvement.
> 
> I guess you measure it by your app. It is just a measurement error.
> +-3ms can be a sum of TCP packet loss, system interrupts, system timer
> inaccuracy, multiple cache missing, different layers (you are using
> Java, it has a VM and a lot of intermediate abstraction layers).

I know all these facts. I run it a "only" a few times, single threaded.
With high resolution. The rounded result of 3ms was reliable and thats
exact enough for my rule of thumb.
Your temp table variant looks more elegant, and is sightly faster. I'm
fine with that.

> Remember, my version has 6 statements each of them requires some work
> at PG's side, plus my version has two joins which usually slower than
> direct search by a value. Your version has only 4 statements and the
> only one slow place -- "where" clause in the second select which can
> be replaced by a value founded in the first select (your version sends
> more data: value1, value2, ...).

Anyway, it is faster ;)

> You also can avoid "begin" and "commit" since default transaction
> isolation is "READ COMMITTED"[1]:
>> Also note that two successive SELECT commands can see different data,
>> even though they are within a single transaction, if other transactions 
>> commit
>> changes after the first SELECT starts and before the second SELECT starts.

I know. I did not tell that I run my queries in with repeatable read
isolation.
And I read it is wise to bundle multiple queries in an transaction,
because the overhead of multiple transaction can be avoid to one.

> If you want to measure time, run both versions 1 times in 8
> connections simultaneously and compare results. ;-)
> 
> 32ms * 10k requests / 8 threads = 4ms = 40sec

Thats more complicated, I think I learned enough about it. But thanks.

Ciao Johannes



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] execute same query only one time?

2016-02-08 Thread Johannes


Am 08.02.2016 um 21:17 schrieb Vitaly Burovoy:
> On 2/8/16, Johannes <jo...@posteo.de> wrote:
>> Am 08.02.2016 um 20:15 schrieb David G. Johnston:
>>> On Mon, Feb 8, 2016 at 12:05 PM, Johannes <jo...@posteo.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> is there a best practice to share data between two select statements?
>>>>
>>>> Imaging following situation: I want to receive two result sets from two
>>>> tables, referring to a specific id from table t0 AND I try not to query
>>>> for that specific id a second time.
>>>>
>>>> Table t0 returns 1 row and table t1 returns multiple rows.
>>>>
>>>> begin;
>>>> select id, col1, col2, ... from t0 where id = (select max(id) from t0
>>>> where col1 = value1 and col2 = value2 and ...);
>>>> select col1 from t1 where t0_id = (select max(id) from t0 where col1 =
>>>> value1 and col2 = value2 and ...);
>>>> commit;
>>>
>>>
>>> Please confirm:​
>>>
>>> ​You want the​ result of "SELECT max(id) FROM t0" to be used in the
>>> second
>>> query without having to recompute it?
>>
>> Yes.
>>
>>> What client are you using to execute these statements?
>>
>> JDBC. I execute both statements at once and iterate through the resultsets.
>>
>> Johannes
> 
> Hmm. Could you clarify why you don't want to pass id from the first
> query to the second one:
> 
> select col1 from t1 where t0_id = value_id_from_the_first_query
> 

Of course I could do that, but in that case I would not ask.

I thougt there could be a better solution to execute all statements at
once. Saving roundtrips, increase speed, a more sophistacted solution,
learn something new...

Johannes



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] execute same query only one time?

2016-02-08 Thread Johannes


Am 08.02.2016 um 21:33 schrieb Vitaly Burovoy:
> On 2/8/16, Johannes <jo...@posteo.de> wrote:
>>
>> Am 08.02.2016 um 20:32 schrieb Vitaly Burovoy:
>>> On 2/8/16, Adrian Klaver <adrian.kla...@aklaver.com> wrote:
>>>> Based on rough guess of the above, without seeing actual table schemas:
>>>>
>>>> select id, t0.col1, t1.col1, col2, ... from t0 JOIN t1 ON t0.id =
>>>> t1.t0_id where id = (select max(id) from t0 where col1 = value1 and col2
>>>> = value2 and ...);
>>>
>>> I don't think it is a good solution because it leads to copying
>>> columns from the t0 which is wasting net traffic and increasing
>>> complexity at the client side. Moreover it works iff t0 returns only
>>> one row.
>>
>> I had same doubts.
>> CTE would be first class, if it was be reusable for other statements.
>>
>> Johannes
> 
> CTEs are temporary tables for a _statement_ for using a single
> statement instead of several ones (create temp table, insert into,
> select from it, select from it, drop temp table).
> 
> But it is not your case because CTEs are for a queries which return a
> single set of rows. Your case is returning two sets (one row with
> several columns from t0 and several rows with a single columns from
> t1).

Sure.
Thanks for the temporary table example!

Johannes




signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] execute same query only one time?

2016-02-08 Thread Johannes


Am 08.02.2016 um 20:32 schrieb Vitaly Burovoy:
> On 2/8/16, Adrian Klaver <adrian.kla...@aklaver.com> wrote:
>> On 02/08/2016 11:05 AM, Johannes wrote:
>>> Imaging following situation: I want to receive two result sets from two
>>> tables, referring to a specific id from table t0 AND I try not to query
>>> for that specific id a second time.
>>
>>> Table t0 returns 1 row and table t1 returns multiple rows.
>>>
>>> begin;
>>> select id, col1, col2, ... from t0 where id = (select max(id) from t0
>>> where col1 = value1 and col2 = value2 and ...);
>>> select col1 from t1 where t0_id = (select max(id) from t0 where col1 =
>>> value1 and col2 = value2 and ...);
>>> commit;
>>>
>>> Best regards Johannes
>>
>> Based on rough guess of the above, without seeing actual table schemas:
>>
>> select id, t0.col1, t1.col1, col2, ... from t0 JOIN t1 ON t0.id =
>> t1.t0_id where id = (select max(id) from t0 where col1 = value1 and col2
>> = value2 and ...);
> 
> I don't think it is a good solution because it leads to copying
> columns from the t0 which is wasting net traffic and increasing
> complexity at the client side. Moreover it works iff t0 returns only
> one row.

I had same doubts.
CTE would be first class, if it was be reusable for other statements.

Johannes



signature.asc
Description: OpenPGP digital signature


[GENERAL] execute same query only one time?

2016-02-08 Thread Johannes
Hi,

is there a best practice to share data between two select statements?

Imaging following situation: I want to receive two result sets from two
tables, referring to a specific id from table t0 AND I try not to query
for that specific id a second time.

Table t0 returns 1 row and table t1 returns multiple rows.

begin;
select id, col1, col2, ... from t0 where id = (select max(id) from t0
where col1 = value1 and col2 = value2 and ...);
select col1 from t1 where t0_id = (select max(id) from t0 where col1 =
value1 and col2 = value2 and ...);
commit;

Best regards Johannes



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] execute same query only one time?

2016-02-08 Thread Johannes
Am 08.02.2016 um 20:15 schrieb David G. Johnston:
> On Mon, Feb 8, 2016 at 12:05 PM, Johannes <jo...@posteo.de> wrote:
> 
>> Hi,
>>
>> is there a best practice to share data between two select statements?
>>
>> Imaging following situation: I want to receive two result sets from two
>> tables, referring to a specific id from table t0 AND I try not to query
>> for that specific id a second time.
>>
>> Table t0 returns 1 row and table t1 returns multiple rows.
>>
>> begin;
>> select id, col1, col2, ... from t0 where id = (select max(id) from t0
>> where col1 = value1 and col2 = value2 and ...);
>> select col1 from t1 where t0_id = (select max(id) from t0 where col1 =
>> value1 and col2 = value2 and ...);
>> commit;
> 
> 
> Please confirm:​
> 
> ​You want the​ result of "SELECT max(id) FROM t0" to be used in the second
> query without having to recompute it?

Yes.

> What client are you using to execute these statements?

JDBC. I execute both statements at once and iterate through the resultsets.

Johannes



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] execute same query only one time?

2016-02-08 Thread Johannes


Am 08.02.2016 um 21:50 schrieb Vitaly Burovoy:
> On 2/8/16, Johannes <jo...@posteo.de> wrote:
>> Am 08.02.2016 um 21:17 schrieb Vitaly Burovoy:
>>> On 2/8/16, Johannes <jo...@posteo.de> wrote:
>>>> Am 08.02.2016 um 20:15 schrieb David G. Johnston:
>>>>> On Mon, Feb 8, 2016 at 12:05 PM, Johannes <jo...@posteo.de> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> is there a best practice to share data between two select statements?
>>>>>>
>>>>>> Imaging following situation: I want to receive two result sets from
>>>>>> two
>>>>>> tables, referring to a specific id from table t0 AND I try not to
>>>>>> query
>>>>>> for that specific id a second time.
>>>>>>
>>>>>> Table t0 returns 1 row and table t1 returns multiple rows.
>>>>>>
>>>>>> begin;
>>>>>> select id, col1, col2, ... from t0 where id = (select max(id) from t0
>>>>>> where col1 = value1 and col2 = value2 and ...);
>>>>>> select col1 from t1 where t0_id = (select max(id) from t0 where col1 =
>>>>>> value1 and col2 = value2 and ...);
>>>>>> commit;
>>>
>>> Hmm. Could you clarify why you don't want to pass id from the first
>>> query to the second one:
>>>
>>> select col1 from t1 where t0_id = value_id_from_the_first_query
>>>
>>
>> Of course I could do that, but in that case I would not ask.
>>
>> I thougt there could be a better solution to execute all statements at
>> once.
> 
> What the reason to execute all statements which return different
> columns at once?
> 
>> Saving roundtrips,
> 
> In most cases they are not so big. Getting a bunch of duplicated data
> is wasting you network bandwidth and don't increase speed.

In my and your example no duplicated data (result sets) is send over the
network. The server do not need to wait until the client snips out the
id and sends it id in the next query again. So the server can compute
the result set without external dependencies as fast as possible.

>> increase speed,
> 
> Speed will be at least the same. In your case either you have to use
> more DDL (like CREATE TEMP TABLE) or get copied columns that leads
> more time to encode/decode and send it via network.

The time difference is small, yes.
My old variant with executing the first select, remember the returned id
value and paste it into the second query and execute it takes 32ms.

Your temp table variant need 29ms. Nice to see. That are 10% speed
improvement.

>> a more sophisticated solution,
> 
> It usually depends on a task. Your case is simple enough and can't
> lead any sophisticated solution. =(
> 

No problem.

>> learn something new...
> 
> It makes sense. =)
> 
>> Johannes

Good night.



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] long transfer time for binary data

2016-01-25 Thread Johannes
Am 25.01.2016 um 19:59 schrieb Daniel Verite:
>   Johannes wrote:
> 
>> \lo_export 12345 /dev/null is completed in 0.86 seconds.
> 
> If it's an 11MB file through a 100Mbits/s network, that's
> pretty much the best that can be expected.
> 
> I would think the above is the baseline against which
> the other methods should be compared.
> 
>> I sa my images as large object, which afaik is in practise not
>> readable with a binary cursor (we should use the lo_* functions). And of
>> course I already use the LargeObjectManager of the postgresql jdbc library.
> 
> Sounds good.
> 
>> You said, the server has to convert the bytes to hex string before
>> sending it over the wire.
> 
> Only in certain contexts. SELECT lo_get(oid) is a query that returns
> bytea, so if the clients requests results in text format, they will
> be transferred as text, and it's the responsibility of the client
> to convert them back to bytes (or not, who knows, maybe the
> client wants hexadecimal).
> 
> But lo_get is an exception, and besides a late addition (9.4 I believe).
> Generally, client-side access to large objects functions doesn't
> use a client-side SQL query, it's done through the 
> "Function Call sub-protocol" described here:
> http://www.postgresql.org/docs/current/static/protocol-flow.html#AEN108750
> and the result comes back as binary.
> 
> Presumably the JDBC LargeObjectManager uses that method.
> 
> Best regards,

I thougth \lo_export can only run on server side only (like \copy copy).
0.8 seconds was the rutime on server to server disk.

Running from client (transfers only 12M):

real0m3.386s
user0m0.308s
sys 0m0.176s

Best regards Johannes



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] long transfer time for binary data

2016-01-23 Thread Johannes
Am 23.01.2016 um 23:38 schrieb John R Pierce:
> On 1/23/2016 2:19 PM, Johannes wrote:
>> I save my images as large object, which afaik is in practise not
>> readable with a binary cursor (we should use the lo_* functions). And of
>> course I already use the LargeObjectManager of the postgresql jdbc
>> library.
> 
> 
> afaik, Large Objects are completely independent of the other mode
> stuff.they are stored and transmitted in binary.


Depends on the client. It can be transfered as text or binary. And the
data is sliced into bytea segements [1] and afaik it is stored as binary
string.


> I haven't read this whole ongoing thread, just glanced at messages as
> they passed by over the past week or whatever, but I have to say, I
> would NOT be storing 11MB images directly in SQL, rather, I would store
> it on a file server, and access it with nfs or https or whatever is most
> appropriate for the nature of the application.   I would store the
> location and metadata in SQL.


The 11MB file is the biggest image one, the rest is normal. I know about
the arguments, but there are pros I want to use in production
(transactions, integrity). But if it fails (amount of space?, slow
import?) I may exclude the image data.

[1] http://www.postgresql.org/docs/9.5/static/catalog-pg-largeobject.html



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] long transfer time for binary data

2016-01-23 Thread Johannes
Am 23.01.2016 um 01:25 schrieb Daniel Verite:
>   Johannes wrote:
> 
>> psql
>> select lo_get(12345);
>> +ssl -compression 6.0 sec
>> -ssl  4.4 sec
> 
> psql requests results in text format so that SELECT does not
> really test the transfer of binary data.
> With bytea_output to 'hex', contents are inflated by 2x.
> 
> Can you tell how fast this goes for you, as a comparison point:
>\lo_export 12345 /dev/null
> ?
> 
> Many client interfaces use the text format, but you want to
> avoid that if possible with large bytea contents.
> In addition to puttingtwice the data on the wire, the server has to
> convert the bytes to hex and the client has to do the reverse operation,
> a complete waste of CPU time on both ends.
> 
> At the SQL level, the DECLARE name BINARY CURSOR FOR query
> can help to force results in binary, but as the doc says:
> 
>  http://www.postgresql.org/docs/current/static/sql-declare.html
> 
>   "Binary cursors should be used carefully. Many applications, including
>   psql, are not prepared to handle binary cursors and expect data to
>   come back in the text format."
> 
> Personally I don't have experience with JDBC, but looking at the doc:
> https://jdbc.postgresql.org/documentation/94/binary-data.html
> 
> I see this:
> 
> "To use the Large Object functionality you can use either the
> LargeObject class provided by the PostgreSQL™ JDBC driver, or by using
> the getBLOB() and setBLOB() methods."
> 
> If the data lives on the server as large objects, I would think that
> this LargeObject class has the best potential for retrieving them
> efficiently, as opposed to "SELECT lo_get(oid)" which looks like
> it could trigger the undesirable round-trip to the text format.
> You may want to test that or bring it up as a question to JDBC folks.
> 
> 
> Best regards,


\lo_export 12345 /dev/null is completed in 0.86 seconds.

I save my images as large object, which afaik is in practise not
readable with a binary cursor (we should use the lo_* functions). And of
course I already use the LargeObjectManager of the postgresql jdbc library.

You said, the server has to convert the bytes to hex string before
sending it over the wire. In my understanding bytea values are stored as
strings and are may compressed in TOAST storage.
> The bytea data type allows storage of binary strings [1]
What is correct?

Your post gave me the hint. I found a binary transfer parameter in the
postgresql jdbc library available [2], [3].

But turning it on, doesn't speed anything up. It seems the binary
transfer mode is active by default. The byte counter (iptables -v) is
nearly as big as the image itself. It is already optimal.

  packets  byte counter
psql  +ssl   8514  23M
psql  -ssl   8179  23M
pgadmin   -ssl  11716  33M
pgadmin   +ssl -compress12196  34M
pgadmin   +ssl +compress12193  34M
java jdbc +ssl  14037  24M
java jdbc -ssl   5622  12M (3.1 seconds)
java jdbc -ssl binarytransfer=true   5615  12M (3.1 seconds)

In fact I do not understand what is the bottleneck. OK my server, runs
in a Raspberry 2b+, thats maybe not the best hardware. But the scp
command could be finished from there in 1.3 seconds. So the bottleneck
is not the network speed. And also not the USB diskdrive. Maybe it is
the slow java program? I pointed my java program to my local postgresql
instance (with the same image as large object, same mtu, no loopback
device, no unix socket, but better system) it was finished in 400 ms.
The java progam is out too. Did I forget anything?

I'm afraid I have to live with it and may use thumbnail images.

Best regards

[1] http://www.postgresql.org/docs/current/static/datatype-binary.html
[2] https://wiki.postgresql.org/wiki/JDBC-BinaryTransfer
[3]
https://jdbc.postgresql.org/documentation/94/connect.html#connection-parameters



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] long transfer time for binary data

2016-01-22 Thread Johannes
Am 21.01.2016 um 08:44 schrieb George Neuner:
> On Wed, 20 Jan 2016 22:29:07 +0100, Johannes <jo...@posteo.de> wrote:
> 
>> I noticed transferring a large object or bytea data between client and
>> server takes a long time.
>> For example: An image with a real size of 11 MB could be read on server
>> side (explain analyze) in 81ms. Fine.
>>
>> But on client side the result was completed after 6.7 seconds without
>> ssl compression and 4.5 seconds with ssl compression (both via 100MBit
>> ethernet).
> 
> I think at ~4 seconds you're actually running pretty close to the
> limit of what is possible.
> 
> Remember that, even assuming the execution plan is accurate and also
> is representative of an average request, your 81ms image fetch may be
> arbitrarily delayed due to server load.
> 
> Even a quiet network has overhead: layers of packet headers, TCP
> checksums (CRC) and ack packets, etc. ... it's quite hard to sustain
> more than 95% of the theoretical bandwidth even on a full duplex
> private subnet.  So figure 11MB of data will take ~1.2 seconds under
> _optimal_ conditions.  Any competing traffic will just slow it down.
> 
> Also note that if the image data was stored already compressed,
> additionally trying to use connection level compression may expand the
> data and increase the transmission time, as well as adding processing
> overhead at both ends.
> 
> And then the client has to convert the image from the storage format
> into a display compatible bitmap and get it onto the screen.
> 
> 
>> Are there any other solutions available to display my images in my
>> client application more quickly? Or are there planned improvements to
>> postgresql (transferring the real binary data)?
> 
> You don't say what is the client platform/software or what format are
> the images.  11MB is (equivalent to) 1500+ pixels square depending on
> pixel/color depth.  That's a relatively large image - even from a
> local file, rendering that would take a couple of seconds.  Add a
> couple more seconds for request turn-around and there is your time
> gone.
> 
> BMP and GIF repectively are the formats that are quickest to render.
> If your stored images are in different format, it might be worth
> converting them to one of these.
> 
> GIF and _some_ BMP formats support direct compression of the pixel
> data.  If you find you must store the pixel data uncompressed, you can
> always gzip the resulting image file and store that.
> 
> Then don't use connection level compression.  With images stored
> already compressed the transmitted size is minimized, and you will
> only ever decompress (on the client) data in the critical path to the
> display.
> 
> 
> Hope this helps,
> George


Thanks for explanation. Im writing a client software in java/jdbc. Most
images are in jpeg format. Some have high quality, most medium.

Rendering this 11MB Image in eog (Eye Of Gome) takes 0.5 sec, in GIMP it
is very fast. In Java the object createion takes nearly all time, the
drawing is done very quickly.

The size of the binary string representation of this image is 22MB. I
guess there are not other special transfer mechanism for binary data
than plain text via sql, or?

Best regards
Johannes



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] long transfer time for binary data

2016-01-21 Thread Johannes
Here are some transferring measurements (from server to client) with the
same file.

scp
+ssl -compression 1.3 sec
+ssl +compression 4.6 sec

pgadmin
select lo_get(12345);
-ssl  3.4 sec
+ssl +compression 5.5 sec
+ssl -compression 4.5 sec

psql
select lo_get(12345);
+ssl -compression 6.0 sec
-ssl  4.4 sec

java/jdbc
only while(in.read(buf,0,len))
+ssl -compression 6.0 sec
-ssl  3.0 sec (+ 1.8 sec for new Image())

Here is a link for insecure ssl compression:
https://en.wikipedia.org/wiki/Comparison_of_TLS_implementations#Compression

Best Regargs
Johannes

Am 21.01.2016 um 03:33 schrieb Andy Colson:
> On 01/20/2016 03:29 PM, Johannes wrote:
>> I noticed transferring a large object or bytea data between client and
>> server takes a long time.
>> For example: An image with a real size of 11 MB could be read on server
>> side (explain analyze) in 81ms. Fine.
>>
>> But on client side the result was completed after 6.7 seconds without
>> ssl compression and 4.5 seconds with ssl compression (both via 100MBit
>> ethernet).
>>
>> SSL compression seems to be not a good idea anymore, since this had
>> become a security risk. Its still possible with pgadmin, but afaik not
>> with java/jdbc .
>>
>> Are there any other solutions available to display my images in my
>> client application more quickly? Or are there planned improvements to
>> postgresql (transferring the real binary data)?
>>
>> Best regards
>> Johannes
>>
> 
> Yep, that's slow.  The ssl compression is very odd if the image is
> jpeg'ish and already compressed.  If its a bitmap or uncompressed tif
> then its not so surprising.
> 
> A few tests you could try:
> 
> 1) copy the same 11 meg file from server to client via regular file copy
> and time it.  At 100 Mbit/s it should take about a second.  If it takes
> 6 you have network problems, not PG problems.
> 
> 2) try it via psql command line (or at least something other than java),
> to see if its java thats the problem.
> 
> 3) watch wireshark/tcpdump, maybe you'll see something glaring that'll
> point you in the right direction.
> 
> -Andy
> 
> PS: I've never heard that ssl compression was a security risk, got
> links/proof?
> 
> 



signature.asc
Description: OpenPGP digital signature


[GENERAL] long transfer time for binary data

2016-01-20 Thread Johannes
I noticed transferring a large object or bytea data between client and
server takes a long time.
For example: An image with a real size of 11 MB could be read on server
side (explain analyze) in 81ms. Fine.

But on client side the result was completed after 6.7 seconds without
ssl compression and 4.5 seconds with ssl compression (both via 100MBit
ethernet).

SSL compression seems to be not a good idea anymore, since this had
become a security risk. Its still possible with pgadmin, but afaik not
with java/jdbc .

Are there any other solutions available to display my images in my
client application more quickly? Or are there planned improvements to
postgresql (transferring the real binary data)?

Best regards
Johannes



signature.asc
Description: OpenPGP digital signature


[GENERAL] update inside function does not use the index

2015-11-16 Thread Johannes
Dear List,

I have problems with a self written function, which does not use the
index, which takes very long (500 ms per update).


The pl/pgsql function iterates over a select resultset with a cursor.
In every loop I execute an update with a where LIKE condition, which
relates to my current cursor position:

FOR i IN SELECT id, level_ids, path_names||'%' as path_names from x LOOP
update x set path_ids[i.level] = id where path_names like i.path_names;
RAISE NOTICE 'path_names : %', i.path_names;
END LOOP;



Calling the updates outside the function, they are very fast because
like 'a.b%' uses the index of the path field ( ~ 15 ms ).

Does anyone know how to fix that?
Thanks, Johannes



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] update inside function does not use the index

2015-11-16 Thread Johannes
That solves my problem. Thanks!!

Best regards Johannes

Am 16.11.2015 um 18:19 schrieb Tom Lane:
> Adrian Klaver <adrian.kla...@aklaver.com> writes:
>> On 11/16/2015 08:03 AM, Johannes wrote:
>>>> In every loop I execute an update with a where LIKE condition, which
>>>> relates to my current cursor position:
>>>> FOR i IN SELECT id, level_ids, path_names||'%' as path_names from x LOOP
>>>> update x set path_ids[i.level] = id where path_names like i.path_names;
> 
> Probably the problem is that the planner is unable to fold i.path_names
> to a constant, so it can't derive an indexscan condition from the LIKE
> clause.
> 
> A little bit of experimentation says that that will work if "i" is
> declared with a named rowtype, but not if it's declared RECORD.  This
> might or might not be something we could fix, but in the meantime I'd
> try
> 
> DECLARE i x%rowtype;
> 
> FOR i IN SELECT * FROM x LOOP
> update x set path_ids[i.level] = id where path_names like (i.path_names || 
> '%');
> 
> which while it might look less "constant" is actually more so from the
> planner's perspective, because there is no question of whether "i" has
> got a field of that name.
> 
>   regards, tom lane
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] update inside function does not use the index

2015-11-16 Thread Johannes
This helps me to understand for these common table expressions better.
Thanks. This looks more elegant than the cursor variant.

Limiting the cte to 10 records the update query needs 1.8 seconds. But
the cursor variant ( 10 records ) was finished in 0.7 seconds. I guess
it is faster, because behind the scenes no join is needed.

Best regards Johannes

Am 16.11.2015 um 15:22 schrieb Thomas Kellerer:
> Johannes schrieb am 16.11.2015 um 14:56:
>> I have problems with a self written function, which does not use the
>> index, which takes very long (500 ms per update).
>>
>> The pl/pgsql function iterates over a select resultset with a cursor.
>> In every loop I execute an update with a where LIKE condition, which
>> relates to my current cursor position:
>>
>> FOR i IN SELECT id, level_ids, path_names||'%' as path_names from x LOOP
>> update x set path_ids[i.level] = id where path_names like i.path_names;
>> RAISE NOTICE 'path_names : %', i.path_names;
>> END LOOP;
>>
>> Calling the updates outside the function, they are very fast because
>> like 'a.b%' uses the index of the path field ( ~ 15 ms ).
> 
> 
> Doing row-by-row processing (also referred to as "slow-by-slow")  is usually 
> not a good idea. 
> 
> I think your statement can be re-written to avoid the loop completely: 
> 
> with path_levels as (
>   SELECT id, 
>  level_ids, 
>  path_names||'%' as path_names 
>   from x 
> )
> update x 
>   set path_ids[i.level] = id
> from path_levels i
> where x.path_names like i.path_names
> 
> 
> 
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] update inside function does not use the index

2015-11-16 Thread Johannes
No, i did a mistake while simplifying it.

It should be

FOR i IN SELECT id, level, path_names||'%' as path_names from x LOOP
update x set path_ids[i.level] = i.id where path_names like i.path_names;

Sorry.
Best regards Johannes

Am 16.11.2015 um 15:10 schrieb Adrian Klaver:
> On 11/16/2015 05:56 AM, Johannes wrote:
>> Dear List,
>>
>> I have problems with a self written function, which does not use the
>> index, which takes very long (500 ms per update).
>>
>>
>> The pl/pgsql function iterates over a select resultset with a cursor.
>> In every loop I execute an update with a where LIKE condition, which
>> relates to my current cursor position:
>>
>> FOR i IN SELECT id, level_ids, path_names||'%' as path_names from x LOOP
>> update x set path_ids[i.level] = id where path_names like i.path_names;
> 
> Is this the actual UPDATE in the function?
> 
> If so, where are i.level and id  coming from?
> 
> Or is that supposed to be?:
> 
> update x set path_ids[i.level_ids] = i.id where path_names like
> i.path_names;
> 
>> RAISE NOTICE 'path_names : %', i.path_names;
>> END LOOP;
>>
>>
>>
>> Calling the updates outside the function, they are very fast because
>> like 'a.b%' uses the index of the path field ( ~ 15 ms ).
>>
>> Does anyone know how to fix that?
>> Thanks, Johannes
>>
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: [GENERAL] update inside function does not use the index

2015-11-16 Thread Johannes
A function seams to be atomic for the analyze command (or?)

EXPLAIN ANALYZE select my_function();

returns no inner query plan, just the costs, rows and width

Am 16.11.2015 um 17:57 schrieb Adrian Klaver:
> EXPLAIN ANALYZE select ... your_function(...);



signature.asc
Description: OpenPGP digital signature


[GENERAL] UNIQUE contraint that is initially deferred in Postgres 8.4.13

2014-03-13 Thread Johannes Bauer
Hi list,

I'm having trouble with a UNIQUE constraint that I need to have
DEFERRABLE INITIALLY DEFERRED. On my Dev machine (Postgres 9.1) it works
fine:

alter table foo drop constraint bar;
ALTER TABLE

alter table foo add constraint bar UNIQUE (col1, col2) deferrable
initially deferred;
ALTER TABLE


In Production (Postgres 8.4.13) this blows up in my face although it
should be supported according to the docs:


alter table foo drop constraint bar;
ALTER TABLE

alter table foo add constraint bar UNIQUE (col1, col2) deferrable
initially deferred;
FEHLER:  Syntaxfehler bei »DEFERRABLE«
LINE 1: bar UNIQUE (col1, col2) DEFERRABLE...
^

(i.e. syntax error at DEFERRABLE, psql seems to ignore my locale
setting).

Does somebody know what the reason for this could be? I'm kind of puzzled.

Thanks in advance,
Best regards,
Joe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] UNIQUE contraint that is initially deferred in Postgres 8.4.13

2014-03-13 Thread Johannes Bauer
On 13.03.2014 13:02, Magnus Hagander wrote:

 My guess is you're looking in the wrong place in the docs - can you specify
 where you are looking? If you are in the right place then the docs are
 wrong.

Here: http://www.postgresql.org/docs/8.4/static/sql-createtable.html

 Deferrable unique constraints are a new feature in PostgreSQL 9.0, so it
 shouldn't be working in 8.4.

Ah! Okay, thanks for the info. That would explain it :-)

 (And as a sidenote, a reminder that 8.4 will go end of life in just a
 couple of months, so if you aren't already planning an upgrade of your
 production environment, it's probably a good idea to start doing that, see
 http://www.postgresql.org/support/versioning/)

I'm going to roll up my sleeves and tackle that then...

Thanks for the help,
Joe


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Some questions about postgresql's default text search parser

2013-11-06 Thread johannes graën
Hi eveyone,

I've been trying to understand the text search parser's behaviour.
Looking at the source code [1] it seems as if there was a
sophisticated FSM mapping the input string to a list of tuples of
category (as defined in [1], lines 32-56, or [2]) and a substring from
the original one order by the appearance of the latter one in the
original.

* Is there any documentation to be found on this parser?


As the parser is not aware of the underlying language, I would like to
create my own one.

* Is adding one to pg_ts_parsers the right way or should this rather
be done outside of the PG internals?
* For the first case, is there any manual or documentation how to do so?


If you want to comprehend my aims, try these commands:

select (ts_parse(3722,s)).*, (ts_debug(s)).*, (ts_debug('french',s)).*
from (select 'aujourd''hui ils m''ont dit qu''il y aura peut-être plus
de 10 000 personnes'::text s) x;

select (ts_parse(3722,s)).*, (ts_debug(s)).* from (select 'heu
d''anar-hi'::text s) x;

Best
  Johannes



[1] http://doxygen.postgresql.org/wparser__def_8c_source.html
[2] http://www.postgresql.org/docs/9.3/static/textsearch-parsers.html


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] large database

2012-12-11 Thread Johannes Lochmann
Hello Jan, hello List

On 12/11/2012 09:10 AM, Jan Kesten wrote:
 There are some sildes from Sun/Oracle about ZFS, ZIL, SSD and
 PostgreSQL performance (I can look if I find them if needed).

I would very much appreciate a copy or a link to these slides!

Johannes


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] large database

2012-12-11 Thread Johannes Lochmann
Hi all,

On 12/11/2012 11:02 AM, Jan Kesten wrote:
 I would very much appreciate a copy or a link to these slides!
 here they are:

 http://www.scribd.com/mobile/doc/61186429

thank you very much!

Johannes


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] SEPostgreSQL alive?

2012-09-04 Thread Johannes Segitz
Hello,

i'm currently working on a project to secure a webapplication with
SELinux and I'm interessted
in SEPostgreSQL. Is this project still alive? I couldn't find any
information besides
http://wiki.postgresql.org/wiki/SEPostgreSQL_Introduction

Is anyone actively using/developing SEPostgreSQL?

Regards,
Johannes


-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


Re: [GENERAL] Force ARE in regexp string

2010-09-22 Thread Johannes Öberg

 On 2010-09-15 15:33, Tom Lane wrote:

=?ISO-8859-1?Q?Johannes_=D6berg?=johannes.ob...@proactivegaming.com  writes:

I've set regex_flavor to ARE, and I've tried prefixing my strings, i.e.
~* E'***:abc' but for some reason postgres treats all my regexps as BRE's.

Well, the symptom as described seems pretty improbable. You didn't show
an exact example, but I'm suspecting the real problem is that you're not
allowing for backslashes in a string literal getting eaten by string
parsing.  Do the cases that don't work for you involve backslashes in
the regex?

regards, tom lane

This was indeed what was happening, problem solved, thanks alot! Now, 
I'm having new problems with Postgres seemingly thinking I'm regexping 
too much for a single query, but that's will be another thread.


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] Force ARE in regexp string

2010-09-15 Thread Johannes Öberg
 Hi! I'm trying to do an advanced regexp match but postgres doesn't 
seem to let me.


I've set regex_flavor to ARE, and I've tried prefixing my strings, i.e. 
~* E'***:abc' but for some reason postgres treats all my regexps as BRE's.


Common newbie gotchas? I'm trying it directly from psql.exe btw, running 
postgres 8.4 on a professionally set up Linux machine, and I've also 
tried it locally on a Windows Bitnami machine with the same results.


Thank
/J

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[GENERAL] pg_xlog - files are guaranteed to be sequentialy named?

2007-06-13 Thread Johannes Konert

Hi pgsql-list-members,
I currently write a small script that deletes outdated xlog-files from 
my backup-location.
Because I do not want to rely on creation-date, I found it usable to use 
the result of

ln | sort -g -r
Thus the newest WAL xlog-file is on top and I can delete all not needed 
files at the bottom of the list.


My question: Is it for ALL cases guaranteed, that the naming of the 
WAL-files in $PGDATA/pg_xlog always produces a higher number for a 
newer file?

What happens if the 24hexdigits reach upper bound?

Thank your for your replies on that issue of postgresql inner working model.
Regards Johannes

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [GENERAL] pg_xlog - files are guaranteed to be sequentialy named?

2007-06-13 Thread Johannes Konert

Greg Smith wrote:
He's talking about wiping out the ones on the backup server, so I 
think Johannes means erasing the old archived logs on the secondary 
here.  That can screw up your backup if you do it wrong, but it's not 
an all-caps worthy mistake.
yes, that's what I am talking about related to 
http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html.

Sorry, if that did not came out clearly enough.


On Wed, 13 Jun 2007, Johannes Konert wrote:

Because I do not want to rely on creation-date,


No, you want to rely on creation date, because then this problem goes 
away. 
Truely right...if I can gurantee, that the file-dates of my archived 
WAL-files do have proper timestamps. If the timestamps once are messed 
up and all have the same timestamp (due to a Windows-copy or something 
else foolish), then the delete-script might delete the wrong files...
The idea you should be working toward is that you identify when your 
last base backup was started after it's copied to the secondary, and 
then you can safely delete any archived logs file on the secondary 
from before that time.  Instead of doing ls | sort -g -r you should 
be doing something like looping over the files in a bash shell script 
and using

[ -ot first xlog in base backup ] to determine which files to delete.

right; but as I said, then I rely on file-dates.
But during the day I came out with an solution: I store the WAL-files 
with the time-stamp of archiving in their file-name. Thus I can order 
and delete them safely.
Your hint was the one, that helped me to find that solution - so thanks 
for that, Greg.and the others.


Regards,
Johannes


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] pg_xlog - files are guaranteed to be sequentialy named?

2007-06-13 Thread Johannes Konert

Johannes Konert wrote:
But during the day I came out with an solution: I store the WAL-files 
with the time-stamp of archiving in their file-name. Thus I can order 
and delete them safely.
Your hint was the one, that helped me to find that solution - so 
thanks for that, Greg.and the others. 
That solution has still a problem: It workes fine in case that the 
WAL-naming restarts with 0001, because the attached 
timestamp in name would still make it possible to identify the file as 
being a newer one as , but there is still the 
problem with shifts in time itself.
If someone corrects the servers computer-time/date to a date before 
current time (e.g. set the clock two hours back), then the newer WAL 
files will have an older timestamp and will be deleted by accident.


Thus now I increase the number of characters of the filename to infinite 
and the last 24 characters are the WAL file name. Thus the archived 
filenames ~always~ increase in naming and all backup files before the 
last base backup can be safely identified not relying on computer 
timestamps or with the risk of a restart in naming by postgresql.
I hope this solutions only border is the 255 character restriction of 
file-name lengthbut if that one will be reached in future times I am 
sure longer names are possible :)


Regards Johannes


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] pg_xlog - files are guaranteed to be sequentialy named?

2007-06-13 Thread Johannes Konert

Greg Smith wrote:

On Wed, 13 Jun 2007, Johannes Konert wrote:

If someone corrects the servers computer-time/date to a date before 
current time (e.g. set the clock two hours back), then the newer WAL 
files will have an older timestamp and will be deleted by accident.


This should never happen; no one should ever touch the clock by hand 
on a production system.  The primary and backup server should both be 
syncronized via NTP.  If you're thinking about clock changes for 
daylight savings time, those shouldn't have any effect on timestamps, 
which should be stored in UTC.  If you're on Windows,

Its not Windows; it will be Debian Linux.
I completely agree with you that of course our servers synchronize 
themselve via NTP with global time, but we already had the case that - 
for some reasons - NTP did not work and times drift away from each 
other. If you have to manage some servers you might not recognize that a 
NTP daemon does not work anymore or that a new firewall prohibits these 
TCP packages nowand time goes by, because everything seem to work 
just fine.
Then one nice day you realize, that one, two or many of your servers 
just have their own time and you need to bring them back to synchronized 
time while they are online. If you made your applications be aware of 
such effects and use system-nanotime or global counters where possible, 
then even these time-corrections can be handled.
But I agree with you: of course normally this will never happen...but it 
happened once.


You're working hard to worry about problems that should be eliminated 
by the overall design of your system.  If you can't trust your system 
clocks and that files are being copied with their attributes intact, 
you should consider thinking about how to resolve those problems 
rather than working around them.

yes, but still there is a remaining risk in my opinion.
It's not just PostgreSQL that will suffer from weird, unpredictable 
behavior in a broken environment like that.  Giving a Windows example, 
if you're running in a Windows Domain configuration, if the client 
time drifts too far from the server you can get The system cannot log 
you on due to the following error:  There is a time difference between 
the Client and Server. when trying to login.
If we add a new server to the cluster, the application will check times 
as it is in oyur Windows-example, but if it is allready in and working, 
then it cannot simply shutdown in case of time-diffs.


Greg, thanks for your sophisticated hints.
But the thread is going a little off-topic now, I guess :)
The issue with the time-dependency of WAL archiving and deletion 
issolved for me by using a global infinite counter to rely on by now.
I am sure next questions will come before long and I look forward to 
read any hints then, if you and others have time to read them.

Regards Johannes

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [GENERAL] pg_xlog - files are guaranteed to be sequentialy named?

2007-06-13 Thread Johannes Konert

Frank Wittig wrote:

24 Hex digits means 24^16 unique file names. Assuming your server saves
a WAL file each second (you should review your config it it does) it
takes (24^16)/(60*60*24*365)=3.84214066×10^14 years to reach the upper
bound.



(..) It has to be 16^24.
But pg does forge filenames other that that. It uses 2 hex digits to
count segments. After 256 segments counting starts over and the serial
is increased by one. The first 8 positions are the time line which I
will ignore for my new calculation.

So there is an eight hex digits serial for each time line which takes
256 segments. So there are 16^8*256 unique file names. If I assume one
WAL file a second this would reach upper bound (for a single time line)
after slightly more than 136 years.

Please correct me if my assumptions are wrong. But I would say one can
rely on serial file names to increase steadily.
  
Thanks for that answer. That was exactly what I could not immediatelly 
find mentioned in the documentation.
If it is guaranteed - and I understood your comments this way - that the 
naming follows a sequential order, then I agree with you, that this is 
enough for a long time.
I was not sure wether or not the naming follows this rule. Of course I 
calculated the number of possible filenames before, but as I said, I was 
not sure, that Postgresql follows a guaranteed naming convention of 
always increasing WAL filenames.

Anyway, this is now for sure and I will rely on that now.
Regards Johannes

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [GENERAL] Suppress checking of chmod 700 on data-dir?

2007-06-12 Thread Johannes Konert

Oliver Elphick wrote:

You could run the backup as postgres and pipe the output to another
program owned by the other user and with suid set in its permissions.
The suid means that the receiving program would have access where you
don't want postgres to go.

Thanks Oliver,
that was a good hint. Suids are not working on bash-scripts, but with a 
restricted entry in /etc/sudoers now the backup-user can execute a 
copy-and-access-script to get the files from within PGDATA-dir.

Regards Johannes

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Suppress checking of chmod 700 on data-dir?

2007-06-11 Thread Johannes Konert

Joshua D. Drake wrote:

Johannes Konert wrote:
But that is not my point. The question is where I can change the 
enforced chmod 700 postgresql always wants me to set.

You can't.

You can however change the postgresql.conf to put look for files
somewhere besides $PGDATA and thus you would be able to back them up.
With postgresql.conf I can change the path to the data-dir, but 
postgresql checks the chmod 700 on that directory either. So only the 
logfiles can be written and accessed somewhere else. The data itself is 
still only accessible by the postgres-user.

Anything else in there you should be grabbing via pg_dump anyway.
So you suggest not to backup the filesystem-files, but to do a dump 
instead? Does this work together with PITR as described in 23.3. 
(http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html#BACKUP-BASE-BACKUP). 
I want to make a full backup every night of a heavy DB while it is 
running. Combined with short-term-WAL-archiving. Getting the WAL-files 
is easy by using the config-parameter archive_command, but to copy and 
backup the base backup once a day is impossible if I cannot access the 
data-files. 
Will the described backuprestore work as well with a dump + WAL-files?

Then I could forget about copying the file-system-files


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


[GENERAL] Suppress checking of chmod 700 on data-dir?

2007-06-07 Thread Johannes Konert

Hi postgresql-listmembers,
for a backup-scenario I need to have access to the pgdata-directory as a 
different shell-user, but postgresqul refuses to start if chmod is not 
700 on the directory.


Is there a way to prevent postgres to check the data-dirs chmod 700 on 
startup (and while running) ?


Thanks for your short replies. I could not figure it out in the 
documentation.


Regards Johannes
postgresql 8.2.4 on ubuntu dapper
(if this question came 100times, I apologize for being unable to find it)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Suppress checking of chmod 700 on data-dir?

2007-06-07 Thread Johannes Konert



use sudo in your backup scenario, or run you backup as postgres
  

Thanks for your quick reply.
Unfortunaltelly runing backup via sudo is not an option due to sercurity 
issues and using postgres-user is not feasable because other data as 
well is backuped where postgres-user should not have access to.
So your answer means that there is definitelly NO way to circumwent the 
chmod 700 thing? Its hard to believe that. Each and evera thing is 
configurable in postgres, but I cannot disable or relax 
directory-permissions checking?   Even not with a compile-option or 
something  like that?


Anyway thanks for your help. I'll keep searching for a solution.
Regards Johannes

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [GENERAL] Suppress checking of chmod 700 on data-dir?

2007-06-07 Thread Johannes Konert

Ragnar wrote:
are you planning a filesystem-level backup? 


are you aware that you cannot just backup the postgres data directories
fro under a running server, and expect the
backup to be usable?

gnari

As war as I understood the docu of psql 8.2.4 
(http://www.postgresql.org/docs/8.2/interactive/continuous-archiving.html 
section 23.3.2) you can copy the files while postgres is running 
(respecting pg_start_backup and pg_stop_backup)
But that is not my point. The question is where I can change the 
enforced chmod 700 postgresql always wants me to set.

Regards Johannes :)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org/


Re: [GENERAL] Postgres 8.1.4 sanity_check failed on SuSE 8.2

2006-09-18 Thread Johannes Weberhofer, Weberhofer GmbH
Tom, 


thank you! from cvs I copied the latest strategy to make the sanity checks 
and removed those values which were added after version 8.1.4. Using that patch, I can 
compile postgres without any problems. This version runs nicely on several servers.

Best regards,
Johannes


Johannes Weberhofer, Weberhofer GmbH schrieb:
I have tried it several times, but no success. I didn't have this issue 
with older postgres versions on the same machine/system.


Best regards,
Johannes Weberhofer

Tom Lane schrieb:

Johannes Weberhofer, Weberhofer GmbH [EMAIL PROTECTED] writes:

while the regression tests, there is one that fails:


Repeatably, or did you only see this once?  There's a known timing issue
that explains this, but it's only been seen once or twice that I know of.

2006-08-06 00:35  tgl

* src/test/regress/: expected/sanity_check.out,
sql/sanity_check.sql: Tweak sanity_check regression test to display
more tables (viz, those without indexes) but not to display temp
tables.  It's a bit hard to credit that sanity_check could get
through a database-wide VACUUM while the preceding create_index
test is still trying to clean up its temp tables ... but I see no
other explanation for the current failure report from buildfarm
member sponge.

The report alluded to is
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=spongedt=2006-08-05%2021:30:02 



regards, tom lane




--


|-
|  weberhofer GmbH   | Johannes Weberhofer
|  information technologies, Austria
|
|  phone : +43 (0)1 5454421 0| email: [EMAIL PROTECTED]
|  fax   : +43 (0)1 5454421 19   | web  : http://weberhofer.at
|  mobile: +43 (0)699 11998315
|---

--- src/test/regress/sql/sanity_check.sql   2003-05-28 18:04:02.0 
+0200
+++ src/test/regress/sql/sanity_check.sql   2006-09-18 10:11:01.0 
+0200
@@ -4,10 +4,13 @@
 -- sanity check, if we don't have indices the test will take years to
 -- complete.  But skip TOAST relations since they will have varying
 -- names depending on the current OID counter.
+-- complete.  But skip TOAST relations (since they will have varying
+-- names depending on the current OID counter) as well as temp tables
+-- of other backends (to avoid timing-dependent behavior).
 --
 SELECT relname, relhasindex
-   FROM pg_class
-   WHERE relhasindex AND relkind != 't'
+   FROM pg_class c LEFT JOIN pg_namespace n ON n.oid = relnamespace
+   WHERE relkind = 'r' AND (nspname ~ '^pg_temp_') IS NOT TRUE
ORDER BY relname;
 
 --
--- src/test/regress/expected/sanity_check.out  2005-09-08 22:07:42.0 
+0200
+++ src/test/regress/expected/sanity_check.out  2006-09-18 11:09:47.0 
+0200
@@ -3,72 +3,142 @@
 -- sanity check, if we don't have indices the test will take years to
 -- complete.  But skip TOAST relations since they will have varying
 -- names depending on the current OID counter.
+-- complete.  But skip TOAST relations (since they will have varying
+-- names depending on the current OID counter) as well as temp tables
+-- of other backends (to avoid timing-dependent behavior).
 --
 SELECT relname, relhasindex
-   FROM pg_class
-   WHERE relhasindex AND relkind != 't'
+   FROM pg_class c LEFT JOIN pg_namespace n ON n.oid = relnamespace
+   WHERE relkind = 'r' AND (nspname ~ '^pg_temp_') IS NOT TRUE
ORDER BY relname;
-   relname   | relhasindex 
--+-
- bt_f8_heap  | t
- bt_i4_heap  | t
- bt_name_heap| t
- bt_txt_heap | t
- circle_tbl  | t
- fast_emp4000| t
- func_index_heap | t
- hash_f8_heap| t
- hash_i4_heap| t
- hash_name_heap  | t
- hash_txt_heap   | t
- ihighway| t
- num_exp_add | t
- num_exp_div | t
- num_exp_ln  | t
- num_exp_log10   | t
- num_exp_mul | t
- num_exp_power_10_ln | t
- num_exp_sqrt| t
- num_exp_sub | t
- onek| t
- onek2   | t
- pg_aggregate| t
- pg_am   | t
- pg_amop | t
- pg_amproc   | t
- pg_attrdef  | t
- pg_attribute| t
- pg_auth_members | t
- pg_authid   | t
- pg_autovacuum   | t
- pg_cast | t
- pg_class| t
- pg_constraint   | t
- pg_conversion   | t
- pg_database | t
- pg_depend   | t
- pg_description  | t
- pg_index| t
- pg_inherits | t
- pg_language | t
- pg_largeobject  | t
- pg_namespace| t
- pg_opclass  | t
- pg_operator | t
- pg_pltemplate   | t
- pg_proc | t
- pg_rewrite  | t
- pg_shdepend | t
- pg_statistic| t
- pg_tablespace   | t
- pg_trigger  | t
- pg_type | t
- polygon_tbl | t
- road| t
- shighway| t
- tenk1   | t

[GENERAL] Postgres 8.1.4 sanity_check failed on SuSE 8.2

2006-09-06 Thread Johannes Weberhofer, Weberhofer GmbH

Hello,

while the regression tests, there is one that fails:

*** ./expected/sanity_check.out Thu Sep  8 20:07:42 2005
--- ./results/sanity_check.out  Tue Sep  5 10:27:53 2006
***
*** 17,22 
--- 17,24 
  circle_tbl  | t
  fast_emp4000| t
  func_index_heap | t
+  gcircle_tbl | t
+  gpolygon_tbl| t
  hash_f8_heap| t
  hash_i4_heap| t
  hash_name_heap  | t

I have found out, that someone else had this problem, but I have found no 
solution: http://archives.postgresql.org/pgsql-general/2006-05/msg01003.php

On the compiling machine, I have (a clean) SuSE 8.2 running, the filesystem is 
a xfs file system. The last version, Postgres 8.1.3, compiled nicely some time 
ago.

Any ideas how I can fix this?

Best regards,
Johannes Weberhofer

--


|-
|  weberhofer GmbH   | Johannes Weberhofer
|  information technologies, Austria
|
|  phone : +43 (0)1 5454421 0| email: [EMAIL PROTECTED]
|  fax   : +43 (0)1 5454421 19   | web  : http://weberhofer.at
|  mobile: +43 (0)699 11998315
|---


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq