Re: [HACKERS] Releasing in September

2016-01-21 Thread Marcin Mańk
On Wed, Jan 20, 2016 at 4:40 PM, Bruce Momjian  wrote:

> Many people where happy with our consistent releasing major releases in
> September, e.g. 9.0 to 9.3:
>
> Not sure why the commitfest process should be synchronized with the
release process. What if, when the release date comes, the currently
running commitfest (if there is one) gets put on hold, cleanup work is
done, release gets stamped, then commitfest gets resumed for the next
release?


Re: [HACKERS] [patch] Proposal for \rotate in psql

2015-09-21 Thread Marcin Mańk
W dniu piątek, 18 września 2015 Daniel Verite 
napisał(a):

> Pavel Stehule wrote:
>
> > in the help inside your last patch, you are using "crosstab". Cannto be
> > crosstab the name for this feature?
>
> If it wasn't taken already by contrib/tablefunc, that would be a first
> choice. But now, when searching for crosstab+postgresql, pages of
> results come out concerning the crosstab() function.
>
How about transpose (or flip)?


Re: [HACKERS] QSoC proposal: Rewrite pg_dump and pg_restore

2014-03-21 Thread Marcin Mańk
On Fri, Mar 21, 2014 at 4:09 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Craig Ringer cr...@2ndquadrant.com writes:
  Here's how I think it needs to look:
  [ move all the functionality to the backend ]

 Of course, after you've done all that work, you've got something that is
 of exactly zero use to its supposed principal use-case, pg_dump.  pg_dump
 will still have to support server versions that predate all these fancy
 new dump functions, and that pretty much ensures that most of pg_dump's
 core functionality will still be on the client side.  Or, if you try to
 finesse that problem by making sure the new server APIs correspond to
 easily-identified pieces of pg_dump code, you'll probably end up with APIs
 that nobody else wants to use :-(.


Or you should mandate that new server versions should be able to consume
_old_ pg_dump version output. This would change the recommended when
upgrading, dump using the new pg_dump to when upgrading, dump using the
old pg_dump.

This would be necessary policy going forward anyway, if most of the pg_dump
functionality was server-side, because it would be generating dumps in the
server-version dump format, not the client-version format.

'Regards
Marcin Mańk
(goes back to lurker cave...)


Re: [HACKERS] high-dimensional knn-GIST tests (was Re: Cube extension kNN support)

2013-10-27 Thread Marcin Mańk
On Thu, Oct 24, 2013 at 3:50 AM, Gordon Mohr gojomo-pg...@xavvy.com wrote:

 On 9/22/13 4:38 PM, Stas Kelvich wrote:

 Hello, hackers.

 Here is the patch that introduces kNN search for cubes with
 euclidean, taxicab and chebyshev distances.


 Thanks for this! I decided to give the patch a try at the bleeding edge
 with some high-dimensional vectors, specifically the 1.4 million
 1000-dimensional Freebase entity vectors from the Google 'word2vec' project:


I believe the curse of dimensionality is affecting you here. I think it is
impossible to get an improvement over sequential scan for 1000 dimensional
vectors. Read here:

http://en.wikipedia.org/wiki/Curse_of_dimensionality#k-nearest_neighbor_classification

Regards
Marcin Mańk


[HACKERS] faster ts_headline

2012-11-20 Thread Marcin Mańk
Hello,
I've started implementing a system for faster headline generation. WIP
patch is attached.

The idea is to make a new type currently called hltext (different
names welcome), that stores the text along with the lexization result.
It conceptually stores an array of tuples like
(word text, type int, lexemes text[] )

A console log is also attached - it shows 5x preformance increase. The
problem is not academic, I have such long texts in an app, making 20
headlines takes 3s+.

The patch lacks documentation, regression tests, and most auxillary
functions (especially I/O functions).


I have a question about the I/O functions of the new type. What format
to choose?

I could make the input function read something like 'english: the
text' where english is the name of the text search configuration . The
input function would do the lexizing.

I could make it read some custom format, which would contain the
tokens, token types and lexemes. Can I use flex/bison, or is there a
good reason not to, and I should make it a hand-made parser?

finally, I could make the type actually create type
hltex_element(word text, type int, lexemes text[] ), by manually
filling in the applicable catalogs, and make the user make columns as
hltext_element[]. Is there a nice way to manipulate objects of such a
type from within the backend? Is there an example? I suppose that in
this case storage would not be as efficient as I made it.

which one to choose? Other ideas?

Regards
Marcin Mańk
$ psql -p 5454 postgres -c 'create table tmp(t text, hlt hltext);';
CREATE TABLE
$ bash -c 'echo insert into tmp(t) values(\$CUTCUT\$ ; curl -d  
http://en.wikipedia.org/wiki/Michael_Jackson; echo \$CUTCUT\$)' | ./bin/psql 
-p 5454 postgres
  % Total% Received % Xferd  Average Speed   TimeTime Time  Current
 Dload  Upload   Total   SpentLeft  Speed
100  636k  100  636k0 0   224k  0  0:00:02  0:00:02 --:--:--  258k
INSERT 0 1
$ psql -p 5454 postgres
psql (9.0.5, server 9.3devel)
WARNING: psql version 9.0, server version 9.3.
 Some psql features might not work.
Type help for help.

postgres=# update tmp set hlt = to_hltext('english', t);
UPDATE 1
postgres=# \timing
Timing is on.
postgres=# select ts_headline('english', t, to_tsquery('janet  jackson'), 
'MaxFragments=2 MinWords=5 MaxWords=15') from tmp;

 ts_headline
 
-
 bJackson/b-Style. (video production; Michael and bJanet/b 
bJackson/b video) .  29 . Theatre Crafts International ... Green Day Look 
Forward To bJanet/b bJackson/b's VMA Tribute To Michael .  MTV . 
September
(1 row)

Time: 414,588 ms
postgres=# select ts_headline('english', t, to_tsquery('janet  jackson'), 
'MaxFragments=2 MinWords=5 MaxWords=15') from tmp;

 ts_headline
 
-
 bJackson/b-Style. (video production; Michael and bJanet/b 
bJackson/b video) .  29 . Theatre Crafts International ... Green Day Look 
Forward To bJanet/b bJackson/b's VMA Tribute To Michael .  MTV . 
September
(1 row)

Time: 75,912 ms
postgres=# select ts_headline('english', hlt, to_tsquery('janet  jackson'), 
'MaxFragments=2 MinWords=5 MaxWords=15') from tmp;

 ts_headline
 
-
 bJackson/b-Style. (video production; Michael and bJanet/b 
bJackson/b video) .  29 . Theatre Crafts International ... Green Day Look 
Forward To bJanet/b bJackson/b's VMA Tribute To Michael .  MTV . 
September
(1 row)

Time: 17,539 ms
postgres=# select ts_headline('english', hlt, to_tsquery('janet  jackson'), 
'MaxFragments=2 MinWords=5 MaxWords=15') from tmp;

 ts_headline

Re: [HACKERS] Checksums, state of play

2012-03-06 Thread Marcin Mańk
On Tue, Mar 06, 2012 at 07:09:23PM +, Simon Riggs wrote:
 The problem is actually on/off/crash/on in quick succession which is
 much less likely.

I must be missing something, but how about:
if (!has_checksums  page_loses_checksum_due_to_hint_bit_write)
 wal_log_the_hint_bit_write();

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Term positions in GIN fulltext index

2011-11-03 Thread Marcin Mańk
On Thu, Nov 3, 2011 at 4:52 PM, Yoann Moreau
yoann.mor...@univ-avignon.fr wrote:
 I'd need a function like this :
 select term_positions(text, 'get') from docs;
  id_doc | positions
 +---
      1 |     {2,6}
      2 |       {3}


check this out:
http://www.postgresql.org/docs/current/static/textsearch-debugging.html
ts_debug does what You want, and more. Look at it's source - it`s a
plain sql function, You can make something based on it.

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Thoughts on SELECT * EXCLUDING (...) FROM ...?

2011-10-31 Thread Marcin Mańk
On Sun, Oct 30, 2011 at 8:50 PM, Eric Ridge eeb...@gmail.com wrote:
 Well, it's a display thing as much as any SELECT statement
 (especially via psql) is a display thing.  It's more like I want
 all 127 columns, except the giant ::xml column, and I'm too lazy to
 type each column name out by hand.


How about an option for psql to truncate too long columns to X characters ?

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-23 Thread Marcin Mańk
One idea:
col like 'foo%' could be translated to col = 'foo' and col = foo || 'zzz' , 
where 'z' is the largest possible character. This should be good enough  for 
calculating stats.

 How to find such a character, i do not know.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] FATAL: ReleaseSavepoint: unexpected state STARTED

2011-08-17 Thread Marcin Mańk
Hello
I tried reporting the following bug via web form, it somerhow got lost
(it is not in pgsql-bugs archives, it was #6157 I believe). Anyway,
here it is:


 psql -c 'release q; prepare q(int) as select 1'
FATAL:  ReleaseSavepoint: unexpected state STARTED
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
connection to server was lost

The message is from 8.4.2, but the bug is in 9.0.4 too .

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] synchronized snapshots

2010-01-09 Thread Marcin Mańk



Dnia 2010-01-09 o godz. 20:37 Markus Wanner mar...@bluegap.ch napisał 
(a):



Hi

Joachim Wieland wrote:

Since nobody objected to the idea in general, I have implemented it.


How cool it would be if we could synchronize snapshots between the  
master and the (sr) standby?


The connection poolers could use that to send read-only queries to the  
standby, and when the first dml/ddl statement in a transaction comes  
up, they could switch to the master.


If it is hard to tell from the statement if it writes anything, the  
pooler could catch the error, and retry on the master


Regards
Marcin Mańk 
--

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers