Re: [HACKERS] Load Distributed Checkpoints test results

2007-06-14 Thread ITAGAKI Takahiro

Heikki Linnakangas [EMAIL PROTECTED] wrote:

 Here's results from a batch of test runs with LDC. This patch only 
 spreads out the writes, fsyncs work as before.

I saw similar results in my tests. Spreading only writes are enough
for OLTP at least on Linux with middle-or-high-grade storage system.
It also works well on desktop-grade Widnows machine.

However, I don't know how it works on other OSes, including Solaris
and FreeBSD, that have different I/O policies. Would anyone test it
in those environment?

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Sorted writes in checkpoint

2007-06-14 Thread ITAGAKI Takahiro
Greg Smith [EMAIL PROTECTED] wrote:

 On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:
  If the kernel can treat sequential writes better than random writes, is 
  it worth sorting dirty buffers in block order per file at the start of 
  checkpoints?

I wrote and tested the attached sorted-writes patch base on Heikki's
ldc-justwrites-1.patch. There was obvious performance win on OLTP workload.

  tests| pgbench | DBT-2 response time (avg/90%/max)
---+-+---
 LDC only  | 181 tps | 1.12 / 4.38 / 12.13 s
 + BM_CHECKPOINT_NEEDED(*) | 187 tps | 0.83 / 2.68 /  9.26 s
 + Sorted writes   | 224 tps | 0.36 / 0.80 /  8.11 s

(*) Don't write buffers that were dirtied after starting the checkpoint.

machine : 2GB-ram, SCSI*4 RAID-5
pgbench : -s400 -t4 -c10  (about 5GB of database)
DBT-2   : 60WH (about 6GB of database)


 I think it has the potential to improve things.  There are three obvious 
 and one subtle argument against it I can think of:
 
 1) Extra complexity for something that may not help.  This would need some 
 good, robust benchmarking improvements to justify its use.

Exactly. I think we need a discussion board for I/O performance issues.
Can I use Developers Wiki for this purpose?  Since performance graphs and
result tables are important for the discussion, so it might be better
than mailing lists, that are text-based.


 2) Block number ordering may not reflect actual order on disk.  While 
 true, it's got to be better correlated with it than writing at random.
 3) The OS disk elevator should be dealing with this issue, particularly 
 because it may really know the actual disk ordering.

Yes, both are true. However, I think there is pretty high correlation
in those orderings. In addition, we should use filesystem to assure
those orderings correspond to each other. For example, pre-allocation
of files might help us, as has often been discussed.


 Here's the subtle thing:  by writing in the same order the LRU scan occurs 
 in, you are writing dirty buffers in the optimal fashion to eliminate 
 client backend writes during BuferAlloc.  This makes the checkpoint a 
 really effective LRU clearing mechanism.  Writing in block order will 
 change that.

The issue will probably go away after we have LDC, because it writes LRU
buffers during checkpoints.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



sorted-ckpt.patch
Description: Binary data

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] DROP TABLE and autovacuum

2007-06-14 Thread ITAGAKI Takahiro
Alvaro Herrera [EMAIL PROTECTED] wrote:

 ITAGAKI Takahiro wrote:
  autovacuum killer triggered in CREATE/DROP/RENAME DATABASE commands.
  Can we extend the feature to several TABLE commands?
 
 Well, one problem with this is that currently SIGINT cancels the whole
 autovacuum worker, not just the table currently being processed.  I
 think this can be fixed easily by improving the signal handling.

There is no difference between SIGINT and SIGTERM against autovacuum
workers presently. I'm thinking to split their effects -- SIGINT to 
'skip the current table' and SIGTERM to 'cancel all tables'.

BTW, if autovacuum workers are signaled by an internal server activity,
we will see 'ERROR: canceling statement due to user request' in server log.
Is it surprising to users? I prefer quiet shutdown to ERROR logs.


 Aside from that, I don't see any problem in handling DROP TABLE like you
 suggest.  But I don't feel comfortable with doing it with just any
 strong locker, because that would easily starve tables from being
 vacuumed at all.

Hmm, how about canceling only the cases of DROP TABLE, TRUNCATE and CLUSTER.
We will obviously not need the table after the commands. Other commands,
VACUUM (FULL), ANALYZE, CREATE INDEX (CONCURRENTLY), REINDEX and LOCK TABLE
still conflict with autovacuum, but I'll leave it as-is in the meantime.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] EXPLAIN omits schema?

2007-06-14 Thread Gregory Stark
Tom Lane [EMAIL PROTECTED] writes:

 Attached is a small patch which adds this conditionally on a guc that
 pg_admin or other GUI tools could set, leaving it unchanged for users.

 That makes things *worse* not better, since now tools would have to
 deal with both possibilities.

I was thinking tools would set the guc before issuing an EXPLAIN they planned
to parse.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Gregory Stark

Tom Lane [EMAIL PROTECTED] writes:

 You could remove the immediate source of this objection if you could
 redesign the APIs for the underlying support functions to be more
 type-safe.  I'm not sure how feasible or useful that would be though.
 The bottom-line question here is whether developing a new parser or
 dictionary implementation is really something that ordinary users might
 do.  If not, then having all this SQL-level support for setting up
 catalog entries seems like wasted effort.

Well assuming we have any SQL-level support at all I think we should strive to
avoid these functions taking INTERNAL arguments.

I feel like having them in the GIST interface has been a major impediment to
more people defining GIST indexes for more datatypes. Because you need to
write C code dealing with internal data structures to handle page splits the
bar to implement GIST index operator classes is too high for most users. So
instead of a simple SQL command we end up with contrib modules implementing
each type of GIST index.

A while back I proposed that we implement the same page-split algorithm that
most (or all?) of those contrib modules copy-paste between them as a default
implementation. That would allow defining a GIST index in terms of a handful
of operators like distance which could be defined with a type-safe api. This
would be less flexible than the existing generic solution but it would allow
defining new GIST indexes without writing C code.

 But they still need some more thought about permissions, because AFAICS
 mucking with a configuration can invalidate some other user's data.

ouch. could mucking with a configuration create a corrupt index?

This sounds sort of analogous to the issues collation bring up.

 It seems to me that the single easiest and most useful part of a
 configuration to change is the stop word list; but this setup guarantees
 that no one but a DBA can do that, and what's more that pg_dump won't record
 your changes.

I would second that, in the past I was expected to provide an administrative
web interface to adjust the list of stop words.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Controlling Load Distributed Checkpoints

2007-06-14 Thread Gregory Stark
PFC [EMAIL PROTECTED] writes:

 Anyway, seq-scan on InnoDB is very slow because, as the btree grows (just
 like postgres indexes) pages are split and scanning the pages in btree order
 becomes a mess of seeks. So, seq scan in InnoDB is very very slow unless
 periodic OPTIMIZE TABLE is applied. (caveat to the postgres TODO item
 implement automatic table clustering...)

Heikki already posted a patch which goes a long way towards implementing what
I think this patch refers to: trying to maintaining the cluster ordering on
updates and inserts.

It does it without changing the basic table structure at all. On updates and
inserts it consults the indexam of the clustered index to ask if for a
suggested block. If the index's suggested block has enough free space then the
tuple is put there.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Sorted writes in checkpoint

2007-06-14 Thread Gregory Stark

ITAGAKI Takahiro [EMAIL PROTECTED] writes:

 Exactly. I think we need a discussion board for I/O performance issues.
 Can I use Developers Wiki for this purpose?  Since performance graphs and
 result tables are important for the discussion, so it might be better
 than mailing lists, that are text-based.

I would suggest keeping the discussion on mail and including links to refer to
charts and tables in the wiki.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] ecpg regression broken on mingw

2007-06-14 Thread Andrew Dunstan


The ECPG regression tests appear to be broken on MinGW (see buildfarm 
trout and vaquita). Please tell me that this isn't a line ending problem 
like it appears on cursory inspection.


cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] ecpg regression broken on mingw

2007-06-14 Thread Magnus Hagander
On Thu, Jun 14, 2007 at 07:47:58AM -0400, Andrew Dunstan wrote:
 
 The ECPG regression tests appear to be broken on MinGW (see buildfarm 
 trout and vaquita). Please tell me that this isn't a line ending problem 
 like it appears on cursory inspection.

It certainly looks like that, but yak is also a mingw machine, no? And it
shows green (and yes, it has pulled the changes in the regression tests
already)

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ecpg regression broken on mingw

2007-06-14 Thread Andrew Dunstan



Magnus Hagander wrote:

On Thu, Jun 14, 2007 at 07:47:58AM -0400, Andrew Dunstan wrote:
  
The ECPG regression tests appear to be broken on MinGW (see buildfarm 
trout and vaquita). Please tell me that this isn't a line ending problem 
like it appears on cursory inspection.



It certainly looks like that, but yak is also a mingw machine, no? And it
shows green (and yes, it has pulled the changes in the regression tests
already)


  


Maybe it uses a different CVS or something. Who knows? Aren't our diffs 
supposed to be done ignoring whitespace?


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Fractions in GUC variables

2007-06-14 Thread Heikki Linnakangas

We have these GUC variables that define a fraction of something:

#autovacuum_vacuum_scale_factor = 0.2   # fraction of rel size before
# vacuum
#autovacuum_analyze_scale_factor = 0.1  # fraction of rel size before
# analyze

#bgwriter_lru_percent = 1.0 # 0-100% of LRU buffers scanned/round
#bgwriter_all_percent = 0.333   # 0-100% of all buffers scanned/round

Autovacuum settings use fractions, and bgwriter settings use a 
percentage. Fortunately these settings are not related so there's not 
too much potential for confusion, but it seems we should have a common 
way to define settings like that.


A nice way would be that the base unit would be a fraction, like in the 
autovacuum settings, but you could add a %-sign to give it as a percent, 
just like you can use KB/MB etc. I'm not sure if we can do anything for 
those without breaking backwards-compatibility, though.


Any ideas? The load distributed checkpoints patch adds one more GUC 
variable like. I'm inclined to follow the example of the bgwriter 
settings because it's more closely related to them, though I like the 
autovacuum style more.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] ecpg regression broken on mingw

2007-06-14 Thread Joachim Wieland


On Thu, Jun 14, 2007 at 08:17:05AM -0400, Andrew Dunstan wrote:
 Maybe it uses a different CVS or something. Who knows? Aren't our diffs 
 supposed to be done ignoring whitespace?

pg_init() in pg_regress_ecpg.c has:

/* no reason to set -w for ecpg checks, except for when on windows */
if (strstr(host_platform, -win32))
basic_diff_opts = -w;
else
basic_diff_opts = ;

What value does host_platform have on MinGW?


Joachim


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] ecpg regression broken on mingw

2007-06-14 Thread Magnus Hagander
On Thu, Jun 14, 2007 at 02:28:26PM +0200, Joachim Wieland wrote:
 
 
 On Thu, Jun 14, 2007 at 08:17:05AM -0400, Andrew Dunstan wrote:
  Maybe it uses a different CVS or something. Who knows? Aren't our diffs 
  supposed to be done ignoring whitespace?
 
 pg_init() in pg_regress_ecpg.c has:
 
 /* no reason to set -w for ecpg checks, except for when on windows */
 if (strstr(host_platform, -win32))
 basic_diff_opts = -w;
 else
 basic_diff_opts = ;
 
 What value does host_platform have on MinGW?

That could certainly be it - on the failing machine, it's:
i686-pc-mingw32

Dave, any chance you can change that check to look for both -win32 and
-mingw32 and give it a test run on vaquita?

(I'll see if I have a vmware around that can run the build meanwhile, but
if it's easy for you to do..)

//Magnus


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] ecpg regression broken on mingw

2007-06-14 Thread Magnus Hagander
On Thu, Jun 14, 2007 at 02:35:29PM +0200, Magnus Hagander wrote:
 On Thu, Jun 14, 2007 at 02:28:26PM +0200, Joachim Wieland wrote:
  
  
  On Thu, Jun 14, 2007 at 08:17:05AM -0400, Andrew Dunstan wrote:
   Maybe it uses a different CVS or something. Who knows? Aren't our diffs 
   supposed to be done ignoring whitespace?
  
  pg_init() in pg_regress_ecpg.c has:
  
  /* no reason to set -w for ecpg checks, except for when on windows 
  */
  if (strstr(host_platform, -win32))
  basic_diff_opts = -w;
  else
  basic_diff_opts = ;
  
  What value does host_platform have on MinGW?
 
 That could certainly be it - on the failing machine, it's:
 i686-pc-mingw32
 
 Dave, any chance you can change that check to look for both -win32 and
 -mingw32 and give it a test run on vaquita?
 
 (I'll see if I have a vmware around that can run the build meanwhile, but
 if it's easy for you to do..)

This seems to fix most of the problems - I still see 3 tests failing. I'll
commit this change for now.

The interesting thing is that the failed ones seem to *still* be because of
line endings...

//Magnus


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ecpg regression broken on mingw

2007-06-14 Thread Magnus Hagander
On Thu, Jun 14, 2007 at 03:08:00PM +0200, Magnus Hagander wrote:
 On Thu, Jun 14, 2007 at 02:35:29PM +0200, Magnus Hagander wrote:
  On Thu, Jun 14, 2007 at 02:28:26PM +0200, Joachim Wieland wrote:
   
   
   On Thu, Jun 14, 2007 at 08:17:05AM -0400, Andrew Dunstan wrote:
Maybe it uses a different CVS or something. Who knows? Aren't our diffs 
supposed to be done ignoring whitespace?
   
   pg_init() in pg_regress_ecpg.c has:
   
   /* no reason to set -w for ecpg checks, except for when on 
   windows */
   if (strstr(host_platform, -win32))
   basic_diff_opts = -w;
   else
   basic_diff_opts = ;
   
   What value does host_platform have on MinGW?
  
  That could certainly be it - on the failing machine, it's:
  i686-pc-mingw32
  
  Dave, any chance you can change that check to look for both -win32 and
  -mingw32 and give it a test run on vaquita?
  
  (I'll see if I have a vmware around that can run the build meanwhile, but
  if it's easy for you to do..)
 
 This seems to fix most of the problems - I still see 3 tests failing. I'll
 commit this change for now.
 
 The interesting thing is that the failed ones seem to *still* be because of
 line endings...

Yeah yeah, spammer here, 'eh ;-)

It's not line-ending problem anymore. It's the different float formats on
win32. I looked at the wrong diff file.

I think what's left now is just that mingw needs a different output file
with a different float format in it, just like for the other regression
tests. I'll try to make that happen.

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Sorted writes in checkpoint

2007-06-14 Thread Heikki Linnakangas

ITAGAKI Takahiro wrote:

Greg Smith [EMAIL PROTECTED] wrote:

On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:
If the kernel can treat sequential writes better than random writes, is 
it worth sorting dirty buffers in block order per file at the start of 
checkpoints?


I wrote and tested the attached sorted-writes patch base on Heikki's
ldc-justwrites-1.patch. There was obvious performance win on OLTP workload.

  tests| pgbench | DBT-2 response time (avg/90%/max)
---+-+---
 LDC only  | 181 tps | 1.12 / 4.38 / 12.13 s
 + BM_CHECKPOINT_NEEDED(*) | 187 tps | 0.83 / 2.68 /  9.26 s
 + Sorted writes   | 224 tps | 0.36 / 0.80 /  8.11 s

(*) Don't write buffers that were dirtied after starting the checkpoint.

machine : 2GB-ram, SCSI*4 RAID-5
pgbench : -s400 -t4 -c10  (about 5GB of database)
DBT-2   : 60WH (about 6GB of database)


Wow, I didn't expect that much gain from the sorted writes. How was LDC 
configured?


3) The OS disk elevator should be dealing with this issue, particularly 
because it may really know the actual disk ordering.


Yeah, but we don't give the OS that much chance to coalesce writes when 
we spread them out.


Here's the subtle thing:  by writing in the same order the LRU scan occurs 
in, you are writing dirty buffers in the optimal fashion to eliminate 
client backend writes during BuferAlloc.  This makes the checkpoint a 
really effective LRU clearing mechanism.  Writing in block order will 
change that.


The issue will probably go away after we have LDC, because it writes LRU
buffers during checkpoints.


I think so too.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] DROP TABLE and autovacuum

2007-06-14 Thread Alvaro Herrera
ITAGAKI Takahiro wrote:
 Alvaro Herrera [EMAIL PROTECTED] wrote:
 
  ITAGAKI Takahiro wrote:
   autovacuum killer triggered in CREATE/DROP/RENAME DATABASE commands.
   Can we extend the feature to several TABLE commands?
  
  Well, one problem with this is that currently SIGINT cancels the whole
  autovacuum worker, not just the table currently being processed.  I
  think this can be fixed easily by improving the signal handling.
 
 There is no difference between SIGINT and SIGTERM against autovacuum
 workers presently. I'm thinking to split their effects -- SIGINT to 
 'skip the current table' and SIGTERM to 'cancel all tables'.

Sure, we can do that (it's even mentioned in the comments in autovacuum.c).

 BTW, if autovacuum workers are signaled by an internal server activity,
 we will see 'ERROR: canceling statement due to user request' in server log.
 Is it surprising to users? I prefer quiet shutdown to ERROR logs.

Maybe cancelling the current table processing should be just a WARNING,
not ERROR.  We would abort the transaction quietly.


  Aside from that, I don't see any problem in handling DROP TABLE like you
  suggest.  But I don't feel comfortable with doing it with just any
  strong locker, because that would easily starve tables from being
  vacuumed at all.
 
 Hmm, how about canceling only the cases of DROP TABLE, TRUNCATE and CLUSTER.
 We will obviously not need the table after the commands. Other commands,
 VACUUM (FULL), ANALYZE, CREATE INDEX (CONCURRENTLY), REINDEX and LOCK TABLE
 still conflict with autovacuum, but I'll leave it as-is in the meantime.

Well, all of DROP TABLE, TRUNCATE and CLUSTER seem safe -- and also,
they will advance the table's relfrozenxid.  No objection there.

I think all the others you mention should be waiting on autovacuum, not
cancel it.  Maybe what we could do with VACUUM and ANALYZE is let the
user know that the table is being processed by autovacuum and return
quickly.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Can autovac try to lock multiple tables at once?

2007-06-14 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Sure, it can do that.  I think it's easy enough to correct this problem;
  see attached patch.  Should this be backpatched?  Earlier releases also
  fall foul of this problem AFAICT.
 
 Yeah, because what made me think about it was a gripe from an 8.2
 user ... maybe this wasn't his problem, but it could be.

Backpatched all the way back to 8.1.

-- 
Alvaro Herrera  Developer, http://www.PostgreSQL.org/
Y dijo Dios: Que sea Satanás, para que la gente no me culpe de todo a mí.
Y que hayan abogados, para que la gente no culpe de todo a Satanás

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] DROP TABLE and autovacuum

2007-06-14 Thread Alvaro Herrera
Alvaro Herrera wrote:
 ITAGAKI Takahiro wrote:

  Hmm, how about canceling only the cases of DROP TABLE, TRUNCATE and CLUSTER.
  We will obviously not need the table after the commands. Other commands,
  VACUUM (FULL), ANALYZE, CREATE INDEX (CONCURRENTLY), REINDEX and LOCK TABLE
  still conflict with autovacuum, but I'll leave it as-is in the meantime.
 
 Well, all of DROP TABLE, TRUNCATE and CLUSTER seem safe -- and also,
 they will advance the table's relfrozenxid.  No objection there.

Something worth considering, though unrelated to the topic at hand: what
happens with the table stats after CLUSTER?  Should we cause an ANALYZE
afterwards?  We could end up running with outdated statistics.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Sorted writes in checkpoint

2007-06-14 Thread Greg Smith

On Thu, 14 Jun 2007, ITAGAKI Takahiro wrote:

I think we need a discussion board for I/O performance issues. Can I use 
Developers Wiki for this purpose?  Since performance graphs and result 
tables are important for the discussion, so it might be better than 
mailing lists, that are text-based.


I started pushing some of my stuff over to there recently to make it 
easier to edit and other people can expand with their expertise.
http://developer.postgresql.org/index.php/Buffer_Cache%2C_Checkpoints%2C_and_the_BGW 
is what I've done so far on this particular topic.


What I would like to see on the Wiki first are pages devoted to how to run 
the common benchmarks people use for useful performance testing.  A recent 
thread on one of the lists reminded me how easy it is to get worthless 
results out of DBT2 if you don't have any guidance on that.  I've already 
got a stack of documentation about how to wrestle with pgbench and am 
generating more.


The problem with using the Wiki as the main focus is that when you get to 
the point that you want to upload detailed test results, that interface 
really isn't appropriate for it.  For example, in the last day I've 
collected up data from about 400 short tests runs that generated 800 
graphs.  It's all organized as HTML so you can drill down into the 
specific tests that executed oddly.  Heikki's DBT2 resuls are similar; not 
as many files, because he's running longer tests, but the navigation is 
even more complicated.


There is no way to easily put that type and level of information into the 
Wiki page.  You really just need a web server to copy the results onto. 
Then the main problem you have to be concerned about is a repeat of the 
OSDL situation, where all the results just dissapear if their hosting 
sponsor goes away.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Tom Lane
Gregory Stark [EMAIL PROTECTED] writes:
 Well assuming we have any SQL-level support at all I think we should
 strive to avoid these functions taking INTERNAL arguments.

I don't think I want to get into redesigning the patch at that level of
detail, at least not for 8.3.  It seems like something possibly worth
thinking about for 8.4 though.  The idea that we might want to change
the API for parser and dictionary support routines seems like another
good argument for not exposing user-level facilities for creating them
right now.

What I'm realizing as I look at it is that this is an enormous patch,
and it's not as close to being ready to apply as I had supposed.  If we
don't scale it back, then either it doesn't get into 8.3 or 8.3 gets
delayed a whole lot longer.  So we need to look at what we can trim or
postpone for a later release.

So all these factors seem to me to point in the same direction: at least
for the time being, we should treat TS parsers and dictionaries the way
we treat index access methods.  There'll be a catalog, which the
adventurous can insert new entries into, but no SQL-level support for
doing it, hence no pg_dump support.  And we reserve the right to whack
around the API for the functions referenced by the catalog entries.

That still leaves us with the question of SQL-level support for TS
configurations, which are built on top of parsers and dictionaries.
We definitely need some level of capability for that.  For the
permissions and dependencies issues, the minimalistic approach is to
say only superusers can create or alter TS configurations, and if you
alter one it's your responsibility to fix up any dependent tsvector
columns or indexes.  We currently handle index operator classes that
way, so it's not completely ridiculous.  Sure it would be nice to do
better, but maybe that's a post-8.3 project.

That gets us down to just needing to worry about whether we like the
SQL representation of configurations.  Which is still a nontrivial
issue, but at least it seems manageable on a timescale that's
reasonable for 8.3.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Joshua D. Drake

Tom Lane wrote:

Gregory Stark [EMAIL PROTECTED] writes:

Well assuming we have any SQL-level support at all I think we should
strive to avoid these functions taking INTERNAL arguments.



That gets us down to just needing to worry about whether we like the
SQL representation of configurations.  Which is still a nontrivial
issue, but at least it seems manageable on a timescale that's
reasonable for 8.3.


O.k. I am not trying to throw any cold water on this, but with the 
limitations we are suggesting, does the patch gain us anything over just 
leaving tsearch in contrib?


Sincerely,

Joshua D. Drake




regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 O.k. I am not trying to throw any cold water on this, but with the 
 limitations we are suggesting, does the patch gain us anything over just 
 leaving tsearch in contrib?

Well, if you want to take a hard-nosed approach, no form of the patch
would gain us anything over leaving it in contrib, at least not from a
functionality standpoint.  The argument in favor has always been about
perception, really: if it's a core feature not an add-on, then
people will take it more seriously.  And there's a rather weak
ease-of-use argument that you don't have to install a contrib module.
(The idea that it's targeted at people who can't or won't install a
contrib module is another reason why I think we can skip user-defined
parsers and dictionaries ...)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Bruce Momjian
Joshua D. Drake wrote:
 Tom Lane wrote:
  Gregory Stark [EMAIL PROTECTED] writes:
  Well assuming we have any SQL-level support at all I think we should
  strive to avoid these functions taking INTERNAL arguments.
 
  That gets us down to just needing to worry about whether we like the
  SQL representation of configurations.  Which is still a nontrivial
  issue, but at least it seems manageable on a timescale that's
  reasonable for 8.3.
 
 O.k. I am not trying to throw any cold water on this, but with the 
 limitations we are suggesting, does the patch gain us anything over just 
 leaving tsearch in contrib?

The idea is that common operations like searching and mapping dictionaries
will be easier to do, but the more complex stuff will require catalog
manipulations.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Joshua D. Drake

Tom Lane wrote:

Joshua D. Drake [EMAIL PROTECTED] writes:
O.k. I am not trying to throw any cold water on this, but with the 
limitations we are suggesting, does the patch gain us anything over just 
leaving tsearch in contrib?


Well, if you want to take a hard-nosed approach, no form of the patch
would gain us anything over leaving it in contrib, at least not from a
functionality standpoint.  The argument in favor has always been about
perception, really: if it's a core feature not an add-on, then
people will take it more seriously.  And there's a rather weak
ease-of-use argument that you don't have to install a contrib module.
(The idea that it's targeted at people who can't or won't install a
contrib module is another reason why I think we can skip user-defined
parsers and dictionaries ...)


Well my argument has always been the core feature argument. Perhaps I 
am missing some info here, but when I read what you wrote, I read that 
Tsearch will now be harder to work with. Not easier. :(


Removal of pg_dump support kind of hurts us, as we already have problems 
with pg_dump support and tsearch2. Adding work to have to re-assign 
permissions to vector columns because we make changes...


I would grant that having the SQL extensions would certainly be nice.

Anyway, I am not trying to stop the progress. I would like to see 
Tsearch2 in core but I also don't want to add complexity. You did say here:


And we reserve the right to whack around the API for the functions 
referenced by the catalog entries.


Which kind of gets us back to upgrade problems doesn't it?


Sincerely,

Joshua D. Drake




regards, tom lane




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Tom Lane
Joshua D. Drake [EMAIL PROTECTED] writes:
 Well my argument has always been the core feature argument. Perhaps I 
 am missing some info here, but when I read what you wrote, I read that 
 Tsearch will now be harder to work with. Not easier. :(

Then you misread it.  What I was proposing was essentially that there
won't be any need for pg_dump support because everything's built-in
(at least as far as parsers/dictionaries go).

As for the permissions issues, that's just formalizing something that's
true today with the contrib module: if you change a configuration, it's
*your* problem whether that invalidates any table entries, the system
won't take care of it for you.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Joshua D. Drake

Tom Lane wrote:

Joshua D. Drake [EMAIL PROTECTED] writes:
Well my argument has always been the core feature argument. Perhaps I 
am missing some info here, but when I read what you wrote, I read that 
Tsearch will now be harder to work with. Not easier. :(


Then you misread it.  What I was proposing was essentially that there
won't be any need for pg_dump support because everything's built-in
(at least as far as parsers/dictionaries go).

As for the permissions issues, that's just formalizing something that's
true today with the contrib module: if you change a configuration, it's
*your* problem whether that invalidates any table entries, the system
won't take care of it for you.


O.k. :)

Joshua D. Drake




regards, tom lane




--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Sorted writes in checkpoint

2007-06-14 Thread Simon Riggs
On Thu, 2007-06-14 at 16:39 +0900, ITAGAKI Takahiro wrote:
 Greg Smith [EMAIL PROTECTED] wrote:
 
  On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:
   If the kernel can treat sequential writes better than random writes, is 
   it worth sorting dirty buffers in block order per file at the start of 
   checkpoints?
 
 I wrote and tested the attached sorted-writes patch base on Heikki's
 ldc-justwrites-1.patch. There was obvious performance win on OLTP workload.
 
   tests| pgbench | DBT-2 response time (avg/90%/max)
 ---+-+---
  LDC only  | 181 tps | 1.12 / 4.38 / 12.13 s
  + BM_CHECKPOINT_NEEDED(*) | 187 tps | 0.83 / 2.68 /  9.26 s
  + Sorted writes   | 224 tps | 0.36 / 0.80 /  8.11 s
 
 (*) Don't write buffers that were dirtied after starting the checkpoint.
 
 machine : 2GB-ram, SCSI*4 RAID-5
 pgbench : -s400 -t4 -c10  (about 5GB of database)
 DBT-2   : 60WH (about 6GB of database)

I'm very surprised by the BM_CHECKPOINT_NEEDED results. What percentage
of writes has been saved by doing that? We would expect a small
percentage of blocks only and so that shouldn't make a significant
difference. I thought we discussed this before, about a year ago. It
would be easy to get that wrong and to avoid writing a block that had
been re-dirtied after the start of checkpoint, but was already dirty
beforehand. How long was the write phase of the checkpoint, how long
between checkpoints?

I can see the sorted writes having an effect because the OS may not
receive blocks within a sufficient time window to fully optimise them.
That effect would grow with increasing sizes of shared_buffers and
decrease with size of controller cache. How big was the shared buffers
setting? What OS scheduler are you using? The effect would be greatest
when using Deadline.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Bruce Momjian

I an attempt to communicate what full text search does, and what
features we are thinking of adding/removing, I have put up the
introduction in HTML:

http://momjian.us/expire/fulltext/HTML/fulltext-intro.html

The links to the other sections don't work yet.

---

Tom Lane wrote:
 Joshua D. Drake [EMAIL PROTECTED] writes:
  Well my argument has always been the core feature argument. Perhaps I 
  am missing some info here, but when I read what you wrote, I read that 
  Tsearch will now be harder to work with. Not easier. :(
 
 Then you misread it.  What I was proposing was essentially that there
 won't be any need for pg_dump support because everything's built-in
 (at least as far as parsers/dictionaries go).
 
 As for the permissions issues, that's just formalizing something that's
 true today with the contrib module: if you change a configuration, it's
 *your* problem whether that invalidates any table entries, the system
 won't take care of it for you.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev



Well assuming we have any SQL-level support at all I think we should
strive to avoid these functions taking INTERNAL arguments.

That gets us down to just needing to worry about whether we like the
SQL representation of configurations.  Which is still a nontrivial
issue, but at least it seems manageable on a timescale that's
reasonable for 8.3.


Possible solution is to split pg_ts_dict (I'll talk about dictionaries, but the 
same way is possible to parsers, but now it's looked as overdesign) to two table 
like pg_am and pg_opclass.
First table, pg_ts_dict_template (I don't know the exact name yet) which 
contains columns: oid, template_name, dict_init, dict_lexize and second:

pg_ts_dict with colimns: oid, template_oid, owner, schema, dict_initoption.

CREATE/ALTER/DROP DICTIONARY affects only second table and access to first one 
is only select/update/insert/delete similar to pg_am.


IMHO, this interface solves problems with security and dumping.

The reason to save SQLish interface to dictionaries is a simplicity of 
configuration. Snowball's stemmers are useful as is, but ispell dictionary 
requires some configuration action before using.


Next, INTERNAL arguments parser's and dictionary's APIs are used because if 
performance reason. During creation of tsvector from text, there are a lot of 
calls of parsers and dictionaries. And internal structures of they states may be 
rather complex and cannot be matched in any pgsql's type, even in flat memory 
structure.



 Next, it took me a while to understand how Mapping objects fit into
 the scheme at all, and now that (I think) I understand, I'm wondering
 why treat them as an independent concept.
ALTER FULLTEXT CONFIGURATION cfgname ADD MAPPING FOR tokentypename[, ...] WITH 
dictname1[, ...];
ALTER FULLTEXT CONFIGURATION cfgname ALTER MAPPING FOR tokentypename[, ...] WITH 
dictname1[, ...];

ALTER FULLTEXT CONFIGURATION cfgname ALTER MAPPING [FOR tokentypename[, ...]]
 REPLACE olddictname TO newdictname;
ALTER FULLTEXT CONFIGURATION cfgname DROP MAPPING [IF EXISTS]  FOR 
tokentypename;
Is it looking reasonable?

 TSConfiguration objects are a different story, since they have only
 type-safe dependencies on parsers, locales, and dictionaries.  But they
 still need some more thought about permissions, because AFAICS mucking
 with a configuration can invalidate some other user's data.Do we want
 to allow runtime changes in a configuration that existing tsvector
 columns already depend on?  How can we even recognize whether there is
 stored data that will be affected by a configuration change?  (AFAICS

Very complex task: experienced users could use several configuration 
simultaneously. For example: indexing use configuration which doesn't reject 
stop-words, but for default searching use configuration which rejects 
stop-words. BTW, the same effects may be produced by dictionary's change.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev

can we hard-code into the backend, and just update for every major
release like we do for encodings?


Sorry, no one of them :(. We know projects which introduce new parser, new 
dictionary. Config and map are changes very often.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev

But they still need some more thought about permissions, because AFAICS
mucking with a configuration can invalidate some other user's data.


ouch. could mucking with a configuration create a corrupt index?


Depending on what you mean 'corrupted'. It will not corrupted as non-readable 
or cause backend crash. But usage of such tsvector column could be limited - not 
all words will be searchable.



This sounds sort of analogous to the issues collation bring up.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 The reason to save SQLish interface to dictionaries is a simplicity of 
 configuration. Snowball's stemmers are useful as is, but ispell dictionary 
 requires some configuration action before using.

Yeah.  I had been wondering about moving the dict_initoption over to the
configuration entry --- is that sane at all?  It would mean that
dict_init functions would have to guard themselves against invalid
options, but they probably ought to do that anyway.  If we did that,
I think we could have a fixed set of dictionaries without too much
problem, and focus on just configurations as being user-alterable.

 Next, it took me a while to understand how Mapping objects fit into
 the scheme at all, and now that (I think) I understand, I'm wondering
 why treat them as an independent concept.

 ALTER FULLTEXT CONFIGURATION cfgname ADD MAPPING FOR tokentypename[, ...] 
 WITH 
 dictname1[, ...];
 ALTER FULLTEXT CONFIGURATION cfgname ALTER MAPPING FOR tokentypename[, ...] 
 WITH 
 dictname1[, ...];
 ALTER FULLTEXT CONFIGURATION cfgname ALTER MAPPING [FOR tokentypename[, ...]]
   REPLACE olddictname TO newdictname;
 ALTER FULLTEXT CONFIGURATION cfgname DROP MAPPING [IF EXISTS]  FOR 
 tokentypename;
 Is it looking reasonable?

Er ... what's the difference between the second and third forms?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Gregory Stark
Teodor Sigaev [EMAIL PROTECTED] writes:

 But they still need some more thought about permissions, because AFAICS
 mucking with a configuration can invalidate some other user's data.

 ouch. could mucking with a configuration create a corrupt index?

 Depending on what you mean 'corrupted'. It will not corrupted as non-readable
 or cause backend crash. But usage of such tsvector column could be limited -
 not all words will be searchable.

Am I correct to think of this like changing collations leaving your btree
index corrupt? In that case it probably won't cause any backend crash either
but you will get incorrect results. For example, returning different results
depending on whether the index or a full table scan is used.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev



Tom Lane wrote:

Teodor Sigaev [EMAIL PROTECTED] writes:
The reason to save SQLish interface to dictionaries is a simplicity of 
configuration. Snowball's stemmers are useful as is, but ispell dictionary 
requires some configuration action before using.


Yeah.  I had been wondering about moving the dict_initoption over to the
configuration entry --- is that sane at all?  It would mean that
It should be. Instances of ispell (and synonym, thesaurus) dictionaries are 
different only in dict_initoption part, so it will be only one entry in 
pg_ts_dict_template and several ones in pg_ts_dict.


ALTER FULLTEXT CONFIGURATION cfgname ADD MAPPING FOR tokentypename[, ...] WITH 
dictname1[, ...];
ALTER FULLTEXT CONFIGURATION cfgname ALTER MAPPING FOR tokentypename[, ...] WITH 
dictname1[, ...];

sets dictionary's list for token's type(s)


ALTER FULLTEXT CONFIGURATION cfgname ALTER MAPPING [FOR tokentypename[, ...]]
  REPLACE olddictname TO newdictname;


Replace dictionary to another dictionary in dictionary's list for token's 
type(s). This command is very useful for tweaking configuration and for creating 
new configuration which differs from already existing one only by pair of 
dictionary.



ALTER FULLTEXT CONFIGURATION cfgname DROP MAPPING [IF EXISTS]  FOR 
tokentypename;
Is it looking reasonable?


Er ... what's the difference between the second and third forms?


That changes are doable for several days. I'd like to make changes together with 
 replacing of FULLTEXT keyword to TEXT SEARCH as you suggested.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Oleg Bartunov

On Thu, 14 Jun 2007, Tom Lane wrote:


Teodor Sigaev [EMAIL PROTECTED] writes:

The reason to save SQLish interface to dictionaries is a simplicity of
configuration. Snowball's stemmers are useful as is, but ispell dictionary
requires some configuration action before using.


Yeah.  I had been wondering about moving the dict_initoption over to the
configuration entry --- is that sane at all?  It would mean that
dict_init functions would have to guard themselves against invalid
options, but they probably ought to do that anyway.  If we did that,
I think we could have a fixed set of dictionaries without too much
problem, and focus on just configurations as being user-alterable.


currently, all dictionaries we provide are all template dictionaries,
so users could change only parameters.

But, there are reasons to allow users register new templates and in fact we 
know people/projects with application-dependent dictionaries. 
How they could dump/reload their dictionaries ?


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Oleg Bartunov

On Thu, 14 Jun 2007, Gregory Stark wrote:


Teodor Sigaev [EMAIL PROTECTED] writes:


But they still need some more thought about permissions, because AFAICS
mucking with a configuration can invalidate some other user's data.


ouch. could mucking with a configuration create a corrupt index?


Depending on what you mean 'corrupted'. It will not corrupted as non-readable
or cause backend crash. But usage of such tsvector column could be limited -
not all words will be searchable.


Am I correct to think of this like changing collations leaving your btree
index corrupt? In that case it probably won't cause any backend crash either
but you will get incorrect results. For example, returning different results
depending on whether the index or a full table scan is used.


You're correct. But we can't defend users from all possible errors. 
Other side, that we need somehow to help user to identify what fts 
configuration was used to produce tsvector. For example, comment on

tsvector column would be useful, but we don't know how to do this
automatically.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
But, there are reasons to allow users register new templates and in fact 
we know people/projects with application-dependent dictionaries. How 
they could dump/reload their dictionaries ?

The same way as pg_am does.

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev

Am I correct to think of this like changing collations leaving your btree
index corrupt? In that case it probably won't cause any backend crash either
but you will get incorrect results. For example, returning different results
depending on whether the index or a full table scan is used.


Without exotic cases, maximum disaster may be that queries with some words will 
return more or less results than should be. Because of wrong stemming or wrong 
match of stop-word or wrong mapping.


By default, configuration is useful for most users and works for danish, dutch, 
finnish, french, german, hungarian, italian, norwegian, portuguese, spanish, 
swedish, russin and english languages.




--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Tom Lane
Oleg Bartunov [EMAIL PROTECTED] writes:
 You're correct. But we can't defend users from all possible errors. 
 Other side, that we need somehow to help user to identify what fts 
 configuration was used to produce tsvector. For example, comment on
 tsvector column would be useful, but we don't know how to do this
 automatically.

Yeah, I was wondering about that too.  The only way we could relax the
superuser, you-better-know-what-you're-doing restriction on changing
configurations would be if we had a way to identify which tsvector
columns needed to be updated.  Right now that's pretty hard to find out
because the references to configurations are buried in the bodies of
trigger functions.  That whole trigger-function business is not the
nicest part of tsearch2, either ... it'd be better if we could automate
tsvector maintenance more.

One thing I was thinking about is that rather than storing a physical
tsvector column, people might index a virtual column using functional
indexes:

create index ... (to_tsvector('english', big_text_col))

which could be queried

select ... where to_tsvector('english', big_text_col) @@ tsquery

Assuming that the index is lossy, the index condition would have to be
rechecked, so to_tsvector() would have to be recomputed, but only at the
rows identified as candidate matches by the index.  The I/O savings from
eliminating the heap's tsvector column might counterbalance the extra
CPU for recomputing tsvectors.  Or not, but in any case this is
attractive because it doesn't need any handmade maintenance support like
a trigger --- the regular index maintenance code does it all.

It strikes me that we could play the same kind of game we played to make
nextval() references to sequences be recognized as dependencies on
sequences.  Invent a regconfig OID type that's just like regclass
except it handles OIDs of ts_config entries instead of pg_class entries,
and make the first argument of to_tsvector be one of those:

create index ... (to_tsvector('english'::regconfig, big_text_col))

Now dependency.c can be taught to recognize the regconfig Const as
depending on the referenced ts_config entry, and voila we have a
pg_depend entry showing that the index depends on the configuration.
What we actually do about it is another question, but this at least
gets the knowledge into the system.

[ thinks some more... ]  If we revived the GENERATED AS patch,
you could imagine computing tsvector columns via GENERATED AS
to_tsvector('english'::regconfig, big_text_col) instead of a
trigger, and then again you've got the dependency exposed where
the system can see it.  I don't wanna try to do that for 8.3,
but it might be a good path to pursue in future, instead of assuming
that triggers will be the way forevermore.

Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Michael Paesold

Bruce Momjian wrote:

I an attempt to communicate what full text search does, and what
features we are thinking of adding/removing, I have put up the
introduction in HTML:

http://momjian.us/expire/fulltext/HTML/fulltext-intro.html



Very good idea, Bruce!

After reading the discussion and the introduction, here is what I think 
tsearch in core should at least accomplish in 8.3. Please bear in mind, 
that (a) I am talking from a user perspective (there might be technical 
arguments against my thoughts) and (b) I have no hands-on experience 
with Tsearch2 yet, so more experienced users might have different needs.


 - Basic full text search usable for non-superusers
 - Out-of-the-box working configuration for as many languages as
   reasonable (Teodor named quite a number of languages working as-is,
   so this is really an improvement over contrib, great!)
 - No foot-guns accessible to non-superuser
 - Agreement on function names, perhaps some should be changed. For
   instance to_tsquery() and plainto_tsquery() seem rather unintuitive
   because they don't have a common prefix, and they are not consistent
   about using underscores. Perhaps to_tsquery() and to_tsquery_plain()?
 - Future compatibility for all features available to non-superusers
 - Stop words in tables, not in external files.
 - At least for superusers, all features available in contrib now,
   should be available, too (don't know about pg_dump).

What I don't really like is the number of commands introduced without 
any strong reference to full text search. E.g. CREATE CONFIGURATION 
gives no hint at all that this is about full text search. IMHO there are 
more configurations than just full text ones. :-) So perhaps better 
spell this CREATE FULLTEXT CONFIGURATION etc.? (Think about tab 
completion in psql, for instance.)


I guess this is in line with what Tom said about mapping objects and 
CREATE ATTRIBUTE vs. CREATE/ALTER CONFIGURATION.

(http://archives.postgresql.org/pgsql-hackers/2007-06/msg00522.php)

After all, I would really welcome having full text search capabilities 
in core.


Best Regards
Michael Paesold


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Teodor Sigaev [EMAIL PROTECTED] writes:
 The reason to save SQLish interface to dictionaries is a simplicity of 
 configuration. Snowball's stemmers are useful as is, but ispell dictionary 
 requires some configuration action before using.
 
 Yeah.  I had been wondering about moving the dict_initoption over to the
 configuration entry --- is that sane at all?  It would mean that

 It should be. Instances of ispell (and synonym, thesaurus) dictionaries are 
 different only in dict_initoption part, so it will be only one entry in 
 pg_ts_dict_template and several ones in pg_ts_dict.

No, I was thinking of still having just one pg_ts_dict catalog (no template)
but removing its dictinit field.  Instead, the init strings would be
stored with configuration mapping entries.

This would mean having to remember to provide the right option along
with the dictionary name when doing ALTER CONFIGURATION ADD MAPPING.
Not sure if that would be harder or easier to use than what you're
thinking of.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] Change sort order on UUIDs?

2007-06-14 Thread Robert Wojciechowski
I've been testing the new UUID functionality in 8.3dev and noticed that
UUIDs are sorted using memcmp in their default in-memory layout, which
is:

 

 struct uuid {

 uint32_ttime_low;

 uint16_ttime_mid;

 uint16_ttime_hi_and_version;

 uint8_t clock_seq_hi_and_reserved;

 uint8_t clock_seq_low;

 uint8_t node[_UUID_NODE_LEN];

 };

 

When done that way, you're going to see a lot of index B-tree
fragmentation with even DCE 1.1 (ISO/IEC 11578:1996) time based UUIDs,
as described above. With random (version 4) or hashed based (version 3
or 5) UUIDs there's nothing that can be done to improve the situation,
obviously.

 

So I went down the path of changing the pgsql sorting order to instead
sort by, from most significant to least:

 

1)  Node (MAC address),

2)  clock sequence, then

3)  time.

 

The implementation is as follows:

 

/* internal uuid compare function */

static int

uuid_internal_cmp(const pg_uuid_t *arg1, const pg_uuid_t *arg2)

{

  int result;

 

  /* node */

  if ((result = memcmp(arg1-data[10], arg2-data[10], 6)) != 0)

return result;

 

  /* clock_seq_hi_and_reserved, clock_seq_low */

  if ((result = memcmp(arg1-data[8], arg2-data[8], 2)) != 0)

return result;

 

  /* time_hi_and_version */

  if ((result = memcmp(arg1-data[6], arg2-data[6], 2)) != 0)

return result;

 

  /* time_mid */

  if ((result = memcmp(arg1-data[4], arg2-data[4], 2)) != 0)

return result;

 

  /* time_low */

  return memcmp(arg1-data[0], arg2-data[0], 4);

}

 

This results in much less fragmentation and reduced page hits when
indexing a UUID column. When multiple UUID generators with different
node values contribute to a single table concurrently, it should also
result in better performance than if they sorted the way they do now or
by time first.

 

Sorting UUIDs when they are random/hashed with memcmp seems pretty darn
useless in all scenarios and performs poorly on indexes. This method is
equally poor with random/hashed UUIDs, but much better with version 1
time based UUIDs.

 

What do you guys think about changing the default behavior of pgsql to
compare UUIDs this way?

 

-- Robert



Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Gregory Stark
Oleg Bartunov [EMAIL PROTECTED] writes:

 On Thu, 14 Jun 2007, Gregory Stark wrote:

 Am I correct to think of this like changing collations leaving your btree
 index corrupt? In that case it probably won't cause any backend crash 
 either
 but you will get incorrect results. For example, returning different results
 depending on whether the index or a full table scan is used.

 You're correct. But we can't defend users from all possible errors. 

Sure, but it seems like a the line, at least in existing cases, is that if you
fiddle with catalogs directly then you should know what consequences you need
to be careful of.

But when if you make changes through a supported, documented interface then
the system will protect you from breaking things. 

Hm, I went to construct an example and accidentally found a precedent for not
necessarily protecting users from themselves in every case:


postgres=# create table x (i integer);
CREATE TABLE
postgres=# create function f(integer) returns integer as 'select $1' immutable 
strict language sql;
CREATE FUNCTION
postgres=# select f(1);
 f 
---
 1
(1 row)

postgres=# create index xi on x (f(i));
CREATE INDEX
postgres=# insert into x values (1);
INSERT 0 1
postgres=# insert into x values (2);
INSERT 0 1
postgres=# create or replace function f(integer) returns integer as 'select 
-$1' immutable strict language sql;
CREATE FUNCTION


Uhm. Oops! And yes, the resulting index is, of course, corrupt:



postgres=# insert into x (select random() from generate_series(1,2000));
INSERT 0 2000
postgres=# select count(*) from x where f(i) = -1;
 count 
---
 0
(1 row)
postgres=# set enable_bitmapscan = off;
SET
postgres=# set enable_indexscan = off;
SET
postgres=#  select count(*) from x where f(i) = -1;
 count 
---
  1003
(1 row)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Tom Lane
Michael Paesold [EMAIL PROTECTED] writes:
 After reading the discussion and the introduction, here is what I think 
 tsearch in core should at least accomplish in 8.3.
 ...
   - Stop words in tables, not in external files.

I realized that there's a pretty serious problem with doing that, which
is encoding.  We don't have any way to deal with preloaded catalog data
that exceeds 7-bit-ASCII, because when you do CREATE DATABASE ... ENCODING
it's going to be copied over exactly as-is.  And there's plenty of
not-ASCII stuff in the non-English stopword files.  This is something we
need to solve eventually, but I think it ties into the whole multiple
locale can-of-worms; there's no way we're getting it done for 8.3.

So I'm afraid we have to settle for stop words in external files for the
moment.  I do have two suggestions though:

* Let's have just one stopword file for each language, with the
convention that the file is stored in UTF8 no matter what language
you're talking about.  We can have the stopword reading code convert
to the database encoding on-the-fly when it reads the file.  Without
this there's just a whole bunch of foot-guns there.  We'd at least need
to have encoding verification checks when reading the files, which seems
hardly cheaper than just translating the data.

* Let's fix it so the reference to the stoplist in the user-visible
options is just a name, with no path or anything like that.  (Similar
to the handling of timezone_abbreviations.)  Then it will be feasible
to re-interpret the option as a reference to a named list in a catalog
someday, when we solve the encoding problem.  Right now the patch has
things like

+ DATA(insert OID = 5140 (  ru_stem_koi8 PGNSP PGUID 5135 5137 
dicts_data/russian.stop.koi8));

which is really binding the option pretty tightly to being a filename;
not to mention the large security risks involved in letting anyone but
a superuser have control of such an option.

 What I don't really like is the number of commands introduced without 
 any strong reference to full text search. E.g. CREATE CONFIGURATION 
 gives no hint at all that this is about full text search.

Yeah.  We had some off-list discussion about this and concluded that
TEXT SEARCH seemed to be the right phrase to use in the command names.
That hasn't gotten reflected into the patch yet.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Teodor Sigaev
It should be. Instances of ispell (and synonym, thesaurus) dictionaries are 
different only in dict_initoption part, so it will be only one entry in 
pg_ts_dict_template and several ones in pg_ts_dict.


No, I was thinking of still having just one pg_ts_dict catalog (no template)
but removing its dictinit field.  Instead, the init strings would be
stored with configuration mapping entries.

This would mean having to remember to provide the right option along
with the dictionary name when doing ALTER CONFIGURATION ADD MAPPING.
Not sure if that would be harder or easier to use than what you're
thinking of.


Hmm. Dictionary may present in several lists of dictionaries in one 
configuration. Suppose, it isn't practical to store dictinitoption 
several times. In other hand, the same dictionary (template) with 
different init option may present on configuration too. Typical example 
is configuration for russian language:

lword, lpword tokens have dictionary's list {ispell_en, stem_en}
nlword, nlpword tokens have dictionary's list {ispell_ru, stem_ru}

stem_(ru|en) is a Snowball's stemmer, but ispell_(ru|en) is a ispell 
dictionary (template) with different dictinitoption. Next, 
configurations may share dictionaries.


And, init option may be rather big.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Change sort order on UUIDs?

2007-06-14 Thread Tom Lane
Robert Wojciechowski [EMAIL PROTECTED] writes:
 I've been testing the new UUID functionality in 8.3dev and noticed that
 UUIDs are sorted using memcmp in their default in-memory layout,
 ...
 When done that way, you're going to see a lot of index B-tree
 fragmentation with even DCE 1.1 (ISO/IEC 11578:1996) time based UUIDs,

This claim seems like nonsense.  Btrees don't care about the ordering
details of what they index.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Change sort order on UUIDs?

2007-06-14 Thread Heikki Linnakangas

Tom Lane wrote:

Robert Wojciechowski [EMAIL PROTECTED] writes:

I've been testing the new UUID functionality in 8.3dev and noticed that
UUIDs are sorted using memcmp in their default in-memory layout,
...
When done that way, you're going to see a lot of index B-tree
fragmentation with even DCE 1.1 (ISO/IEC 11578:1996) time based UUIDs,


This claim seems like nonsense.  Btrees don't care about the ordering
details of what they index.


I believe he means that with his modified comparison function, when 
inserting a series of UUIDs with increasing time-fields, the index keys 
are always inserted to the rightmost page, which gives a more tightly 
packed index than scattered inserts all-around the index.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Change sort order on UUIDs?

2007-06-14 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 I believe he means that with his modified comparison function, when 
 inserting a series of UUIDs with increasing time-fields, the index keys 
 are always inserted to the rightmost page, which gives a more tightly 
 packed index than scattered inserts all-around the index.

Hm.  Still, given that that benefit would only accrue for one version of
uuid generation, it's a pretty weak argument.

The concrete reason for not changing it is that the sort ordering of
uuids would then look quite unnatural compared to the display format.
Which would provoke confusion and bug reports...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Change sort order on UUIDs?

2007-06-14 Thread Robert Wojciechowski
  I've been testing the new UUID functionality in 8.3dev and noticed
that
  UUIDs are sorted using memcmp in their default in-memory layout,
  ...
  When done that way, you're going to see a lot of index B-tree
  fragmentation with even DCE 1.1 (ISO/IEC 11578:1996) time based
UUIDs,
 
  This claim seems like nonsense.  Btrees don't care about the
ordering
  details of what they index.
 
 I believe he means that with his modified comparison function, when
 inserting a series of UUIDs with increasing time-fields, the index
keys
 are always inserted to the rightmost page, which gives a more tightly
 packed index than scattered inserts all-around the index.
 

That was my thinking; that it would speed up (bulk) inserts causing
fewer page splits.

I'm also using my own contrib module that uses FreeBSD's uuid_create
generating DCE 1.1 UUIDs, which keeps the state of the UUID generator in
the kernel. The 8.3 contrib module based on uuid-ossp seems to 1) not
compile on FreeBSD (conflicts with uuid.h from the OS) and 2) randomizes
the clock sequence as there is no state stored between invocations.

The other thing this modification does is allow ORDER BY to order by
time when possible, which is a nice default behavior as well, yes?

-- Robert

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Change sort order on UUIDs?

2007-06-14 Thread Robert Wojciechowski
 Heikki Linnakangas [EMAIL PROTECTED] writes:
  I believe he means that with his modified comparison function, when
  inserting a series of UUIDs with increasing time-fields, the index
keys
  are always inserted to the rightmost page, which gives a more
tightly
  packed index than scattered inserts all-around the index.
 
 Hm.  Still, given that that benefit would only accrue for one version
of
 uuid generation, it's a pretty weak argument.
 
 The concrete reason for not changing it is that the sort ordering of
 uuids would then look quite unnatural compared to the display format.
 Which would provoke confusion and bug reports...
 
   regards, tom lane

If it improves non-user controllable indexing behavior, doesn't
negatively affect the indexing of random/hash based UUIDs, and only
seems to affect ordering for the display format, it seems worth it to
me.

A paragraph in the documentation stating how UUIDs are sorted seems to
satisfy the visual ordering concern, which is more than what Microsoft
is doing (I had to dig for a blog post to find this out.)

In addition it would be very odd to sort random/hashed GUIDs and expect
anything that in meaningful, anyway. If the user wants to see a UUID
lexographically sorted, they could also cast the column to text like so:

  select uuid_column from uuid_test order by uuid_column::text;

... which produces the desired output for visual analysis if that was
desired while still retaining all the other benefits.

I'll continue thinking about any other downsides to this tonight, too.

-- Robert

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Change sort order on UUIDs?

2007-06-14 Thread Gregory Stark

Robert Wojciechowski [EMAIL PROTECTED] writes:

 When done that way, you're going to see a lot of index B-tree
 fragmentation with even DCE 1.1 (ISO/IEC 11578:1996) time based UUIDs,
 as described above. With random (version 4) or hashed based (version 3
 or 5) UUIDs there's nothing that can be done to improve the situation,
 obviously.

Is this based on empirical results or just a theory? I'm asking because it's
actually a common technique to reverse the natural index key to construct
basically exactly this situation -- for performance reasons. The idea is that
low order bits have higher cardinality and that that can *improve* btree
performance by avoiding contention.

I'm not sure how much I believe in the effectiveness of that strategy myself
or for that matter whether it's universally applicable or only useful in
certain types of loads.

I'm not saying you're wrong, but I'm not sure it's a simple open and shut case
either. 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Change sort order on UUIDs?

2007-06-14 Thread Gregory Stark
Robert Wojciechowski [EMAIL PROTECTED] writes:

 That was my thinking; that it would speed up (bulk) inserts causing
 fewer page splits.

Ah, I understand better now. hm. high data density would be good for reading.
But I think the case for inserting is actually quite mixed. If you have lots
of processes trying to insert you'll actually get poorer performance because
they'll all have to get access to the same page. Worse, you'll probably have a
unique index.

 The other thing this modification does is allow ORDER BY to order by
 time when possible, which is a nice default behavior as well, yes?

I think that actually is quite a nice effect. Certainly the loss of it is one
of the big practical disadvantages of using UUIDs over a sequence.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Change sort order on UUIDs?

2007-06-14 Thread mark
On Thu, Jun 14, 2007 at 03:38:44PM -0400, Robert Wojciechowski wrote:
 I've been testing the new UUID functionality in 8.3dev and noticed that
 UUIDs are sorted using memcmp in their default in-memory layout, which
 is:
  struct uuid {
  uint32_ttime_low;
  uint16_ttime_mid;
  uint16_ttime_hi_and_version;
  uint8_t clock_seq_hi_and_reserved;
  uint8_t clock_seq_low;
  uint8_t node[_UUID_NODE_LEN];
  };
 When done that way, you're going to see a lot of index B-tree
 fragmentation with even DCE 1.1 (ISO/IEC 11578:1996) time based UUIDs,
 as described above. With random (version 4) or hashed based (version 3
 or 5) UUIDs there's nothing that can be done to improve the situation,
 obviously.

I suggest that treating the UUID as anything other than a unique
random value is a mistake. There should be no assumptions by users
with regard to how the order is displayed. Also, as UUID generation
based on time is always in sequence, it seems to me that sorting by
UUID time would have the effect of inserts always being to the end of
the index. While this might pack tightly, wouldn't this hurt
concurrency? Random access vs sequential performance. For UUID, I
would value random access before sequential performance. Why would
anybody scan UUID through the index in sequential order?

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Sorted writes in checkpoint

2007-06-14 Thread Gregory Maxwell

On 6/14/07, Simon Riggs [EMAIL PROTECTED] wrote:

On Thu, 2007-06-14 at 16:39 +0900, ITAGAKI Takahiro wrote:
 Greg Smith [EMAIL PROTECTED] wrote:

  On Mon, 11 Jun 2007, ITAGAKI Takahiro wrote:
   If the kernel can treat sequential writes better than random writes, is
   it worth sorting dirty buffers in block order per file at the start of
   checkpoints?

 I wrote and tested the attached sorted-writes patch base on Heikki's
 ldc-justwrites-1.patch. There was obvious performance win on OLTP workload.

   tests| pgbench | DBT-2 response time (avg/90%/max)
 ---+-+---
  LDC only  | 181 tps | 1.12 / 4.38 / 12.13 s
  + BM_CHECKPOINT_NEEDED(*) | 187 tps | 0.83 / 2.68 /  9.26 s
  + Sorted writes   | 224 tps | 0.36 / 0.80 /  8.11 s

 (*) Don't write buffers that were dirtied after starting the checkpoint.

 machine : 2GB-ram, SCSI*4 RAID-5
 pgbench : -s400 -t4 -c10  (about 5GB of database)
 DBT-2   : 60WH (about 6GB of database)

I'm very surprised by the BM_CHECKPOINT_NEEDED results. What percentage
of writes has been saved by doing that? We would expect a small
percentage of blocks only and so that shouldn't make a significant
difference. I thought we discussed this before, about a year ago. It
would be easy to get that wrong and to avoid writing a block that had
been re-dirtied after the start of checkpoint, but was already dirty
beforehand. How long was the write phase of the checkpoint, how long
between checkpoints?

I can see the sorted writes having an effect because the OS may not
receive blocks within a sufficient time window to fully optimise them.
That effect would grow with increasing sizes of shared_buffers and
decrease with size of controller cache. How big was the shared buffers
setting? What OS scheduler are you using? The effect would be greatest
when using Deadline.


Linux has some instrumentation that might be useful for this testing,

echo 1  /proc/sys/vm/block_dump
Will have the kernel log all physical IO (disable syslog writing to
disk before turning it on if you don't want the system to blow up).

Certainly the OS elevator should be working well enough to not see
that much of an improvement. Perhaps frequent fsync behavior is having
unintended interaction with the elevator?  ... It might be worthwhile
to contact some Linux kernel developers and see if there is some
misunderstanding.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Tsearch vs Snowball, or what's a source file?

2007-06-14 Thread Tom Lane
I wrote:
 Teodor Sigaev [EMAIL PROTECTED] writes:
 2 Snowball's compiling infrastructure doesn't support Windows target.

 Yeah.  Another problem with using their original source code is that
 running the Snowball compiler during build would not work for
 cross-compiled builds of Postgres, at least not without solving the
 problem of building some code for the host platform instead of the
 target.

 So what I'm thinking now is we should import libstemmer instead of the
 snowball_code representation.  I haven't gotten as far as thinking about
 exactly how to lay out the files though.

I've done some more work on this point.  After looking at the Snowball
code in more detail, I'm thinking it'd be a good idea to keep it at
arm's length in a loadable shared library, instead of incorporating it
directly into the backend.  This is because they don't see anything
wrong with exporting random global function names like eq_v and
skip_utf8; so the probability of name collisions is a bit too high for
my taste.  The current tsearch_core patch envisions having a couple of
the snowball stemmers in the core backend and the rest in a loadable
library, but I suggest we just put them all in a loadable library, with
the only entry points being snowball_init() and snowball_lexize()
tsearch dictionary support functions.  (I am thinking of having just one
such function pair, with the init function taking an init option to
select which stemmer to use, instead of a separate Postgres function
pair per stemmer.)

Attached is a rough proof-of-concept patch for this.  It doesn't do
anything useful, but it does prove that we can compile and link the
Snowball stemmers into a Postgres loadable module with only trivial
changes to their source code.  The code compiles cleanly (zero warnings
in gcc).  The file layout is

src/backend/snowball/Makefile   our files
src/backend/snowball/README
src/backend/snowball/dict_snowball.c
src/backend/snowball/libstemmer/*.c their .c files

src/include/snowball/header.h   intercepting .h file
src/include/snowball/libstemmer/*.h their .h files

If there're no objections, I'll push forward with completing the
dictionary support functions to go with this infrastructure.

regards, tom lane



binodtypuIVWP.bin
Description: snowball-add.tar.gz

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] How does the tsearch configuration get selected?

2007-06-14 Thread Bruce Momjian
I am confused by the CREATE FULLTEXT CONFIGURATION command:

http://momjian.us/expire/fulltext/SGML/ref/create-fulltext-config.sgml

First, why are we specifying the server locale here since it never
changes:

   varlistentry
termliteralLOCALE/literal/term
listitem
 para
replaceable class=PARAMETERlocalename/replaceable
 is the name of the locale. It should match server's locale 
(varnamelc_ctype/varname)
 to identify full-text configuration used by default.
 /para
/listitem
   /varlistentry

Second, I can't figure out how to reference a non-default
configuration.  The description says:

   varlistentry
termLITERALAS DEFAULT/LITERAL/term
listitem
 para
  Set literaldefault/literal flag for the configuration, which
  used to identify if this configuration is selectable on default
  (see LITERALLOCALE/LITERAL description above).
  It is possible to have emphasismaximum one/emphasis configuration
  with the same locale and in the same schema with this flag enabled.
 /para
/listitem
   /varlistentry

The documentation says that the first fulltext configuration found in
the search patch is the one used, so how does a secondary configuration
in the same schema actually get accessed by @@ or ::tsquery?  Do you
have to use to_tsquery() with the optional configuration name?

Is this really the direction we want to go, having a default that gets
picked up from the search_path, perhaps based on some encoding/locale
match I can't figure out, or do we want to require the configuration to
be specified always, and if we do that, how do we handle the @@
operator?

I am thinking we should just have use the first fulltext configuration
from the first schema in the search path and eliminate naming the
configurations (same as schema name?).  Allowing configuration names to
be specified only sometimes is confusing.  Or we can use a GUC to name
the configuration we want to use specifically, rather than have a
read-only tsearch_conf_name (is it read-only?) that is controlled by the
search_path.

And why are we talking locale here instead of encoding?  And since we
only have one encoding per database, how can there be more than one?  Is
this _client_ encoding?

FYI, while the configuration selection needs work, the rest of the areas
seem logical.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-14 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 First, why are we specifying the server locale here since it never
 changes:

It's poorly described.  What it should really say is the language
that the text-to-be-searched is in.  We can actually support multiple
languages here today, the restriction being that there have to be
stemmer instances for the languages with the database encoding you're
using.  With UTF8 encoding this isn't much of a restriction.  We do need
to put code into the dictionary stuff to enforce that you can't use a
stemmer when the database encoding isn't compatible with it.

I would prefer that we not drive any of this stuff off the server's
LC_xxx settings, since as you say that restricts things to just one
locale.

 Second, I can't figure out how to reference a non-default
 configuration.

See the multi-argument versions of to_tsvector etc.

I do see a problem with having to_tsvector(config, text) plus
to_tsvector(text) where the latter implicitly references a config
selected by a GUC variable: how can you tell whether a query using the
latter matches a particular index using the former?  There isn't
anything in the current planner mechanisms that would make that work.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] How does the tsearch configuration get selected?

2007-06-14 Thread Oleg Bartunov

On Thu, 14 Jun 2007, Tom Lane wrote:


Bruce Momjian [EMAIL PROTECTED] writes:

First, why are we specifying the server locale here since it never
changes:


server's locale is used just for one purpose - to select what text search 
configuration to use by default. Any text search functions can accept

text search configuration as an optional parameter.



It's poorly described.  What it should really say is the language
that the text-to-be-searched is in.  We can actually support multiple
languages here today, the restriction being that there have to be
stemmer instances for the languages with the database encoding you're
using.  With UTF8 encoding this isn't much of a restriction.  We do need
to put code into the dictionary stuff to enforce that you can't use a
stemmer when the database encoding isn't compatible with it.

I would prefer that we not drive any of this stuff off the server's
LC_xxx settings, since as you say that restricts things to just one
locale.


something like 
CREATE TEXT SEARCH DICTIONARY dictname [LOCALE=ru_RU.UTF-8]
and raise warning/error if database encoding doesn't match dictionary 
encoding if specified (not all dictionaries depend on encoding, so it

should be an optional parameter).




Second, I can't figure out how to reference a non-default
configuration.


See the multi-argument versions of to_tsvector etc.

I do see a problem with having to_tsvector(config, text) plus
to_tsvector(text) where the latter implicitly references a config
selected by a GUC variable: how can you tell whether a query using the
latter matches a particular index using the former?  There isn't
anything in the current planner mechanisms that would make that work.


Probably, having default text search configuration is not a good idea
and we could just require it as a mandatory parameter, which could
eliminate many confusion with selecting text search configuration.


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] tsearch_core patch: permissions and security issues

2007-06-14 Thread Oleg Bartunov

On Thu, 14 Jun 2007, Tom Lane wrote:


Oleg Bartunov [EMAIL PROTECTED] writes:

You're correct. But we can't defend users from all possible errors.
Other side, that we need somehow to help user to identify what fts
configuration was used to produce tsvector. For example, comment on
tsvector column would be useful, but we don't know how to do this
automatically.


Yeah, I was wondering about that too.  The only way we could relax the
superuser, you-better-know-what-you're-doing restriction on changing
configurations would be if we had a way to identify which tsvector
columns needed to be updated.  Right now that's pretty hard to find out
because the references to configurations are buried in the bodies of
trigger functions.  That whole trigger-function business is not the
nicest part of tsearch2, either ... it'd be better if we could automate
tsvector maintenance more.


yes, trigger function is a complex stuff, our tsearch() trigger is an 
example of automated stuff. It could be written very easy on plpgsql,

for example.

=# create function my_update() returns trigger as 
$$

BEGIN
   NEW.fts=
   setweight( to_tsvector('english',NEW.t1),'A') || ' ' ||
   setweight( to_tsvector('english',NEW.t2),'B');
 RETURN NEW;
END;
$$ 
language plpgsql;




One thing I was thinking about is that rather than storing a physical
tsvector column, people might index a virtual column using functional
indexes:

create index ... (to_tsvector('english', big_text_col))

which could be queried

select ... where to_tsvector('english', big_text_col) @@ tsquery



this is already possible for gin index

create index gin_text_idx on test using gin (
( coalesce(to_tsvector(title),'') || coalesce(to_tsvector(body),'') )
);


apod=# select title from test where 
(coalesce(to_tsvector(title),'') || coalesce(to_tsvector(body),'') ) @@ 
to_tsquery('supernovae') order by sdate desc limit 10;





Assuming that the index is lossy, the index condition would have to be
rechecked, so to_tsvector() would have to be recomputed, but only at the
rows identified as candidate matches by the index.  The I/O savings from
eliminating the heap's tsvector column might counterbalance the extra
CPU for recomputing tsvectors.  Or not, but in any case this is
attractive because it doesn't need any handmade maintenance support like
a trigger --- the regular index maintenance code does it all.


I'm afraid it wouldn't work for all cases. We already have headline() function 
which had to reparse document to produce text snippet and it's very slow
and eats most select time. 
ALso, trigger stuff is a normal machinery for databases.




It strikes me that we could play the same kind of game we played to make
nextval() references to sequences be recognized as dependencies on
sequences.  Invent a regconfig OID type that's just like regclass
except it handles OIDs of ts_config entries instead of pg_class entries,
and make the first argument of to_tsvector be one of those:

create index ... (to_tsvector('english'::regconfig, big_text_col))

Now dependency.c can be taught to recognize the regconfig Const as
depending on the referenced ts_config entry, and voila we have a
pg_depend entry showing that the index depends on the configuration.
What we actually do about it is another question, but this at least
gets the knowledge into the system.



interesting. And \di could display all configuration stuff for text search
indexes ?



[ thinks some more... ]  If we revived the GENERATED AS patch,
you could imagine computing tsvector columns via GENERATED AS
to_tsvector('english'::regconfig, big_text_col) instead of a
trigger, and then again you've got the dependency exposed where
the system can see it.  I don't wanna try to do that for 8.3,
but it might be a good path to pursue in future, instead of assuming
that triggers will be the way forevermore.

Thoughts?


No way with standard. GENERATED AS says that all columns references in an 
expression associated with a generated column must be to columns of the base 
table containing that generated column.


tsvector could be result of rather complex select involving several tables.



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Sorted writes in checkpoint

2007-06-14 Thread Greg Smith

On Thu, 14 Jun 2007, Gregory Maxwell wrote:


Linux has some instrumentation that might be useful for this testing,
echo 1  /proc/sys/vm/block_dump


That bit was developed for tracking down who was spinning the hard drive 
up out of power saving mode, and I was under the impression that very 
rough feature isn't useful at all here.  I just tried to track down again 
where I got that impression from, and I think it was this thread:


http://linux.slashdot.org/comments.pl?sid=231817cid=18832379

This mentions general issues figuring out who was responsible for a write 
and specifically mentions how you'll have to reconcile two different paths 
if fsync is mixed in.  Not saying it won't work, it's just obvious using 
the block_dump output isn't a simple job.


(For anyone who would like an intro to this feature, try 
http://www.linuxjournal.com/node/7539/print and 
http://toadstool.se/journal/2006/05/27/monitoring-filesystem-activity-under-linux-with-block_dump 
)


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster