Re: [HACKERS] can we publish a aset interface?

2010-09-16 Thread Peter Eisentraut
On tis, 2010-09-07 at 20:35 +0200, Pavel Stehule wrote:
 I don't plan to try to move this module to core. And it's useless -
 other languages has not our problems.

I don't know the details of what you're struggling with, but it's a bit
hard to believe that there is a problem that is absolutely unique to the
Czech language.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-16 Thread Pavel Stehule
2010/9/16 Peter Eisentraut pete...@gmx.net:
 On tis, 2010-09-07 at 20:35 +0200, Pavel Stehule wrote:
 I don't plan to try to move this module to core. And it's useless -
 other languages has not our problems.

 I don't know the details of what you're struggling with, but it's a bit
 hard to believe that there is a problem that is absolutely unique to the
 Czech language.


I think so people uses a steamer dictionary - because ispell
dictionary should be slow for any language. But there are not
available steamer for Czech language. People who need fast processing
just use a simple dictionary - and probably there are not any pg
hacker from Poland or Slovakia.

Regards

Pavel





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-16 Thread David Fetter
On Thu, Sep 16, 2010 at 08:43:37PM +0200, Pavel Stehule wrote:
 2010/9/16 Peter Eisentraut pete...@gmx.net:
  On tis, 2010-09-07 at 20:35 +0200, Pavel Stehule wrote:
  I don't plan to try to move this module to core. And it's useless
  - other languages has not our problems.
 
  I don't know the details of what you're struggling with, but it's
  a bit hard to believe that there is a problem that is absolutely
  unique to the Czech language.
 
 I think so people uses a steamer dictionary - because ispell
 dictionary should be slow for any language. But there are not
 available steamer for Czech language. People who need fast
 processing just use a simple dictionary - and probably there are not
 any pg hacker from Poland or Slovakia.

I know of at least one in Poland, and I'd be amazed if there were none
from Slovakia.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] can we publish a aset interface?

2010-09-07 Thread Pavel Stehule
Hello

I would to use a special memory context for shared data (based on
mmap) and I like impementation of aset. There is only one difference -
aset is based on malloc and I would to use a mmap.

malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
procedures should be overwritten, but other code and data structures
can be used. This step can be useful for previous discuss about some
more comfortable maintaining of shared memory.

What do you think about?

Regards

Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Robert Haas
On Tue, Sep 7, 2010 at 4:53 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I would to use a special memory context for shared data (based on
 mmap) and I like impementation of aset. There is only one difference -
 aset is based on malloc and I would to use a mmap.

 malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
 procedures should be overwritten, but other code and data structures
 can be used. This step can be useful for previous discuss about some
 more comfortable maintaining of shared memory.

 What do you think about?

What would this be good for?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Pavel Stehule
2010/9/7 Robert Haas robertmh...@gmail.com:
 On Tue, Sep 7, 2010 at 4:53 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I would to use a special memory context for shared data (based on
 mmap) and I like impementation of aset. There is only one difference -
 aset is based on malloc and I would to use a mmap.

 malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
 procedures should be overwritten, but other code and data structures
 can be used. This step can be useful for previous discuss about some
 more comfortable maintaining of shared memory.

 What do you think about?

 What would this be good for?


I try to solve performance problems with czech tsearch. I checked
serialization and deserialization, but this decrease load time only to
100ms (from 500) that is too much for us. After some gaming with mmap
I thinking so there some chance to preallocate mmap memory, and then
use a special memory context based on mmap instead of malloc.
Teoretically I can copy aset interface - this module probably never be
in core (this problem is probably local - only Czech), but it isn't
nice. So I asking.

Regards

Pavel Stehule


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Robert Haas
On Tue, Sep 7, 2010 at 9:27 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2010/9/7 Robert Haas robertmh...@gmail.com:
 On Tue, Sep 7, 2010 at 4:53 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I would to use a special memory context for shared data (based on
 mmap) and I like impementation of aset. There is only one difference -
 aset is based on malloc and I would to use a mmap.

 malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
 procedures should be overwritten, but other code and data structures
 can be used. This step can be useful for previous discuss about some
 more comfortable maintaining of shared memory.

 What do you think about?

 What would this be good for?


 I try to solve performance problems with czech tsearch. I checked
 serialization and deserialization, but this decrease load time only to
 100ms (from 500) that is too much for us. After some gaming with mmap
 I thinking so there some chance to preallocate mmap memory, and then
 use a special memory context based on mmap instead of malloc.
 Teoretically I can copy aset interface - this module probably never be
 in core (this problem is probably local - only Czech), but it isn't
 nice. So I asking.

I don't see how you could do anything with this that you can't do with
the existing implementation.  It's not as if you can store pointers
into an mmap'd block and then count on them being valid the next time
you map the file...  it might not end up at the same offset.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 I would to use a special memory context for shared data (based on
 mmap) and I like impementation of aset. There is only one difference -
 aset is based on malloc and I would to use a mmap.

 malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
 procedures should be overwritten, but other code and data structures
 can be used. This step can be useful for previous discuss about some
 more comfortable maintaining of shared memory.

 What do you think about?

If you're proposing factoring aset.c into two levels, I don't think so.
That code is already a tremendous performance hot-spot and introducing
any more inefficiency into it doesn't seem like a good idea.  Especially
not for shared memory allocation, which is a feature that still has
no buy-in.  Also, you'd need to do more than just replace malloc: you'd
need to add locking capability.  That would make the code even uglier,
and slower, if it has to support locking or no locking dynamically.

Use the mcxt.c switch.  That's what it's there for.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Sep 7, 2010 at 9:27 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I try to solve performance problems with czech tsearch. I checked
 serialization and deserialization, but this decrease load time only to
 100ms (from 500) that is too much for us. After some gaming with mmap
 I thinking so there some chance to preallocate mmap memory, and then
 use a special memory context based on mmap instead of malloc.
 Teoretically I can copy aset interface - this module probably never be
 in core (this problem is probably local - only Czech), but it isn't
 nice. So I asking.

 I don't see how you could do anything with this that you can't do with
 the existing implementation.  It's not as if you can store pointers
 into an mmap'd block and then count on them being valid the next time
 you map the file...  it might not end up at the same offset.

More to the point, this entire approach to speeding up dictionary loading
has already been proposed and rejected, and it'll get rejected again if
it's submitted.

The conclusion of the previous discussion was that we should build
precompiled dictionaries, using some pointer-free representation,
which would be stored in files that could be either mmap'd in or just
read in if running on a platform lacking mmap.  There is no need for
any shmem allocator in that implementation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Alvaro Herrera
Excerpts from Robert Haas's message of mar sep 07 10:13:12 -0400 2010:

  I try to solve performance problems with czech tsearch. I checked
  serialization and deserialization, but this decrease load time only to
  100ms (from 500) that is too much for us. After some gaming with mmap
  I thinking so there some chance to preallocate mmap memory, and then
  use a special memory context based on mmap instead of malloc.
  Teoretically I can copy aset interface - this module probably never be
  in core (this problem is probably local - only Czech), but it isn't
  nice. So I asking.
 
 I don't see how you could do anything with this that you can't do with
 the existing implementation.  It's not as if you can store pointers
 into an mmap'd block and then count on them being valid the next time
 you map the file...  it might not end up at the same offset.

Hmm, surely you could store offsets instead of absolute pointers.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Robert Haas
On Tue, Sep 7, 2010 at 11:18 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 Excerpts from Robert Haas's message of mar sep 07 10:13:12 -0400 2010:

  I try to solve performance problems with czech tsearch. I checked
  serialization and deserialization, but this decrease load time only to
  100ms (from 500) that is too much for us. After some gaming with mmap
  I thinking so there some chance to preallocate mmap memory, and then
  use a special memory context based on mmap instead of malloc.
  Teoretically I can copy aset interface - this module probably never be
  in core (this problem is probably local - only Czech), but it isn't
  nice. So I asking.

 I don't see how you could do anything with this that you can't do with
 the existing implementation.  It's not as if you can store pointers
 into an mmap'd block and then count on them being valid the next time
 you map the file...  it might not end up at the same offset.

 Hmm, surely you could store offsets instead of absolute pointers.

Surely you could.  But then where does palloc come in?  As Tom said
upthread, the right thing to do here is to create a pre-compiler that
outputs a pointer-free representation which you can then mmap().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Pavel Stehule
2010/9/7 Robert Haas robertmh...@gmail.com:
 On Tue, Sep 7, 2010 at 9:27 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2010/9/7 Robert Haas robertmh...@gmail.com:
 On Tue, Sep 7, 2010 at 4:53 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I would to use a special memory context for shared data (based on
 mmap) and I like impementation of aset. There is only one difference -
 aset is based on malloc and I would to use a mmap.

 malloc() is used in AllocSetContextCreate and AllocSetAlloc. These
 procedures should be overwritten, but other code and data structures
 can be used. This step can be useful for previous discuss about some
 more comfortable maintaining of shared memory.

 What do you think about?

 What would this be good for?


 I try to solve performance problems with czech tsearch. I checked
 serialization and deserialization, but this decrease load time only to
 100ms (from 500) that is too much for us. After some gaming with mmap
 I thinking so there some chance to preallocate mmap memory, and then
 use a special memory context based on mmap instead of malloc.
 Teoretically I can copy aset interface - this module probably never be
 in core (this problem is probably local - only Czech), but it isn't
 nice. So I asking.

 I don't see how you could do anything with this that you can't do with
 the existing implementation.  It's not as if you can store pointers
 into an mmap'd block and then count on them being valid the next time
 you map the file...  it might not end up at the same offset.

you can, but you have to do preallocation and you have to use a FIXED flag.

Pavel



 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Robert Haas
On Tue, Sep 7, 2010 at 12:44 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I don't see how you could do anything with this that you can't do with
 the existing implementation.  It's not as if you can store pointers
 into an mmap'd block and then count on them being valid the next time
 you map the file...  it might not end up at the same offset.

 you can, but you have to do preallocation and you have to use a FIXED flag.

MAP_FIXED?  As TFM says: Because requiring a fixed address for a
mapping is less portable, the use of this option  is  discouraged.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] can we publish a aset interface?

2010-09-07 Thread Pavel Stehule
2010/9/7 Robert Haas robertmh...@gmail.com:
 On Tue, Sep 7, 2010 at 12:44 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I don't see how you could do anything with this that you can't do with
 the existing implementation.  It's not as if you can store pointers
 into an mmap'd block and then count on them being valid the next time
 you map the file...  it might not end up at the same offset.

 you can, but you have to do preallocation and you have to use a FIXED flag.

 MAP_FIXED?  As TFM says: Because requiring a fixed address for a
 mapping is less portable, the use of this option  is  discouraged.

yes, I know. This will be used for proprietary Czech language - 95% of
postgresql instalations are on Linux, 10% on MS Windows (in Czech
Republic)

I don't plan to try to move this module to core. And it's useless -
other languages has not our problems.

Regards

Pavel


 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers