Re: [HACKERS] [PATCHES] regexp_replace

2005-06-06 Thread a_ogawa

Bruce Momjian wrote:
 David Fetter wrote:
  On Mon, Jun 06, 2005 at 12:02:18PM -0400, Bruce Momjian wrote:
  
   Patch removed because we already have this functionality.
 
  We don't yet have this functionality, as the patch allows for using
  second and later regex matches () in the replacement pattern.
 
  The function is misnamed.  It should be called regex_replace_all() or
  some such, as it violates the principle of least astonishment by
  replacing all instances by default.  Every other regex replacement
  defaults to replace first, not replace all.  Or maybe it should
  take a bool for replace all, or...?  Anyhow, it's worth a discussion
  :)

 Does anyone want to argue that this additional functionality is
 significant and deserves its own function or an additional argument to
 the existing function?

Oracle10g has a similar functionality. The name is regexp_replace.
There is the following usages in this functionality.
- Format the ZIP code and the telephone number, etc.
   Example: select regexp_replace('111222', '(\\d{3})(\\d{3})(\\d{4})',
  '(\\1) \\2-\\3');
result: (111) 222-
- Delete an unnecessary white space.
   Example: select regexp_replace('A B C', '\\s+', ' ');
result: A B C

I think that the usage increases if replace all or replace first can be
specified to this function.

regards,

---
Atsushi Ogawa


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Cost of XLogInsert CRC calculations

2005-05-17 Thread a_ogawa

I tested crctest in two machines and two versions of gcc. 

UltraSPARC III, gcc 2.95.3:
gcc -O1 crctest.c   1.321517 s
gcc -O2 crctest.c   1.099186 s
gcc -O3 crctest.c   1.099330 s
gcc -O1 crctest64.c 1.651599 s
gcc -O2 crctest64.c 1.429089 s
gcc -O3 crctest64.c 1.434296 s

UltraSPARC III, gcc 3.4.3:
gcc -O1 crctest.c   1.209168 s
gcc -O2 crctest.c   1.206253 s
gcc -O3 crctest.c   1.209762 s
gcc -O1 crctest64.c 1.545899 s
gcc -O2 crctest64.c 1.545290 s
gcc -O3 crctest64.c 1.540993 s

Pentium III, gcc 2.95.3:
gcc -O1 crctest.c   1.548432 s
gcc -O2 crctest.c   1.226873 s
gcc -O3 crctest.c   1.227699 s
gcc -O1 crctest64.c 1.362152 s
gcc -O2 crctest64.c 1.259324 s
gcc -O3 crctest64.c 1.259608 s

Pentium III, gcc 3.4.3:
gcc -O1 crctest.c   1.084822 s
gcc -O2 crctest.c   0.921594 s
gcc -O3 crctest.c   0.921910 s
gcc -O1 crctest64.c 1.188287 s
gcc -O2 crctest64.c 1.242013 s
gcc -O3 crctest64.c 1.638812 s

I think that it can improve the performance by loop unrolling. 
I measured the performance when the loop unrolled by -funroll-loops
option or hand-tune. (hand-tune version is attached.)

UltraSPARC III, gcc 2.95.3:
gcc -O2 crctest.c   1.098880 s
gcc -O2 -funroll-loops crctest.c0.874165 s
gcc -O2 crctest_unroll.c0.808208 s

UltraSPARC III, gcc 3.4.3:
gcc -O2 crctest.c   1.209168 s
gcc -O2 -funroll-loops crctest.c1.127973 s
gcc -O2 crctest_unroll.c1.017485 s

Pentium III, gcc 2.95.3:
gcc -O2 crctest.c   1.226873 s
gcc -O2 -funroll-loops crctest.c1.077475 s
gcc -O2 crctest_unroll.c1.051375 s

Pentium III, gcc 3.4.3:
gcc -O2 crctest.c   0.921594 s
gcc -O2 -funroll-loops crctest.c0.873614 s
gcc -O2 crctest_unroll.c0.839384 s

regards,

---

Atsushi Ogawa

crctest.tar.gz
Description: Binary data

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] FunctionCallN improvement.

2005-02-02 Thread a_ogawa

Tom Lane wrote:
 Based on this I think we ought to go with the unrolled approach, ie,
 we'll create a macro to initialize the fixed fields of fcinfo but fill
 in the arg and argisnull arrays with code like what's already in
 FunctionCall2:

I agree. The unrolled approach is a good result in most environments. 

I think that a new macro becomes the following:

#define InitFunctionCallInfoData(Fcinfo, Flinfo, Nargs) \
do {\
(Fcinfo)-flinfo = Flinfo;  \
(Fcinfo)-context = NULL;   \
(Fcinfo)-resultinfo = NULL;\
(Fcinfo)-isnull = false;   \
(Fcinfo)-nargs = Nargs;\
} while(0)

I think that this macro is effective also in other function such as 
ExecMakeFunctionResultNoSets. However, we should apply that after 
actually examining the effect.

First of all, this macro will be applied only to fmgr.c, but I think 
we better define it in fmgr.h. 

regards,

---
A.Ogawa ( [EMAIL PROTECTED] )


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] FunctionCallN improvement.

2005-01-31 Thread a_ogawa

When SQL that returns many tuples with character code conversion
is executed, the FunctionCall3/FunctionCall5 becomes a bottleneck.
Because MemSet is used to initialize FunctionCallInfoData in these
functions, a lot of cycles are spent. 

test query
set client_encoding to 'SJIS';
select * from pg_class, pg_amop;
(This SQL is used only to get a lot of tuples, and there is no 
logical meaning) 

result of profile
Each sample counts as 0.01 seconds.
  %   cumulative   self  self total
 time   seconds   secondscalls   s/call   s/call  name
 22.91  1.29 1.29  1562351 0.00 0.00  FunctionCall5
 18.29  2.32 1.03  1602006 0.00 0.00  FunctionCall3
  5.06  2.60 0.28  4892127 0.00 0.00  AllocSetAlloc
  4.88  2.88 0.28  9781322 0.00 0.00  AllocSetFreeIndex
  4.35  3.12 0.24  1587600 0.00 0.00  ExecEvalVar

Most of calls of these functions are from printtup. 
FunctionCall3 is used to generate the text. 
FunctionCall5 is used to character code conversion.
(printtup - pq_sendcountedtext - pg_server_to_client -
 perform_default_encoding_conversion - FunctionCall5)

I think that we should initialize only the fields of 
FunctionCallInfoData that must be initialized. 
(Such as FunctionCall1)

I have two plans to modify the code. 
(a)Change FunctionCall3/FunctionCall5 like FunctionCall1. 
 It is simple, minimum change.

(b)Define the macro that initialize FunctionCallInfoData, and use it 
instead of MemSet in all FunctionCallN, DirectFunctionCallN, 
OidFunctionCallN.
 This macro is the following. 

#define InitFunctionCallInfoData(Fcinfo, Flinfo, Nargs) \
do {\
(Fcinfo)-flinfo = Flinfo;  \
(Fcinfo)-context = NULL;   \
(Fcinfo)-resultinfo = NULL;\
(Fcinfo)-isnull = false;   \
(Fcinfo)-nargs = Nargs;\
MemSet((Fcinfo)-argnull, 0, Nargs * sizeof(bool)); \
} while(0)

I think that plan(b) is better, because source code consistency 
and efficiency improve.

Any comments?

regards, 

---
A.Ogawa ( [EMAIL PROTECTED] )


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq