> The character-code conversion from EUC_JP to SJIS is executed by > converting two stages. The first stage is conversion from EUC_JP to MIC. > The next stage is conversion from MIC to SJIS. (Conversion from SJIS to > EUC_JP is also similar.) > > It is not so efficient, because it is necessary to allocate the > buffer for MIC, and to execute the calculation for conversion twice. > > In the attached patch, it enables the direct conversion of EUC_JP and > SJIS. Additionally, there is an improvement that reduce the call of > pg_mic_mblen. > > The effect of the patch that I measured is as follows: > > o The Data for test was created by 'pgbench -i'. > > o Test SQL: > set client_encoding to 'SJIS'; > select * from accounts; > > o Test results: Linux(CPU: Pentium III, Compiler option: -O2) > - original: 2.920s > - patched : 2.278s > > regards, > > --- > Atsushi Ogawa
I have tested Atsushi's patches with PostgreSQL 8.0.3 on my Note PC running Linux 2.4 and got following results (database encoding is EUC_JP): 1) without patches $ time psql -c 'set client_encoding to 'SJIS';select * from accounts;' test >/dev/null real 0m4.926s user 0m1.680s sys 0m0.090s 2) with patches $ time psql -c 'set client_encoding to 'SJIS';select * from accounts;' test >/dev/null real 0m3.816s user 0m1.560s sys 0m0.070s 3) no encoding conversions $ time psql -c 'set client_encoding to 'EUC_JP';select * from accounts;' test >/dev/null real 0m3.220s user 0m1.760s sys 0m0.070s I got the 52% overhead decreases to 18% with the patches. This is a huge improvement! I will commit to current if there's no objection. -- Tatsuo Ishii ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])