Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments

2012-04-23 Thread Heikki Linnakangas

On 23.04.2012 02:59, Fujii Masao wrote:

On Thu, Apr 19, 2012 at 2:20 PM, Kyotaro HORIGUCHI
  wrote:

Hello, this is new version of standby checkpoint_segments patch.


Thanks for the patch!


This still makes catching up in standby mode slower, as you get many 
more restartpoints. The reason for ignoring checkpoint_segments during 
recovery was to avoid that. Maybe it's still better than what we have 
currently, I'm not sure, but at least it needs to be discussed. Would be 
good to do some performance testing of recovery with various 
checkpoint_segments and _timeout settings, with and without this patch.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gsoc2012 idea, tablesample

2012-04-23 Thread Sandro Santilli
On Mon, Apr 23, 2012 at 08:34:44PM +0300, Ants Aasma wrote:
> On Mon, Apr 23, 2012 at 4:37 PM, Sandro Santilli  wrote:
> > I'd love to see enhanced CTID operators, to fetch all visible tuples in a 
> > page
> > using a tidscan.  Something like: WHERE ctid =~ '(501,*)' or a ctidrange.
> 
> Among other things, this would enable user-space implementation of
> tablesample. Given the operator =~(tid, int) that matches the page
> number and planner/executor integration so that it results in a TID
> scan, you would need the following functions:
> 
> random_pages(tbl regclass, samples int) returns int[]
> aggregate function:
> reservoir_sample(item anyelement, samples int) returns anyarray
> 
> Implementations for both of the functions could be adapted from analyze.c.
> 
> Then tablesample could be implemented with the following query:
> SELECT (SELECT reservoir_sample(some_table, 50) AS samples
>FROM some_table WHERE ctid =~ ANY (rnd_pgtids))
> FROM random_pages('some_table', 50) AS rnd_pgtids;
> 
> Actually, now that I think about it, it could actually be implemented
> without any modifications to core at some cost to efficiency.
> random_pages would have to return tid[] that contains for each
> generated pagenumber all possible tids on that page.

This is exactly what I'm after.
I've actually started crafting such a TableSample function and I'm in the
process to refine the signature so your suggested interface above is 
very useful, thanks !

But I don't understand the reservoir_sample call, what is it supposed to do ?
And how flexibly "anyarray" return would be ? Could you return arbitrary 
typed rowtypes from it ?

> By making the building blocks available users get more flexibility.
> The downside would be that we can't automatically make better sampling
> methods available.

One approach doesn't preclude the other. TABLESAMPLE will still be useful,
also for SQL compliance.

--strk; 

  ,--o-. 
  |   __/  |Delivering high quality PostGIS 2.0 !
  |  / 2.0 |http://strk.keybit.net - http://vizzuality.com
  `-o--'


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUG] Checkpointer on hot standby runs without looking checkpoint_segments

2012-04-23 Thread Kyotaro HORIGUCHI
Hello,

> >  - xlog.c: Make StandbyMode shared.
> >
> >  - checkpointer.c: Use IsStandbyMode() to check if postmaster is
> >   under standby mode.
> 
> IsStandbyMode() looks overkill to me. The standby mode flag is forcibly
> turned off at the end of recovery, but its change doesn't need to be shared
> to the checkpointer process, IOW, the shared flag doesn't need to change
> since startup like XLogCtl->archiveCleanupCommand, I think. So we can
> simplify the code to share the flag to the checkpointer. See the attached
> patch (though not tested yet).

Hmm. I understood that the aim of the spinlock and volatil'ize of
the pointer in reading shared memory is to secure the memory
consistency on SMPs with weak memory consistency and to make
compiler help from over-optimization for non-volatile pointer
respectively.

You removed both of them in the patch.

If we are allowed to be tolerant of the temporary lack of
coherence in shared memory there, the spinlock could be removed.
But the possibility to read garbage by using XLogCtl itself to
access standbyMode does not seem to be tolerable. What do you
think about that?


> The comments in checkpointer.c seems to need to be revised more. For
> example,
> 
> +  * XLogInsert that actually triggers a checkpoint when
> 
> Currently a checkpoint is triggered by XLogWrite (not XLogInsert), the above
> needs to be corrected.

I will be carefull for such outdated description.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

== My e-mail address has been changed since Apr. 1, 2012.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Usage of planner_ctx

2012-04-23 Thread Amit Kapila
PlannerInfo->planner_ctx is assigned with either MessageContext or
PortalHeapMemory depending on the flow from which planner is invoked.

My doubt is that during whole planning process till the Plan is stored in
PlannedStmt, all the memory is from CurrentMemoryContext which is same as
PlannerInfo->planner_ctx.

 

So what is the use of having  PlannerInfo->planner_ctx which only contains
CurrentMemoryContext?

 



Re: [HACKERS] Aggressive memory consumption in {ts,array}_typanalyze

2012-04-23 Thread Robert Haas
On Wed, Apr 18, 2012 at 11:09 AM, Robert Haas  wrote:
> On Mon, Apr 16, 2012 at 4:58 PM, Noah Misch  wrote:
>> The size hint I chose is fairly arbitrary.  Any suggestions for principled
>> alternatives?
>
> Based on your test results, it doesn't seem like it matters very much
> what you put in there, so I'm inclined to think that num_mcelem is
> fine.  I thought about maybe allowing for a little slop, like
> num_mcelem * 10, but maybe the way you did it is better.  It's
> possible that people will set ridiculously overblown stats targets on
> some columns, and that's certainly going to cost something no matter
> what we do, but there's no point in making that worse than it has to
> be without some clear reason for doing so.

Hearing no further comments, I have committed your patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add timing of buffer I/O requests

2012-04-23 Thread Robert Haas
On Sat, Apr 14, 2012 at 10:33 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> The internal representation doesn't have to be (and certainly
>> shouldn't be) numeric.  But if you translate to numeric before
>> returning the data to the user, then you have the freedom, in the
>> future, to whack around the internal representation however you like,
>> without breaking backward compatibility.  Choosing float8 for the
>> external representation is fine as long as we're sure we're not ever
>> going to want more than 16 significant digits, but I see no particular
>> value in baking in that assumption.  But perhaps, as the saying goes,
>> 16 digits ought to be enough for anyone.
>
> There's no particular reason to think that Moore's Law is going to
> result in an increase in the fractional precision of timing data.
> It hasn't done so in the past, for sure.

Perhaps, but nobody's explained what we gain out of NOT using numeric.
 "It's slow" doesn't impress me; selecting from a system view doesn't
need to be lightning-fast.

However, the main thing here is that we need to do *something* here...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gsoc2012 idea, tablesample

2012-04-23 Thread Ants Aasma
On Mon, Apr 23, 2012 at 4:37 PM, Sandro Santilli  wrote:
> I'd love to see enhanced CTID operators, to fetch all visible tuples in a page
> using a tidscan.  Something like: WHERE ctid =~ '(501,*)' or a ctidrange.

Among other things, this would enable user-space implementation of
tablesample. Given the operator =~(tid, int) that matches the page
number and planner/executor integration so that it results in a TID
scan, you would need the following functions:

random_pages(tbl regclass, samples int) returns int[]
aggregate function:
reservoir_sample(item anyelement, samples int) returns anyarray

Implementations for both of the functions could be adapted from analyze.c.

Then tablesample could be implemented with the following query:
SELECT (SELECT reservoir_sample(some_table, 50) AS samples
   FROM some_table WHERE ctid =~ ANY (rnd_pgtids))
FROM random_pages('some_table', 50) AS rnd_pgtids;

Actually, now that I think about it, it could actually be implemented
without any modifications to core at some cost to efficiency.
random_pages would have to return tid[] that contains for each
generated pagenumber all possible tids on that page.

By making the building blocks available users get more flexibility.
The downside would be that we can't automatically make better sampling
methods available.

Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] psql omits row count under "\x auto"

2012-04-23 Thread Noah Misch
I've been enjoying "\x auto" in .psqlrc, but I noticed the row count footer
missing when it chooses ordinary output:

[local] test=# \x off
Expanded display is off.
[local] test=# select 1;
 ?column?
--
1
(1 row)

[local] test=# \x auto
Expanded display is used automatically.
[local] test=# select 1;
 ?column?
--
1

[local] test=#

Looks like the logic in printQuery() needs further treatment.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree page deletion boundary cases

2012-04-23 Thread Noah Misch
On Sun, Apr 22, 2012 at 12:13:34AM +0530, Nikhil Sontakke wrote:
> Was wondering if there's a similar bug which gets triggered while using
> VACUUM FULL. See for instance this thread:
> 
> http://postgresql.1045698.n5.nabble.com/index-corruption-in-PG-8-3-13-td4257589.html
> 
> This issue has been reported on-off from time to time and in most cases
> VACUUM or VACUUM FULL appears to be involved. We have usually attributed it
> to hardware issues and reindex has been recommended by default as a
> solution/work around..

I do not perceive much similarity.  The bug I've raised can produce wrong
query results transiently.  It might permit injecting a tuple into the wrong
spot in the tree, yielding persistent wrong results.  It would not introduce
tree-structural anomalies like sibling pointers directed at zeroed pages or
internal pages in an 1-level tree.  Given the symptoms you reported, I share
Robert's suspicion of WAL replay in your scenario.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG FETCH readahead

2012-04-23 Thread Boszormenyi Zoltan

Hi,

2012-04-17 06:48 keltezéssel, Michael Meskes írta:

On Tue, Apr 17, 2012 at 06:02:34AM +0200, Boszormenyi Zoltan wrote:

I listed two scenarios.
1. occasional bump of the readahead window for large requests,
for smaller requests it uses the originally set size
2. permanent bump of the readahead window for large requests
(larger than previously seen), all subsequent requests use
the new size

Both can be implemented easily, which one do you prefer?
If you always use very large requests, 1) behaves like 2)

I'd say let's go for #2. #1 is probably more efficient but not what the
programmer asked us to do. After all it's easy to increase the window size
accordingly if you want so as a programmer.

Michael


OK, I will implement #2. Another question popped up: what to do
with FETCH ALL? The current readahead window size or temporarily
bumping it to say some tens of thousands can be used. We may not
know how much is the "all records". This, although lowers performance,
saves memory.

Please, don't apply this patch yet. I discovered a rather big hole
that can confuse the cursor position tracking if you do this:

DECLARE mycur;
MOVE ABSOLUTE n IN mycur;
MOVE BACKWARD m IN mycur;

If (n+m) is greater, but (n-m) is smaller than the number
of rows in the cursor, the backend's and the caching code's
ideas about where the cursor is will differ. I need to fix this
before it can be applied.

That will also need a new round of review. Sorry for that.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Namespace of array of user defined types is confused by the parser in insert?

2012-04-23 Thread Krzysztof Nienartowicz
Hello,
Sorry for re-posting - I initially posted this in pgsql.sql - probably
this group is more appropriate.

I have a bizzare problem that started to manifest itself after
addition of field being the array of compound UDTs to the table
declared in multiple schemas.
It is clearly related to how the type namespace is resolved and shows
up for the JDBC client (probably related to the paramterized query, as
the static query works without problems).

given the types:

drop type if exists periodSearchResults cascade;
create type periodSearchResults as
(
 period float8[], -- [1] - 1st value, kth... value for the same ts
 periodError float8[],
 probability float8[],
 amplitude float8[]
 );

drop type if exists periodTSResult cascade;
create type periodTSResult as
(
   method text,
   periods periodSearchResults[] -- 1st raw ts, 2...kth - residual
values
)
;

and a table defined as:

CREATE TABLE timeseriesresult (
   id bigint NOT NULL,
--- ...
   fperiodsearchresults periodTSResult[]
 --- 
);

when the type periodTSResult is defined in more than one schema, I got
error for insert using JDBC client (or prepared statement - it works
with a non-parametrized query) in any other then a first schema it was
defined in - in this case the first schema it was defined in is
cu7user_test, the schema of the execution is cu7user_test_2:

ERROR:  column "fperiodsearchresults" is of type periodtsresult[] but
expression is of type cu7user_test.periodtsresult[] at character 1416
HINT:  You will need to rewrite or cast the expression.
STATEMENT:  INSERT INTO cu7user_test_2.timeseriesresult (runid,
catalogid, sourceid, ftimeseriestype, fstate,
fminimizedfunctionminval, freducedchi2, fstopcriteria,
fweightedmodeling, fhomoscedasticitytest, fkurtosis,
fnumpointsobstime, fljungboxrandomnesstest, fmedianabsolutedeviation,
fmax, fmeanobstime, fmean, fmeanerror, fmedian, fmedianerror, fmin,
frange, frobustweightedstddev, fskewness, fstddev, fsymmetrytest,
ftrimmedweightedmean, ftrimmedweightedrange, fvariabilityflag,
fstatvariabilityflag, fweightedkurtosis, fweightedmean,
fweightedmeanconfidenceinterval, fweightedmeanobstime,
fweightednormalizedp2pscatter, fweightedskewness, fweightedstddevdf,
fweightedstddevwdf, fabbe, fchi2, fiqr, foutliermedian, fpstetson,
fpabbe, fpchi2, fpiqr, fpkurtosis, fpoutliermedian, fpskew,
fpweightedkurtosis, fpweightedskew, fstetson, referencetime,
cadencevalue, cadenceerror, cadencename, fperiodsearchmethod,
fweightedpercentileranks, fweightedpercentiles, fotherparameters,
ffundamentalfrequencies, mapfperiodsearchresults,
fperiodsearchresults, fpolynomialcoefficients, ffouriercoefficients,
derivedtschanges) VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10,
$11, $12, $13, $14, $15, $16, $17, $18, $19, $20, $21, $22, $23, $24,
$25, $26, $27, $28, $29, $30, $31, $32, $33, $34, $35, $36, $37, $38,
$39, $40, $41, $42, $43, $44, $45, $46, $47, $48, $49, $50, $51, $52,
$53, $54, $55, $56, $57, $58, $59, $60, $61, $62, $63, $64, $65, $66)
DEBUG:  parse S_10: ROLLBACK


I debugged a little to find out that indeed there are two types
defined in separate schemas:

SELECT t.oid, t.typname,(select nspname from pg_namespace n where
n.oid=typnamespace) nspname ,typarray  FROM pg_catalog.pg_type t where
typname like 'periodts%';

gives:

 oid  |typname |nspname | typarray
---+++--
 16646 | periodtsresult | cu7user_test   |16645
 17123 | periodtsresult | cu7user_test_2 |17122

but for user/schema cu7user_test_2
parse_coerce.c:86:

if (!can_coerce_type(1, &exprtype, &targettype, ccontext))
   return NULL;
the function returns false and subsequently coerce_to_target_type
returns null resulting in the above error.
This is caused by the
parse_coerce: 421 expr pointer returning type_id for the type defined
in the '1st' schema:

type_id = exprType((Node *) expr); //<-- type_id returned is 16645,
should be 17122

expr pointer value is already wrong in the parsed list in the loop
starting at analyze:799

foreach(lc, exprlist) <- wrong type_id already for the array UDT
column


Is there any way of avoid this error different than having a single
type defined for all schemas?
Any hints appreciated..
Postgres 9.1.3, latest JDBC driver.

Best regards,
Krzysztof

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Gsoc2012 idea, tablesample

2012-04-23 Thread Sandro Santilli
On Sat, Apr 21, 2012 at 02:28:52PM +0800, Qi Huang wrote:
> 
> Hi, Heikki
...
> > Another idea that Robert Haas suggested was to add support doing a TID 
> > scan for a query like "WHERE ctid< '(501,1)'". That's not enough work 
> > for GSoC project on its own, but could certainly be a part of it.
> 
> the first one and the last one are still not clear. 

The last one was the TID scan on filters like ctid < '(501,1)'.
TID "scans" are the fastest access method as they directly access
explicitly referenced addresses. Starting from this observation a sampling
function may select random pages and tuples within pages and directly
access them, optimizing accesses by grouping tuples within the same
page so to fetch them all togheter.

This is what the ANALYZE command already does when providing samples
for the type analyzers.

Unfortunately it looks like at SQL level only the equality operator triggers
a TID scan, so things like "WHERE ctid < '(501,1)'" won't be as fast as
fetching all visible tuples in the first 501 pages.

I think that's what Heikki was referring about.

I'd love to see enhanced CTID operators, to fetch all visible tuples in a page
using a tidscan.  Something like: WHERE ctid =~ '(501,*)' or a ctidrange.

--strk; 

  ,--o-. 
  |   __/  |Delivering high quality PostGIS 2.0 !
  |  / 2.0 |http://strk.keybit.net - http://vizzuality.com
  `-o--'


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] lock_timeout and common SIGALRM framework

2012-04-23 Thread Marc Cousin
On Mon, 2012-04-23 at 10:53 +0200, Boszormenyi Zoltan wrote:
> 2012-04-10 09:02 keltezéssel, Boszormenyi Zoltan írta:
> > 2012-04-06 14:47 keltezéssel, Cousin Marc írta:
> >> On 05/04/12 08:02, Boszormenyi Zoltan wrote:
> >>> 2012-04-04 21:30 keltezéssel, Alvaro Herrera írta:
>  I think this patch is doing two things: first touching infrastructure
>  stuff and then adding lock_timeout on top of that.  Would it work to
>  split the patch in two pieces?
> 
> >>> Sure. Attached is the split version.
> >>>
> >>> Best regards,
> >>> Zoltán Böszörményi
> >>>
> >> Hi,
> >>
> >> I've started looking at and testing both patches.
> >>
> >> Technically speaking, I think the source looks much better than the
> >> first version of lock timeout, and may help adding other timeouts in the
> >> future. I haven't tested it in depth though, because I encountered the
> >> following problem:
> >>
> >> While testing the patch, I found a way to crash PG. But what's weird is
> >> that it crashes also with an unpatched git version.
> >>
> >> Here is the way to reproduce it (I have done it with a pgbench schema):
> >>
> >> - Set a small statement_timeout (just to save time during the tests)
> >>
> >> Session1:
> >> =#BEGIN;
> >> =#lock TABLE pgbench_accounts ;
> >>
> >> Session 2:
> >> =#BEGIN;
> >> =#lock TABLE pgbench_accounts ;
> >> ERROR:  canceling statement due to statement timeout
> >> =# lock TABLE pgbench_accounts ;
> >>
> >> I'm using \set ON_ERROR_ROLLBACK INTERACTIVE by the way. It can also be
> >> done with a rollback to savepoint of course.
> >>
> >> Session 2 crashes with this : TRAP : FailedAssertion(«
> >> !(locallock->holdsStrongLockCount == 0) », fichier : « lock.c », ligne :
> >> 749).
> >>
> >> It can also be done without a statement_timeout, and a control-C on the
> >> second lock table.
> >>
> >> I didn't touch anything but this. It occurs everytime, when asserts are
> >> activated.
> >>
> >> I tried it on 9.1.3, and I couldn't make it crash with the same sequence
> >> of events. So maybe it's something introduced since ? Or is the assert
> >> still valid ?
> >>
> >> Cheers
> >>
> >
> > Attached are the new patches. I rebased them to current GIT and
> > they are expected to be applied after Robert Haas' patch in the
> > "bug in fast-path locking" thread.
> >
> > Now it survives the above scenario.
> >
> > Best regards,
> > Zoltán Böszörményi
> 
> New patch attached, rebased to today's GIT.
> 
> Best regards,
> Zoltán Böszörményi
> 

Ok, I've done what was missing from the review (from when I had a bug in
locking the other day), so here is the full review. By the way, this
patch doesn't belong to current commitfest, but to the next one.



Is the patch in context diff format?
Yes

Does it apply cleanly to the current git master?
Yes

Does it include reasonable tests, necessary doc patches, etc?
The new lock_timeout GUC is documented. There are regression tests.

Read what the patch is supposed to do, and consider:
Does the patch actually implement that?
Yes

Do we want that?
I do. Mostly for administrative jobs which could lock the whole
application. It would be much easier to run reindexes, vacuum full, etc…
without worrying about bringing application down because of lock
contention.

Do we already have it?
No.

Does it follow SQL spec, or the community-agreed behavior?
I don't know if there is a consensus on this new GUC. statement_timeout
is obviously not in the SQL spec.

Does it include pg_dump support (if applicable)?
Not applicable

Are there dangers?
Yes, as it rewrites all the timeout code. I feel it is much cleaner this
way, as there is a generic set of function managing all sigalarm code,
but it heavily touches this part.

Have all the bases been covered?
I tried all sql statements I could think of (select, insert, update,
delete, truncate, drop, create index, adding constraint, lock.

I tried having statement_timeout, lock_timeout and deadlock_timeout at
very short and close or equal values. It worked too.

Rollback to savepoint while holding locks dont crash PostgreSQL anymore.

Other timeouts such as archive_timeout and checkpoint_timeout still
work.

Does the feature work as advertised?
Yes

Are there corner cases the author has failed to consider?
I didn't find any.

Are there any assertion failures or crashes?
No.

Does the patch slow down simple tests?
No

If it claims to improve performance, does it?
Not applicable

Does it slow down other things?
No

Does it follow the project coding guidelines?
I think so

Are there portability issues?
No, all the portable code (acquiring locks and manipulating sigalarm) is
the same as before.

Will it work on Windows/BSD etc?
It should. I couldn't test it though.

Are the comments sufficient and accurate?
Yes

Does it do what it says, correctly?
Yes

Does it produce compiler warnings?
No

Can you make it crash?
Not anymore

Is everything done in a way that fits together coherently with other
features/modules?
Yes, I think so. The new way o

[HACKERS] Patch: add conversion from pg_wchar to multibyte

2012-04-23 Thread Alexander Korotkov
Hackers,

attached patch adds conversion from pg_wchar string to multibyte string.
This functionality is needed for my patch on index support for regular
expression search
http://archives.postgresql.org/pgsql-hackers/2011-11/msg01297.php .
Analyzing conversion from multibyte to pg_wchar I found following types of
conversion:
1) Trivial conversion for single-byte encoding. It just adds leading zeros
to each byte.
2) Conversion from UTF-8 to unicode.
3) Conversions from euc* encodings. They write bytes of a character to
pg_wchar in inverse order starting from lower byte (this explanation assume
little endian system).
4) Conversion from mule encoding. This conversion is unclear for me and
also seems to be lossy.

It was easy to write inverse conversion for 1-3. I've changed 4 conversion
to behave like 3. I'm not sure my change is ok, because I didn't understand
original conversion.

--
With best regards,
Alexander Korotkov.


wchar2mb-0.1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers