Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-21 Thread Kyotaro HORIGUCHI
Hello, At Tue, 13 Sep 2016 11:44:01 +0300, Heikki Linnakangas wrote in <7ff67a45-a53e-4d38-e25d-3a121afea...@iki.fi> > On 09/08/2016 09:35 AM, Kyotaro HORIGUCHI wrote: > > Returning in UTF-8 bloats the result string by about 1.5 times so > > it doesn't seem to make sense

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-13 Thread Heikki Linnakangas
On 09/08/2016 09:35 AM, Kyotaro HORIGUCHI wrote: Returning in UTF-8 bloats the result string by about 1.5 times so it doesn't seem to make sense comparing with it. But it takes real = 47.35s. Nice! I was hoping that this would also make the binaries smaller. A few dozen kB of storage is

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-12 Thread Kyotaro HORIGUCHI
At Thu, 8 Sep 2016 07:09:51 +, "Tsunakawa, Takayuki" wrote in <0A3221C70F24FB45833433255569204D1F5E7D4A@G01JPEXMBYT05> > From: pgsql-hackers-ow...@postgresql.org > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > > HORIGUCHI > > > > $

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-08 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > HORIGUCHI > > $ time psql postgres -c 'select t.a from t, generate_series(0, )' > > /dev/null > > real 0m22.696s > user 0m16.991s > sys 0m0.182s> > > Using binsearch the result

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-08 Thread Kyotaro HORIGUCHI
Hello, At Wed, 07 Sep 2016 16:13:04 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI wrote in <20160907.161304.112519789.horiguchi.kyot...@lab.ntt.co.jp> > > Implementing radix tree code, then redefining the format of mapping table > > > to suppot radix tree,

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-07 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > Thanks, by the way, there's another issue related to SJIS conversion. MS932 > has several characters that have multiple code points. By converting texts > in this encoding to and from

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-07 Thread Kyotaro HORIGUCHI
Hello, At Tue, 6 Sep 2016 03:43:46 +, "Tsunakawa, Takayuki" wrote in <0A3221C70F24FB45833433255569204D1F5E66CE@G01JPEXMBYT05> > > From: pgsql-hackers-ow...@postgresql.org > > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > > HORIGUCHI >

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
> From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kyotaro > HORIGUCHI Implementing radix tree code, then redefining the format of mapping table > to suppot radix tree, then modifying mapping generator script are needed. > > If no one oppse to

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Kyotaro HORIGUCHI
Hello, At Mon, 5 Sep 2016 19:38:33 +0300, Heikki Linnakangas wrote in <529db688-72fc-1ca2-f898-b0b99e300...@iki.fi> > On 09/05/2016 05:47 PM, Tom Lane wrote: > > "Tsunakawa, Takayuki" writes: > >> Before digging into the problem, could you share

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tom Lane
"Tsunakawa, Takayuki" writes: > Using multibyte-functions like mb... to process characters would solve > the problem? Well, sure. The problem is (1) finding all the places that need that (I'd estimate dozens to hundreds of places in the core code, and then

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Heikki > But one thing that would help a little, would be to optimize the UTF-8 > -> SJIS conversion. It uses a very generic routine, with a binary search > over a large array of mappings. I bet you

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
From: Tom Lane [mailto:t...@sss.pgh.pa.us] > "Tsunakawa, Takayuki" writes: > > Before digging into the problem, could you share your impression on > > whether PostgreSQL can support SJIS? Would it be hopeless? > > I think it's pretty much hopeless. Even if we

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Heikki Linnakangas
On 09/05/2016 05:47 PM, Tom Lane wrote: "Tsunakawa, Takayuki" writes: Before digging into the problem, could you share your impression on whether PostgreSQL can support SJIS? Would it be hopeless? I think it's pretty much hopeless. Agreed. But one thing

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tom Lane
"Tsunakawa, Takayuki" writes: > Before digging into the problem, could you share your impression on > whether PostgreSQL can support SJIS? Would it be hopeless? I think it's pretty much hopeless. Even if we were willing to make every bit of code that looks for

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tatsuo Ishii
> Before digging into the problem, could you share your impression on whether > PostgreSQL can support SJIS? Would it be hopeless? Can't we find any > direction to go? Can I find relevant source code by searching specific words > like "ASCII", "HIGH_BIT", "\\" etc? For starters, you could

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
> From: pgsql-hackers-ow...@postgresql.org > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Tatsuo Ishii > > But what I'm wondering is why PostgreSQL doesn't support SJIS. Was there > any technical difficulty? Is there anything you are worried about if adding > SJIS? > > Yes, there's

Re: [HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tatsuo Ishii
> But what I'm wondering is why PostgreSQL doesn't support SJIS. Was there any > technical difficulty? Is there anything you are worried about if adding SJIS? Yes, there's a technical difficulty with backend code. In many places it is assumed that any string is "ASCII compatible", which means

[HACKERS] Supporting SJIS as a database encoding

2016-09-05 Thread Tsunakawa, Takayuki
Hello, I'd like to propose adding SJIS as a database encoding. You may wonder why SJIS is still necessary in the world of Unicode. The purpose is to achieve comparable performance when migrating legacy database systems from other DBMSs without little modification of applications. Recently,