Bug Tracker item #3023888, was opened at 2010-07-01 18:19
Message generated for change (Comment added) made by sbajic
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=3023888&group_id=250683

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: daemon
Group: v3.9.0
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Stéphane Berthelot (sberthelot)
Assigned to: Stevan Bajic (sbajic)
Summary: Wrong escaping of bytea (signature data) with postgresql 8.4

Initial Comment:
Escaping of insertion has been modified in 3.9.0 in two ways :
- using PQescapeByteaConn (instead of PQescapeBytea)
- adding E'' syntax in INSERT

But this encodes twice the data and then doesn't work on fresh PostgreSQL 8.4.2 
with or without standard_conforming_string in either UTF8 or SQL_ASCII database 
encoding.

Removing the E'' syntax (since the string is already correctly encoded now) 
makes signature insertion work again

In the git code, it may be necessary to add the if construct in this patch to 
check about postgresql version (but <=7.4 seem not supported anymore officially 
by the postgres team now ...)

----------------------------------------------------------------------

>Comment By: Stevan Bajic (sbajic)
Date: 2010-08-26 09:43

Message:
Hallo Stéphane,

thanks for the patch. I am aware of PQexecParams. The problem is that
PQexecParams requires PostgreSQL protocol version 3.0 or up. However...
DSPAM still supports older PostgreSQL versions. So using PQexecParams is a
now go. But you know what? I think it's time to remove the support for
those ancient PostgreSQL version from DSPAM. If I find time in the next
days then I am going to do exactly that and remove support for anything
older then 7.4.

----------------------------------------------------------------------

Comment By: Stéphane Berthelot (sberthelot)
Date: 2010-08-25 12:38

Message:
Ok I just found it, sorry I thought I've tested that before but it seems I
didn't restart my postgresql server correctly.

The problem comes from : standard_conforming_strings = on

Of course, I could set it to off but there are 2 problems with that :
pgsql docs clearly state the "on" value will become standard "soon"
http://www.postgresql.org/docs/9.0/static/runtime-config-compatible.html
(but still not in 9.0 it seems ...) and I need it to on on this server for
other apps ; this setting seems to be server-wide.

I think the E'...' syntax should be completely indifferent to this setting
according to pgsql manual but it doesn't seem to be as of 8.4.2 (maybe
fixed later I didn't check) ...

I will attach the patch I will be using on production servers that is
rather better than removing E when standard_conforming_strings is on...
My patch uses the PQexecParams way of inserting data, thus removing the
need for "manual" encoding before passing data to postgresql. I personnally
think it is a cleaner, easier, and safer approach for this binary insertion
problem (and certainly future proof...)

I left old code in comments for you to see what I removed easily.

Please let me know what you think of this approach. Of course I fully
tested insertions here with standard_conforming_strings to off and on and
it works flawlessly each time.
I treated the SIG->data as an opaque char* , I checked that set and get
signature functions get data with the same length (which also equals what
in the length field of the db table), yet I don't know how to check if the
data integrity is really preserved ... (it should but I prefer telling you
that I'm not 100% sure of that)

----------------------------------------------------------------------

Comment By: Stéphane Berthelot (sberthelot)
Date: 2010-08-25 10:15

Message:
Yes I got it with DSPAM GIT as of yesterday

Here is the result of the SELECT command :

"number"        "name"  "attnum"        "notnull"       "type"  "primarykey"    
"uniquekey"     "foreignkey"    "foreignkey_fieldnum"   "foreignkey"    
"foreignkey_connnum"    "default"
"1"     "uid"   "1"     "f"     "integer"       "f"     "t"     "\N"    "\N"    
"\N"    "\N"    "\N"
"2"     "signature"     "2"     "f"     "character
varying(128)"   "f"     "t"     "\N"    "\N"    "\N"    "\N"    "\N"
"3"     "data"  "3"     "f"     "bytea" "f"     "f"     "\N"    "\N"    "\N"    
"\N"    "\N"
"4"     "length"        "4"     "f"     "integer"       "f"     "f"     "\N"    
"\N"    "\N"    "\N"    "\N"
"5"     "created_on"    "5"     "f"     "date"  "f"     "f"     "\N"    "\N"    
"\N"    "\N"    "\N"

(I don't know if the tabbed output mode will work with the bug tracker,
otherwise I can output it in XML mode which should be more readable)

I will attach the showall.csv file (in CSV output mode since there are
much more lines)

By the way, two confirmations :
- the problem seem to appear only when inserting '\000' in my SQL_ASCII
database, I have the same problem trying to insert this char "manually"
- I have patched my dspam version to use Prepared Statements then
everything inserts without any problem (and the code is much more simple
...) but then there seem to be a problem in _ds_get_signature getting
eveything correctly , I'm compiling now a debug build to see what happens

* Is is possible (wanted) that the data field was encoded twice "before" ?
(inserting text '\000' (4 chars) instead of E'\000' (1 char)). If you try
SELECT length(...) with both, you'll see different size, but I wonder
whether the display (in psql or pgadmin) is different ...
* Can I reach you by IRC or other mean to solve this issue more quickly
and do tests in a more "interactive" way ?

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2010-08-24 18:47

Message:
Hello Stéphane,

do you get that error with DSPAM GIT?

Could you please run this and post the result:
SELECT f.attnum AS number, f.attname AS name, f.attnum, f.attnotnull AS
notnull, pg_catalog.format_type(f.atttypid,f.atttypmod) AS type, CASE WHEN
p.contype = 'p' THEN 't' ELSE 'f' END AS primarykey, CASE WHEN p.contype =
'u' THEN 't' ELSE 'f' END AS uniquekey, CASE WHEN p.contype = 'f' THEN
g.relname END AS foreignkey, CASE WHEN p.contype = 'f' THEN p.confkey END
AS foreignkey_fieldnum, CASE WHEN p.contype = 'f' THEN g.relname END AS
foreignkey, CASE WHEN p.contype = 'f' THEN p.conkey END AS
foreignkey_connnum, CASE WHEN f.atthasdef = 't' THEN d.adsrc END AS default
FROM pg_attribute f JOIN pg_class c ON c.oid = f.attrelid JOIN pg_type t ON
t.oid = f.atttypid LEFT JOIN pg_attrdef d ON d.adrelid = c.oid AND d.adnum
= f.attnum LEFT JOIN pg_namespace n ON n.oid = c.relnamespace LEFT JOIN
pg_constraint p ON p.conrelid = c.oid AND f.attnum = ANY ( p.conkey ) LEFT
JOIN pg_class AS g ON p.confrelid = g.oid WHERE c.relkind = 'r'::char AND
n.nspname = 'public' AND c.relname = 'dspam_signature_data' AND f.attnum >
0 ORDER BY number;


And please post the output of the following command:
SHOW ALL;

----------------------------------------------------------------------

Comment By: Stéphane Berthelot (sberthelot)
Date: 2010-08-24 18:05

Message:
Hello Stevan,

I still have the issue, I'm investigating to see what kind of postgresql
setting I may have that cause trouble ...

Excerpt from my sql.errors :
[08/24/2010 17:58:59] 8347: ERROR:  invalid byte sequence for encoding
"SQL_ASCII": 0x00
HINT:  This error can also happen if the byte sequence does not match the
encoding expected by the server, which is controlled by "client_encoding".
: INSERT INTO dspam_signature_data (uid,signature,length,created_on,data)
VALUES
(8,E'8,4c73ec4383477704710225',864,CURRENT_DATE,E'\331\235\237\210]\237\351x\002\000\000\000\303\351\...@\234\230+\302\001\000\000\000\237\3133\326\334\035\021d\002\000\000\000\245\271t\271\254\025\\\351\003\000\000\000+\203u\232\352\337\230\300\001\000\000\000%\031lzs&\377\004\001\000\000\000:}G\235\177\271L\267\001\000\000\000\332\2337\275\210\035V\352\001\000\000\000\215\227\032\247\204\242\332\217\001\000\000\000\206\227K\266B\014\035\015\001\000\000\000\332\2337\275\370\0016\351\001\000\000\000\220\231\356]\333A\372\255\001\000\000\000\2337\2758M.y\214\001\000\000\000\277\344\013\375\245c1a\002\000\000\000\220:\026\273#nF9\002\000\000\000H\252pm\274\362\364\322\001\000\000\000\257\204\364\256\006\316\202\224\001\000\000\000\3664\235&n\235\370\022\002\000\000\000\275\370\0016Y\2502\002\001\000\000\000\345d-&S
\201J\001\000\000\000\035VZ\250\362DS\370\001\000\000\000m\3664\235&\256;v\002\000\000\000&\016S\375\375\0055\342\002\000\000\000,s\337<\006\305(l\002\000\000\000\236\320\375\275\016p\333\237\001\000\000\000\275\270\353\036\357D\235\023\001\000\000\0004\235&nm\031S\314\002\000\000\000m\3664\235&\256\274\020\002\000\000\0007[\222\322\226\021\263\012\002\000\000\000E\244\021\344\221>R\241\001\000\000\000\034\341\253\240a\023\325\252\001\000\000\000aY\2564\015\251,\344\002\000\000\000\305(|\304\246pAR\002\000\000\0000\271\311\255\034H\365\227\002\000\000\000\264\332\375\206}\240\031\211\001\000\000\000O\235\376\005\365\364?d\002\000\000\000\300\332\2337\275\210<)\001\000\000\000\300\332\2337\275X\356+\001\000\000\0004\235&\256|V\334\261\002\000\000\000_%\212\325\022\000\356f\001\000\000\000\300\332\2337\275H\252
\002\000\000\000\256|V\334!RU\032\002\000\000\000\257\204\364\256\006\316\205\344\001\000\000\000\031\255\362)\031*v6\001\000\000\000\245c1\241\305\325\311\316\002\000\000\000m\3664\235&~n\022\002\000\000\000X.\235x\214U\345\326\001\000\000\000hVuu\012\210I\221\001\000\000\000\351NV-\374`\205\241\002\000\000\000m\3664\235&\376\304g\002\000\000\000\275X\356{\335\354\212w\001\000\000\0004\235&\336'']7\313\002\000\000\000s\264\300\270\324\327YI\001\000\000\000\203\207\203\350\016iI\213\001\000\000\000\2337\275X.\235x\214\001\000\000\0007\275\210<y\035[\354\001\000\000\000H\273\216n5v\023\015\001\000\000\000\307\2356\371\001\273j7\001\000\000\000\000\000\000\300\001\273\265q\001\000\000\000\3664\235&~O\235\016\002\000\000\000C\361u\034\222$\326\337\001\000\000\000\347h\276\314\000\262`\300\001\000\000\000\000\000\000\320u\307\235f\001\000\000\000&\256\274\260yA\215\227\002\000\000\000Q5\023\232\230\313}b\001\000\000\0007\275H\252p\035[\354\001\000\000\000\322\243\260\350C\361u\\\001\000\000\000\214e2ak\363Ow\001\000\000\000\235&\276\351NV-\\\002\000\000\000\3664\235&\016S\375\015\002\000\000\000\370y\212\307.''^''\001\000\000\000\245Ff\321,0H\317\001\000\000\000')

and in my pgsql server log (about the same message)
Aug 24 17:58:59 darkstar postgres[8349]: [2-1] 127.0.0.1 dspam dspam
4c73ebdc.209d 0 ERROR:  invalid byte sequence for encoding "SQL_ASCII":
0x00
Aug 24 17:58:59 darkstar postgres[8349]: [2-2] 127.0.0.1 dspam dspam
4c73ebdc.209d 0 HINT:  This error can also happen if the byte sequence does
not match the encoding expected by the server, which is controlled by
"client_encoding".
Aug 24 17:58:59 darkstar postgres[8349]: [2-3] 127.0.0.1 dspam dspam
4c73ebdc.209d 0 STATEMENT:  INSERT INTO dspam_signature_data
(uid,signature,length,created_on,data) VALUES
(8,E'8,4c73ec4383477704710225',864,CURRENT_DATE,E'\331\235\237\210]\237\351x\002\000\000\000\303\351\...@\234\230+\302\001\000\000\000\237\3133\326\334\035\021d\002\000\000\000\245\271t\271\254\025\\\351\003\000\000\000+\203u\232\352\337\230\300\001\000\000\000%\031lzs&\377\004\001\000\000\000:}G\235\177\271L\267\001\000\000\000\332\2337\275\210\035V\352\001\000\000\000\215\227\032\247\204\242\332\217\001\000\000\000\206\227K\266B\014\035\015\001\000\000\000\332\2337\275\370\0016\351\001\000\000\000\220\231\356]\333A\372\255\001\000\000\000\2337\2758M.y\214\001\000\000\000\277\344\013\375\245c1a\002\000\000\000\220:\026\273#nF9\002\000\000\000H\252pm\274\362\364\322\001\000\000\000\257\204\364\256\006\316\202\224\001\000\000\000\3664\235&n\235\370\022\002\000\000\000\275\370\0016Y\2502\002\001\000\000\000\345d-&S
(...)

I am trying to figure out how db encoding or initialization may cause this
error. I'll post a comment again as soon as I have something useful


----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2010-08-22 20:09

Message:
Hallo Stéphane, had you the chance to test latest GIT and verify if the
issue still exists there? Since you are the only one reporting that issue
and PostgreSQL is widely used I am going to set the status to pending.
Please respond in the next two weeks or the Sourceforge bug tracking system
will automatically close this bug report.

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2010-08-05 21:08

Message:
> The "correct" way to fix is certainly to replace queries containing
bytea
> (that use string copy functions) by stored procedures instead. What do
> you think about it ?
>
Hmm... so you would take a stream of data and transform it with a
procedure on the server to fit into a normal string? Something like the
current commands in 3.9.1 of DSPAM are doing it? That could work if you
query inside DSPAM data from PostgreSQL but how do you want to transmit the
other way around? I mean you have a stream of binary data and you want to
push it to PostgreSQL. How can you assure that the procedure will see the
data the way it should see it? And why do you think that a own crafted
procedure should be better/faster than using
PQescapeByteaConn/PQescapeBytea?


> If ok I'll try to prepare a patch in this direction and
> test it.
>
Please take the time and test DSPAM Git and tell me if it is working there
or not?

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2010-08-05 12:24

Message:
Can you try again but use this time DSPAM Git version?

btw: You are the first one to report to have an issue with the PostgreSQL
driver inserting signature data. I just tried yesterday with a stock
PostgreSQL 9.0Beta2 and Tom Hendrikx tried this morning with 8.4.4. We both
had no issues inserting signatures.

Anyway... you have an error and this is all what counts. So please post me
that error. Telling me that you have an error is fine and dandy but I need
to error code/text (copy it from your console and paste back here). This
more helpful to me then just the statement that you can not insert
signatures.

----------------------------------------------------------------------

Comment By: Stéphane Berthelot (sberthelot)
Date: 2010-08-05 12:00

Message:
I know it's certainly not the failsafe way to fix the bug ...
I was referring to
http://www.postgresql.org/docs/8.4/static/datatype-binary.html (between
table 8-7 and table 8-8)
The pgsql doc says that the bytea are to be encoded twice since they are
decoded first by the query analyzer then by the bytea processor.

Removing the E'' syntax was the quickiest way for me to "fix it" (since
with stock dspam 3.9.0 and stock postgresql 8.4 the signature data doesn't
get inserted and return an encoding error)

I agree with you on the syntax but it doesn't explain why it doesn't work
with E'' and start working when I remove it ...

The "correct" way to fix is certainly to replace queries containing bytea
(that use string copy functions) by stored procedures instead. What do you
think about it ? If ok I'll try to prepare a patch in this direction and
test it.

----------------------------------------------------------------------

Comment By: Stevan Bajic (sbajic)
Date: 2010-08-04 23:38

Message:
> But this encodes twice the data
>
This is simply not true. The 'E' syntax is telling PostgreSQL that the
data is not a string literal but a escaped string. That's all.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=3023888&group_id=250683

------------------------------------------------------------------------------
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users 
worldwide. Take advantage of special opportunities to increase revenue and 
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
_______________________________________________
Dspam-devel mailing list
Dspam-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspam-devel

Reply via email to