Bug Tracker item #3023888, was opened at 2010-07-01 18:19 Message generated for change (Comment added) made by sbajic You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=3023888&group_id=250683
Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: daemon Group: v3.9.0 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Stéphane Berthelot (sberthelot) Assigned to: Stevan Bajic (sbajic) Summary: Wrong escaping of bytea (signature data) with postgresql 8.4 Initial Comment: Escaping of insertion has been modified in 3.9.0 in two ways : - using PQescapeByteaConn (instead of PQescapeBytea) - adding E'' syntax in INSERT But this encodes twice the data and then doesn't work on fresh PostgreSQL 8.4.2 with or without standard_conforming_string in either UTF8 or SQL_ASCII database encoding. Removing the E'' syntax (since the string is already correctly encoded now) makes signature insertion work again In the git code, it may be necessary to add the if construct in this patch to check about postgresql version (but <=7.4 seem not supported anymore officially by the postgres team now ...) ---------------------------------------------------------------------- >Comment By: Stevan Bajic (sbajic) Date: 2010-08-26 09:43 Message: Hallo Stéphane, thanks for the patch. I am aware of PQexecParams. The problem is that PQexecParams requires PostgreSQL protocol version 3.0 or up. However... DSPAM still supports older PostgreSQL versions. So using PQexecParams is a now go. But you know what? I think it's time to remove the support for those ancient PostgreSQL version from DSPAM. If I find time in the next days then I am going to do exactly that and remove support for anything older then 7.4. ---------------------------------------------------------------------- Comment By: Stéphane Berthelot (sberthelot) Date: 2010-08-25 12:38 Message: Ok I just found it, sorry I thought I've tested that before but it seems I didn't restart my postgresql server correctly. The problem comes from : standard_conforming_strings = on Of course, I could set it to off but there are 2 problems with that : pgsql docs clearly state the "on" value will become standard "soon" http://www.postgresql.org/docs/9.0/static/runtime-config-compatible.html (but still not in 9.0 it seems ...) and I need it to on on this server for other apps ; this setting seems to be server-wide. I think the E'...' syntax should be completely indifferent to this setting according to pgsql manual but it doesn't seem to be as of 8.4.2 (maybe fixed later I didn't check) ... I will attach the patch I will be using on production servers that is rather better than removing E when standard_conforming_strings is on... My patch uses the PQexecParams way of inserting data, thus removing the need for "manual" encoding before passing data to postgresql. I personnally think it is a cleaner, easier, and safer approach for this binary insertion problem (and certainly future proof...) I left old code in comments for you to see what I removed easily. Please let me know what you think of this approach. Of course I fully tested insertions here with standard_conforming_strings to off and on and it works flawlessly each time. I treated the SIG->data as an opaque char* , I checked that set and get signature functions get data with the same length (which also equals what in the length field of the db table), yet I don't know how to check if the data integrity is really preserved ... (it should but I prefer telling you that I'm not 100% sure of that) ---------------------------------------------------------------------- Comment By: Stéphane Berthelot (sberthelot) Date: 2010-08-25 10:15 Message: Yes I got it with DSPAM GIT as of yesterday Here is the result of the SELECT command : "number" "name" "attnum" "notnull" "type" "primarykey" "uniquekey" "foreignkey" "foreignkey_fieldnum" "foreignkey" "foreignkey_connnum" "default" "1" "uid" "1" "f" "integer" "f" "t" "\N" "\N" "\N" "\N" "\N" "2" "signature" "2" "f" "character varying(128)" "f" "t" "\N" "\N" "\N" "\N" "\N" "3" "data" "3" "f" "bytea" "f" "f" "\N" "\N" "\N" "\N" "\N" "4" "length" "4" "f" "integer" "f" "f" "\N" "\N" "\N" "\N" "\N" "5" "created_on" "5" "f" "date" "f" "f" "\N" "\N" "\N" "\N" "\N" (I don't know if the tabbed output mode will work with the bug tracker, otherwise I can output it in XML mode which should be more readable) I will attach the showall.csv file (in CSV output mode since there are much more lines) By the way, two confirmations : - the problem seem to appear only when inserting '\000' in my SQL_ASCII database, I have the same problem trying to insert this char "manually" - I have patched my dspam version to use Prepared Statements then everything inserts without any problem (and the code is much more simple ...) but then there seem to be a problem in _ds_get_signature getting eveything correctly , I'm compiling now a debug build to see what happens * Is is possible (wanted) that the data field was encoded twice "before" ? (inserting text '\000' (4 chars) instead of E'\000' (1 char)). If you try SELECT length(...) with both, you'll see different size, but I wonder whether the display (in psql or pgadmin) is different ... * Can I reach you by IRC or other mean to solve this issue more quickly and do tests in a more "interactive" way ? ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2010-08-24 18:47 Message: Hello Stéphane, do you get that error with DSPAM GIT? Could you please run this and post the result: SELECT f.attnum AS number, f.attname AS name, f.attnum, f.attnotnull AS notnull, pg_catalog.format_type(f.atttypid,f.atttypmod) AS type, CASE WHEN p.contype = 'p' THEN 't' ELSE 'f' END AS primarykey, CASE WHEN p.contype = 'u' THEN 't' ELSE 'f' END AS uniquekey, CASE WHEN p.contype = 'f' THEN g.relname END AS foreignkey, CASE WHEN p.contype = 'f' THEN p.confkey END AS foreignkey_fieldnum, CASE WHEN p.contype = 'f' THEN g.relname END AS foreignkey, CASE WHEN p.contype = 'f' THEN p.conkey END AS foreignkey_connnum, CASE WHEN f.atthasdef = 't' THEN d.adsrc END AS default FROM pg_attribute f JOIN pg_class c ON c.oid = f.attrelid JOIN pg_type t ON t.oid = f.atttypid LEFT JOIN pg_attrdef d ON d.adrelid = c.oid AND d.adnum = f.attnum LEFT JOIN pg_namespace n ON n.oid = c.relnamespace LEFT JOIN pg_constraint p ON p.conrelid = c.oid AND f.attnum = ANY ( p.conkey ) LEFT JOIN pg_class AS g ON p.confrelid = g.oid WHERE c.relkind = 'r'::char AND n.nspname = 'public' AND c.relname = 'dspam_signature_data' AND f.attnum > 0 ORDER BY number; And please post the output of the following command: SHOW ALL; ---------------------------------------------------------------------- Comment By: Stéphane Berthelot (sberthelot) Date: 2010-08-24 18:05 Message: Hello Stevan, I still have the issue, I'm investigating to see what kind of postgresql setting I may have that cause trouble ... Excerpt from my sql.errors : [08/24/2010 17:58:59] 8347: ERROR: invalid byte sequence for encoding "SQL_ASCII": 0x00 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". : INSERT INTO dspam_signature_data (uid,signature,length,created_on,data) VALUES (8,E'8,4c73ec4383477704710225',864,CURRENT_DATE,E'\331\235\237\210]\237\351x\002\000\000\000\303\351\...@\234\230+\302\001\000\000\000\237\3133\326\334\035\021d\002\000\000\000\245\271t\271\254\025\\\351\003\000\000\000+\203u\232\352\337\230\300\001\000\000\000%\031lzs&\377\004\001\000\000\000:}G\235\177\271L\267\001\000\000\000\332\2337\275\210\035V\352\001\000\000\000\215\227\032\247\204\242\332\217\001\000\000\000\206\227K\266B\014\035\015\001\000\000\000\332\2337\275\370\0016\351\001\000\000\000\220\231\356]\333A\372\255\001\000\000\000\2337\2758M.y\214\001\000\000\000\277\344\013\375\245c1a\002\000\000\000\220:\026\273#nF9\002\000\000\000H\252pm\274\362\364\322\001\000\000\000\257\204\364\256\006\316\202\224\001\000\000\000\3664\235&n\235\370\022\002\000\000\000\275\370\0016Y\2502\002\001\000\000\000\345d-&S \201J\001\000\000\000\035VZ\250\362DS\370\001\000\000\000m\3664\235&\256;v\002\000\000\000&\016S\375\375\0055\342\002\000\000\000,s\337<\006\305(l\002\000\000\000\236\320\375\275\016p\333\237\001\000\000\000\275\270\353\036\357D\235\023\001\000\000\0004\235&nm\031S\314\002\000\000\000m\3664\235&\256\274\020\002\000\000\0007[\222\322\226\021\263\012\002\000\000\000E\244\021\344\221>R\241\001\000\000\000\034\341\253\240a\023\325\252\001\000\000\000aY\2564\015\251,\344\002\000\000\000\305(|\304\246pAR\002\000\000\0000\271\311\255\034H\365\227\002\000\000\000\264\332\375\206}\240\031\211\001\000\000\000O\235\376\005\365\364?d\002\000\000\000\300\332\2337\275\210<)\001\000\000\000\300\332\2337\275X\356+\001\000\000\0004\235&\256|V\334\261\002\000\000\000_%\212\325\022\000\356f\001\000\000\000\300\332\2337\275H\252 \002\000\000\000\256|V\334!RU\032\002\000\000\000\257\204\364\256\006\316\205\344\001\000\000\000\031\255\362)\031*v6\001\000\000\000\245c1\241\305\325\311\316\002\000\000\000m\3664\235&~n\022\002\000\000\000X.\235x\214U\345\326\001\000\000\000hVuu\012\210I\221\001\000\000\000\351NV-\374`\205\241\002\000\000\000m\3664\235&\376\304g\002\000\000\000\275X\356{\335\354\212w\001\000\000\0004\235&\336'']7\313\002\000\000\000s\264\300\270\324\327YI\001\000\000\000\203\207\203\350\016iI\213\001\000\000\000\2337\275X.\235x\214\001\000\000\0007\275\210<y\035[\354\001\000\000\000H\273\216n5v\023\015\001\000\000\000\307\2356\371\001\273j7\001\000\000\000\000\000\000\300\001\273\265q\001\000\000\000\3664\235&~O\235\016\002\000\000\000C\361u\034\222$\326\337\001\000\000\000\347h\276\314\000\262`\300\001\000\000\000\000\000\000\320u\307\235f\001\000\000\000&\256\274\260yA\215\227\002\000\000\000Q5\023\232\230\313}b\001\000\000\0007\275H\252p\035[\354\001\000\000\000\322\243\260\350C\361u\\\001\000\000\000\214e2ak\363Ow\001\000\000\000\235&\276\351NV-\\\002\000\000\000\3664\235&\016S\375\015\002\000\000\000\370y\212\307.''^''\001\000\000\000\245Ff\321,0H\317\001\000\000\000') and in my pgsql server log (about the same message) Aug 24 17:58:59 darkstar postgres[8349]: [2-1] 127.0.0.1 dspam dspam 4c73ebdc.209d 0 ERROR: invalid byte sequence for encoding "SQL_ASCII": 0x00 Aug 24 17:58:59 darkstar postgres[8349]: [2-2] 127.0.0.1 dspam dspam 4c73ebdc.209d 0 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". Aug 24 17:58:59 darkstar postgres[8349]: [2-3] 127.0.0.1 dspam dspam 4c73ebdc.209d 0 STATEMENT: INSERT INTO dspam_signature_data (uid,signature,length,created_on,data) VALUES (8,E'8,4c73ec4383477704710225',864,CURRENT_DATE,E'\331\235\237\210]\237\351x\002\000\000\000\303\351\...@\234\230+\302\001\000\000\000\237\3133\326\334\035\021d\002\000\000\000\245\271t\271\254\025\\\351\003\000\000\000+\203u\232\352\337\230\300\001\000\000\000%\031lzs&\377\004\001\000\000\000:}G\235\177\271L\267\001\000\000\000\332\2337\275\210\035V\352\001\000\000\000\215\227\032\247\204\242\332\217\001\000\000\000\206\227K\266B\014\035\015\001\000\000\000\332\2337\275\370\0016\351\001\000\000\000\220\231\356]\333A\372\255\001\000\000\000\2337\2758M.y\214\001\000\000\000\277\344\013\375\245c1a\002\000\000\000\220:\026\273#nF9\002\000\000\000H\252pm\274\362\364\322\001\000\000\000\257\204\364\256\006\316\202\224\001\000\000\000\3664\235&n\235\370\022\002\000\000\000\275\370\0016Y\2502\002\001\000\000\000\345d-&S (...) I am trying to figure out how db encoding or initialization may cause this error. I'll post a comment again as soon as I have something useful ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2010-08-22 20:09 Message: Hallo Stéphane, had you the chance to test latest GIT and verify if the issue still exists there? Since you are the only one reporting that issue and PostgreSQL is widely used I am going to set the status to pending. Please respond in the next two weeks or the Sourceforge bug tracking system will automatically close this bug report. ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2010-08-05 21:08 Message: > The "correct" way to fix is certainly to replace queries containing bytea > (that use string copy functions) by stored procedures instead. What do > you think about it ? > Hmm... so you would take a stream of data and transform it with a procedure on the server to fit into a normal string? Something like the current commands in 3.9.1 of DSPAM are doing it? That could work if you query inside DSPAM data from PostgreSQL but how do you want to transmit the other way around? I mean you have a stream of binary data and you want to push it to PostgreSQL. How can you assure that the procedure will see the data the way it should see it? And why do you think that a own crafted procedure should be better/faster than using PQescapeByteaConn/PQescapeBytea? > If ok I'll try to prepare a patch in this direction and > test it. > Please take the time and test DSPAM Git and tell me if it is working there or not? ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2010-08-05 12:24 Message: Can you try again but use this time DSPAM Git version? btw: You are the first one to report to have an issue with the PostgreSQL driver inserting signature data. I just tried yesterday with a stock PostgreSQL 9.0Beta2 and Tom Hendrikx tried this morning with 8.4.4. We both had no issues inserting signatures. Anyway... you have an error and this is all what counts. So please post me that error. Telling me that you have an error is fine and dandy but I need to error code/text (copy it from your console and paste back here). This more helpful to me then just the statement that you can not insert signatures. ---------------------------------------------------------------------- Comment By: Stéphane Berthelot (sberthelot) Date: 2010-08-05 12:00 Message: I know it's certainly not the failsafe way to fix the bug ... I was referring to http://www.postgresql.org/docs/8.4/static/datatype-binary.html (between table 8-7 and table 8-8) The pgsql doc says that the bytea are to be encoded twice since they are decoded first by the query analyzer then by the bytea processor. Removing the E'' syntax was the quickiest way for me to "fix it" (since with stock dspam 3.9.0 and stock postgresql 8.4 the signature data doesn't get inserted and return an encoding error) I agree with you on the syntax but it doesn't explain why it doesn't work with E'' and start working when I remove it ... The "correct" way to fix is certainly to replace queries containing bytea (that use string copy functions) by stored procedures instead. What do you think about it ? If ok I'll try to prepare a patch in this direction and test it. ---------------------------------------------------------------------- Comment By: Stevan Bajic (sbajic) Date: 2010-08-04 23:38 Message: > But this encodes twice the data > This is simply not true. The 'E' syntax is telling PostgreSQL that the data is not a string literal but a escaped string. That's all. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=1126467&aid=3023888&group_id=250683 ------------------------------------------------------------------------------ Sell apps to millions through the Intel(R) Atom(Tm) Developer Program Be part of this innovative community and reach millions of netbook users worldwide. Take advantage of special opportunities to increase revenue and speed time-to-market. Join now, and jumpstart your future. http://p.sf.net/sfu/intel-atom-d2d _______________________________________________ Dspam-devel mailing list Dspam-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspam-devel