Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
In message mailman.182.1278126257.1673.python-l...@python.org, Rami Chowdhury wrote: I'm sorry, perhaps you've misunderstood what I was refuting. You posted: macro: #define Descr(v) v, sizeof v As written, this works whatever the type of v: array, struct, whatever. With my code example I found that, as others have pointed out, unfortunately it doesn't work if v is a pointer to a heap-allocated area. It still correctly passes the address and size of that pointer variable. It that’s not what you intended, you shouldn’t use it. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message mailman.2121.1277522302.32709.python-l...@python.org, Robert Kern wrote: On 2010-06-25 19:47 , Lawrence D'Oliveiro wrote: In messagemailman.2046.1277445301.32709.python-l...@python.org, Cameron Simpson wrote: On 25Jun2010 15:38, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: | In message2010062422432660794-angrybald...@gmailcom, Owen Jacobson | wrote: | Why would I write this when SQLAlchemy, even without using its ORM | features, can do it for me? | | SQLAlchemy doesn’t seem very flexible. Looking at the code examples |http://www.sqlalchemy.org/docs/examples.html, they’re very |procedural: build object, then do a string of separate method calls to |add data to it. I prefer the functional approach, as in my table-update |example. He said without using its ORM. I noticed that. So were those examples I referenced above “using its ORM”? Can you offer better examples “without using its ORM”? http://www.sqlalchemy.org/docs/sqlexpression.html Still full of very repetitive boilerplate. Doesn’t look like it can create a simpler alternative to my example at all. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message mailman.2128.1277537954.32709.python-l...@python.org, Robert Kern wrote: On 2010-06-25 19:49 , Lawrence D'Oliveiro wrote: Why do people consider input sanitization so hard? It's not hard per se; it's just repetitive, prone to the occasional mistake, and, frankly, really boring. But as a programmer, I’m not in the habit of doing “repetitive” and “boring”. Look at the example I posted, and you’ll see. It’s the ones trying to come up with alternatives to my code who produce things that look “reptitive” and “boring”. -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
In message mailman.136.1278040489.1673.python-l...@python.org, Rami Chowdhury wrote: On Thursday 01 July 2010 16:50:59 Lawrence D'Oliveiro wrote: Nevertheless, it it at least self-consistent. To return to my original macro: #define Descr(v) v, sizeof v As written, this works whatever the type of v: array, struct, whatever. Doesn't seem to, sorry. Using Michael Torrie's code example, slightly modified... char *buf = malloc(512 * sizeof(char)); Again, you misunderstand the difference between a C array and a pointer. Study the following example, which does work, and you might grasp the point: l...@theon:hack cat test.c #include stdio.h int main(int argc, char ** argv) { char buf[512]; const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); fprintf(stdout, buf); return 0; } /*main*/ l...@theon:hack ./test 2 + 3 = 5 -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
On Friday 02 July 2010 19:20:26 Lawrence D'Oliveiro wrote: In message mailman.136.1278040489.1673.python-l...@python.org, Rami Chowdhury wrote: On Thursday 01 July 2010 16:50:59 Lawrence D'Oliveiro wrote: Nevertheless, it it at least self-consistent. To return to my original macro: #define Descr(v) v, sizeof v As written, this works whatever the type of v: array, struct, whatever. Doesn't seem to, sorry. Using Michael Torrie's code example, slightly modified... char *buf = malloc(512 * sizeof(char)); Again, you misunderstand the difference between a C array and a pointer. Study the following example, which does work, and you might grasp the point: l...@theon:hack cat test.c #include stdio.h int main(int argc, char ** argv) { char buf[512]; const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); fprintf(stdout, buf); return 0; } /*main*/ l...@theon:hack ./test 2 + 3 = 5 I'm sorry, perhaps you've misunderstood what I was refuting. You posted: macro: #define Descr(v) v, sizeof v As written, this works whatever the type of v: array, struct, whatever. With my code example I found that, as others have pointed out, unfortunately it doesn't work if v is a pointer to a heap-allocated area. Rami Chowdhury A man with a watch knows what time it is. A man with two watches is never sure. -- Segal's Law +1-408-597-7068 / +44-7875-841-046 / +88-01819-245544 -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Mon, Jun 28, 2010 at 6:44 PM, Gregory Ewing greg.ew...@canterbury.ac.nz wrote: Carl Banks wrote: Indeed, strncpy does not copy that final NUL if it's at or beyond the nth element. Probably the most mind-bogglingly stupid thing about the standard C library, which has lots of mind-boggling stupidity. I don't think it was as stupid as that back when C was designed Actually, strncpy had a very specific use case when it was introduced (dealing with limited-size entries in very old unix filesystem). It should never be used for C string handling, and I don't think it is fair to say it is stupid: it does exactly what it was designed for. It just happens that most people don't know what it was designed for. David -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
On Wed, 2010-06-30, Michael Torrie wrote: On 06/30/2010 03:00 AM, Jorgen Grahn wrote: On Wed, 2010-06-30, Michael Torrie wrote: On 06/29/2010 10:17 PM, Michael Torrie wrote: On 06/29/2010 10:05 PM, Michael Torrie wrote: #include stdio.h int main(int argc, char ** argv) { char *buf = malloc(512 * sizeof(char)); const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); ^^ Make that 512*sizeof(buf) Sigh. Try again. How about 512 * sizeof(char) ? Still doesn't make a different. The code still crashes because the buf is incorrect. I haven't tried to understand the rest ... but never write 'sizeof(char)' unless you might change the type later. 'sizeof(char)' is by definition 1 -- even on odd-ball architectures where a char is e.g. 16 bits. You're right. I normally don't use sizeof(char). This is obviously a contrived example; I just wanted to make the example such that there's no way the original poster could argue that the crash is caused by something other than buf. Then again, it's always a bad idea in C to make assumptions about anything. There are some things you cannot assume, others which few fellow programmers can care to memorize, and others which you often can get away with (like assuming an int is more than 16 bits, when your code is tied to a modern Unix anyway). But sizeof(char) is always 1. If you're on Windows and want to use the unicode versions of everything, you'd need to do sizeof(). So using it here would remind you that when you move to the 16-bit Microsoft unicode versions of snprintf need to change the sizeof(char) lines as well to sizeof(wchar_t). Yes -- see unless you might change the type later above. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
On Wed, 30 Jun 2010 23:40:06 -0600, Michael Torrie wrote: Given char buf[512], buf's type is char * according to the compiler and every C textbook I know of. No, the type of buf is char [512], i.e. array of 512 chars. If you use buf as an rvalue (rather than an lvalue), it will be implicitly converted to char*. If you take its address, you'll get a pointer to array of 512 chars, i.e. a pointer to the array rather than to the first element. Converting this to a char* will yield a pointer to the first element. If buf was declared char *buf, then taking its address will yield a char**, and converting this to a char* will produce a pointer to the first byte of the pointer, which is unlikely to be useful. -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
Nobody wrote: On Wed, 30 Jun 2010 23:40:06 -0600, Michael Torrie wrote: Given char buf[512], buf's type is char * according to the compiler and every C textbook I know of. References from Kernighan Ritchie _The C Programming Language_ second edition: No, the type of buf is char [512], i.e. array of 512 chars. If you use buf as an rvalue (rather than an lvalue), it will be implicitly converted to char*. KR2 A7.1 If you take its address, you'll get a pointer to array of 512 chars, i.e. a pointer to the array rather than to the first element. Converting this to a char* will yield a pointer to the first element. KR2 A7.4.2 Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
On 07/01/2010 01:24 AM, Nobody wrote: No, the type of buf is char [512], i.e. array of 512 chars. If you use buf as an rvalue (rather than an lvalue), it will be implicitly converted to char*. Yes this is true. I misstated. I meant that most text books I've seen say to just use the variable in an *rvalue* as a pointer (can't think of any lvalue use of an array). KR states that arrays (in C anyway) are always *passed* by pointer, hence when you pass an array to a function it automatically decays into a pointer. Which is what you said. So no need for and the compiler warning you get with it. That's all. If the OP was striving for pedantic correctness, he would use buf[0]. -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
On 7/1/2010 8:36 AM, Mel wrote: Nobody wrote: On Wed, 30 Jun 2010 23:40:06 -0600, Michael Torrie wrote: Given char buf[512], buf's type is char * according to the compiler and every C textbook I know of. References from Kernighan Ritchie _The C Programming Language_ second edition: No, the type of buf is char [512], i.e. array of 512 chars. If you use buf as an rvalue (rather than an lvalue), it will be implicitly converted to char*. Yes, unfortunately. The approach to arrays in C is just broken, for historical reasons. To understand C, you have to realize that in the early versions, function declarations weren't visible when function calls were compiled. That came later, in ANSI C. So parameter passing in C is very dumb. Billions of crashes due to buffer overflows later, we're still suffering from that mistake. But this isn't a Python issue. John Nagle -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message mailman.2370.1277871088.32709.python-l...@python.org, Michael Torrie wrote: On 06/29/2010 06:26 PM, Lawrence D'Oliveiro wrote: I'm not sure you understood me correctly, because I advocate *not* doing input sanitization. Hard or not -- I don't want to know, because I don't want to do it. But no-one has yet managed to come up with an alternative that involves less work. Your case is still not persuasive. So persuade me. I have given an example of code written the way I do it. Now let’s see you rewrite it using your preferred technique, just to prove that your way is simpler and easier to understand. Enough hand-waving, let’s see some code! -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
In message 4c2ccd9c$0$1643$742ec...@news.sonic.net, John Nagle wrote: The approach to arrays in C is just broken, for historical reasons. Nevertheless, it it at least self-consistent. To return to my original macro: #define Descr(v) v, sizeof v As written, this works whatever the type of v: array, struct, whatever. So parameter passing in C is very dumb. Nothing to do with the above issue. -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
On Thursday 01 July 2010 16:50:59 Lawrence D'Oliveiro wrote: Nevertheless, it it at least self-consistent. To return to my original macro: #define Descr(v) v, sizeof v As written, this works whatever the type of v: array, struct, whatever. Doesn't seem to, sorry. Using Michael Torrie's code example, slightly modified... [r...@tigris ~]$ cat example.c #include stdio.h #define Descr(v) v, sizeof v int main(int argc, char ** argv) { char *buf = malloc(512 * sizeof(char)); const int a = 2, b = 3; snprintf(Descr(buf), %d + %d = %d\n, a, b, a + b); fprintf(stdout, buf); free(buf); return 0; } /*main*/ [r...@tigris ~]$ clang example.c example.c:11:18: warning: incompatible pointer types passing 'char **', expected 'char *' [-pedantic] snprintf(Descr(buf), %d + %d = %d\n, a, b, a + b); ^~ example.c:4:18: note: instantiated from: #define Descr(v) v, sizeof v ^~~~ snip [r...@tigris ~]$ ./a.out Segmentation fault Rami Chowdhury Passion is inversely proportional to the amount of real information available. -- Benford's Law of Controversy +1-408-597-7068 / +44-7875-841-046 / +88-01819-245544 -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
On Wed, 2010-06-30, Michael Torrie wrote: On 06/29/2010 10:17 PM, Michael Torrie wrote: On 06/29/2010 10:05 PM, Michael Torrie wrote: #include stdio.h int main(int argc, char ** argv) { char *buf = malloc(512 * sizeof(char)); const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); ^^ Make that 512*sizeof(buf) Sigh. Try again. How about 512 * sizeof(char) ? Still doesn't make a different. The code still crashes because the buf is incorrect. I haven't tried to understand the rest ... but never write 'sizeof(char)' unless you might change the type later. 'sizeof(char)' is by definition 1 -- even on odd-ball architectures where a char is e.g. 16 bits. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Ancient C string conventions (was Re: Why Is Escaping Data Considered So Magical?)
On Wed, 2010-06-30, Carl Banks wrote: On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote: Carl Banks wrote: Indeed, strncpy does not copy that final NUL if it's at or beyond the nth element. Probably the most mind-bogglingly stupid thing about the standard C library, which has lots of mind-boggling stupidity. I don't think it was as stupid as that back when C was designed. Every byte of memory was precious in those days, and if you had, say, 10 bytes allocated for a string, you wanted to be able to use all 10 of them for useful data. So the convention was that a NUL byte was used to mark the end of the string *if it didn't fill all the available space*. I can't think of any function in the standard library that observes that convention, Me neither, except strncpy(), according to above. which inclines me to disbelieve this convention ever really existed. If it did, there would be functions to support it. Maybe others existed, but got killed off early. That would make strncpy() a living fossil, like the Coelacanth ... For that matter, I'm not really inclined to believe bytes were *that* precious in those days. It's somewhat believable. If I handled thousands of student names in a big C array char[30][], I would resent the fact that 1/30 of the memory was wasted on NUL bytes. I'm sure plenty of people have done what Gregory suggests ... but it's not clear that strncpy() was designed to support those people. I suppose it's all lost in history. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 29Jun2010 21:49, Carl Banks pavlovevide...@gmail.com wrote: | On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote: | Carl Banks wrote: | Indeed, strncpy does not copy that final NUL if it's at or beyond the | nth element. Probably the most mind-bogglingly stupid thing about the | standard C library, which has lots of mind-boggling stupidity. | | I don't think it was as stupid as that back when C was | designed. Every byte of memory was precious in those days, | and if you had, say, 10 bytes allocated for a string, you | wanted to be able to use all 10 of them for useful data. | | So the convention was that a NUL byte was used to mark | the end of the string *if it didn't fill all the available | space*. | | I can't think of any function in the standard library that observes | that convention, which inclines me to disbelieve this convention ever | really existed. If it did, there would be functions to support it. | | For that matter, I'm not really inclined to believe bytes were *that* | precious in those days. Jeez. PDP-11s, 16 bit addressing, tiny tiny disc drives! The original V7 (and probably earlier) UNIX filesystem has 16 byte directory entries: 2 bytes for an inode and 14 bytes for the name. You could use 14 bytes of that name, and strncpy makes it effective to work with that data structure. Shortening something already only 14 bytes (the name) _is_ a big ask, and it is well work the unusual convention in play. | The obvious rationale behind strncpy's stupid behavior is that it's | not a string function at all, but a memory block function, that stops | at a NUL in case you don't care what's after the NUL in a block. But | it leads you to believe it's a string function by it's name. Bah. It's for copying a _string_ into a _buffer_! Strangely, since it starts with a string (NUL-terminated byte sequence) it begins with str. And it _is_ copying, but not into another string. It is special purpose but perfectly reasonable for the problem at hand. -- Cameron Simpson c...@zip.com.au DoD#743 http://www.cskk.ezoshosting.com/cs/ If it ain't broken, keep playing with it. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In article mailman.14.1277891765.1673.python-l...@python.org, Cameron Simpson c...@zip.com.au wrote: Jeez. PDP-11s, 16 bit addressing, tiny tiny disc drives! What you talking about, tiny? An RK-05 was huge! Why would anybody ever need more than that? The original V7 (and probably earlier) UNIX filesystem has 16 byte directory entries Certainly earlier. I used v6, and it was like that there. I'm reasonably sure it pre-dated v6, however. -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
On 06/30/2010 03:00 AM, Jorgen Grahn wrote: On Wed, 2010-06-30, Michael Torrie wrote: On 06/29/2010 10:17 PM, Michael Torrie wrote: On 06/29/2010 10:05 PM, Michael Torrie wrote: #include stdio.h int main(int argc, char ** argv) { char *buf = malloc(512 * sizeof(char)); const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); ^^ Make that 512*sizeof(buf) Sigh. Try again. How about 512 * sizeof(char) ? Still doesn't make a different. The code still crashes because the buf is incorrect. I haven't tried to understand the rest ... but never write 'sizeof(char)' unless you might change the type later. 'sizeof(char)' is by definition 1 -- even on odd-ball architectures where a char is e.g. 16 bits. You're right. I normally don't use sizeof(char). This is obviously a contrived example; I just wanted to make the example such that there's no way the original poster could argue that the crash is caused by something other than buf. Then again, it's always a bad idea in C to make assumptions about anything. If you're on Windows and want to use the unicode versions of everything, you'd need to do sizeof(). So using it here would remind you that when you move to the 16-bit Microsoft unicode versions of snprintf need to change the sizeof(char) lines as well to sizeof(wchar_t). -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Jun 30, 2:55 am, Cameron Simpson c...@zip.com.au wrote: On 29Jun2010 21:49, Carl Banks pavlovevide...@gmail.com wrote: | On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote: | Carl Banks wrote: | Indeed, strncpy does not copy that final NUL if it's at or beyond the | nth element. Probably the most mind-bogglingly stupid thing about the | standard C library, which has lots of mind-boggling stupidity. | | I don't think it was as stupid as that back when C was | designed. Every byte of memory was precious in those days, | and if you had, say, 10 bytes allocated for a string, you | wanted to be able to use all 10 of them for useful data. | | So the convention was that a NUL byte was used to mark | the end of the string *if it didn't fill all the available | space*. | | I can't think of any function in the standard library that observes | that convention, which inclines me to disbelieve this convention ever | really existed. If it did, there would be functions to support it. | | For that matter, I'm not really inclined to believe bytes were *that* | precious in those days. Jeez. PDP-11s, 16 bit addressing, tiny tiny disc drives! The original V7 (and probably earlier) UNIX filesystem has 16 byte directory entries: 2 bytes for an inode and 14 bytes for the name. You could use 14 bytes of that name, and strncpy makes it effective to work with that data structure. Shortening something already only 14 bytes (the name) _is_ a big ask, and it is well work the unusual convention in play. You are talking about fixed-length memory records, not strings. I'm saying that bytes were not so precious that, when you operate on *actual strings*, that you need to desperately cut off nul terminators to save space. | The obvious rationale behind strncpy's stupid behavior is that it's | not a string function at all, but a memory block function, that stops | at a NUL in case you don't care what's after the NUL in a block. But | it leads you to believe it's a string function by it's name. Bah. It's for copying a _string_ into a _buffer_! Strangely, since it starts with a string (NUL-terminated byte sequence) it begins with str. And it _is_ copying, but not into another string. I'm going to disagree. The input of strncpy can be either a string or a memory block, and the output can only a memory block. In other words, neither the source nor destination has to be a string. This is a memory block function, not a string function. The correct name for this function should have been memcpytonul. Even if you disagree, then you must admit it should have been called strcpytobuf. Nothing about the name strncpy gives the slightest suggestion that the destination is not a string. Based on analogy from other str functions, none of which have any sources or destinations that are memory blocks, one would logically expect that strncpy's destination was a string. It defies common sense. And there should have been an actual, correctly working strncpy in the standard library that copies and truncates actual strings. It is special purpose but perfectly reasonable for the problem at hand. The usefulness of strncpy's behavior for writing fixed-length memory blocks is not in question here. The thing that's mind-bogglingly stupid is that the function that does this is called strncpy. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
Cameron Simpson c...@zip.com.au writes: The original V7 (and probably earlier) UNIX filesystem has 16 byte directory entries: 2 bytes for an inode and 14 bytes for the name. You could use 14 bytes of that name, and strncpy makes it effective to work with that data structure. Why not use memcpy for that? -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 30Jun2010 12:19, Paul Rubin no.em...@nospam.invalid wrote: | Cameron Simpson c...@zip.com.au writes: | The original V7 (and probably earlier) UNIX filesystem has 16 byte directory | entries: 2 bytes for an inode and 14 bytes for the name. You could use 14 | bytes of that name, and strncpy makes it effective to work with that data | structure. | | Why not use memcpy for that? Because when you've pulled names _out_ of the directory structure they're conventional C strings, ready for conventional C string mucking about: NUL terminated, with no expectation that any memory is allocated beyond the NUL. Think of strncpy as a conversion function. Your source is a conventional C string of unknown size, your destination is a NUL padded buffer of known size. Copy at most n bytes of this string into the buffer, pad with NULs. Cheers, -- Cameron Simpson c...@zip.com.au DoD#743 http://www.cskk.ezoshosting.com/cs/ We had the experience, but missed the meaning. - T.S. Eliot -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
In message mailman.2369.1277870727.32709.python-l...@python.org, Michael Torrie wrote: Okay, I will. Your code passes a char** when a char* is expected. No it doesn’t. Consider this variation where I use a dynamically allocated buffer instead of static: And so you misunderstand the difference between a C array and a pointer. -- http://mail.python.org/mailman/listinfo/python-list
Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?
On 06/30/2010 06:36 PM, Lawrence D'Oliveiro wrote: In message mailman.2369.1277870727.32709.python-l...@python.org, Michael Torrie wrote: Okay, I will. Your code passes a char** when a char* is expected. No it doesn’t. You're right; it doesn't. Your code passes char (*)[512]. warning: passing argument 1 of ‘snprintf’ from incompatible pointer type /usr/include/stdio.h:385: note: expected ‘char * __restrict__’ but argument is of type ‘char (*)[512]’ And so you misunderstand the difference between a C array and a pointer. You make a pretty big assumption. Given char buf[512], buf's type is char * according to the compiler and every C textbook I know of. With a static char array, there's no need to take it's address since it *is* the address of the first element. Taking the address can lead to problems if you ever substitute a dynamically-allocated buffer for the statically-allocated one. For one-dimensional arrays at least, static arrays and pointers are interchangeable when calling snprinf. You do not agree? Anyway, this is far enough away from Python. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 29/06/2010 01:55, Roy Smith wrote: [snips] The nice thing about null-terminated strings is how portable they have been over various word lengths. The bad thing about null-terminated strings is the number of off-by-one errors they've helped to create. I obviously have never created an off-by-one error myself. :) Kindest regards. Mark Lawrence. -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
In message mailman.2332.1277785175.32709.python-l...@python.org, Kushal Kumaran wrote: On Tue, Jun 29, 2010 at 5:56 AM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: Why does this work, then: l...@theon:hack cat test.c #include stdio.h int main(int argc, char ** argv) { char buf[512]; const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); fprintf(stdout, buf); return 0; } /*main*/ l...@theon:hack ./test 2 + 3 = 5 By accident. I have yet to find an architecture or C compiler where it DOESN’T work. Feel free to try and prove me wrong. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message slrni2f8v2.j19.grahn+n...@frailea.sa.invalid, Jorgen Grahn wrote: On Sat, 2010-06-26, Lawrence D'Oliveiro wrote: In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn wrote: I thought it was well-known that the solution is *not* to try to sanitize the input -- it's to switch to an interface which doesn't involve generating an intermediate executable. In the Python example, that would be something like os.popen2(['zcat', '-f', '--', untrusted]). That’s what I mean. Why do people consider input sanitization so hard? I'm not sure you understood me correctly, because I advocate *not* doing input sanitization. Hard or not -- I don't want to know, because I don't want to do it. But no-one has yet managed to come up with an alternative that involves less work. -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
On 06/29/2010 06:25 PM, Lawrence D'Oliveiro wrote: I have yet to find an architecture or C compiler where it DOESN’T work. Feel free to try and prove me wrong. Okay, I will. Your code passes a char** when a char* is expected. Every compiler I know of will give you a *warning*. Mistaking char*, char**, and char[] is a common mistake that almost every C program makes in the beginning. Now for the proof: Consider this variation where I use a dynamically allocated buffer instead of static: #include stdio.h int main(int argc, char ** argv) { char *buf = malloc(512 * sizeof(char)); const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); fprintf(stdout, buf); free(buf); return 0; } /*main*/ On my machine, an immediate segfault (stack overrun). Your code only works because your buf is statically allocated, which means buf==buf. But this equivalance does not hold for any other situation. If your buffer was dynamically allocated on the heap, instead of passing a pointer to the buffer (which *is* what buf itself is), you are passing a pointer to the pointer, which is where buf is stored on the stack, but not the buffer itself. Instant stack corruption. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 06/29/2010 06:26 PM, Lawrence D'Oliveiro wrote: I'm not sure you understood me correctly, because I advocate *not* doing input sanitization. Hard or not -- I don't want to know, because I don't want to do it. But no-one has yet managed to come up with an alternative that involves less work. Your case is still not persuasive. How is using the DB API's placeholders and parameterization more work? It's the same amount of keystrokes, perhaps even less. You would just be substituting the API's parameter placeholders for Python's. In fact with Psycopg2 and the mysql python db apis, it's almost a matter of simply removing the % and putting in a comma, turning python's string substitution into a method call. And you can leave out the quotes around where the variables go. If I have to sanitize every input, I have to do it on each and every field on each and every form action. With the DB API doing the work I just do it once, in one place. Is this not easier that manually escaping everything and then embedding it in the query string? I've not used sqlalchemy, but it looks similarly easy. -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
On 06/29/2010 10:05 PM, Michael Torrie wrote: #include stdio.h int main(int argc, char ** argv) { char *buf = malloc(512 * sizeof(char)); const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); ^^ Make that 512*sizeof(buf) Still segfaults though. fprintf(stdout, buf); free(buf); return 0; } /*main*/ -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
On 06/29/2010 10:17 PM, Michael Torrie wrote: On 06/29/2010 10:05 PM, Michael Torrie wrote: #include stdio.h int main(int argc, char ** argv) { char *buf = malloc(512 * sizeof(char)); const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); ^^ Make that 512*sizeof(buf) Sigh. Try again. How about 512 * sizeof(char) ? Still doesn't make a different. The code still crashes because the buf is incorrect. Another reason python programming is just so much funner and easier! This little diversion is fun though. C is pretty powerful and I enjoy it, but it sure keeps one on one's toes. I made a similar mistake to the buf thing years ago when I thought I could return strings (char *) from functions on the stack the way Pascal and BASIC could. It was only by pure luck that my code worked as the part of the stack being accessed was invalid and could have been overwritten. fprintf(stdout, buf); free(buf); return 0; } /*main*/ -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote: Carl Banks wrote: Indeed, strncpy does not copy that final NUL if it's at or beyond the nth element. Probably the most mind-bogglingly stupid thing about the standard C library, which has lots of mind-boggling stupidity. I don't think it was as stupid as that back when C was designed. Every byte of memory was precious in those days, and if you had, say, 10 bytes allocated for a string, you wanted to be able to use all 10 of them for useful data. So the convention was that a NUL byte was used to mark the end of the string *if it didn't fill all the available space*. I can't think of any function in the standard library that observes that convention, which inclines me to disbelieve this convention ever really existed. If it did, there would be functions to support it. For that matter, I'm not really inclined to believe bytes were *that* precious in those days. Functions such as strncpy and snprintf are designed for use with strings that follow this convention. Proper usage requires being cognizant of the maximum length and using appropriate length-limited functions for all operations on such strings. Well, no. Being cognizant of the string's maximum length doesn't make you able to pass it to printf, or system, or any other C function. The obvious rationale behind strncpy's stupid behavior is that it's not a string function at all, but a memory block function, that stops at a NUL in case you don't care what's after the NUL in a block. But it leads you to believe it's a string function by it's name. Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 6/26/10 7:21 PM, Lawrence D'Oliveiro wrote: In messagemailman.2123.1277522976.32709.python-l...@python.org, Tim Chase wrote: On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote: ... I see that you published my unobfuscated e-mail address on USENET for all to see. I obfuscated it for a reason, to keep the spammers away. I'm assuming this was a momentary lapse of judgement, for which I expect an apology. Otherwise, it becomes grounds for an abuse complaint to your ISP. Wow. Way to be a douchebag. I was going to say something about the realities of this forum and its dual-nature and conflicting netiquette and on. But I decided it really just had no point. So, I'm left with: wow. You kinda suck*, man. -- ... Stephen Hansen ... Also: Ixokai ... Mail: me+list/python (AT) ixokai (DOT) io ... Blog: http://meh.ixokai.io/ P.S. *Then again, I'm fairly sure anytime someone has a form letter which contains the words, I expect an apology, there's some personal suck going on. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Mon, 2010-06-28, Kushal Kumaran wrote: On Mon, Jun 28, 2010 at 2:00 AM, Jorgen Grahn grahn+n...@snipabacken.se wrote: On Sun, 2010-06-27, Lawrence D'Oliveiro wrote: In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote: I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); This is off-topic, but I believe snprintf() in C can *never* safely be the only thing you do to the buffer: you also have to NUL-terminate it manually in some corner cases. See the documentation. snprintf goes to great lengths to be safe, in fact. You might be thinking of strncpy. Yes, it was indeed strncpy I was thinking of. Thanks. But actually, the snprintf(3) man page I have is not 100% clear on this issue, so last time I used it, I added a manual NUL-termination plus a comment saying I wasn't sure it was needed. I normally use C++ or Python, so I am a bit rusty on these things. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
Carl Banks wrote: Indeed, strncpy does not copy that final NUL if it's at or beyond the nth element. Probably the most mind-bogglingly stupid thing about the standard C library, which has lots of mind-boggling stupidity. I don't think it was as stupid as that back when C was designed. Every byte of memory was precious in those days, and if you had, say, 10 bytes allocated for a string, you wanted to be able to use all 10 of them for useful data. So the convention was that a NUL byte was used to mark the end of the string *if it didn't fill all the available space*. Functions such as strncpy and snprintf are designed for use with strings that follow this convention. Proper usage requires being cognizant of the maximum length and using appropriate length-limited functions for all operations on such strings. -- Greg -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
Gregory Ewing greg.ew...@canterbury.ac.nz writes: I don't think it was as stupid as that back when C was designed. Every byte of memory was precious in those days, and if you had, say, 10 bytes allocated for a string, you wanted to be able to use all 10 of them for useful data. No I don't think so. Traditional C strings simply didn't carry length info except for the nul byte at the end. Most string functions expected the nul to be there. The nul byte convention (instead of having a header word with a length) arguably saved some space both by eliminating a multi-byte header and by allowing trailing substrings to be represented as pointers into a larger string. In retrospect it seems like a big error. -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
In message mailman.2231.1277700501.32709.python-l...@python.org, Kushal Kumaran wrote: On Sun, Jun 27, 2010 at 5:16 PM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message mailman.2184.1277626565.32709.python-l...@python.org, Kushal Kumaran wrote: On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); Not quite right. If buf is a char array, as suggested by the use of sizeof, then you're not passing a char* to snprintf. What am I passing, then? Here's what gcc tells me (I declared buf as char buf[512]): sprintf.c:8: warning: passing argument 1 of ‘snprintf’ from incompatible pointer type /usr/include/stdio.h:363: note: expected ‘char * __restrict__’ but argument is of type ‘char (*)[512]’ You just need to lose the from the macro. Why does this work, then: l...@theon:hack cat test.c #include stdio.h int main(int argc, char ** argv) { char buf[512]; const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); fprintf(stdout, buf); return 0; } /*main*/ l...@theon:hack ./test 2 + 3 = 5 -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message pan.2010.06.27.13.55.04.500...@nowhere.com, Nobody wrote: On Sun, 27 Jun 2010 14:36:10 +1200, Lawrence D'Oliveiro wrote: Except nobody has yet shown an alternative which is easier to get right. For SQL, use stored procedures or prepared statements. So feel free to rewrite my example using either stored procedures or prepared statements, to prove how much easier it is. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In article 7xmxuffpxp@ruckus.brouhaha.com, Paul Rubin no.em...@nospam.invalid wrote: Gregory Ewing greg.ew...@canterbury.ac.nz writes: I don't think it was as stupid as that back when C was designed. Every byte of memory was precious in those days, and if you had, say, 10 bytes allocated for a string, you wanted to be able to use all 10 of them for useful data. No I don't think so. Traditional C strings simply didn't carry length info except for the nul byte at the end. Most string functions expected the nul to be there. The nul byte convention (instead of having a header word with a length) arguably saved some space both by eliminating a multi-byte header and by allowing trailing substrings to be represented as pointers into a larger string. In retrospect it seems like a big error. Null-terminated strings predate C. Various assembler languages had ASCIIZ (or similar) directives long before that. The nice thing about null-terminated strings is how portable they have been over various word lengths. Life would have been truly inconvenient if KR had picked, say, a 16-bit length field, and then we needed to bump that up to 32 bits in the 80's, and again to 64 bits in the 90's. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Mon, 28 Jun 2010 20:55:53 -0400, Roy Smith wrote: The nice thing about null-terminated strings is how portable they have been over various word lengths. Life would have been truly inconvenient if KR had picked, say, a 16-bit length field, and then we needed to bump that up to 32 bits in the 80's, and again to 64 bits in the 90's. Or a Pascal 8 bit length field. However the cost of null-terminated strings is that they can't store binary data, and worse, they're slow. In fact, according to some, null- terminated strings are the *worst* way to implement a string type. http://www.joelonsoftware.com/articles/fog000319.html -- Steven -- http://mail.python.org/mailman/listinfo/python-list
Re: [OT] Re: Why Is Escaping Data Considered So Magical?
On Tue, Jun 29, 2010 at 5:56 AM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message mailman.2231.1277700501.32709.python-l...@python.org, Kushal Kumaran wrote: On Sun, Jun 27, 2010 at 5:16 PM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message mailman.2184.1277626565.32709.python-l...@python.org, Kushal Kumaran wrote: On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); Not quite right. If buf is a char array, as suggested by the use of sizeof, then you're not passing a char* to snprintf. What am I passing, then? Here's what gcc tells me (I declared buf as char buf[512]): sprintf.c:8: warning: passing argument 1 of ‘snprintf’ from incompatible pointer type /usr/include/stdio.h:363: note: expected ‘char * __restrict__’ but argument is of type ‘char (*)[512]’ You just need to lose the from the macro. Why does this work, then: l...@theon:hack cat test.c #include stdio.h int main(int argc, char ** argv) { char buf[512]; const int a = 2, b = 3; snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b); fprintf(stdout, buf); return 0; } /*main*/ l...@theon:hack ./test 2 + 3 = 5 By accident. I hope your compiler warned you about your snprintf call. Reading these threads might help you understand how char* and char (*)[512] are different: http://groups.google.com/group/comp.lang.c++/browse_thread/thread/24708a9204061ce/848ceaf5ec774d81 http://groups.google.com/group/comp.lang.c.moderated/browse_thread/thread/fe264c550947a2e5/32b330cdf8aba3d6 -- regards, kushal -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Sat, Jun 26, 2010 at 8:31 PM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: Except I only needed two calls to SQLString, while you need two dozen instances of that repetitive items.c boilerplate. As a human, being repetitive is not my job. That’s what the computer is for. Then why do you have every parameter prefixed with modify_? 8-) But seriously, if that bothers you, then fold the items.c. portion into the generator expression with a getattr call. Or just change them back to the same strings you had originally, and sqlalchemy will be just as happy to accept them as-is. Cheers, Ian -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote: I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); Not quite right. If buf is a char array, as suggested by the use of sizeof, then you're not passing a char* to snprintf. You need to lose the in your macro. -- regards, kushal -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message mailman.2184.1277626565.32709.python-l...@python.org, Kushal Kumaran wrote: On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote: I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); Not quite right. If buf is a char array, as suggested by the use of sizeof, then you're not passing a char* to snprintf. What am I passing, then? -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message mailman.2183.1277623909.32709.python-l...@python.org, Ian Kelly wrote: On Sat, Jun 26, 2010 at 8:31 PM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: Except I only needed two calls to SQLString, while you need two dozen instances of that repetitive items.c boilerplate. As a human, being repetitive is not my job. That’s what the computer is for. Then why do you have every parameter prefixed with modify_? 8-) Touché :). Actually it’s because the same form can be used to add a new record to the table, so there’s a separate set of input fields for that. But seriously, if that bothers you, then fold the items.c. portion into the generator expression with a getattr call. Or just change them back to the same strings you had originally, and sqlalchemy will be just as happy to accept them as-is. All this trouble, and it only gets rid of 2 of the 3 instances of data- escaping in the example. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Sun, 27 Jun 2010 14:36:10 +1200, Lawrence D'Oliveiro wrote: In any case, you're still trying to make arguments about whether it's easy or hard to get it right, which completely misses the point. Eliminating the escaping entirely makes it impossible to get it wrong. Except nobody has yet shown an alternative which is easier to get right. For SQL, use stored procedures or prepared statements. For HTML/XML, use a DOM (or similar) interface. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Sat, 2010-06-26, Lawrence D'Oliveiro wrote: In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn wrote: I thought it was well-known that the solution is *not* to try to sanitize the input -- it's to switch to an interface which doesn't involve generating an intermediate executable. In the Python example, that would be something like os.popen2(['zcat', '-f', '--', untrusted]). That???s what I mean. Why do people consider input sanitization so hard? I'm not sure you understood me correctly, because I advocate *not* doing input sanitization. Hard or not -- I don't want to know, because I don't want to do it. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Fri, 2010-06-25, Nobody wrote: On Fri, 25 Jun 2010 12:15:08 +, Jorgen Grahn wrote: I don't do SQL and I don't even understand the terminology properly ... but the discussion around it bothers me. Do those people really do this? Yes. And then some. Among web developers, the median level of programming knowledge amounts to the first 3 chapters of Learn PHP in 7 Days. It doesn't help the the guy who wrote PHP itself wasn't much better. - accept untrusted user data - try to sanitize the data (escaping certain characters etc) - turn this data into executable code (SQL) - executing it Like the example in the article SELECT * FROM hotels WHERE city = 'untrusted'; Yep. Search the BugTraq archives for SQL injection. And most of those are for widely-deployed middleware; the zillions of bespoke site-specific scripts are likely to be worse. Also: http://xkcd.com/327/ Priceless! As is often the case with xkcd, I learned something, too: there's a widely used web application/portal/database thingy which silently strips some characters from my input. I thought it had to do with HTML, but it's in fact exactly the sequences ', ')', ';' and '--' from the comic, and a few more like '' and undoubtedly some I haven't noticed yet. That is surely input sanitization gone horribly wrong: I enter 6--8 slices of bread, but the system stores 68 slices of bread. I thought it was well-known that the solution is *not* to try to sanitize the input Well known by anyone with a reasonable understanding of the principles of programming, but somewhat less well known by the other 98% of web developers. Am I missing something? There's a world of difference between a skilled chef and the people flipping burgers for a minimum wage. And between a chartered civil engineer and the people laying the asphalt. And between what you probably consider a programmer and the people doing most web development. I don't know them, so I wouldn't know ... What I would *expect* is that safe tools are provided for them, not just workarounds so they can keep using the unsafe tools. That's what Python did, with its multitude of alternatives to os.system and os.popen. Anyway, thanks. It's always nice to be able to map foreign terminology like SQL injection to something you already know. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Sun, 2010-06-27, Lawrence D'Oliveiro wrote: In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote: I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); This is off-topic, but I believe snprintf() in C can *never* safely be the only thing you do to the buffer: you also have to NUL-terminate it manually in some corner cases. See the documentation. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 2010-06-26 22:33:57 -0400, Lawrence D'Oliveiro said: In message 2010062522560231540-angrybald...@gmailcom, Owen Jacobson wrote: It's not hard. It's just begging for a visit from the fuckup fairy. That’s the same fallacious argument I pointed out earlier. In the sense that using correct manual escaping leads to SQL injection vulnerabilities, yes, that's a fallacious argument on its own. However, as sites like BUGTRAQ amply demonstrate, generating SQL through string manipulation is a risky development practice[0]. You can continue to justify your choice to do so however you want, and you may even be the One True Developer capable of getting it absolutely right under all circumstances, but I'd still reject patches that introduced a SQLString-like function and ask that you resubmit them using the database API's parameterization tools instead. Assuming for the sake of discussion that your SQLString function perfectly captures the transformation required to turn an arbitrary str into a MySQL string literal. How do you address the following issues? 1. Other (possibly inexperienced) developers reading your source who may not have the skills to correctly implement the same transform correctly learn from your programs that writing your own query munger is okay. 1a. Other (possibly inexperienced) developers decide to copy and paste your function without fully understanding how it works, in tandem with any of the other issues below. (If you think this is rare, I invite you to visit stackoverflow or roseindia some time.) 2. MySQL changes the quoting and escaping rules to address a bug/feature request/developer whim, introducing a new set of corner cases into your function and forcing you to re-learn the escaping and quoting rules. (For people using DB API parameters, this is a matter of upgrading the DB adapter module to a version that supports the modified rules.) 3. You decide to switch from MySQL to a more fully-featured RDBMS, which may have different quoting and escaping rules around string literals. 3a. *Someone else* decides to port your program to a different RDBMS, and may not understand that SQLString implements MySQL's quoting and escaping rules only. 4. MySQL AB finally get off their collective duffs and adds real parameter separation to the MySQL wire protocol, and implements real prepared statements to massive speed gains in scenarios that are relevant to your interests; string-based query construction gets left out in the cold. 4a. As with case 3, except that instead of the rules changing when you move to a new RDBMS, it's the relative performance of submitting new queries versus reusing a parameterized query that changes. On top of the obvious issue of completely avoiding quoting bugs, using query parameters rather than escaping and string manipulation neatly saves you from having to address any of these problems (and a multitude of others) -- the DB API implementation will handle things for you, and you are propagating good practice in an easy-to-understand form. I am honestly at a loss trying to understand your position. There is a huge body of documentation out there about the weaknesses of string-manipulation-based approaches to query construction, and the use of query parameters is so compellingly the Right Thing that I have a very hard time comprehending why anyone would opt not to use it except out of pure ignorance of their existence. Generating executable code -- including SQL -- from untrusted user input introduces an large vulnerability surface for very little benefit. You don't handle function parameters by building up python-language strs containing the values as literals and eval'ing them, do you? -o [0] If you want to be *really* pedantic, string-manipulation-based query construction is strongly correlated with the occurrence of SQL injection vulnerabilities and bugs, which is in turn not strongly correlated with very many other practices. Happy? -- http://mail.python.org/mailman/listinfo/python-list
[OT] Re: Why Is Escaping Data Considered So Magical?
On Sun, Jun 27, 2010 at 5:16 PM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message mailman.2184.1277626565.32709.python-l...@python.org, Kushal Kumaran wrote: On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote: I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); Not quite right. If buf is a char array, as suggested by the use of sizeof, then you're not passing a char* to snprintf. What am I passing, then? Here's what gcc tells me (I declared buf as char buf[512]): sprintf.c:8: warning: passing argument 1 of ‘snprintf’ from incompatible pointer type /usr/include/stdio.h:363: note: expected ‘char * __restrict__’ but argument is of type ‘char (*)[512]’ You just need to lose the from the macro. -- regards, kushal -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Mon, Jun 28, 2010 at 2:00 AM, Jorgen Grahn grahn+n...@snipabacken.se wrote: On Sun, 2010-06-27, Lawrence D'Oliveiro wrote: In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote: I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); This is off-topic, but I believe snprintf() in C can *never* safely be the only thing you do to the buffer: you also have to NUL-terminate it manually in some corner cases. See the documentation. snprintf goes to great lengths to be safe, in fact. You might be thinking of strncpy. -- regards, kushal -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Jun 27, 9:54 pm, Kushal Kumaran kushal.kumaran+pyt...@gmail.com wrote: On Mon, Jun 28, 2010 at 2:00 AM, Jorgen Grahn grahn+n...@snipabacken.se wrote: On Sun, 2010-06-27, Lawrence D'Oliveiro wrote: In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote: I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); This is off-topic, but I believe snprintf() in C can *never* safely be the only thing you do to the buffer: you also have to NUL-terminate it manually in some corner cases. See the documentation. snprintf goes to great lengths to be safe, in fact. You might be thinking of strncpy. Indeed, strncpy does not copy that final NUL if it's at or beyond the nth element. Probably the most mind-bogglingly stupid thing about the standard C library, which has lots of mind-boggling stupidity. Whenever I do an audit of someone's C code the first thing I do is search for strncpy and see if they set the nth character to 0. (They usually didn't.) Carl Banks -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Thu, Jun 24, 2010 at 9:38 PM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message 2010062422432660794-angrybald...@gmailcom, Owen Jacobson wrote: Why would I write this when SQLAlchemy, even without using its ORM features, can do it for me? SQLAlchemy doesn’t seem very flexible. Looking at the code examples http://www.sqlalchemy.org/docs/examples.html, they’re very procedural: build object, then do a string of separate method calls to add data to it. I prefer the functional approach, as in my table-update example. Your example from the first post of the thread rewritten using sqlalchemy: conn.execute( items.update() .where(items.c.inventory_nr == modify_id) .values( dict( (field[0], Params.getvalue(%s[%s] % (field[1], urllib.quote(modify_id for field in [ (items.c.class_name, modify_class), (items.c.make, modify_make), (items.c.model, modify_model), (items.c.details, modify_details), (items.c.serial_nr, modify_serial), (items.c.inventory_nr, modify_invent), (items.c.when_purchased, modify_when_purchased), ... you get the idea ... (items.c.location_name, modify_location), (items.c.comment, modify_comment), ] ) ) .values(last_modified = time.time()) ) Doesn't seem any less flexible to me, plus you don't have to worry about calling your SQLString function at all. Cheers, Ian -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 2010-06-25 19:49 , Lawrence D'Oliveiro wrote: In messageslrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn wrote: I thought it was well-known that the solution is *not* to try to sanitize the input -- it's to switch to an interface which doesn't involve generating an intermediate executable. In the Python example, that would be something like os.popen2(['zcat', '-f', '--', untrusted]). That’s what I mean. Why do people consider input sanitization so hard? It's not hard per se; it's just repetitive, prone to the occasional mistake, and, frankly, really boring. When faced with things like that, we do what we do everywhere else in programming: wrap up the repetitive bits into a simpler library API and use that everywhere. Wrapping up the escaping code into SQLString is a step in that direction. However, the standard SQL parameterization in most of the DB protocols or SQLAlchemy's query construction removes even more repetition and unnecessary typing. There's just no point in not using it. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Sat, 26 Jun 2010 12:40:41 +1200, Lawrence D'Oliveiro wrote: I construct ad-hoc queries all the time. It really isn’t that hard to do safely. Wrong. Even if you get the quoting absolutely correct (which is a very big if), you have to remember to perform it every time, without exception. More generally, as a program gets more complex, this will work so long as we do X every time without fail approaches this won't work. That’s a content-free claim. Why? Because it applies equally to everything. Replace “quoting” with something like “arithmetic”, and you’ll see what I mean: If you omit the arithmetic, the program is likely to fail in very obvious ways. Escaping is almost an identity function, which makes it far more likely that omission or repetition will go unnoticed. And you need to perform it exactly once. As the program gets more complex, ensuring that it's done in the correct place, and only there, gets harder. Nonsense. It only needs to be done at the boundary to the appropriate component (MySQL, HTML, JavaScript, whatever). That assumes that you have a well-defined boundary, which isn't necessarily the case. In any case, you're still trying to make arguments about whether it's easy or hard to get it right, which completely misses the point. Eliminating the escaping entirely makes it impossible to get it wrong. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Fri, 25 Jun 2010 20:43:51 -0400, Roy Smith wrote: To bring this back to something remotely Python related, the point of all this is that security is hard. Oh, this isn't solely a security issue. Ask anyone with a surname like O'Neil, O'Connor, O'Leary, etc; they've probably broken a lot of web apps *without even trying*. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In article 2010062522560231540-angrybald...@gmailcom, Owen Jacobson angrybald...@gmail.com wrote: It's not hard. It's just begging for a visit from the fuckup fairy. QOTD? -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Sat, 26 Jun 2010 12:04:38 +0100 Nobody nob...@nowhere.com wrote: Ask anyone with a surname like O'Neil, O'Connor, O'Leary, etc; they've probably broken a lot of web apps *without even trying*. At least it isn't a problem with the first name field. Oh, wait... -- D'Arcy J.M. Cain da...@druid.net | Democracy is three wolves http://www.druid.net/darcy/| and a sheep voting on +1 416 425 1212 (DoD#0082)(eNTP) | what's for dinner. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message mailman.2123.1277522976.32709.python-l...@python.org, Tim Chase wrote: On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote: ... I see that you published my unobfuscated e-mail address on USENET for all to see. I obfuscated it for a reason, to keep the spammers away. I'm assuming this was a momentary lapse of judgement, for which I expect an apology. Otherwise, it becomes grounds for an abuse complaint to your ISP. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message mailman.2126.1277534032.32709.python-l...@python.org, Ian Kelly wrote: Your example from the first post of the thread rewritten using sqlalchemy: conn.execute( items.update() .where(items.c.inventory_nr == modify_id) .values( dict( (field[0], Params.getvalue(%s[%s] % (field[1], urllib.quote(modify_id for field in [ (items.c.class_name, modify_class), (items.c.make, modify_make), (items.c.model, modify_model), (items.c.details, modify_details), (items.c.serial_nr, modify_serial), (items.c.inventory_nr, modify_invent), (items.c.when_purchased, modify_when_purchased), ... you get the idea ... (items.c.location_name, modify_location), (items.c.comment, modify_comment), ] ) ) .values(last_modified = time.time()) ) Doesn't seem any less flexible to me, plus you don't have to worry about calling your SQLString function at all. Except I only needed two calls to SQLString, while you need two dozen instances of that repetitive items.c boilerplate. As a human, being repetitive is not my job. That’s what the computer is for. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message 2010062522560231540-angrybald...@gmailcom, Owen Jacobson wrote: It's not hard. It's just begging for a visit from the fuckup fairy. That’s the same fallacious argument I pointed out earlier. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message pan.2010.06.26.10.49.02.156...@nowhere.com, Nobody wrote: On Sat, 26 Jun 2010 12:40:41 +1200, Lawrence D'Oliveiro wrote: I construct ad-hoc queries all the time. It really isn’t that hard to do safely. Wrong. Even if you get the quoting absolutely correct (which is a very big if), you have to remember to perform it every time, without exception. More generally, as a program gets more complex, this will work so long as we do X every time without fail approaches this won't work. That’s a content-free claim. Why? Because it applies equally to everything. Replace “quoting” with something like “arithmetic”, and you’ll see what I mean: If you omit the arithmetic, the program is likely to fail in very obvious ways. Escaping is almost an identity function, which makes it far more likely that omission or repetition will go unnoticed. Maybe you need to go back and reread my original posting. The SQLString routine doesn’t just escape special characters, it generates a full MySQL string literal, complete with quotation marks. That makes it rather more likely for a syntax error to occur if I forget to use it, don’t you think? And you need to perform it exactly once. As the program gets more complex, ensuring that it's done in the correct place, and only there, gets harder. Nonsense. It only needs to be done at the boundary to the appropriate component (MySQL, HTML, JavaScript, whatever). That assumes that you have a well-defined boundary, which isn't necessarily the case. It’s ALWAYS the case. In any case, you're still trying to make arguments about whether it's easy or hard to get it right, which completely misses the point. Eliminating the escaping entirely makes it impossible to get it wrong. Except nobody has yet shown an alternative which is easier to get right. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In article i06cju$qq...@lust.ihug.co.nz, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message mailman.2123.1277522976.32709.python-l...@python.org, Tim Chase wrote: On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote: ... I see that you published my unobfuscated e-mail address on USENET for all to see. I obfuscated it for a reason, to keep the spammers away. I'm assuming this was a momentary lapse of judgement, for which I expect an apology. Otherwise, it becomes grounds for an abuse complaint to your ISP. You are double daft. First, I completely disagree with you about it being abuse; from my POV anyone posting to Usenet should do so with an unobfuscated address. Secondly, you are wrong about Tim publishing your address unless you intended to follow up to a completely different post, and you owe *him* an apology for a false accusation. -- Aahz (a...@pythoncraft.com) * http://www.pythoncraft.com/ If you don't know what your program is supposed to do, you'd better not start writing it. --Dijkstra -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Sat, Jun 26, 2010 at 7:21 PM, Lawrence D'Oliveiro wrote: In message mailman.2123.1277522976.32709.python-l...@python.org, Tim Chase wrote: On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote: ... I see that you published my unobfuscated e-mail address on USENET for all to see. I obfuscated it for a reason, to keep the spammers away. I'm assuming this was a momentary lapse of judgement, for which I expect an apology. Otherwise, it becomes grounds for an abuse complaint to your ISP. Will you give it a rest already with these threatening messages? Why are you still using this only-partially-obfuscated address with USENET anyway? This has happened twice before, it will doubtless happen yet again. Just use an /entirely invalid/ From address like some other posters do. I can't believe you have a form letter for this... Regards, Chris -- Public addresses eventually going bad is a *fact of life*; plan ahead accordingly. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 06/26/2010 09:21 PM, Lawrence D'Oliveiro wrote: In messagemailman.2123.1277522976.32709.python-l...@python.org, Tim Chase wrote: On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote: ... I see that you published my unobfuscated e-mail address on USENET for all to see. I obfuscated it for a reason, to keep the spammers away. I'm assuming this was a momentary lapse of judgement, for which I expect an apology. Otherwise, it becomes grounds for an abuse complaint to your ISP. I'm sorry...you've got your knickers in a knot? That your spam filters seem to be insufficient? That you don't have a custom throwaway address for such public dialogs? For preventing an undeliverable bounce message that your bogus address would have caused (if your mail provider is RFC-compliant; though your mail provider may kindly be breaking RFC by disabling undeliverable responses to prevent back-scatter spam)? Is the abuse charge waah, he replied to my actual email rather than the false one I spoofed? I'm not sure an abuse complaint to my ISP would net you anything since the exact out-bound headers show nothing abusive, only the correcting of an invalid TLD to prevent a bounce (and a distinct lack of USENET references in the original message that went to you and CC'ed python-list@python.org). Having regularly used python.l...@tim.thechases.com unobfuscated for easily over 5 years, the spam to this address has been almost negligible (or so effectively dealt with by Thunderbird's spam filters that I've never noticed it). -tkc -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message pan.2010.06.26.11.04.22.328...@nowhere.com, Nobody wrote: Ask anyone with a surname like O'Neil, O'Connor, O'Leary, etc; they've probably broken a lot of web apps *without even trying*. Last I checked, I couldn’t post comments on freedom-to-tinker.com. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote: I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); A long while ago I came up with this macro: #define Descr(v) v, sizeof v making the correct version of the above become snprintf(Descr(buf), foo); -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 25Jun2010 15:54, I wrote: | The number of times I've had to | fix/remove insert-values-into-SQL-text code ... My point here is that with insert-escaped-values-into-sql-text, you only need to forget to do it once (or do it wrong). By using a parameterised form like that required by SQLalchemy the library does it and never forgets. I would also point out that if you use a library to _construct_ the SQL statements themselves eg via SQLA's .select() methods etc then you will never introduce a syntax error into the SQL either. I expect I could construct SQL syntax errors that cause havoc when inserted with correctly escaped parameter values if I tried, probably using quotes in the SQL typo part. Cheers, -- Cameron Simpson c...@zip.com.au DoD#743 http://www.cskk.ezoshosting.com/cs/ George, discussing a patent and prior art: Look, this publication has a date, the patent has a priority date, can't you just compare them? Paul Sutcliffe: Not unless you're a lawyer. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Fri, 25 Jun 2010 12:25:56 +1200, Lawrence D'Oliveiro wrote: Just been reading this article ... which says that a lot of security holes are arising these days because everybody is concentrating on unit testing of their own particular components, with less attention being devoted to overall integration testing. Fair enough. But it’s disconcerting to see some of the advice being offered in the reader comments, like “force everyone to use stored procedures”, or “force everyone to use prepared/parametrized statements”, “never construct ad-hoc SQL queries” and the like. I construct ad-hoc queries all the time. It really isn’t that hard to do safely. Wrong. Even if you get the quoting absolutely correct (which is a very big if), you have to remember to perform it every time, without exception. And you need to perform it exactly once. As the program gets more complex, ensuring that it's done in the correct place, and only there, gets harder. More generally, as a program gets more complex, this will work so long as we do X every time without fail approaches this won't work. All you have to do is read the documentation—for example, http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html—and then write a routine that takes arbitrary data and turns it into a valid string literal, like this http://www.codecodex.com/wiki/Useful_MySQL_Routines#Quoting. That's okay. Provided the documentation is accurate. And provided that you update the escaping algorithm whenever the SQL dialect gets extended, or you switch to a different back-end, or modify the program. IOW, it's not even remotely okay. Unparsing data so that you get the correct answer out of a subsequent parsing step is objectively and obviously the wrong approach. The correct approach is to skip both the unparsing and parsing steps entirely. Formal grammars are a useful way to represent graph-like data structures in a human-readable and human-editable form. But for creation, modification and use by a computer, it is invariably preferable to operate upon the graph directly. Textual formats inherit all of the issues which apply to the underlying data structure, then add a few of their own for good measure. I've done this sort of thing for MySQL, for HTML and JavaScript (in both Python and JavaScript itself), and for Bash. And, of course, you're convinced that you got it right every time. That attitude alone should set alarm bells ringing for anyone who's worked in this industry for more than five minutes. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
Nobody nob...@nowhere.com writes: More generally, as a program gets more complex, this will work so long as we do X every time without fail approaches this won't work. QOTW -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Fri, 2010-06-25, Lawrence D'Oliveiro wrote: Just been reading this article http://www.theregister.co.uk/2010/06/23/xxs_sql_injection_attacks_testing_remedy/ which says that a lot of security holes are arising these days because everybody is concentrating on unit testing of their own particular components, with less attention being devoted to overall integration testing. I don't do SQL and I don't even understand the terminology properly ... but the discussion around it bothers me. Do those people really do this? - accept untrusted user data - try to sanitize the data (escaping certain characters etc) - turn this data into executable code (SQL) - executing it Like the example in the article SELECT * FROM hotels WHERE city = 'untrusted'; If so, its isomorphic with doing os.popen('zcat -f %s' % untrusted) in Python (at least on Unix, where 'zcat ...' is executed as a shell script). I thought it was well-known that the solution is *not* to try to sanitize the input -- it's to switch to an interface which doesn't involve generating an intermediate executable. In the Python example, that would be something like os.popen2(['zcat', '-f', '--', untrusted]). Am I missing something? If not, I can go back to sleep -- and keep avoiding SQL and web programming like the plague until that community has entered the 21st century. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Fri, Jun 25, 2010 at 5:15 AM, Jorgen Grahn grahn+n...@snipabacken.segrahn%2bn...@snipabacken.se wrote: Am I missing something? If not, I can go back to sleep -- and keep avoiding SQL and web programming like the plague until that community has entered the 21st century. You're not missing anything. Its been the accepted industry practice for years and years (and /years/), the taught industry practice, the advised industry practice, the constantly repeated practice on every even vaguely database related forum forever now. However: a) Some people are convinced of their own infallibility, and prefer a clever construct generating a string that has to be parsed due to the cleverness of said construct. b) Some people don't listen / understand. c) Some people don't care. And so, SQL injection attacks continue to persist. Then again, its not like anyone in the C-ish world doesn't know about bounds checking on arrays, do they? But buffer overflows persist. Probably for similar reasons as above (with slightly different 'and prefer' clause) --Stephen -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 6/25/2010 12:09 AM, Paul Rubin wrote: Nobodynob...@nowhere.com writes: More generally, as a program gets more complex, this will work so long as we do X every time without fail approaches this won't work. Yes. I was just looking at some of my own code. Out of about 100 SQL statements, I'd used manual escaping once, in code where the WHERE clause is built up depending on what information is available for the search. It's done properly, using MySQLdb.escape_string(s), which is what's used inside cursor.execute. Looking at the code, I now realize that it would have been better to add sections to the SQL string with standard escapes, and at the same time, append the key items to a list. Then the list can be converted to a tuple for submission to cursor.execute. John Nagle -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Fri, 25 Jun 2010 12:15:08 +, Jorgen Grahn wrote: I don't do SQL and I don't even understand the terminology properly ... but the discussion around it bothers me. Do those people really do this? Yes. And then some. Among web developers, the median level of programming knowledge amounts to the first 3 chapters of Learn PHP in 7 Days. It doesn't help the the guy who wrote PHP itself wasn't much better. - accept untrusted user data - try to sanitize the data (escaping certain characters etc) - turn this data into executable code (SQL) - executing it Like the example in the article SELECT * FROM hotels WHERE city = 'untrusted'; Yep. Search the BugTraq archives for SQL injection. And most of those are for widely-deployed middleware; the zillions of bespoke site-specific scripts are likely to be worse. Also: http://xkcd.com/327/ I thought it was well-known that the solution is *not* to try to sanitize the input Well known by anyone with a reasonable understanding of the principles of programming, but somewhat less well known by the other 98% of web developers. Am I missing something? There's a world of difference between a skilled chef and the people flipping burgers for a minimum wage. And between a chartered civil engineer and the people laying the asphalt. And between what you probably consider a programmer and the people doing most web development. If not, I can go back to sleep -- and keep avoiding SQL and web programming like the plague until that community has entered the 21st century. Don't hold your breath. Of course, there's no fundamental reason why you can't apply sound practices to web development. Well, other than the fact that you're competing against an infinite number of (code-) monkeys for lowest-bidder contracts. To be fair, it isn't actually limited to web developers. I've seen the following in scientific code written in C (or, more likely, ported to C from Fortran) for Unix: sprintf(buff, rm -f %s, filename); system(buff); Why bother learning the Unix API when you already know system()? -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Fri, Jun 25, 2010 at 5:17 PM, Nobody nob...@nowhere.com wrote: To be fair, it isn't actually limited to web developers. I've seen the following in scientific code written in C (or, more likely, ported to C from Fortran) for Unix: sprintf(buff, rm -f %s, filename); system(buff); Tsk, tsk. And it's so easy to fix, too: #define BUFSIZE 100 char buff[BUFSIZE]; if (snprintf(buff, BUFSIZE, rm -f %s, filename) = BUFSIZE) { printf(No buffer overflow for you!\n); } else { system(buff); } There, that's much more secure. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message pan.2010.06.25.06.47.34.297...@nowhere.com, Nobody wrote: On Fri, 25 Jun 2010 12:25:56 +1200, Lawrence D'Oliveiro wrote: I construct ad-hoc queries all the time. It really isn’t that hard to do safely. Wrong. Even if you get the quoting absolutely correct (which is a very big if), you have to remember to perform it every time, without exception. More generally, as a program gets more complex, this will work so long as we do X every time without fail approaches this won't work. That’s a content-free claim. Why? Because it applies equally to everything. Replace “quoting” with something like “arithmetic”, and you’ll see what I mean: Even if you get the arithmetic absolutely correct (which is a very big if), you have to remember to perform it every time, without exception. More generally, as a program gets more complex, this will work so long as we do X every time without fail approaches this won't work. From which we can conclude, according to your logic, that one shouldn’t be doing arithmetic. Next time, try to avoid fallacious arguments. And you need to perform it exactly once. As the program gets more complex, ensuring that it's done in the correct place, and only there, gets harder. Nonsense. It only needs to be done at the boundary to the appropriate component (MySQL, HTML, JavaScript, whatever). That’s the only place which needs to have knowledge of what’s on the other side. Everything else can work with arbitrary data without having to worry about such things. Go back to my example, and you’ll see this: the original updates two dozen different fields in a database table, yet it only needs two calls to SQLString: one deals with all the fields requiring updating, while the other one deals with the key-matching. That’s it. Instead of two dozen different places needing checking, you only have two. That’s what “maintainability” is all about. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In article mailman.2117.1277511935.32709.python-l...@python.org, Ian Kelly ian.g.ke...@gmail.com wrote: On Fri, Jun 25, 2010 at 5:17 PM, Nobody nob...@nowhere.com wrote: To be fair, it isn't actually limited to web developers. I've seen the following in scientific code written in C (or, more likely, ported to C from Fortran) for Unix: sprintf(buff, rm -f %s, filename); system(buff); Tsk, tsk. And it's so easy to fix, too: #define BUFSIZE 100 char buff[BUFSIZE]; if (snprintf(buff, BUFSIZE, rm -f %s, filename) = BUFSIZE) { printf(No buffer overflow for you!\n); } else { system(buff); } There, that's much more secure. I recently fixed a bug in some production code. The programmer was careful to use snprintf() to avoid buffer overflows. The only problem is, he wrote something along the lines of: snprintf(buf, strlen(foo), foo); I'm sure the code got reviewed originally, and probably looked at dozens of times over the years. Nobody caught the problem until we ran a static code analysis tool (Coverity) over it. To bring this back to something remotely Python related, the point of all this is that security is hard. A lot of the security best practices (such as don't compose SQL queries on the fly with externally tainted strings) exist because they address ways that people have gotten burned in the past. It if foolish to think that you're smarter than everybody else and have thought of every possibility to avoid getting burned by doing the things that have gotten other people in trouble. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message mailman.2046.1277445301.32709.python-l...@python.org, Cameron Simpson wrote: On 25Jun2010 15:38, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: | In message 2010062422432660794-angrybald...@gmailcom, Owen Jacobson | wrote: | Why would I write this when SQLAlchemy, even without using its ORM | features, can do it for me? | | SQLAlchemy doesn’t seem very flexible. Looking at the code examples | http://www.sqlalchemy.org/docs/examples.html, they’re very procedural: | build object, then do a string of separate method calls to add data to | it. I prefer the functional approach, as in my table-update example. He said without using its ORM. I noticed that. So were those examples I referenced above “using its ORM”? Can you offer better examples “without using its ORM”? -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn wrote: I thought it was well-known that the solution is *not* to try to sanitize the input -- it's to switch to an interface which doesn't involve generating an intermediate executable. In the Python example, that would be something like os.popen2(['zcat', '-f', '--', untrusted]). That’s what I mean. Why do people consider input sanitization so hard? -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 2010-06-25 20:49:09 -0400, Lawrence D'Oliveiro said: In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn wrote: I thought it was well-known that the solution is *not* to try to sanitize the input -- it's to switch to an interface which doesn't involve generating an intermediate executable. In the Python example, that would be something like os.popen2(['zcat', '-f', '--', untrusted]). That’s what I mean. Why do people consider input sanitization so hard? It's not hard. It's just begging for a visit from the fuckup fairy. -o -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 2010-06-25 19:47 , Lawrence D'Oliveiro wrote: In messagemailman.2046.1277445301.32709.python-l...@python.org, Cameron Simpson wrote: On 25Jun2010 15:38, Lawrence D'Oliveirol...@geek-central.gen.new_zealand wrote: | In message2010062422432660794-angrybald...@gmailcom, Owen Jacobson | wrote: | Why would I write this when SQLAlchemy, even without using its ORM | features, can do it for me? | | SQLAlchemy doesn’t seem very flexible. Looking at the code examples |http://www.sqlalchemy.org/docs/examples.html, they’re very procedural: | build object, then do a string of separate method calls to add data to | it. I prefer the functional approach, as in my table-update example. He said without using its ORM. I noticed that. So were those examples I referenced above “using its ORM”? Can you offer better examples “without using its ORM”? http://www.sqlalchemy.org/docs/sqlexpression.html -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On Fri, Jun 25, 2010 at 5:49 PM, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn wrote: I thought it was well-known that the solution is *not* to try to sanitize the input -- it's to switch to an interface which doesn't involve generating an intermediate executable. In the Python example, that would be something like os.popen2(['zcat', '-f', '--', untrusted]). That’s what I mean. Why do people consider input sanitization so hard? Its not that it is hard, its that it has to be done with care: and when an interface provides you two methods to pass it data, one that requires it to parse a string to get at your data (thus requiring careful sanitization), and one that is a direct channel where no parsing is required and the data is directly passed through memory and bypasses the need for any sanitization ... preference for the latter seems pretty darn obvious to me. Use a method that does not add an extra security concern to the application or system = best practice. When that method *also* provides positive performance characteristics on top of alleviating a security concern, and even gets rid of a lot of data type conversion details you shouldn't really need to worry about, well. Using that method seems pretty much an obvious choice to me. If the only reason not to use it is so you can produce ghoulish spaghetti code like in the first post, I think that's a count in PQ's favor :) --S -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote: In the Python example, that would be something like os.popen2(['zcat', '-f', '--', untrusted]). That’s what I mean. Why do people consider input sanitization so hard? It's hard because it requires thinking. Sadly, many of the people I know who call themselves programmers couldn't code their way out of a paper bag, let alone think logically about the security implications of their code.[1] -tkc [1] much of which ends up being cargo-cult programming, cut-n-paste'd from Google search-results. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In article i00t2k$l0...@lust.ihug.co.nz, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: I construct ad-hoc queries all the time. It really isnât that hard to do safely. All you have to do is read the documentation I get worried when people talk about how easy it is to do something safely. Let me suggest a couple of things you might not have considered: 1) Somebody is running your application (or the database server) with the locale set to something unexpected. This might change how numbers, dates, currency, etc, get formatted, which could change the meaning of your constructed SQL statement. 2) Somebody runs your application with a different PYTHONPATH, which causes a different (i.e. malicious) urllib module to get loaded, which makes urllib.quote() do something you didn't expect. Iâve done this sort of thing for MySQL, for HTML and JavaScript (in both Python and JavaScript itself), and for Bash. Itâs not hard to verify youâve done it correctly. It lets you easily create table-updating code like the following, which makes it so easy to update the code to track changes in the database structure: sql.cursor.execute \ ( update items set + , .join ( tuple ( %(name)s = %(value)s % { name : field[0], value : SQLString(Params.getvalue ( %s[%s] % (field[1], urllib.quote(modify_id)) )) } for field in ( (class_name, modify_class), (make, modify_make), (model, modify_model), (details, modify_details), (serial_nr, modify_serial), (inventory_nr, modify_invent), (when_purchased, modify_when_purchased), ... you get the idea ... (location_name, modify_location), (comment, modify_comment), ) ) + ( last_modified = %d % int(time.time()), ) ) + where inventory_nr = %s % SQLString(modify_id) ) -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 2010-06-24 21:02:48 -0400, Roy Smith said: In article i00t2k$l0...@lust.ihug.co.nz, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: I construct ad-hoc queries all the time. It really isnât that hard to do safely. All you have to do is read the documentation I get worried when people talk about how easy it is to do something safely. First: I agree with this. While it's definitely possible to correctly escape a given SQL dialect under controlled conditions, it's not at all easy to get it right, and the real world is even more unfriendly than most people expect. Furthermore there's no reason to do it that way: Python's DB API spec effectively requires that placeholder parameters of *some* kind exist. Even if you feel the need to construct SQL, you can construct it with parameters almost as easily as you can construct it with the values baked in. With that said... 2) Somebody runs your application with a different PYTHONPATH, which causes a different (i.e. malicious) urllib module to get loaded, which makes urllib.quote() do something you didn't expect. Someone who can manipulate PYTHONPATH or otherwise add code to the runtime environment is already in a position to hose your database, independently of escaping-related issues. It's up to the sysadmin or user to ensure that their environment is sane, and it's on their head if they add broken code to a program's runtime environment. Lawrence D'Oliveiro wrote: I've done this sort of thing for MySQL, for HTML and JavaScript (in both Python and JavaScript itself), and for Bash. Itâs not hard to verify youâve done it correctly. It lets you easily create table-updating code like the following, which makes it so easy to update the code to track changes in the database structure: sql.cursor.execute \ ( update items set + , .join ( tuple ( %(name)s = %(value)s % { name : field[0], value : SQLString(Params.getvalue ( %s[%s] % (field[1], urllib.quote(modify_id)) )) } for field in ( (class_name, modify_class), (make, modify_make), (model, modify_model), (details, modify_details), (serial_nr, modify_serial), (inventory_nr, modify_invent), (when_purchased, modify_when_purchased), ... you get the idea ... (location_name, modify_location), (comment, modify_comment), ) ) + ( last_modified = %d % int(time.time()), ) ) + where inventory_nr = %s % SQLString(modify_id) ) Why would I write this when SQLAlchemy, even without using its ORM features, can do it for me? It even uses the placeholder-generating strategy I mentioned above, where possible. Finally, it's worth noting that MySQL is (almost) the only mainstream database that uses escaping for parameterization. PostgreSQL, SQL Server, Oracle, DB2, and most other databases support parameters natively in their communication protocols: parameters aren't injected into the query string, but are sent separately and processed separately within the DBMS. This neatly avoids encoding-related and quoting-related problems entirely, and it means the type of the parameter can be preserved if it's useful. -o -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message roy-30b881.21024824062...@news.panix.com, Roy Smith wrote: 1) Somebody is running your application (or the database server) with the locale set to something unexpected. Locales are under program control, so that won’t happen. This is why I use UTF-8 encoding for everything. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
In message 2010062422432660794-angrybald...@gmailcom, Owen Jacobson wrote: Why would I write this when SQLAlchemy, even without using its ORM features, can do it for me? SQLAlchemy doesn’t seem very flexible. Looking at the code examples http://www.sqlalchemy.org/docs/examples.html, they’re very procedural: build object, then do a string of separate method calls to add data to it. I prefer the functional approach, as in my table-update example. -- http://mail.python.org/mailman/listinfo/python-list
Re: Why Is Escaping Data Considered So Magical?
On 25Jun2010 15:38, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote: | In message 2010062422432660794-angrybald...@gmailcom, Owen Jacobson wrote: | Why would I write this when SQLAlchemy, even without using its ORM | features, can do it for me? | | SQLAlchemy doesn’t seem very flexible. Looking at the code examples | http://www.sqlalchemy.org/docs/examples.html, they’re very procedural: | build object, then do a string of separate method calls to add data to it. I | prefer the functional approach, as in my table-update example. He said without using its ORM. I do what you suggest (make SQL statements at need) using SQLalchemy all the time. It is simple and easy and _robust_ against odd data. The number of times I've had to fix/remove insert-values-into-SQL-text code ... -- Cameron Simpson c...@zip.com.au DoD#743 http://www.cskk.ezoshosting.com/cs/ Plague, Famine, Pestilence, and C++ stalk the land. We're doomed! Doomed! - Simon E Spero -- http://mail.python.org/mailman/listinfo/python-list