Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-03 Thread Lawrence D'Oliveiro
In message mailman.182.1278126257.1673.python-l...@python.org, Rami 
Chowdhury wrote:

 I'm sorry, perhaps you've misunderstood what I was refuting. You posted:
  macro:
  #define Descr(v) v, sizeof v
  
  As written, this works whatever the type of v: array, struct,
  whatever.
 
 With my code example I found that, as others have pointed out,
 unfortunately it doesn't work if v is a pointer to a heap-allocated area.

It still correctly passes the address and size of that pointer variable. It 
that’s not what you intended, you shouldn’t use it.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-07-03 Thread Lawrence D'Oliveiro
In message mailman.2121.1277522302.32709.python-l...@python.org, Robert 
Kern wrote:

 On 2010-06-25 19:47 , Lawrence D'Oliveiro wrote:

 In messagemailman.2046.1277445301.32709.python-l...@python.org, Cameron
 Simpson wrote:

 On 25Jun2010 15:38, Lawrence
 D'Oliveiro l...@geek-central.gen.new_zealand wrote:

 | In message2010062422432660794-angrybald...@gmailcom, Owen Jacobson
 | wrote:

 |  Why would I write this when SQLAlchemy, even without using its ORM
 |  features, can do it for me?
 |
 | SQLAlchemy doesn’t seem very flexible. Looking at the code examples
 |http://www.sqlalchemy.org/docs/examples.html, they’re very
 |procedural: build object, then do a string of separate method calls to
 |add data to it. I prefer the functional approach, as in my table-update
 |example.

 He said without using its ORM.

 I noticed that. So were those examples I referenced above “using its
 ORM”? Can you offer better examples “without using its ORM”?
 
 http://www.sqlalchemy.org/docs/sqlexpression.html

Still full of very repetitive boilerplate. Doesn’t look like it can create a 
simpler alternative to my example at all.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-07-03 Thread Lawrence D'Oliveiro
In message mailman.2128.1277537954.32709.python-l...@python.org, Robert 
Kern wrote:

 On 2010-06-25 19:49 , Lawrence D'Oliveiro wrote:

 Why do people consider input sanitization so hard?
 
 It's not hard per se; it's just repetitive, prone to the occasional
 mistake, and, frankly, really boring.

But as a programmer, I’m not in the habit of doing “repetitive” and 
“boring”. Look at the example I posted, and you’ll see. It’s the ones trying 
to come up with alternatives to my code who produce things that look 
“reptitive” and “boring”.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-02 Thread Lawrence D'Oliveiro
In message mailman.136.1278040489.1673.python-l...@python.org, Rami 
Chowdhury wrote:

 On Thursday 01 July 2010 16:50:59 Lawrence D'Oliveiro wrote:

 Nevertheless, it it at least self-consistent. To return to my original
 macro:
 
 #define Descr(v) v, sizeof v
 
 As written, this works whatever the type of v: array, struct, whatever.
 
 Doesn't seem to, sorry. Using Michael Torrie's code example, slightly
 modified...
 
 char *buf = malloc(512 * sizeof(char));

Again, you misunderstand the difference between a C array and a pointer. 
Study the following example, which does work, and you might grasp the point:

l...@theon:hack cat test.c
#include stdio.h

int main(int argc, char ** argv)
  {
char buf[512];
const int a = 2, b = 3;
snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
fprintf(stdout, buf);
return
0;
  } /*main*/
l...@theon:hack ./test
2 + 3 = 5

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-02 Thread Rami Chowdhury
On Friday 02 July 2010 19:20:26 Lawrence D'Oliveiro wrote:
 In message mailman.136.1278040489.1673.python-l...@python.org, Rami
 Chowdhury wrote:
  On Thursday 01 July 2010 16:50:59 Lawrence D'Oliveiro wrote:
  Nevertheless, it it at least self-consistent. To return to my original
  
  macro:
  #define Descr(v) v, sizeof v
  
  As written, this works whatever the type of v: array, struct, whatever.
  
  Doesn't seem to, sorry. Using Michael Torrie's code example, slightly
  modified...
  
  char *buf = malloc(512 * sizeof(char));
 
 Again, you misunderstand the difference between a C array and a pointer.
 Study the following example, which does work, and you might grasp the
 point:
 
 l...@theon:hack cat test.c
 #include stdio.h
 
 int main(int argc, char ** argv)
   {
 char buf[512];
 const int a = 2, b = 3;
 snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
 fprintf(stdout, buf);
 return
 0;
   } /*main*/
 l...@theon:hack ./test
 2 + 3 = 5

I'm sorry, perhaps you've misunderstood what I was refuting. You posted:
  macro:
  #define Descr(v) v, sizeof v
  
  As written, this works whatever the type of v: array, struct, whatever.

With my code example I found that, as others have pointed out, unfortunately it 
doesn't work if v is a pointer to a heap-allocated area. 


Rami Chowdhury
A man with a watch knows what time it is. A man with two watches is never
sure. -- Segal's Law
+1-408-597-7068 / +44-7875-841-046 / +88-01819-245544
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-07-02 Thread David Cournapeau
On Mon, Jun 28, 2010 at 6:44 PM, Gregory Ewing
greg.ew...@canterbury.ac.nz wrote:
 Carl Banks wrote:

 Indeed, strncpy does not copy that final NUL if it's at or beyond the
 nth element.  Probably the most mind-bogglingly stupid thing about the
 standard C library, which has lots of mind-boggling stupidity.

 I don't think it was as stupid as that back when C was
 designed

Actually, strncpy had a very specific use case when it was introduced
(dealing with limited-size entries in very old unix filesystem). It
should never be used for C string handling, and I don't think it is
fair to say it is stupid: it does exactly what it was designed for. It
just happens that most people don't know what it was designed for.

David
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread Jorgen Grahn
On Wed, 2010-06-30, Michael Torrie wrote:
 On 06/30/2010 03:00 AM, Jorgen Grahn wrote:
 On Wed, 2010-06-30, Michael Torrie wrote:
 On 06/29/2010 10:17 PM, Michael Torrie wrote:
 On 06/29/2010 10:05 PM, Michael Torrie wrote:
 #include stdio.h

 int main(int argc, char ** argv)
 {
   char *buf = malloc(512 * sizeof(char));
   const int a = 2, b = 3;
   snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
^^
 Make that 512*sizeof(buf)

 Sigh.  Try again.  How about 512 * sizeof(char) ?  Still doesn't make
 a different.  The code still crashes because the buf is incorrect.
 
 I haven't tried to understand the rest ... but never write
 'sizeof(char)' unless you might change the type later. 'sizeof(char)'
 is by definition 1 -- even on odd-ball architectures where a char is
 e.g. 16 bits.

 You're right.  I normally don't use sizeof(char).  This is obviously a
 contrived example; I just wanted to make the example such that there's
 no way the original poster could argue that the crash is caused by
 something other than buf.

 Then again, it's always a bad idea in C to make assumptions about
 anything.

There are some things you cannot assume, others which few fellow
programmers can care to memorize, and others which you often can get
away with (like assuming an int is more than 16 bits, when your code
is tied to a modern Unix anyway).

But sizeof(char) is always 1.

 If you're on Windows and want to use the unicode versions of
 everything, you'd need to do sizeof().  So using it here would remind
 you that when you move to the 16-bit Microsoft unicode versions of
 snprintf need to change the sizeof(char) lines as well to sizeof(wchar_t).

Yes -- see unless you might change the type later above.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread Nobody
On Wed, 30 Jun 2010 23:40:06 -0600, Michael Torrie wrote:

 Given char buf[512], buf's type is char * according to the compiler
 and every C textbook I know of.

No, the type of buf is char [512], i.e. array of 512 chars. If you
use buf as an rvalue (rather than an lvalue), it will be implicitly
converted to char*.

If you take its address, you'll get a pointer to array of 512 chars,
i.e. a pointer to the array rather than to the first element. Converting
this to a char* will yield a pointer to the first element.

If buf was declared char *buf, then taking its address will yield a
char**, and converting this to a char* will produce a pointer to the first
byte of the pointer, which is unlikely to be useful.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread Mel
Nobody wrote:
 On Wed, 30 Jun 2010 23:40:06 -0600, Michael Torrie wrote:
 Given char buf[512], buf's type is char * according to the compiler
 and every C textbook I know of.

References from Kernighan  Ritchie _The C Programming Language_ second 
edition:

 No, the type of buf is char [512], i.e. array of 512 chars. If you
 use buf as an rvalue (rather than an lvalue), it will be implicitly
 converted to char*.

KR2 A7.1

 If you take its address, you'll get a pointer to array of 512 chars,
 i.e. a pointer to the array rather than to the first element. Converting
 this to a char* will yield a pointer to the first element.

KR2 A7.4.2


Mel.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread Michael Torrie
On 07/01/2010 01:24 AM, Nobody wrote:
 No, the type of buf is char [512], i.e. array of 512 chars. If you
 use buf as an rvalue (rather than an lvalue), it will be implicitly
 converted to char*.

Yes this is true.  I misstated.  I meant that most text books I've seen
say to just use the variable in an *rvalue* as a pointer (can't think of
any lvalue use of an array).

KR states that arrays (in C anyway) are always *passed* by pointer,
hence when you pass an array to a function it automatically decays into
a pointer.  Which is what you said.  So no need for  and the compiler
warning you get with it.  That's all.

If the OP was striving for pedantic correctness, he would use buf[0].
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread John Nagle

On 7/1/2010 8:36 AM, Mel wrote:

Nobody wrote:

On Wed, 30 Jun 2010 23:40:06 -0600, Michael Torrie wrote:

Given char buf[512], buf's type is char * according to the compiler
and every C textbook I know of.


References from Kernighan  Ritchie _The C Programming Language_ second
edition:


No, the type of buf is char [512], i.e. array of 512 chars. If you
use buf as an rvalue (rather than an lvalue), it will be implicitly
converted to char*.


   Yes, unfortunately.  The approach to arrays in C is just broken,
for historical reasons.  To understand C, you have to realize that
in the early versions, function declarations weren't visible when 
function calls were compiled.  That came later, in ANSI C. So

parameter passing in C is very dumb.  Billions of crashes due
to buffer overflows later, we're still suffering from that mistake.

   But this isn't a Python issue.

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread Lawrence D'Oliveiro
In message mailman.2370.1277871088.32709.python-l...@python.org, Michael 
Torrie wrote:

 On 06/29/2010 06:26 PM, Lawrence D'Oliveiro wrote:
 I'm not sure you understood me correctly, because I advocate
 *not* doing input sanitization. Hard or not -- I don't want to know,
 because I don't want to do it.
 
 But no-one has yet managed to come up with an alternative that involves
 less work.
 
 Your case is still not persuasive.

So persuade me. I have given an example of code written the way I do it. Now 
let’s see you rewrite it using your preferred technique, just to prove that 
your way is simpler and easier to understand.

Enough hand-waving, let’s see some code!
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread Lawrence D'Oliveiro
In message 4c2ccd9c$0$1643$742ec...@news.sonic.net, John Nagle wrote:

 The approach to arrays in C is just broken, for historical reasons.

Nevertheless, it it at least self-consistent. To return to my original 
macro:

#define Descr(v) v, sizeof v

As written, this works whatever the type of v: array, struct, whatever.

 So parameter passing in C is very dumb.

Nothing to do with the above issue.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-07-01 Thread Rami Chowdhury
On Thursday 01 July 2010 16:50:59 Lawrence D'Oliveiro wrote:
 Nevertheless, it it at least self-consistent. To return to my original
 macro:
 
 #define Descr(v) v, sizeof v
 
 As written, this works whatever the type of v: array, struct, whatever.
 

Doesn't seem to, sorry. Using Michael Torrie's code example, slightly 
modified...

[r...@tigris ~]$ cat example.c 
#include stdio.h

#define Descr(v) v, sizeof v

int main(int argc, char ** argv)
{
char *buf = malloc(512 * sizeof(char));
const int a = 2, b = 3;
snprintf(Descr(buf), %d + %d = %d\n, a, b, a + b);
fprintf(stdout, buf);
free(buf);
return 0;
} /*main*/

[r...@tigris ~]$ clang example.c 
example.c:11:18: warning: incompatible pointer types passing 'char **', 
expected 
'char *' [-pedantic]
snprintf(Descr(buf), %d + %d = %d\n, a, b, a + b);
 ^~
example.c:4:18: note: instantiated from:
   
#define Descr(v) v, sizeof v
 ^~~~
snip
[r...@tigris ~]$ ./a.out 
Segmentation fault



Rami Chowdhury
Passion is inversely proportional to the amount of real information available.
-- Benford's Law of Controversy
+1-408-597-7068 / +44-7875-841-046 / +88-01819-245544
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Jorgen Grahn
On Wed, 2010-06-30, Michael Torrie wrote:
 On 06/29/2010 10:17 PM, Michael Torrie wrote:
 On 06/29/2010 10:05 PM, Michael Torrie wrote:
 #include stdio.h

 int main(int argc, char ** argv)
 {
 char *buf = malloc(512 * sizeof(char));
 const int a = 2, b = 3;
 snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
^^
 Make that 512*sizeof(buf)

 Sigh.  Try again.  How about 512 * sizeof(char) ?  Still doesn't make
 a different.  The code still crashes because the buf is incorrect.

I haven't tried to understand the rest ... but never write
'sizeof(char)' unless you might change the type later. 'sizeof(char)'
is by definition 1 -- even on odd-ball architectures where a char is
e.g. 16 bits.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Ancient C string conventions (was Re: Why Is Escaping Data Considered So Magical?)

2010-06-30 Thread Jorgen Grahn
On Wed, 2010-06-30, Carl Banks wrote:
 On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote:
 Carl Banks wrote:
  Indeed, strncpy does not copy that final NUL if it's at or beyond the
  nth element.  Probably the most mind-bogglingly stupid thing about the
  standard C library, which has lots of mind-boggling stupidity.

 I don't think it was as stupid as that back when C was
 designed. Every byte of memory was precious in those days,
 and if you had, say, 10 bytes allocated for a string, you
 wanted to be able to use all 10 of them for useful data.

 So the convention was that a NUL byte was used to mark
 the end of the string *if it didn't fill all the available
 space*.

 I can't think of any function in the standard library that observes
 that convention,

Me neither, except strncpy(), according to above.

 which inclines me to disbelieve this convention ever
 really existed.  If it did, there would be functions to support it.

Maybe others existed, but got killed off early. That would make
strncpy() a living fossil, like the Coelacanth ...

 For that matter, I'm not really inclined to believe bytes were *that*
 precious in those days.

It's somewhat believable. If I handled thousands of student names in a
big C array char[30][], I would resent the fact that 1/30 of the
memory was wasted on NUL bytes.  I'm sure plenty of people have done what
Gregory suggests ... but it's not clear that strncpy() was designed to
support those people.

I suppose it's all lost in history.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Cameron Simpson
On 29Jun2010 21:49, Carl Banks pavlovevide...@gmail.com wrote:
| On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote:
|  Carl Banks wrote:
|   Indeed, strncpy does not copy that final NUL if it's at or beyond the
|   nth element.  Probably the most mind-bogglingly stupid thing about the
|   standard C library, which has lots of mind-boggling stupidity.
| 
|  I don't think it was as stupid as that back when C was
|  designed. Every byte of memory was precious in those days,
|  and if you had, say, 10 bytes allocated for a string, you
|  wanted to be able to use all 10 of them for useful data.
| 
|  So the convention was that a NUL byte was used to mark
|  the end of the string *if it didn't fill all the available
|  space*.
| 
| I can't think of any function in the standard library that observes
| that convention, which inclines me to disbelieve this convention ever
| really existed.  If it did, there would be functions to support it.
| 
| For that matter, I'm not really inclined to believe bytes were *that*
| precious in those days.

Jeez. PDP-11s, 16 bit addressing, tiny tiny disc drives!

The original V7 (and probably earlier) UNIX filesystem has 16 byte directory
entries: 2 bytes for an inode and 14 bytes for the name. You could use 14
bytes of that name, and strncpy makes it effective to work with that data
structure.  

Shortening something already only 14 bytes (the name) _is_ a big ask,
and it is well work the unusual convention in play.

| The obvious rationale behind strncpy's stupid behavior is that it's
| not a string function at all, but a memory block function, that stops
| at a NUL in case you don't care what's after the NUL in a block.  But
| it leads you to believe it's a string function by it's name.

Bah. It's for copying a _string_ into a _buffer_! Strangely, since it
starts with a string (NUL-terminated byte sequence) it begins with
str. And it _is_ copying, but not into another string.

It is special purpose but perfectly reasonable for the problem at hand.
-- 
Cameron Simpson c...@zip.com.au DoD#743
http://www.cskk.ezoshosting.com/cs/

If it ain't broken, keep playing with it.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Roy Smith
In article mailman.14.1277891765.1673.python-l...@python.org,
 Cameron Simpson c...@zip.com.au wrote:

 Jeez. PDP-11s, 16 bit addressing, tiny tiny disc drives!

What you talking about, tiny?  An RK-05 was huge!  Why would anybody 
ever need more than that?

 The original V7 (and probably earlier) UNIX filesystem has 16 byte directory
 entries

Certainly earlier.  I used v6, and it was like that there.  I'm 
reasonably sure it pre-dated v6, however.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Michael Torrie
On 06/30/2010 03:00 AM, Jorgen Grahn wrote:
 On Wed, 2010-06-30, Michael Torrie wrote:
 On 06/29/2010 10:17 PM, Michael Torrie wrote:
 On 06/29/2010 10:05 PM, Michael Torrie wrote:
 #include stdio.h

 int main(int argc, char ** argv)
 {
char *buf = malloc(512 * sizeof(char));
const int a = 2, b = 3;
snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
^^
 Make that 512*sizeof(buf)

 Sigh.  Try again.  How about 512 * sizeof(char) ?  Still doesn't make
 a different.  The code still crashes because the buf is incorrect.
 
 I haven't tried to understand the rest ... but never write
 'sizeof(char)' unless you might change the type later. 'sizeof(char)'
 is by definition 1 -- even on odd-ball architectures where a char is
 e.g. 16 bits.

You're right.  I normally don't use sizeof(char).  This is obviously a
contrived example; I just wanted to make the example such that there's
no way the original poster could argue that the crash is caused by
something other than buf.

Then again, it's always a bad idea in C to make assumptions about
anything.  If you're on Windows and want to use the unicode versions of
everything, you'd need to do sizeof().  So using it here would remind
you that when you move to the 16-bit Microsoft unicode versions of
snprintf need to change the sizeof(char) lines as well to sizeof(wchar_t).
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Carl Banks
On Jun 30, 2:55 am, Cameron Simpson c...@zip.com.au wrote:
 On 29Jun2010 21:49, Carl Banks pavlovevide...@gmail.com wrote:
 | On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote:
 |  Carl Banks wrote:
 |   Indeed, strncpy does not copy that final NUL if it's at or beyond the
 |   nth element.  Probably the most mind-bogglingly stupid thing about the
 |   standard C library, which has lots of mind-boggling stupidity.
 | 
 |  I don't think it was as stupid as that back when C was
 |  designed. Every byte of memory was precious in those days,
 |  and if you had, say, 10 bytes allocated for a string, you
 |  wanted to be able to use all 10 of them for useful data.
 | 
 |  So the convention was that a NUL byte was used to mark
 |  the end of the string *if it didn't fill all the available
 |  space*.
 |
 | I can't think of any function in the standard library that observes
 | that convention, which inclines me to disbelieve this convention ever
 | really existed.  If it did, there would be functions to support it.
 |
 | For that matter, I'm not really inclined to believe bytes were *that*
 | precious in those days.

 Jeez. PDP-11s, 16 bit addressing, tiny tiny disc drives!

 The original V7 (and probably earlier) UNIX filesystem has 16 byte directory
 entries: 2 bytes for an inode and 14 bytes for the name. You could use 14
 bytes of that name, and strncpy makes it effective to work with that data
 structure.  

 Shortening something already only 14 bytes (the name) _is_ a big ask,
 and it is well work the unusual convention in play.

You are talking about fixed-length memory records, not strings.

I'm saying that bytes were not so precious that, when you operate on
*actual strings*, that you need to desperately cut off nul terminators
to save space.


 | The obvious rationale behind strncpy's stupid behavior is that it's
 | not a string function at all, but a memory block function, that stops
 | at a NUL in case you don't care what's after the NUL in a block.  But
 | it leads you to believe it's a string function by it's name.

 Bah. It's for copying a _string_ into a _buffer_! Strangely, since it
 starts with a string (NUL-terminated byte sequence) it begins with
 str. And it _is_ copying, but not into another string.

I'm going to disagree.  The input of strncpy can be either a string or
a memory block, and the output can only a memory block.  In other
words, neither the source nor destination has to be a string.  This is
a memory block function, not a string function.  The correct name for
this function should have been memcpytonul.

Even if you disagree, then you must admit it should have been called
strcpytobuf.  Nothing about the name strncpy gives the slightest
suggestion that the destination is not a string.  Based on analogy
from other str functions, none of which have any sources or
destinations that are memory blocks, one would logically expect that
strncpy's destination was a string.  It defies common sense.

And there should have been an actual, correctly working strncpy in the
standard library that copies and truncates actual strings.


 It is special purpose but perfectly reasonable for the problem at hand.

The usefulness of strncpy's behavior for writing fixed-length memory
blocks is not in question here.  The thing that's mind-bogglingly
stupid is that the function that does this is called strncpy.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Paul Rubin
Cameron Simpson c...@zip.com.au writes:
 The original V7 (and probably earlier) UNIX filesystem has 16 byte directory
 entries: 2 bytes for an inode and 14 bytes for the name. You could use 14
 bytes of that name, and strncpy makes it effective to work with that data
 structure.  

Why not use memcpy for that?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Cameron Simpson
On 30Jun2010 12:19, Paul Rubin no.em...@nospam.invalid wrote:
| Cameron Simpson c...@zip.com.au writes:
|  The original V7 (and probably earlier) UNIX filesystem has 16 byte directory
|  entries: 2 bytes for an inode and 14 bytes for the name. You could use 14
|  bytes of that name, and strncpy makes it effective to work with that data
|  structure.  
| 
| Why not use memcpy for that?

Because when you've pulled names _out_ of the directory structure they're
conventional C strings, ready for conventional C string mucking about:
NUL terminated, with no expectation that any memory is allocated beyond
the NUL.

Think of strncpy as a conversion function. Your source is a conventional
C string of unknown size, your destination is a NUL padded buffer of
known size. Copy at most n bytes of this string into the buffer, pad
with NULs.

Cheers,
-- 
Cameron Simpson c...@zip.com.au DoD#743
http://www.cskk.ezoshosting.com/cs/

We had the experience, but missed the meaning.  - T.S. Eliot
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Lawrence D'Oliveiro
In message mailman.2369.1277870727.32709.python-l...@python.org, Michael 
Torrie wrote:

 Okay, I will. Your code passes a char** when a char* is expected.

No it doesn’t.

 Consider this variation where I use a dynamically allocated buffer
 instead of static:

And so you misunderstand the difference between a C array and a pointer.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [farther OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-30 Thread Michael Torrie
On 06/30/2010 06:36 PM, Lawrence D'Oliveiro wrote:
 In message mailman.2369.1277870727.32709.python-l...@python.org,
 Michael Torrie wrote:
 
 Okay, I will. Your code passes a char** when a char* is expected.
 
 No it doesn’t.

You're right; it doesn't.  Your code passes char (*)[512].

warning: passing argument 1 of ‘snprintf’ from incompatible pointer type
/usr/include/stdio.h:385: note: expected ‘char * __restrict__’ but
argument is of type ‘char (*)[512]’

 And so you misunderstand the difference between a C array and a
 pointer.

You make a pretty big assumption.

Given char buf[512], buf's type is char * according to the compiler
and every C textbook I know of.  With a static char array, there's no
need to take it's address since it *is* the address of the first
element.  Taking the address can lead to problems if you ever substitute
a dynamically-allocated buffer for the statically-allocated one.  For
one-dimensional arrays at least, static arrays and pointers are
interchangeable when calling snprinf.  You do not agree?

Anyway, this is far enough away from Python.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-29 Thread Mark Lawrence

On 29/06/2010 01:55, Roy Smith wrote:

[snips]


The nice thing about null-terminated strings is how portable they have
been over various word lengths.


The bad thing about null-terminated strings is the number of off-by-one 
errors they've helped to create.  I obviously have never created an 
off-by-one error myself. :)


Kindest regards.

Mark Lawrence.


--
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-29 Thread Lawrence D'Oliveiro
In message mailman.2332.1277785175.32709.python-l...@python.org, Kushal 
Kumaran wrote:

 On Tue, Jun 29, 2010 at 5:56 AM, Lawrence D'Oliveiro
 l...@geek-central.gen.new_zealand wrote:

 Why does this work, then:

 l...@theon:hack cat test.c
 #include stdio.h

 int main(int argc, char ** argv)
  {
char buf[512];
const int a = 2, b = 3;
snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
fprintf(stdout, buf);
return
0;
  } /*main*/
 l...@theon:hack ./test
 2 + 3 = 5
 
 By accident.

I have yet to find an architecture or C compiler where it DOESN’T work.

Feel free to try and prove me wrong.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-29 Thread Lawrence D'Oliveiro
In message slrni2f8v2.j19.grahn+n...@frailea.sa.invalid, Jorgen Grahn 
wrote:

 On Sat, 2010-06-26, Lawrence D'Oliveiro wrote:

 In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn
 wrote:

 I thought it was well-known that the solution is *not* to try to
 sanitize the input -- it's to switch to an interface which doesn't
 involve generating an intermediate executable.  In the Python example,
 that would be something like os.popen2(['zcat', '-f', '--', untrusted]).

 That’s what I mean. Why do people consider input sanitization so hard?
 
 I'm not sure you understood me correctly, because I advocate
 *not* doing input sanitization. Hard or not -- I don't want to know,
 because I don't want to do it.

But no-one has yet managed to come up with an alternative that involves less 
work.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-29 Thread Michael Torrie
On 06/29/2010 06:25 PM, Lawrence D'Oliveiro wrote:
 I have yet to find an architecture or C compiler where it DOESN’T work.
 
 Feel free to try and prove me wrong.

Okay, I will. Your code passes a char** when a char* is expected.  Every
compiler I know of will give you a *warning*.  Mistaking char*, char**,
and char[] is a common mistake that almost every C program makes in the
beginning.  Now for the proof:

Consider this variation where I use a dynamically allocated buffer
instead of static:

#include stdio.h

int main(int argc, char ** argv)
{
char *buf = malloc(512 * sizeof(char));
const int a = 2, b = 3;
snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
fprintf(stdout, buf);
free(buf);
return 0;
} /*main*/

On my machine, an immediate segfault (stack overrun).  Your code only
works because your buf is statically allocated, which means buf==buf.
But this equivalance does not hold for any other situation.  If your
buffer was dynamically allocated on the heap, instead of passing a
pointer to the buffer (which *is* what buf itself is), you are passing a
pointer to the pointer, which is where buf is stored on the stack, but
not the buffer itself.  Instant stack corruption.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-29 Thread Michael Torrie
On 06/29/2010 06:26 PM, Lawrence D'Oliveiro wrote:
 I'm not sure you understood me correctly, because I advocate
 *not* doing input sanitization. Hard or not -- I don't want to know,
 because I don't want to do it.
 
 But no-one has yet managed to come up with an alternative that involves less 
 work.

Your case is still not persuasive.

How is using the DB API's placeholders and parameterization more work?
It's the same amount of keystrokes, perhaps even less.  You would just
be substituting the API's parameter placeholders for Python's.  In fact
with Psycopg2 and the mysql python db apis, it's almost a matter of
simply removing the % and putting in a comma, turning python's string
substitution into a method call.  And you can leave out the quotes
around where the variables go.  If I have to sanitize every input, I
have to do it on each and every field on each and every form action.
With the DB API doing the work I just do it once, in one place.  Is this
not easier that manually escaping everything and then embedding it in
the query string?

I've not used sqlalchemy, but it looks similarly easy.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-29 Thread Michael Torrie
On 06/29/2010 10:05 PM, Michael Torrie wrote:
 #include stdio.h
 
 int main(int argc, char ** argv)
 {
   char *buf = malloc(512 * sizeof(char));
   const int a = 2, b = 3;
   snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
   ^^
Make that 512*sizeof(buf)

Still segfaults though.

   fprintf(stdout, buf);
   free(buf);
   return 0;
 } /*main*/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-29 Thread Michael Torrie
On 06/29/2010 10:17 PM, Michael Torrie wrote:
 On 06/29/2010 10:05 PM, Michael Torrie wrote:
 #include stdio.h

 int main(int argc, char ** argv)
 {
  char *buf = malloc(512 * sizeof(char));
  const int a = 2, b = 3;
  snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
^^
 Make that 512*sizeof(buf)

Sigh.  Try again.  How about 512 * sizeof(char) ?  Still doesn't make
a different.  The code still crashes because the buf is incorrect.

Another reason python programming is just so much funner and easier!

This little diversion is fun though.  C is pretty powerful and I enjoy
it, but it sure keeps one on one's toes.  I made a similar mistake to
the buf thing years ago when I thought I could return strings (char *)
from functions on the stack the way Pascal and BASIC could.  It was only
by pure luck that my code worked as the part of the stack being accessed
was invalid and could have been overwritten.

  fprintf(stdout, buf);
  free(buf);
  return 0;
 } /*main*/

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-29 Thread Carl Banks
On Jun 28, 2:44 am, Gregory Ewing greg.ew...@canterbury.ac.nz wrote:
 Carl Banks wrote:
  Indeed, strncpy does not copy that final NUL if it's at or beyond the
  nth element.  Probably the most mind-bogglingly stupid thing about the
  standard C library, which has lots of mind-boggling stupidity.

 I don't think it was as stupid as that back when C was
 designed. Every byte of memory was precious in those days,
 and if you had, say, 10 bytes allocated for a string, you
 wanted to be able to use all 10 of them for useful data.

 So the convention was that a NUL byte was used to mark
 the end of the string *if it didn't fill all the available
 space*.

I can't think of any function in the standard library that observes
that convention, which inclines me to disbelieve this convention ever
really existed.  If it did, there would be functions to support it.

For that matter, I'm not really inclined to believe bytes were *that*
precious in those days.

 Functions such as strncpy and snprintf are designed
 for use with strings that follow this convention. Proper
 usage requires being cognizant of the maximum length and
 using appropriate length-limited functions for all operations
 on such strings.

Well, no.  Being cognizant of the string's maximum length doesn't make
you able to pass it to printf, or system, or any other C function.

The obvious rationale behind strncpy's stupid behavior is that it's
not a string function at all, but a memory block function, that stops
at a NUL in case you don't care what's after the NUL in a block.  But
it leads you to believe it's a string function by it's name.


Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Stephen Hansen

On 6/26/10 7:21 PM, Lawrence D'Oliveiro wrote:

In messagemailman.2123.1277522976.32709.python-l...@python.org, Tim Chase
wrote:


On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote:
...


I see that you published my unobfuscated e-mail address on USENET for all to
see. I obfuscated it for a reason, to keep the spammers away. I'm assuming
this was a momentary lapse of judgement, for which I expect an apology.
Otherwise, it becomes grounds for an abuse complaint to your ISP.


Wow.

Way to be a douchebag.

I was going to say something about the realities of this forum and its 
dual-nature and conflicting netiquette and on. But I decided it really 
just had no point.


So, I'm left with: wow. You kinda suck*, man.

--

   ... Stephen Hansen
   ... Also: Ixokai
   ... Mail: me+list/python (AT) ixokai (DOT) io
   ... Blog: http://meh.ixokai.io/

P.S. *Then again, I'm fairly sure anytime someone has a form letter 
which contains the words, I expect an apology, there's some personal 
suck going on.


--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Jorgen Grahn
On Mon, 2010-06-28, Kushal Kumaran wrote:
 On Mon, Jun 28, 2010 at 2:00 AM, Jorgen Grahn grahn+n...@snipabacken.se 
 wrote:
 On Sun, 2010-06-27, Lawrence D'Oliveiro wrote:
 In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:

 snprintf(buf, strlen(foo), foo);

 A long while ago I came up with this macro:

     #define Descr(v) v, sizeof v

 making the correct version of the above become

     snprintf(Descr(buf), foo);

 This is off-topic, but I believe snprintf() in C can *never* safely be
 the only thing you do to the buffer: you also have to NUL-terminate it
 manually in some corner cases. See the documentation.

 snprintf goes to great lengths to be safe, in fact.  You might be
 thinking of strncpy.

Yes, it was indeed strncpy I was thinking of. Thanks.

But actually, the snprintf(3) man page I have is not 100% clear on
this issue, so last time I used it, I added a manual NUL-termination
plus a comment saying I wasn't sure it was needed.  I normally use C++
or Python, so I am a bit rusty on these things.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Gregory Ewing

Carl Banks wrote:


Indeed, strncpy does not copy that final NUL if it's at or beyond the
nth element.  Probably the most mind-bogglingly stupid thing about the
standard C library, which has lots of mind-boggling stupidity.


I don't think it was as stupid as that back when C was
designed. Every byte of memory was precious in those days,
and if you had, say, 10 bytes allocated for a string, you
wanted to be able to use all 10 of them for useful data.

So the convention was that a NUL byte was used to mark
the end of the string *if it didn't fill all the available
space*. Functions such as strncpy and snprintf are designed
for use with strings that follow this convention. Proper
usage requires being cognizant of the maximum length and
using appropriate length-limited functions for all operations
on such strings.

--
Greg
--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Paul Rubin
Gregory Ewing greg.ew...@canterbury.ac.nz writes:
 I don't think it was as stupid as that back when C was
 designed. Every byte of memory was precious in those days,
 and if you had, say, 10 bytes allocated for a string, you
 wanted to be able to use all 10 of them for useful data.

No I don't think so.  Traditional C strings simply didn't carry length
info except for the nul byte at the end.  Most string functions expected
the nul to be there.  The nul byte convention (instead of having a
header word with a length) arguably saved some space both by eliminating
a multi-byte header and by allowing trailing substrings to be
represented as pointers into a larger string.  In retrospect it seems
like a big error.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Lawrence D'Oliveiro
In message mailman.2231.1277700501.32709.python-l...@python.org, Kushal 
Kumaran wrote:

 On Sun, Jun 27, 2010 at 5:16 PM, Lawrence D'Oliveiro
 l...@geek-central.gen.new_zealand wrote:

In message mailman.2184.1277626565.32709.python-l...@python.org, Kushal
 Kumaran wrote:

 On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro
 l...@geek-central.gen.new_zealand wrote:

 A long while ago I came up with this macro:

 #define Descr(v) v, sizeof v

 making the correct version of the above become

 snprintf(Descr(buf), foo);

 Not quite right.  If buf is a char array, as suggested by the use of
 sizeof, then you're not passing a char* to snprintf.

 What am I passing, then?
 
 Here's what gcc tells me (I declared buf as char buf[512]):
 sprintf.c:8: warning: passing argument 1 of ‘snprintf’ from
 incompatible pointer type
 /usr/include/stdio.h:363: note: expected ‘char * __restrict__’ but
 argument is of type ‘char (*)[512]’
 
 You just need to lose the  from the macro.

Why does this work, then:

l...@theon:hack cat test.c
#include stdio.h

int main(int argc, char ** argv)
  {
char buf[512];
const int a = 2, b = 3;
snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
fprintf(stdout, buf);
return
0;
  } /*main*/
l...@theon:hack ./test
2 + 3 = 5

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Lawrence D'Oliveiro
In message pan.2010.06.27.13.55.04.500...@nowhere.com, Nobody wrote:

 On Sun, 27 Jun 2010 14:36:10 +1200, Lawrence D'Oliveiro wrote:
 
 Except nobody has yet shown an alternative which is easier to get right.
 
 For SQL, use stored procedures or prepared statements.

So feel free to rewrite my example using either stored procedures or 
prepared statements, to prove how much easier it is.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Roy Smith
In article 7xmxuffpxp@ruckus.brouhaha.com,
 Paul Rubin no.em...@nospam.invalid wrote:

 Gregory Ewing greg.ew...@canterbury.ac.nz writes:
  I don't think it was as stupid as that back when C was
  designed. Every byte of memory was precious in those days,
  and if you had, say, 10 bytes allocated for a string, you
  wanted to be able to use all 10 of them for useful data.
 
 No I don't think so.  Traditional C strings simply didn't carry length
 info except for the nul byte at the end.  Most string functions expected
 the nul to be there.  The nul byte convention (instead of having a
 header word with a length) arguably saved some space both by eliminating
 a multi-byte header and by allowing trailing substrings to be
 represented as pointers into a larger string.  In retrospect it seems
 like a big error.

Null-terminated strings predate C.  Various assembler languages had 
ASCIIZ (or similar) directives long before that.

The nice thing about null-terminated strings is how portable they have 
been over various word lengths.  Life would have been truly inconvenient 
if KR had picked, say, a 16-bit length field, and then we needed to 
bump that up to 32 bits in the 80's, and again to 64 bits in the 90's.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Steven D'Aprano
On Mon, 28 Jun 2010 20:55:53 -0400, Roy Smith wrote:

 The nice thing about null-terminated strings is how portable they have
 been over various word lengths.  Life would have been truly inconvenient
 if KR had picked, say, a 16-bit length field, and then we needed to
 bump that up to 32 bits in the 80's, and again to 64 bits in the 90's.

Or a Pascal 8 bit length field.

However the cost of null-terminated strings is that they can't store 
binary data, and worse, they're slow. In fact, according to some, null-
terminated strings are the *worst* way to implement a string type.

http://www.joelonsoftware.com/articles/fog000319.html



-- 
Steven
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: [OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-28 Thread Kushal Kumaran
On Tue, Jun 29, 2010 at 5:56 AM, Lawrence D'Oliveiro
l...@geek-central.gen.new_zealand wrote:
 In message mailman.2231.1277700501.32709.python-l...@python.org, Kushal
 Kumaran wrote:

 On Sun, Jun 27, 2010 at 5:16 PM, Lawrence D'Oliveiro
 l...@geek-central.gen.new_zealand wrote:

In message mailman.2184.1277626565.32709.python-l...@python.org, Kushal
 Kumaran wrote:

 On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro
 l...@geek-central.gen.new_zealand wrote:

 A long while ago I came up with this macro:

 #define Descr(v) v, sizeof v

 making the correct version of the above become

 snprintf(Descr(buf), foo);

 Not quite right.  If buf is a char array, as suggested by the use of
 sizeof, then you're not passing a char* to snprintf.

 What am I passing, then?

 Here's what gcc tells me (I declared buf as char buf[512]):
 sprintf.c:8: warning: passing argument 1 of ‘snprintf’ from
 incompatible pointer type
 /usr/include/stdio.h:363: note: expected ‘char * __restrict__’ but
 argument is of type ‘char (*)[512]’

 You just need to lose the  from the macro.

 Why does this work, then:

 l...@theon:hack cat test.c
 #include stdio.h

 int main(int argc, char ** argv)
  {
    char buf[512];
    const int a = 2, b = 3;
    snprintf(buf, sizeof buf, %d + %d = %d\n, a, b, a + b);
    fprintf(stdout, buf);
    return
        0;
  } /*main*/
 l...@theon:hack ./test
 2 + 3 = 5


By accident.  I hope your compiler warned you about your snprintf call.

Reading these threads might help you understand how char* and char
(*)[512] are different:

http://groups.google.com/group/comp.lang.c++/browse_thread/thread/24708a9204061ce/848ceaf5ec774d81

http://groups.google.com/group/comp.lang.c.moderated/browse_thread/thread/fe264c550947a2e5/32b330cdf8aba3d6

-- 
regards,
kushal
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Ian Kelly
On Sat, Jun 26, 2010 at 8:31 PM, Lawrence D'Oliveiro
l...@geek-central.gen.new_zealand wrote:
 Except I only needed two calls to SQLString, while you need two dozen
 instances of that repetitive items.c boilerplate.

 As a human, being repetitive is not my job. That’s what the computer is for.

Then why do you have every parameter prefixed with modify_? 8-)

But seriously, if that bothers you, then fold the items.c. portion
into the generator expression with a getattr call.  Or just change
them back to the same strings you had originally, and sqlalchemy will
be just as happy to accept them as-is.

Cheers,
Ian
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Kushal Kumaran
On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro
l...@geek-central.gen.new_zealand wrote:
 In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:

 snprintf(buf, strlen(foo), foo);

 A long while ago I came up with this macro:

    #define Descr(v) v, sizeof v

 making the correct version of the above become

    snprintf(Descr(buf), foo);


Not quite right.  If buf is a char array, as suggested by the use of
sizeof, then you're not passing a char* to snprintf.  You need to lose
the  in your macro.

-- 
regards,
kushal
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Lawrence D'Oliveiro
In message mailman.2184.1277626565.32709.python-l...@python.org, Kushal 
Kumaran wrote:

 On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro
 l...@geek-central.gen.new_zealand wrote:

 In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:

 snprintf(buf, strlen(foo), foo);

 A long while ago I came up with this macro:

 #define Descr(v) v, sizeof v

 making the correct version of the above become

 snprintf(Descr(buf), foo);
 
 Not quite right.  If buf is a char array, as suggested by the use of
 sizeof, then you're not passing a char* to snprintf.

What am I passing, then?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Lawrence D'Oliveiro
In message mailman.2183.1277623909.32709.python-l...@python.org, Ian Kelly 
wrote:

 On Sat, Jun 26, 2010 at 8:31 PM, Lawrence D'Oliveiro
 l...@geek-central.gen.new_zealand wrote:

 Except I only needed two calls to SQLString, while you need two dozen
 instances of that repetitive items.c boilerplate.

 As a human, being repetitive is not my job. That’s what the computer is
 for.
 
 Then why do you have every parameter prefixed with modify_? 8-)

Touché :). Actually it’s because the same form can be used to add a new 
record to the table, so there’s a separate set of input fields for that.

 But seriously, if that bothers you, then fold the items.c. portion
 into the generator expression with a getattr call.  Or just change
 them back to the same strings you had originally, and sqlalchemy will
 be just as happy to accept them as-is.

All this trouble, and it only gets rid of 2 of the 3 instances of data-
escaping in the example.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Nobody
On Sun, 27 Jun 2010 14:36:10 +1200, Lawrence D'Oliveiro wrote:

 In any case, you're still trying to make arguments about whether it's easy
 or hard to get it right, which completely misses the point. Eliminating
 the escaping entirely makes it impossible to get it wrong.
 
 Except nobody has yet shown an alternative which is easier to get right.

For SQL, use stored procedures or prepared statements. For HTML/XML, use a
DOM (or similar) interface.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Jorgen Grahn
On Sat, 2010-06-26, Lawrence D'Oliveiro wrote:
 In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn 
 wrote:

 I thought it was well-known that the solution is *not* to try to
 sanitize the input -- it's to switch to an interface which doesn't
 involve generating an intermediate executable.  In the Python example,
 that would be something like os.popen2(['zcat', '-f', '--', untrusted]).

 That???s what I mean. Why do people consider input sanitization so hard?

I'm not sure you understood me correctly, because I advocate
*not* doing input sanitization. Hard or not -- I don't want to know,
because I don't want to do it.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Jorgen Grahn
On Fri, 2010-06-25, Nobody wrote:
 On Fri, 25 Jun 2010 12:15:08 +, Jorgen Grahn wrote:

 I don't do SQL and I don't even understand the terminology properly
 ... but the discussion around it bothers me.
 
 Do those people really do this?

 Yes. And then some.

 Among web developers, the median level of programming knowledge amounts to
 the first 3 chapters of Learn PHP in 7 Days.

 It doesn't help the the guy who wrote PHP itself wasn't much better.

 - accept untrusted user data
 - try to sanitize the data (escaping certain characters etc)
 - turn this data into executable code (SQL)
 - executing it
 
 Like the example in the article
 
   SELECT * FROM hotels WHERE city = 'untrusted';

 Yep. Search the BugTraq archives for SQL injection. And most of those
 are for widely-deployed middleware; the zillions of bespoke site-specific
 scripts are likely to be worse.

 Also: http://xkcd.com/327/

Priceless!

As is often the case with xkcd, I learned something, too: there's a
widely used web application/portal/database thingy which silently
strips some characters from my input.  I thought it had to do with
HTML, but it's in fact exactly the sequences ', ')', ';' and '--'
from the comic, and a few more like '' and undoubtedly some I haven't
noticed yet.

That is surely input sanitization gone horribly wrong: I enter 6--8
slices of bread, but the system stores 68 slices of bread.

 I thought it was well-known that the solution is *not* to try to
 sanitize the input

 Well known by anyone with a reasonable understanding of the principles of
 programming, but somewhat less well known by the other 98% of web
 developers.

 Am I missing something?

 There's a world of difference between a skilled chef and the people
 flipping burgers for a minimum wage. And between a chartered civil
 engineer and the people laying the asphalt. And between what you
 probably consider a programmer and the people doing most web development.

I don't know them, so I wouldn't know ... What I would *expect* is
that safe tools are provided for them, not just workarounds so they
can keep using the unsafe tools. That's what Python did, with its
multitude of alternatives to os.system and os.popen.

Anyway, thanks. It's always nice to be able to map foreign terminology
like SQL injection to something you already know.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Jorgen Grahn
On Sun, 2010-06-27, Lawrence D'Oliveiro wrote:
 In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:
 
 snprintf(buf, strlen(foo), foo);

 A long while ago I came up with this macro:

 #define Descr(v) v, sizeof v

 making the correct version of the above become

 snprintf(Descr(buf), foo);

This is off-topic, but I believe snprintf() in C can *never* safely be
the only thing you do to the buffer: you also have to NUL-terminate it
manually in some corner cases. See the documentation.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Owen Jacobson

On 2010-06-26 22:33:57 -0400, Lawrence D'Oliveiro said:


In message 2010062522560231540-angrybald...@gmailcom, Owen Jacobson wrote:


It's not hard. It's just begging for a visit from the fuckup fairy.


That’s the same fallacious argument I pointed out earlier.


In the sense that using correct manual escaping leads to SQL injection 
vulnerabilities, yes, that's a fallacious argument on its own. 
However, as sites like BUGTRAQ amply demonstrate, generating SQL 
through string manipulation is a risky development practice[0]. You can 
continue to justify your choice to do so however you want, and you may 
even be the One True Developer capable of getting it absolutely right 
under all circumstances, but I'd still reject patches that introduced a 
SQLString-like function and ask that you resubmit them using the 
database API's parameterization tools instead.


Assuming for the sake of discussion that your SQLString function 
perfectly captures the transformation required to turn an arbitrary str 
into a MySQL string literal. How do you address the following issues?


1. Other (possibly inexperienced) developers reading your source who 
may not have the skills to correctly implement the same transform 
correctly learn from your programs that writing your own query munger 
is okay.
1a. Other (possibly inexperienced) developers decide to copy and paste 
your function without fully understanding how it works, in tandem with 
any of the other issues below. (If you think this is rare, I invite you 
to visit stackoverflow or roseindia some time.)


2. MySQL changes the quoting and escaping rules to address a 
bug/feature request/developer whim, introducing a new set of corner 
cases into your function and forcing you to re-learn the escaping and 
quoting rules. (For people using DB API parameters, this is a matter of 
upgrading the DB adapter module to a version that supports the modified 
rules.)


3. You decide to switch from MySQL to a more fully-featured RDBMS, 
which may have different quoting and escaping rules around string 
literals.
3a. *Someone else* decides to port your program to a different RDBMS, 
and may not understand that SQLString implements MySQL's quoting and 
escaping rules only.


4. MySQL AB finally get off their collective duffs and adds real 
parameter separation to the MySQL wire protocol, and implements real 
prepared statements to massive speed gains in scenarios that are 
relevant to your interests; string-based query construction gets left 
out in the cold.
4a. As with case 3, except that instead of the rules changing when you 
move to a new RDBMS, it's the relative performance of submitting new 
queries versus reusing a parameterized query that changes.


On top of the obvious issue of completely avoiding quoting bugs, using 
query parameters rather than escaping and string manipulation neatly 
saves you from having to address any of these problems (and a multitude 
of others) -- the DB API implementation will handle things for you, and 
you are propagating good practice in an easy-to-understand form.


I am honestly at a loss trying to understand your position. There is a 
huge body of documentation out there about the weaknesses of 
string-manipulation-based approaches to query construction, and the use 
of query parameters is so compellingly the Right Thing that I have a 
very hard time comprehending why anyone would opt not to use it except 
out of pure ignorance of their existence. Generating executable code -- 
including SQL -- from untrusted user input introduces an large 
vulnerability surface for very little benefit.


You don't handle function parameters by building up python-language 
strs containing the values as literals and eval'ing them, do you?


-o

[0] If you want to be *really* pedantic, string-manipulation-based 
query construction is strongly correlated with the occurrence of SQL 
injection vulnerabilities and bugs, which is in turn not strongly 
correlated with very many other practices. Happy?


--
http://mail.python.org/mailman/listinfo/python-list


[OT] Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Kushal Kumaran
On Sun, Jun 27, 2010 at 5:16 PM, Lawrence D'Oliveiro
l...@geek-central.gen.new_zealand wrote:
 In message mailman.2184.1277626565.32709.python-l...@python.org, Kushal
 Kumaran wrote:

 On Sun, Jun 27, 2010 at 9:47 AM, Lawrence D'Oliveiro
 l...@geek-central.gen.new_zealand wrote:

 In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:

 snprintf(buf, strlen(foo), foo);

 A long while ago I came up with this macro:

 #define Descr(v) v, sizeof v

 making the correct version of the above become

 snprintf(Descr(buf), foo);

 Not quite right.  If buf is a char array, as suggested by the use of
 sizeof, then you're not passing a char* to snprintf.

 What am I passing, then?

Here's what gcc tells me (I declared buf as char buf[512]):
sprintf.c:8: warning: passing argument 1 of ‘snprintf’ from
incompatible pointer type
/usr/include/stdio.h:363: note: expected ‘char * __restrict__’ but
argument is of type ‘char (*)[512]’

You just need to lose the  from the macro.

-- 
regards,
kushal
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Kushal Kumaran
On Mon, Jun 28, 2010 at 2:00 AM, Jorgen Grahn grahn+n...@snipabacken.se wrote:
 On Sun, 2010-06-27, Lawrence D'Oliveiro wrote:
 In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:

 snprintf(buf, strlen(foo), foo);

 A long while ago I came up with this macro:

     #define Descr(v) v, sizeof v

 making the correct version of the above become

     snprintf(Descr(buf), foo);

 This is off-topic, but I believe snprintf() in C can *never* safely be
 the only thing you do to the buffer: you also have to NUL-terminate it
 manually in some corner cases. See the documentation.


snprintf goes to great lengths to be safe, in fact.  You might be
thinking of strncpy.

-- 
regards,
kushal
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-27 Thread Carl Banks
On Jun 27, 9:54 pm, Kushal Kumaran kushal.kumaran+pyt...@gmail.com
wrote:
 On Mon, Jun 28, 2010 at 2:00 AM, Jorgen Grahn grahn+n...@snipabacken.se 
 wrote:
  On Sun, 2010-06-27, Lawrence D'Oliveiro wrote:
  In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

  I recently fixed a bug in some production code.  The programmer was
  careful to use snprintf() to avoid buffer overflows.  The only problem
  is, he wrote something along the lines of:

  snprintf(buf, strlen(foo), foo);

  A long while ago I came up with this macro:

      #define Descr(v) v, sizeof v

  making the correct version of the above become

      snprintf(Descr(buf), foo);

  This is off-topic, but I believe snprintf() in C can *never* safely be
  the only thing you do to the buffer: you also have to NUL-terminate it
  manually in some corner cases. See the documentation.

 snprintf goes to great lengths to be safe, in fact.  You might be
 thinking of strncpy.

Indeed, strncpy does not copy that final NUL if it's at or beyond the
nth element.  Probably the most mind-bogglingly stupid thing about the
standard C library, which has lots of mind-boggling stupidity.

Whenever I do an audit of someone's C code the first thing I do is
search for strncpy and see if they set the nth character to 0.  (They
usually didn't.)

Carl Banks
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Ian Kelly
On Thu, Jun 24, 2010 at 9:38 PM, Lawrence D'Oliveiro
l...@geek-central.gen.new_zealand wrote:
 In message 2010062422432660794-angrybald...@gmailcom, Owen Jacobson wrote:

 Why would I write this when SQLAlchemy, even without using its ORM
 features, can do it for me?

 SQLAlchemy doesn’t seem very flexible. Looking at the code examples
 http://www.sqlalchemy.org/docs/examples.html, they’re very procedural:
 build object, then do a string of separate method calls to add data to it. I
 prefer the functional approach, as in my table-update example.

Your example from the first post of the thread rewritten using sqlalchemy:

conn.execute(
items.update()
 .where(items.c.inventory_nr == modify_id)
 .values(
 dict(
  (field[0], Params.getvalue(%s[%s] % (field[1],
urllib.quote(modify_id
  for field in [
  (items.c.class_name, modify_class),
  (items.c.make, modify_make),
  (items.c.model, modify_model),
  (items.c.details, modify_details),
  (items.c.serial_nr, modify_serial),
  (items.c.inventory_nr, modify_invent),
  (items.c.when_purchased, modify_when_purchased),
  ... you get the idea ...
  (items.c.location_name, modify_location),
  (items.c.comment, modify_comment),
  ]
 )
)
 .values(last_modified = time.time())
)

Doesn't seem any less flexible to me, plus you don't have to worry
about calling your SQLString function at all.

Cheers,
Ian
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Robert Kern

On 2010-06-25 19:49 , Lawrence D'Oliveiro wrote:

In messageslrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn
wrote:


I thought it was well-known that the solution is *not* to try to
sanitize the input -- it's to switch to an interface which doesn't
involve generating an intermediate executable.  In the Python example,
that would be something like os.popen2(['zcat', '-f', '--', untrusted]).


That’s what I mean. Why do people consider input sanitization so hard?


It's not hard per se; it's just repetitive, prone to the occasional mistake, 
and, frankly, really boring. When faced with things like that, we do what we do 
everywhere else in programming: wrap up the repetitive bits into a simpler 
library API and use that everywhere. Wrapping up the escaping code into 
SQLString is a step in that direction. However, the standard SQL 
parameterization in most of the DB protocols or SQLAlchemy's query construction 
removes even more repetition and unnecessary typing. There's just no point in 
not using it.


--
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Nobody
On Sat, 26 Jun 2010 12:40:41 +1200, Lawrence D'Oliveiro wrote:

 I construct ad-hoc queries all the time. It really isn’t that hard to
 do safely.
 
 Wrong.
 
 Even if you get the quoting absolutely correct (which is a very big if),
 you have to remember to perform it every time, without exception.
 
 More generally, as a program gets more complex, this will work so long as
 we do X every time without fail approaches this won't work.
 
 That’s a content-free claim. Why? Because it applies equally to everything. 
 Replace “quoting” with something like “arithmetic”, and you’ll
 see what I mean:

If you omit the arithmetic, the program is likely to fail in very
obvious ways. Escaping is almost an identity function, which makes it
far more likely that omission or repetition will go unnoticed.

 And you need to perform it exactly once. As the program gets more complex,
 ensuring that it's done in the correct place, and only there, gets harder.
 
 Nonsense. It only needs to be done at the boundary to the appropriate 
 component (MySQL, HTML, JavaScript, whatever).

That assumes that you have a well-defined boundary, which isn't
necessarily the case.

In any case, you're still trying to make arguments about whether it's easy
or hard to get it right, which completely misses the point. Eliminating
the escaping entirely makes it impossible to get it wrong.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Nobody
On Fri, 25 Jun 2010 20:43:51 -0400, Roy Smith wrote:

 To bring this back to something remotely Python related, the point of 
 all this is that security is hard.

Oh, this isn't solely a security issue.

Ask anyone with a surname like O'Neil, O'Connor, O'Leary, etc; they've
probably broken a lot of web apps *without even trying*.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Roy Smith
In article 2010062522560231540-angrybald...@gmailcom,
 Owen Jacobson angrybald...@gmail.com wrote:

 It's not hard. It's just begging for a visit from the fuckup fairy.

QOTD?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread D'Arcy J.M. Cain
On Sat, 26 Jun 2010 12:04:38 +0100
Nobody nob...@nowhere.com wrote:
 Ask anyone with a surname like O'Neil, O'Connor, O'Leary, etc; they've
 probably broken a lot of web apps *without even trying*.

At least it isn't a problem with the first name field.  Oh, wait...

-- 
D'Arcy J.M. Cain da...@druid.net |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Lawrence D'Oliveiro
In message mailman.2123.1277522976.32709.python-l...@python.org, Tim Chase 
wrote:

 On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote:
 ...

I see that you published my unobfuscated e-mail address on USENET for all to
see. I obfuscated it for a reason, to keep the spammers away. I'm assuming
this was a momentary lapse of judgement, for which I expect an apology.
Otherwise, it becomes grounds for an abuse complaint to your ISP.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Lawrence D'Oliveiro
In message mailman.2126.1277534032.32709.python-l...@python.org, Ian Kelly 
wrote:

 Your example from the first post of the thread rewritten using sqlalchemy:
 
 conn.execute(
 items.update()
  .where(items.c.inventory_nr == modify_id)
  .values(
  dict(
   (field[0], Params.getvalue(%s[%s] % (field[1],
 urllib.quote(modify_id
   for field in [
   (items.c.class_name, modify_class),
   (items.c.make, modify_make),
   (items.c.model, modify_model),
   (items.c.details, modify_details),
   (items.c.serial_nr, modify_serial),
   (items.c.inventory_nr, modify_invent),
   (items.c.when_purchased, modify_when_purchased),
   ... you get the idea ...
   (items.c.location_name, modify_location),
   (items.c.comment, modify_comment),
   ]
  )
 )
  .values(last_modified = time.time())
 )
 
 Doesn't seem any less flexible to me, plus you don't have to worry
 about calling your SQLString function at all.

Except I only needed two calls to SQLString, while you need two dozen 
instances of that repetitive items.c boilerplate.

As a human, being repetitive is not my job. That’s what the computer is for.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Lawrence D'Oliveiro
In message 2010062522560231540-angrybald...@gmailcom, Owen Jacobson wrote:

 It's not hard. It's just begging for a visit from the fuckup fairy.

That’s the same fallacious argument I pointed out earlier.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Lawrence D'Oliveiro
In message pan.2010.06.26.10.49.02.156...@nowhere.com, Nobody wrote:

 On Sat, 26 Jun 2010 12:40:41 +1200, Lawrence D'Oliveiro wrote:
 
 I construct ad-hoc queries all the time. It really isn’t that hard to
 do safely.
 
 Wrong.
 
 Even if you get the quoting absolutely correct (which is a very big
 if), you have to remember to perform it every time, without exception.
 
 More generally, as a program gets more complex, this will work so long
 as we do X every time without fail approaches this won't work.
 
 That’s a content-free claim. Why? Because it applies equally to
 everything. Replace “quoting” with something like “arithmetic”, and
 you’ll see what I mean:
 
 If you omit the arithmetic, the program is likely to fail in very
 obvious ways. Escaping is almost an identity function, which makes it
 far more likely that omission or repetition will go unnoticed.

Maybe you need to go back and reread my original posting. The SQLString 
routine doesn’t just escape special characters, it generates a full MySQL 
string literal, complete with quotation marks. That makes it rather more 
likely for a syntax error to occur if I forget to use it, don’t you think?

 And you need to perform it exactly once. As the program gets more
 complex, ensuring that it's done in the correct place, and only there,
 gets harder.
 
 Nonsense. It only needs to be done at the boundary to the appropriate
 component (MySQL, HTML, JavaScript, whatever).
 
 That assumes that you have a well-defined boundary, which isn't
 necessarily the case.

It’s ALWAYS the case.

 In any case, you're still trying to make arguments about whether it's easy
 or hard to get it right, which completely misses the point. Eliminating
 the escaping entirely makes it impossible to get it wrong.

Except nobody has yet shown an alternative which is easier to get right.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Aahz
In article i06cju$qq...@lust.ihug.co.nz,
Lawrence D'Oliveiro  l...@geek-central.gen.new_zealand wrote:
In message mailman.2123.1277522976.32709.python-l...@python.org, Tim Chase 
wrote:

 On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote:
 ...

I see that you published my unobfuscated e-mail address on USENET for all to
see. I obfuscated it for a reason, to keep the spammers away. I'm assuming
this was a momentary lapse of judgement, for which I expect an apology.
Otherwise, it becomes grounds for an abuse complaint to your ISP.

You are double daft.  First, I completely disagree with you about it
being abuse; from my POV anyone posting to Usenet should do so with an
unobfuscated address.  Secondly, you are wrong about Tim publishing your
address unless you intended to follow up to a completely different post,
and you owe *him* an apology for a false accusation.
-- 
Aahz (a...@pythoncraft.com)   * http://www.pythoncraft.com/

If you don't know what your program is supposed to do, you'd better not
start writing it.  --Dijkstra
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Chris Rebert
On Sat, Jun 26, 2010 at 7:21 PM, Lawrence D'Oliveiro  wrote:
 In message mailman.2123.1277522976.32709.python-l...@python.org, Tim Chase
 wrote:

 On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote:
 ...

 I see that you published my unobfuscated e-mail address on USENET for all to
 see. I obfuscated it for a reason, to keep the spammers away. I'm assuming
 this was a momentary lapse of judgement, for which I expect an apology.
 Otherwise, it becomes grounds for an abuse complaint to your ISP.

Will you give it a rest already with these threatening messages? Why
are you still using this only-partially-obfuscated address with USENET
anyway? This has happened twice before, it will doubtless happen yet
again. Just use an /entirely invalid/ From address like some other
posters do.

I can't believe you have a form letter for this...

Regards,
Chris
--
Public addresses eventually going bad is a *fact of life*; plan ahead
accordingly.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Tim Chase

On 06/26/2010 09:21 PM, Lawrence D'Oliveiro wrote:

In messagemailman.2123.1277522976.32709.python-l...@python.org, Tim Chase
wrote:


On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote:
...


I see that you published my unobfuscated e-mail address on USENET for all to
see. I obfuscated it for a reason, to keep the spammers away. I'm assuming
this was a momentary lapse of judgement, for which I expect an apology.
Otherwise, it becomes grounds for an abuse complaint to your ISP.


I'm sorry...you've got your knickers in a knot?  That your spam 
filters seem to be insufficient?  That you don't have a custom 
throwaway address for such public dialogs?  For preventing an 
undeliverable bounce message that your bogus address would have 
caused (if your mail provider is RFC-compliant; though your mail 
provider may kindly be breaking RFC by disabling undeliverable 
responses to prevent back-scatter spam)?


Is the abuse charge waah, he replied to my actual email rather 
than the false one I spoofed?


I'm not sure an abuse complaint to my ISP would net you anything 
since the exact out-bound headers show nothing abusive, only the 
correcting of an invalid TLD to prevent a bounce (and a distinct 
lack of USENET references in the original message that went to 
you and CC'ed python-list@python.org).


Having regularly used python.l...@tim.thechases.com unobfuscated 
for easily over 5 years, the spam to this address has been almost 
negligible (or so effectively dealt with by Thunderbird's spam 
filters that I've never noticed it).


-tkc



--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Lawrence D'Oliveiro
In message pan.2010.06.26.11.04.22.328...@nowhere.com, Nobody wrote:

 Ask anyone with a surname like O'Neil, O'Connor, O'Leary, etc; they've
 probably broken a lot of web apps *without even trying*.

Last I checked, I couldn’t post comments on freedom-to-tinker.com.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-26 Thread Lawrence D'Oliveiro
In message roy-854954.20435125062...@news.panix.com, Roy Smith wrote:

 I recently fixed a bug in some production code.  The programmer was
 careful to use snprintf() to avoid buffer overflows.  The only problem
 is, he wrote something along the lines of:
 
 snprintf(buf, strlen(foo), foo);

A long while ago I came up with this macro:

#define Descr(v) v, sizeof v

making the correct version of the above become

snprintf(Descr(buf), foo);

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Cameron Simpson
On 25Jun2010 15:54, I wrote:
| The number of times I've had to
| fix/remove insert-values-into-SQL-text code ...

My point here is that with insert-escaped-values-into-sql-text,
you only need to forget to do it once (or do it wrong).
By using a parameterised form like that required by SQLalchemy
the library does it and never forgets.

I would also point out that if you use a library to _construct_ the SQL
statements themselves eg via SQLA's .select() methods etc then you will never
introduce a syntax error into the SQL either. I expect I could construct SQL
syntax errors that cause havoc when inserted with correctly escaped parameter
values if I tried, probably using quotes in the SQL typo part.

Cheers,
-- 
Cameron Simpson c...@zip.com.au DoD#743
http://www.cskk.ezoshosting.com/cs/

George, discussing a patent and prior art:
Look, this  publication has a date, the patent has a priority date,
can't you just compare them?
Paul Sutcliffe:
Not unless you're a lawyer.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Nobody
On Fri, 25 Jun 2010 12:25:56 +1200, Lawrence D'Oliveiro wrote:

 Just been reading this article
 ...
 which says that a lot of security holes are arising these days because
 everybody is concentrating on unit testing of their own particular
 components, with less attention being devoted to overall integration
 testing.
 
 Fair enough. But it’s disconcerting to see some of the advice being
 offered in the reader comments, like “force everyone to use stored
 procedures”, or “force everyone to use prepared/parametrized
 statements”, “never construct ad-hoc SQL queries” and the like.
 
 I construct ad-hoc queries all the time. It really isn’t that hard to
 do safely.

Wrong.

Even if you get the quoting absolutely correct (which is a very big if),
you have to remember to perform it every time, without exception. And you
need to perform it exactly once. As the program gets more complex,
ensuring that it's done in the correct place, and only there, gets harder.

More generally, as a program gets more complex, this will work so long as
we do X every time without fail approaches this won't work.

 All you have to do is read the documentation—for example,
 http://dev.mysql.com/doc/refman/5.0/en/string-syntax.html—and then
 write a routine that takes arbitrary data and turns it into a valid
 string literal, like this
 http://www.codecodex.com/wiki/Useful_MySQL_Routines#Quoting.

That's okay. Provided the documentation is accurate. And provided that you
update the escaping algorithm whenever the SQL dialect gets extended, or
you switch to a different back-end, or modify the program. IOW, it's not
even remotely okay.

Unparsing data so that you get the correct answer out of a subsequent
parsing step is objectively and obviously the wrong approach. The
correct approach is to skip both the unparsing and parsing steps
entirely.

Formal grammars are a useful way to represent graph-like data structures
in a human-readable and human-editable form. But for creation,
modification and use by a computer, it is invariably preferable to operate
upon the graph directly. Textual formats inherit all of the issues which
apply to the underlying data structure, then add a few of their own for
good measure.

 I've done this sort of thing for MySQL, for HTML and JavaScript (in both
 Python and JavaScript itself), and for Bash.

And, of course, you're convinced that you got it right every time. That
attitude alone should set alarm bells ringing for anyone who's worked in
this industry for more than five minutes.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Paul Rubin
Nobody nob...@nowhere.com writes:
 More generally, as a program gets more complex, this will work so long as
 we do X every time without fail approaches this won't work.

QOTW
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Jorgen Grahn
On Fri, 2010-06-25, Lawrence D'Oliveiro wrote:
 Just been reading this article
 http://www.theregister.co.uk/2010/06/23/xxs_sql_injection_attacks_testing_remedy/
 which says that a lot of security holes are arising these days because
 everybody is concentrating on unit testing of their own particular
 components, with less attention being devoted to overall integration
 testing.

I don't do SQL and I don't even understand the terminology properly
... but the discussion around it bothers me.

Do those people really do this?
- accept untrusted user data
- try to sanitize the data (escaping certain characters etc)
- turn this data into executable code (SQL)
- executing it

Like the example in the article

  SELECT * FROM hotels WHERE city = 'untrusted';

If so, its isomorphic with doing os.popen('zcat -f %s' % untrusted)
in Python (at least on Unix, where 'zcat ...' is executed as a shell
script).

I thought it was well-known that the solution is *not* to try to
sanitize the input -- it's to switch to an interface which doesn't
involve generating an intermediate executable.  In the Python example,
that would be something like os.popen2(['zcat', '-f', '--', untrusted]).

Am I missing something?  If not, I can go back to sleep -- and keep
avoiding SQL and web programming like the plague until that community
has entered the 21st century.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Stephen Hansen
On Fri, Jun 25, 2010 at 5:15 AM, Jorgen Grahn
grahn+n...@snipabacken.segrahn%2bn...@snipabacken.se
 wrote:

 Am I missing something?  If not, I can go back to sleep -- and keep
 avoiding SQL and web programming like the plague until that community
 has entered the 21st century.


You're not missing anything. Its been the accepted industry practice for
years and years (and /years/), the taught industry practice, the advised
industry practice, the constantly repeated practice on every even vaguely
database related forum forever now.

However:

  a) Some people are convinced of their own infallibility, and prefer a
clever construct generating a string that has to be parsed due to the
cleverness of said construct.
  b) Some people don't listen / understand.
  c) Some people don't care.

And so, SQL injection attacks continue to persist. Then again, its not like
anyone in the C-ish world doesn't know about bounds checking on arrays, do
they? But buffer overflows persist. Probably for similar reasons as above
(with slightly different 'and prefer' clause)

--Stephen
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread John Nagle

On 6/25/2010 12:09 AM, Paul Rubin wrote:

Nobodynob...@nowhere.com  writes:

More generally, as a program gets more complex, this will work so long as
we do X every time without fail approaches this won't work.


   Yes.  I was just looking at some of my own code.  Out of about 100
SQL statements, I'd used manual escaping once, in code where the WHERE
clause is built up depending on what information is available for the
search.  It's done properly, using MySQLdb.escape_string(s), which
is what's used inside cursor.execute.  Looking at the code, I
now realize that it would have been better to
add sections to the SQL string with standard escapes, and at the same
time, append the key items to a list.  Then the list can be
converted to a tuple for submission to cursor.execute.

John Nagle

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Nobody
On Fri, 25 Jun 2010 12:15:08 +, Jorgen Grahn wrote:

 I don't do SQL and I don't even understand the terminology properly
 ... but the discussion around it bothers me.
 
 Do those people really do this?

Yes. And then some.

Among web developers, the median level of programming knowledge amounts to
the first 3 chapters of Learn PHP in 7 Days.

It doesn't help the the guy who wrote PHP itself wasn't much better.

 - accept untrusted user data
 - try to sanitize the data (escaping certain characters etc)
 - turn this data into executable code (SQL)
 - executing it
 
 Like the example in the article
 
   SELECT * FROM hotels WHERE city = 'untrusted';

Yep. Search the BugTraq archives for SQL injection. And most of those
are for widely-deployed middleware; the zillions of bespoke site-specific
scripts are likely to be worse.

Also: http://xkcd.com/327/

 I thought it was well-known that the solution is *not* to try to
 sanitize the input

Well known by anyone with a reasonable understanding of the principles of
programming, but somewhat less well known by the other 98% of web
developers.

 Am I missing something?

There's a world of difference between a skilled chef and the people
flipping burgers for a minimum wage. And between a chartered civil
engineer and the people laying the asphalt. And between what you
probably consider a programmer and the people doing most web development.

 If not, I can go back to sleep -- and keep
 avoiding SQL and web programming like the plague until that community
 has entered the 21st century.

Don't hold your breath.

Of course, there's no fundamental reason why you can't apply sound
practices to web development. Well, other than the fact that you're
competing against an infinite number of (code-) monkeys for lowest-bidder
contracts.

To be fair, it isn't actually limited to web developers. I've seen the
following in scientific code written in C (or, more likely, ported to C
from Fortran) for Unix:

sprintf(buff, rm -f %s, filename);
system(buff);

Why bother learning the Unix API when you already know system()?

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Ian Kelly
On Fri, Jun 25, 2010 at 5:17 PM, Nobody nob...@nowhere.com wrote:
 To be fair, it isn't actually limited to web developers. I've seen the
 following in scientific code written in C (or, more likely, ported to C
 from Fortran) for Unix:

        sprintf(buff, rm -f %s, filename);
        system(buff);

Tsk, tsk.  And it's so easy to fix, too:

#define BUFSIZE 100
char buff[BUFSIZE];
if (snprintf(buff, BUFSIZE, rm -f %s, filename) = BUFSIZE) {
printf(No buffer overflow for you!\n);
} else {
system(buff);
}

There, that's much more secure.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Lawrence D'Oliveiro
In message pan.2010.06.25.06.47.34.297...@nowhere.com, Nobody wrote:

 On Fri, 25 Jun 2010 12:25:56 +1200, Lawrence D'Oliveiro wrote:
 
 I construct ad-hoc queries all the time. It really isn’t that hard to
 do safely.
 
 Wrong.
 
 Even if you get the quoting absolutely correct (which is a very big if),
 you have to remember to perform it every time, without exception.
 
 More generally, as a program gets more complex, this will work so long as
 we do X every time without fail approaches this won't work.

That’s a content-free claim. Why? Because it applies equally to everything. 
Replace “quoting” with something like “arithmetic”, and you’ll see what I 
mean:

Even if you get the arithmetic absolutely correct (which is a very big
if), you have to remember to perform it every time, without exception.

More generally, as a program gets more complex, this will work so long
as we do X every time without fail approaches this won't work.

From which we can conclude, according to your logic, that one shouldn’t be 
doing arithmetic.

Next time, try to avoid fallacious arguments.

 And you need to perform it exactly once. As the program gets more complex,
 ensuring that it's done in the correct place, and only there, gets harder.

Nonsense. It only needs to be done at the boundary to the appropriate 
component (MySQL, HTML, JavaScript, whatever). That’s the only place which 
needs to have knowledge of what’s on the other side. Everything else can 
work with arbitrary data without having to worry about such things.

Go back to my example, and you’ll see this: the original updates two dozen 
different fields in a database table, yet it only needs two calls to 
SQLString: one deals with all the fields requiring updating, while the other 
one deals with the key-matching. That’s it. Instead of two dozen different 
places needing checking, you only have two.

That’s what “maintainability” is all about.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Roy Smith
In article mailman.2117.1277511935.32709.python-l...@python.org,
 Ian Kelly ian.g.ke...@gmail.com wrote:

 On Fri, Jun 25, 2010 at 5:17 PM, Nobody nob...@nowhere.com wrote:
  To be fair, it isn't actually limited to web developers. I've seen the
  following in scientific code written in C (or, more likely, ported to C
  from Fortran) for Unix:
 
         sprintf(buff, rm -f %s, filename);
         system(buff);
 
 Tsk, tsk.  And it's so easy to fix, too:
 
 #define BUFSIZE 100
 char buff[BUFSIZE];
 if (snprintf(buff, BUFSIZE, rm -f %s, filename) = BUFSIZE) {
 printf(No buffer overflow for you!\n);
 } else {
 system(buff);
 }
 
 There, that's much more secure.

I recently fixed a bug in some production code.  The programmer was 
careful to use snprintf() to avoid buffer overflows.  The only problem 
is, he wrote something along the lines of:

snprintf(buf, strlen(foo), foo);

I'm sure the code got reviewed originally, and probably looked at dozens 
of times over the years.  Nobody caught the problem until we ran a 
static code analysis tool (Coverity) over it.

To bring this back to something remotely Python related, the point of 
all this is that security is hard.  A lot of the security best practices 
(such as don't compose SQL queries on the fly with externally tainted 
strings) exist because they address ways that people have gotten burned 
in the past.  It if foolish to think that you're smarter than everybody 
else and have thought of every possibility to avoid getting burned by 
doing the things that have gotten other people in trouble.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Lawrence D'Oliveiro
In message mailman.2046.1277445301.32709.python-l...@python.org, Cameron 
Simpson wrote:

 On 25Jun2010 15:38, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand
 wrote:

 | In message 2010062422432660794-angrybald...@gmailcom, Owen Jacobson
 | wrote:

 |  Why would I write this when SQLAlchemy, even without using its ORM
 |  features, can do it for me?
 | 
 | SQLAlchemy doesn’t seem very flexible. Looking at the code examples
 | http://www.sqlalchemy.org/docs/examples.html, they’re very procedural:
 | build object, then do a string of separate method calls to add data to
 | it. I prefer the functional approach, as in my table-update example.
 
 He said without using its ORM.

I noticed that. So were those examples I referenced above “using its ORM”? 
Can you offer better examples “without using its ORM”?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Lawrence D'Oliveiro
In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn 
wrote:

 I thought it was well-known that the solution is *not* to try to
 sanitize the input -- it's to switch to an interface which doesn't
 involve generating an intermediate executable.  In the Python example,
 that would be something like os.popen2(['zcat', '-f', '--', untrusted]).

That’s what I mean. Why do people consider input sanitization so hard?
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Owen Jacobson

On 2010-06-25 20:49:09 -0400, Lawrence D'Oliveiro said:


In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn
wrote:


I thought it was well-known that the solution is *not* to try to
sanitize the input -- it's to switch to an interface which doesn't
involve generating an intermediate executable.  In the Python example,
that would be something like os.popen2(['zcat', '-f', '--', untrusted]).


That’s what I mean. Why do people consider input sanitization so hard?


It's not hard. It's just begging for a visit from the fuckup fairy.

-o

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Robert Kern

On 2010-06-25 19:47 , Lawrence D'Oliveiro wrote:

In messagemailman.2046.1277445301.32709.python-l...@python.org, Cameron
Simpson wrote:


On 25Jun2010 15:38, Lawrence D'Oliveirol...@geek-central.gen.new_zealand
wrote:

| In message2010062422432660794-angrybald...@gmailcom, Owen Jacobson
| wrote:

|  Why would I write this when SQLAlchemy, even without using its ORM
|  features, can do it for me?
|
| SQLAlchemy doesn’t seem very flexible. Looking at the code examples
|http://www.sqlalchemy.org/docs/examples.html, they’re very procedural:
| build object, then do a string of separate method calls to add data to
| it. I prefer the functional approach, as in my table-update example.

He said without using its ORM.


I noticed that. So were those examples I referenced above “using its ORM”?
Can you offer better examples “without using its ORM”?


http://www.sqlalchemy.org/docs/sqlexpression.html

--
Robert Kern

I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth.
  -- Umberto Eco

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Stephen Hansen
On Fri, Jun 25, 2010 at 5:49 PM, Lawrence D'Oliveiro
l...@geek-central.gen.new_zealand wrote:

 In message slrni297ec.1m5.grahn+n...@frailea.sa.invalid, Jorgen Grahn
 wrote:

  I thought it was well-known that the solution is *not* to try to
  sanitize the input -- it's to switch to an interface which doesn't
  involve generating an intermediate executable.  In the Python example,
  that would be something like os.popen2(['zcat', '-f', '--', untrusted]).

 That’s what I mean. Why do people consider input sanitization so hard?


Its not that it is hard, its that it has to be done with care: and when an
interface provides you two methods to pass it data, one that requires it to
parse a string to get at your data (thus requiring careful sanitization),
and one that is a direct channel where no parsing is required and the data
is directly passed through memory and bypasses the need for any sanitization
... preference for the latter seems pretty darn obvious to me.

Use a method that does not add an extra security concern to the application
or system = best practice.

When that method *also* provides positive performance characteristics on top
of alleviating a security concern, and even gets rid of a lot of data type
conversion details you shouldn't really need to worry about, well. Using
that method seems pretty much an obvious choice to me.

If the only reason not to use it is so you can produce ghoulish spaghetti
code like in the first post, I think that's a count in PQ's favor :)

--S
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-25 Thread Tim Chase

On 06/25/2010 07:49 PM, Lawrence D'Oliveiro wrote:

In the Python example, that would be something like
os.popen2(['zcat', '-f', '--', untrusted]).


That’s what I mean. Why do people consider input sanitization
so hard?


It's hard because it requires thinking.  Sadly, many of the 
people I know who call themselves programmers couldn't code their 
way out of a paper bag, let alone think logically about the 
security implications of their code.[1]


-tkc


[1] much of which ends up being cargo-cult programming, 
cut-n-paste'd from Google search-results.






--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-24 Thread Roy Smith
In article i00t2k$l0...@lust.ihug.co.nz,
 Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote:

 I construct ad-hoc queries all the time. It really isn’t that hard to do
 safely. All you have to do is read the documentation

I get worried when people talk about how easy it is to do something 
safely.  Let me suggest a couple of things you might not have considered:

1) Somebody is running your application (or the database server) with 
the locale set to something unexpected.  This might change how numbers, 
dates, currency, etc, get formatted, which could change the meaning of 
your constructed SQL statement.

2) Somebody runs your application with a different PYTHONPATH, which 
causes a different (i.e. malicious) urllib module to get loaded, which 
makes urllib.quote() do something you didn't expect.
 
 I’ve done this sort of thing for MySQL, for HTML and JavaScript (in both
 Python and JavaScript itself), and for Bash. It’s not hard to verify 
 you’ve
 done it correctly. It lets you easily create table-updating code like the
 following, which makes it so easy to update the code to track changes in the
 database structure:
 
  sql.cursor.execute \
   (
 update items set 
 +
 , .join
 (
 tuple
 (
 %(name)s = %(value)s
 %
 {
 name : field[0],
 value : SQLString(Params.getvalue
   (
 %s[%s] % (field[1], 
 urllib.quote(modify_id))
   ))
 }
 for field in
 (
 (class_name, modify_class),
 (make, modify_make),
 (model, modify_model),
 (details, modify_details),
 (serial_nr, modify_serial),
 (inventory_nr, modify_invent),
 (when_purchased, modify_when_purchased),
 ... you get the idea ...
 (location_name, modify_location),
 (comment, modify_comment),
 )
 )
 +
 (
 last_modified = %d % int(time.time()),
 )
 )
 +
  where inventory_nr = %s % SQLString(modify_id)
   )
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-24 Thread Owen Jacobson

On 2010-06-24 21:02:48 -0400, Roy Smith said:


In article i00t2k$l0...@lust.ihug.co.nz,
 Lawrence D'Oliveiro l...@geek-central.gen.new_zealand wrote:


I construct ad-hoc queries all the time. It really isn’t that hard to do
safely. All you have to do is read the documentation


I get worried when people talk about how easy it is to do something
safely.


First: I agree with this. While it's definitely possible to correctly 
escape a given SQL dialect under controlled conditions, it's not at all 
easy to get it right, and the real world is even more unfriendly than 
most people expect. Furthermore there's no reason to do it that way: 
Python's DB API spec effectively requires that placeholder parameters 
of *some* kind exist. Even if you feel the need to construct SQL, you 
can construct it with parameters almost as easily as you can construct 
it with the values baked in.


With that said...


2) Somebody runs your application with a different PYTHONPATH, which
causes a different (i.e. malicious) urllib module to get loaded, which
makes urllib.quote() do something you didn't expect.


Someone who can manipulate PYTHONPATH or otherwise add code to the 
runtime environment is already in a position to hose your database, 
independently of escaping-related issues. It's up to the sysadmin or 
user to ensure that their environment is sane, and it's on their head 
if they add broken code to a program's runtime environment.


Lawrence D'Oliveiro wrote:


I'€™ve done this sort of thing for MySQL, for HTML and JavaScript (in both
Python and JavaScript itself), and for Bash. It’s not hard to verify you’ve
done it correctly. It lets you easily create table-updating code like the
following, which makes it so easy to update the code to track changes in the
database structure:

 sql.cursor.execute \
  (
update items set 
+
, .join
(
tuple
(
%(name)s = %(value)s
%
{
name : field[0],
value : SQLString(Params.getvalue
  (
%s[%s] % (field[1], 
urllib.quote(modify_id))

  ))
}
for field in
(
(class_name, modify_class),
(make, modify_make),
(model, modify_model),
(details, modify_details),
(serial_nr, modify_serial),
(inventory_nr, modify_invent),
(when_purchased, modify_when_purchased),
... you get the idea ...
(location_name, modify_location),
(comment, modify_comment),
)
)
+
(
last_modified = %d % int(time.time()),
)
)
+
 where inventory_nr = %s % SQLString(modify_id)
  )


Why would I write this when SQLAlchemy, even without using its ORM 
features, can do it for me? It even uses the placeholder-generating 
strategy I mentioned above, where possible.


Finally, it's worth noting that MySQL is (almost) the only mainstream 
database that uses escaping for parameterization. PostgreSQL, SQL 
Server, Oracle, DB2, and most other databases support parameters 
natively in their communication protocols: parameters aren't injected 
into the query string, but are sent separately and processed separately 
within the DBMS. This neatly avoids encoding-related and 
quoting-related problems entirely, and it means the type of the 
parameter can be preserved if it's useful.


-o

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-24 Thread Lawrence D'Oliveiro
In message roy-30b881.21024824062...@news.panix.com, Roy Smith wrote:

 1) Somebody is running your application (or the database server) with
 the locale set to something unexpected.

Locales are under program control, so that won’t happen.

This is why I use UTF-8 encoding for everything.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-24 Thread Lawrence D'Oliveiro
In message 2010062422432660794-angrybald...@gmailcom, Owen Jacobson wrote:

 Why would I write this when SQLAlchemy, even without using its ORM
 features, can do it for me?

SQLAlchemy doesn’t seem very flexible. Looking at the code examples 
http://www.sqlalchemy.org/docs/examples.html, they’re very procedural: 
build object, then do a string of separate method calls to add data to it. I 
prefer the functional approach, as in my table-update example.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Why Is Escaping Data Considered So Magical?

2010-06-24 Thread Cameron Simpson
On 25Jun2010 15:38, Lawrence D'Oliveiro l...@geek-central.gen.new_zealand 
wrote:
| In message 2010062422432660794-angrybald...@gmailcom, Owen Jacobson wrote:
|  Why would I write this when SQLAlchemy, even without using its ORM
|  features, can do it for me?
| 
| SQLAlchemy doesn’t seem very flexible. Looking at the code examples 
| http://www.sqlalchemy.org/docs/examples.html, they’re very procedural: 
| build object, then do a string of separate method calls to add data to it. I 
| prefer the functional approach, as in my table-update example.

He said without using its ORM. I do what you suggest (make SQL
statements at need) using SQLalchemy all the time. It is simple and easy
and _robust_ against odd data. The number of times I've had to
fix/remove insert-values-into-SQL-text code ...
-- 
Cameron Simpson c...@zip.com.au DoD#743
http://www.cskk.ezoshosting.com/cs/

Plague, Famine, Pestilence, and C++ stalk the land. We're doomed! Doomed!
- Simon E Spero
-- 
http://mail.python.org/mailman/listinfo/python-list