When is a string a number?
Hi, I'm hoping someone here can give me some insight in to what I can do as I feel sure someone must have hit this problem before. The software I'm working is a good deal more complex than what is described here but I've tried to resolve it down to make it more simple (perhaps unsuccessfully). The application reads values from a databases (but we'll stick to Oracle for now), converts the data to JSON (JavaScript Object Notation), writes the json to a file where it is picked up by some javascript code. JSON was chosen because JSON is native to javascript and the javascript can simply eval the read data to turn it into a javascript object. With DBI/DBD::Oracle all values read from the database are scalars. As everyone will know, whether something read from the database is a string or a number in Perl purely depends on the context it is used in so: $a = 1; print $a +1 results in 2 and print $a . 2 results in 12; The problem is javascript is more tightly typed than Perl and it matters whether something is a number or a string - it affects what operations you can perform on a variable. In JSON, a string is represented by quotes around it and a number is missing the quotes. My problem is that values read from the database which are actually numbers (in the database) look like strings to the JSON parser so when I do a selectall_arrayref(select number_field from table) and convert the result to JSON I get: [12] in the serialised JSON data instead of: [12] I have to admit I don't know how the JSON module knows what is a number and what is a string in Perl but I see the same issue with Data::Dumper so I presume there must be some way to find out if a perl scalar is a number or a string. The problem gets a lot worse for me since I do some arithmetic on values pulled from the database before converting them to JSON and this is where Perl seems to change them into numbers e.g. $r = selectall_arrayref(select number1, number2 from table); $r-[0] += 2; results in JSON output of: [3, 3] and this really annoys javascript since the first value is typed as a number and the second as a string. You can imagine this can get really hairy if you have a field in the DB called house_number_or_name. Whilst investigating this I tried to reproduce with Data::Dumper (another data serialiser) but the problem gets even more unfathomable to me: perl -MData::Dumper -le '$a = [1,2];print Dumper($a);' $VAR1 = [1,'2']; so Data::Dumper knows 2 is a string (XS version) and yet: perl -MData::Dumper -le ' $Data::Dumper::Useperl =1;$a = [1,2];print Dumper($a);' $VAR1 = [1,2]; just because I've switched to a Perl version of Data::Dumper 2 is a number. I don't want to have to do ($var +0) on all the number fields I pull from the database (to turn them into numbers) and neither do I want to do a '$var .= ' (to turn all the fields into strings). As an aside (and probably a perl question rather than a DBI one) does anyone know why the type of a scalar changes when you use it on the right side of an assignment: perl -MData::Dumper -le '$a=1; print Dumper($a); $b += $a; print Dumper($a);' $VAR1 = '1'; $VAR1 = 1; How does JSON and Data::Dumper know whether Perl thinks something is a number or a string? Sorry for the long posting. As I said, someone must have hit this before and I hope you can offer me some insight as I'm rather stuck with this now. Martin -- Martin J. Evans Easysoft Limited http://www.easysoft.com
Re: When is a string a number?
On 10/4/06, Martin J. Evans [EMAIL PROTECTED] wrote: With DBI/DBD::Oracle all values read from the database are scalars. As everyone will know, whether something read from the database is a string or a number in Perl purely depends on the context it is used in so: internally, there are flags in the SV structure that hint as to what conversions have been done on the SV already, for efficiency's sake. I have to admit I don't know how the JSON module knows what is a number and what is a string in Perl but I see the same issue with Data::Dumper so I presume there must be some way to find out if a perl scalar is a number or a string. It seems like the pure perl Dumper tests values with a regex and optimizes to numbers when the stringification matches a number template, while the XS version checks the flags in the SV structure. The problem gets a lot worse for me since I do some arithmetic on values pulled from the database before converting them to JSON and this is where Perl seems to change them into numbers e.g. I don't want to have to do ($var +0) on all the number fields I pull from the database (to turn them into numbers) and neither do I want to do a '$var .= ' (to turn all the fields into strings). you might have to do exactly that. $var will produce a string version. Sorry. Apparently you can tell the JSON module to make everything strings: http://search.cpan.org/~makamaka/JSON-1.07/lib/JSON.pm#AUTOCONVERT As an aside (and probably a perl question rather than a DBI one) does anyone know why the type of a scalar changes when you use it on the right side of an assignment: perl -MData::Dumper -le '$a=1; print Dumper($a); $b += $a; print Dumper($a);' $VAR1 = '1'; $VAR1 = 1; The has-been-evaluated-as-a-number flag got set on $a when it was evaluated as a number; then Dumper, with both available, chose the number format. How does JSON and Data::Dumper know whether Perl thinks something is a number or a string? inspecting the flags; except pure-perl Dumper apparently uses a regular expression to identify numbers. Those are guesses. The source is available for your inspection. -- The Country Of The Blind, by H.G. Wells http://cronos.advenge.com/pc/Wells/p528.html
Re: When is a string a number?
On Wed, 2006-10-04 at 14:00 -0500, David Nicol wrote: On 10/4/06, Martin J. Evans [EMAIL PROTECTED] wrote: With DBI/DBD::Oracle all values read from the database are scalars. As everyone will know, whether something read from the database is a string or a number in Perl purely depends on the context it is used in so: internally, there are flags in the SV structure that hint as to what conversions have been done on the SV already, for efficiency's sake. As I suspected. I have to admit I don't know how the JSON module knows what is a number and what is a string in Perl but I see the same issue with Data::Dumper so I presume there must be some way to find out if a perl scalar is a number or a string. It seems like the pure perl Dumper tests values with a regex and optimizes to numbers when the stringification matches a number template, while the XS version checks the flags in the SV structure. OK, I only introduced the Data::Dumper example as it seemed very similar but read on - I over simplified a little. However, it seems strange to me that the pure perl Data::Dumper does something different. If I was using Data::Dumper for data interchange and switched from the pure perl to the XS version (or vice versa) and then called another XS based module with the results - they could be different. The problem gets a lot worse for me since I do some arithmetic on values pulled from the database before converting them to JSON and this is where Perl seems to change them into numbers e.g. I don't want to have to do ($var +0) on all the number fields I pull from the database (to turn them into numbers) and neither do I want to do a '$var .= ' (to turn all the fields into strings). you might have to do exactly that. $var will produce a string version. Sorry. This seems amazing - I really did not want to do this. I guess what I'm saying is that it would be really useful for a DBD to mark a number as a number so it does not start as a string and mutate to a number when I do arithmetic on it. Most DBDs must know if the column is a number one or not (certainly DBD::ODBC does). Apparently you can tell the JSON module to make everything strings: http://search.cpan.org/~makamaka/JSON-1.07/lib/JSON.pm#AUTOCONVERT Here was an over simplification. I moved from JSON to JSON Syck. The JSON module is a purl perl module and does the right thing for me for all the cases I tried but I changed to JSON Syck which uses the libsyck library for speed. This explains the difference since JSON Syck is XS and will know the flags in the SV structure and JSON won't. As an aside (and probably a perl question rather than a DBI one) does anyone know why the type of a scalar changes when you use it on the right side of an assignment: perl -MData::Dumper -le '$a=1; print Dumper($a); $b += $a; print Dumper($a);' $VAR1 = '1'; $VAR1 = 1; The has-been-evaluated-as-a-number flag got set on $a when it was evaluated as a number; then Dumper, with both available, chose the number format. That's interesting to know, I'll look into that. I was rather surprised the right hand side of an evaluation changed the type of a variable on the right hand side. How does JSON and Data::Dumper know whether Perl thinks something is a number or a string? inspecting the flags; except pure-perl Dumper apparently uses a regular expression to identify numbers. Those are guesses. The source is available for your inspection. I did look at it briefly but not extensively since it was not my primary problem. Thanks for your insights David, they have increased my understanding although I still feel stuck. I could move back to using the JSON module but I could run into problems with fields like house_name_or_number which I'm now guessing JSON will serialise as a number when it looks like a number and a string when not. If I stick with JSON Syck (which is a lot faster and I need the speed), I'm forced to do +0 on all database fields I know are a number and I don't like this at all. I guess I'm rather surprised to hit this issue without seeing anyone else with it after all this time using DBI. Martin -- Martin J. Evans Easysoft Limited http://www.easysoft.com
RE: When is a string a number?
This is a very interesting problem, Martin, because most of us ordinary Perl users don't experience these issues on a regular basis if ever. Just one thought, maybe a bit naiive, but wouldn't it be possible to copy all the values you've got from the database and do the extra computations on the copies separately? It might avoid polluting the originals before passing them off to JSON. -Will -Original Message- From: Martin J. Evans [mailto:[EMAIL PROTECTED] Sent: Wednesday 04 October 2006 15:36 To: dbi-users Subject: Re: When is a string a number? On Wed, 2006-10-04 at 14:00 -0500, David Nicol wrote: On 10/4/06, Martin J. Evans [EMAIL PROTECTED] wrote: With DBI/DBD::Oracle all values read from the database are scalars. As everyone will know, whether something read from the database is a string or a number in Perl purely depends on the context it is used in so: internally, there are flags in the SV structure that hint as to what conversions have been done on the SV already, for efficiency's sake. As I suspected. I have to admit I don't know how the JSON module knows what is a number and what is a string in Perl but I see the same issue with Data::Dumper so I presume there must be some way to find out if a perl scalar is a number or a string. It seems like the pure perl Dumper tests values with a regex and optimizes to numbers when the stringification matches a number template, while the XS version checks the flags in the SV structure. OK, I only introduced the Data::Dumper example as it seemed very similar but read on - I over simplified a little. However, it seems strange to me that the pure perl Data::Dumper does something different. If I was using Data::Dumper for data interchange and switched from the pure perl to the XS version (or vice versa) and then called another XS based module with the results - they could be different. The problem gets a lot worse for me since I do some arithmetic on values pulled from the database before converting them to JSON and this is where Perl seems to change them into numbers e.g. I don't want to have to do ($var +0) on all the number fields I pull from the database (to turn them into numbers) and neither do I want to do a '$var .= ' (to turn all the fields into strings). you might have to do exactly that. $var will produce a string version. Sorry. This seems amazing - I really did not want to do this. I guess what I'm saying is that it would be really useful for a DBD to mark a number as a number so it does not start as a string and mutate to a number when I do arithmetic on it. Most DBDs must know if the column is a number one or not (certainly DBD::ODBC does). Apparently you can tell the JSON module to make everything strings: http://search.cpan.org/~makamaka/JSON-1.07/lib/JSON.pm#AUTOCONVERT Here was an over simplification. I moved from JSON to JSON Syck. The JSON module is a purl perl module and does the right thing for me for all the cases I tried but I changed to JSON Syck which uses the libsyck library for speed. This explains the difference since JSON Syck is XS and will know the flags in the SV structure and JSON won't. As an aside (and probably a perl question rather than a DBI one) does anyone know why the type of a scalar changes when you use it on the right side of an assignment: perl -MData::Dumper -le '$a=1; print Dumper($a); $b += $a; print Dumper($a);' $VAR1 = '1'; $VAR1 = 1; The has-been-evaluated-as-a-number flag got set on $a when it was evaluated as a number; then Dumper, with both available, chose the number format. That's interesting to know, I'll look into that. I was rather surprised the right hand side of an evaluation changed the type of a variable on the right hand side. How does JSON and Data::Dumper know whether Perl thinks something is a number or a string? inspecting the flags; except pure-perl Dumper apparently uses a regular expression to identify numbers. Those are guesses. The source is available for your inspection. I did look at it briefly but not extensively since it was not my primary problem. Thanks for your insights David, they have increased my understanding although I still feel stuck. I could move back to using the JSON module but I could run into problems with fields like house_name_or_number which I'm now guessing JSON will serialise as a number when it looks like a number and a string when not. If I stick with JSON Syck (which is a lot faster and I need the speed), I'm forced to do +0 on all database fields I know are a number and I don't like this at all. I guess I'm rather surprised to hit this issue without seeing anyone else with it after all this time using DBI. Martin -- Martin J. Evans Easysoft Limited http://www.easysoft.com - - - - - Appended
RE: When is a string a number?
On Wed, 2006-10-04 at 15:50 -0400, Rutherdale, Will wrote: This is a very interesting problem, Martin, because most of us ordinary Perl users don't experience these issues on a regular basis if ever. Just one thought, maybe a bit naiive, but wouldn't it be possible to copy all the values you've got from the database and do the extra computations on the copies separately? It might avoid polluting the originals before passing them off to JSON. -Will It would be possible if it were not for the facts that: a) the values computed from the DB values are required in the JSON output. i.e. the perl code does some calculations on DB values that are logic based and difficult to perform in SQL before being transformed into JSON. b) the computations on the DB retrieved values are dependent on what values are returned so sometimes I do a $a += 1 before transforming to JSON and sometimes I don't. This itself means that I can end up with strings in the JSON or numbers depending on what logic is applied to the DB returned values. Martin -- Martin J. Evans Easysoft Limited http://www.easysoft.com -Original Message- From: Martin J. Evans [mailto:[EMAIL PROTECTED] Sent: Wednesday 04 October 2006 15:36 To: dbi-users Subject: Re: When is a string a number? On Wed, 2006-10-04 at 14:00 -0500, David Nicol wrote: On 10/4/06, Martin J. Evans [EMAIL PROTECTED] wrote: With DBI/DBD::Oracle all values read from the database are scalars. As everyone will know, whether something read from the database is a string or a number in Perl purely depends on the context it is used in so: internally, there are flags in the SV structure that hint as to what conversions have been done on the SV already, for efficiency's sake. As I suspected. I have to admit I don't know how the JSON module knows what is a number and what is a string in Perl but I see the same issue with Data::Dumper so I presume there must be some way to find out if a perl scalar is a number or a string. It seems like the pure perl Dumper tests values with a regex and optimizes to numbers when the stringification matches a number template, while the XS version checks the flags in the SV structure. OK, I only introduced the Data::Dumper example as it seemed very similar but read on - I over simplified a little. However, it seems strange to me that the pure perl Data::Dumper does something different. If I was using Data::Dumper for data interchange and switched from the pure perl to the XS version (or vice versa) and then called another XS based module with the results - they could be different. The problem gets a lot worse for me since I do some arithmetic on values pulled from the database before converting them to JSON and this is where Perl seems to change them into numbers e.g. I don't want to have to do ($var +0) on all the number fields I pull from the database (to turn them into numbers) and neither do I want to do a '$var .= ' (to turn all the fields into strings). you might have to do exactly that. $var will produce a string version. Sorry. This seems amazing - I really did not want to do this. I guess what I'm saying is that it would be really useful for a DBD to mark a number as a number so it does not start as a string and mutate to a number when I do arithmetic on it. Most DBDs must know if the column is a number one or not (certainly DBD::ODBC does). Apparently you can tell the JSON module to make everything strings: http://search.cpan.org/~makamaka/JSON-1.07/lib/JSON.pm#AUTOCONVERT Here was an over simplification. I moved from JSON to JSON Syck. The JSON module is a purl perl module and does the right thing for me for all the cases I tried but I changed to JSON Syck which uses the libsyck library for speed. This explains the difference since JSON Syck is XS and will know the flags in the SV structure and JSON won't. As an aside (and probably a perl question rather than a DBI one) does anyone know why the type of a scalar changes when you use it on the right side of an assignment: perl -MData::Dumper -le '$a=1; print Dumper($a); $b += $a; print Dumper($a);' $VAR1 = '1'; $VAR1 = 1; The has-been-evaluated-as-a-number flag got set on $a when it was evaluated as a number; then Dumper, with both available, chose the number format. That's interesting to know, I'll look into that. I was rather surprised the right hand side of an evaluation changed the type of a variable on the right hand side. How does JSON and Data::Dumper know whether Perl thinks something is a number or a string? inspecting the flags; except pure-perl Dumper apparently uses a regular expression to identify numbers. Those are guesses. The source is available