Re: [sqlite] Problem getting utf-8 text out of a text column

2013-03-22 Thread Simon Slavin

On 22 Mar 2013, at 3:04pm, Tim Streater  wrote:

> Having checked the db carefully, both using hex(col) (via the sqlite3 CLI 
> program), and using the unix hexdump utility on the db, I'm now reasonably 
> certain that the C2 bytes are simply not in the db. That they show up when I 
> examine the db with Navicat for SQLite Lite is another matter altogether.

The moral of the story (for anyone else finding this thread when googling to 
solve their own problem) is not to trust third-party software.  Write your own 
code or use the sqlite3 executable provided by the sqlite development team.  
Oh, and I should have thought of hex() myself.  Bad Simon.  No cookie.

> Sorry for the distraction.

I assume that means you have no more sqlite3-related problem.  Glad we could 
help.  Good luck with the rest of your project.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem getting utf-8 text out of a text column

2013-03-22 Thread Tim Streater
On 21 Mar 2013 at 23:30, Richard Hipp  wrote: 

> On Thu, Mar 21, 2013 at 6:52 PM, Tim Streater  wrote:
>
>> I have a column defined as TEXT and I insert utf-8 text into it. In
>> particular, there are byte sequences that might look as follows:
>> ... 74 6F 20 C2 A3 32 35 30 ... (in hex)
>> which represents:
>> ... to £250 ...
>> I used Navicat for SQLite Lite to peer at the data as hex and text in the
>> column, and it looks as I've presented it.
>> The problem comes when I try to retrieve the data using a PHP script. I
>> want the data back as bytes because I'm going to encode it as quoted
>> printable, with decimal byte values greater than 127 being separately
>> encoded. This would give the above as:
>> ... to =C2=A3250 ...
>> But: the byte stream I get back on doing a SELECT is always *missing* the
>> C2 byte. What am I missing here? I'm using the PHP PDO interface.

> What do you get back when you do select "hex(column)" instead of just
> "column" using PHP.  If the C2 byte is there, then I'm going to put the
> blame on PHP in this case.

Having checked the db carefully, both using hex(col) (via the sqlite3 CLI 
program), and using the unix hexdump utility on the db, I'm now reasonably 
certain that the C2 bytes are simply not in the db. That they show up when I 
examine the db with Navicat for SQLite Lite is another matter altogether.

Sorry for the distraction.

--
Cheers  --  Tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem getting utf-8 text out of a text column

2013-03-21 Thread Richard Hipp
On Thu, Mar 21, 2013 at 6:52 PM, Tim Streater  wrote:

> I have a column defined as TEXT and I insert utf-8 text into it. In
> particular, there are byte sequences that might look as follows:
> ... 74 6F 20 C2 A3 32 35 30 ... (in hex)
> which represents:
> ... to £250 ...
> I used Navicat for SQLite Lite to peer at the data as hex and text in the
> column, and it looks as I've presented it.
> The problem comes when I try to retrieve the data using a PHP script. I
> want the data back as bytes because I'm going to encode it as quoted
> printable, with decimal byte values greater than 127 being separately
> encoded. This would give the above as:
> ... to =C2=A3250 ...
> But: the byte stream I get back on doing a SELECT is always *missing* the
> C2 byte. What am I missing here? I'm using the PHP PDO interface.
> Thanks,
>

What do you get back when you do select "hex(column)" instead of just
"column" using PHP.  If the C2 byte is there, then I'm going to put the
blame on PHP in this case.


> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem getting utf-8 text out of a text column

2013-03-21 Thread Tim Streater
Simon,

Thanks for the suggestions. I should have mentioned that I'm running under OS X 
so I have the sqlite3 application and can easily do the tests you suggest. But 
that'll have to be tomorrow - time for shuteye now :-)

Cheers -- tim
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Problem getting utf-8 text out of a text column

2013-03-21 Thread Simon Slavin

On 21 Mar 2013, at 10:52pm, Tim Streater  wrote:

> I have a column defined as TEXT and I insert utf-8 text into it. In 
> particular, there are byte sequences that might look as follows:
> ... 74 6F 20 C2 A3 32 35 30 ... (in hex)
> which represents:
> ... to £250 ...
> I used Navicat for SQLite Lite to peer at the data as hex and text in the 
> column, and it looks as I've presented it.

Thanks for the detailed report which saves us a lot of time asking questions.  
Can you use the sqlite shell tool, rather than Navicat, to look at the database 
file and see the contents ?  You can download compiled versions for most 
platforms from



In particular please use something like

SELECT myField,length(myField) FROM myTable

to check how long it thinks the field is.

> The problem comes when I try to retrieve the data using a PHP script. I want 
> the data back as bytes because I'm going to encode it as quoted printable, 
> with decimal byte values greater than 127 being separately encoded. This 
> would give the above as:
> ... to =C2=A3250 ...
> But: the byte stream I get back on doing a SELECT is always *missing* the C2 
> byte. What am I missing here? I'm using the PHP PDO interface.

Again, can you try retreiving the length of the field as part of the same 
SELECT and see whether the field length agrees with the C2 being missing ?  And 
also compare that with the result of doing strlen(theString) as a PHP command 
and find out whether all three agree ?

You could try doing same thing using the sqlte3 interface which is closer to 
'pure sqlite3' if you think that'll help.  Using that library will involve a 
cleaner path with fewer conversions and processes.  It's less likely to give 
you such problems.

$dbConnection = new SQLite3('testdb.sqlite');
$dbConnection->exec("INSERT INTO myTable VALUES ('to £250'");  // should really 
check return value
$retrieved = $dbConnection->querySingle('SELECT theValue,length(theValue) FROM 
myTable', true);
echo $retrieved[0], $retrieved[1], strlen($retrieved[0]);

If that works, you could write with one and read with the other and use that to 
figure out whether the fault is in the bit that stores or retrieves the value.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Problem getting utf-8 text out of a text column

2013-03-21 Thread Tim Streater
I have a column defined as TEXT and I insert utf-8 text into it. In particular, 
there are byte sequences that might look as follows:
... 74 6F 20 C2 A3 32 35 30 ... (in hex)
which represents:
... to £250 ...
I used Navicat for SQLite Lite to peer at the data as hex and text in the 
column, and it looks as I've presented it.
The problem comes when I try to retrieve the data using a PHP script. I want 
the data back as bytes because I'm going to encode it as quoted printable, with 
decimal byte values greater than 127 being separately encoded. This would give 
the above as:
... to =C2=A3250 ...
But: the byte stream I get back on doing a SELECT is always *missing* the C2 
byte. What am I missing here? I'm using the PHP PDO interface.
Thanks,
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users