Thanks Anthony,
I will try your suggestions when I'm back in the office tomorrow, I'm on a study day today for my OU course. Would be good to track this down and fix without having to refactor to Win32::ODBC , I eventually want to look at replacing my own DBI wrapper for DBIC ORM but am concerned this wouldn't be possible if I cant get DBI to play ball with MS. >> Your data is being stored in Unicode data typed columns right? Yes it's NVARCHAR(max) , which I understood is MS's data-typing for uNicode VARiable CHARacters, looking at some sample column data via the Windows SQL Management GUI, it appears to display ok. I know that the data being pasted into it is coming from an MS Access front end application that is linked to the same backend SQL server. I also know that this is a memo / rich text input box control on the form (view) bound directly to the table column via a linked table definition with the backend SQL server and some of what they enter they copy/paste from emails and MS Word documents (and possibly PDF) I can't see any odd characters looking at a small amount of sample data on the SQL server, and the data comes out of Win32::ODBC looking ok too. >From what I can tell the data is in Unicode during capture and storage, it's >just the retrieval with DBI where it seems to be breaking down. I have to include a longread setting when using DBI::DBD::ODBC with SQL already, otherwise it falls over with the data being to long, so perhaps there is another parameter I need? I really appreciate all the help you guys have given so far, thank you. Regards, Craig ________________________________ From: Anthony Lucas [anthonyjlu...@gmail.com] Sent: 04 July 2013 01:09 To: The elegant MVC web framework Subject: Re: [Catalyst] CSV / UTF-8 / Unicode On 3 July 2013 11:18, Craig Chant <cr...@homeloanpartnership.com<mailto:cr...@homeloanpartnership.com>> wrote: >> Maybe write a standalone test and take Catalyst and browser quirks out of >> the picture. I have already done this, I have two SQL wrapper modules one that uses DBI::DBD::ODBC and one that uses Win32::ODBC, I applied it to the same standalone script that produces CSV output, the only difference between the test was one test accessed SQL with the DBI SQL wrapper and one test accessed SQL with the Win32::ODBC SQL wrapper, DBI outputted junk chars, Win32::ODBC didn't. What else should I be doing to test for the culprit of the corruption? You need to see how they are using the ODBC API underneath for handling the data and encoding. Setting the trace flag on DBI (i.e. DBI->trace(n)) will expose the DBD::ODBC activity. I'm not sure of the debugging available for Win32::ODBC. One thing I would check first is what they are treating the column data as. If DBD::ODBC is treating the columns as WCHAR but Win32::ODBC is treating them as CHAR and then doing extra "magic" decoding (or not), well then you've found a big clue. There has to be different handling or differing levels of ODBC support somewhere. I would assume that DBD::ODBC is doing "the right thing", and something else is amiss upstream (but well, never assume with Unicode handling, so make sure with the trace). >> Also, you are aware that your data will probably be coming back as UCS2 if >> you're using SQL Server right? No, what is UCS2 and is this handled differently in DBI::DBD::ODBC vs Win32::ODBC ? >From what I understand, is ultimately what you've got happening?: Original Input Data -> SQL Client -> Database Driver -> Database (UCS2) -> Windows ODBC Driver -> DBD::ODBC -> Catalyst(?) If so, since you're storing the data as Unicode and the database driver knows this (because your column type is NVARCHAR etc.), conversion to UCS2 happens at the driver stage on Windows. This is lossless between the different Unicodes, so just make sure your input is actual good Unicode up to that point and your data is being stored correctly. Your data is being stored in Unicode data typed columns right? This Email and any attachments contain confidential information and is intended solely for the individual to whom it is addressed. If this Email has been misdirected, please notify the author as soon as possible. If you are not the intended recipient you must not disclose, distribute, copy, print or rely on any of the information contained, and all copies must be deleted immediately. Whilst we take reasonable steps to try to identify any software viruses, any attachments to this e-mail may nevertheless contain viruses, which our anti-virus software has failed to identify. You should therefore carry out your own anti-virus checks before opening any documents. HomeLoan Partnership will not accept any liability for damage caused by computer viruses emanating from any attachment or other document supplied with this e-mail. HomeLoan Partnership reserves the right to monitor and archive all e-mail communications through its network. No representative or employee of HomeLoan Partnership has the authority to enter into any contract on behalf of HomeLoan Partnership by email. HomeLoan Partnership is a trading name of H L Partnership Limited, registered in England and Wales with Registration Number 5011722. Registered office: 26-34 Old Street, London, EC1V 9QQ. H L Partnership Limited is authorised and regulated by the Financial Conduct Authority.
_______________________________________________ List: Catalyst@lists.scsys.co.uk Listinfo: http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst Searchable archive: http://www.mail-archive.com/catalyst@lists.scsys.co.uk/ Dev site: http://dev.catalyst.perl.org/