Issue #857 has been reported by Clément OUDOT.

----------------------------------------
Bug #857: UTF-8 csv detection too restrictive
http://tools.lsc-project.org/issues/857

Author: Clément OUDOT
Status: New
Priority: Normal
Assigned to: 
Category: Sample
Target version: 2.2
Problem in version: 


Reportes by François Tamone:

I found in lsc-sample, that the test for UTF-8 into .csv file to import to 
hsqldb is failing because it is too restrictive: it takes the second field of 
the
<pre>
  file input.csv
</pre>

command. This command makes three tests and may give more than one answer.

On my csv, onto Debian 7 with file, version 5.11 it returns:
<pre>
  ../input.csv: C source, UTF-8 Unicode text
</pre>

The lsc-sample scripts does a "file" commands and takes the second field with 
the blank as a separator, which fails here.


If the line searches for a substring of "UTF-8" anywhere on the line this might 
work a little better like in:

<pre>
diff lsc-sample lsc-sample-new
125,126c125,126
<         encoding=`file $1  2> /dev/null | cut -d' ' -f2`
<         if [ "x$encoding" == "xUTF-8" ] ; then
---
>         encoding=`file $1`
>         if [ "$encoding" != "${encoding%UTF-8*}" ] ; then 
</pre>


-- 
You have received this notification because you have either subscribed to it, 
or are involved in it.
To change your notification preferences, please click here: 
http://tools.lsc-project.org/my/account
_______________________________________________________________
Ldap Synchronization Connector (LSC) - http://lsc-project.org

lsc-dev mailing list
[email protected]
http://lists.lsc-project.org/listinfo/lsc-dev

Reply via email to