On 12.05.2012 03:00, OSGeo wrote: > #896: sphinx doc build is broken because of BOM > ---------------------+------------------------------------------------------ > Reporter: fgdrf | Owner: live-demo@… > Type: defect | Status: new > Priority: major | Milestone: > Component: LiveDVD | Keywords: > ---------------------+------------------------------------------------------ > > Comment(by hamish): > > the Byte Order Mark has been added and removed from the .csv lists of > contributers for a while now. > > I haven't really been sure if they should be there or not so only did a > quick edit just before the last release to stop the table creation from > breaking. > > It's easy enough to open with vi and delete the first two chars in the > file if needed.. Converting UTF back to ISO-8859-1 isn't too bad either: > `iconv -f UTF-8 -t ISO_8859-1 utf_file > iso_file` > > > Qs: > * Should the BOM be there or not?
according to http://en.wikipedia.org/wiki/Byte_order_mark it is maningless for UTF-8 but allowed. > * What files (if any) should be saved in UTF-8, and why? (ISO will not > handle non-Western multibytes, but that doesn't necessitate that the > English/Western pages also be in UTF) > > this is out of my area of expertise, but the constant "last committer > wins" back and forth of text file variants is as we see here causing > problems. > WHICH: i'd suggest to keep realms where everything is in *one* character encoding which can be announced so people can use the proper editor e.g. UTF-8 for the docs. WHY: users from languages with characters not in latin-1 aka. ISO_8859-1 can eventually write names and texts natively without having to escape convert them. as UTF-8 is backwards compatible with ASCII it also keeps at least this (currently most important user-base-wise) area intact even on misconversion. editor software usually warns when trying to open or save unsupported characters into a different character set. we could actually use svn properties to effectively assign MIME-TYPE and character set to specific files which is respected by most svn clients. for the BOM issue: i don't know the sphinx internals, but would it be too difficult to strip the BOM on each file read conditionally? just for safety? ..ede _______________________________________________ Live-demo mailing list [email protected] http://lists.osgeo.org/mailman/listinfo/live-demo http://live.osgeo.org http://wiki.osgeo.org/wiki/Live_GIS_Disc
