Hi Koen, Thank you for this information. I did tryout some of the suggestions on Google for using Excel to create UTF-8 files, because I like using Excel and know it well, but I have tried some and they are over complicated and produce a CVS file in UTF-BOM format which I believe will not work in Arches. It looks like I will need to download the Openoffice version as you suggest. Must all files loading into Arches be UTF-8 only?
Lucy On Friday, January 22, 2016 at 4:24:42 PM UTC+2, Koen Van Daele wrote: > > Hi Lucy, > > > as far as I know Excel (all versions) are notoriously bad at handling > things like character encodings. This rather old Stackoverflow question > seems to confirm that: > > http://stackoverflow.com/questions/4221176/excel-to-csv-with-utf8-encoding It > does offer some workarounds, but none of them are very nice. > > > I would suggest writing your CSV files with Libreoffice/Openoffice. You > should be able to install it and it's free. While it's not always an exact > replacement for Excel, when it comes to character encodings, it just works. > By default it will save things as UTF-8 (at least under Linux it does) and > it will ask you if you want to save in a different encoding. > > > Cheers, > > Koen > > > > Op vrijdag 22 januari 2016 15:05:52 UTC+1 schreef Lucy FJ: >> >> Hi Adam and Alexei, >> >> I forgot to add that the diacriticals are in the altnames at rows 132 to >> 136 when editing in Excel. >> >> Lucy >> >> ----- Original Message ----- >> *From:* Adam Cox >> *To:* Lucy Fletcher-Jones >> *Cc:* Alexei Peters ; Arches Project >> *Sent:* Thursday, January 21, 2016 5:36 PM >> *Subject:* Re: [Arches] Diacriticals in authority and .Arches files >> problems >> >> Hi Lucy, you can check the encoding in Notepad ++. Open your authority >> document with that program, and click the Encoding menu. Your file should >> be in "UTF-8" or "UTF-8 without BOM" (depends on the version of Notepad ++ >> you have). The î character should work as far as I know... >> >> On Thu, Jan 21, 2016 at 7:18 AM, 'Lucy Fletcher-Jones' via Arches Project >> <[email protected]> wrote: >> >>> Hi Alexei, >>> >>> Thank you for looking into this. I am glad to hear that Arches should >>> support diacriticals. >>> >>> Here is the error message on loading the 'Ruler' Authority document: >>> >>> RULER_AUTHORITY_DOCUMENT.csv >>> >>> ERRORS IN FILE: RULER_AUTHORITY_DOCUMENT.values.csv >>> >>> ERRORS IN FILE: RULER_AUTHORITY_DOCUMENT.csv >>> >>> ERROR: Make sure the file is saved with UTF-8 encoding >>> 'utf8' codec can't decode byte 0xea in position 30: invalid continuation >>> byte >>> Traceback (most recent call last): >>> File >>> "/opt/projects/ENV/lib/python2.7/site-packages/arches/management/commands/package_utils/authority_files.py", >>> >>> line 112, in load_authority_file >>> for row in rows: >>> File >>> "/opt/projects/ENV/lib/python2.7/site-packages/unicodecsv/py2.py", line >>> 217, in next >>> row = csv.DictReader.next(self) >>> File "/usr/local/lib/python2.7/csv.py", line 104, in next >>> row = self.reader.next() >>> File >>> "/opt/projects/ENV/lib/python2.7/site-packages/unicodecsv/py2.py", line >>> 128, in next >>> for value in row] >>> File "/opt/projects/ENV/lib/python2.7/encodings/utf_8_sig.py", line >>> 22, in decode >>> (output, consumed) = codecs.utf_8_decode(input, errors, True) >>> UnicodeDecodeError: 'utf8' codec can't decode byte 0xea in position 30: >>> invalid continuation byte >>> >>> ERROR in row 31 (Legacyoid (RULER_UID:30) not found. Make sure your >>> ParentConceptid in the >>> >>> This caused further errors in the Ruler Values files as can be seen from >>> above. >>> I do not have a copy of the authority file that caused the error asI >>> have since corrected it and changed it in a few places. But the alternative >>> name was >>> >>> Ptolemaîos Philadelphos >>> >>> and I believe it was the circumflex above the 'i' that caused the >>> problem. Certainly when I removed the circumflex, the file loaded OK. >>> >>> Thank you, >>> Lucy >>> >>> >>> ----- Original Message ----- >>> >>> *From:* Alexei Peters >>> *To:* Lucy FJ >>> *Cc:* Arches Project >>> *Sent:* Wednesday, January 20, 2016 8:24 PM >>> *Subject:* Re: [Arches] Diacriticals in authority and .Arches files >>> problems >>> >>> Hi Lucy, >>> The .arches file should support diacritics. I'm actually surprised that >>> the authority files don't. I just tested a local file and I was able to >>> add these records: >>> >>> conceptid,PrefLabel,AltLabels,ParentConceptid,ConceptType,Provider >>> >>> 20000001-0000-0000-0000-000000000000,Portland,,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI >>> 20000002-0000-0000-0000-000000000000,San Francisco,The Bay >>> Area,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI >>> 20000003-0000-0000-0000-000000000000,San Jose,San >>> José,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI >>> >>> Notice that the alt label for San Jose, is San José >>> >>> Can you share the authority file that you're having trouble with? >>> Cheers, >>> Alexei >>> >>> >>> Director of Web Development - Farallon Geographics, Inc. - 971.227.3173 >>> >>> On Wed, Jan 20, 2016 at 12:32 AM, Lucy FJ <[email protected]> wrote: >>> >>>> Hi all, >>>> We have been loading customised authority files and have noticed that >>>> Arches rejects words with diacriticals (accents etc). This is not a >>>> problem >>>> for us as we were happy to remove them and if we really want them we can >>>> enter then through the RDM. But will this problem occur when loading >>>> resource data through .arches? We need to input place names as alternative >>>> names using diacriticals and it would be much easier if we can do this via >>>> .arches files. We know we can input them using the resource data manager >>>> but obviously when dealing with about 3000 entries,,this is time consuming. >>>> Any ideas? >>>> Lucy >>>> >>>> -- >>>> -- To post, send email to [email protected]. To unsubscribe, >>>> send email to [email protected]. For more information, >>>> visit https://groups.google.com/d/forum/archesproject?hl=en >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "Arches Project" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> -- To post, send email to [email protected]. To unsubscribe, >>> send email to [email protected]. For more information, >>> visit https://groups.google.com/d/forum/archesproject?hl=en >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Arches Project" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- -- To post, send email to [email protected]. To unsubscribe, send email to [email protected]. For more information, visit https://groups.google.com/d/forum/archesproject?hl=en --- You received this message because you are subscribed to the Google Groups "Arches Project" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
