Hi Lucy,
as far as I know Excel (all versions) are notoriously bad at handling things like character encodings. This rather old Stackoverflow question seems to confirm that: http://stackoverflow.com/questions/4221176/excel-to-csv-with-utf8-encoding It does offer some workarounds, but none of them are very nice. I would suggest writing your CSV files with Libreoffice/Openoffice. You should be able to install it and it's free. While it's not always an exact replacement for Excel, when it comes to character encodings, it just works. By default it will save things as UTF-8 (at least under Linux it does) and it will ask you if you want to save in a different encoding. Cheers, Koen Op vrijdag 22 januari 2016 15:05:52 UTC+1 schreef Lucy FJ: > > Hi Adam and Alexei, > > I forgot to add that the diacriticals are in the altnames at rows 132 to > 136 when editing in Excel. > > Lucy > > ----- Original Message ----- > *From:* Adam Cox <javascript:> > *To:* Lucy Fletcher-Jones <javascript:> > *Cc:* Alexei Peters <javascript:> ; Arches Project <javascript:> > *Sent:* Thursday, January 21, 2016 5:36 PM > *Subject:* Re: [Arches] Diacriticals in authority and .Arches files > problems > > Hi Lucy, you can check the encoding in Notepad ++. Open your authority > document with that program, and click the Encoding menu. Your file should > be in "UTF-8" or "UTF-8 without BOM" (depends on the version of Notepad ++ > you have). The î character should work as far as I know... > > On Thu, Jan 21, 2016 at 7:18 AM, 'Lucy Fletcher-Jones' via Arches Project > <[email protected] <javascript:>> wrote: > >> Hi Alexei, >> >> Thank you for looking into this. I am glad to hear that Arches should >> support diacriticals. >> >> Here is the error message on loading the 'Ruler' Authority document: >> >> RULER_AUTHORITY_DOCUMENT.csv >> >> ERRORS IN FILE: RULER_AUTHORITY_DOCUMENT.values.csv >> >> ERRORS IN FILE: RULER_AUTHORITY_DOCUMENT.csv >> >> ERROR: Make sure the file is saved with UTF-8 encoding >> 'utf8' codec can't decode byte 0xea in position 30: invalid continuation >> byte >> Traceback (most recent call last): >> File >> "/opt/projects/ENV/lib/python2.7/site-packages/arches/management/commands/package_utils/authority_files.py", >> >> line 112, in load_authority_file >> for row in rows: >> File "/opt/projects/ENV/lib/python2.7/site-packages/unicodecsv/py2.py", >> line 217, in next >> row = csv.DictReader.next(self) >> File "/usr/local/lib/python2.7/csv.py", line 104, in next >> row = self.reader.next() >> File "/opt/projects/ENV/lib/python2.7/site-packages/unicodecsv/py2.py", >> line 128, in next >> for value in row] >> File "/opt/projects/ENV/lib/python2.7/encodings/utf_8_sig.py", line 22, >> in decode >> (output, consumed) = codecs.utf_8_decode(input, errors, True) >> UnicodeDecodeError: 'utf8' codec can't decode byte 0xea in position 30: >> invalid continuation byte >> >> ERROR in row 31 (Legacyoid (RULER_UID:30) not found. Make sure your >> ParentConceptid in the >> >> This caused further errors in the Ruler Values files as can be seen from >> above. >> I do not have a copy of the authority file that caused the error asI have >> since corrected it and changed it in a few places. But the alternative >> name was >> >> Ptolemaîos Philadelphos >> >> and I believe it was the circumflex above the 'i' that caused the >> problem. Certainly when I removed the circumflex, the file loaded OK. >> >> Thank you, >> Lucy >> >> >> ----- Original Message ----- >> >> *From:* Alexei Peters <javascript:> >> *To:* Lucy FJ <javascript:> >> *Cc:* Arches Project <javascript:> >> *Sent:* Wednesday, January 20, 2016 8:24 PM >> *Subject:* Re: [Arches] Diacriticals in authority and .Arches files >> problems >> >> Hi Lucy, >> The .arches file should support diacritics. I'm actually surprised that >> the authority files don't. I just tested a local file and I was able to >> add these records: >> >> conceptid,PrefLabel,AltLabels,ParentConceptid,ConceptType,Provider >> >> 20000001-0000-0000-0000-000000000000,Portland,,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI >> 20000002-0000-0000-0000-000000000000,San Francisco,The Bay >> Area,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI >> 20000003-0000-0000-0000-000000000000,San Jose,San >> José,CITY_AUTHORITY_DOCUMENT.csv,Index,GCI >> >> Notice that the alt label for San Jose, is San José >> >> Can you share the authority file that you're having trouble with? >> Cheers, >> Alexei >> >> >> Director of Web Development - Farallon Geographics, Inc. - 971.227.3173 >> >> On Wed, Jan 20, 2016 at 12:32 AM, Lucy FJ <[email protected] >> <javascript:>> wrote: >> >>> Hi all, >>> We have been loading customised authority files and have noticed that >>> Arches rejects words with diacriticals (accents etc). This is not a problem >>> for us as we were happy to remove them and if we really want them we can >>> enter then through the RDM. But will this problem occur when loading >>> resource data through .arches? We need to input place names as alternative >>> names using diacriticals and it would be much easier if we can do this via >>> .arches files. We know we can input them using the resource data manager >>> but obviously when dealing with about 3000 entries,,this is time consuming. >>> Any ideas? >>> Lucy >>> >>> -- >>> -- To post, send email to [email protected] <javascript:>. To >>> unsubscribe, send email to [email protected] >>> <javascript:>. For more information, visit >>> https://groups.google.com/d/forum/archesproject?hl=en >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "Arches Project" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected] <javascript:>. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- >> -- To post, send email to [email protected] <javascript:>. To >> unsubscribe, send email to [email protected] <javascript:>. >> For more information, visit >> https://groups.google.com/d/forum/archesproject?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Arches Project" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > -- -- To post, send email to [email protected]. To unsubscribe, send email to [email protected]. For more information, visit https://groups.google.com/d/forum/archesproject?hl=en --- You received this message because you are subscribed to the Google Groups "Arches Project" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
