Hi Darcy, this may not be the most elegant answer, but with the basic
testing I can do now I believe it will work.

Add the following function to the top of the file in question
<https://github.com/archesproject/arches/blob/stable/3.x/arches/management/commands/packages.py#L352>,
right under the import statements:

def encode_utf8(val):
    try:
        utf8 = val.encode('utf-8')
    except:
        utf8 = str(val).encode('utf-8')
    return utf8

then, change this line

csvwriter.writerow({k: str(v).encode('utf8') for k, v in
csv_record.items()})

to

csvwriter.writerow({k: encode_utf8(v) for k, v in csv_record.items()})

Let me know you have any trouble with that. It will probably be easiest to
just modify the file in your virtual environment.

Adam


On Mon, Sep 18, 2017 at 8:04 PM, Darcy Christ <[email protected]> wrote:

> Hi Adam,
>
> Still not sure how to get past this. I see it is an n-dash (I should have
> looked this up, rather than assumed it was related to Chinese).
>
> The question for me is why is this code assuming all ascii since it
> allowed an ndash into the database?
>
> Is there a way to fix this code, rather update the content?
>
>
> Darcy
>
>
> On Saturday, September 16, 2017 at 12:59:44 AM UTC+10, Adam Cox wrote:
>>
>> Darcy, it looks like this is not a chinese character, but a long dash.
>>
>> This issue seems to be well-summed up in this stack exchange answer:
>> https://stackoverflow.com/a/5387966/3873885
>>
>> Essentially, in this line
>>
>> csvwriter.writerow({k: str(v).encode('utf8') for k, v in
>> csv_record.items()})
>>
>> the str() operation is encoding v (which in this case is a unicode
>> object) to ascii, the default encoding for a str object in python 2.7.
>> Then, that ascii-encoded string is further encoded into utf-8. I assume the
>> initial str() operation is meant to handle integers and other non-text
>> obects, but you've found an example where because  u'\u2013' (unicode
>> character 2013
>> <http://www.fileformat.info/info/unicode/char/2013/index.htm>) cannot be
>> encoded in ascii, it hits an error even before it has a chance to encode to 
>> utf-8.
>> So, I think that line in the code could be improved.
>>
>> It looks like that line comes from the related resource export
>> <https://github.com/archesproject/arches/blob/stable/3.x/arches/management/commands/packages.py#L352>
>> process. Maybe you have a long dash in one of the notes that you have about
>> a resource to resource relationship? Otherwise, I'm really not sure where
>> that problem character would come from...
>>
>> Hope that's at least a little helpful.
>>
>> Adam
>>
>> On Thu, Sep 14, 2017 at 8:01 PM, Darcy Christ <[email protected]>
>> wrote:
>>
>>> I am having trouble exporting data from v3
>>>
>>> I have add this to my config:
>>>
>>> EXPORT_CONFIG = os.path.normpath(os.path.join(PACKAGE_ROOT,
>>> 'source_data', 'business_data', 'resource_export_mappings.json'))
>>>
>>>
>>> And then I get an error while trying to export. Given that it is related
>>> to encoding, could it be any chinese characters I might have in the data?
>>>
>>>
>>> (hkarches) [hkarches@heritage hongkong]$ python manage.py packages -o
>>> export_resources -d '../hongkong_data'
>>> operation: export_resources
>>> package: hongkong
>>> Writing 3 ACTIVITY.E7 resources
>>> Writing 370 INFORMATION_RESOURCE.E73 resources
>>> Writing 1205 HERITAGE_RESOURCE.E18 resources
>>> Writing 545 ACTOR.E39 resources
>>> Writing 6 HISTORICAL_EVENT.E5 resources
>>> Writing 0 HERITAGE_RESOURCE_GROUP.E27 resources
>>> Traceback (most recent call last):
>>>   File "manage.py", line 28, in <module>
>>>     execute_from_command_line(sys.argv)
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/django/core/management/__init__.py",
>>> line 399, in execute_from_command_line
>>>     utility.execute()
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/django/core/management/__init__.py",
>>> line 392, in execute
>>>     self.fetch_command(subcommand).run_from_argv(self.argv)
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/django/core/management/base.py",
>>> line 242, in run_from_argv
>>>     self.execute(*args, **options.__dict__)
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/django/core/management/base.py",
>>> line 285, in execute
>>>     output = self.handle(*args, **options)
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/arches/management/commands/packages.py",
>>> line 106, in handle
>>>     self.export_resources(package_name, options['dest_dir'])
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/arches/management/commands/packages.py",
>>> line 351, in export_resources
>>>     csvwriter.writerow({k: str(v).encode('utf8') for k, v in
>>> csv_record.items()})
>>>   File 
>>> "/home/hkarches/lib/python2.7/site-packages/arches/management/commands/packages.py",
>>> line 351, in <dictcomp>
>>>     csvwriter.writerow({k: str(v).encode('utf8') for k, v in
>>> csv_record.items()})
>>> UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in
>>> position 220: ordinal not in range(128)
>>>
>>> --
>>> -- To post, send email to [email protected]. To unsubscribe,
>>> send email to [email protected]. For more information,
>>> visit https://groups.google.com/d/forum/archesproject?hl=en
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Arches Project" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
> -- To post, send email to [email protected]. To unsubscribe,
> send email to [email protected]. For more
> information, visit https://groups.google.com/d/forum/archesproject?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "Arches Project" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
-- To post, send email to [email protected]. To unsubscribe, send 
email to [email protected]. For more information, 
visit https://groups.google.com/d/forum/archesproject?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Arches Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to