The problem seem to be that the file is opened another time, in non-binary
mode here:

https://bitbucket.org/nagylzs/generateds/src/03fdb6673d8d606ad9eead6a27629021ad006ddc/process_includes.py#lines-351

I have submitted a PR that solves this and finally generates an python
module for the XSD.

But something went wrong with line endings. :-(

Best,

   Laszlo


2018-06-22 13:37 GMT+02:00 Les <nagy...@gmail.com>:

>
>   Hi Dave!
>
> Here is what I did:
>
>
>    - hg clone https://nagy...@bitbucket.org/dkuhlman/generateds
>    - Then I have overwritten process_include.py and generateDS.py with
>    the files you have provided.
>
> Finally, I have tried this (without installing it into Python site wide
> site-package or virtual env):
>
> pipenv --three
> pipenv install six
> pipenv install lxml
> pipenv shell
>
> Then I got this error:
>
> c:\Python\Lib\generateds>python generateDS.py -o c:\temp\invoiceApi.py
> c:\temp\invoiceApi.xsd
> Traceback (most recent call last):
>   File "generateDS.py", line 7480, in <module>
>     main()
>   File "generateDS.py", line 7462, in main
>     superModule=superModule)
>   File "generateDS.py", line 6932, in parseAndGenerate
>     no_redefine_groups=noRedefineGroups,
>   File "c:\Python\Lib\generateds\process_includes.py", line 88, in
> process_include_files
>     doc, ns_dict = prep_schema_doc(infile, outfile, inpath, options)
>   File "c:\Python\Lib\generateds\process_includes.py", line 332, in
> prep_schema_doc
>     collect_inserts(root1, params, inserts, ns_dict, options)
>   File "c:\Python\Lib\generateds\process_includes.py", line 240, in
> collect_inserts
>     child, params, inserts, ns_dict, options)
>   File "c:\Python\Lib\generateds\process_includes.py", line 248, in
> collect_inserts_aux
>     string_content = resolve_ref(child, params, options)
>   File "c:\Python\Lib\generateds\process_includes.py", line 214, in
> resolve_ref
>     unencoded_content = infile.read()
>   File 
> "C:\Users\User\.virtualenvs\generateds-zuqM3JFo\lib\encodings\cp1250.py",
> line 23, in decode
>     return codecs.charmap_decode(input,self.errors,decoding_table)[0]
> UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position
> 4483: character maps to <undefined>
>
> Would you allow me to fork the project on bitbucket, fix the problem and
> sent a pull request?
>
> Thanks,
>
>    Laszlo
>
>
>
>
>
> 2018-06-20 23:34 GMT+02:00 Dave Kuhlman <dkuhl...@davekuhlman.org>:
>
>> Laszlo,
>>
>> I'm trying to use your suggestions *and* make minimal changes so
>> that I have the best chance of not making things worse.
>>
>> I've made the following changes:
>>
>> 1. Now we always open the XML schema files (*.xsd) in binary mode.
>>    This should fix the exception that you reported.
>>
>> 2. Encoding output XML instance document -- Use of
>>    --external-encoding is optional:
>>
>>    (a) for Python 3, we do not use the value of --external-encoding
>>        and do not encode.
>>
>>    (b) for Python 2, we encode with the value of --external-encoding
>>        only if it was used (entered on the generateDS.py command line),
>>        else we use "utf-8".
>>
>> Do the above changes fix your issues?  Please let me know if we are
>> getting closer to the fixes that you need.
>>
>> I have attached a diff file and the two files that I modified.  The
>> diff file is not intended to be used to patch your code; I only send
>> it so that you can see the exact changes, if you wish.
>>
>> Thank you for your suggestion to open the input XML schema files in
>> binary mode.  I hope that's the right thing to do.  I don't believe
>> that I know how to reproduce your exception or to test that change.
>> The tests that I have run show that this change causes no harm.
>>
>> I could not use your suggestion to open the output XML instance
>> document in binary-write mode because we are writing to sys.stdout
>> (which I cannot open).
>>
>> And, thank you for your help with these issues.
>>
>> Dave
>>
>> On Wed, Jun 20, 2018 at 07:37:29AM +0200, Les wrote:
>> > > *#2 output encoding*
>> > > >
>> > > > On Python 3, we should not have an option to specify the output
>> encoding.
>> > >
>> > > The generated file, when run under Python 3, ignores the
>> > > ExternalEncoding global variable.  Here is an example of the code
>> > > that I copied from a module generated by generateDS.py:
>> > >
>> > >         def gds_encode(instring):
>> > >                 if sys.version_info.major == 2:
>> > >                         return instring.encode(ExternalEncoding)
>> > >                 else:
>> > >                         return instring
>> > >
>> > > And, that is the only place in the generated module that the value
>> > > given to the --external-encoding command line option has any effect.
>> > >
>> > > Or is there something else I'm missing?
>> > >
>> >
>> > I did not dive into that code, but here is what I think. Under Python 3,
>> > you have two options:
>> >
>> > #1. Open your output file as a binary file (e.g. open(filepath,"wb+"))
>> and
>> > write a binary string into it. In this case, you have to encode the
>> output
>> > source code into a binary string manually.
>> > #2. Open your output file with a specific encoding e.g. open("filepath",
>> > "w", encoding="UTF-8") and then you can write normal (non-binary)
>> strings
>> > into it. If you omit the encoding parameter of the open() function,
>> then it
>> > will default to the system default encoding - which is bad, because
>> Python
>> > 3 source files should always be encoded in UTF-8.
>> >
>> > It seems that you are using a mixture of these. Your gds_encode function
>> > returns an encoded string under Python 2, but it returns a normal
>> > (non-binary) string under Python 3. Consequently, the output file
>> should be
>> > opened as a binary file under Python 2 (with "wb" or "wb+" mode), and it
>> > should be opened as a non-binary file open(filepath, "w+",
>> > encoding=ExternalEncoding) under Python 3, given that ExternalEncoding
>> is
>> > always "UTF-8". I did not check the source code, but I think this is
>> how it
>> > should be.
>> >
>> > It might be better to always open the file in binary mode, and let your
>> > gds_encode function return an (UTF-8 encoded) binary string under
>> Python 3.
>> > Then actually you would not need the gds_encode method at all just this:
>> >
>> > fout = open(filepath, "w+", encoding=ExternalEncoding) # Where
>> > ExternalEncoding must be UTF-8 for Python 3
>> > fout.write(instring) # Where instring is not binary under Python3 and
>> it is
>> > not encoded in Python2
>> >
>> > Best,
>> >
>> >   Laszlo
>> --
>>
>> Dave Kuhlman
>> http://www.davekuhlman.org
>>
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users

Reply via email to