Laszlo, I've pushed my changes to the Bitbucket repo, and I've merged your pull request. So, the Bitbucket repo should have the process_includes.py change at line 351 and all our previous changes.
Thanks again for your help with this. Please let me know if there are still problems. About "something went wrong with line endings" -- Is this still an existing problem? Were you able to fix it? Dave On Fri, Jun 22, 2018 at 02:05:09PM +0200, Les wrote: > Line endings fixed. the PR should be okay now. > 2018-06-22 13:57 GMT+02:00 Les <nagy...@gmail.com>: > > The problem seem to be that the file is opened another time, in > non-binary mode here: > > https://bitbucket.org/nagylzs/generateds/src/03fdb6673d8d606ad9eead6a27629021ad006ddc/process_includes.py#lines-351 > I have submitted a PR that solves this and finally generates an python > module for the XSD. > But something went wrong with line endings. :-( > Best, >   Laszlo > 2018-06-22 13:37 GMT+02:00 Les <nagy...@gmail.com>: > >  Hi Dave! > Here is what I did: > * hg clone https://nagy...@bitbucket.org/dkuhlman/generateds > * Then I have overwritten process_include.py and generateDS.py with > the files you have provided. > Finally, I have tried this (without installing it into Python site > wide site-package or virtual env): > pipenv --three > pipenv install six > pipenv install lxml > pipenv shell > Then I got this error: > c:\Python\Lib\generateds>python generateDS.py -o c:\temp\invoiceApi.py > c:\temp\invoiceApi.xsd > Traceback (most recent call last): >  File "generateDS.py", line 7480, in <module> >    main() >  File "generateDS.py", line 7462, in main >    superModule=superModule) >  File "generateDS.py", line 6932, in parseAndGenerate >    no_redefine_groups=noRedefineGroups, >  File "c:\Python\Lib\generateds\process_includes.py", line 88, in > process_include_files >    doc, ns_dict = prep_schema_doc(infile, outfile, inpath, > options) >  File "c:\Python\Lib\generateds\process_includes.py", line 332, in > prep_schema_doc >    collect_inserts(root1, params, inserts, ns_dict, options) >  File "c:\Python\Lib\generateds\process_includes.py", line 240, in > collect_inserts >    child, params, inserts, ns_dict, options) >  File "c:\Python\Lib\generateds\process_includes.py", line 248, in > collect_inserts_aux >    string_content = resolve_ref(child, params, options) >  File "c:\Python\Lib\generateds\process_includes.py", line 214, in > resolve_ref >    unencoded_content = infile.read() >  File > > "C:\Users\User\.virtualenvs\generateds-zuqM3JFo\lib\encodings\cp1250.py", > line 23, in decode >    return > codecs.charmap_decode(input,self.errors,decoding_table)[0] > UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position > 4483: character maps to <undefined> > Would you allow me to fork the project on bitbucket, fix the problem > and sent a pull request? > Thanks, >   Laszlo > 2018-06-20 23:34 GMT+02:00 Dave Kuhlman <dkuhl...@davekuhlman.org>: > > Laszlo, > > I'm trying to use your suggestions *and* make minimal changes so > that I have the best chance of not making things worse. > > I've made the following changes: > > 1. Now we always open the XML schema files (*.xsd) in binary mode. >   This should fix the exception that you reported. > > 2. Encoding output XML instance document -- Use of >   --external-encoding is optional: > >   (a) for Python 3, we do not use the value of > --external-encoding >     and do not encode. > >   (b) for Python 2, we encode with the value of > --external-encoding >     only if it was used (entered on the generateDS.py command > line), >     else we use "utf-8". > > Do the above changes fix your issues? Please let me know if we are > getting closer to the fixes that you need. > > I have attached a diff file and the two files that I modified. The > diff file is not intended to be used to patch your code; I only send > it so that you can see the exact changes, if you wish. > > Thank you for your suggestion to open the input XML schema files in > binary mode. I hope that's the right thing to do. I don't > believe > that I know how to reproduce your exception or to test that change. > The tests that I have run show that this change causes no harm. > > I could not use your suggestion to open the output XML instance > document in binary-write mode because we are writing to sys.stdout > (which I cannot open). > > And, thank you for your help with these issues. > Dave > On Wed, Jun 20, 2018 at 07:37:29AM +0200, Les wrote: > > > *#2 output encoding* > > > > > > > > On Python 3, we should not have an option to specify the > output encoding. > > > > > > The generated file, when run under Python 3, ignores the > > > ExternalEncoding global variable. Here is an example of the > code > > > that I copied from a module generated by generateDS.py: > > > > > >     def gds_encode(instring): > > >         if sys.version_info.major == 2: > > >             return > instring.encode(ExternalEncoding) > > >         else: > > >             return instring > > > > > > And, that is the only place in the generated module that the > value > > > given to the --external-encoding command line option has any > effect. > > > > > > Or is there something else I'm missing? > > > > > > > I did not dive into that code, but here is what I think. Under > Python 3, > > you have two options: > > > > #1. Open your output file as a binary file (e.g. > open(filepath,"wb+")) and > > write a binary string into it. In this case, you have to encode > the output > > source code into a binary string manually. > > #2. Open your output file with a specific encoding e.g. > open("filepath", > > "w", encoding="UTF-8") and then you can write normal (non-binary) > strings > > into it. If you omit the encoding parameter of the open() > function, then it > > will default to the system default encoding - which is bad, > because Python > > 3 source files should always be encoded in UTF-8. > > > > It seems that you are using a mixture of these. Your gds_encode > function > > returns an encoded string under Python 2, but it returns a normal > > (non-binary) string under Python 3. Consequently, the output file > should be > > opened as a binary file under Python 2 (with "wb" or "wb+" mode), > and it > > should be opened as a non-binary file open(filepath, "w+", > > encoding=ExternalEncoding) under Python 3, given that > ExternalEncoding is > > always "UTF-8". I did not check the source code, but I think this > is how it > > should be. > > > > It might be better to always open the file in binary mode, and let > your > > gds_encode function return an (UTF-8 encoded) binary string under > Python 3. > > Then actually you would not need the gds_encode method at all just > this: > > > > fout = open(filepath, "w+", encoding=ExternalEncoding) # Where > > ExternalEncoding must be UTF-8 for Python 3 > > fout.write(instring) # Where instring is not binary under Python3 > and it is > > not encoded in Python2 > > > > Best, > > > >  Laszlo > -- > > Dave Kuhlman > http://www.davekuhlman.org -- Dave Kuhlman http://www.davekuhlman.org ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ generateds-users mailing list generateds-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/generateds-users