Eugene,
I apologize for taking so long. And, I do not have fixes for all
the issues that you report.
But, I think I've made some progress.
A few notes are below.
> One more issue I've found is that in documentation it is written that
> default parameter for export is --export="write literal" but in reality it
> is --export="write".
I've updated the doc. Thanks for reporting it.
> 2) When I try to use --one-file-per-xsd argument, I get the following error:
> *** maxLoops exceeded. Something is wrong with --one-file-per-xsd.
I believe that the problem occurs when we try to generate modules
from an incomplete schema. Possibly, it's because when gDS attempts
to generate a class from element type A which extends element type
B, and the definition of element type B is in a part of the schema
that is not included (with xs:include or xs:import), then it cannot
generate the class for A without first generating its super-class,
which is the class for B, which is missing.
As an alternative strategy, I'm working on a replacement for the
--one-file-per-xsd capability. That option seems too inflexible to
me. So, what I've done is to implement two scripts to replace that
capability:
1. utils/collect_schema_locations.py -- Scans an XML Schema and
collects the top level xs:include and xs:import references. It
writes them out in a (JSON) file that can be used by
utils/batch_generate.py.
2. utils/batch_generate.py -- Reads the output file produced by
collect_schema_locations.py. For each reference in that file, it
runs generateDS.py to produce a Python module.
So, I believe that the function and intent of these two scripts is
pretty much the same as the capability provided by
--one-file-per-xsd, *but* these scripts are small and relatively
eash to understand and their function is not hard-wired into
generateDS.py. And, therefore, I'm hoping that they will be more
usable and will give us more flexibility. When they do *not* do
what we want, we will be more easily able to modify them.
I've now got these two scripts working. But I need to do more work on
them. In particular I need to write some documentation. And, I need
to make them more easy to use. Right now they are a bit hard to
work with even for me, and I'm the one who implemented them.
I've attached these two scripts. If you decide to try them, I'd
welcome your comments.
Here is how you might run them:
$ ./collect_schema_locations.py -f
energistics/prodml/v2.0/xsd_schemas/DasAcquisition.xsd directives04.json
$ mkdir OnePer3
$ ./batch_generate.py --config=gds02.config directives04.json
And, if gds02.config contains the following:
[generateds]
verbose = true
command = ./generateDS.py
flags = -f --member-specs=dict
in-path = energistics/prodml/v2.0/xsd_schemas
out-path = OnePer3
You would end up with the following files in subdirectory OnePer3/:
OnePer3/DtsInstrumentBox.py
OnePer3/FiberOpticalPath.py
OnePer3/ProdmlCommon.py
OnePer3/SubProdmlCommon.py
Here is what the directives file that produced the above modules
looks like:
{
"directives": [
{
"schema": "DtsInstrumentBox.xsd",
"outfile": "DtsInstrumentBox.py",
"outsubfile": "",
"flags": ""
},
{
"schema": "ProdmlCommon.xsd",
"outfile": "ProdmlCommon.py",
"outsubfile": "SubProdmlCommon.py",
"flags": ""
},
{
"schema": "FiberOpticalPath.xsd",
"outfile": "FiberOpticalPath.py",
"outsubfile": "",
"flags": ""
}
]
}
Note that I manually added the line:
"outsubfile": "SubProdmlCommon.py",
OK. I admit. That process seems a bit complex. I'll work on that.
Note that if while generating one of the modules using the above
procedure, there is a missing and needed element type definition
(for example, element type A extends element type B and the
definition of element type B is missing), then we'll still get the
error that you reported. This procedure only lets us narrow and
control the generation of these multiple modules, for example by
editing the directives file that is input to batch_generate.py.
> 3) Namespace definition behavior -- by default, generateDS puts namespace
> definition in every export method of generated classes. That is, every
> element in an exported xml that has children would have namespace
> definition. But what if I only want namespace definition in top-level
> element? For example, I want this:
This one is my next task.
I'm thinking perhaps if we had an additional command line option
--no-namespace-defs. If you use that option, we never export the
namespace definitions. So, then at the top level you would add the
namespacedef_="xmlns:abc=xxx" and it would not be passed down to
child elements. I'll see if I can come up with an example for you
to review.
But, before we pursue that approach (a --no-namespace-defs command
line flag), we should ask what our (the user's actually) needs and
goals are? I'm thinking that perhaps the user needs a more fine
grained control over which elements are generated with which
namespace definitions (xmlns:xx="yyy") and when. Consider the
following range of possible controls:
1. Enable use to specify namespace definitions to be generated on
the export of *all* elements. This is the current capability.
gDS attempts to automatically detect the needed namespace
definition and the --namespacedef command line option.
2. Enable the user to request that no namespace definitions are
generated on the export of any elements. This might be done with
a new --no-namespace-defs command line option.
3. Enable the user to specify the namespace definitions to be
generated on each element type. This might be done by enabling
the user to provide a (JSON? XML?) table/dictionary that maps
element complexType names to namespace definitions (strings of
the form 'xmlns:xx="yyy" xmlns:zz="www" ...').
Perhaps we need both #2 and #3. No. 2 is quick and easy. No. 3
will take me a little longer, but should not be too complex or
difficult, even for me.
So, I'll do some more exploration and will report back later. If
you have comments or suggestions, I'll welcome them.
[later ...]
OK, here is what I've done:
1. There is now a new command line option for generateDS.py. When
--no-namespace-defs is used, the default value for the
namespacedef_ parameter for each `export` method will be "".
This means that namespace prefix definitions will be generated
only for the top level (outer most) element and only when
explicitly passed in to the call to ``export()``. Also note that
the `parse()` function generated near the bottom of each module
may already do this.
2. Implemented the capability to use a manually edited dictionary
that enables you to specify the namespace prefix definitions to
be exported with specific element types. OK, I realize that the
same element type can occur at different levels and that you
might want the namespace prefix definitions on upper ones but not
lower (enclosed) ones. Still, this capability gives you more
control than you have now.
Attached are:
- collect_schema_locations.py -- Collect xs:include and xs:import
references for batch generation.
- batch_generate.py -- Batch generation of modules.
- directives06.json -- Sample directives file for batch generation
of modules.
- gds02.config -- Sample configuration file for use with --config
option to batch_generate.py.
- generatedsnamespaces.py -- Sample module containing a dictionary
that specifies namespace prefix definitions to be attached to
specific element types during export.
Yet to be done:
- Add some documentation for the collect_schema_locations.py and
batch_generate.py scripts.
- Add documentation for the added namespace prefix definition
command line option and the prefix mapping dictionary module.
- Additional testing -- In particular, I suspect that
batch_generate.py does not do error reporting in a reasonable way.
Any comments or guidance that you might want to give is welcome.
Dave
On Wed, May 03, 2017 at 02:05:31PM +0300, Eugene Petkevich wrote:
> Hello Dave,
>
> Thank you for the quick answer.
>
> Regarding (2), here are the xsd files:
> https://www.dropbox.com/s/x5kljbv3gjsem1h/energistics.zip?dl=0 , and the
> file that didn't work is 'prodml/v2.0/xsd_schemas/DasAcquisition.xsd' in the
> archive.
>
> One more issue I've found is that in documentation it is written that
> default parameter for export is --export="write literal" but in reality it
> is --export="write".
>
> Regards,
> Eugene
>
> On 02.05.2017 01:45, Dave Kuhlman wrote:
> > Eugene,
> >
> > Hello. I'm glad generateDS.py has been helpful. Thanks for letting
> > me know.
> >
> > Here are a few comments:
> >
> > 1. The file gends_user_methods.py is in the source distribution.
> > You can find that here:
> > https://dkuhl...@bitbucket.org/dkuhlman/generateds
> >
> > The documentation is wrong on that. I'll fix it.
> >
> > 2. With respect to on-file-per-xsd -- In the test directory
> > (generateds/tests/ again in the source distribution) there is a
> > test that uses that option. Perhaps you can look at that for
> > clues. The files of interest are:
> >
> > generateds/tests/oneper00.xsd
> > generateds/tests/oneper02.xsd
> > generateds/tests/oneper01.xsd
> > generateds/tests/oneper03.xsd
> >
> > The unit test when run, generates output modules in subdirectory
> > tests/OnePer.
> >
> > The command used to run that test is in tests/test.py in method
> > test_022_one_per. Here is that command:
> >
> > def test_022_one_per(self):
> > cmdTempl = (
> > 'python generateDS.py --no-dates --no-versions '
> > '--silence --member-specs=list -f '
> > '--one-file-per-xsd
> > --output-directory="tests/Ot_022_one_perePer" '
> > '--module-suffix="One" '
> > '--super=%s2_sup '
> > 'tests/%s00.xsd'
> > )
> > t_ = 'oneper'
> > cmd = cmdTempl % (t_, t_, )
> > o
> > o
> > o
> >
> > More specifically, about the maxLoops message, that error means
> > that you have an element definition that extends another element
> > definition, but generateDS.py thinks it should not generate the
> > class for the extension because it has not yet generated the
> > class for the base/parent. I've had to work on this once before.
> > But, I don't know why that error is happening in your case.
> >
> > Do you have a schema that produces this error and that you could
> > send me. If you do, I take a look.
> >
> > 3. With respect to the namespace definition behavior and the
> > repeated namespace definitions -- I'll take a look to see how
> > this can be done.
> >
> > 6. About parsing from a file-like object -- Actually, if I
> > understand you correctly, this already works. You can pass a
> > file object that is open for reading to the generated parse
> > functions. The parameter name is misleading, I suppose. But,
> > lxml.etree.parse does accept either a string file name or a file
> > object.
> >
> > More tomorrow when I have a bit more time.
> >
> > Thanks for the detailed report.
> >
> > Dave
> >
> >
> > On Mon, May 01, 2017 at 11:18:00AM +0300, Eugene Petkevich wrote:
> > > Hello,
> > >
> > > Thank you for the GenerateDS library. I find it very useful. I have a
> > > couple of things to ask:
[snip]
--
Dave Kuhlman
http://www.davekuhlman.org
#!/usr/bin/env python
"""
usage: batch_generate.py [-h] [--config CONFIG] [-c COMMAND] [--flags FLAGS]
[--in-path IN_PATH] [--out-path OUT_PATH] [-v]
infilename
synopsis:
read input directives from JSON file (produced by
collect_schema_locations.py) and generate python modules.
positional arguments:
infilename input JSON file containing directives
optional arguments:
-h, --help show this help message and exit
--config CONFIG configuration file
-c COMMAND, --command COMMAND
command. Default is "generateDS.py"
--flags FLAGS command line options for generateDS.py
--in-path IN_PATH path to the directory containing the input schemas
--out-path OUT_PATH path to a directory into which modules should be
generated
-v, --verbose Print messages during actions.
examples:
python batch_generate.py input_directives.json
python batch_generate.py --config=name.config input_directives.json
notes:
The configuration file (see --config) has the following form:
[generateds]
verbose = true
command = ./generateDS.py
flags = -f --member-specs=dict
in-path = energistics/prodml/v2.0/xsd_schemas
out-path = OnePer
The option names in this configuration file are the long command line
option names. Options entered on the command line over-ride options
in this config file.
Flags/options for generateDS.py -- These flags are passed to
generateDS.py as command line options. Precedence: (1) Flags in
the directives file override the --flags command line option to
batch_generate.py. (2) Flags in the --flags command line option
to batch_generate.py override flags in the configuration file (the
argument to --config=).
"""
#
# imports
from __future__ import print_function
import sys
import os
import argparse
import configparser
import subprocess
import json
#
# Global variables
#
# Private functions
def dbg_msg(options, msg):
"""Print a message if verbose is on."""
if options.verbose:
print(msg)
def generate_one(directive, options):
"""Generate modules for one XML schema."""
schema_name = directive.get('schema')
outfilename = directive.get('outfile')
outsubfilename = directive.get('outsubfile')
if options.in_path:
schema_name = os.path.join(options.in_path, schema_name)
modulename = outfilename.split('.')[0]
if options.out_path:
outfilename = os.path.join(options.out_path, outfilename)
if outsubfilename:
if options.out_path:
outsubfilename = os.path.join(options.out_path, outsubfilename)
outsubfilestem = outsubfilename
outsubfilename = '--super={} -s {}'.format(modulename, outsubfilename)
else:
outsubfilename = ""
outsubfilestem = outsubfilename
flags = directive.get('flags')
if not flags:
flags = options.flags
#flags = '{} {}'.format(flags, options.flags)
cmd = '{} {} -o {} {} {}'.format(
options.command,
flags,
outfilename,
outsubfilename,
schema_name,
)
dbg_msg(options, '\ncmd: {}'.format(cmd))
result = subprocess.run(
cmd,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE,
shell=True,
)
dbg_msg(options, 'generated: {} {}'.format(outfilename, outsubfilestem))
if result.stderr:
print('errors: {}'.format(result.stderr))
if result.stdout:
print('output: {}'.format(result.stdout))
def merge_options(options):
"""Merge config file options and command line options.
Command line options over-ride config file options.
"""
config = configparser.ConfigParser()
config.read(options.config)
if not config.has_section('generateds-batch'):
raise RuntimeError(
'config file missing required section "generateds-batch"')
section = config['generateds-batch']
for key in section:
key1 = key.replace('-', '_')
if not getattr(options, key1):
setattr(options, key1, section[key])
def load_json_file(infile):
"""Read file. Strip out lines that begin with '//'."""
lines = []
for line in infile:
if not line.lstrip().startswith('//'):
lines.append(line)
content = ''.join(lines)
return content
#
# Exported functions
def batch_generate(infile, options):
"""Generate module(s) for each line in directives file."""
content = load_json_file(infile)
specification = json.loads(content)
directives = specification['directives']
for directive in directives:
generate_one(directive, options)
def main():
description = """\
synopsis:
read input directives from JSON file (produced by
collect_schema_locations.py) and generate python modules.
"""
epilog = """\
examples:
python batch_generate.py input_directives.json
python batch_generate.py --config=name.config input_directives.json
notes:
The input directives file is a JSON file. batch_generate.py will
run generateDS.py once for each directive in this JSON file.
The configuration file (see --config) has the following form:
[generateds]
verbose = true
command = ./generateDS.py
flags = -f --member-specs=dict
in-path = energistics/prodml/v2.0/xsd_schemas
out-path = OnePer
The option names in this configuration file are the long command line
option names. Options entered on the command line over-ride options
in this config file.
Flags/options for generateDS.py -- These flags are passed to
generateDS.py as command line options. Precedence: (1) Flags in
the directives file override the --flags command line option to
batch_generate.py. (2) Flags in the --flags command line option
to batch_generate.py override flags in the configuration file (the
argument to --config=).
The input directives file can contain comments. Any line whose first
non-whitespace characters are "//" is considered a comment and is
discarded before the remaining contents are parsed by the JSON parser.
"""
parser = argparse.ArgumentParser(
description=description,
epilog=epilog,
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"infilename",
help="input JSON file containing directives",
)
parser.add_argument(
"--config",
help="configuration file",
)
parser.add_argument(
"-c", "--command",
#default="generateDS.py",
help="command. Default is \"generateDS.py\"",
)
parser.add_argument(
"--flags",
help="command line options for generateDS.py",
)
parser.add_argument(
"--in-path",
help="path to the directory containing the input schemas",
)
parser.add_argument(
"--out-path",
help="path to a directory into which modules should be generated",
)
parser.add_argument(
"-v", "--verbose",
action="store_true",
help="Print messages during actions.",
)
options = parser.parse_args()
if options.config:
merge_options(options)
if not options.command:
options.command = 'generateDS.py'
if not options.flags:
options.flags = ''
dbg_msg(options, '\noptions: {}'.format(options))
infile = open(options.infilename, 'r')
batch_generate(infile, options)
if __name__ == '__main__':
#import pdb; pdb.set_trace()
#import ipdb; ipdb.set_trace()
main()
#!/usr/bin/env python
"""
usage: collect_schema_locations.py [-h] [-f] [-v] infilename [outfilename]
synopsis:
collect schema locations from xs:include/xs:import elements in schema.
positional arguments:
infilename name/location of the XML schema file to be searched
outfilename output file name; if ommited stdout
optional arguments:
-h, --help show this help message and exit
-f, --force force overwrite existing output file
-v, --verbose print messages during actions.
examples:
python collect_schema_locations.py myschema.xsd
python collect_schema_locations.py myschema.xsd outfile.txt
"""
#
# imports
from __future__ import print_function
import sys
import os
import argparse
import json
from lxml import etree
#
# Global variables
#
# Private functions
def dbg_msg(options, msg):
"""Print a message if verbose is on."""
if options.verbose:
print(msg)
def extract_locations(infile, options):
doc = etree.parse(infile)
root = doc.getroot()
elements = root.xpath(
'.//xs:include',
namespaces=root.nsmap,
)
locations = []
for element in elements:
schema_name = element.get('schemaLocation')
locations.append(schema_name)
return locations
def generate(locations, outfile, options):
directives = []
for location in locations:
schema_name = location
outfilename = os.path.split(schema_name)[1]
outfilename = os.path.splitext(outfilename)[0]
outfilename = '{}.py'.format(outfilename)
directive = {
'schema': schema_name,
'outfile': outfilename,
'outsubfile': '',
'flags': '',
}
directives.append(directive)
return directives
def make_output_file(outfilename, options):
if os.path.exists(outfilename) and not options.force:
sys.exit("\noutput file exists. Use -f/--force to over-write.\n")
outfile = open(outfilename, 'w')
return outfile
#
# Exported functions
def extract_and_generate(infile, outfile, options):
locations = extract_locations(infile, options)
directives = generate(locations, outfile, options)
specification = {
'directives': directives,
}
json.dump(specification, outfile, indent=' ')
def main():
description = """\
synopsis:
collect schema locations from xs:include/xs:import elements in schema.
"""
epilog = """\
examples:
python collect_schema_locations.py myschema.xsd
python collect_schema_locations.py myschema.xsd outfile.txt
notes:
The output directives file is a JSON file suitable for input to
batch_generate.py. This directives file contains one directive
for each module to be generated by generateDS.py.
You can edit this resulting JSON file, for example to add a sub-class
module file or flags/options for generateDS.py that are specific
to each directive.
"""
parser = argparse.ArgumentParser(
description=description,
epilog=epilog,
formatter_class=argparse.RawDescriptionHelpFormatter,
)
parser.add_argument(
"infilename",
help="name/location of the XML schema file to be searched"
)
parser.add_argument(
"outfilename",
nargs="?",
default=None,
help="output file name; if ommited stdout"
)
parser.add_argument(
"-f", "--force",
action="store_true",
help="force overwrite existing output file",
)
parser.add_argument(
"-v", "--verbose",
action="store_true",
help="print messages during actions.",
)
options = parser.parse_args()
infile = open(options.infilename, 'r')
if options.outfilename:
outfile = make_output_file(options.outfilename, options)
else:
outfile = sys.stdout
extract_and_generate(infile, outfile, options)
infile.close()
if options.outfilename:
outfile.close()
if __name__ == '__main__':
#import pdb; pdb.set_trace()
#import ipdb; ipdb.set_trace()
main()
[generateds-batch]
verbose = true
command = ./generateDS.py
flags = -f --member-specs=dict
in-path = energistics/prodml/v2.0/xsd_schemas
out-path = OnePer3
{
"directives": [
{
"schema": "DtsInstrumentBox.xsd",
"outfile": "DtsInstrumentBox.py",
"outsubfile": "",
"flags": ""
},
{
"schema": "ProdmlCommon.xsd",
"outfile": "ProdmlCommon.py",
"outsubfile": "",
"flags": "-f --member-specs=list --no-namespace-defs"
},
{
"schema": "FiberOpticalPath.xsd",
"outfile": "FiberOpticalPath.py",
"outsubfile": "",
"flags": ""
}
]
}
GenerateDSNamespaceDefs = {
'specialperson': 'xmlns:aa="http://www.xxx.com/namespacespecial"',
'python-programmerType':
'xmlns:aa="http://www.xxx.com/namespacepyprog"',
}
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users