Eugene,

I apologize for taking so long.  And, I do not have fixes for all
the issues that you report.

But, I think I've made some progress.

A few notes are below.

> One more issue I've found is that in documentation it is written that
> default parameter for export is --export="write literal" but in reality it
> is --export="write".

I've updated the doc.  Thanks for reporting it.

> 2) When I try to use --one-file-per-xsd argument, I get the following error:
> *** maxLoops exceeded.  Something is wrong with --one-file-per-xsd.

I believe that the problem occurs when we try to generate modules
from an incomplete schema.  Possibly, it's because when gDS attempts
to generate a class from element type A which extends element type
B, and the definition of element type B is in a part of the schema
that is not included (with xs:include or xs:import), then it cannot
generate the class for A without first generating its super-class,
which is the class for B, which is missing.

As an alternative strategy, I'm working on a replacement for the
--one-file-per-xsd capability.  That option seems too inflexible to
me.  So, what I've done is to implement two scripts to replace that
capability:

1. utils/collect_schema_locations.py -- Scans an XML Schema and
   collects the top level xs:include and xs:import references.  It
   writes them out in a (JSON) file that can be used by
   utils/batch_generate.py.

2. utils/batch_generate.py -- Reads the output file produced by
   collect_schema_locations.py.  For each reference in that file, it
   runs generateDS.py to produce a Python module.

So, I believe that the function and intent of these two scripts is
pretty much the same as the capability provided by
--one-file-per-xsd, *but* these scripts are small and relatively
eash to understand and their function is not hard-wired into
generateDS.py.  And, therefore, I'm hoping that they will be more
usable and will give us more flexibility.  When they do *not* do
what we want, we will be more easily able to modify them.

I've now got these two scripts working.  But I need to do more work on
them.  In particular I need to write some documentation.  And, I need
to make them more easy to use.  Right now they are a bit hard to
work with even for me, and I'm the one who implemented them.

I've attached these two scripts.  If you decide to try them, I'd
welcome your comments.

Here is how you might run them:

$ ./collect_schema_locations.py -f 
energistics/prodml/v2.0/xsd_schemas/DasAcquisition.xsd directives04.json
$ mkdir OnePer3
$ ./batch_generate.py --config=gds02.config directives04.json

And, if gds02.config contains the following:

    [generateds]
    verbose = true
    command = ./generateDS.py
    flags = -f --member-specs=dict
    in-path = energistics/prodml/v2.0/xsd_schemas
    out-path = OnePer3

You would end up with the following files in subdirectory OnePer3/:

    OnePer3/DtsInstrumentBox.py
    OnePer3/FiberOpticalPath.py
    OnePer3/ProdmlCommon.py
    OnePer3/SubProdmlCommon.py

Here is what the directives file that produced the above modules
looks like:

    {
        "directives": [
            {
                "schema": "DtsInstrumentBox.xsd",
                "outfile": "DtsInstrumentBox.py",
                "outsubfile": "",
                "flags": ""
            },
            {
                "schema": "ProdmlCommon.xsd",
                "outfile": "ProdmlCommon.py",
                "outsubfile": "SubProdmlCommon.py",
                "flags": ""
            },
            {
                "schema": "FiberOpticalPath.xsd",
                "outfile": "FiberOpticalPath.py",
                "outsubfile": "",
                "flags": ""
            }
        ]
    }

Note that I manually added the line:

    "outsubfile": "SubProdmlCommon.py",

OK.  I admit.  That process seems a bit complex.  I'll work on that.

Note that if while generating one of the modules using the above
procedure, there is a missing and needed element type definition
(for example, element type A extends element type B and the
definition of element type B is missing), then we'll still get the
error that you reported.  This procedure only lets us narrow and
control the generation of these multiple modules, for example by
editing the directives file that is input to batch_generate.py.

> 3) Namespace definition behavior -- by default, generateDS puts namespace
> definition in every export method of generated classes. That is, every
> element in an exported xml that has children would have namespace
> definition.  But what if I only want namespace definition in top-level
> element?  For example, I want this:

This one is my next task.

I'm thinking perhaps if we had an additional command line option
--no-namespace-defs.  If you use that option, we never export the
namespace definitions.  So, then at the top level you would add the
namespacedef_="xmlns:abc=xxx" and it would not be passed down to
child elements.  I'll see if I can come up with an example for you
to review.

But, before we pursue that approach (a --no-namespace-defs command
line flag), we should ask what our (the user's actually) needs and
goals are?  I'm thinking that perhaps the user needs a more fine
grained control over which elements are generated with which
namespace definitions (xmlns:xx="yyy") and when.  Consider the
following range of possible controls:

1. Enable use to specify namespace definitions to be generated on
   the export of *all* elements.  This is the current capability.
   gDS attempts to automatically detect the needed namespace
   definition and the --namespacedef command line option.

2. Enable the user to request that no namespace definitions are
   generated on the export of any elements.  This might be done with
   a new --no-namespace-defs command line option.

3. Enable the user to specify the namespace definitions to be
   generated on each element type.  This might be done by enabling
   the user to provide a (JSON? XML?) table/dictionary that maps
   element complexType names to namespace definitions (strings of
   the form 'xmlns:xx="yyy" xmlns:zz="www" ...').

Perhaps we need both #2 and #3.  No. 2 is quick and easy.  No. 3
will take me a little longer, but should not be too complex or
difficult, even for me.

So, I'll do some more exploration and will report back later.  If
you have comments or suggestions, I'll welcome them.

[later ...]

OK, here is what I've done:

1. There is now a new command line option for generateDS.py.  When
   --no-namespace-defs is used, the default value for the
   namespacedef_ parameter for each `export` method will be "".
   This means that namespace prefix definitions will be generated
   only for the top level (outer most) element and only when
   explicitly passed in to the call to ``export()``.  Also note that
   the `parse()` function generated near the bottom of each module
   may already do this.

2. Implemented the capability to use a manually edited dictionary
   that enables you to specify the namespace prefix definitions to
   be exported with specific element types.  OK, I realize that the
   same element type can occur at different levels and that you
   might want the namespace prefix definitions on upper ones but not
   lower (enclosed) ones.  Still, this capability gives you more
   control than you have now.

Attached are:

- collect_schema_locations.py -- Collect xs:include and xs:import
  references for batch generation.

- batch_generate.py -- Batch generation of modules.

- directives06.json -- Sample directives file for batch generation
  of modules.

- gds02.config -- Sample configuration file for use with --config
  option to batch_generate.py.

- generatedsnamespaces.py -- Sample module containing a dictionary
  that specifies namespace prefix definitions to be attached to
  specific element types during export.

Yet to be done:

- Add some documentation for the collect_schema_locations.py and
  batch_generate.py scripts.

- Add documentation for the added namespace prefix definition
  command line option and the prefix mapping dictionary module.

- Additional testing -- In particular, I suspect that
  batch_generate.py does not do error reporting in a reasonable way.

Any comments or guidance that you might want to give is welcome.

Dave

On Wed, May 03, 2017 at 02:05:31PM +0300, Eugene Petkevich wrote:
> Hello Dave,
> 
> Thank you for the quick answer.
> 
> Regarding (2), here are the xsd files:
> https://www.dropbox.com/s/x5kljbv3gjsem1h/energistics.zip?dl=0 , and the
> file that didn't work is 'prodml/v2.0/xsd_schemas/DasAcquisition.xsd' in the
> archive.
> 
> One more issue I've found is that in documentation it is written that
> default parameter for export is --export="write literal" but in reality it
> is --export="write".
> 
> Regards,
> Eugene
> 
> On 02.05.2017 01:45, Dave Kuhlman wrote:
> > Eugene,
> > 
> > Hello.  I'm glad generateDS.py has been helpful.  Thanks for letting
> > me know.
> > 
> > Here are a few comments:
> > 
> > 1. The file gends_user_methods.py is in the source distribution.
> >     You can find that here:
> >     https://dkuhl...@bitbucket.org/dkuhlman/generateds
> > 
> >     The documentation is wrong on that.  I'll fix it.
> > 
> > 2. With respect to on-file-per-xsd -- In the test directory
> >     (generateds/tests/ again in the source distribution) there is a
> >     test that uses that option.  Perhaps you can look at that for
> >     clues.  The files of interest are:
> > 
> >         generateds/tests/oneper00.xsd
> >         generateds/tests/oneper02.xsd
> >         generateds/tests/oneper01.xsd
> >         generateds/tests/oneper03.xsd
> > 
> >     The unit test when run, generates output modules in subdirectory
> >     tests/OnePer.
> > 
> >     The command used to run that test is in tests/test.py in method
> >     test_022_one_per.  Here is that command:
> > 
> >          def test_022_one_per(self):
> >              cmdTempl = (
> >                  'python generateDS.py --no-dates --no-versions '
> >                  '--silence --member-specs=list -f '
> >                  '--one-file-per-xsd 
> > --output-directory="tests/Ot_022_one_perePer" '
> >                  '--module-suffix="One" '
> >                  '--super=%s2_sup '
> >                  'tests/%s00.xsd'
> >              )
> >              t_ = 'oneper'
> >              cmd = cmdTempl % (t_, t_, )
> >              o
> >              o
> >              o
> > 
> >     More specifically, about the maxLoops message, that error means
> >     that you have an element definition that extends another element
> >     definition, but generateDS.py thinks it should not generate the
> >     class for the extension because it has not yet generated the
> >     class for the base/parent.  I've had to work on this once before.
> >     But, I don't know why that error is happening in your case.
> > 
> >     Do you have a schema that produces this error and that you could
> >     send me.  If you do, I take a look.
> > 
> > 3. With respect to the namespace definition behavior and the
> >     repeated namespace definitions -- I'll take a look to see how
> >     this can be done.
> > 
> > 6. About parsing from a file-like object -- Actually, if I
> >     understand you correctly, this already works.  You can pass a
> >     file object that is open for reading to the generated parse
> >     functions.  The parameter name is misleading, I suppose.  But,
> >     lxml.etree.parse does accept either a string file name or a file
> >     object.
> > 
> > More tomorrow when I have a bit more time.
> > 
> > Thanks for the detailed report.
> > 
> > Dave
> > 
> > 
> > On Mon, May 01, 2017 at 11:18:00AM +0300, Eugene Petkevich wrote:
> > > Hello,
> > > 
> > > Thank you for the GenerateDS library.  I find it very useful.  I have a
> > > couple of things to ask:

[snip]


-- 

Dave Kuhlman
http://www.davekuhlman.org
#!/usr/bin/env python

"""
usage: batch_generate.py [-h] [--config CONFIG] [-c COMMAND] [--flags FLAGS]
                         [--in-path IN_PATH] [--out-path OUT_PATH] [-v]
                         infilename

synopsis:
  read input directives from JSON file (produced by
  collect_schema_locations.py) and generate python modules.

positional arguments:
  infilename            input JSON file containing directives

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG       configuration file
  -c COMMAND, --command COMMAND
                        command. Default is "generateDS.py"
  --flags FLAGS         command line options for generateDS.py
  --in-path IN_PATH     path to the directory containing the input schemas
  --out-path OUT_PATH   path to a directory into which modules should be
                        generated
  -v, --verbose         Print messages during actions.

examples:
  python batch_generate.py input_directives.json
  python batch_generate.py --config=name.config input_directives.json

notes:
  The configuration file (see --config) has the following form:

        [generateds]
        verbose = true
        command = ./generateDS.py
        flags = -f --member-specs=dict
        in-path = energistics/prodml/v2.0/xsd_schemas
        out-path = OnePer

    The option names in this configuration file are the long command line
    option names.  Options entered on the command line over-ride options
    in this config file.
  Flags/options for generateDS.py -- These flags are passed to
    generateDS.py as command line options.  Precedence: (1) Flags in
    the directives file override the --flags command line option to
    batch_generate.py.  (2) Flags in the --flags command line option
    to batch_generate.py override flags in the configuration file (the
    argument to --config=).
"""


#
# imports
from __future__ import print_function
import sys
import os
import argparse
import configparser
import subprocess
import json


#
# Global variables


#
# Private functions

def dbg_msg(options, msg):
    """Print a message if verbose is on."""
    if options.verbose:
        print(msg)


def generate_one(directive, options):
    """Generate modules for one XML schema."""
    schema_name = directive.get('schema')
    outfilename = directive.get('outfile')
    outsubfilename = directive.get('outsubfile')
    if options.in_path:
        schema_name = os.path.join(options.in_path, schema_name)
    modulename = outfilename.split('.')[0]
    if options.out_path:
        outfilename = os.path.join(options.out_path, outfilename)
    if outsubfilename:
        if options.out_path:
            outsubfilename = os.path.join(options.out_path, outsubfilename)
        outsubfilestem = outsubfilename
        outsubfilename = '--super={} -s {}'.format(modulename, outsubfilename)
    else:
        outsubfilename = ""
        outsubfilestem = outsubfilename
    flags = directive.get('flags')
    if not flags:
        flags = options.flags
    #flags = '{} {}'.format(flags, options.flags)
    cmd = '{} {} -o {} {} {}'.format(
        options.command,
        flags,
        outfilename,
        outsubfilename,
        schema_name,
    )
    dbg_msg(options, '\ncmd: {}'.format(cmd))
    result = subprocess.run(
        cmd,
        stderr=subprocess.PIPE,
        stdout=subprocess.PIPE,
        shell=True,
    )
    dbg_msg(options, 'generated: {} {}'.format(outfilename, outsubfilestem))
    if result.stderr:
        print('errors: {}'.format(result.stderr))
    if result.stdout:
        print('output: {}'.format(result.stdout))


def merge_options(options):
    """Merge config file options and command line options.
    Command line options over-ride config file options.
    """
    config = configparser.ConfigParser()
    config.read(options.config)
    if not config.has_section('generateds-batch'):
        raise RuntimeError(
            'config file missing required section "generateds-batch"')
    section = config['generateds-batch']
    for key in section:
        key1 = key.replace('-', '_')
        if not getattr(options, key1):
            setattr(options, key1, section[key])


def load_json_file(infile):
    """Read file.  Strip out lines that begin with '//'."""
    lines = []
    for line in infile:
        if not line.lstrip().startswith('//'):
            lines.append(line)
    content = ''.join(lines)
    return content


#
# Exported functions
def batch_generate(infile, options):
    """Generate module(s) for each line in directives file."""
    content = load_json_file(infile)
    specification = json.loads(content)
    directives = specification['directives']
    for directive in directives:
        generate_one(directive, options)


def main():
    description = """\
synopsis:
  read input directives from JSON file (produced by
  collect_schema_locations.py) and generate python modules.
"""
    epilog = """\
examples:
  python batch_generate.py input_directives.json
  python batch_generate.py --config=name.config input_directives.json

notes:
  The input directives file is a JSON file.  batch_generate.py will
    run generateDS.py once for each directive in this JSON file.
  The configuration file (see --config) has the following form:

        [generateds]
        verbose = true
        command = ./generateDS.py
        flags = -f --member-specs=dict
        in-path = energistics/prodml/v2.0/xsd_schemas
        out-path = OnePer

    The option names in this configuration file are the long command line
    option names.  Options entered on the command line over-ride options
    in this config file.
  Flags/options for generateDS.py -- These flags are passed to
    generateDS.py as command line options.  Precedence: (1) Flags in
    the directives file override the --flags command line option to
    batch_generate.py.  (2) Flags in the --flags command line option
    to batch_generate.py override flags in the configuration file (the
    argument to --config=).
  The input directives file can contain comments.  Any line whose first
    non-whitespace characters are "//" is considered a comment and is
    discarded before the remaining contents are parsed by the JSON parser.
"""
    parser = argparse.ArgumentParser(
        description=description,
        epilog=epilog,
        formatter_class=argparse.RawDescriptionHelpFormatter,
    )
    parser.add_argument(
        "infilename",
        help="input JSON file containing directives",
    )
    parser.add_argument(
        "--config",
        help="configuration file",
    )
    parser.add_argument(
        "-c", "--command",
        #default="generateDS.py",
        help="command.  Default is \"generateDS.py\"",
    )
    parser.add_argument(
        "--flags",
        help="command line options for generateDS.py",
    )
    parser.add_argument(
        "--in-path",
        help="path to the directory containing the input schemas",
    )
    parser.add_argument(
        "--out-path",
        help="path to a directory into which modules should be generated",
    )
    parser.add_argument(
        "-v", "--verbose",
        action="store_true",
        help="Print messages during actions.",
    )
    options = parser.parse_args()
    if options.config:
        merge_options(options)
    if not options.command:
        options.command = 'generateDS.py'
    if not options.flags:
        options.flags = ''
    dbg_msg(options, '\noptions: {}'.format(options))
    infile = open(options.infilename, 'r')
    batch_generate(infile, options)


if __name__ == '__main__':
    #import pdb; pdb.set_trace()
    #import ipdb; ipdb.set_trace()
    main()
#!/usr/bin/env python

"""
usage: collect_schema_locations.py [-h] [-f] [-v] infilename [outfilename]

synopsis:
  collect schema locations from xs:include/xs:import elements in schema.

positional arguments:
  infilename     name/location of the XML schema file to be searched
  outfilename    output file name; if ommited stdout

optional arguments:
  -h, --help     show this help message and exit
  -f, --force    force overwrite existing output file
  -v, --verbose  print messages during actions.

examples:
  python collect_schema_locations.py myschema.xsd
  python collect_schema_locations.py myschema.xsd outfile.txt
"""


#
# imports
from __future__ import print_function
import sys
import os
import argparse
import json
from lxml import etree


#
# Global variables


#
# Private functions

def dbg_msg(options, msg):
    """Print a message if verbose is on."""
    if options.verbose:
        print(msg)


def extract_locations(infile, options):
    doc = etree.parse(infile)
    root = doc.getroot()
    elements = root.xpath(
        './/xs:include',
        namespaces=root.nsmap,
    )
    locations = []
    for element in elements:
        schema_name = element.get('schemaLocation')
        locations.append(schema_name)
    return locations


def generate(locations, outfile, options):
    directives = []
    for location in locations:
        schema_name = location
        outfilename = os.path.split(schema_name)[1]
        outfilename = os.path.splitext(outfilename)[0]
        outfilename = '{}.py'.format(outfilename)
        directive = {
            'schema': schema_name,
            'outfile': outfilename,
            'outsubfile': '',
            'flags': '',
        }
        directives.append(directive)
    return directives


def make_output_file(outfilename, options):
    if os.path.exists(outfilename) and not options.force:
        sys.exit("\noutput file exists.  Use -f/--force to over-write.\n")
    outfile = open(outfilename, 'w')
    return outfile


#
# Exported functions

def extract_and_generate(infile, outfile, options):
    locations = extract_locations(infile, options)
    directives = generate(locations, outfile, options)
    specification = {
        'directives': directives,
    }
    json.dump(specification, outfile, indent='    ')


def main():
    description = """\
synopsis:
  collect schema locations from xs:include/xs:import elements in schema.
"""
    epilog = """\
examples:
  python collect_schema_locations.py myschema.xsd
  python collect_schema_locations.py myschema.xsd outfile.txt

notes:
  The output directives file is a JSON file suitable for input to
    batch_generate.py.  This directives file contains one directive
    for each module to be generated by generateDS.py.
  You can edit this resulting JSON file, for example to add a sub-class
    module file or flags/options for generateDS.py that are specific
    to each directive.
"""
    parser = argparse.ArgumentParser(
        description=description,
        epilog=epilog,
        formatter_class=argparse.RawDescriptionHelpFormatter,
    )
    parser.add_argument(
        "infilename",
        help="name/location of the XML schema file to be searched"
    )
    parser.add_argument(
        "outfilename",
        nargs="?",
        default=None,
        help="output file name; if ommited stdout"
    )
    parser.add_argument(
        "-f", "--force",
        action="store_true",
        help="force overwrite existing output file",
    )
    parser.add_argument(
        "-v", "--verbose",
        action="store_true",
        help="print messages during actions.",
    )
    options = parser.parse_args()
    infile = open(options.infilename, 'r')
    if options.outfilename:
        outfile = make_output_file(options.outfilename, options)
    else:
        outfile = sys.stdout
    extract_and_generate(infile, outfile, options)
    infile.close()
    if options.outfilename:
        outfile.close()


if __name__ == '__main__':
    #import pdb; pdb.set_trace()
    #import ipdb; ipdb.set_trace()
    main()
[generateds-batch]
verbose = true
command = ./generateDS.py
flags = -f --member-specs=dict
in-path = energistics/prodml/v2.0/xsd_schemas
out-path = OnePer3
{
    "directives": [
        {
            "schema": "DtsInstrumentBox.xsd",
            "outfile": "DtsInstrumentBox.py",
            "outsubfile": "",
            "flags": ""
        },
        {
            "schema": "ProdmlCommon.xsd",
            "outfile": "ProdmlCommon.py",
            "outsubfile": "",
            "flags": "-f --member-specs=list --no-namespace-defs"
        },
        {
            "schema": "FiberOpticalPath.xsd",
            "outfile": "FiberOpticalPath.py",
            "outsubfile": "",
            "flags": ""
        }
    ]
}
GenerateDSNamespaceDefs = {
        'specialperson': 'xmlns:aa="http://www.xxx.com/namespacespecial";',
        'python-programmerType': 
'xmlns:aa="http://www.xxx.com/namespacepyprog";',
}
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users

Reply via email to