Dave,
Thank you for the response and excuse me for the late answer.
1) Regarding the utils/collect_schema_locations.py and
utils/batch_generate.py.
When I ran the commands you've provided on the xsd file:
$ ./collect_schema_locations.py -f
energistics/prodml/v2.0/xsd_schemas/DasAcquisition.xsd directives04.json
$ mkdir OnePer3
$ ./batch_generate.py --config=gds02.config directives04.json
I've got an error from which I see what probably was a reason with
--one-file-per-xsd argument in my case. I see that there is a circular
import in schema files (gmd.xsd imports gts.xsd which imports gml.xsd
which again imports gbd.xsd and so on). So utils/batch_generate.py says
it cannot find a file with a very long path that comes from that looped
reference, like
'../../../../gml/3.2.1/../../iso/19139/20070417/gmd/../../../../iso/19139/20070417/gts/../../../../gml/3.2.1/../../iso/19139/20070417/gmd
... etc'.
So here are my comments:
* I noticed that utils/collect_schema_locations.py only collects
includes, not imports.
* We somehow need to deal with nested imports/includes. The
utils/collect_schema_locations.py only collects imports from the input
file, but not from the schemas it imports/includes. Probably a solution
would be to recursively scan files and put all required schemas in the
directives.json file.
* Then, circular imports/includes should be supported. I guess that may
be a complicated thing. For collecting/loading schema files a solution
would probably be to manage a set of already discovered schemas. But
not sure how complicated would be to generate classes, we probably need
to generate them in a correct order. How complicated do you think that
would be?
I can try to help you with this. Just tell me if you have any comments
regarding what I've written.
**Update**:
Actually, I've tried to run again generateDS.py on the xsd and it gives
the same error (about file not found with circulare import). I'm not
sure why it worked before and does not work now on the same file. I've
tried both the newest version of generateDS and the old one I've used
before, both lead to the same error.
Ok, after trying out more I figured out that only happens when I used
absolute path to the xsd file. Do you have any idea why absolute path
may be a problem? Then I'm not sure whether my comments are still
applicable. As I understand now, circular imports/includes are
supported in generateDS.py itself, but somehow not in case of
--one-file-per-xsd. Still I see a reason to search for xsd files
recursively and make one file for each of those xsd files.
2) Namespace definition behavior.
I see that your new --no-namespace-defs argument works fine. I
currently don't see a use case for manually choosing namespace
definitions with dictionary, but that could be useful. The reason to
have only top-level namespace is to make an xml files more readable and
smaller, by using namespace definition only where needed. Though yes,
probably in some cases it would be needed to have it not in the top,
maybe when using same namespace name for different namespaces in
different parts of the xml...
3) Other issues I've mentioned before.
Do you have a plan at which issue to look next? Maybe I can try to
investigate another one meanwhile.
4) One more thing I want to mention -- in generated code positional
arguments are used for export, __init__ and other functions. When
sublclassing it is more convenient when keyword arguments are used,
since we can get value of a particular element by its name. I think it
could make sense to change positional arguments to keyword arguments at
least in the autogenerated code. Though I'm not sure that it would help
if a user uses positional arguments.
Best regards,
Eugene
On 27.05.2017 00:20, Dave Kuhlman wrote:
Eugene,
I apologize for taking so long. And, I do not have fixes for all
the issues that you report.
But, I think I've made some progress.
A few notes are below.
One more issue I've found is that in documentation it is written that
default parameter for export is --export="write literal" but in reality it
is --export="write".
I've updated the doc. Thanks for reporting it.
2) When I try to use --one-file-per-xsd argument, I get the following error:
*** maxLoops exceeded. Something is wrong with --one-file-per-xsd.
I believe that the problem occurs when we try to generate modules
from an incomplete schema. Possibly, it's because when gDS attempts
to generate a class from element type A which extends element type
B, and the definition of element type B is in a part of the schema
that is not included (with xs:include or xs:import), then it cannot
generate the class for A without first generating its super-class,
which is the class for B, which is missing.
As an alternative strategy, I'm working on a replacement for the
--one-file-per-xsd capability. That option seems too inflexible to
me. So, what I've done is to implement two scripts to replace that
capability:
1. utils/collect_schema_locations.py -- Scans an XML Schema and
collects the top level xs:include and xs:import references. It
writes them out in a (JSON) file that can be used by
utils/batch_generate.py.
2. utils/batch_generate.py -- Reads the output file produced by
collect_schema_locations.py. For each reference in that file, it
runs generateDS.py to produce a Python module.
So, I believe that the function and intent of these two scripts is
pretty much the same as the capability provided by
--one-file-per-xsd, *but* these scripts are small and relatively
eash to understand and their function is not hard-wired into
generateDS.py. And, therefore, I'm hoping that they will be more
usable and will give us more flexibility. When they do *not* do
what we want, we will be more easily able to modify them.
I've now got these two scripts working. But I need to do more work on
them. In particular I need to write some documentation. And, I need
to make them more easy to use. Right now they are a bit hard to
work with even for me, and I'm the one who implemented them.
I've attached these two scripts. If you decide to try them, I'd
welcome your comments.
Here is how you might run them:
$ ./collect_schema_locations.py -f
energistics/prodml/v2.0/xsd_schemas/DasAcquisition.xsd directives04.json
$ mkdir OnePer3
$ ./batch_generate.py --config=gds02.config directives04.json
And, if gds02.config contains the following:
[generateds]
verbose = true
command = ./generateDS.py
flags = -f --member-specs=dict
in-path = energistics/prodml/v2.0/xsd_schemas
out-path = OnePer3
You would end up with the following files in subdirectory OnePer3/:
OnePer3/DtsInstrumentBox.py
OnePer3/FiberOpticalPath.py
OnePer3/ProdmlCommon.py
OnePer3/SubProdmlCommon.py
Here is what the directives file that produced the above modules
looks like:
{
"directives": [
{
"schema": "DtsInstrumentBox.xsd",
"outfile": "DtsInstrumentBox.py",
"outsubfile": "",
"flags": ""
},
{
"schema": "ProdmlCommon.xsd",
"outfile": "ProdmlCommon.py",
"outsubfile": "SubProdmlCommon.py",
"flags": ""
},
{
"schema": "FiberOpticalPath.xsd",
"outfile": "FiberOpticalPath.py",
"outsubfile": "",
"flags": ""
}
]
}
Note that I manually added the line:
"outsubfile": "SubProdmlCommon.py",
OK. I admit. That process seems a bit complex. I'll work on that.
Note that if while generating one of the modules using the above
procedure, there is a missing and needed element type definition
(for example, element type A extends element type B and the
definition of element type B is missing), then we'll still get the
error that you reported. This procedure only lets us narrow and
control the generation of these multiple modules, for example by
editing the directives file that is input to batch_generate.py.
3) Namespace definition behavior -- by default, generateDS puts namespace
definition in every export method of generated classes. That is, every
element in an exported xml that has children would have namespace
definition. But what if I only want namespace definition in top-level
element? For example, I want this:
This one is my next task.
I'm thinking perhaps if we had an additional command line option
--no-namespace-defs. If you use that option, we never export the
namespace definitions. So, then at the top level you would add the
namespacedef_="xmlns:abc=xxx" and it would not be passed down to
child elements. I'll see if I can come up with an example for you
to review.
But, before we pursue that approach (a --no-namespace-defs command
line flag), we should ask what our (the user's actually) needs and
goals are? I'm thinking that perhaps the user needs a more fine
grained control over which elements are generated with which
namespace definitions (xmlns:xx="yyy") and when. Consider the
following range of possible controls:
1. Enable use to specify namespace definitions to be generated on
the export of *all* elements. This is the current capability.
gDS attempts to automatically detect the needed namespace
definition and the --namespacedef command line option.
2. Enable the user to request that no namespace definitions are
generated on the export of any elements. This might be done with
a new --no-namespace-defs command line option.
3. Enable the user to specify the namespace definitions to be
generated on each element type. This might be done by enabling
the user to provide a (JSON? XML?) table/dictionary that maps
element complexType names to namespace definitions (strings of
the form 'xmlns:xx="yyy" xmlns:zz="www" ...').
Perhaps we need both #2 and #3. No. 2 is quick and easy. No. 3
will take me a little longer, but should not be too complex or
difficult, even for me.
So, I'll do some more exploration and will report back later. If
you have comments or suggestions, I'll welcome them.
[later ...]
OK, here is what I've done:
1. There is now a new command line option for generateDS.py. When
--no-namespace-defs is used, the default value for the
namespacedef_ parameter for each `export` method will be "".
This means that namespace prefix definitions will be generated
only for the top level (outer most) element and only when
explicitly passed in to the call to ``export()``. Also note that
the `parse()` function generated near the bottom of each module
may already do this.
2. Implemented the capability to use a manually edited dictionary
that enables you to specify the namespace prefix definitions to
be exported with specific element types. OK, I realize that the
same element type can occur at different levels and that you
might want the namespace prefix definitions on upper ones but not
lower (enclosed) ones. Still, this capability gives you more
control than you have now.
Attached are:
- collect_schema_locations.py -- Collect xs:include and xs:import
references for batch generation.
- batch_generate.py -- Batch generation of modules.
- directives06.json -- Sample directives file for batch generation
of modules.
- gds02.config -- Sample configuration file for use with --config
option to batch_generate.py.
- generatedsnamespaces.py -- Sample module containing a dictionary
that specifies namespace prefix definitions to be attached to
specific element types during export.
Yet to be done:
- Add some documentation for the collect_schema_locations.py and
batch_generate.py scripts.
- Add documentation for the added namespace prefix definition
command line option and the prefix mapping dictionary module.
- Additional testing -- In particular, I suspect that
batch_generate.py does not do error reporting in a reasonable way.
Any comments or guidance that you might want to give is welcome.
Dave
On Wed, May 03, 2017 at 02:05:31PM +0300, Eugene Petkevich wrote:
Hello Dave,
Thank you for the quick answer.
Regarding (2), here are the xsd files:
https://www.dropbox.com/s/x5kljbv3gjsem1h/energistics.zip?dl=0 , and the
file that didn't work is 'prodml/v2.0/xsd_schemas/DasAcquisition.xsd' in the
archive.
One more issue I've found is that in documentation it is written that
default parameter for export is --export="write literal" but in reality it
is --export="write".
Regards,
Eugene
On 02.05.2017 01:45, Dave Kuhlman wrote:
Eugene,
Hello. I'm glad generateDS.py has been helpful. Thanks for letting
me know.
Here are a few comments:
1. The file gends_user_methods.py is in the source distribution.
You can find that here:
https://dkuhl...@bitbucket.org/dkuhlman/generateds
The documentation is wrong on that. I'll fix it.
2. With respect to on-file-per-xsd -- In the test directory
(generateds/tests/ again in the source distribution) there is a
test that uses that option. Perhaps you can look at that for
clues. The files of interest are:
generateds/tests/oneper00.xsd
generateds/tests/oneper02.xsd
generateds/tests/oneper01.xsd
generateds/tests/oneper03.xsd
The unit test when run, generates output modules in subdirectory
tests/OnePer.
The command used to run that test is in tests/test.py in method
test_022_one_per. Here is that command:
def test_022_one_per(self):
cmdTempl = (
'python generateDS.py --no-dates --no-versions '
'--silence --member-specs=list -f '
'--one-file-per-xsd
--output-directory="tests/Ot_022_one_perePer" '
'--module-suffix="One" '
'--super=%s2_sup '
'tests/%s00.xsd'
)
t_ = 'oneper'
cmd = cmdTempl % (t_, t_, )
o
o
o
More specifically, about the maxLoops message, that error means
that you have an element definition that extends another element
definition, but generateDS.py thinks it should not generate the
class for the extension because it has not yet generated the
class for the base/parent. I've had to work on this once before.
But, I don't know why that error is happening in your case.
Do you have a schema that produces this error and that you could
send me. If you do, I take a look.
3. With respect to the namespace definition behavior and the
repeated namespace definitions -- I'll take a look to see how
this can be done.
6. About parsing from a file-like object -- Actually, if I
understand you correctly, this already works. You can pass a
file object that is open for reading to the generated parse
functions. The parameter name is misleading, I suppose. But,
lxml.etree.parse does accept either a string file name or a file
object.
More tomorrow when I have a bit more time.
Thanks for the detailed report.
Dave
On Mon, May 01, 2017 at 11:18:00AM +0300, Eugene Petkevich wrote:
Hello,
Thank you for the GenerateDS library. I find it very useful. I have a
couple of things to ask:
[snip]
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users