Eugene, If you decide to work with this, I've pushed my latest changes to Bitbucket: https://bitbucket.org/dkuhlman/generateds
> So here are my comments: > > * I noticed that utils/collect_schema_locations.py only collects includes, > not imports. Fixed. Thanks for catching this. > * We somehow need to deal with nested imports/includes. The > utils/collect_schema_locations.py only collects imports from the input file, > but not from the schemas it imports/includes. A couple of points: 1. One goal of the collect/batch approach is make it easier to modify. The function in collect_schema_locations.py that creates that list of schemas and their locations could be replaced (with reasonable ease) with one that descends recursively into nested schemas. Or, you could re-write that function, make it recursive, and then after running collect_schema_locations.py, you could edit the generated file of directives and delete any unwanted directives. Perhaps we could make that function that collects the schemas into a plug-in, although the whole module (collect_schema_locations.py) is so small that it hardly seems worth it; it'd be likely easier just to copy and replace the whole module. You might try to work on this. See below for more. 2. And, actually, I intended to *not* recurse into nested include/imports. Some of the schemas I've looked at (usually when a generateDS.py user sent me a schema that generateDS.py had a problem with) have lots of nested schemas. There are two problems with that: (1) We'd generate an unmanagable and unwanted number of modules. (2) Often in one schema a element type definition extends an element type definition that is in another schema; but we need the class generated for that other element type to be in the *same* module so that we can generate a sub-class of it. I worry about a huge, bushy hierarchy that would result in lots of modules. But, you are likely right that there would be some (simpler) schemas and some use cases where we would want that recursive search. One thing we might want to ask is: What are we trying to accomplish? What is the user trying to accomplish with --one-file-per-xsd or this new collect/batch approach? One response to that question is to say "Let the user decide", and then to give the user some tools to help the user create a solution. Another approach is to list a few specific needs and tasks, and to give the user tools that solve those specific needs or perform those specific tasks. What I'd expect a user to do (and we should be wary about this, since many users want and do the unexpected) is to create a special schema containing almost nothing but a few xs:include/xs:import elements that we be used to direct the generation of a few specific modules. > Probably a solution would be > to recursively scan files and put all required schemas in the > directives.json file. > * Then, circular imports/includes should be supported. I guess that may be > a complicated thing. For collecting/loading schema files a solution would > probably be to manage a set of already discovered schemas. But not sure how > complicated would be to generate classes, we probably need to generate them > in a correct order. How complicated do you think that would be? > > I can try to help you with this. Just tell me if you have any comments > regarding what I've written. You could consider modifying collect_schema_locations.py and adding the recursive search that pulls in nested xs:include/xs:import. There are functions in process_includes.py and show_schema_hierarchy.py that do this. The one in show_schema_hierarchy.py is simpler, but I'm not sure it is correct, even after making a fix recently. I tried to modify collect_schema_locations.py so that it would be easier to call the exported function extract_and_generate and pass in a function that does that recursive walk. But I'm not sure that I actually made it easier. You may just want to consider copying the entire module and editing your copy. Let me know if I can answer questions or be of help. By the way, is it possible that this task (the recursive search for all xs:include/xs:import elements has been done before and could be reused? I've searched several times, but have not found anything helpful. Still, it does seem like a task that is common enough so that there'd already be a solution. Perhaps in PyXB (http://pyxb.sourceforge.net/). I'll take a look. I've pushed my latest changes to Bitbucket: https://bitbucket.org/dkuhlman/generateds. You will want to start with that. A fork and a pull request would likely be easiest for both of us to deal with, if you decide to work on it. Dave On Thu, Jun 08, 2017 at 03:07:34AM +0300, Eugene Petkevich wrote: > Dave, > > Thank you for the response and excuse me for the late answer. > > 1) Regarding the utils/collect_schema_locations.py and > utils/batch_generate.py. > > When I ran the commands you've provided on the xsd file: > > > $ ./collect_schema_locations.py -f > > energistics/prodml/v2.0/xsd_schemas/DasAcquisition.xsd directives04.json > > $ mkdir OnePer3 > > $ ./batch_generate.py --config=gds02.config directives04.json > > I've got an error from which I see what probably was a reason with > --one-file-per-xsd argument in my case. I see that there is a circular > import in schema files (gmd.xsd imports gts.xsd which imports gml.xsd which > again imports gbd.xsd and so on). So utils/batch_generate.py says it cannot > find a file with a very long path that comes from that looped reference, > like > '../../../../gml/3.2.1/../../iso/19139/20070417/gmd/../../../../iso/19139/20070417/gts/../../../../gml/3.2.1/../../iso/19139/20070417/gmd > ... etc'. > > So here are my comments: > > * I noticed that utils/collect_schema_locations.py only collects includes, > not imports. > * We somehow need to deal with nested imports/includes. The > utils/collect_schema_locations.py only collects imports from the input file, > but not from the schemas it imports/includes. Probably a solution would be > to recursively scan files and put all required schemas in the > directives.json file. > * Then, circular imports/includes should be supported. I guess that may be > a complicated thing. For collecting/loading schema files a solution would > probably be to manage a set of already discovered schemas. But not sure how > complicated would be to generate classes, we probably need to generate them > in a correct order. How complicated do you think that would be? > > I can try to help you with this. Just tell me if you have any comments > regarding what I've written. > > **Update**: > > Actually, I've tried to run again generateDS.py on the xsd and it gives the > same error (about file not found with circulare import). I'm not sure why > it worked before and does not work now on the same file. I've tried both > the newest version of generateDS and the old one I've used before, both lead > to the same error. > > Ok, after trying out more I figured out that only happens when I used > absolute path to the xsd file. Do you have any idea why absolute path may > be a problem? Then I'm not sure whether my comments are still applicable. > As I understand now, circular imports/includes are supported in > generateDS.py itself, but somehow not in case of --one-file-per-xsd. Still > I see a reason to search for xsd files recursively and make one file for > each of those xsd files. > > 2) Namespace definition behavior. > > I see that your new --no-namespace-defs argument works fine. I currently > don't see a use case for manually choosing namespace definitions with > dictionary, but that could be useful. The reason to have only top-level > namespace is to make an xml files more readable and smaller, by using > namespace definition only where needed. Though yes, probably in some cases > it would be needed to have it not in the top, maybe when using same > namespace name for different namespaces in different parts of the xml... > > 3) Other issues I've mentioned before. > > Do you have a plan at which issue to look next? Maybe I can try to > investigate another one meanwhile. > > 4) One more thing I want to mention -- in generated code positional > arguments are used for export, __init__ and other functions. When > sublclassing it is more convenient when keyword arguments are used, since we > can get value of a particular element by its name. I think it could make > sense to change positional arguments to keyword arguments at least in the > autogenerated code. Though I'm not sure that it would help if a user uses > positional arguments. > > Best regards, > Eugene > > On 27.05.2017 00:20, Dave Kuhlman wrote: > > Eugene, > > [snip] -- Dave Kuhlman http://www.davekuhlman.org ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ generateds-users mailing list generateds-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/generateds-users