Re: [Generateds-users] A few questions/issues

Dave Kuhlman Mon, 12 Jun 2017 15:33:36 -0700

Eugene,

If you decide to work with this, I've pushed my latest changes to
Bitbucket: https://bitbucket.org/dkuhlman/generateds

> So here are my comments:
> 
> * I noticed that utils/collect_schema_locations.py only collects includes,
> not imports.

Fixed.  Thanks for catching this.

> * We somehow need to deal with nested imports/includes.  The
> utils/collect_schema_locations.py only collects imports from the input file,
> but not from the schemas it imports/includes.

A couple of points:

1. One goal of the collect/batch approach is make it easier to
   modify.  The function in collect_schema_locations.py that creates
   that list of schemas and their locations could be replaced (with
   reasonable ease) with one that descends recursively into nested
   schemas.  Or, you could re-write that function, make it
   recursive, and then after running collect_schema_locations.py,
   you could edit the generated file of directives and delete any
   unwanted directives.  Perhaps we could make that function that
   collects the schemas into a plug-in, although the whole module
   (collect_schema_locations.py) is so small that it hardly seems
   worth it; it'd be likely easier just to copy and replace the
   whole module.  You might try to work on this.  See below for
   more.

2. And, actually, I intended to *not* recurse into nested
   include/imports.  Some of the schemas I've looked at (usually
   when a generateDS.py user sent me a schema that generateDS.py had
   a problem with) have lots of nested schemas.  There are two
   problems with that: (1) We'd generate an unmanagable and unwanted
   number of modules.  (2) Often in one schema a element type
   definition extends an element type definition that is in another
   schema; but we need the class generated for that other element
   type to be in the *same* module so that we can generate a
   sub-class of it.

   I worry about a huge, bushy hierarchy that would result in lots
   of modules.  But, you are likely right that there would be some
   (simpler) schemas and some use cases where we would want that
   recursive search.

One thing we might want to ask is: What are we trying to accomplish?
What is the user trying to accomplish with --one-file-per-xsd or
this new collect/batch approach?  One response to that question is
to say "Let the user decide", and then to give the user some tools
to help the user create a solution.  Another approach is to list a
few specific needs and tasks, and to give the user tools that solve
those specific needs or perform those specific tasks.

What I'd expect a user to do (and we should be wary about this,
since many users want and do the unexpected) is to create a special
schema containing almost nothing but a few xs:include/xs:import
elements that we be used to direct the generation of a few specific
modules.

> Probably a solution would be
> to recursively scan files and put all required schemas in the
> directives.json file.
> * Then, circular imports/includes should be supported.  I guess that may be
> a complicated thing.  For collecting/loading schema files a solution would
> probably be to manage a set of already discovered schemas.  But not sure how
> complicated would be to generate classes, we probably need to generate them
> in a correct order.  How complicated do you think that would be?
> 
> I can try to help you with this.  Just tell me if you have any comments
> regarding what I've written.

You could consider modifying collect_schema_locations.py and adding
the recursive search that pulls in nested xs:include/xs:import.
There are functions in process_includes.py and
show_schema_hierarchy.py that do this.  The one in
show_schema_hierarchy.py is simpler, but I'm not sure it is correct,
even after making a fix recently.

I tried to modify collect_schema_locations.py so that it would be
easier to call the exported function extract_and_generate and pass
in a function that does that recursive walk.  But I'm not sure that
I actually made it easier.  You may just want to consider copying
the entire module and editing your copy.

Let me know if I can answer questions or be of help.

By the way, is it possible that this task (the recursive search for
all xs:include/xs:import elements has been done before and could be
reused?  I've searched several times, but have not found anything
helpful.  Still, it does seem like a task that is common enough so
that there'd already be a solution.  Perhaps in PyXB
(http://pyxb.sourceforge.net/).  I'll take a look.

I've pushed my latest changes to Bitbucket:
https://bitbucket.org/dkuhlman/generateds.  You will want to start
with that.  A fork and a pull request would likely be easiest for
both of us to deal with, if you decide to work on it.

Dave

On Thu, Jun 08, 2017 at 03:07:34AM +0300, Eugene Petkevich wrote:
> Dave,
> 
> Thank you for the response and excuse me for the late answer.
> 
> 1) Regarding the utils/collect_schema_locations.py and
> utils/batch_generate.py.
> 
> When I ran the commands you've provided on the xsd file:
> 
> > $ ./collect_schema_locations.py -f 
> > energistics/prodml/v2.0/xsd_schemas/DasAcquisition.xsd directives04.json
> > $ mkdir OnePer3
> > $ ./batch_generate.py --config=gds02.config directives04.json
> 
> I've got an error from which I see what probably was a reason with
> --one-file-per-xsd argument in my case.  I see that there is a circular
> import in schema files (gmd.xsd imports gts.xsd which imports gml.xsd which
> again imports gbd.xsd and so on).  So utils/batch_generate.py says it cannot
> find a file with a very long path that comes from that looped reference,
> like 
> '../../../../gml/3.2.1/../../iso/19139/20070417/gmd/../../../../iso/19139/20070417/gts/../../../../gml/3.2.1/../../iso/19139/20070417/gmd
> ... etc'.
> 
> So here are my comments:
> 
> * I noticed that utils/collect_schema_locations.py only collects includes,
> not imports.
> * We somehow need to deal with nested imports/includes.  The
> utils/collect_schema_locations.py only collects imports from the input file,
> but not from the schemas it imports/includes. Probably a solution would be
> to recursively scan files and put all required schemas in the
> directives.json file.
> * Then, circular imports/includes should be supported.  I guess that may be
> a complicated thing.  For collecting/loading schema files a solution would
> probably be to manage a set of already discovered schemas.  But not sure how
> complicated would be to generate classes, we probably need to generate them
> in a correct order.  How complicated do you think that would be?
> 
> I can try to help you with this.  Just tell me if you have any comments
> regarding what I've written.
> 
> **Update**:
> 
> Actually, I've tried to run again generateDS.py on the xsd and it gives the
> same error (about file not found with circulare import).  I'm not sure why
> it worked before and does not work now on the same file.  I've tried both
> the newest version of generateDS and the old one I've used before, both lead
> to the same error.
> 
> Ok, after trying out more I figured out that only happens when I used
> absolute path to the xsd file.  Do you have any idea why absolute path may
> be a problem?  Then I'm not sure whether my comments are still applicable.
> As I understand now, circular imports/includes are supported in
> generateDS.py itself, but somehow not in case of --one-file-per-xsd.  Still
> I see a reason to search for xsd files recursively and make one file for
> each of those xsd files.
> 
> 2) Namespace definition behavior.
> 
> I see that your new --no-namespace-defs argument works fine.  I currently
> don't see a use case for manually choosing namespace definitions with
> dictionary, but that could be useful.  The reason to have only top-level
> namespace is to make an xml files more readable and smaller, by using
> namespace definition only where needed.  Though yes, probably in some cases
> it would be needed to have it not in the top, maybe when using same
> namespace name for different namespaces in different parts of the xml...
> 
> 3) Other issues I've mentioned before.
> 
> Do you have a plan at which issue to look next?  Maybe I can try to
> investigate another one meanwhile.
> 
> 4) One more thing I want to mention -- in generated code positional
> arguments are used for export, __init__ and other functions.  When
> sublclassing it is more convenient when keyword arguments are used, since we
> can get value of a particular element by its name.  I think it could make
> sense to change positional arguments to keyword arguments at least in the
> autogenerated code. Though I'm not sure that it would help if a user uses
> positional arguments.
> 
> Best regards,
> Eugene
> 
> On 27.05.2017 00:20, Dave Kuhlman wrote:
> > Eugene,
> >

[snip]

-- 

Dave Kuhlman
http://www.davekuhlman.org

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users

Re: [Generateds-users] A few questions/issues

Reply via email to