Re: [Generateds-users] generateDS.py infinite loop

George David Sun, 08 Mar 2015 16:26:19 -0700

Hi Dave,

I think you included the wrong patch file.


George
On Sun, Mar 8, 2015 at 5:15 PM Dave Kuhlman <dkuhl...@davekuhlman.org>
wrote:

> On Sat, Mar 07, 2015 at 04:53:44AM +0000, George David wrote:
> > I tested my proposal and it seems to work fine for me.
> >
> > I thought the code snippet in question was one I added, but now I'm
> fairly
> > sure I didn't. I hope I didn't offend anyone with my "critique".
> >
> > George
> >
> > On Fri, Mar 6, 2015 at 9:39 PM George David <geo...@miamisalsa.com>
> wrote:
> >
> > > Hi guys,
> > >
> > > It's funny how when you look at your own code after some time, it's
> much
> > > easier to critique.
> > >
> > > For instance, why did I do this:
> > >
> > >     while 1:
> > >         if len(PostponedExtensions) <= 0:
> > >             break
> > >
> > > instead of this:
> > >
> > >     while len(PostponedExtensions) > 0:
> > >
> > > To be honest, I really don't remember this specific code segment, but I
> > > think perhaps another solution would be better. I worry about capping
> the
> > > number of loops because we could potentially prematurely end the
> > > processing. I believe the expectation was that as we generate more
> classes,
> > > more names would be added to AlreadyGenerated and/or to
> SimpleTypeDIct. In
> > > order to guard against the infinite loop, we need to detect that
> > > PostponedExtensions is in a state we already encountered. For example:
> > >
> > > Assume this is the starting state:
> > > PostponedExtensions: [ A, B, C, D ]
> > >
> > > After loop 1, D was not found and inserted at the beginning
> > > PostponedExtensions: [ D, A, B, C ]
> > >
> > > After loop 2, C was found and processed
> > > PostponedExtensions: [ D, A, B ]
> > >
> > > After loop 3, B was not found and added at the beginning:
> > > PostponedExtensions: [ B, D, A]
> > >
> > > After loop 4, A was found and processed:
> > > PostponedExtensions: [ B, D ]
> > >
> > > After loop 5, D was not found and added at the beginning
> > > PostponedExtensions: [ D B ]
> > >
> > > After loop 6, B was not found and added at the beginning
> > > PostponedExtensions: [ B, D ]
> > >
> > > No we see that PostponedExtensions after loop 6 is in the same state
> as it
> > > was after loop 4 and as a result, we are in an infinite loop. At this
> point
> > > we should break out of the loop.
> > >
> > > To detect the state, we can create a checksum of the state of
> > > PostponedExceptions and keep it in a set. Once we detect a duplicate
> > > checksum, we break out of the loop.
> > >
> > > Here is my proposal:
> > >
> > >      import hashlib
> > >
> > >      #
> > >      # Generate the elements that were postponed because we had not
> > >      #   yet generated their base class.
> > >      checksums = set()
> > >
> > >      def isNewState():
> > >         state = reduce(operator.concat, PostponedExtensions)
> > >         sum = hashlib.sha1(state).hexdigest()
> > >         if sum in checksums:
> > >            return False
> > >         checksums.add(sum)
> > >         return True
> > >
> > >      while len(PostponedExtensions) > 0:
> > >          if not isNewState():
> > >             break
> > >
> > >          element = PostponedExtensions.pop()
> > >          parentName, parent = getParentName(element)
> > >          if parentName:
> > >              if (parentName in AlreadyGenerated or
> > >                      parentName in SimpleTypeDict):
> > >                  generateClasses(wrt, prefix, element, 1)
> > >              else:
> > >                  PostponedExtensions.insert(0, element)
> > >
>
> George, and Michael too,
>
> Excellent.  Yes. I like that better.  And now, I can brag about how
> I know how to use reduce().
>
> So, I've backed out my change and added yours.
>
> > > Perhaps a warning would be nice.
>
> Added.  Writes to sys.stderr when it exits that loop because of no
> change.
>
> Unfortunately (or fortunately, depending on your point of view), I'm
> leaving for 3 days of vacation tomorrow.  So, I won't be able to do
> the testing I need to do for a few days.  When I get back I do more
> testing and will push this change to BitBucket.
>
> By the way, I had to change this:
>
>     sum = hashlib.sha1(state).hexdigest()
>
> to this:
>
>     sum = hashlib.sha1(str(state)).hexdigest()
>
> Not sure why.  So, I'd best look at that more closely to make sure I
> understand your code.  I did very light testing, and it seems to work.
>
> Dave
>
> > >
> > > The --one-file-per-xsd will assume the schema supplied as the root
> schema.
> > > In my case I don't have a schema that is a root schema, and so what I
> do is
> > > run generateds for each XSD that I have. Unfortunately this results in
> > > recreating a lot of schemas.
> > >
> > > However, your statement that you should use a root xsd that imports all
> > > the xsds was just the suggestion I needed. I changed my bash script
> from
> > > running generateds on each xsd, to create a root xsd and only run
> > > generateds on the root xsd. I went from taking over a minute to
> generate
> > > the python classes to 10 seconds! Thanks. Such an obvious solution,
> yet it
> > > never came to me.
> > >
> > > George
> > >
> > > On Fri, Mar 6, 2015 at 6:12 PM Dave Kuhlman <dkuhl...@davekuhlman.org>
> > > wrote:
> > >
> > >> On Fri, Mar 06, 2015 at 04:23:36PM +0000, Michael L. Vezie wrote:
> > >> > (Sorry for sending a message this way -- I couldn't find an issue
> > >> > tracker on bitbucket and sf won't let me add a bug)
> > >> >
> > >> > I think I've found a bug in generateDS.py. If I run it as:
> > >> >
> > >> > generateDS.py --one-file-per-xsd --output-dir=py manifest.xsd
> > >> >
> > >> > it works fine. But a different schema file, manifest-ack just hangs
> > >> forever.
> > >> >
> > >> > I think I know where and maybe why.
> > >> > Starting at line 6094 in the current version:
> > >> >
> > >> >
> > >> >
> > >> >     #
> > >> >     # Generate the elements that were postponed because we had not
> > >> >     #   yet generated their base class.
> > >> >     while 1:
> > >> >         if len(PostponedExtensions) <= 0:
> > >> >             break
> > >> >         element = PostponedExtensions.pop()
> > >> >         parentName, parent = getParentName(element)
> > >> >         if parentName:
> > >> >             if (parentName in AlreadyGenerated or
> > >> >                     parentName in SimpleTypeDict):
> > >> >                 generateClasses(wrt, prefix, element, 1)
> > >> >             else:
> > >> >                 PostponedExtensions.insert(0, element)
> > >> >
> > >> >
> > >> > This loop is (in some cases) an infinite loop. For some reason,
> > >> > parentName is not in AlreadyGenerated so element is popped out of
> > >> > PostponedExtensions and inserted back in it forever. If I add it to
> > >> > a different list, then copy that list back to PostponedExtensions
> > >> > after the loop has finished, it seems to work fine.
> > >> >
> > >> > It happens when I run it with --one-file-per-xsd on certain schemas,
> > >> > but not others.
> > >> >
> > >> > If I change the loop to:
> > >> >
> > >> >     #
> > >> >     # Generate the elements that were postponed because we had not
> > >> >     #   yet generated their base class.
> > >> >     nPostponedExtensions=[]
> > >> >     while 1:
> > >> >         if len(PostponedExtensions) <= 0:
> > >> >             break
> > >> >         element = PostponedExtensions.pop()
> > >> >         parentName, parent = getParentName(element)
> > >> >         if parentName:
> > >> >             if (parentName in AlreadyGenerated or
> > >> >                     parentName in SimpleTypeDict):
> > >> >                 generateClasses(wrt, prefix, element, 1)
> > >> >             else:
> > >> >                 nPostponedExtensions.insert(0, element)
> > >> >     for e in nPostponedExtensions:
> > >> >         PostponedExtensions.append(e)
> > >> >
> > >> >
> > >> > (inserting the element in a different list, then copying that list
> > >> > back to the first after the loop has finished), it seems to work
> > >> > fine.
> > >> >
> > >> > Attached are the two schemas, along with a common one they both
> include.
> > >>
> > >> Michael,
> > >>
> > >> Thanks for catching this error and for alerting me about it.  Your
> > >> fix was very helpful, because it focused me on the area where the
> > >> problem is and what is causing it.
> > >>
> > >> I've made a fix that is perhaps simpler and certainly dumber than
> > >> yours.  The reason that I'd rather use this simpler fix is that the
> > >> code that implements the --one-file-per-xsd feature was added
> > >> by someone else.  Because of that I'm not too clear on what his
> > >> intentions were and I don't want to take a chance on making a change
> > >> that will affect some corner case that he wants to be able to
> > >> handle.
> > >>
> > >> So, I've made a fix so that it no longer goes into an infinite loop
> > >> and so that it loops a maximum of 10 times.
> > >>
> > >> So, I've made a change which causes an exit from that loop after a
> > >> maximum number of iterations.  I admit that it's a kludge to handle
> > >> what I believe is this abnormal situation.
> > >>
> > >> I've attached a patch file.  There are several changes.  The change
> > >> we are concerned with it the one that has "maxLoops" and "loops" in
> > >> it.
> > >>
> > >> This patch will protect us from that infinite loop.  But, an
> > >> additional issue is whether you should be using this one-per feature
> > >> for your schema at all.
> > >>
> > >> My belief is that when you use --one-file-per-xsd, you should give
> > >> it a root XML schema file that has almost nothing in it except
> > >> xs:import statements.  That root schema file sort of like a table of
> > >> contents that tells generateDS.py, when run with --one-file-per-xsd,
> > >> what files to generate and what schema files to use in order to
> > >> generate each one.  I've also attached a Zip file
> > >> (revised_schema.zip) containing schemas that roughly approximates
> > >> your schemas with test02.xsd serving as the root schema.  Maybe that
> > >> will give you hints about what I *believe* is the way to use this
> > >> feature.
> > >>
> > >> So, I suppose my advice is to not use the --one-file-per-xsd feature
> > >> unless you need it and, if you do need it, then you will want to
> > >> design your schema to fit the way it works.
> > >>
> > >> Hope this helps.  Thanks again for helping me on this and also for
> > >> your patience.
> > >>
> > >> Dave
> > >>
> > >> --
> > >>
> > >> Dave Kuhlman
> > >> http://www.davekuhlman.org
> > >> ------------------------------------------------------------
> > >> ------------------
> > >> Dive into the World of Parallel Programming The Go Parallel Website,
> > >> sponsored
> > >> by Intel and developed in partnership with Slashdot Media, is your hub
> > >> for all
> > >> things parallel software development, from weekly thought leadership
> > >> blogs to
> > >> news, videos, case studies, tutorials and more. Take a look and join
> the
> > >> conversation now. http://goparallel.sourceforge.net/
> > >> _______________________________________________
> > >> generateds-users mailing list
> > >> generateds-users@lists.sourceforge.net
> > >> https://lists.sourceforge.net/lists/listinfo/generateds-users
> > >>
> > >
>
>
> --
>
> Dave Kuhlman
> http://www.davekuhlman.org
>

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/

_______________________________________________
generateds-users mailing list
generateds-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/generateds-users

Re: [Generateds-users] generateDS.py infinite loop

Reply via email to