Re: The Gump3 branch
Leo Simons wrote: Pfew. We really should start writing some unit tests. If I had the time I would start from scratch one more time using a test-first approach, but I haven't figured out how to comfortably do test-first python development yet. That paragraph sounds extremely important to this novice. --David - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: The Gump3 branch
On 08-01-2005 15:21, Adam R. B. Jack [EMAIL PROTECTED] wrote: Phew, have I been busy :-D. You certainly have. Ooh, long e-mail! I'm gonna try and split this up... :-D Inter-component-communication - I'm sure your IOC/container experiences have required you to answer this before, but how do you allow components to communicate/collaborate? I firmly believe there is very little need for different components to communicate. If you architect things the IOC way, components will use just one or two other components, and their parent can just set up the references between all those components. What will happen is that a component needs a certain kind of result available. For example, something that pushes information in the dynagump database needs that information, which might be put there by an ant builder or something like that. This kind of stuff is trivial in python; you just set the property on the relevant part of the model and then retrieve it later. Note that such communication is pretty indirect. For example the start of the CvsUpdater plugin I did just pushes information into the model (the log of the cvs command, exit status, etc) without worrying who uses that information (at the moment, it is just ignored). There were times when building logic wanted to know something historically (had this built before, etc.) in order to determine how much effort (or what switches) to use. Is inter-component communications like this a real no-no, or is this something that might be coincidentally allowed via steps in pre-processing, etc. We don't need steps. Think unix command line utilities. You can make them communicate: find . -type f | xargs -v .svn Without steps. That | there in gump is achieved by setting a property on a piece of the model. Threading - Do you think we have a chance to re-instate threading in this model? [It is a minor nit, not a show stopper, but I liked the large run-time reduction of concurrent checkouts.] Yes. We can probably reuse the worker code from gump2. I left it out on purpose because it was clouding the gump2 code (several of the gump2 bits all worry about multi-threading) and making it difficult to read. What you can do for example is multithread each of the three stages, then join the threads in between. And each plugin might do multithreading on its own. What I want to see first is where we need it. Instrument the different bits of the build and find out where we need the speedups. Keep most of the code simple! :-D CLI --- I've gotten the Gump3 branch into a state where everything works (for me), as far as stuff is implemented. The main core thing that is missing is cyclic dependency detection. I've got the right algorithm written down on paper, just need to make it happen. The hooks for it are there already though (the gump.engine.modeller.Verifier class). Mind pumping a few command lines up to a wiki or somewhere? I'd like to run the engine, and unit tests, and such. Gump2 was a pain to run (we never cured it's confusion) and I'd like to start comfortably with Gump3 fro mthe get go. Uhm, yeah, I do :-D. The interface should be so easy to use you don't need the docs. Try ./gump help for starters. There's work TODO here, but I really prefer to update the code rather than the wiki! On thought in that regard is partial runs. I think Gump2 was beleived (although not actually true) to be less incremental build friendly since it wouldn't allow one to do build X, update X. [It was there in Gump3, just the command line was so crude folks never got to use it.]. I feel we need Gump3 to be easy to run in pieces, and in parts. I disagree, actually! The reason we needed to do stuff like that was because gump is so complex and difficult to use that one resorts to a model of let's try this and see if works. We need to fix gump so that you don't need to do that. IE, make it easy to write correct metadata. I would like to make the hacky bits like this not part of the core. If you need an adjusted profile with just a few projects, then change the profile! Easily asking for things that include/exclude components on the fly. Nicola's (and Sam's) wxPython GUI was a nice user this way. Any thoughts on re-instating that? I'm not against GUIs, but I feel CLI is way more important to get right first. Plugins --- I think that generating plug-ins (perhaps even for loading, and such) is key. I'm not sure (yet) if the new model is any better than the old in allowing the core steps (loading, modelling) to be pluged-in, but I think it need to be investigated. Yes, its easy. Change the get_verifier() in config.py to provide a different implementation, and that's it! I see you have a Maven parser, but could/should that be a plug-in? I doubt we should be talking about this kind of stuff as a plugin. There's very specific bits of functionality that *need* to be performed (right contracts) for gump to
Re: The Gump3 branch
On 08-01-2005 20:58, Stefano Mazzocchi [EMAIL PROTECTED] wrote: big snip of lots of stuff/ I see you have a Maven parser, but could/should that be a plug-in? This is *EXACTLY* the kind of question we should *NOT* be answering. It does *NOT* matter if it's a plugin or not, as long as it does the job. Early refactoring is the root of all evil, even worse than early optimization. Well, I think I disagree that's the case here. What I did was a pretty late refactoring of gump2. What Adam is basically asking is that refactoring now done? and the answer is probably not completely. It makes sense to figure out at this point if there's some big architectural flaws to catch now and change. Please do be critical and ask those questions! The answer could just be no, but that just makes us all more confident that we're on the right path...I really don't mind takin a week or two to be sure of that. Thinking more about that Q re the to-be-built maven parser...basically you could have 0..n Normalizers that chain up to all change small bits of the xml model. Sounds like a pipeline of cocoon transformers :-D. Fortunately that change is kind-of isolated since you could just refactor the current Normalizer internally to consist of multiple smaller components. An alternative might be replacing the normalizer with a wrapper around an XSLT script to handle the transformation. Or ... Or ... :-D The same could probably be said of all the other xml handling bits. For example the Objectifier could probably be split into one small class for each different kind of tag that needs to be turned into a python object. I suggest we don't worry about that for now (no need to build another cocoon in python! :-D) but keep in mind that its possible. A lot of what Leo has done is to reduce the (bloated, percieved or real I don't know) complexity of Gump2, if you start moving stuff over from Gump2 to Gump3 before *others* had a look at the new (and much simpler) code, we go back to a one man show and the entire effort is useless. If that would really be the case then the refactoring effort would have failed. I would hope that adding an RDF generator plugin would be adding a single sourcefile somewhere where it is easily ignored by someone just learning the system internals. Nevertheless, I do agree with: so, please, let's work as a team in identifying what needs to be done, outline the priorities and then allowing code to get in. priority #1: avoid one man shows. priority #2: keep it simple, stupid (to help people understanding the code, then helping on #1) priority #3: achieve separation of build from presentation priority #4: implement a very usable command line interface everything else (including sending email!) will come later. Concretely, doing a grep for TODO should show lots of places where I think the existing code needs work :-D Avalon had the notion of a component lifecycle and this is what Leo is doing here. Ssssh! :-D no, and we don't need one: you need mysql and if you don't have it Gump3 won't run. That's as simple as that. +1 to that. Some decisions about that which are implicit in the new CLI as I've been building it: * python 2.3 with all its libs * unix environment (cygwin will do, probably) * mysql * bunch of python libraries we choose to use (like MySQLdb) * bash, java, ... The bash script simply checks for all this and complains if something is not there. Grepping that script for the word check hopefully results in a full list of required software :-D Next week is scheduled to be rather busy so it might take a while before I have time to reply! Cheers, - Leo - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
DynaGump (was Re: The Gump3 branch)
The goal now is to allow Gump3 to perform builds and put its data into the database so that dynagump can start publishing it. Everything else is secondary. I agree, but I think Gump3 is a good idea and I'd like to see it for the long run. The *right*/focused plan for now is to accept that Gump3 is months (and a lot of work) off (I know from experience) and that the shortest path to DynaGump is not Gump3. Work with me to finish the DynaGump actor for Gump2 that I wrote for you, and let's get it up and running. Let's start exercising/integrating DynaGump now, not wait for a core re-write. The best thing that happened to Gump2 is that folks were running Gump1 in parrelel. Countless bugs were detected/resolved by being able to run side by side and compare. The best things we can do for Gump3 is allow Gump2 to talk to DynaGump in parrellel. If we create a workspace on Brutus called DynaGump and configure it to a DB with both old and new DB schemas in it, we can have DynaGump up a running in no time. Nothing (IMHO) better than running DynaGump against DBs formed by old and new Gump (2 3) and also comparing it to the HTML results generated by Gump2. Let's allow Gump3 to be team formed by giving it time, whilst we make one incremental improvement and allow DynaGump to be born. Can we agree on this as a step in the plan? But I also hope that we'll work as a team this time. Stefano, you make me smile. :-) You are so strong in your opinions (at least how they read to others) that you come perilously close to stymieing the community you love. I gave up on Depot, leaving behind parts I love/long to see, mainly 'cos it was becoming a one man band. Gump, however, is thriving community, and even when I was the only Python coder we had vast community efforts in metadata/management/communications (Wikis/Documentation/Blogs)/problem resolution/and so on. Gump's code is only a small component of it's whole. I welcome more coders into Gump code, in fact I've longed for it tried to encourage entry many times.Gump2 was a one man band 'cos nobody else wished to invest time and effort in a possibly dying venture, and yet out of it (in part by you helping it becoming a TLP) Gump was re-born and is once again thriving. Gump thrives based of it's contributions to the community, and hence their contributions to it (via metadata/effort) not due to the code. I welcome Gump3 as great opportunity for discussion and solving some mistakes of Gump2. Leo has address some, but not all (as I'll write) that need solving. I see no point in doing a re-write if after months and much effort we are no better off, and we've just shifted the one man team to a new man who we'll near burn out w/ all the 'implementation nits' that pure theory doesn't prepare you for. I'm no Leo, but I know this, I've been there. Stefano, we are a team, and as a team we will have different world views/skill sets/insights -- and yes, have different weakness/make different mistakes. I'll keep raising concerns/issues based off my one-man-band wealth of experience, and hope we'll all keep an open mind to what is re-instating a past mistake, and what is a practical insight. Part of being a team is, perhaps, you educating me into your views/insights and me pressure testing them on me. Let's not let our desire for progress to weaken our team. regards, Adam - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: The Gump3 branch
Ooh, long e-mail! I'm gonna try and split this up... :-D Sorry Dude, I got excited. :-) I'll try to keep them shorter or split them. [I'll reply a few times to this one.] Having slept on what I saw, I do have some serious questions, and (to keep it short, I'll come right to the point, knowing you know I mean it respectfully) I wonder if there is as much significant difference between Gump2 and Gump3 as I first thought. They are much the same. I have a slight deja-vu feeling here. You've built a nice (clean) start, like Sam did, but to get from this to a live running system will take much the same work that I added last time, and I'm not sure the key problems of Gump2 have been understood/corrected. I'm going to try (over time) to list every place in Gump2 that I feel would be as bad in Gump3 so we can address them. This isn't me being petty, but me trying to pressure test this new approach against my understanding of reality (for all it's/my warts). I firmly believe there is very little need for different components to communicate. If you architect things the IOC way, components will use just one or two other components, and their parent can just set up the references between all those components. [ BTW: I still could use help with IOC. I have a crude understanding of it, but please don't forget to enlighten me if you see I'm missing a point.] Sure, I see that components ought not need to communicate directly. In Gump2 we have a model tree (workspace/modules/projects) and a (theoretically separate, but not) tree of results. That tree is for a few projects, or all, based off the filter of work to do. As components do work on that tree they store data at the right level (run/workspace/module/project), perhaps even setting state (failed, etc.). This is Gump2, and (as I hear it) Gump3, no differences. I feel it is that tree that is the weakness people consider bloat. Not it's memory size, but it's complexity, all the data stored in there -- and the fact it is a batch. That is a key similarity between Gump2/Gump3 and (IMHO) a key issue to address. The closer I look the more I realize the similarities between Gump2 and Gump3. What will happen is that a component needs a certain kind of result available. For example, something that pushes information in the dynagump database needs that information, which might be put there by an ant builder or something like that. This kind of stuff is trivial in python; you just set the property on the relevant part of the model and then retrieve it later. [...] Note that such communication is pretty indirect. For example the start of the CvsUpdater plugin I did just pushes information into the model (the log of the cvs command, exit status, etc) without worrying who uses that information (at the moment, it is just ignored). Part of the problem is ordering/sequencing. The CVS updating would not halt all efforts on a module (builds would occur) 'cos the CVS failed if it had a semi-fresh copy. (This was due to SF.net CVS being so flakey for so long even for Gump-wise stable things like JUnit.) As such, prior to CVS updating we needed to bring some stats/history information into memory, so enforces an implicit dependency. [Note: Stats Actor today stores Stats on the Tree, so users (CVS Actor) just ask for it from there, they don't talk directly.] I know you can do inter component communications w/ Python properties, Gump2 does, but it has no contract (as Stefano would say) it is not clean, it is intricate internals knowledge from one component to annother. It is stuff like this (and order dependencies like this) that ties components together, and keeps things fat. [Gump2 at least used typed member data/methods on the tree in order to allow some contracts.] What you are suggesting in almost exactly how Gump2 works, and is (I fear) where the thoughts to bloat come from. There were times when building logic wanted to know something historically (had this built before, etc.) in order to determine how much effort (or what switches) to use. Is inter-component communications like this a real no-no, or is this something that might be coincidentally allowed via steps in pre-processing, etc. We don't need steps. Think unix command line utilities. You can make them communicate: find . -type f | xargs -v .svn I'm a PIPE lover the much as the next guy, but simple flat stream pipes are not what we are building. Our components use complex results. Do we need contracts for those, or things (like DOM tree/XML structures) that we can persist/stream/validate. [How does Cocoon address this?] Without steps. That | there in gump is achieved by setting a property on a piece of the model. As with Gump2, but the properties grow and need management. They (and implicit dependencies) are the bloat. Plugins --- I think that generating plug-ins (perhaps even for loading, and such) is key. I'm not sure (yet) if the new model is any better than
Re: DynaGump (was Re: The Gump3 branch)
Boy, this really came across wrong. First of all (and not for the first time, but probably not for the last either.. unfortunately) allow me to apologize: I *really* would love to just have time to spend on this, showing how gump could potentially be the killer app of the semantic web... but no, I'm supposed to deliver other things... :-( Anyway, this is not an excuse to be rude and disrespectful and I'm sorry for that. Adam R. B. Jack wrote: The goal now is to allow Gump3 to perform builds and put its data into the database so that dynagump can start publishing it. Everything else is secondary. I agree, but I think Gump3 is a good idea and I'd like to see it for the long run. The *right*/focused plan for now is to accept that Gump3 is months (and a lot of work) off (I know from experience) and that the shortest path to DynaGump is not Gump3. Work with me to finish the DynaGump actor for Gump2 that I wrote for you, and let's get it up and running. Let's start exercising/integrating DynaGump now, not wait for a core re-write.\ Good point: SoC also enforces polymorphism. The best thing that happened to Gump2 is that folks were running Gump1 in parrelel. Countless bugs were detected/resolved by being able to run side by side and compare. The best things we can do for Gump3 is allow Gump2 to talk to DynaGump in parrellel. Very good point as well. If we create a workspace on Brutus called DynaGump and configure it to a DB with both old and new DB schemas in it, we can have DynaGump up a running in no time. Nothing (IMHO) better than running DynaGump against DBs formed by old and new Gump (2 3) and also comparing it to the HTML results generated by Gump2. Let's allow Gump3 to be team formed by giving it time, whilst we make one incremental improvement and allow DynaGump to be born. Can we agree on this as a step in the plan? +1 But I also hope that we'll work as a team this time. Stefano, you make me smile. :-) You are so strong in your opinions (at least how they read to others) that you come perilously close to stymieing the community you love. Yeah, well, (looking down) I know. I gave up on Depot, leaving behind parts I love/long to see, mainly 'cos it was becoming a one man band. Gump, however, is thriving community, and even when I was the only Python coder we had vast community efforts in metadata/management/communications (Wikis/Documentation/Blogs)/problem resolution/and so on. Gump's code is only a small component of it's whole. Agreed. Yet, we *must* have more people touching the code and, IMO, we should do so by thinking that every line of python code puts us a little bit farther away from that goal. I welcome more coders into Gump code, in fact I've longed for it tried to encourage entry many times. Yes, I know. I'm *not* blaming it on you. I'm blaming it more on me. Gump2 was a one man band 'cos nobody else wished to invest time and effort in a possibly dying venture, and yet out of it (in part by you helping it becoming a TLP) Gump was re-born and is once again thriving. Oh, here I really came across wrong: if it wasn't for your effort, I wouldn't have been involved in the first place since I thought that Sam's try just had failed to attract attention and momentum. Your energy and vitality gave me new hope and I think that's why we have a lot more gumpers today (even if they still don't touch the code!). Hopefully, the next wave will be the final one: when the community behaves just like any other and it's diversified enough to sustain any single individual leaving. Gump thrives based of it's contributions to the community, and hence their contributions to it (via metadata/effort) not due to the code. I welcome Gump3 as great opportunity for discussion and solving some mistakes of Gump2. Leo has address some, but not all (as I'll write) that need solving. I see no point in doing a re-write if after months and much effort we are no better off, and we've just shifted the one man team to a new man who we'll near burn out w/ all the 'implementation nits' that pure theory doesn't prepare you for. I'm no Leo, but I know this, I've been there. Completely agree. Stefano, we are a team, and as a team we will have different world views/skill sets/insights -- and yes, have different weakness/make different mistakes. I'll keep raising concerns/issues based off my one-man-band wealth of experience, and hope we'll all keep an open mind to what is re-instating a past mistake, and what is a practical insight. Part of being a team is, perhaps, you educating me into your views/insights and me pressure testing them on me. Let's not let our desire for progress to weaken our team. Very wise. Again, I apologize. I came across wrong, rude and disrespectful. It was not my intention. I very much welcome the idea of using gump2 and gump3 *both* to drive dynagump. I say let's do it :-) -- Stefano. - To unsubscribe, e-mail: [EMAIL
Re: DynaGump (was Re: The Gump3 branch)
On 09-01-2005 17:40, Adam R. B. Jack [EMAIL PROTECTED] wrote: The goal now is to allow Gump3 to perform builds and put its data into the database so that dynagump can start publishing it. Everything else is secondary. I agree, but I think Gump3 is a good idea and I'd like to see it for the long run. I think we're all in violent agreement :-D The *right*/focused plan for now is to accept that Gump3 is months (and a lot of work) off Yep! Oh man, so much work... (I know from experience) and that the shortest path to DynaGump is not Gump3. Work with me to finish the DynaGump actor for Gump2 that I wrote for you, and let's get it up and running. Let's start exercising/integrating DynaGump now, not wait for a core re-write. The power I'm hoping to maximize on (and I think that was something Stefano was getting at) is the power of the clean sheet. Sam did this as well, only the clean sheet he provided was so amazingly smart that we all had the hardest time understanding it! I'm guessing one of the key bits that makes a gump2/dynagump integration difficult is just how much cool ideas Stefano has in his head wrt dynagump and how immensely difficult it is to figure out how to weld those ideas into gump2. Since gump3 has no outputs, no builders, no generators, we can start with the kind of output we need and go from there. A reversed approach if you will. By starting with an empty shell like gump3, it becomes easier to visualize how to do those things (its a hard hard problem to get right). From there, we should be able to figure out together how to make gump2 deliver the outputs that dynagump needs to fly. It's not an or/or decision, its and/and. Let's just go with the flow. That's always worked for the gump community so far. We're that good ;) The best thing that happened to Gump2 is that folks were running Gump1 in parrelel. Countless bugs were detected/resolved by being able to run side by side and compare. The best things we can do for Gump3 is allow Gump2 to talk to DynaGump in parrellel. That makes a lot of sense. If we create a workspace on Brutus called DynaGump and configure it to a DB with both old and new DB schemas in it, we can have DynaGump up a running in no time. Hehehe. Don't be too optimistic. The dynagump database schema as stefano built seems to be completely different in some ways from how gump2 is set up. Thinking about it hurts :-D Nothing (IMHO) better than running DynaGump against DBs formed by old and new Gump (2 3) and also comparing it to the HTML results generated by Gump2. Let's allow Gump3 to be team formed by giving it time, whilst we make one incremental improvement and allow DynaGump to be born. Can we agree on this as a step in the plan? Like I said, it makes sense. What I'd really love to see would be for you to fully digest all the fledgling concepts in gump3 (after we figure out what they are :-D) so you can figure out what kind of migration/integration/ reorganisation strategy makes sense. And also, as I mentioned in my previous e-mail, I think we really don't need a grand plan to all agree on. Baby steps. Python makes it so easy to glue things up, there's a miriad of possibilities to make different versions of gump all interoperate. We can figure all that out! But I also hope that we'll work as a team this time. Stefano, you make me smile. :-) You are so strong in your opinions (at least how they read to others) that you come perilously close to stymieing the community you love. Hehehe. Take that, you passionate Italian! :-D Gump, however, is thriving community, and even when I was the only Python coder we had vast community efforts in metadata/management/communications (Wikis/Documentation/Blogs)/problem resolution/and so on. Gump's code is only a small component of it's whole. I welcome more coders into Gump code, in fact I've longed for it tried to encourage entry many times.Gump2 was a one man band 'cos nobody else wished to invest time and effort in a possibly dying venture, and yet out of it (in part by you helping it becoming a TLP) Gump was re-born and is once again thriving. Gump thrives based of it's contributions to the community, and hence their contributions to it (via metadata/effort) not due to the code. I welcome Gump3 as great opportunity for discussion and solving some mistakes of Gump2. Leo has address some, but not all (as I'll write) that need solving. I see no point in doing a re-write if after months and much effort we are no better off, and we've just shifted the one man team to a new man who we'll near burn out w/ all the 'implementation nits' that pure theory doesn't prepare you for. I'm no Leo, but I know this, I've been there. Stefano, we are a team, and as a team we will have different world views/skill sets/insights -- and yes, have different weakness/make different mistakes. I'll keep raising concerns/issues based off my one-man-band wealth of experience, and hope
Re: The Gump3 branch
Adam R. B. Jack wrote: I'm a PIPE lover the much as the next guy, but simple flat stream pipes are not what we are building. Our components use complex results. Do we need contracts for those, or things (like DOM tree/XML structures) that we can persist/stream/validate. [How does Cocoon address this?] Cocoon pipelines are not streams of characters but streams of structured events (using the SAX API). So, for example, if you have ab//a to pass along, the events are: - startElement(a) - startElement(b) - endElement(b) - endElement(a) -- Stefano. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: The Gump3 branch
On 09-01-2005 18:28, Adam R. B. Jack [EMAIL PROTECTED] wrote: Ooh, long e-mail! I'm gonna try and split this up... :-D Sorry Dude, I got excited. :-) Excitement is good! I wonder if there is as much significant difference between Gump2 and Gump3 as I first thought. Probably not. Then again, I don't know what you thought... I have a slight deja-vu feeling here. You've built a nice (clean) start, like Sam did, but to get from this to a live running system will take much the same work that I added last time, and I'm not sure the key problems of Gump2 have been understood/corrected. I'm going to try (over time) to list every place in Gump2 that I feel would be as bad in Gump3 so we can address them. This isn't me being petty, but me trying to pressure test this new approach against my understanding of reality (for all it's/my warts). It's a good idea. By all means. Software architecture is *hard*. [ BTW: I still could use help with IOC. I have a crude understanding of it, but please don't forget to enlighten me if you see I'm missing a point.] That takes years! :-D. Think military and about who is in command. Maybe you're familiar with patterns like chain of responsibility. IOC is a pattern in the same sense. Its a tree of commands. General at the top. Sure, I see that components ought not need to communicate directly. In Gump2 we have a model tree (workspace/modules/projects) and a (theoretically separate, but not) tree of results. That tree is for a few projects, or all, based off the filter of work to do. As components do work on that tree they store data at the right level (run/workspace/module/project), perhaps even setting state (failed, etc.). This is Gump2, and (as I hear it) Gump3, no differences. There's a few I think. I had the hardest time fully reading through the gump2 model code. I decided I needed to start with the XML, retrieve the fundamental abstractions, and rewrite the tree. It was so much fun I just kept going. The gump3 tree is totally passive, and much closer to the way a mathematician would build a tree. You can let loose algorithms on it that were figured out in the 30s (ie the topological sort in the walker code is one of those). The gump3 tree does not do any kind of validation. It does only the most minimal of defaults. The gump3 tree is more fully normalized. All references are fully two-way, like with DOM. The difference between depend/ and option/ sucks conceptually, so now the option-ness is just a property of the edge that connects two vertices. I think its much simpler. I feel it is that tree that is the weakness people consider bloat. Not it's memory size, but it's complexity, all the data stored in there -- and the fact it is a batch. That is a key similarity between Gump2/Gump3 and (IMHO) a key issue to address. Right. Part of the problem is ordering/sequencing. The CVS updating would not halt all efforts on a module (builds would occur) 'cos the CVS failed if it had a semi-fresh copy. (This was due to SF.net CVS being so flakey for so long even for Gump-wise stable things like JUnit.) As such, prior to CVS updating we needed to bring some stats/history information into memory, so enforces an implicit dependency. [Note: Stats Actor today stores Stats on the Tree, so users (CVS Actor) just ask for it from there, they don't talk directly.] That's a big part of the problem. The solution is in the back of my head, nearly constantly as I look at gump. Basically these kinds of decisions are all encapsulated into the graphical algebra formulae Stefano and me found in September. It would be real nice to meet face-to-face so we could talk about that one! I know you can do inter component communications w/ Python properties, Gump2 does, but it has no contract (as Stefano would say) it is not clean, it is intricate internals knowledge from one component to annother. It is stuff like this (and order dependencies like this) that ties components together, and keeps things fat. [Gump2 at least used typed member data/methods on the tree in order to allow some contracts.] That's a fundamental difference right there! Strong typing is the way we write contracts in java, but that really doesn't work as well in python. We miss the interface keyword. Python OO needs to be built for dynamism. Take a look at how hard the Zope people tried and failed to add that in and how immensely hard that has hit them in the face and how bloated their design is now! The way to specify contracts in python is to document them. The CvsUpdater plugin will set a string property cvs_update_log on each module that is of type 'cvs'. The property contains the log output from the cvs update command of course. That's a contract right there. Solidify the contract in a unit test for the updater. Model stays clean, and blissfully unaware. What you are suggesting in almost exactly how Gump2 works, and is (I fear) where the thoughts to bloat come from. The
Re: The Gump3 branch
Phew, have I been busy :-D. You certainly have. I got up real early (before I go cut up cars w/ the jaws of life) so I could take a read of this. I'm impress, inspired and (frankly) a little awed. I love how you've been far bolder than I ever was with putting your stamp on this thing, and enforcing clean practices. I was trying to replicate what existed, and make incremental deviations, but you've stood your ground from the start, enforcing your will/beliefs on this thing. I'm sure your previous container/component works have given you a lot of experience to inject here, and I think Gump lucked out that you gave it the time/framework. Ok, so love fest over a few questions (and there will be more, 'cos I don't have enough time now.) I guess my questions are concerns about where the pure theory meets the many practicalities that Gump bumps into. I'm sure your IOC/container experiences have required you to answer this before, but how do you allow components to communicate/collaborate? There were times when building logic wanted to know something historically (had this built before, etc.) in order to determine how much effort (or what switches) to use. Is inter-component communications like this a real no-no, or is this something that might be coincidentally allowed via steps in pre-processing, etc. Do you think we have a chance to re-instate threading in this model? [It is a minor nit, not a show stopper, but I liked the large run-time reduction of concurrent checkouts.] I've gotten the Gump3 branch into a state where everything works (for me), as far as stuff is implemented. The main core thing that is missing is cyclic dependency detection. I've got the right algorithm written down on paper, just need to make it happen. The hooks for it are there already though (the gump.engine.modeller.Verifier class). Mind pumping a few command lines up to a wiki or somewhere? I'd like to run the engine, and unit tests, and such. Gump2 was a pain to run (we never cured it's confusion) and I'd like to start comfortably with Gump3 fro mthe get go. On thought in that regard is partial runs. I think Gump2 was beleived (although not actually true) to be less incremental build friendly since it wouldn't allow one to do build X, update X. [It was there in Gump3, just the command line was so crude folks never got to use it.]. I feel we need Gump3 to be easy to run in pieces, and in parts. Easily asking for things that include/exclude components on the fly. Nicola's (and Sam's) wxPython GUI was a nice user this way. Any thoughts on re-instating that? I think that generating plug-ins (perhaps even for loading, and such) is key. I'm not sure (yet) if the new model is any better than the old in allowing the core steps (loading, modelling) to be pluged-in, but I think it need to be investigated. I see you have a Maven parser, but could/should that be a plug-in? If you can leverage the framework here, perhaps in multi-stage runs (e.g. pre/run/post for loading metadata, pre/run/post for building, etc.) that might be nice. [Not sure if it is overkill, but I think it was a big weakness in Gump2 that needs to be addresses.] Another weakness of Gump2 was the (eventually huge) in-memory trees combining model and results. Hmm, I'm not sure if this goes away here (or not), and I fear not. How are we going to allow (say) a results plug-in to inject the build log (and/or commandline or whatever) into the results DB? I suspect it needs to reach out and touch the memory structures. Maybe little has changed here. [I half wondered about using XML file between components so we could completed run build and later run results generate. I never got to it 'cos I felt it was a lot of work and maybe overkill. Thoughts?] [Hmm, do we need a Wiki page w/ re-design goals/objectives to measure this framework against?] I think we need to treat internal plug-ins the same as community added, i.e. east our own dog food. Do you know Python patterns for discovering and loading such plug-ins? I'd like to start by writting plug-ins that this framework can run. Is (say) an RDF generating plug-in missing the point of DynaGump, or something allowable? I'm game to start work on the DB interface for generating history, or others. The other stuff that's missing is a lot of plugins. The new architecture as I set things up identifies three stages: - preprocessing - build/run - postprocessing This a tried and tested model you've used a lot in containers? Just curious of it's origins. I wonder if (eventually) we'd like to be able to break Gump3 completely from the sequential run, perhaps into an event-based engine. And each of those can have plugins (basically what are now called actors). Preprocessing plugins that need to be built include source repository updaters. Build tools that need to be built include all the handlers for the different Commands. Postprocessing that needs to be built include the dynagump adapter. Basically everything
Re: The Gump3 branch
Adam R. B. Jack wrote: Phew, have I been busy :-D. You certainly have. I got up real early (before I go cut up cars w/ the jaws of life) so I could take a read of this. I'm impress, inspired and (frankly) a little awed. I love how you've been far bolder than I ever was with putting your stamp on this thing, and enforcing clean practices. I was trying to replicate what existed, and make incremental deviations, but you've stood your ground from the start, enforcing your will/beliefs on this thing. I'm sure your previous container/component works have given you a lot of experience to inject here, and I think Gump lucked out that you gave it the time/framework. yes, I agree with Adam (and expressed my sentiments to Leo privately so far), Gump3 is very avalonish, in the pure IoC way, which is *very* refreshing for me :-) [and also refreshing to see that all the years spent in trying to get avalon working were not that useless] Ok, so love fest over a few questions (and there will be more, 'cos I don't have enough time now.) I guess my questions are concerns about where the pure theory meets the many practicalities that Gump bumps into. I'm sure your IOC/container experiences have required you to answer this before, but how do you allow components to communicate/collaborate? well, you don't :-) No, seriously, IoC drives your ability to interact. Leo is not using a proper component manager (and I agree that would be overkill), but it's using the idea that if a component needs something, it calls a factory (or a method factory) that will return a component, or a proxy/facade for the component to talk to. This allows isolation and polymorphism. There were times when building logic wanted to know something historically (had this built before, etc.) in order to determine how much effort (or what switches) to use. Is inter-component communications like this a real no-no, or is this something that might be coincidentally allowed via steps in pre-processing, etc. if the building component needs the historical component, it will ask its parent for it and the parent will either know how to obtain one or will delegate to its parent until somebody knows how. Do you think we have a chance to re-instate threading in this model? [It is a minor nit, not a show stopper, but I liked the large run-time reduction of concurrent checkouts.] I don't see any architectural impediment for this. I've gotten the Gump3 branch into a state where everything works (for me), as far as stuff is implemented. The main core thing that is missing is cyclic dependency detection. I've got the right algorithm written down on paper, just need to make it happen. The hooks for it are there already though (the gump.engine.modeller.Verifier class). Mind pumping a few command lines up to a wiki or somewhere? I'd like to run the engine, and unit tests, and such. Gump2 was a pain to run (we never cured it's confusion) and I'd like to start comfortably with Gump3 fro mthe get go. No, I think Wikis and documentations here are more harmful than useful at this stage, because they get out of synch. I would much rather spend time in making the code self-documented and the error messages, CLI-interface very very very explicit. Leo already did a great start on this. On thought in that regard is partial runs. I think Gump2 was beleived (although not actually true) to be less incremental build friendly since it wouldn't allow one to do build X, update X. [It was there in Gump3, just the command line was so crude folks never got to use it.]. I think you mean Gump2 and yes, the command line was aweful. I feel we need Gump3 to be easy to run in pieces, and in parts. Absolutely. Easily asking for things that include/exclude components on the fly. Nicola's (and Sam's) wxPython GUI was a nice user this way. Any thoughts on re-instating that? -1 as well. it's pretty much useless to have a GUI when you are running gump over a server and accessing it over a SSH text shell. We don't need anything that allows to include/exclude components on the fly (if not a configuration file) and we don't need a gui to edit the metadata. what we need is: 1) the ability to run gump on a single project or on a group of them 2) the ability to validate metadata on a single project on on a group of them 3) keep it simple stupid 4) reduce the complexity to a minimum 5) prevent YAGNI I think that generating plug-ins (perhaps even for loading, and such) is key. I'm not sure (yet) if the new model is any better than the old in allowing the core steps (loading, modelling) to be pluged-in, but I think it need to be investigated. Adam, please, let's not commit the same mistakes again: we should fix things only if they are broken. The goal now is to allow Gump3 to perform builds and put its data into the database so that dynagump can start publishing it. Everything else is secondary. I see you have a Maven parser, but could/should that be a plug-in? This is
Re: The Gump3 branch
What I would like now is a beer and some feedback :-D First feedback ... my ailing 802.11b WISP network practically puked on all those Cocoon JARS. Yikes! I had to give up on Eclipse SVN and use command line SVN so as not to time out. 4+ hours (on third try) and counting. And no, I won't move to the flatlands, the mountains are worth this pain. Perhaps I ought re-install the modem... More (and hopefully useful) feedback once I've been able to play with this. Thanks for all your efforts on this. I'm excited to see what you've injected... regards Adam - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]