Re: [DISCUSS] Full-dev role in PR testign
> > My impression is that this is already the status quo. But, if we think we > need to be more clear on this, let's put up a vote to change the coding > guidelines and PR checklist. I've done this many times in the past, the > most obvious instances are when I've made doc changes or unit test > modifications because those will not impact full dev. I will own this item. > I think it can probably get rolled in with my dev guideline changes for > architecture diagrams. For completeness in our PR checklist: "- [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?" In practice, you're right, but any newer contributors aren't necessarily going to know this. 1. I think we should create Jiras with the end deliverable being that our > private vs public API endpoints are clearly delineated. From there, we > create another round of javadoc - for the public APIs let the javadocs rain > from the heavens to your heart's content. It's for public consumption and > should assist end users. See Mockito, for example - > > https://static.javadoc.io/org.mockito/mockito-core/2.27.0/org/mockito/Mockito.html > . > For developer docs, I'm of the *extremely strong* opinion that this should > be limited. Emphasize module, package, class, and method naming conventions > over all else. If it doesn't make sense just reading the code, take a > minute to summarize what you're doing and consider refactoring. For > legitimately more complex and necessary code passages, add a note. For > multi-class interactions that provide a larger story arc, add dev docs to > the relevant READMEs. Our use of Zookeeper Curator and its interaction with > our topology config loading is a perfect example of a feature that would > fit this need. > 2. I'm an immediate -1 on any documentation that looks like " /** Open the > car door **/ public void openCarDoor() {...}" :-). The code speaks for > itself. > 3. Publish javadocs for public APIs, not our internal dev APIs. Let your > fav IDE fill in the gaps for devs. I'm +1 on delineating public vs private APIs like you've outlined there. I think our dev stuff is *better* than our general usage guides, but there's room for improvement. I'm fairly agnostic on the dev docs because to be honest, a ton of our older code is not at all self explanatory, and to be blunt refactoring a lot of it is a substantial lift (as we've all seen multiple times trying to refactor it). If this were greenfield, I'd be in much stronger agreement with you, but I suspect in practice there's a lot of stuff nobody's going to refactor for awhile. > Full dev until we vote to replace the existing setup and can be confident > that the new approach 1. is stable, 2. takes <= the amount of time to > complete as full dev. I am +1 for migrating towards this approach and think > we should do so when it's dialed in. Great, I look forward to that getting in. Justin, what are your thoughts on leveraging this approach along with > long-lived Docker containers? > Apparently, there's actually an extension for running Docker containers, see https://faustxvi.github.io/junit5-docker/. My main hesitation there is more around how much effort to migrate it is. I think that's almost certainly a cleaner long term solution, but I suspect the 80% solution of migrating what we have *might* be easier. There might also be ways of just leveraging this by moving stuff over to failsafe from surefire, but I really don't know enough about it. Clean/reset component state after test method and/or test class >completion Probably doable regardless of approach, but things can take nontrivial amounts of time to clean up (e.g. topic deletion or clearing). I believe we do this right now, so I'd expect to continue with that. On Wed, May 1, 2019 at 8:48 PM Michael Miklavcic < michael.miklav...@gmail.com> wrote: > Justin, what are your thoughts on leveraging this approach along with > long-lived Docker containers? I think the lifecycle would look like: > >1. I need components A, B, C >2. If not started, start A, B, C >3. If started, clean/reset it >4. Setup pre-test state >5. Run test(s) >6. Clean/reset component state after test method and/or test class >completion (we may consider nuking this step - I assisted in designing > an >acceptance testing framework in the past where we handled cleanup on the >front end. This meant that remediation for failed tests became simpler >because I still had the state from the failed test run) > > Stopping/removing the containers could be a manual process or just a simple > script in the project root. We could also add a special module that runs > last that handles shutting down containers, if desired. > > I know this has been a perennial thread for us. I can dig up the original > discuss threads on this as well if folks think they're still pertinent, but > I think we're pretty far removed
Re: [DISCUSS] Full-dev role in PR testign
Justin, what are your thoughts on leveraging this approach along with long-lived Docker containers? I think the lifecycle would look like: 1. I need components A, B, C 2. If not started, start A, B, C 3. If started, clean/reset it 4. Setup pre-test state 5. Run test(s) 6. Clean/reset component state after test method and/or test class completion (we may consider nuking this step - I assisted in designing an acceptance testing framework in the past where we handled cleanup on the front end. This meant that remediation for failed tests became simpler because I still had the state from the failed test run) Stopping/removing the containers could be a manual process or just a simple script in the project root. We could also add a special module that runs last that handles shutting down containers, if desired. I know this has been a perennial thread for us. I can dig up the original discuss threads on this as well if folks think they're still pertinent, but I think we're pretty far removed now from that original point in time and should just move forward with fresh perspectives. On Wed, May 1, 2019 at 9:21 AM Justin Leet wrote: > Re: the integration testing point above, a strategy I used recently to > alleviate a similar problem was to exploit JUnit 5's extensions. I haven't > played with this at all in Metron's code, so 1) Take this with a several > grains of salt and 2) Feel encouraged to point out anything > wrong/problematic/ludicrous. My use case was pretty tame and easy to > implement. > > Essentially, you can set up an extension backed by a singleton cluster (in > the case I was doing, I had two extensions, one for a Kafka cluster and one > for MiniDfs). The backing clusters expose some methods (i.e. create topic, > get brokers, get file system, etc.), which lets the tests classes setup > whatever they need. Classes that need them just use an annotation and can > be setup under the hood to init only when tests in the current run suite > actually use them and spin down after all are done. This saved ~1 min on a > 4 minute build. It ends up being a pretty clean way to use them, although > we might need to be a bit more sophisticated, since my case was less > complicated. The main messiness is that this necessarily invites tests to > interfere with each other (i.e. if multiple tests use the kafka topic > "test", problems will be involved). We might be able to improve on this. > > We could likely do something similar with our InMemoryComponents. Right now > these are spun up on a per-test basis, rather than the overall suite. This > involves some setup in any class that wants them, just to be able to get a > cluster. There's also some definition of spin up order and so on, which is > already taken care of with the extensions (declaration order). The other > catch in our case is that we have way more code covered by JUnit 4, which > doesn't have the same capabilities. That migration doesn't necessarily > have to be done all at once, but it is a potential substantial pain point. > > Basically, I'd hope to push management of our abstraction further back into > actual JUnit so they're get more reuse across runs and it's easily > understood what a test needs and uses right up front. In our case, the > InMemoryComponents likely map to extensions. > > If we did something like this, it potentially solves a few problems > * Makes it easy to build an integration test that says "Give me these > components". > * Keeps it alive across any test that needs them > * Only spins it up if there are tests running that do need them. Only spins > up the necessary components. > * Lower our build times. > * Interaction with components remains primarily through the same methods > you'd normally interact (you can build producers/consumers, or whatever). > * IMO, easier to explain and have people use. I've had a couple people > pretty easily pick up on it. > > I can't share the code, but essentially it looks like (And I'm sure email > will butcher this, so I'm sorry in advance): > @ExtendWith({ ... > @BeforeAll > public static void beforeAll() { >// Test specific setup, similar to before. But without the cluster > creation work. > } > > Everything else gets handled under the covers, with the caveat that what I > needed to do was less involved than some of our tests, so there may be some > experimenting. > > On Wed, May 1, 2019 at 10:59 AM Justin Leet wrote: > > > Hi all, > > > > I wanted to start a discussion on something near and dear to all of our > > hearts: The role of full-dev in our testing cycle. > > > > Right now, we require all PRs to have spun up the full-dev environment > and > > make sure that things flow through properly. In some cases, this is a > > necessity, and in other cases, it's a fairly large burden on current and > > potential contributors. > > > > So what do we need to do to reduce our dependence on full dev and > increase > > our confidence in our CI process? >
[DISCUSS] Dev guideline changes for architecture diagrams
Picking up where things left off in the VOTE thread on the subject, I'm presenting a revision to my original proposal below. I'd like to get this signed off on before submitting it for another vote. Otto, picking up where we left off, let me know if this looks good to you. I'd like to propose a vote to change our dev guidelines which will clarify the tooling we use to produce diagrams and share the source files for those diagrams. The original discuss thread is noted at the end of this email. I propose the dev guidelines https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines and PR checklist https://github.com/apache/metron/blob/master/.github/PULL_REQUEST_TEMPLATE.md#for-documentation-related-changes be changed in the following ways: 1. Under "1.1 Contributing A Code Change" 1. Change <<"New features and significant bug fixes should be documented in the JIRA and appropriate architecture diagrams should be attached. Major features may require a vote.">> to <<"New features and significant bug fixes should be documented in the JIRA. Appropriate architecture diagrams should be created in https://www.draw.io/ and committed to source control as per section 2.4. Diagrams may be requested of PR submitters during review either as documentation or as an aid to the reviewer. Major features may also require a vote.">> 2. Under "2.4 Documentation" 1. New line item <<"Diagrams - We save architecture diagram source files in an xml format rendered by draw.io (instructions below). This is the free tool of choice that we've agreed to use for exchanging diagrams and their source files in Metron.">> 2. New line item >. This section would provide basic instructions for downloading source files from draw.io. 3. Add a new checkbox item under PR checklist heading "For documentation related changes" with the following text 1. Have you ensured that any documentation diagrams have been updated, along with their source files, using draw.io? See https://cwiki.apache.org/confluence/display/METRON/Development+Guidelines for instructions. 4. Here is the Jira for migrating/redoing existing diagrams 1. https://issues.apache.org/jira/browse/METRON-2099 - Original DISCUSS thread - https://lists.apache.org/thread.html/3ae02f1e32044b1a7648899700d44611aefdab6caa09fb3196292425@%3Cdev.metron.apache.org%3E - Original VOTE thread - https://lists.apache.org/thread.html/c41dca65a46354c161d58caa7c79cfac6758f437b6638d12bd5c5622@%3Cdev.metron.apache.org%3E
Re: [DISCUSS] Full-dev role in PR testign
Re: the integration testing point above, a strategy I used recently to alleviate a similar problem was to exploit JUnit 5's extensions. I haven't played with this at all in Metron's code, so 1) Take this with a several grains of salt and 2) Feel encouraged to point out anything wrong/problematic/ludicrous. My use case was pretty tame and easy to implement. Essentially, you can set up an extension backed by a singleton cluster (in the case I was doing, I had two extensions, one for a Kafka cluster and one for MiniDfs). The backing clusters expose some methods (i.e. create topic, get brokers, get file system, etc.), which lets the tests classes setup whatever they need. Classes that need them just use an annotation and can be setup under the hood to init only when tests in the current run suite actually use them and spin down after all are done. This saved ~1 min on a 4 minute build. It ends up being a pretty clean way to use them, although we might need to be a bit more sophisticated, since my case was less complicated. The main messiness is that this necessarily invites tests to interfere with each other (i.e. if multiple tests use the kafka topic "test", problems will be involved). We might be able to improve on this. We could likely do something similar with our InMemoryComponents. Right now these are spun up on a per-test basis, rather than the overall suite. This involves some setup in any class that wants them, just to be able to get a cluster. There's also some definition of spin up order and so on, which is already taken care of with the extensions (declaration order). The other catch in our case is that we have way more code covered by JUnit 4, which doesn't have the same capabilities. That migration doesn't necessarily have to be done all at once, but it is a potential substantial pain point. Basically, I'd hope to push management of our abstraction further back into actual JUnit so they're get more reuse across runs and it's easily understood what a test needs and uses right up front. In our case, the InMemoryComponents likely map to extensions. If we did something like this, it potentially solves a few problems * Makes it easy to build an integration test that says "Give me these components". * Keeps it alive across any test that needs them * Only spins it up if there are tests running that do need them. Only spins up the necessary components. * Lower our build times. * Interaction with components remains primarily through the same methods you'd normally interact (you can build producers/consumers, or whatever). * IMO, easier to explain and have people use. I've had a couple people pretty easily pick up on it. I can't share the code, but essentially it looks like (And I'm sure email will butcher this, so I'm sorry in advance): @ExtendWith({ wrote: > Hi all, > > I wanted to start a discussion on something near and dear to all of our > hearts: The role of full-dev in our testing cycle. > > Right now, we require all PRs to have spun up the full-dev environment and > make sure that things flow through properly. In some cases, this is a > necessity, and in other cases, it's a fairly large burden on current and > potential contributors. > > So what do we need to do to reduce our dependence on full dev and increase > our confidence in our CI process? > > My initial thoughts on this encompass a few things > * Increasing our ability to easily write automated tests. In particular, I > think our integration testing is fairly hard and has some structural issues > (e.g. our tests spin up components per Test class, not for the overall test > suite being run, which also increases build times, etc.). How do we make > this easier and have less boilerplate? I have one potential idea I'll > reply to this thread with. > * Unit tests in general. I'd argue that any area we thing need full-dev > spun up to be confident in needs more testing. Does anyone have any > opinions on which areas we have the least confidence in? Which areas of > code are people afraid to touch? > * Our general procedures. Should we not be requiring full-dev on all PRs, > but maybe just asking for a justification why not (e.g. "I don't need > full-dev for this, because I added unit tests here and here and integration > tests for the these components?"). And then a reviewer can request a > full-dev spin up if needed? Could we stage this rollout (e.g. > metron-platform modules require it, but others don't, etc.?) This'll add > more pressure onto the release phase, so maybe that involves fleshing out a > checklist of critical path items to check. > * Do we need to improve our docs? Publish Javadocs? After all, if docs are > better, hopefully we'd see less issues introduced and be more confident in > changes coming in. > * What environment do we even want testing to occur in? > https://github.com/apache/metron/pull/1261 has seen a lot of work, and > getting it across the finish line may help make the dev environment less >
[DISCUSS] Full-dev role in PR testign
Hi all, I wanted to start a discussion on something near and dear to all of our hearts: The role of full-dev in our testing cycle. Right now, we require all PRs to have spun up the full-dev environment and make sure that things flow through properly. In some cases, this is a necessity, and in other cases, it's a fairly large burden on current and potential contributors. So what do we need to do to reduce our dependence on full dev and increase our confidence in our CI process? My initial thoughts on this encompass a few things * Increasing our ability to easily write automated tests. In particular, I think our integration testing is fairly hard and has some structural issues (e.g. our tests spin up components per Test class, not for the overall test suite being run, which also increases build times, etc.). How do we make this easier and have less boilerplate? I have one potential idea I'll reply to this thread with. * Unit tests in general. I'd argue that any area we thing need full-dev spun up to be confident in needs more testing. Does anyone have any opinions on which areas we have the least confidence in? Which areas of code are people afraid to touch? * Our general procedures. Should we not be requiring full-dev on all PRs, but maybe just asking for a justification why not (e.g. "I don't need full-dev for this, because I added unit tests here and here and integration tests for the these components?"). And then a reviewer can request a full-dev spin up if needed? Could we stage this rollout (e.g. metron-platform modules require it, but others don't, etc.?) This'll add more pressure onto the release phase, so maybe that involves fleshing out a checklist of critical path items to check. * Do we need to improve our docs? Publish Javadocs? After all, if docs are better, hopefully we'd see less issues introduced and be more confident in changes coming in. * What environment do we even want testing to occur in? https://github.com/apache/metron/pull/1261 has seen a lot of work, and getting it across the finish line may help make the dev environment less onerous, even if still needed.
Re: [DISCUSS] Metron Release - 0.7.1 next steps
Short version: I'm in favor of #2 of 0.7.1 and #1 as a blocker for 0.8.0. #3 seems like a total waste of time and effort. The wall of text version: I agree this isn't "just the wrong thing shown", but for completely different reasons. To be extremely clear about what the problem is: Our "dev" environment (whose very name implies the audience is develops) uses a performance-based advanced feature to ensure that all our of sample flows are regularly run and produce data. This feature has a bare minimal implementation to be enabled via Ambari, which it currently is by default. This is because of the limited resources available that previously resulted in us turning off Yaf, and therefore testing it during regular full dev runs. Right now however, this feature is not exposed through the management UI, and therefore it isn't obvious what the implications are. Am I missing anything here? For users actually choosing to use the parser aggregation feature in a non-full-dev environment, I'd expect substantially more care to be involved given the lack of easy configuration for it (after all, why would you bother running the aggregated parser alongside the regular parser? This could be more explicitly stated, but again that feels like a doc problem. Right now I could essentially provide two of the same parser and create the same problem, so right now aggregation is only special because it runs on dev by default). This is, in my opinion, primarily a first impression problem and likely one of many areas that could use improved documentation. Quite frankly, I think the issue pointed out here could mostly be resolved by documenting how the current aggregation is done in dev, and telling how to change it. Especially for a 0.x.1 release, which is primarily bug fixes. As can be inferred from my vote, I don't think this problem is a problem that needs solving in a point release. I would support improving the documentation, both full-dev and for aggregation in general for the 0.7.1 point release, while making a 0.8.0 release contingent upon the outstanding PRs to enable it in the management UI. There are a couple deeper issues, imo, that I care substantially more about than this in particular * The dev environment is being used as our intro for users, because it's convenient for us to not maintain more environments (which has been a major pain point in the past). Worse, the dev environment strongly implies it's for Metron developers, rather than people looking to build on top of Metron. We need an actual strategy for providing end users a clean impression of Metron (this could be clarifying what the expectations of full dev are, renaming it to something like "full-demo", something more involved, etc.). This is something that we've needed for awhile in general, and includes larger topics like improving our website, potentially improving the site book, actually publishing our Javadocs somewhere so people can develop things easier, publishing out info about Stellar functions in a better manner, etc. * The fact that parsers are handled in Ambari at all. It's awful and leads to situations like this. To the best of my knowledge, once we can do chaining and aggregation in the Management UI, we should be able to entirely divorce these two overlapping domains. I'd love to see parsers ripped out of Ambari, then full-dev manages all the setup via REST. At that point, we can easily tell everyone to just use the management UI. On Wed, May 1, 2019 at 7:23 AM Otto Fowler wrote: > I think it would help if the full consequences of having the UI show the > wrong status where listed. > > Someone trying metron, will, by default , see the wrong thing in the UI for > the ONLY sensors they have that are running and doing data. > > What happens when they try to start them to make them work? One, two or > all? > What happens when he edits them or try to add transformations? One, two or > all? > What other things can you do with the sensors in the ui? What happens? > > Are we recommending aggregation on the list and elsewhere for users? Are > we recommending something that is going to ensure they get into this > situation? > > I think this is more than ‘just the wrong thing shown’ in the ui. > > > > > On April 30, 2019 at 20:48:10, Michael Miklavcic ( > michael.miklav...@gmail.com) wrote: > > The vote for RC1 did not pass and I'd like to kickstart some discussion > about what we should do. > > I started taking a look at PR#1360 and it looks like this isn't quite as > close to being able go in as I had originally expected. I want to talk > about options here. It seems to me that we can: > > 1. Wait for PR#1360 to go in, but this is likely going to take more time > than originally anticipated > 2. Accept the issue in full dev, but add some notes in the developer > docs about the current feature gap and why sensors aren't showing status in > the management UI when aggregation is enabled. > 3. Find some other workable UI solution. > 4.
Re: [DISCUSS] Metron Release - 0.7.1 next steps
I think it would help if the full consequences of having the UI show the wrong status where listed. Someone trying metron, will, by default , see the wrong thing in the UI for the ONLY sensors they have that are running and doing data. What happens when they try to start them to make them work? One, two or all? What happens when he edits them or try to add transformations? One, two or all? What other things can you do with the sensors in the ui? What happens? Are we recommending aggregation on the list and elsewhere for users? Are we recommending something that is going to ensure they get into this situation? I think this is more than ‘just the wrong thing shown’ in the ui. On April 30, 2019 at 20:48:10, Michael Miklavcic ( michael.miklav...@gmail.com) wrote: The vote for RC1 did not pass and I'd like to kickstart some discussion about what we should do. I started taking a look at PR#1360 and it looks like this isn't quite as close to being able go in as I had originally expected. I want to talk about options here. It seems to me that we can: 1. Wait for PR#1360 to go in, but this is likely going to take more time than originally anticipated 2. Accept the issue in full dev, but add some notes in the developer docs about the current feature gap and why sensors aren't showing status in the management UI when aggregation is enabled. 3. Find some other workable UI solution. 4. Other option? All things considered, I'm personally leaning towards #2 in the short-term, but I think we should probably talk about this a bit before deciding what RC2 should be. Best, Mike