Thank you Ahmet, Brian, Robert and everyone else who spent time working on this. The pull requests are now merged and the website seems to be working as expected [1].
One minor issue that I have noticed is that the code blocks have a grey background, which makes it less accessible than before. I created a Jira issue for this [2], and will follow up to get it fixed. If you notice any other issues, please file Jira issues and let me know. Hope you are all safe, Aizhamal [1] https://beam.apache.org/ [2] https://issues.apache.org/jira/browse/BEAM-10001 On Thu, May 14, 2020 at 11:25 AM Pablo Estrada <pabl...@google.com> wrote: > Here's a zipped-up tree from a staged sample of the website: > https://drive.google.com/file/d/1LKL936tBJ79jpjvlL5vC5uYYwTHsWXiJ/view?usp=sharing > > I'd also suggest tagging the commit, so we can find the fist commit later > on for reference. I can push the tag after the PR is merged. > > On Thu, May 14, 2020 at 10:43 AM Ahmet Altay <al...@google.com> wrote: > >> >> >> On Thu, May 14, 2020 at 9:16 AM Aizhamal Nurmamat kyzy < >> aizha...@apache.org> wrote: >> >>> Thank you all for reviewing and validating this pull request. I see that >>> all tests are passing now, should we merge it? >>> >> >> +1 to merging now. >> >> Before the merge, please share a link to an archive copy of the old >> website. After the merge, please try out the live website see if it is >> working as expected. >> >> >>> >>> On Wed, May 13, 2020, 5:41 PM Ahmet Altay <al...@google.com> wrote: >>> >>>> Thank you! Let's merge it once tests are done. >>>> >>>> On Wed, May 13, 2020 at 5:23 PM Robert Bradshaw <rober...@google.com> >>>> wrote: >>>> >>>>> I took a (non-comprehensive) look at these as well, and didn't see any >>>>> issues, so am happy to sign off on this. Thanks Nam, Brian, Ahmet, and >>>>> everyone else. >>>>> >>>>> On Wed, May 13, 2020 at 7:58 AM Nam Bui <nam....@polidea.com> wrote: >>>>> >>>>>> Hi Ahmet, >>>>>> "Does this mean the internal links (e.g. contribute/team) will >>>>>> disappear?" >>>>>> Yes, I'd like to get rid of them. And to make sure it won't appear to >>>>>> confuse people, I replaced all of the spots using "contribute/team" with >>>>>> the external one. Currently, we only have 2 "redirect_to" links which are >>>>>> "contribute/team" & "contribute/project/team", so this act won't have any >>>>>> affects. >>>>>> Also, based on your question, I just added a section in the >>>>>> documentation (CONTRIBUTE.md), which mentions the replaced/removed >>>>>> features >>>>>> of Jekyll in terms of writing a new blog post or documentation in Hugo. >>>>>> >>>>> >>>> Got it. The main effect will be any one has a bookmark/link to these >>>> pages, those links will no longer work. It is fine if it is only limited to >>>> these 2 urls. >>>> >>>> >>>>> >>>>>> >>>>>> On Wed, May 13, 2020 at 4:17 AM Ahmet Altay <al...@google.com> wrote: >>>>>> >>>>>>> - I reviewed the diff output with Nam's explanations. The change >>>>>>> looks minimal. Large diffs are primarily coming from index and redirect >>>>>>> files. codeblocks have differences but the content is seemingly >>>>>>> preserved. >>>>>>> IIUC, the source of truth is snippet files anyway. (It would be good to >>>>>>> get >>>>>>> one more set of eyes on this.) >>>>>>> - Brian and I reviewed the infrastructure changes. They look >>>>>>> reasonable. >>>>>>> >>>>>>> I think PR is very close to a mergeable state. Especially if we can >>>>>>> get an archive copy of the current website, I will be comfortable with >>>>>>> the >>>>>>> merge. >>>>>>> >>>>>>> And, thank you Nam for your work so far. >>>>>>> >>>>>>> On Tue, May 12, 2020 at 4:13 PM Nam Bui <nam....@polidea.com> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> A new commit covers Robert's script is pushed [1], and also the >>>>>>>> script output is attached in this email. >>>>>>>> >>>>>>>> Based on the diff output of the script, my strategy is looking at >>>>>>>> the sections which contain the large/massive removed texts, to make >>>>>>>> sure >>>>>>>> that there are no lost content or files. And below are all of the links >>>>>>>> which have large of the removed content. >>>>>>>> >>>>>>>> - Detection: >>>>>>>> These links lost some of the contents. Fixed! >>>>>>>> + documentation/runners/jstorm/index.html >>>>>>>> + documentation/dsls/sql/calcite/lexical-structure/index.html >>>>>>>> + documentation/dsls/sql/zetasql/data-types/index.html >>>>>>>> + documentation/dsls/sql/zetasql/query-syntax/index.html >>>>>>>> >>>>>>>> - Aliases: >>>>>>>> These links are redirected links. So in Hugo, these HTML files only >>>>>>>> include redirected URLs. I also took a look at them to ensure the >>>>>>>> content >>>>>>>> was there. >>>>>>>> + documentation/dsls/sql/calcite/lexical/index.html >>>>>>>> + old URLs of blog posts >>>>>>>> >>>>>>>> - Ignore: >>>>>>>> Hugo and Jekyll have different structures of code highlighters >>>>>>>> rendering in HTML. Ahmed & Pablo agree with me that its fair to ignore >>>>>>>> them >>>>>>>> for now. >>>>>>>> + codeblocks >>>>>>>> >>>>>>>> - Missing files: >>>>>>>> The script returns some of “missing files” status >>>>>>>> + coming-soon.html (this file was used nowhere in Jekyll, so I >>>>>>>> didn’t migrate to Hugo) >>>>>>>> + documentation/dsls/sql/statements/select/index.html (aliases) >>>>>>>> + blog/2019/04/25/beam-2.12.0.html (fixed!) >>>>>>>> + blog/2020/05/08/beam-summit-digital-2020.html (new blog post, >>>>>>>> added!) >>>>>>>> + v2/index.html (this file was used nowhere in Jekyll, so I didn’t >>>>>>>> migrate to Hugo) >>>>>>>> + contribute/team/index.html (mentioned in “redirect_to” below) >>>>>>>> + contribute/project/team/index.html (mentioned in “redirect_to” >>>>>>>> below) >>>>>>>> >>>>>>>> - “redirect_to”: >>>>>>>> In Jekyll, there is a feature called “redirect_to”. For instance, >>>>>>>> you click on an internal link “contribute/team/” to reach the markdown >>>>>>>> “team.md”, then from the markdown file, it redirects you to the >>>>>>>> external >>>>>>>> URL “https://example.com”. >>>>>>>> However, there is no such feature in Hugo. My solution is to >>>>>>>> directly replace “contribute/team/” with “https://example.com”. >>>>>>>> >>>>>>> >>>>>>> Does this mean the internal links (e.g. contribute/team) will >>>>>>> disappear? >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> [1] https://github.com/apache/beam/pull/11554 >>>>>>>> >>>>>>>> On Mon, May 11, 2020 at 7:34 PM Nam Bui <nam....@polidea.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Updates for today: >>>>>>>>> - Thanks Brian & Ahmet for your reviews. I left my comments for >>>>>>>>> some of the questions and also adapted new changes to the reviews [1]. >>>>>>>>> - I see that the new blog post was merged yesterday, so I added it >>>>>>>>> to the PR as well. >>>>>>>>> >>>>>>>>> I briefly tried the script from Robert with the input of build >>>>>>>>> files from old and new websites. It seemed to work well in terms of >>>>>>>>> detecting missing files (or probably wrong links leading to missing >>>>>>>>> files). >>>>>>>>> I will push another commit to fix all that up, hope can be tomorrow. >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> https://github.com/apache/beam/pull/11554#issuecomment-626792031 >>>>>>>>> >>>>>>>>> Best regards, >>>>>>>>> Nam >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, May 11, 2020 at 9:01 AM Nam Bui <nam....@polidea.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> @Ahmet: Yeah, it's all clear to me. :) >>>>>>>>>> @Robert: Thanks for your ideas and also the script. It really >>>>>>>>>> helps me to serve my works. >>>>>>>>>> >>>>>>>>>> Best regard! >>>>>>>>>> >>>>>>>>>> On Sat, May 9, 2020 at 2:10 AM Ahmet Altay <al...@google.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> This sounds reasonable to me. Thank you. Nam, does it make sense >>>>>>>>>>> to you? >>>>>>>>>>> >>>>>>>>>>> On Fri, May 8, 2020 at 11:53 AM Robert Bradshaw < >>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> I'd really like to not see this work go to waste, both the >>>>>>>>>>>> original revision, the further efforts Nam has done in making it >>>>>>>>>>>> more >>>>>>>>>>>> manageable to review, and the work put into reviewing this so far, >>>>>>>>>>>> so we >>>>>>>>>>>> can get the benefits of being on Hugo. How about this for a >>>>>>>>>>>> concrete proposal: >>>>>>>>>>>> >>>>>>>>>>>> (1) We get "standard" approval from one or more committers for >>>>>>>>>>>> the infrastructure changes, just as with any other PR. Brian has >>>>>>>>>>>> already started this, but if others could step up as well that'd >>>>>>>>>>>> be great. >>>>>>>>>>>> >>>>>>>>>>>> (2) Reviewers (and authors) typically count on (or request) >>>>>>>>>>>> sufficient automated test coverage to augment the fact that their >>>>>>>>>>>> eyeballs >>>>>>>>>>>> are fallible, which is something that is missing here (and given >>>>>>>>>>>> the size >>>>>>>>>>>> of the change not easily compensated for by a more detailed manual >>>>>>>>>>>> review). >>>>>>>>>>>> How about we use the script above (or similar) as an automated >>>>>>>>>>>> test to >>>>>>>>>>>> validate the website's contents haven't (materially) changed. I >>>>>>>>>>>> feel we've >>>>>>>>>>>> validated enough that the style looks good via spot checking >>>>>>>>>>>> (which is >>>>>>>>>>>> something that should work on all pages if it works on one). The >>>>>>>>>>>> diff >>>>>>>>>>>> between the current site and the newly generated site should be >>>>>>>>>>>> empty (it >>>>>>>>>>>> might already be [1]), or at least we should get a stamp of >>>>>>>>>>>> approval on the >>>>>>>>>>>> plain-text diff (which should be small), before merging. >>>>>>>>>>>> >>>>>>>>>>>> (3) To make things easier, everyone holds off on making any >>>>>>>>>>>> changes to the old site until a fixed future date (say, next >>>>>>>>>>>> Wednesday). >>>>>>>>>>>> Hopefully we can get it merged by then. If not, a condition for >>>>>>>>>>>> merging >>>>>>>>>>>> would be a commitment incorporating new changes after this date. >>>>>>>>>>>> >>>>>>>>>>>> Does this sound reasonable? >>>>>>>>>>>> >>>>>>>>>>>> - Robert >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> [1] I'd be curious as to how small the diff already is, but my >>>>>>>>>>>> script relies on local directories with the generated HTML, which >>>>>>>>>>>> I don't >>>>>>>>>>>> have handy at the moment. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, May 8, 2020 at 10:45 AM Robert Bradshaw < >>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Here's a script that we could run on the old and new sites >>>>>>>>>>>>> that should quickly catch any major issues but not get caught up >>>>>>>>>>>>> in >>>>>>>>>>>>> formatting minutia. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Fri, May 8, 2020 at 10:23 AM Robert Bradshaw < >>>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, May 8, 2020 at 9:58 AM Aizhamal Nurmamat kyzy < >>>>>>>>>>>>>> aizha...@apache.org> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I understand the difficulty, and this certainly comes with >>>>>>>>>>>>>>> lessons learned for future similar projects. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> To your questions Robert: >>>>>>>>>>>>>>> (1 and 2) I will commit to review the text in the resulting >>>>>>>>>>>>>>> pages. I will try and use some automation to extract visible >>>>>>>>>>>>>>> text from each >>>>>>>>>>>>>>> page and diff it with the current state of the website. I can >>>>>>>>>>>>>>> do this >>>>>>>>>>>>>>> starting next week. From some quick research, there seem to be >>>>>>>>>>>>>>> tools that >>>>>>>>>>>>>>> help with this analysis ( >>>>>>>>>>>>>>> https://stackoverflow.com/questions/3286955/compare-two-websites-and-see-if-they-are-equal >>>>>>>>>>>>>>> ) >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> At first glance it looks like these tools would give diffs >>>>>>>>>>>>>> that are *larger* than the 47K one we're struggling to review >>>>>>>>>>>>>> here. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> By remaining in this state, we hold others up from making >>>>>>>>>>>>>>> changes, or we increase the amount of work needed after merging >>>>>>>>>>>>>>> to port >>>>>>>>>>>>>>> over changes that may be missed. If we move forward, new >>>>>>>>>>>>>>> changes can be >>>>>>>>>>>>>>> done on top of the new website. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I agree we don't want to hold others up from making changes. >>>>>>>>>>>>>> However, the amount of work to port changes over seems small in >>>>>>>>>>>>>> comparison >>>>>>>>>>>>>> to everything else that is being discussed here. (It also >>>>>>>>>>>>>> provides good >>>>>>>>>>>>>> incentives to reach the bar quickly and has the advantage of >>>>>>>>>>>>>> falling on the >>>>>>>>>>>>>> right people.) (3) will still take some time. >>>>>>>>>>>>>> >>>>>>>>>>>>>> If we go this route, we're lowering the bar for doc changes, >>>>>>>>>>>>>> but not removing it. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> (3) This makes sense. Brian, would you be able to spend some >>>>>>>>>>>>>>> time to look at the automation changes (build files and >>>>>>>>>>>>>>> scripts) to ensure >>>>>>>>>>>>>>> they look fine? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I would also like to write a post mortem to extract lessons >>>>>>>>>>>>>>> learned and avoid this situation in the future. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Fri, May 8, 2020 at 9:44 AM Brian Hulette < >>>>>>>>>>>>>>> bhule...@google.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm -0 on merging as-is. I have the same concerns as Robert >>>>>>>>>>>>>>>> and he's voiced them very well so I won't waste time re-airing >>>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (2) I spot checked the content, pulled out some common >>>>>>>>>>>>>>>>> patterns, and >>>>>>>>>>>>>>>>> it mostly looks good, but there were also some issues >>>>>>>>>>>>>>>>> (e.g. several >>>>>>>>>>>>>>>>> pages were replaced with the contents from entirely >>>>>>>>>>>>>>>>> different pages). >>>>>>>>>>>>>>>>> I would be more comfortable if, say, a smoke test of >>>>>>>>>>>>>>>>> comparing the old >>>>>>>>>>>>>>>>> and new sites, with html tags stripped and ignoring >>>>>>>>>>>>>>>>> whitespace, >>>>>>>>>>>>>>>>> yielded what should be empty diffs. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Can you share any details about this analysis? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> It was basically paging through the diff, adding things to >>>>>>>>>>>>>> the sed script, and then looking at more diffs. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>>> +1 for verifying the old and new are the same by diffing the >>>>>>>>>>>>>>>> output. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> (3) It'd be good to have someone give a stamp of approval >>>>>>>>>>>>>>>>> on the >>>>>>>>>>>>>>>>> infrastructure changes, at least to validate that we're >>>>>>>>>>>>>>>>> not going to >>>>>>>>>>>>>>>>> be taking on extra tech debt with regard to jenkins >>>>>>>>>>>>>>>>> stability and >>>>>>>>>>>>>>>>> developer workflow. I see that Brian has at least looked >>>>>>>>>>>>>>>>> at this some. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> My involvement so far was just recognizing a problem >>>>>>>>>>>>>>>> (creating root-owned files on jenkins workers) and helping to >>>>>>>>>>>>>>>> fix it. If >>>>>>>>>>>>>>>> there's anyone available who's familiar with the website >>>>>>>>>>>>>>>> infrastructure it >>>>>>>>>>>>>>>> would be great if they could take a look instead (if not I >>>>>>>>>>>>>>>> could probably >>>>>>>>>>>>>>>> acquaint myself enough to review). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, May 7, 2020 at 11:57 PM Robert Bradshaw < >>>>>>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> This is a tough situation. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> It would have been much better if this transition was >>>>>>>>>>>>>>>>> structured in >>>>>>>>>>>>>>>>> such a way that the review was more manageable (e.g. the >>>>>>>>>>>>>>>>> suggestion of >>>>>>>>>>>>>>>>> scripts, not mixing in voluminous unnecessary changes like >>>>>>>>>>>>>>>>> whitespace, >>>>>>>>>>>>>>>>> and not updating content), and possibly even incrementally >>>>>>>>>>>>>>>>> (e.g. the >>>>>>>>>>>>>>>>> new site would have been developed over multiple PRs in a >>>>>>>>>>>>>>>>> subdomain or >>>>>>>>>>>>>>>>> subdirectory while being worked on). But hindsight is >>>>>>>>>>>>>>>>> 20/20 and no >>>>>>>>>>>>>>>>> one, myself included, thought to bring this up when the >>>>>>>>>>>>>>>>> original >>>>>>>>>>>>>>>>> migration was proposed, so this is more something to keep >>>>>>>>>>>>>>>>> in mind for >>>>>>>>>>>>>>>>> the future. I also appreciate the efforts that have been >>>>>>>>>>>>>>>>> made to clean >>>>>>>>>>>>>>>>> things up (e.g. preserving history) and address feedback. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> So, where do we go from here? My first thought is that I >>>>>>>>>>>>>>>>> really don't >>>>>>>>>>>>>>>>> want to set a precedent that just because a PR "will >>>>>>>>>>>>>>>>> require a large >>>>>>>>>>>>>>>>> effort" and in a state that if we don't "move forward and >>>>>>>>>>>>>>>>> merge what >>>>>>>>>>>>>>>>> we have now" then "work done so far will be lost" means >>>>>>>>>>>>>>>>> that we think >>>>>>>>>>>>>>>>> it's OK to forgo doing a proper review. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On the other hand, there are some mitigating factors with >>>>>>>>>>>>>>>>> this being >>>>>>>>>>>>>>>>> the website and not the code in that "bugs," though >>>>>>>>>>>>>>>>> possibly >>>>>>>>>>>>>>>>> embarrassing, won't break production pipelines or data >>>>>>>>>>>>>>>>> loss, and >>>>>>>>>>>>>>>>> though the source is technically part of the release, when >>>>>>>>>>>>>>>>> we find >>>>>>>>>>>>>>>>> something to fix we can fix the live website much more >>>>>>>>>>>>>>>>> quickly than go >>>>>>>>>>>>>>>>> through the whole release process and convince people to >>>>>>>>>>>>>>>>> upgrade. (I >>>>>>>>>>>>>>>>> recognize accepting this argument is, to some degree at >>>>>>>>>>>>>>>>> least, saying >>>>>>>>>>>>>>>>> that we don't care about the correctness of docs as much >>>>>>>>>>>>>>>>> as so-called >>>>>>>>>>>>>>>>> "real" code, if we go there.) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If we decide to go ahead and merge (and I would not >>>>>>>>>>>>>>>>> object), there are >>>>>>>>>>>>>>>>> some things I would like to see. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> (1) I would like to understand what we would do afterwards >>>>>>>>>>>>>>>>> to "review >>>>>>>>>>>>>>>>> the outcome, and ensure that all the content is there," >>>>>>>>>>>>>>>>> and why it >>>>>>>>>>>>>>>>> can't be done before merging instead. (Is it because it'd >>>>>>>>>>>>>>>>> take time >>>>>>>>>>>>>>>>> and we don't want to incorporate changes that are made to >>>>>>>>>>>>>>>>> the website >>>>>>>>>>>>>>>>> in the meantime? I think that boat has sailed, but maybe >>>>>>>>>>>>>>>>> we can avoid >>>>>>>>>>>>>>>>> making it worse...) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> (2) I spot checked the content, pulled out some common >>>>>>>>>>>>>>>>> patterns, and >>>>>>>>>>>>>>>>> it mostly looks good, but there were also some issues >>>>>>>>>>>>>>>>> (e.g. several >>>>>>>>>>>>>>>>> pages were replaced with the contents from entirely >>>>>>>>>>>>>>>>> different pages). >>>>>>>>>>>>>>>>> I would be more comfortable if, say, a smoke test of >>>>>>>>>>>>>>>>> comparing the old >>>>>>>>>>>>>>>>> and new sites, with html tags stripped and ignoring >>>>>>>>>>>>>>>>> whitespace, >>>>>>>>>>>>>>>>> yielded what should be empty diffs. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> (3) It'd be good to have someone give a stamp of approval >>>>>>>>>>>>>>>>> on the >>>>>>>>>>>>>>>>> infrastructure changes, at least to validate that we're >>>>>>>>>>>>>>>>> not going to >>>>>>>>>>>>>>>>> be taking on extra tech debt with regard to jenkins >>>>>>>>>>>>>>>>> stability and >>>>>>>>>>>>>>>>> developer workflow. I see that Brian has at least looked >>>>>>>>>>>>>>>>> at this some. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> - Robert >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Thu, May 7, 2020 at 12:40 PM Aizhamal Nurmamat kyzy >>>>>>>>>>>>>>>>> <aizha...@apache.org> wrote: >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > Thank you Ahmet. >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > Robert/Brian, what do you think? >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > The website staging and pre commit tests have passed >>>>>>>>>>>>>>>>> [1]. If nobody has objections, we could merge it soon. >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > [1] https://github.com/apache/beam/pull/11554 >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > On Thu, May 7, 2020 at 11:38 AM Ahmet Altay < >>>>>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> On Thu, May 7, 2020 at 10:50 AM Aizhamal Nurmamat kyzy < >>>>>>>>>>>>>>>>> aizha...@apache.org> wrote: >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> Thanks for the writeup Ahmet. >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> My bias is to move forward and merge the PR. After >>>>>>>>>>>>>>>>> this, we'll review the outcome, and ensure that all the >>>>>>>>>>>>>>>>> content is there. >>>>>>>>>>>>>>>>> Nam will help us with that. >>>>>>>>>>>>>>>>> >>> The reason that I'd like to move forward and merge >>>>>>>>>>>>>>>>> what we have now - is that if we don't do that, the work done >>>>>>>>>>>>>>>>> so far will >>>>>>>>>>>>>>>>> be lost. >>>>>>>>>>>>>>>>> >>> We'll make sure to stage the website in its current >>>>>>>>>>>>>>>>> state, and use that as reference/archive to ensure all the >>>>>>>>>>>>>>>>> content have >>>>>>>>>>>>>>>>> been moved. >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> Is this reasonable to everyone? >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> This is reasonable to me. I agree with your reasons. >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >> What do others think? >>>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>>> >>> On Wed, May 6, 2020 at 7:07 PM Ahmet Altay < >>>>>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> On Wed, May 6, 2020 at 2:33 PM Aizhamal Nurmamat kyzy >>>>>>>>>>>>>>>>> <aizha...@apache.org> wrote: >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> >>>>>> > 1) Currently, the main blocker for merging is >>>>>>>>>>>>>>>>> Staging Test Failures. >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> >>>>>> That and finishing the review. (Is someone >>>>>>>>>>>>>>>>> tracking/coordinating this?) >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> I am coordinating the work on the failed tests, but >>>>>>>>>>>>>>>>> I would need other committer's help to perform the review. >>>>>>>>>>>>>>>>> @Ahmet, could >>>>>>>>>>>>>>>>> you help us prioritize the review for this PR? >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> The problem is there are too many manual changes. >>>>>>>>>>>>>>>>> Reviewing this change in this form will require a large >>>>>>>>>>>>>>>>> effort. I do not >>>>>>>>>>>>>>>>> think I can interrupt other projects to prioritize reviews on >>>>>>>>>>>>>>>>> this PR. IMO, >>>>>>>>>>>>>>>>> we have a few options: >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> - PR to be restructured in the format suggested in >>>>>>>>>>>>>>>>> this thread. A commit for infrastructure changes from Jekyll >>>>>>>>>>>>>>>>> to hugo. A >>>>>>>>>>>>>>>>> second commit for a script that will convert the majority of >>>>>>>>>>>>>>>>> the content. A >>>>>>>>>>>>>>>>> third commit for the execution of the script. And a fourth >>>>>>>>>>>>>>>>> commit for the >>>>>>>>>>>>>>>>> additional manual content changes. If Nam can get to this >>>>>>>>>>>>>>>>> form, people on >>>>>>>>>>>>>>>>> this thread myself/Robert/Pablo/Brian can review the changes. >>>>>>>>>>>>>>>>> >>>> - Another option is, we can accept that we already >>>>>>>>>>>>>>>>> invested in this transition and overall this is a good >>>>>>>>>>>>>>>>> change, and merge >>>>>>>>>>>>>>>>> the PR more or less in its current form (with tests fixed and >>>>>>>>>>>>>>>>> open comments >>>>>>>>>>>>>>>>> addressed) even though it has issues. And then overtime fix >>>>>>>>>>>>>>>>> the issues we >>>>>>>>>>>>>>>>> encounter. There was already some amount of review and visual >>>>>>>>>>>>>>>>> comparisons, >>>>>>>>>>>>>>>>> we risk losing some recent content changes but I am assuming >>>>>>>>>>>>>>>>> this will not >>>>>>>>>>>>>>>>> be much. If Nam can commit to compare two sites after a >>>>>>>>>>>>>>>>> merge, fixing the >>>>>>>>>>>>>>>>> majority of the delta, this might be a viable option. >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>> Another thing we can do, we can archive/store a >>>>>>>>>>>>>>>>> read-only copy of the current website in an "archive" url >>>>>>>>>>>>>>>>> temporarily >>>>>>>>>>>>>>>>> instead of completely deleting it. It will give us a baseline >>>>>>>>>>>>>>>>> for a while >>>>>>>>>>>>>>>>> to go back to the old content and move any missing data. (And >>>>>>>>>>>>>>>>> maybe, >>>>>>>>>>>>>>>>> someone can come up with an innovative way to compare the >>>>>>>>>>>>>>>>> textual content >>>>>>>>>>>>>>>>> of both sites.) A note on the stop world approach, I believe >>>>>>>>>>>>>>>>> we are already >>>>>>>>>>>>>>>>> failing on that with merge conflicts showing up on the PR. It >>>>>>>>>>>>>>>>> will be >>>>>>>>>>>>>>>>> better for us to complete the transition as soon as possible. >>>>>>>>>>>>>>>>> Fixing after >>>>>>>>>>>>>>>>> the initial merge might be a simpler task, especially if we >>>>>>>>>>>>>>>>> can archive the >>>>>>>>>>>>>>>>> old site. >>>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>>> >>>>>> > Michal showed Nam how to handle the 1st test >>>>>>>>>>>>>>>>> which was about Apache License missing. >>>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>>> >>>>>> > However, the 2nd and 3rd tests looked like some >>>>>>>>>>>>>>>>> kind of permissions error on the Jenkins worker, not to be >>>>>>>>>>>>>>>>> configured by >>>>>>>>>>>>>>>>> code. For more details based on Jenkin logs, the 2nd test >>>>>>>>>>>>>>>>> failed because of >>>>>>>>>>>>>>>>> website/www/site/themes and the 3rd test failed because of >>>>>>>>>>>>>>>>> website/www/node_modules, they are both auto-generated files >>>>>>>>>>>>>>>>> on build. Can >>>>>>>>>>>>>>>>> someone help Nam to look into this? >>>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>>> >>>>>> > RAT ("Run RAT PreCommit") — FAILURE >>>>>>>>>>>>>>>>> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS >>>>>>>>>>>>>>>>> PreCommit") — FAILURE >>>>>>>>>>>>>>>>> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS >>>>>>>>>>>>>>>>> PreCommit") — FAILURE >>>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>>> >>>>>> > 2) Are there any other blockers for merging? >>>>>>>>>>>>>>>>> @Ahmet/Robert/others please share if there are any other >>>>>>>>>>>>>>>>> blockers. >>>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>>> >>>>>> > [1] https://github.com/gohugoio/hugo/pull/4494 >>>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>>> >>>>>> > On Wed, May 6, 2020 at 10:19 AM Robert Bradshaw < >>>>>>>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>>>>>>> >>>>>> >> >>>>>>>>>>>>>>>>> >>>>>> >> On Mon, May 4, 2020 at 7:07 PM Ahmet Altay < >>>>>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>>>>> >>>>>> >> > >>>>>>>>>>>>>>>>> >>>>>> >> >> On Mon, May 4, 2020 at 6:30 PM Robert >>>>>>>>>>>>>>>>> Bradshaw <rober...@google.com> wrote: >>>>>>>>>>>>>>>>> >>>>>> >> >>> >>>>>>>>>>>>>>>>> >>>>>> >> >>> I took the massive commit and split it up >>>>>>>>>>>>>>>>> into: >>>>>>>>>>>>>>>>> >>>>>> >> >>> >>>>>>>>>>>>>>>>> >>>>>> >> >>> (1) Infrastructure changes (basically >>>>>>>>>>>>>>>>> everything outside of >>>>>>>>>>>>>>>>> >>>>>> >> >>> (website/www/site/content) >>>>>>>>>>>>>>>>> >>>>>> >> >>> (2) Sed script changes, and >>>>>>>>>>>>>>>>> >>>>>> >> >>> (3) Manual changes (everything not in (1) >>>>>>>>>>>>>>>>> and (2)). >>>>>>>>>>>>>>>>> >>>>>> >> > >>>>>>>>>>>>>>>>> >>>>>> >> > >>>>>>>>>>>>>>>>> >>>>>> >> > Thank you Robert. This makes it much easier. >>>>>>>>>>>>>>>>> What is the source of the sed script? I am not sure why some >>>>>>>>>>>>>>>>> of those lines >>>>>>>>>>>>>>>>> are there. It would be much easier for us to comment on the >>>>>>>>>>>>>>>>> script source >>>>>>>>>>>>>>>>> if it is reviewable somewhere. >>>>>>>>>>>>>>>>> >>>>>> >> >>>>>>>>>>>>>>>>> >>>>>> >> I just gathered up common patterns as I was >>>>>>>>>>>>>>>>> trying to go through and >>>>>>>>>>>>>>>>> >>>>>> >> review the files... Mostly it was an exercise in >>>>>>>>>>>>>>>>> finding a compact >>>>>>>>>>>>>>>>> >>>>>> >> representation for the delta, not trying to be a >>>>>>>>>>>>>>>>> perfect conversion. >>>>>>>>>>>>>>>>> >>>>>> >> (I do think in retrospect, if we do something >>>>>>>>>>>>>>>>> like this again, it >>>>>>>>>>>>>>>>> >>>>>> >> would be preferable to commit a script that does >>>>>>>>>>>>>>>>> the auto-conversion >>>>>>>>>>>>>>>>> >>>>>> >> (maybe even with some patch files for manual >>>>>>>>>>>>>>>>> changes) both for ease of >>>>>>>>>>>>>>>>> >>>>>> >> reviewing and to avoid the stop-the-world >>>>>>>>>>>>>>>>> situation we're in now. (I'm >>>>>>>>>>>>>>>>> >>>>>> >> still worried that some changes will get lost in >>>>>>>>>>>>>>>>> the shuffle.) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>