Here's a zipped-up tree from a staged sample of the website: https://drive.google.com/file/d/1LKL936tBJ79jpjvlL5vC5uYYwTHsWXiJ/view?usp=sharing
I'd also suggest tagging the commit, so we can find the fist commit later on for reference. I can push the tag after the PR is merged. On Thu, May 14, 2020 at 10:43 AM Ahmet Altay <al...@google.com> wrote: > > > On Thu, May 14, 2020 at 9:16 AM Aizhamal Nurmamat kyzy < > aizha...@apache.org> wrote: > >> Thank you all for reviewing and validating this pull request. I see that >> all tests are passing now, should we merge it? >> > > +1 to merging now. > > Before the merge, please share a link to an archive copy of the old > website. After the merge, please try out the live website see if it is > working as expected. > > >> >> On Wed, May 13, 2020, 5:41 PM Ahmet Altay <al...@google.com> wrote: >> >>> Thank you! Let's merge it once tests are done. >>> >>> On Wed, May 13, 2020 at 5:23 PM Robert Bradshaw <rober...@google.com> >>> wrote: >>> >>>> I took a (non-comprehensive) look at these as well, and didn't see any >>>> issues, so am happy to sign off on this. Thanks Nam, Brian, Ahmet, and >>>> everyone else. >>>> >>>> On Wed, May 13, 2020 at 7:58 AM Nam Bui <nam....@polidea.com> wrote: >>>> >>>>> Hi Ahmet, >>>>> "Does this mean the internal links (e.g. contribute/team) will >>>>> disappear?" >>>>> Yes, I'd like to get rid of them. And to make sure it won't appear to >>>>> confuse people, I replaced all of the spots using "contribute/team" with >>>>> the external one. Currently, we only have 2 "redirect_to" links which are >>>>> "contribute/team" & "contribute/project/team", so this act won't have any >>>>> affects. >>>>> Also, based on your question, I just added a section in the >>>>> documentation (CONTRIBUTE.md), which mentions the replaced/removed >>>>> features >>>>> of Jekyll in terms of writing a new blog post or documentation in Hugo. >>>>> >>>> >>> Got it. The main effect will be any one has a bookmark/link to these >>> pages, those links will no longer work. It is fine if it is only limited to >>> these 2 urls. >>> >>> >>>> >>>>> >>>>> On Wed, May 13, 2020 at 4:17 AM Ahmet Altay <al...@google.com> wrote: >>>>> >>>>>> - I reviewed the diff output with Nam's explanations. The change >>>>>> looks minimal. Large diffs are primarily coming from index and redirect >>>>>> files. codeblocks have differences but the content is seemingly >>>>>> preserved. >>>>>> IIUC, the source of truth is snippet files anyway. (It would be good to >>>>>> get >>>>>> one more set of eyes on this.) >>>>>> - Brian and I reviewed the infrastructure changes. They look >>>>>> reasonable. >>>>>> >>>>>> I think PR is very close to a mergeable state. Especially if we can >>>>>> get an archive copy of the current website, I will be comfortable with >>>>>> the >>>>>> merge. >>>>>> >>>>>> And, thank you Nam for your work so far. >>>>>> >>>>>> On Tue, May 12, 2020 at 4:13 PM Nam Bui <nam....@polidea.com> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> A new commit covers Robert's script is pushed [1], and also the >>>>>>> script output is attached in this email. >>>>>>> >>>>>>> Based on the diff output of the script, my strategy is looking at >>>>>>> the sections which contain the large/massive removed texts, to make sure >>>>>>> that there are no lost content or files. And below are all of the links >>>>>>> which have large of the removed content. >>>>>>> >>>>>>> - Detection: >>>>>>> These links lost some of the contents. Fixed! >>>>>>> + documentation/runners/jstorm/index.html >>>>>>> + documentation/dsls/sql/calcite/lexical-structure/index.html >>>>>>> + documentation/dsls/sql/zetasql/data-types/index.html >>>>>>> + documentation/dsls/sql/zetasql/query-syntax/index.html >>>>>>> >>>>>>> - Aliases: >>>>>>> These links are redirected links. So in Hugo, these HTML files only >>>>>>> include redirected URLs. I also took a look at them to ensure the >>>>>>> content >>>>>>> was there. >>>>>>> + documentation/dsls/sql/calcite/lexical/index.html >>>>>>> + old URLs of blog posts >>>>>>> >>>>>>> - Ignore: >>>>>>> Hugo and Jekyll have different structures of code highlighters >>>>>>> rendering in HTML. Ahmed & Pablo agree with me that its fair to ignore >>>>>>> them >>>>>>> for now. >>>>>>> + codeblocks >>>>>>> >>>>>>> - Missing files: >>>>>>> The script returns some of “missing files” status >>>>>>> + coming-soon.html (this file was used nowhere in Jekyll, so I >>>>>>> didn’t migrate to Hugo) >>>>>>> + documentation/dsls/sql/statements/select/index.html (aliases) >>>>>>> + blog/2019/04/25/beam-2.12.0.html (fixed!) >>>>>>> + blog/2020/05/08/beam-summit-digital-2020.html (new blog post, >>>>>>> added!) >>>>>>> + v2/index.html (this file was used nowhere in Jekyll, so I didn’t >>>>>>> migrate to Hugo) >>>>>>> + contribute/team/index.html (mentioned in “redirect_to” below) >>>>>>> + contribute/project/team/index.html (mentioned in “redirect_to” >>>>>>> below) >>>>>>> >>>>>>> - “redirect_to”: >>>>>>> In Jekyll, there is a feature called “redirect_to”. For instance, >>>>>>> you click on an internal link “contribute/team/” to reach the markdown >>>>>>> “team.md”, then from the markdown file, it redirects you to the external >>>>>>> URL “https://example.com”. >>>>>>> However, there is no such feature in Hugo. My solution is to >>>>>>> directly replace “contribute/team/” with “https://example.com”. >>>>>>> >>>>>> >>>>>> Does this mean the internal links (e.g. contribute/team) will >>>>>> disappear? >>>>>> >>>>>> >>>>>>> >>>>>>> [1] https://github.com/apache/beam/pull/11554 >>>>>>> >>>>>>> On Mon, May 11, 2020 at 7:34 PM Nam Bui <nam....@polidea.com> wrote: >>>>>>> >>>>>>>> Updates for today: >>>>>>>> - Thanks Brian & Ahmet for your reviews. I left my comments for >>>>>>>> some of the questions and also adapted new changes to the reviews [1]. >>>>>>>> - I see that the new blog post was merged yesterday, so I added it >>>>>>>> to the PR as well. >>>>>>>> >>>>>>>> I briefly tried the script from Robert with the input of build >>>>>>>> files from old and new websites. It seemed to work well in terms of >>>>>>>> detecting missing files (or probably wrong links leading to missing >>>>>>>> files). >>>>>>>> I will push another commit to fix all that up, hope can be tomorrow. >>>>>>>> >>>>>>>> [1] >>>>>>>> https://github.com/apache/beam/pull/11554#issuecomment-626792031 >>>>>>>> >>>>>>>> Best regards, >>>>>>>> Nam >>>>>>>> >>>>>>>> >>>>>>>> On Mon, May 11, 2020 at 9:01 AM Nam Bui <nam....@polidea.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> @Ahmet: Yeah, it's all clear to me. :) >>>>>>>>> @Robert: Thanks for your ideas and also the script. It really >>>>>>>>> helps me to serve my works. >>>>>>>>> >>>>>>>>> Best regard! >>>>>>>>> >>>>>>>>> On Sat, May 9, 2020 at 2:10 AM Ahmet Altay <al...@google.com> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> This sounds reasonable to me. Thank you. Nam, does it make sense >>>>>>>>>> to you? >>>>>>>>>> >>>>>>>>>> On Fri, May 8, 2020 at 11:53 AM Robert Bradshaw < >>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>> >>>>>>>>>>> I'd really like to not see this work go to waste, both the >>>>>>>>>>> original revision, the further efforts Nam has done in making it >>>>>>>>>>> more >>>>>>>>>>> manageable to review, and the work put into reviewing this so far, >>>>>>>>>>> so we >>>>>>>>>>> can get the benefits of being on Hugo. How about this for a >>>>>>>>>>> concrete proposal: >>>>>>>>>>> >>>>>>>>>>> (1) We get "standard" approval from one or more committers for >>>>>>>>>>> the infrastructure changes, just as with any other PR. Brian has >>>>>>>>>>> already started this, but if others could step up as well that'd be >>>>>>>>>>> great. >>>>>>>>>>> >>>>>>>>>>> (2) Reviewers (and authors) typically count on (or request) >>>>>>>>>>> sufficient automated test coverage to augment the fact that their >>>>>>>>>>> eyeballs >>>>>>>>>>> are fallible, which is something that is missing here (and given >>>>>>>>>>> the size >>>>>>>>>>> of the change not easily compensated for by a more detailed manual >>>>>>>>>>> review). >>>>>>>>>>> How about we use the script above (or similar) as an automated test >>>>>>>>>>> to >>>>>>>>>>> validate the website's contents haven't (materially) changed. I >>>>>>>>>>> feel we've >>>>>>>>>>> validated enough that the style looks good via spot checking (which >>>>>>>>>>> is >>>>>>>>>>> something that should work on all pages if it works on one). The >>>>>>>>>>> diff >>>>>>>>>>> between the current site and the newly generated site should be >>>>>>>>>>> empty (it >>>>>>>>>>> might already be [1]), or at least we should get a stamp of >>>>>>>>>>> approval on the >>>>>>>>>>> plain-text diff (which should be small), before merging. >>>>>>>>>>> >>>>>>>>>>> (3) To make things easier, everyone holds off on making any >>>>>>>>>>> changes to the old site until a fixed future date (say, next >>>>>>>>>>> Wednesday). >>>>>>>>>>> Hopefully we can get it merged by then. If not, a condition for >>>>>>>>>>> merging >>>>>>>>>>> would be a commitment incorporating new changes after this date. >>>>>>>>>>> >>>>>>>>>>> Does this sound reasonable? >>>>>>>>>>> >>>>>>>>>>> - Robert >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> [1] I'd be curious as to how small the diff already is, but my >>>>>>>>>>> script relies on local directories with the generated HTML, which I >>>>>>>>>>> don't >>>>>>>>>>> have handy at the moment. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Fri, May 8, 2020 at 10:45 AM Robert Bradshaw < >>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Here's a script that we could run on the old and new sites that >>>>>>>>>>>> should quickly catch any major issues but not get caught up in >>>>>>>>>>>> formatting >>>>>>>>>>>> minutia. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Fri, May 8, 2020 at 10:23 AM Robert Bradshaw < >>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On Fri, May 8, 2020 at 9:58 AM Aizhamal Nurmamat kyzy < >>>>>>>>>>>>> aizha...@apache.org> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I understand the difficulty, and this certainly comes with >>>>>>>>>>>>>> lessons learned for future similar projects. >>>>>>>>>>>>>> >>>>>>>>>>>>>> To your questions Robert: >>>>>>>>>>>>>> (1 and 2) I will commit to review the text in the resulting >>>>>>>>>>>>>> pages. I will try and use some automation to extract visible >>>>>>>>>>>>>> text from each >>>>>>>>>>>>>> page and diff it with the current state of the website. I can do >>>>>>>>>>>>>> this >>>>>>>>>>>>>> starting next week. From some quick research, there seem to be >>>>>>>>>>>>>> tools that >>>>>>>>>>>>>> help with this analysis ( >>>>>>>>>>>>>> https://stackoverflow.com/questions/3286955/compare-two-websites-and-see-if-they-are-equal >>>>>>>>>>>>>> ) >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> At first glance it looks like these tools would give diffs >>>>>>>>>>>>> that are *larger* than the 47K one we're struggling to review >>>>>>>>>>>>> here. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> By remaining in this state, we hold others up from making >>>>>>>>>>>>>> changes, or we increase the amount of work needed after merging >>>>>>>>>>>>>> to port >>>>>>>>>>>>>> over changes that may be missed. If we move forward, new changes >>>>>>>>>>>>>> can be >>>>>>>>>>>>>> done on top of the new website. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> I agree we don't want to hold others up from making changes. >>>>>>>>>>>>> However, the amount of work to port changes over seems small in >>>>>>>>>>>>> comparison >>>>>>>>>>>>> to everything else that is being discussed here. (It also >>>>>>>>>>>>> provides good >>>>>>>>>>>>> incentives to reach the bar quickly and has the advantage of >>>>>>>>>>>>> falling on the >>>>>>>>>>>>> right people.) (3) will still take some time. >>>>>>>>>>>>> >>>>>>>>>>>>> If we go this route, we're lowering the bar for doc changes, >>>>>>>>>>>>> but not removing it. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> (3) This makes sense. Brian, would you be able to spend some >>>>>>>>>>>>>> time to look at the automation changes (build files and scripts) >>>>>>>>>>>>>> to ensure >>>>>>>>>>>>>> they look fine? >>>>>>>>>>>>>> >>>>>>>>>>>>>> I would also like to write a post mortem to extract lessons >>>>>>>>>>>>>> learned and avoid this situation in the future. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, May 8, 2020 at 9:44 AM Brian Hulette < >>>>>>>>>>>>>> bhule...@google.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm -0 on merging as-is. I have the same concerns as Robert >>>>>>>>>>>>>>> and he's voiced them very well so I won't waste time re-airing >>>>>>>>>>>>>>> them. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> (2) I spot checked the content, pulled out some common >>>>>>>>>>>>>>>> patterns, and >>>>>>>>>>>>>>>> it mostly looks good, but there were also some issues (e.g. >>>>>>>>>>>>>>>> several >>>>>>>>>>>>>>>> pages were replaced with the contents from entirely >>>>>>>>>>>>>>>> different pages). >>>>>>>>>>>>>>>> I would be more comfortable if, say, a smoke test of >>>>>>>>>>>>>>>> comparing the old >>>>>>>>>>>>>>>> and new sites, with html tags stripped and ignoring >>>>>>>>>>>>>>>> whitespace, >>>>>>>>>>>>>>>> yielded what should be empty diffs. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Can you share any details about this analysis? >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> It was basically paging through the diff, adding things to the >>>>>>>>>>>>> sed script, and then looking at more diffs. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> +1 for verifying the old and new are the same by diffing the >>>>>>>>>>>>>>> output. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (3) It'd be good to have someone give a stamp of approval >>>>>>>>>>>>>>>> on the >>>>>>>>>>>>>>>> infrastructure changes, at least to validate that we're not >>>>>>>>>>>>>>>> going to >>>>>>>>>>>>>>>> be taking on extra tech debt with regard to jenkins >>>>>>>>>>>>>>>> stability and >>>>>>>>>>>>>>>> developer workflow. I see that Brian has at least looked at >>>>>>>>>>>>>>>> this some. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> My involvement so far was just recognizing a problem >>>>>>>>>>>>>>> (creating root-owned files on jenkins workers) and helping to >>>>>>>>>>>>>>> fix it. If >>>>>>>>>>>>>>> there's anyone available who's familiar with the website >>>>>>>>>>>>>>> infrastructure it >>>>>>>>>>>>>>> would be great if they could take a look instead (if not I >>>>>>>>>>>>>>> could probably >>>>>>>>>>>>>>> acquaint myself enough to review). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Thu, May 7, 2020 at 11:57 PM Robert Bradshaw < >>>>>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is a tough situation. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> It would have been much better if this transition was >>>>>>>>>>>>>>>> structured in >>>>>>>>>>>>>>>> such a way that the review was more manageable (e.g. the >>>>>>>>>>>>>>>> suggestion of >>>>>>>>>>>>>>>> scripts, not mixing in voluminous unnecessary changes like >>>>>>>>>>>>>>>> whitespace, >>>>>>>>>>>>>>>> and not updating content), and possibly even incrementally >>>>>>>>>>>>>>>> (e.g. the >>>>>>>>>>>>>>>> new site would have been developed over multiple PRs in a >>>>>>>>>>>>>>>> subdomain or >>>>>>>>>>>>>>>> subdirectory while being worked on). But hindsight is 20/20 >>>>>>>>>>>>>>>> and no >>>>>>>>>>>>>>>> one, myself included, thought to bring this up when the >>>>>>>>>>>>>>>> original >>>>>>>>>>>>>>>> migration was proposed, so this is more something to keep >>>>>>>>>>>>>>>> in mind for >>>>>>>>>>>>>>>> the future. I also appreciate the efforts that have been >>>>>>>>>>>>>>>> made to clean >>>>>>>>>>>>>>>> things up (e.g. preserving history) and address feedback. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> So, where do we go from here? My first thought is that I >>>>>>>>>>>>>>>> really don't >>>>>>>>>>>>>>>> want to set a precedent that just because a PR "will >>>>>>>>>>>>>>>> require a large >>>>>>>>>>>>>>>> effort" and in a state that if we don't "move forward and >>>>>>>>>>>>>>>> merge what >>>>>>>>>>>>>>>> we have now" then "work done so far will be lost" means >>>>>>>>>>>>>>>> that we think >>>>>>>>>>>>>>>> it's OK to forgo doing a proper review. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On the other hand, there are some mitigating factors with >>>>>>>>>>>>>>>> this being >>>>>>>>>>>>>>>> the website and not the code in that "bugs," though possibly >>>>>>>>>>>>>>>> embarrassing, won't break production pipelines or data >>>>>>>>>>>>>>>> loss, and >>>>>>>>>>>>>>>> though the source is technically part of the release, when >>>>>>>>>>>>>>>> we find >>>>>>>>>>>>>>>> something to fix we can fix the live website much more >>>>>>>>>>>>>>>> quickly than go >>>>>>>>>>>>>>>> through the whole release process and convince people to >>>>>>>>>>>>>>>> upgrade. (I >>>>>>>>>>>>>>>> recognize accepting this argument is, to some degree at >>>>>>>>>>>>>>>> least, saying >>>>>>>>>>>>>>>> that we don't care about the correctness of docs as much as >>>>>>>>>>>>>>>> so-called >>>>>>>>>>>>>>>> "real" code, if we go there.) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> If we decide to go ahead and merge (and I would not >>>>>>>>>>>>>>>> object), there are >>>>>>>>>>>>>>>> some things I would like to see. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (1) I would like to understand what we would do afterwards >>>>>>>>>>>>>>>> to "review >>>>>>>>>>>>>>>> the outcome, and ensure that all the content is there," and >>>>>>>>>>>>>>>> why it >>>>>>>>>>>>>>>> can't be done before merging instead. (Is it because it'd >>>>>>>>>>>>>>>> take time >>>>>>>>>>>>>>>> and we don't want to incorporate changes that are made to >>>>>>>>>>>>>>>> the website >>>>>>>>>>>>>>>> in the meantime? I think that boat has sailed, but maybe we >>>>>>>>>>>>>>>> can avoid >>>>>>>>>>>>>>>> making it worse...) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (2) I spot checked the content, pulled out some common >>>>>>>>>>>>>>>> patterns, and >>>>>>>>>>>>>>>> it mostly looks good, but there were also some issues (e.g. >>>>>>>>>>>>>>>> several >>>>>>>>>>>>>>>> pages were replaced with the contents from entirely >>>>>>>>>>>>>>>> different pages). >>>>>>>>>>>>>>>> I would be more comfortable if, say, a smoke test of >>>>>>>>>>>>>>>> comparing the old >>>>>>>>>>>>>>>> and new sites, with html tags stripped and ignoring >>>>>>>>>>>>>>>> whitespace, >>>>>>>>>>>>>>>> yielded what should be empty diffs. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> (3) It'd be good to have someone give a stamp of approval >>>>>>>>>>>>>>>> on the >>>>>>>>>>>>>>>> infrastructure changes, at least to validate that we're not >>>>>>>>>>>>>>>> going to >>>>>>>>>>>>>>>> be taking on extra tech debt with regard to jenkins >>>>>>>>>>>>>>>> stability and >>>>>>>>>>>>>>>> developer workflow. I see that Brian has at least looked at >>>>>>>>>>>>>>>> this some. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> - Robert >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Thu, May 7, 2020 at 12:40 PM Aizhamal Nurmamat kyzy >>>>>>>>>>>>>>>> <aizha...@apache.org> wrote: >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > Thank you Ahmet. >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > Robert/Brian, what do you think? >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > The website staging and pre commit tests have passed [1]. >>>>>>>>>>>>>>>> If nobody has objections, we could merge it soon. >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > [1] https://github.com/apache/beam/pull/11554 >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > On Thu, May 7, 2020 at 11:38 AM Ahmet Altay < >>>>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> On Thu, May 7, 2020 at 10:50 AM Aizhamal Nurmamat kyzy < >>>>>>>>>>>>>>>> aizha...@apache.org> wrote: >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> >>> Thanks for the writeup Ahmet. >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> >>> My bias is to move forward and merge the PR. After >>>>>>>>>>>>>>>> this, we'll review the outcome, and ensure that all the >>>>>>>>>>>>>>>> content is there. >>>>>>>>>>>>>>>> Nam will help us with that. >>>>>>>>>>>>>>>> >>> The reason that I'd like to move forward and merge what >>>>>>>>>>>>>>>> we have now - is that if we don't do that, the work done so >>>>>>>>>>>>>>>> far will be >>>>>>>>>>>>>>>> lost. >>>>>>>>>>>>>>>> >>> We'll make sure to stage the website in its current >>>>>>>>>>>>>>>> state, and use that as reference/archive to ensure all the >>>>>>>>>>>>>>>> content have >>>>>>>>>>>>>>>> been moved. >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> >>> Is this reasonable to everyone? >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> This is reasonable to me. I agree with your reasons. >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >> What do others think? >>>>>>>>>>>>>>>> >> >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> >>> >>>>>>>>>>>>>>>> >>> On Wed, May 6, 2020 at 7:07 PM Ahmet Altay < >>>>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> >>>> On Wed, May 6, 2020 at 2:33 PM Aizhamal Nurmamat kyzy < >>>>>>>>>>>>>>>> aizha...@apache.org> wrote: >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> >>>>>> > 1) Currently, the main blocker for merging is >>>>>>>>>>>>>>>> Staging Test Failures. >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> >>>>>> That and finishing the review. (Is someone >>>>>>>>>>>>>>>> tracking/coordinating this?) >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> I am coordinating the work on the failed tests, but I >>>>>>>>>>>>>>>> would need other committer's help to perform the review. >>>>>>>>>>>>>>>> @Ahmet, could you >>>>>>>>>>>>>>>> help us prioritize the review for this PR? >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> >>>> The problem is there are too many manual changes. >>>>>>>>>>>>>>>> Reviewing this change in this form will require a large >>>>>>>>>>>>>>>> effort. I do not >>>>>>>>>>>>>>>> think I can interrupt other projects to prioritize reviews on >>>>>>>>>>>>>>>> this PR. IMO, >>>>>>>>>>>>>>>> we have a few options: >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> >>>> - PR to be restructured in the format suggested in >>>>>>>>>>>>>>>> this thread. A commit for infrastructure changes from Jekyll >>>>>>>>>>>>>>>> to hugo. A >>>>>>>>>>>>>>>> second commit for a script that will convert the majority of >>>>>>>>>>>>>>>> the content. A >>>>>>>>>>>>>>>> third commit for the execution of the script. And a fourth >>>>>>>>>>>>>>>> commit for the >>>>>>>>>>>>>>>> additional manual content changes. If Nam can get to this >>>>>>>>>>>>>>>> form, people on >>>>>>>>>>>>>>>> this thread myself/Robert/Pablo/Brian can review the changes. >>>>>>>>>>>>>>>> >>>> - Another option is, we can accept that we already >>>>>>>>>>>>>>>> invested in this transition and overall this is a good change, >>>>>>>>>>>>>>>> and merge >>>>>>>>>>>>>>>> the PR more or less in its current form (with tests fixed and >>>>>>>>>>>>>>>> open comments >>>>>>>>>>>>>>>> addressed) even though it has issues. And then overtime fix >>>>>>>>>>>>>>>> the issues we >>>>>>>>>>>>>>>> encounter. There was already some amount of review and visual >>>>>>>>>>>>>>>> comparisons, >>>>>>>>>>>>>>>> we risk losing some recent content changes but I am assuming >>>>>>>>>>>>>>>> this will not >>>>>>>>>>>>>>>> be much. If Nam can commit to compare two sites after a merge, >>>>>>>>>>>>>>>> fixing the >>>>>>>>>>>>>>>> majority of the delta, this might be a viable option. >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> >>>> Another thing we can do, we can archive/store a >>>>>>>>>>>>>>>> read-only copy of the current website in an "archive" url >>>>>>>>>>>>>>>> temporarily >>>>>>>>>>>>>>>> instead of completely deleting it. It will give us a baseline >>>>>>>>>>>>>>>> for a while >>>>>>>>>>>>>>>> to go back to the old content and move any missing data. (And >>>>>>>>>>>>>>>> maybe, >>>>>>>>>>>>>>>> someone can come up with an innovative way to compare the >>>>>>>>>>>>>>>> textual content >>>>>>>>>>>>>>>> of both sites.) A note on the stop world approach, I believe >>>>>>>>>>>>>>>> we are already >>>>>>>>>>>>>>>> failing on that with merge conflicts showing up on the PR. It >>>>>>>>>>>>>>>> will be >>>>>>>>>>>>>>>> better for us to complete the transition as soon as possible. >>>>>>>>>>>>>>>> Fixing after >>>>>>>>>>>>>>>> the initial merge might be a simpler task, especially if we >>>>>>>>>>>>>>>> can archive the >>>>>>>>>>>>>>>> old site. >>>>>>>>>>>>>>>> >>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>> >>>>>>>>>>>>>>>> >>>>>> >>>>>>>>>>>>>>>> >>>>>> > Michal showed Nam how to handle the 1st test which >>>>>>>>>>>>>>>> was about Apache License missing. >>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>> >>>>>> > However, the 2nd and 3rd tests looked like some >>>>>>>>>>>>>>>> kind of permissions error on the Jenkins worker, not to be >>>>>>>>>>>>>>>> configured by >>>>>>>>>>>>>>>> code. For more details based on Jenkin logs, the 2nd test >>>>>>>>>>>>>>>> failed because of >>>>>>>>>>>>>>>> website/www/site/themes and the 3rd test failed because of >>>>>>>>>>>>>>>> website/www/node_modules, they are both auto-generated files >>>>>>>>>>>>>>>> on build. Can >>>>>>>>>>>>>>>> someone help Nam to look into this? >>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>> >>>>>> > RAT ("Run RAT PreCommit") — FAILURE >>>>>>>>>>>>>>>> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS >>>>>>>>>>>>>>>> PreCommit") — FAILURE >>>>>>>>>>>>>>>> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS >>>>>>>>>>>>>>>> PreCommit") — FAILURE >>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>> >>>>>> > 2) Are there any other blockers for merging? >>>>>>>>>>>>>>>> @Ahmet/Robert/others please share if there are any other >>>>>>>>>>>>>>>> blockers. >>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>> >>>>>> > [1] https://github.com/gohugoio/hugo/pull/4494 >>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>> >>>>>> > >>>>>>>>>>>>>>>> >>>>>> > On Wed, May 6, 2020 at 10:19 AM Robert Bradshaw < >>>>>>>>>>>>>>>> rober...@google.com> wrote: >>>>>>>>>>>>>>>> >>>>>> >> >>>>>>>>>>>>>>>> >>>>>> >> On Mon, May 4, 2020 at 7:07 PM Ahmet Altay < >>>>>>>>>>>>>>>> al...@google.com> wrote: >>>>>>>>>>>>>>>> >>>>>> >> > >>>>>>>>>>>>>>>> >>>>>> >> >> On Mon, May 4, 2020 at 6:30 PM Robert Bradshaw >>>>>>>>>>>>>>>> <rober...@google.com> wrote: >>>>>>>>>>>>>>>> >>>>>> >> >>> >>>>>>>>>>>>>>>> >>>>>> >> >>> I took the massive commit and split it up >>>>>>>>>>>>>>>> into: >>>>>>>>>>>>>>>> >>>>>> >> >>> >>>>>>>>>>>>>>>> >>>>>> >> >>> (1) Infrastructure changes (basically >>>>>>>>>>>>>>>> everything outside of >>>>>>>>>>>>>>>> >>>>>> >> >>> (website/www/site/content) >>>>>>>>>>>>>>>> >>>>>> >> >>> (2) Sed script changes, and >>>>>>>>>>>>>>>> >>>>>> >> >>> (3) Manual changes (everything not in (1) and >>>>>>>>>>>>>>>> (2)). >>>>>>>>>>>>>>>> >>>>>> >> > >>>>>>>>>>>>>>>> >>>>>> >> > >>>>>>>>>>>>>>>> >>>>>> >> > Thank you Robert. This makes it much easier. >>>>>>>>>>>>>>>> What is the source of the sed script? I am not sure why some >>>>>>>>>>>>>>>> of those lines >>>>>>>>>>>>>>>> are there. It would be much easier for us to comment on the >>>>>>>>>>>>>>>> script source >>>>>>>>>>>>>>>> if it is reviewable somewhere. >>>>>>>>>>>>>>>> >>>>>> >> >>>>>>>>>>>>>>>> >>>>>> >> I just gathered up common patterns as I was >>>>>>>>>>>>>>>> trying to go through and >>>>>>>>>>>>>>>> >>>>>> >> review the files... Mostly it was an exercise in >>>>>>>>>>>>>>>> finding a compact >>>>>>>>>>>>>>>> >>>>>> >> representation for the delta, not trying to be a >>>>>>>>>>>>>>>> perfect conversion. >>>>>>>>>>>>>>>> >>>>>> >> (I do think in retrospect, if we do something >>>>>>>>>>>>>>>> like this again, it >>>>>>>>>>>>>>>> >>>>>> >> would be preferable to commit a script that does >>>>>>>>>>>>>>>> the auto-conversion >>>>>>>>>>>>>>>> >>>>>> >> (maybe even with some patch files for manual >>>>>>>>>>>>>>>> changes) both for ease of >>>>>>>>>>>>>>>> >>>>>> >> reviewing and to avoid the stop-the-world >>>>>>>>>>>>>>>> situation we're in now. (I'm >>>>>>>>>>>>>>>> >>>>>> >> still worried that some changes will get lost in >>>>>>>>>>>>>>>> the shuffle.) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>