Thank you all for reviewing and validating this pull request. I see that
all tests are passing now, should we merge it?

On Wed, May 13, 2020, 5:41 PM Ahmet Altay <al...@google.com> wrote:

> Thank you! Let's merge it once tests are done.
>
> On Wed, May 13, 2020 at 5:23 PM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> I took a (non-comprehensive) look at these as well, and didn't see any
>> issues, so am happy to sign off on this. Thanks Nam, Brian, Ahmet, and
>> everyone else.
>>
>> On Wed, May 13, 2020 at 7:58 AM Nam Bui <nam....@polidea.com> wrote:
>>
>>> Hi Ahmet,
>>> "Does this mean the internal links (e.g. contribute/team) will
>>> disappear?"
>>> Yes, I'd like to get rid of them. And to make sure it won't appear to
>>> confuse people, I replaced all of the spots using "contribute/team" with
>>> the external one. Currently, we only have 2 "redirect_to" links which are
>>> "contribute/team" & "contribute/project/team", so this act won't have any
>>> affects.
>>> Also, based on your question, I just added a section in the
>>> documentation (CONTRIBUTE.md), which mentions the replaced/removed features
>>> of Jekyll in terms of writing a new blog post or documentation in Hugo.
>>>
>>
> Got it. The main effect will be any one has a bookmark/link to these
> pages, those links will no longer work. It is fine if it is only limited to
> these 2 urls.
>
>
>>
>>>
>>> On Wed, May 13, 2020 at 4:17 AM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> - I reviewed the diff output with Nam's explanations. The change looks
>>>> minimal. Large diffs are primarily coming from index and redirect files.
>>>> codeblocks have differences but the content is seemingly preserved. IIUC,
>>>> the source of truth is snippet files anyway. (It would be good to get one
>>>> more set of eyes on this.)
>>>> - Brian and I reviewed the infrastructure changes. They look reasonable.
>>>>
>>>> I think PR is very close to a mergeable state. Especially if we can get
>>>> an archive copy of the current website, I will be comfortable with the
>>>> merge.
>>>>
>>>> And, thank you Nam for your work so far.
>>>>
>>>> On Tue, May 12, 2020 at 4:13 PM Nam Bui <nam....@polidea.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> A new commit covers Robert's script is pushed [1], and also the script
>>>>> output is attached in this email.
>>>>>
>>>>> Based on the diff output of the script, my strategy is looking at the
>>>>> sections which contain the large/massive removed texts, to make sure that
>>>>> there are no lost content or files. And below are all of the links which
>>>>> have large of the removed content.
>>>>>
>>>>> - Detection:
>>>>> These links lost some of the contents. Fixed!
>>>>> + documentation/runners/jstorm/index.html
>>>>> + documentation/dsls/sql/calcite/lexical-structure/index.html
>>>>> + documentation/dsls/sql/zetasql/data-types/index.html
>>>>> + documentation/dsls/sql/zetasql/query-syntax/index.html
>>>>>
>>>>> - Aliases:
>>>>> These links are redirected links. So in Hugo, these HTML files only
>>>>> include redirected URLs. I also took a look at them to ensure the content
>>>>> was there.
>>>>> + documentation/dsls/sql/calcite/lexical/index.html
>>>>> + old URLs of blog posts
>>>>>
>>>>> - Ignore:
>>>>> Hugo and Jekyll have different structures of code highlighters
>>>>> rendering in HTML. Ahmed & Pablo agree with me that its fair to ignore 
>>>>> them
>>>>> for now.
>>>>> + codeblocks
>>>>>
>>>>> - Missing files:
>>>>> The script returns some of “missing files” status
>>>>> + coming-soon.html (this file was used nowhere in Jekyll, so I didn’t
>>>>> migrate to Hugo)
>>>>> + documentation/dsls/sql/statements/select/index.html (aliases)
>>>>> + blog/2019/04/25/beam-2.12.0.html (fixed!)
>>>>> + blog/2020/05/08/beam-summit-digital-2020.html (new blog post, added!)
>>>>> + v2/index.html (this file was used nowhere in Jekyll, so I didn’t
>>>>> migrate to Hugo)
>>>>> + contribute/team/index.html (mentioned in “redirect_to” below)
>>>>> + contribute/project/team/index.html (mentioned in “redirect_to” below)
>>>>>
>>>>> - “redirect_to”:
>>>>> In Jekyll, there is a feature called “redirect_to”. For instance, you
>>>>> click on an internal link “contribute/team/” to reach the markdown
>>>>> “team.md”, then from the markdown file, it redirects you to the external
>>>>> URL “https://example.com”.
>>>>> However, there is no such feature in Hugo. My solution is to directly
>>>>> replace “contribute/team/” with “https://example.com”.
>>>>>
>>>>
>>>> Does this mean the internal links (e.g. contribute/team) will disappear?
>>>>
>>>>
>>>>>
>>>>> [1] https://github.com/apache/beam/pull/11554
>>>>>
>>>>> On Mon, May 11, 2020 at 7:34 PM Nam Bui <nam....@polidea.com> wrote:
>>>>>
>>>>>> Updates for today:
>>>>>> - Thanks Brian & Ahmet for your reviews. I left my comments for some
>>>>>> of the questions and also adapted new changes to the reviews [1].
>>>>>> - I see that the new blog post was merged yesterday, so I added it to
>>>>>> the PR as well.
>>>>>>
>>>>>> I briefly tried the script from Robert with the input of build files
>>>>>> from old and new websites. It seemed to work well in terms of detecting
>>>>>> missing files (or probably wrong links leading to missing files). I will
>>>>>> push another commit to fix all that up, hope can be tomorrow.
>>>>>>
>>>>>> [1] https://github.com/apache/beam/pull/11554#issuecomment-626792031
>>>>>>
>>>>>> Best regards,
>>>>>> Nam
>>>>>>
>>>>>>
>>>>>> On Mon, May 11, 2020 at 9:01 AM Nam Bui <nam....@polidea.com> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> @Ahmet: Yeah, it's all clear to me. :)
>>>>>>> @Robert: Thanks for your ideas and also the script. It really helps
>>>>>>> me to serve my works.
>>>>>>>
>>>>>>> Best regard!
>>>>>>>
>>>>>>> On Sat, May 9, 2020 at 2:10 AM Ahmet Altay <al...@google.com> wrote:
>>>>>>>
>>>>>>>> This sounds reasonable to me. Thank you. Nam, does it make sense to
>>>>>>>> you?
>>>>>>>>
>>>>>>>> On Fri, May 8, 2020 at 11:53 AM Robert Bradshaw <
>>>>>>>> rober...@google.com> wrote:
>>>>>>>>
>>>>>>>>> I'd really like to not see this work go to waste, both the
>>>>>>>>> original revision, the further efforts Nam has done in making it more
>>>>>>>>> manageable to review, and the work put into reviewing this so far, so 
>>>>>>>>> we
>>>>>>>>> can get the benefits of being on Hugo. How about this for a
>>>>>>>>> concrete proposal:
>>>>>>>>>
>>>>>>>>> (1) We get "standard" approval from one or more committers for the
>>>>>>>>> infrastructure changes, just as with any other PR. Brian has
>>>>>>>>> already started this, but if others could step up as well that'd be 
>>>>>>>>> great.
>>>>>>>>>
>>>>>>>>> (2) Reviewers (and authors) typically count on (or request)
>>>>>>>>> sufficient automated test coverage to augment the fact that their 
>>>>>>>>> eyeballs
>>>>>>>>> are fallible, which is something that is missing here (and given the 
>>>>>>>>> size
>>>>>>>>> of the change not easily compensated for by a more detailed manual 
>>>>>>>>> review).
>>>>>>>>> How about we use the script above (or similar) as an automated test to
>>>>>>>>> validate the website's contents haven't (materially) changed. I feel 
>>>>>>>>> we've
>>>>>>>>> validated enough that the style looks good via spot checking (which is
>>>>>>>>> something that should work on all pages if it works on one). The diff
>>>>>>>>> between the current site and the newly generated site should be empty 
>>>>>>>>> (it
>>>>>>>>> might already be [1]), or at least we should get a stamp of approval 
>>>>>>>>> on the
>>>>>>>>> plain-text diff (which should be small), before merging.
>>>>>>>>>
>>>>>>>>> (3) To make things easier, everyone holds off on making any
>>>>>>>>> changes to the old site until a fixed future date (say, next 
>>>>>>>>> Wednesday).
>>>>>>>>> Hopefully we can get it merged by then. If not, a condition for 
>>>>>>>>> merging
>>>>>>>>> would be a commitment incorporating new changes after this date.
>>>>>>>>>
>>>>>>>>> Does this sound reasonable?
>>>>>>>>>
>>>>>>>>> - Robert
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> [1] I'd be curious as to how small the diff already is, but my
>>>>>>>>> script relies on local directories with the generated HTML, which I 
>>>>>>>>> don't
>>>>>>>>> have handy at the moment.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, May 8, 2020 at 10:45 AM Robert Bradshaw <
>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> Here's a script that we could run on the old and new sites that
>>>>>>>>>> should quickly catch any major issues but not get caught up in 
>>>>>>>>>> formatting
>>>>>>>>>> minutia.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, May 8, 2020 at 10:23 AM Robert Bradshaw <
>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> On Fri, May 8, 2020 at 9:58 AM Aizhamal Nurmamat kyzy <
>>>>>>>>>>> aizha...@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I understand the difficulty, and this certainly comes with
>>>>>>>>>>>> lessons learned for future similar projects.
>>>>>>>>>>>>
>>>>>>>>>>>> To your questions Robert:
>>>>>>>>>>>> (1 and 2) I will commit to review the text in the resulting
>>>>>>>>>>>> pages. I will try and use some automation to extract visible text 
>>>>>>>>>>>> from each
>>>>>>>>>>>> page and diff it with the current state of the website. I can do 
>>>>>>>>>>>> this
>>>>>>>>>>>> starting next week. From some quick research, there seem to be 
>>>>>>>>>>>> tools that
>>>>>>>>>>>> help with this analysis (
>>>>>>>>>>>> https://stackoverflow.com/questions/3286955/compare-two-websites-and-see-if-they-are-equal
>>>>>>>>>>>> )
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> At first glance it looks like these tools would give diffs that
>>>>>>>>>>> are *larger* than the 47K one we're struggling to review here.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> By remaining in this state, we hold others up from making
>>>>>>>>>>>> changes, or we increase the amount of work needed after merging to 
>>>>>>>>>>>> port
>>>>>>>>>>>> over changes that may be missed. If we move forward, new changes 
>>>>>>>>>>>> can be
>>>>>>>>>>>> done on top of the new website.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I agree we don't want to hold others up from making changes.
>>>>>>>>>>> However, the amount of work to port changes over seems small in 
>>>>>>>>>>> comparison
>>>>>>>>>>> to everything else that is being discussed here. (It also provides 
>>>>>>>>>>> good
>>>>>>>>>>> incentives to reach the bar quickly and has the advantage of 
>>>>>>>>>>> falling on the
>>>>>>>>>>> right people.) (3) will still take some time.
>>>>>>>>>>>
>>>>>>>>>>> If we go this route, we're lowering the bar for doc changes, but
>>>>>>>>>>> not removing it.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> (3) This makes sense. Brian, would you be able to spend some
>>>>>>>>>>>> time to look at the automation changes (build files and scripts) 
>>>>>>>>>>>> to ensure
>>>>>>>>>>>> they look fine?
>>>>>>>>>>>>
>>>>>>>>>>>> I would also like to write a post mortem to extract lessons
>>>>>>>>>>>> learned and avoid this situation in the future.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 8, 2020 at 9:44 AM Brian Hulette <
>>>>>>>>>>>> bhule...@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I'm -0 on merging as-is. I have the same concerns as Robert
>>>>>>>>>>>>> and he's voiced them very well so I won't waste time re-airing 
>>>>>>>>>>>>> them.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (2) I spot checked the content, pulled out some common
>>>>>>>>>>>>>> patterns, and
>>>>>>>>>>>>>> it mostly looks good, but there were also some issues (e.g.
>>>>>>>>>>>>>> several
>>>>>>>>>>>>>> pages were replaced with the contents from entirely different
>>>>>>>>>>>>>> pages).
>>>>>>>>>>>>>> I would be more comfortable if, say, a smoke test of
>>>>>>>>>>>>>> comparing the old
>>>>>>>>>>>>>> and new sites, with html tags stripped and ignoring
>>>>>>>>>>>>>> whitespace,
>>>>>>>>>>>>>> yielded what should be empty diffs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Can you share any details about this analysis?
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>> It was basically paging through the diff, adding things to the
>>>>>>>>>>> sed script, and then looking at more diffs.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> +1 for verifying the old and new are the same by diffing the
>>>>>>>>>>>>> output.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> (3) It'd be good to have someone give a stamp of approval on
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> infrastructure changes, at least to validate that we're not
>>>>>>>>>>>>>> going to
>>>>>>>>>>>>>> be taking on extra tech debt with regard to jenkins stability
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> developer workflow. I see that Brian has at least looked at
>>>>>>>>>>>>>> this some.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> My involvement so far was just recognizing a problem (creating
>>>>>>>>>>>>> root-owned files on jenkins workers) and helping to fix it. If 
>>>>>>>>>>>>> there's
>>>>>>>>>>>>> anyone available who's familiar with the website infrastructure 
>>>>>>>>>>>>> it would be
>>>>>>>>>>>>> great if they could take a look instead (if not I could probably 
>>>>>>>>>>>>> acquaint
>>>>>>>>>>>>> myself enough to review).
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, May 7, 2020 at 11:57 PM Robert Bradshaw <
>>>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is a tough situation.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It would have been much better if this transition was
>>>>>>>>>>>>>> structured in
>>>>>>>>>>>>>> such a way that the review was more manageable (e.g. the
>>>>>>>>>>>>>> suggestion of
>>>>>>>>>>>>>> scripts, not mixing in voluminous unnecessary changes like
>>>>>>>>>>>>>> whitespace,
>>>>>>>>>>>>>> and not updating content), and possibly even incrementally
>>>>>>>>>>>>>> (e.g. the
>>>>>>>>>>>>>> new site would have been developed over multiple PRs in a
>>>>>>>>>>>>>> subdomain or
>>>>>>>>>>>>>> subdirectory while being worked on). But hindsight is 20/20
>>>>>>>>>>>>>> and no
>>>>>>>>>>>>>> one, myself included, thought to bring this up when the
>>>>>>>>>>>>>> original
>>>>>>>>>>>>>> migration was proposed, so this is more something to keep in
>>>>>>>>>>>>>> mind for
>>>>>>>>>>>>>> the future. I also appreciate the efforts that have been made
>>>>>>>>>>>>>> to clean
>>>>>>>>>>>>>> things up (e.g. preserving history) and address feedback.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> So, where do we go from here? My first thought is that I
>>>>>>>>>>>>>> really don't
>>>>>>>>>>>>>> want to set a precedent that just because a PR "will require
>>>>>>>>>>>>>> a large
>>>>>>>>>>>>>> effort" and in a state that if we don't "move forward and
>>>>>>>>>>>>>> merge what
>>>>>>>>>>>>>> we have now" then "work done so far will be lost" means that
>>>>>>>>>>>>>> we think
>>>>>>>>>>>>>> it's OK to forgo doing a proper review.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On the other hand, there are some mitigating factors with
>>>>>>>>>>>>>> this being
>>>>>>>>>>>>>> the website and not the code in that "bugs," though possibly
>>>>>>>>>>>>>> embarrassing, won't break production pipelines or data loss,
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> though the source is technically part of the release, when we
>>>>>>>>>>>>>> find
>>>>>>>>>>>>>> something to fix we can fix the live website much more
>>>>>>>>>>>>>> quickly than go
>>>>>>>>>>>>>> through the whole release process and convince people to
>>>>>>>>>>>>>> upgrade. (I
>>>>>>>>>>>>>> recognize accepting this argument is, to some degree at
>>>>>>>>>>>>>> least, saying
>>>>>>>>>>>>>> that we don't care about the correctness of docs as much as
>>>>>>>>>>>>>> so-called
>>>>>>>>>>>>>> "real" code, if we go there.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we decide to go ahead and merge (and I would not object),
>>>>>>>>>>>>>> there are
>>>>>>>>>>>>>> some things I would like to see.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (1) I would like to understand what we would do afterwards to
>>>>>>>>>>>>>> "review
>>>>>>>>>>>>>> the outcome, and ensure that all the content is there," and
>>>>>>>>>>>>>> why it
>>>>>>>>>>>>>> can't be done before merging instead. (Is it because it'd
>>>>>>>>>>>>>> take time
>>>>>>>>>>>>>> and we don't want to incorporate changes that are made to the
>>>>>>>>>>>>>> website
>>>>>>>>>>>>>> in the meantime? I think that boat has sailed, but maybe we
>>>>>>>>>>>>>> can avoid
>>>>>>>>>>>>>> making it worse...)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (2) I spot checked the content, pulled out some common
>>>>>>>>>>>>>> patterns, and
>>>>>>>>>>>>>> it mostly looks good, but there were also some issues (e.g.
>>>>>>>>>>>>>> several
>>>>>>>>>>>>>> pages were replaced with the contents from entirely different
>>>>>>>>>>>>>> pages).
>>>>>>>>>>>>>> I would be more comfortable if, say, a smoke test of
>>>>>>>>>>>>>> comparing the old
>>>>>>>>>>>>>> and new sites, with html tags stripped and ignoring
>>>>>>>>>>>>>> whitespace,
>>>>>>>>>>>>>> yielded what should be empty diffs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (3) It'd be good to have someone give a stamp of approval on
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> infrastructure changes, at least to validate that we're not
>>>>>>>>>>>>>> going to
>>>>>>>>>>>>>> be taking on extra tech debt with regard to jenkins stability
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>> developer workflow. I see that Brian has at least looked at
>>>>>>>>>>>>>> this some.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, May 7, 2020 at 12:40 PM Aizhamal Nurmamat kyzy
>>>>>>>>>>>>>> <aizha...@apache.org> wrote:
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Thank you Ahmet.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > Robert/Brian, what do you think?
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > The website staging and pre commit tests have passed [1].
>>>>>>>>>>>>>> If nobody has objections, we could merge it soon.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > [1] https://github.com/apache/beam/pull/11554
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > On Thu, May 7, 2020 at 11:38 AM Ahmet Altay <
>>>>>>>>>>>>>> al...@google.com> wrote:
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> On Thu, May 7, 2020 at 10:50 AM Aizhamal Nurmamat kyzy <
>>>>>>>>>>>>>> aizha...@apache.org> wrote:
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> Thanks for the writeup Ahmet.
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> My bias is to move forward and merge the PR. After this,
>>>>>>>>>>>>>> we'll review the outcome, and ensure that all the content is 
>>>>>>>>>>>>>> there. Nam
>>>>>>>>>>>>>> will help us with that.
>>>>>>>>>>>>>> >>> The reason that I'd like to move forward and merge what
>>>>>>>>>>>>>> we have now - is that if we don't do that, the work done so far 
>>>>>>>>>>>>>> will be
>>>>>>>>>>>>>> lost.
>>>>>>>>>>>>>> >>> We'll make sure to stage the website in its current
>>>>>>>>>>>>>> state, and use that as reference/archive to ensure all the 
>>>>>>>>>>>>>> content have
>>>>>>>>>>>>>> been moved.
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> Is this reasonable to everyone?
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> This is reasonable to me. I agree with your reasons.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> What do others think?
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>> >>> On Wed, May 6, 2020 at 7:07 PM Ahmet Altay <
>>>>>>>>>>>>>> al...@google.com> wrote:
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> On Wed, May 6, 2020 at 2:33 PM Aizhamal Nurmamat kyzy <
>>>>>>>>>>>>>> aizha...@apache.org> wrote:
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>> > 1) Currently, the main blocker for merging is
>>>>>>>>>>>>>> Staging Test Failures.
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>> That and finishing the review. (Is someone
>>>>>>>>>>>>>> tracking/coordinating this?)
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>> I am coordinating the work on the failed tests, but I
>>>>>>>>>>>>>> would need other committer's help to perform the review. @Ahmet, 
>>>>>>>>>>>>>> could you
>>>>>>>>>>>>>> help us prioritize the review for this PR?
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> The problem is there are too many manual changes.
>>>>>>>>>>>>>> Reviewing this change in this form will require a large effort. 
>>>>>>>>>>>>>> I do not
>>>>>>>>>>>>>> think I can interrupt other projects to prioritize reviews on 
>>>>>>>>>>>>>> this PR. IMO,
>>>>>>>>>>>>>> we have a few options:
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> - PR to be restructured in the format suggested in this
>>>>>>>>>>>>>> thread. A commit for infrastructure changes from Jekyll to hugo. 
>>>>>>>>>>>>>> A second
>>>>>>>>>>>>>> commit for a script that will convert the majority of the 
>>>>>>>>>>>>>> content. A third
>>>>>>>>>>>>>> commit for the execution of the script. And a fourth commit for 
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> additional manual content changes. If Nam can get to this form, 
>>>>>>>>>>>>>> people on
>>>>>>>>>>>>>> this thread myself/Robert/Pablo/Brian can review the changes.
>>>>>>>>>>>>>> >>>> - Another option is, we can accept that we already
>>>>>>>>>>>>>> invested in this transition and overall this is a good change, 
>>>>>>>>>>>>>> and merge
>>>>>>>>>>>>>> the PR more or less in its current form (with tests fixed and 
>>>>>>>>>>>>>> open comments
>>>>>>>>>>>>>> addressed) even though it has issues. And then overtime fix the 
>>>>>>>>>>>>>> issues we
>>>>>>>>>>>>>> encounter. There was already some amount of review and visual 
>>>>>>>>>>>>>> comparisons,
>>>>>>>>>>>>>> we risk losing some recent content changes but I am assuming 
>>>>>>>>>>>>>> this will not
>>>>>>>>>>>>>> be much. If Nam can commit to compare two sites after a merge, 
>>>>>>>>>>>>>> fixing the
>>>>>>>>>>>>>> majority of the delta, this might be a viable option.
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>> Another thing we can do, we can archive/store a
>>>>>>>>>>>>>> read-only copy of the current website in an "archive" url 
>>>>>>>>>>>>>> temporarily
>>>>>>>>>>>>>> instead of completely deleting it. It will give us a baseline 
>>>>>>>>>>>>>> for a while
>>>>>>>>>>>>>> to go back to the old content and move any missing data. (And 
>>>>>>>>>>>>>> maybe,
>>>>>>>>>>>>>> someone can come up with an innovative way to compare the 
>>>>>>>>>>>>>> textual content
>>>>>>>>>>>>>> of both sites.) A note on the stop world approach, I believe we 
>>>>>>>>>>>>>> are already
>>>>>>>>>>>>>> failing on that with merge conflicts showing up on the PR. It 
>>>>>>>>>>>>>> will be
>>>>>>>>>>>>>> better for us to complete the transition as soon as possible. 
>>>>>>>>>>>>>> Fixing after
>>>>>>>>>>>>>> the initial merge might be a simpler task, especially if we can 
>>>>>>>>>>>>>> archive the
>>>>>>>>>>>>>> old site.
>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>> >>>>>> > Michal showed Nam how to handle the 1st test which
>>>>>>>>>>>>>> was about Apache License missing.
>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>> >>>>>> > However, the 2nd and 3rd tests looked like some kind
>>>>>>>>>>>>>> of permissions error on the Jenkins worker, not to be configured 
>>>>>>>>>>>>>> by code.
>>>>>>>>>>>>>> For more details based on Jenkin logs, the 2nd test failed 
>>>>>>>>>>>>>> because of
>>>>>>>>>>>>>> website/www/site/themes and the 3rd test failed because of
>>>>>>>>>>>>>> website/www/node_modules, they are both auto-generated files on 
>>>>>>>>>>>>>> build. Can
>>>>>>>>>>>>>> someone help Nam to look into this?
>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>> >>>>>> > RAT ("Run RAT PreCommit") — FAILURE
>>>>>>>>>>>>>> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS
>>>>>>>>>>>>>> PreCommit") — FAILURE
>>>>>>>>>>>>>> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS
>>>>>>>>>>>>>> PreCommit") — FAILURE
>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>> >>>>>> > 2) Are there any other blockers for merging?
>>>>>>>>>>>>>> @Ahmet/Robert/others please share if there are any other 
>>>>>>>>>>>>>> blockers.
>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>> >>>>>> > [1] https://github.com/gohugoio/hugo/pull/4494
>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>> >>>>>> > On Wed, May 6, 2020 at 10:19 AM Robert Bradshaw <
>>>>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>>>> >>>>>> >>
>>>>>>>>>>>>>> >>>>>> >> On Mon, May 4, 2020 at 7:07 PM Ahmet Altay <
>>>>>>>>>>>>>> al...@google.com> wrote:
>>>>>>>>>>>>>> >>>>>> >> >
>>>>>>>>>>>>>> >>>>>> >> >> On Mon, May 4, 2020 at 6:30 PM Robert Bradshaw <
>>>>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>>>> >>>>>> >> >>>
>>>>>>>>>>>>>> >>>>>> >> >>> I took the massive commit and split it up into:
>>>>>>>>>>>>>> >>>>>> >> >>>
>>>>>>>>>>>>>> >>>>>> >> >>> (1) Infrastructure changes (basically
>>>>>>>>>>>>>> everything outside of
>>>>>>>>>>>>>> >>>>>> >> >>> (website/www/site/content)
>>>>>>>>>>>>>> >>>>>> >> >>> (2) Sed script changes, and
>>>>>>>>>>>>>> >>>>>> >> >>> (3) Manual changes (everything not in (1) and
>>>>>>>>>>>>>> (2)).
>>>>>>>>>>>>>> >>>>>> >> >
>>>>>>>>>>>>>> >>>>>> >> >
>>>>>>>>>>>>>> >>>>>> >> > Thank you Robert. This makes it much easier. What
>>>>>>>>>>>>>> is the source of the sed script? I am not sure why some of those 
>>>>>>>>>>>>>> lines are
>>>>>>>>>>>>>> there. It would be much easier for us to comment on the script 
>>>>>>>>>>>>>> source if it
>>>>>>>>>>>>>> is reviewable somewhere.
>>>>>>>>>>>>>> >>>>>> >>
>>>>>>>>>>>>>> >>>>>> >> I just gathered up common patterns as I was trying
>>>>>>>>>>>>>> to go through and
>>>>>>>>>>>>>> >>>>>> >> review the files... Mostly it was an exercise in
>>>>>>>>>>>>>> finding a compact
>>>>>>>>>>>>>> >>>>>> >> representation for the delta, not trying to be a
>>>>>>>>>>>>>> perfect conversion.
>>>>>>>>>>>>>> >>>>>> >> (I do think in retrospect, if we do something like
>>>>>>>>>>>>>> this again, it
>>>>>>>>>>>>>> >>>>>> >> would be preferable to commit a script that does
>>>>>>>>>>>>>> the auto-conversion
>>>>>>>>>>>>>> >>>>>> >> (maybe even with some patch files for manual
>>>>>>>>>>>>>> changes) both for ease of
>>>>>>>>>>>>>> >>>>>> >> reviewing and to avoid the stop-the-world situation
>>>>>>>>>>>>>> we're in now. (I'm
>>>>>>>>>>>>>> >>>>>> >> still worried that some changes will get lost in
>>>>>>>>>>>>>> the shuffle.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>

Reply via email to