Thank you Ahmet, Brian, Robert and everyone else who spent time working on
this. The pull requests are now merged and the website seems to be working
as expected [1].

One minor issue that I have noticed is that the code blocks have a grey
background, which makes it less accessible than before. I created a Jira
issue for this [2], and will follow up to get it fixed. If you notice any
other issues, please file Jira issues and let me know.

Hope you are all safe,
Aizhamal

[1] https://beam.apache.org/
[2] https://issues.apache.org/jira/browse/BEAM-10001

On Thu, May 14, 2020 at 11:25 AM Pablo Estrada <pabl...@google.com> wrote:

> Here's a zipped-up tree from a staged sample of the website:
> https://drive.google.com/file/d/1LKL936tBJ79jpjvlL5vC5uYYwTHsWXiJ/view?usp=sharing
>
> I'd also suggest tagging the commit, so we can find the fist commit later
> on for reference. I can push the tag after the PR is merged.
>
> On Thu, May 14, 2020 at 10:43 AM Ahmet Altay <al...@google.com> wrote:
>
>>
>>
>> On Thu, May 14, 2020 at 9:16 AM Aizhamal Nurmamat kyzy <
>> aizha...@apache.org> wrote:
>>
>>> Thank you all for reviewing and validating this pull request. I see that
>>> all tests are passing now, should we merge it?
>>>
>>
>> +1 to merging now.
>>
>> Before the merge, please share a link to an archive copy of the old
>> website. After the merge, please try out the live website see if it is
>> working as expected.
>>
>>
>>>
>>> On Wed, May 13, 2020, 5:41 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> Thank you! Let's merge it once tests are done.
>>>>
>>>> On Wed, May 13, 2020 at 5:23 PM Robert Bradshaw <rober...@google.com>
>>>> wrote:
>>>>
>>>>> I took a (non-comprehensive) look at these as well, and didn't see any
>>>>> issues, so am happy to sign off on this. Thanks Nam, Brian, Ahmet, and
>>>>> everyone else.
>>>>>
>>>>> On Wed, May 13, 2020 at 7:58 AM Nam Bui <nam....@polidea.com> wrote:
>>>>>
>>>>>> Hi Ahmet,
>>>>>> "Does this mean the internal links (e.g. contribute/team) will
>>>>>> disappear?"
>>>>>> Yes, I'd like to get rid of them. And to make sure it won't appear to
>>>>>> confuse people, I replaced all of the spots using "contribute/team" with
>>>>>> the external one. Currently, we only have 2 "redirect_to" links which are
>>>>>> "contribute/team" & "contribute/project/team", so this act won't have any
>>>>>> affects.
>>>>>> Also, based on your question, I just added a section in the
>>>>>> documentation (CONTRIBUTE.md), which mentions the replaced/removed 
>>>>>> features
>>>>>> of Jekyll in terms of writing a new blog post or documentation in Hugo.
>>>>>>
>>>>>
>>>> Got it. The main effect will be any one has a bookmark/link to these
>>>> pages, those links will no longer work. It is fine if it is only limited to
>>>> these 2 urls.
>>>>
>>>>
>>>>>
>>>>>>
>>>>>> On Wed, May 13, 2020 at 4:17 AM Ahmet Altay <al...@google.com> wrote:
>>>>>>
>>>>>>> - I reviewed the diff output with Nam's explanations. The change
>>>>>>> looks minimal. Large diffs are primarily coming from index and redirect
>>>>>>> files. codeblocks have differences but the content is seemingly 
>>>>>>> preserved.
>>>>>>> IIUC, the source of truth is snippet files anyway. (It would be good to 
>>>>>>> get
>>>>>>> one more set of eyes on this.)
>>>>>>> - Brian and I reviewed the infrastructure changes. They look
>>>>>>> reasonable.
>>>>>>>
>>>>>>> I think PR is very close to a mergeable state. Especially if we can
>>>>>>> get an archive copy of the current website, I will be comfortable with 
>>>>>>> the
>>>>>>> merge.
>>>>>>>
>>>>>>> And, thank you Nam for your work so far.
>>>>>>>
>>>>>>> On Tue, May 12, 2020 at 4:13 PM Nam Bui <nam....@polidea.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> A new commit covers Robert's script is pushed [1], and also the
>>>>>>>> script output is attached in this email.
>>>>>>>>
>>>>>>>> Based on the diff output of the script, my strategy is looking at
>>>>>>>> the sections which contain the large/massive removed texts, to make 
>>>>>>>> sure
>>>>>>>> that there are no lost content or files. And below are all of the links
>>>>>>>> which have large of the removed content.
>>>>>>>>
>>>>>>>> - Detection:
>>>>>>>> These links lost some of the contents. Fixed!
>>>>>>>> + documentation/runners/jstorm/index.html
>>>>>>>> + documentation/dsls/sql/calcite/lexical-structure/index.html
>>>>>>>> + documentation/dsls/sql/zetasql/data-types/index.html
>>>>>>>> + documentation/dsls/sql/zetasql/query-syntax/index.html
>>>>>>>>
>>>>>>>> - Aliases:
>>>>>>>> These links are redirected links. So in Hugo, these HTML files only
>>>>>>>> include redirected URLs. I also took a look at them to ensure the 
>>>>>>>> content
>>>>>>>> was there.
>>>>>>>> + documentation/dsls/sql/calcite/lexical/index.html
>>>>>>>> + old URLs of blog posts
>>>>>>>>
>>>>>>>> - Ignore:
>>>>>>>> Hugo and Jekyll have different structures of code highlighters
>>>>>>>> rendering in HTML. Ahmed & Pablo agree with me that its fair to ignore 
>>>>>>>> them
>>>>>>>> for now.
>>>>>>>> + codeblocks
>>>>>>>>
>>>>>>>> - Missing files:
>>>>>>>> The script returns some of “missing files” status
>>>>>>>> + coming-soon.html (this file was used nowhere in Jekyll, so I
>>>>>>>> didn’t migrate to Hugo)
>>>>>>>> + documentation/dsls/sql/statements/select/index.html (aliases)
>>>>>>>> + blog/2019/04/25/beam-2.12.0.html (fixed!)
>>>>>>>> + blog/2020/05/08/beam-summit-digital-2020.html (new blog post,
>>>>>>>> added!)
>>>>>>>> + v2/index.html (this file was used nowhere in Jekyll, so I didn’t
>>>>>>>> migrate to Hugo)
>>>>>>>> + contribute/team/index.html (mentioned in “redirect_to” below)
>>>>>>>> + contribute/project/team/index.html (mentioned in “redirect_to”
>>>>>>>> below)
>>>>>>>>
>>>>>>>> - “redirect_to”:
>>>>>>>> In Jekyll, there is a feature called “redirect_to”. For instance,
>>>>>>>> you click on an internal link “contribute/team/” to reach the markdown
>>>>>>>> “team.md”, then from the markdown file, it redirects you to the 
>>>>>>>> external
>>>>>>>> URL “https://example.com”.
>>>>>>>> However, there is no such feature in Hugo. My solution is to
>>>>>>>> directly replace “contribute/team/” with “https://example.com”.
>>>>>>>>
>>>>>>>
>>>>>>> Does this mean the internal links (e.g. contribute/team) will
>>>>>>> disappear?
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> [1] https://github.com/apache/beam/pull/11554
>>>>>>>>
>>>>>>>> On Mon, May 11, 2020 at 7:34 PM Nam Bui <nam....@polidea.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Updates for today:
>>>>>>>>> - Thanks Brian & Ahmet for your reviews. I left my comments for
>>>>>>>>> some of the questions and also adapted new changes to the reviews [1].
>>>>>>>>> - I see that the new blog post was merged yesterday, so I added it
>>>>>>>>> to the PR as well.
>>>>>>>>>
>>>>>>>>> I briefly tried the script from Robert with the input of build
>>>>>>>>> files from old and new websites. It seemed to work well in terms of
>>>>>>>>> detecting missing files (or probably wrong links leading to missing 
>>>>>>>>> files).
>>>>>>>>> I will push another commit to fix all that up, hope can be tomorrow.
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://github.com/apache/beam/pull/11554#issuecomment-626792031
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>> Nam
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, May 11, 2020 at 9:01 AM Nam Bui <nam....@polidea.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> @Ahmet: Yeah, it's all clear to me. :)
>>>>>>>>>> @Robert: Thanks for your ideas and also the script. It really
>>>>>>>>>> helps me to serve my works.
>>>>>>>>>>
>>>>>>>>>> Best regard!
>>>>>>>>>>
>>>>>>>>>> On Sat, May 9, 2020 at 2:10 AM Ahmet Altay <al...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> This sounds reasonable to me. Thank you. Nam, does it make sense
>>>>>>>>>>> to you?
>>>>>>>>>>>
>>>>>>>>>>> On Fri, May 8, 2020 at 11:53 AM Robert Bradshaw <
>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I'd really like to not see this work go to waste, both the
>>>>>>>>>>>> original revision, the further efforts Nam has done in making it 
>>>>>>>>>>>> more
>>>>>>>>>>>> manageable to review, and the work put into reviewing this so far, 
>>>>>>>>>>>> so we
>>>>>>>>>>>> can get the benefits of being on Hugo. How about this for a
>>>>>>>>>>>> concrete proposal:
>>>>>>>>>>>>
>>>>>>>>>>>> (1) We get "standard" approval from one or more committers for
>>>>>>>>>>>> the infrastructure changes, just as with any other PR. Brian has
>>>>>>>>>>>> already started this, but if others could step up as well that'd 
>>>>>>>>>>>> be great.
>>>>>>>>>>>>
>>>>>>>>>>>> (2) Reviewers (and authors) typically count on (or request)
>>>>>>>>>>>> sufficient automated test coverage to augment the fact that their 
>>>>>>>>>>>> eyeballs
>>>>>>>>>>>> are fallible, which is something that is missing here (and given 
>>>>>>>>>>>> the size
>>>>>>>>>>>> of the change not easily compensated for by a more detailed manual 
>>>>>>>>>>>> review).
>>>>>>>>>>>> How about we use the script above (or similar) as an automated 
>>>>>>>>>>>> test to
>>>>>>>>>>>> validate the website's contents haven't (materially) changed. I 
>>>>>>>>>>>> feel we've
>>>>>>>>>>>> validated enough that the style looks good via spot checking 
>>>>>>>>>>>> (which is
>>>>>>>>>>>> something that should work on all pages if it works on one). The 
>>>>>>>>>>>> diff
>>>>>>>>>>>> between the current site and the newly generated site should be 
>>>>>>>>>>>> empty (it
>>>>>>>>>>>> might already be [1]), or at least we should get a stamp of 
>>>>>>>>>>>> approval on the
>>>>>>>>>>>> plain-text diff (which should be small), before merging.
>>>>>>>>>>>>
>>>>>>>>>>>> (3) To make things easier, everyone holds off on making any
>>>>>>>>>>>> changes to the old site until a fixed future date (say, next 
>>>>>>>>>>>> Wednesday).
>>>>>>>>>>>> Hopefully we can get it merged by then. If not, a condition for 
>>>>>>>>>>>> merging
>>>>>>>>>>>> would be a commitment incorporating new changes after this date.
>>>>>>>>>>>>
>>>>>>>>>>>> Does this sound reasonable?
>>>>>>>>>>>>
>>>>>>>>>>>> - Robert
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> [1] I'd be curious as to how small the diff already is, but my
>>>>>>>>>>>> script relies on local directories with the generated HTML, which 
>>>>>>>>>>>> I don't
>>>>>>>>>>>> have handy at the moment.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, May 8, 2020 at 10:45 AM Robert Bradshaw <
>>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Here's a script that we could run on the old and new sites
>>>>>>>>>>>>> that should quickly catch any major issues but not get caught up 
>>>>>>>>>>>>> in
>>>>>>>>>>>>> formatting minutia.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, May 8, 2020 at 10:23 AM Robert Bradshaw <
>>>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, May 8, 2020 at 9:58 AM Aizhamal Nurmamat kyzy <
>>>>>>>>>>>>>> aizha...@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I understand the difficulty, and this certainly comes with
>>>>>>>>>>>>>>> lessons learned for future similar projects.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> To your questions Robert:
>>>>>>>>>>>>>>> (1 and 2) I will commit to review the text in the resulting
>>>>>>>>>>>>>>> pages. I will try and use some automation to extract visible 
>>>>>>>>>>>>>>> text from each
>>>>>>>>>>>>>>> page and diff it with the current state of the website. I can 
>>>>>>>>>>>>>>> do this
>>>>>>>>>>>>>>> starting next week. From some quick research, there seem to be 
>>>>>>>>>>>>>>> tools that
>>>>>>>>>>>>>>> help with this analysis (
>>>>>>>>>>>>>>> https://stackoverflow.com/questions/3286955/compare-two-websites-and-see-if-they-are-equal
>>>>>>>>>>>>>>> )
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> At first glance it looks like these tools would give diffs
>>>>>>>>>>>>>> that are *larger* than the 47K one we're struggling to review 
>>>>>>>>>>>>>> here.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> By remaining in this state, we hold others up from making
>>>>>>>>>>>>>>> changes, or we increase the amount of work needed after merging 
>>>>>>>>>>>>>>> to port
>>>>>>>>>>>>>>> over changes that may be missed. If we move forward, new 
>>>>>>>>>>>>>>> changes can be
>>>>>>>>>>>>>>> done on top of the new website.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I agree we don't want to hold others up from making changes.
>>>>>>>>>>>>>> However, the amount of work to port changes over seems small in 
>>>>>>>>>>>>>> comparison
>>>>>>>>>>>>>> to everything else that is being discussed here. (It also 
>>>>>>>>>>>>>> provides good
>>>>>>>>>>>>>> incentives to reach the bar quickly and has the advantage of 
>>>>>>>>>>>>>> falling on the
>>>>>>>>>>>>>> right people.) (3) will still take some time.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If we go this route, we're lowering the bar for doc changes,
>>>>>>>>>>>>>> but not removing it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (3) This makes sense. Brian, would you be able to spend some
>>>>>>>>>>>>>>> time to look at the automation changes (build files and 
>>>>>>>>>>>>>>> scripts) to ensure
>>>>>>>>>>>>>>> they look fine?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I would also like to write a post mortem to extract lessons
>>>>>>>>>>>>>>> learned and avoid this situation in the future.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, May 8, 2020 at 9:44 AM Brian Hulette <
>>>>>>>>>>>>>>> bhule...@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm -0 on merging as-is. I have the same concerns as Robert
>>>>>>>>>>>>>>>> and he's voiced them very well so I won't waste time re-airing 
>>>>>>>>>>>>>>>> them.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> (2) I spot checked the content, pulled out some common
>>>>>>>>>>>>>>>>> patterns, and
>>>>>>>>>>>>>>>>> it mostly looks good, but there were also some issues
>>>>>>>>>>>>>>>>> (e.g. several
>>>>>>>>>>>>>>>>> pages were replaced with the contents from entirely
>>>>>>>>>>>>>>>>> different pages).
>>>>>>>>>>>>>>>>> I would be more comfortable if, say, a smoke test of
>>>>>>>>>>>>>>>>> comparing the old
>>>>>>>>>>>>>>>>> and new sites, with html tags stripped and ignoring
>>>>>>>>>>>>>>>>> whitespace,
>>>>>>>>>>>>>>>>> yielded what should be empty diffs.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Can you share any details about this analysis?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It was basically paging through the diff, adding things to
>>>>>>>>>>>>>> the sed script, and then looking at more diffs.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> +1 for verifying the old and new are the same by diffing the
>>>>>>>>>>>>>>>> output.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (3) It'd be good to have someone give a stamp of approval
>>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>>> infrastructure changes, at least to validate that we're
>>>>>>>>>>>>>>>>> not going to
>>>>>>>>>>>>>>>>> be taking on extra tech debt with regard to jenkins
>>>>>>>>>>>>>>>>> stability and
>>>>>>>>>>>>>>>>> developer workflow. I see that Brian has at least looked
>>>>>>>>>>>>>>>>> at this some.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> My involvement so far was just recognizing a problem
>>>>>>>>>>>>>>>> (creating root-owned files on jenkins workers) and helping to 
>>>>>>>>>>>>>>>> fix it. If
>>>>>>>>>>>>>>>> there's anyone available who's familiar with the website 
>>>>>>>>>>>>>>>> infrastructure it
>>>>>>>>>>>>>>>> would be great if they could take a look instead (if not I 
>>>>>>>>>>>>>>>> could probably
>>>>>>>>>>>>>>>> acquaint myself enough to review).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, May 7, 2020 at 11:57 PM Robert Bradshaw <
>>>>>>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This is a tough situation.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It would have been much better if this transition was
>>>>>>>>>>>>>>>>> structured in
>>>>>>>>>>>>>>>>> such a way that the review was more manageable (e.g. the
>>>>>>>>>>>>>>>>> suggestion of
>>>>>>>>>>>>>>>>> scripts, not mixing in voluminous unnecessary changes like
>>>>>>>>>>>>>>>>> whitespace,
>>>>>>>>>>>>>>>>> and not updating content), and possibly even incrementally
>>>>>>>>>>>>>>>>> (e.g. the
>>>>>>>>>>>>>>>>> new site would have been developed over multiple PRs in a
>>>>>>>>>>>>>>>>> subdomain or
>>>>>>>>>>>>>>>>> subdirectory while being worked on). But hindsight is
>>>>>>>>>>>>>>>>> 20/20 and no
>>>>>>>>>>>>>>>>> one, myself included, thought to bring this up when the
>>>>>>>>>>>>>>>>> original
>>>>>>>>>>>>>>>>> migration was proposed, so this is more something to keep
>>>>>>>>>>>>>>>>> in mind for
>>>>>>>>>>>>>>>>> the future. I also appreciate the efforts that have been
>>>>>>>>>>>>>>>>> made to clean
>>>>>>>>>>>>>>>>> things up (e.g. preserving history) and address feedback.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> So, where do we go from here? My first thought is that I
>>>>>>>>>>>>>>>>> really don't
>>>>>>>>>>>>>>>>> want to set a precedent that just because a PR "will
>>>>>>>>>>>>>>>>> require a large
>>>>>>>>>>>>>>>>> effort" and in a state that if we don't "move forward and
>>>>>>>>>>>>>>>>> merge what
>>>>>>>>>>>>>>>>> we have now" then "work done so far will be lost" means
>>>>>>>>>>>>>>>>> that we think
>>>>>>>>>>>>>>>>> it's OK to forgo doing a proper review.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On the other hand, there are some mitigating factors with
>>>>>>>>>>>>>>>>> this being
>>>>>>>>>>>>>>>>> the website and not the code in that "bugs," though
>>>>>>>>>>>>>>>>> possibly
>>>>>>>>>>>>>>>>> embarrassing, won't break production pipelines or data
>>>>>>>>>>>>>>>>> loss, and
>>>>>>>>>>>>>>>>> though the source is technically part of the release, when
>>>>>>>>>>>>>>>>> we find
>>>>>>>>>>>>>>>>> something to fix we can fix the live website much more
>>>>>>>>>>>>>>>>> quickly than go
>>>>>>>>>>>>>>>>> through the whole release process and convince people to
>>>>>>>>>>>>>>>>> upgrade. (I
>>>>>>>>>>>>>>>>> recognize accepting this argument is, to some degree at
>>>>>>>>>>>>>>>>> least, saying
>>>>>>>>>>>>>>>>> that we don't care about the correctness of docs as much
>>>>>>>>>>>>>>>>> as so-called
>>>>>>>>>>>>>>>>> "real" code, if we go there.)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If we decide to go ahead and merge (and I would not
>>>>>>>>>>>>>>>>> object), there are
>>>>>>>>>>>>>>>>> some things I would like to see.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (1) I would like to understand what we would do afterwards
>>>>>>>>>>>>>>>>> to "review
>>>>>>>>>>>>>>>>> the outcome, and ensure that all the content is there,"
>>>>>>>>>>>>>>>>> and why it
>>>>>>>>>>>>>>>>> can't be done before merging instead. (Is it because it'd
>>>>>>>>>>>>>>>>> take time
>>>>>>>>>>>>>>>>> and we don't want to incorporate changes that are made to
>>>>>>>>>>>>>>>>> the website
>>>>>>>>>>>>>>>>> in the meantime? I think that boat has sailed, but maybe
>>>>>>>>>>>>>>>>> we can avoid
>>>>>>>>>>>>>>>>> making it worse...)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (2) I spot checked the content, pulled out some common
>>>>>>>>>>>>>>>>> patterns, and
>>>>>>>>>>>>>>>>> it mostly looks good, but there were also some issues
>>>>>>>>>>>>>>>>> (e.g. several
>>>>>>>>>>>>>>>>> pages were replaced with the contents from entirely
>>>>>>>>>>>>>>>>> different pages).
>>>>>>>>>>>>>>>>> I would be more comfortable if, say, a smoke test of
>>>>>>>>>>>>>>>>> comparing the old
>>>>>>>>>>>>>>>>> and new sites, with html tags stripped and ignoring
>>>>>>>>>>>>>>>>> whitespace,
>>>>>>>>>>>>>>>>> yielded what should be empty diffs.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> (3) It'd be good to have someone give a stamp of approval
>>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>>> infrastructure changes, at least to validate that we're
>>>>>>>>>>>>>>>>> not going to
>>>>>>>>>>>>>>>>> be taking on extra tech debt with regard to jenkins
>>>>>>>>>>>>>>>>> stability and
>>>>>>>>>>>>>>>>> developer workflow. I see that Brian has at least looked
>>>>>>>>>>>>>>>>> at this some.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, May 7, 2020 at 12:40 PM Aizhamal Nurmamat kyzy
>>>>>>>>>>>>>>>>> <aizha...@apache.org> wrote:
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Thank you Ahmet.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > Robert/Brian, what do you think?
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > The website staging and pre commit tests have passed
>>>>>>>>>>>>>>>>> [1]. If nobody has objections, we could merge it soon.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > [1] https://github.com/apache/beam/pull/11554
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > On Thu, May 7, 2020 at 11:38 AM Ahmet Altay <
>>>>>>>>>>>>>>>>> al...@google.com> wrote:
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> On Thu, May 7, 2020 at 10:50 AM Aizhamal Nurmamat kyzy <
>>>>>>>>>>>>>>>>> aizha...@apache.org> wrote:
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> Thanks for the writeup Ahmet.
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> My bias is to move forward and merge the PR. After
>>>>>>>>>>>>>>>>> this, we'll review the outcome, and ensure that all the 
>>>>>>>>>>>>>>>>> content is there.
>>>>>>>>>>>>>>>>> Nam will help us with that.
>>>>>>>>>>>>>>>>> >>> The reason that I'd like to move forward and merge
>>>>>>>>>>>>>>>>> what we have now - is that if we don't do that, the work done 
>>>>>>>>>>>>>>>>> so far will
>>>>>>>>>>>>>>>>> be lost.
>>>>>>>>>>>>>>>>> >>> We'll make sure to stage the website in its current
>>>>>>>>>>>>>>>>> state, and use that as reference/archive to ensure all the 
>>>>>>>>>>>>>>>>> content have
>>>>>>>>>>>>>>>>> been moved.
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> Is this reasonable to everyone?
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> This is reasonable to me. I agree with your reasons.
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> What do others think?
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>>
>>>>>>>>>>>>>>>>> >>> On Wed, May 6, 2020 at 7:07 PM Ahmet Altay <
>>>>>>>>>>>>>>>>> al...@google.com> wrote:
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>> On Wed, May 6, 2020 at 2:33 PM Aizhamal Nurmamat kyzy
>>>>>>>>>>>>>>>>> <aizha...@apache.org> wrote:
>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>> >>>>>> > 1) Currently, the main blocker for merging is
>>>>>>>>>>>>>>>>> Staging Test Failures.
>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>> >>>>>> That and finishing the review. (Is someone
>>>>>>>>>>>>>>>>> tracking/coordinating this?)
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>> I am coordinating the work on the failed tests, but
>>>>>>>>>>>>>>>>> I would need other committer's help to perform the review. 
>>>>>>>>>>>>>>>>> @Ahmet, could
>>>>>>>>>>>>>>>>> you help us prioritize the review for this PR?
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>> The problem is there are too many manual changes.
>>>>>>>>>>>>>>>>> Reviewing this change in this form will require a large 
>>>>>>>>>>>>>>>>> effort. I do not
>>>>>>>>>>>>>>>>> think I can interrupt other projects to prioritize reviews on 
>>>>>>>>>>>>>>>>> this PR. IMO,
>>>>>>>>>>>>>>>>> we have a few options:
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>> - PR to be restructured in the format suggested in
>>>>>>>>>>>>>>>>> this thread. A commit for infrastructure changes from Jekyll 
>>>>>>>>>>>>>>>>> to hugo. A
>>>>>>>>>>>>>>>>> second commit for a script that will convert the majority of 
>>>>>>>>>>>>>>>>> the content. A
>>>>>>>>>>>>>>>>> third commit for the execution of the script. And a fourth 
>>>>>>>>>>>>>>>>> commit for the
>>>>>>>>>>>>>>>>> additional manual content changes. If Nam can get to this 
>>>>>>>>>>>>>>>>> form, people on
>>>>>>>>>>>>>>>>> this thread myself/Robert/Pablo/Brian can review the changes.
>>>>>>>>>>>>>>>>> >>>> - Another option is, we can accept that we already
>>>>>>>>>>>>>>>>> invested in this transition and overall this is a good 
>>>>>>>>>>>>>>>>> change, and merge
>>>>>>>>>>>>>>>>> the PR more or less in its current form (with tests fixed and 
>>>>>>>>>>>>>>>>> open comments
>>>>>>>>>>>>>>>>> addressed) even though it has issues. And then overtime fix 
>>>>>>>>>>>>>>>>> the issues we
>>>>>>>>>>>>>>>>> encounter. There was already some amount of review and visual 
>>>>>>>>>>>>>>>>> comparisons,
>>>>>>>>>>>>>>>>> we risk losing some recent content changes but I am assuming 
>>>>>>>>>>>>>>>>> this will not
>>>>>>>>>>>>>>>>> be much. If Nam can commit to compare two sites after a 
>>>>>>>>>>>>>>>>> merge, fixing the
>>>>>>>>>>>>>>>>> majority of the delta, this might be a viable option.
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>> Another thing we can do, we can archive/store a
>>>>>>>>>>>>>>>>> read-only copy of the current website in an "archive" url 
>>>>>>>>>>>>>>>>> temporarily
>>>>>>>>>>>>>>>>> instead of completely deleting it. It will give us a baseline 
>>>>>>>>>>>>>>>>> for a while
>>>>>>>>>>>>>>>>> to go back to the old content and move any missing data. (And 
>>>>>>>>>>>>>>>>> maybe,
>>>>>>>>>>>>>>>>> someone can come up with an innovative way to compare the 
>>>>>>>>>>>>>>>>> textual content
>>>>>>>>>>>>>>>>> of both sites.) A note on the stop world approach, I believe 
>>>>>>>>>>>>>>>>> we are already
>>>>>>>>>>>>>>>>> failing on that with merge conflicts showing up on the PR. It 
>>>>>>>>>>>>>>>>> will be
>>>>>>>>>>>>>>>>> better for us to complete the transition as soon as possible. 
>>>>>>>>>>>>>>>>> Fixing after
>>>>>>>>>>>>>>>>> the initial merge might be a simpler task, especially if we 
>>>>>>>>>>>>>>>>> can archive the
>>>>>>>>>>>>>>>>> old site.
>>>>>>>>>>>>>>>>> >>>>
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>>
>>>>>>>>>>>>>>>>> >>>>>>
>>>>>>>>>>>>>>>>> >>>>>> > Michal showed Nam how to handle the 1st test
>>>>>>>>>>>>>>>>> which was about Apache License missing.
>>>>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>>>>> >>>>>> > However, the 2nd and 3rd tests looked like some
>>>>>>>>>>>>>>>>> kind of permissions error on the Jenkins worker, not to be 
>>>>>>>>>>>>>>>>> configured by
>>>>>>>>>>>>>>>>> code. For more details based on Jenkin logs, the 2nd test 
>>>>>>>>>>>>>>>>> failed because of
>>>>>>>>>>>>>>>>> website/www/site/themes and the 3rd test failed because of
>>>>>>>>>>>>>>>>> website/www/node_modules, they are both auto-generated files 
>>>>>>>>>>>>>>>>> on build. Can
>>>>>>>>>>>>>>>>> someone help Nam to look into this?
>>>>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>>>>> >>>>>> > RAT ("Run RAT PreCommit") — FAILURE
>>>>>>>>>>>>>>>>> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS
>>>>>>>>>>>>>>>>> PreCommit") — FAILURE
>>>>>>>>>>>>>>>>> >>>>>> > Website_Stage_GCS ("Run Website_Stage_GCS
>>>>>>>>>>>>>>>>> PreCommit") — FAILURE
>>>>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>>>>> >>>>>> > 2) Are there any other blockers for merging?
>>>>>>>>>>>>>>>>> @Ahmet/Robert/others please share if there are any other 
>>>>>>>>>>>>>>>>> blockers.
>>>>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>>>>> >>>>>> > [1] https://github.com/gohugoio/hugo/pull/4494
>>>>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>>>>> >>>>>> >
>>>>>>>>>>>>>>>>> >>>>>> > On Wed, May 6, 2020 at 10:19 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>>>>>>> >>>>>> >>
>>>>>>>>>>>>>>>>> >>>>>> >> On Mon, May 4, 2020 at 7:07 PM Ahmet Altay <
>>>>>>>>>>>>>>>>> al...@google.com> wrote:
>>>>>>>>>>>>>>>>> >>>>>> >> >
>>>>>>>>>>>>>>>>> >>>>>> >> >> On Mon, May 4, 2020 at 6:30 PM Robert
>>>>>>>>>>>>>>>>> Bradshaw <rober...@google.com> wrote:
>>>>>>>>>>>>>>>>> >>>>>> >> >>>
>>>>>>>>>>>>>>>>> >>>>>> >> >>> I took the massive commit and split it up
>>>>>>>>>>>>>>>>> into:
>>>>>>>>>>>>>>>>> >>>>>> >> >>>
>>>>>>>>>>>>>>>>> >>>>>> >> >>> (1) Infrastructure changes (basically
>>>>>>>>>>>>>>>>> everything outside of
>>>>>>>>>>>>>>>>> >>>>>> >> >>> (website/www/site/content)
>>>>>>>>>>>>>>>>> >>>>>> >> >>> (2) Sed script changes, and
>>>>>>>>>>>>>>>>> >>>>>> >> >>> (3) Manual changes (everything not in (1)
>>>>>>>>>>>>>>>>> and (2)).
>>>>>>>>>>>>>>>>> >>>>>> >> >
>>>>>>>>>>>>>>>>> >>>>>> >> >
>>>>>>>>>>>>>>>>> >>>>>> >> > Thank you Robert. This makes it much easier.
>>>>>>>>>>>>>>>>> What is the source of the sed script? I am not sure why some 
>>>>>>>>>>>>>>>>> of those lines
>>>>>>>>>>>>>>>>> are there. It would be much easier for us to comment on the 
>>>>>>>>>>>>>>>>> script source
>>>>>>>>>>>>>>>>> if it is reviewable somewhere.
>>>>>>>>>>>>>>>>> >>>>>> >>
>>>>>>>>>>>>>>>>> >>>>>> >> I just gathered up common patterns as I was
>>>>>>>>>>>>>>>>> trying to go through and
>>>>>>>>>>>>>>>>> >>>>>> >> review the files... Mostly it was an exercise in
>>>>>>>>>>>>>>>>> finding a compact
>>>>>>>>>>>>>>>>> >>>>>> >> representation for the delta, not trying to be a
>>>>>>>>>>>>>>>>> perfect conversion.
>>>>>>>>>>>>>>>>> >>>>>> >> (I do think in retrospect, if we do something
>>>>>>>>>>>>>>>>> like this again, it
>>>>>>>>>>>>>>>>> >>>>>> >> would be preferable to commit a script that does
>>>>>>>>>>>>>>>>> the auto-conversion
>>>>>>>>>>>>>>>>> >>>>>> >> (maybe even with some patch files for manual
>>>>>>>>>>>>>>>>> changes) both for ease of
>>>>>>>>>>>>>>>>> >>>>>> >> reviewing and to avoid the stop-the-world
>>>>>>>>>>>>>>>>> situation we're in now. (I'm
>>>>>>>>>>>>>>>>> >>>>>> >> still worried that some changes will get lost in
>>>>>>>>>>>>>>>>> the shuffle.)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

Reply via email to