Hey guys,

How was your weekend? Thanks for some of the compliments and also
recommendations.

About the commits, as Brian said, we worked together on the-asf slack. It
was the tough one, we even did a few experiments. And finally came up with
a solution that preserved all commits and used `git mv`.
IMHO, I know it's really difficult to review all of them at first, even
though we made a commit [1] which helps you to compare changes since there
are tons of files. Therefore, I recommend to check out my work, take a look
at Hugo structure and you will link it to Jekyll one quickly. There are no
chances about file or directory names, just organize the structure. I write
a short details here, hope it would be helpful in terms of reviewing.

1. Syntax
- I strongly prefer this one [2]. He wrote about Hugo syntax which is
corresponding to Jekyll syntax. It would make sense to your overview,
instead of skimming one by one markdown file.

2. Project structure
- The main part of Hugo is in "website/www/site". You will briefly confused
a little bit here with many directories, so please read this one [3] first,
then you'll get into it very quickly. The most important thing here is the
flow. In Jekyll, you write a markdown file and then pick the layout with
"layout: home" in frontmatter as an example. In Hugo, we have separated
"content" and "layouts" directory, the "layouts" will mimic the structure
of the "content", and at the end, Hugo will know how to connect each of
them behind the scene.
- In Jekyll, the components are in "website/src/_include" and it will be
moved to "website/src/layouts/partials" in Hugo.

3. Shortcodes.
- Just thinking "shortcodes" as utility functions and we will reuse it many
times in markdown files. One of the unique features from Hugo, and it's
located at "website/www/layouts/shortcodes".

A quick Q&A:
@Altay: there are some deleted files if you see them in [1]. Some of them
have the different behaviour in Hugo. For instance,
"_data/capabilitymatrix.md" will be used directly in
frontmatter "website/www/site/content/en/blog/capability-matrix.md", the
reason is, it will take more works in Hugo to retrieve data from files and
pass them into "shortcodes" in markdown files (other data files are not
deleted because they are used in "layouts" HTML files).
@Robert: thanks for your review and comments on GitHub. I will walk through
all of them today.

Best regards,
Nam


[1]
https://github.com/apache/beam/pull/11554/commits/b267bb360866a723ac2536f408f23de648c7cd4d
[2] https://simpleit.rocks/golang/hugo/migrating-a-jekyll-blog-to-hugo/
[3] https://gohugo.io/getting-started/directory-structure/

On Fri, May 1, 2020 at 6:24 PM Brian Hulette <bhule...@google.com> wrote:

> Regarding move detection: I worked with Nam on this some on the-asf slack.
> We couldn't make squashing into a single large commit work - when I did it,
> `git log` still showed many dropped and added files. Breaking out a single
> commit with the file moves was the best we could manage. I tested a PR that
> used this approach on a single file and the github UI did pick up on it
> [1]. Sadly it seems to give up on the larger PR.
>
> I figured this was good enough though, it's difficult to review all of the
> changes at once, but you can at least review the individual commits without
> being obfuscated by the moves.
>
> [1] https://github.com/apache/beam/pull/11579
>
>
> On Fri, May 1, 2020 at 9:11 AM Robert Bradshaw <rober...@google.com>
> wrote:
>
>> I just took a look, and added a couple of comments, but it mostly looks
>> good. Thanks for creating a commit that preserves changes; that's a big
>> improvement.
>>
>> +1 to Ahmet's suggestion about braking the huge commit up a bit more. I
>> would suggest one that adds the mechanics (etc.), one that applies a script
>> to auto-convert the content (where we can review the script and that it's
>> application give the resulting diff), and a final one that takes care of
>> the things that the script wasn't able to handle (or messed up, rather than
>> spending a huge amount of time getting the script perfect).
>>
>> On Fri, May 1, 2020 at 6:44 AM Kenneth Knowles <k...@apache.org> wrote:
>>
>>> I believe taking Brian and Robert's advice to help git detect moves
>>> (even more than you already have) will make this much more manageable. I
>>> just tried it out and squashing commits brings it to "631 files changed,
>>> 10363 insertions(+), 9945 deletions(-)" according to git, so that is more
>>> manageable than +47k - 47k. I'm not saying that a total squash is best.
>>> There may be a better way to factor the changes.
>>>
>>> Kenn
>>>
>>> On Thu, Apr 30, 2020 at 8:09 PM Ahmet Altay <al...@google.com> wrote:
>>>
>>>> Nam,
>>>>
>>>>  - Website looks good and looks the same as the current website.
>>>> (Visually comparing a few pages, not a deep analysis.)
>>>> - contribute.md looks good. (this is new content.)
>>>> - website/Dockerfile and website/README.md changes look good.
>>>> - I do not know what is the new version of some files, for example:
>>>> website/src/_data/authors.yml,  website/src/_data/capability-matrix.yml --
>>>> what replaces them?
>>>>
>>>> There are 887 file changes. It is not easy to review this. I wanted to
>>>> go commit by commit, but that did not help much. How about we try to
>>>> organize this review as reviewable commits.
>>>> - Changes to the mechanics (jekyll to hugo), themes, build files,
>>>> website related readmes etc. This will likely be a smaller change in number
>>>> of files. (This will likely have many completed new, and completely deleted
>>>> files. Only a few files have meaningful diffs.)
>>>> - Changes to the content. This might be a large number of files with
>>>> minimal changes. I do not think we can manually review each file, but at
>>>> least a quick review of minimal changes to each file would be good enough.
>>>>
>>>> What do you think?
>>>>
>>>> Ahmet
>>>>
>>>> On Thu, Apr 30, 2020 at 4:29 PM Hannah Jiang <hannahji...@google.com>
>>>> wrote:
>>>>
>>>>> Since we want to move forward with the PR, I would like to ask the
>>>>>> community to hold off changes to the current Beam website for a week, 
>>>>>> until
>>>>>> we are able to review and merge the PR. Is this acceptable to everyone?
>>>>>
>>>>> Do we have an exact date when we can push changes to the website? I
>>>>> have PRs to update documents so would like to plan ahead.
>>>>>
>>>>> On Thu, Apr 30, 2020 at 1:17 PM Nam Bui <nam....@polidea.com> wrote:
>>>>>
>>>>>> Hey guys,
>>>>>>
>>>>>> I tried my best to handle renamed files in Git. I have no clue why
>>>>>> GitHub doesn't show it, but finally, I made this commit [1] (thanks for
>>>>>> your idea @bhulette) so you guys can review changes with ease (there is 
>>>>>> no
>>>>>> bunch of deleted markdown files anymore :D). Also, new staged version is
>>>>>> deployed, you could check it out [2].
>>>>>>
>>>>>> In case you are interested in translation, here is the proof of
>>>>>> concept [3] (the earth icon on the right corner is temporarily used for
>>>>>> switching languages). You can take a look at the translation guide for 
>>>>>> this
>>>>>> PoC [4].
>>>>>>
>>>>>> [1]
>>>>>> https://github.com/apache/beam/pull/11554/commits/b267bb360866a723ac2536f408f23de648c7cd4d
>>>>>> [2]
>>>>>> http://apache-beam-website-pull-requests.storage.googleapis.com/11554/index.html
>>>>>> [3] https://safe-relation.surge.sh/
>>>>>> [4]
>>>>>> https://github.com/PolideaInternal/beam/blob/website-develop/website/CONTRIBUTE.md#translation-guide
>>>>>>
>>>>>>
>>>>>> On Thu, Apr 30, 2020 at 7:24 PM Brian Hulette <bhule...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Changing the URLs is fine with me as long as the old urls will work
>>>>>>> too.
>>>>>>>
>>>>>>> But do we need to change the filenames for the blog posts to
>>>>>>> accomplish that? It's nice that the blog post markdown files start with 
>>>>>>> a
>>>>>>> date so they naturally sort chronologically. It looks like this hugo PR 
>>>>>>> [1]
>>>>>>> made it possible to extract date metadata and slug
>>>>>>> (i.e. dataflow-python-sdk-is-now-public) separately from the filename.
>>>>>>>
>>>>>>> [1] https://github.com/gohugoio/hugo/pull/4494
>>>>>>>
>>>>>>> On Thu, Apr 30, 2020 at 10:06 AM Ahmet Altay <al...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Apr 30, 2020 at 9:55 AM Thomas Weise <t...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> For changed URLs, will previous URLs be mapped to avoid broken
>>>>>>>>> external links?
>>>>>>>>>
>>>>>>>>
>>>>>>>> I believe the answer is yes from Nam's response "For now, we keep
>>>>>>>> the old URLs working in terms of redirecting them". I very much agree 
>>>>>>>> that
>>>>>>>> this is very important and should work for all existing urls.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Apr 30, 2020 at 9:34 AM Aizhamal Nurmamat kyzy <
>>>>>>>>> aizha...@apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> To give a little more context regarding the URLs, the date should
>>>>>>>>>> still appear on the blog post, but not on the URL.
>>>>>>>>>> For example, we'd have:
>>>>>>>>>>
>>>>>>>>>> https://beam.apache.org/beam/python/sdk/2016/02/25/python-sdk-now-public.html
>>>>>>>>>> become
>>>>>>>>>> https://beam.apache.org/blog/dataflow-python-sdk-is-now-public/.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>> I am not a content marketer. IMO, this is a good change. In the
>>>>>>>> past, a few times, we edited dates on posts (e.g. a release date was
>>>>>>>> entered incorrectly) and we had to either have a mismatch between 
>>>>>>>> dates in
>>>>>>>> the url and the date in the blog, or change the url. This change
>>>>>>>> simplifies, by having date only in place (in content metadata).
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> The blog posts would have a small header showing the title,
>>>>>>>>>> author and publish date. But the URL would not have it.
>>>>>>>>>> Thoughts?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 30, 2020 at 9:23 AM Nam Bui <nam....@polidea.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> @altay: Hey hey. Yeah, I didn't expect the baseUrl of staging
>>>>>>>>>>> version is "
>>>>>>>>>>> http://apache-beam-website-pull-requests.storage.googleapis.com/11554/";
>>>>>>>>>>> which also includes "/11554", and Hugo considers it as a path so it 
>>>>>>>>>>> breaks
>>>>>>>>>>> the path of "static files" (like images). We made a fix. Now I'm 
>>>>>>>>>>> working on
>>>>>>>>>>> "getting git to recognize files as renames" as you suggested.
>>>>>>>>>>>
>>>>>>>>>>> @robert: The dates are nice but it causes verbose/long/ugly
>>>>>>>>>>> URLs. We discussed with Aizhamal in the development stage and 
>>>>>>>>>>> agreed to get
>>>>>>>>>>> rid of this. For now, we keep the old URLs working in terms of 
>>>>>>>>>>> redirecting
>>>>>>>>>>> them. However, from now on, we should change the name convention on 
>>>>>>>>>>> blog
>>>>>>>>>>> posts to have a fancy URL like "
>>>>>>>>>>> beam.apache.org/blog/myblogpost.md". :)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 30, 2020 at 2:57 AM Robert Bradshaw <
>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 29, 2020 at 5:08 PM Ahmet Altay <al...@google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Nam, this looks better. At least links are working, and the
>>>>>>>>>>>>> website visually looks similar and generally in good shape. I 
>>>>>>>>>>>>> think there
>>>>>>>>>>>>> are still issues. For example, I do not see any of the images 
>>>>>>>>>>>>> (e.g. the
>>>>>>>>>>>>> beam logo on top left is missing.)
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 29, 2020 at 3:11 PM Brian Hulette <
>>>>>>>>>>>>> bhule...@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I left a comment on the PR [1]. I think the reason all of the
>>>>>>>>>>>>>> website content is not being tracked as file renames is because 
>>>>>>>>>>>>>> there was a
>>>>>>>>>>>>>> series of commits that created files in the new directory, and 
>>>>>>>>>>>>>> then one
>>>>>>>>>>>>>> commit that deleted the old directory. If there were a single 
>>>>>>>>>>>>>> commit with
>>>>>>>>>>>>>> all of the deleted and new files, git would surely recognize 
>>>>>>>>>>>>>> they are
>>>>>>>>>>>>>> effectively renameds and mark them as such. Maybe we just need 
>>>>>>>>>>>>>> to get all
>>>>>>>>>>>>>> these commits squashed into one?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>> https://github.com/apache/beam/pull/11554#issuecomment-621489844
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Nam, could you try this? If we can get git to recognize these
>>>>>>>>>>>>> as renames, review process would be much easier.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> +1.
>>>>>>>>>>>>
>>>>>>>>>>>> Alternatively, create a commit that just moves the files into a
>>>>>>>>>>>> new location (which git can always detect), then sit the edits on 
>>>>>>>>>>>> top of
>>>>>>>>>>>> that (which should preserve history better).
>>>>>>>>>>>>
>>>>>>>>>>>> Also, is there a reason the dates were removed from the blog
>>>>>>>>>>>> post filenames? For content like that, the dates are nice.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 29, 2020 at 10:39 AM Nam Bui <nam....@polidea.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi guys,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm Nam - from the responsible team of Apache Beam website
>>>>>>>>>>>>>>> migration. I am pleased to answer some of the questions here.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> @aizhamal: Thanks for informing to the community. :)
>>>>>>>>>>>>>>> @altay, @robertwb: Yes. there is a problem with the staged
>>>>>>>>>>>>>>> version at the moment. We didn't expect some behaviours on the 
>>>>>>>>>>>>>>> build
>>>>>>>>>>>>>>> process. So, we fixed it today and been waiting for @pablo to 
>>>>>>>>>>>>>>> re-run it
>>>>>>>>>>>>>>> again. The purpose of this PR is to migrate completely Beam 
>>>>>>>>>>>>>>> site from
>>>>>>>>>>>>>>> Jekyll to Hugo. Therefore, a bunch of deleted markdown files 
>>>>>>>>>>>>>>> are from
>>>>>>>>>>>>>>> Jekyll which was located at `beam/website/src`, and Hugo is 
>>>>>>>>>>>>>>> located at
>>>>>>>>>>>>>>> `beam/website/www` now. In `beam/website/README.md`, I wrote 
>>>>>>>>>>>>>>> down about
>>>>>>>>>>>>>>> running the Hugo website locally, although it is actually same 
>>>>>>>>>>>>>>> as Jekyll
>>>>>>>>>>>>>>> (because it's also set up with Docker & Gradle). In
>>>>>>>>>>>>>>> `beam/website/CONTRIBUTE.md`, I guided people on how to get 
>>>>>>>>>>>>>>> started with
>>>>>>>>>>>>>>> Hugo on the Beam website. There is also a link in the 
>>>>>>>>>>>>>>> "Translation Guide"
>>>>>>>>>>>>>>> section which points to a branch of multilingual provenance, 
>>>>>>>>>>>>>>> and it will
>>>>>>>>>>>>>>> become a next PR soon.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please let me know if you need more details. Feel free to
>>>>>>>>>>>>>>> ask any questions and I will get back to you with answers. I'm 
>>>>>>>>>>>>>>> so sorry if
>>>>>>>>>>>>>>> I answer a little bit due to the timezone. :)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best regards,
>>>>>>>>>>>>>>> Nam
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Apr 28, 2020 at 8:49 PM Aizhamal Nurmamat kyzy <
>>>>>>>>>>>>>>> aizha...@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Adding +Nam Bui <nam....@polidea.com> and +Karolina Rosół
>>>>>>>>>>>>>>>> <karolina.ro...@polidea.com> to follow up on questions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Apr 28, 2020 at 11:34 AM Ahmet Altay <
>>>>>>>>>>>>>>>> al...@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I am having trouble reviewing the staged version. What is
>>>>>>>>>>>>>>>>> the best way to review this change?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Do we expect any changes to markdown files, beyond some
>>>>>>>>>>>>>>>>> metadata?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Apr 28, 2020 at 10:45 AM Robert Bradshaw <
>>>>>>>>>>>>>>>>> rober...@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks. It'll be great to better support more languages.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I looked at the PR and there seems to be no
>>>>>>>>>>>>>>>>>> provenance/history. E.g. all the content seems to be 
>>>>>>>>>>>>>>>>>> entirely new files
>>>>>>>>>>>>>>>>>> rather than diffs from the old. (There also seems to be a 
>>>>>>>>>>>>>>>>>> huge amount of
>>>>>>>>>>>>>>>>>> auto-generated js code as well.)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I agree. This makes it very hard to review. I also see a
>>>>>>>>>>>>>>>>> bunch of deleted markdown files. Are they not getting 
>>>>>>>>>>>>>>>>> migrated?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Apr 28, 2020 at 10:23 AM Aizhamal Nurmamat kyzy <
>>>>>>>>>>>>>>>>>> aizha...@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Hello everybody,
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> We are almost done migrating the Apache Beam website
>>>>>>>>>>>>>>>>>>> from Jekyll to Hugo. You can see the PR in [1], and we'd 
>>>>>>>>>>>>>>>>>>> love to hear your
>>>>>>>>>>>>>>>>>>> feedback/comments on the PR. It includes  detailed 
>>>>>>>>>>>>>>>>>>> guidelines on
>>>>>>>>>>>>>>>>>>> contributing to the new Hugo-based website and adding 
>>>>>>>>>>>>>>>>>>> translations to pages
>>>>>>>>>>>>>>>>>>> [2]. For those who are curious about adding new languages, 
>>>>>>>>>>>>>>>>>>> we will provide
>>>>>>>>>>>>>>>>>>> a proof of concept in the next couple of days in this 
>>>>>>>>>>>>>>>>>>> thread.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Since we want to move forward with the PR, I would like
>>>>>>>>>>>>>>>>>>> to ask the community to hold off changes to the current 
>>>>>>>>>>>>>>>>>>> Beam website for a
>>>>>>>>>>>>>>>>>>> week, until we are able to review and merge the PR. Is this 
>>>>>>>>>>>>>>>>>>> acceptable to
>>>>>>>>>>>>>>>>>>> everyone?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In case anyone missed my previous email with the
>>>>>>>>>>>>>>>>>>> background for the website migration, you can find more 
>>>>>>>>>>>>>>>>>>> context here [3].
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> Aizhamal
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> [1] https://github.com/apache/beam/pull/11554
>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>> https://github.com/apache/beam/blob/256b7042bf504b94f161ca03b388a2ba247918d9/website/CONTRIBUTE.md
>>>>>>>>>>>>>>>>>>> [3]
>>>>>>>>>>>>>>>>>>> https://lists.apache.org/thread.html/r7fa6d710c0a1959cce5108e460d71c306ce5756cf96af818b41cb7ca%40%3Cdev.beam.apache.org%3E
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>

Reply via email to