[ 
https://issues.apache.org/jira/browse/HIVE-29204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Rebele updated HIVE-29204:
---------------------------------
    Description: 
Some links to attachments lead to a 404 Not found, e.g. 
[attachments/40509928/42696874-txt|https://hive.apache.org/attachments/40509928/42696874-txt]
 in [SQL Standard Based Hive 
Authorization|https://hive.apache.org/docs/latest/language/sql-standard-based-hive-authorization/#hive-013].

Some link texts replace the dot with a dash (e.g., 
content/community/resources/presentations.md). In general, it would be better 
to use the title of the document instead of numbers as file name and link text.
{code:java}
50:* [attachments/27362054/35193149-pptx](/attachments/27362054/35193149.pptx) 
(Ashutosh Chauhan){code}
A few shell commands that might be helpful:
{code:java}
find themes/hive/static/attachments -type f | sed 's#themes/hive/static/##' | 
sort -u > available-attachments.txt
rg "attachments/" | sed 's#attachments/#\nattachments/#g;' | grep 
'^attachments' | sed 's/\([?"<> )]\|\]\).*//' | sort -u > needed-attachments.txt
{code}
{-}There are also some duplicate files{-}: (Update: the duplicates have been 
removed by HIVE-29325)
{code:java}
$ cat available-attachments.txt| sed 's#^#themes/hive/static/#' | xargs md5sum 
| sort
...
f9f26fe37b0c5276d0b63f98e1188324  
themes/hive/static/attachments/27362075/34177489.pdf
f9f26fe37b0c5276d0b63f98e1188324  
themes/hive/static/attachments/27362075/34177517.pdf
f9f26fe37b0c5276d0b63f98e1188324  
themes/hive/static/attachments/27362075/35193010.pdf
f9f26fe37b0c5276d0b63f98e1188324  
themes/hive/static/attachments/27362075/35193011.pdf
...
{code}

  was:
Some links to attachments lead to a 404 Not found, e.g. 
[attachments/40509928/42696874-txt|https://hive.apache.org/attachments/40509928/42696874-txt]
 in [SQL Standard Based Hive 
Authorization|https://hive.apache.org/docs/latest/language/sql-standard-based-hive-authorization/#hive-013].

Some link texts replace the dot with a dash (e.g., 
content/community/resources/presentations.md). In general, it would be better 
to use the title of the document instead of numbers as file name and link text.
{code:java}
50:* [attachments/27362054/35193149-pptx](/attachments/27362054/35193149.pptx) 
(Ashutosh Chauhan){code}
A few shell commands that might be helpful:
{code:java}
find themes/hive/static/attachments -type f | sed 's#themes/hive/static/##' | 
sort -u > available-attachments.txt
rg "attachments/" | sed 's#attachments/#\nattachments/#g;' | grep 
'^attachments' | sed 's/\([?"<> )]\|\]\).*//' | sort -u > needed-attachments.txt
{code}
There are also some duplicate files:
{code:java}
$ cat available-attachments.txt| sed 's#^#themes/hive/static/#' | xargs md5sum 
| sort
...
f9f26fe37b0c5276d0b63f98e1188324  
themes/hive/static/attachments/27362075/34177489.pdf
f9f26fe37b0c5276d0b63f98e1188324  
themes/hive/static/attachments/27362075/34177517.pdf
f9f26fe37b0c5276d0b63f98e1188324  
themes/hive/static/attachments/27362075/35193010.pdf
f9f26fe37b0c5276d0b63f98e1188324  
themes/hive/static/attachments/27362075/35193011.pdf
...
{code}


> Hive-site: cleanup attachments and links to attachments
> -------------------------------------------------------
>
>                 Key: HIVE-29204
>                 URL: https://issues.apache.org/jira/browse/HIVE-29204
>             Project: Hive
>          Issue Type: Task
>            Reporter: Thomas Rebele
>            Priority: Major
>
> Some links to attachments lead to a 404 Not found, e.g. 
> [attachments/40509928/42696874-txt|https://hive.apache.org/attachments/40509928/42696874-txt]
>  in [SQL Standard Based Hive 
> Authorization|https://hive.apache.org/docs/latest/language/sql-standard-based-hive-authorization/#hive-013].
> Some link texts replace the dot with a dash (e.g., 
> content/community/resources/presentations.md). In general, it would be better 
> to use the title of the document instead of numbers as file name and link 
> text.
> {code:java}
> 50:* 
> [attachments/27362054/35193149-pptx](/attachments/27362054/35193149.pptx) 
> (Ashutosh Chauhan){code}
> A few shell commands that might be helpful:
> {code:java}
> find themes/hive/static/attachments -type f | sed 's#themes/hive/static/##' | 
> sort -u > available-attachments.txt
> rg "attachments/" | sed 's#attachments/#\nattachments/#g;' | grep 
> '^attachments' | sed 's/\([?"<> )]\|\]\).*//' | sort -u > 
> needed-attachments.txt
> {code}
> {-}There are also some duplicate files{-}: (Update: the duplicates have been 
> removed by HIVE-29325)
> {code:java}
> $ cat available-attachments.txt| sed 's#^#themes/hive/static/#' | xargs 
> md5sum | sort
> ...
> f9f26fe37b0c5276d0b63f98e1188324  
> themes/hive/static/attachments/27362075/34177489.pdf
> f9f26fe37b0c5276d0b63f98e1188324  
> themes/hive/static/attachments/27362075/34177517.pdf
> f9f26fe37b0c5276d0b63f98e1188324  
> themes/hive/static/attachments/27362075/35193010.pdf
> f9f26fe37b0c5276d0b63f98e1188324  
> themes/hive/static/attachments/27362075/35193011.pdf
> ...
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to