[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-18 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619507#comment-17619507
 ] 

Robert Metzger commented on FLINK-27721:


Great, I'll open a PR to add it to our website.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-18 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619422#comment-17619422
 ] 

Chesnay Schepler commented on FLINK-27721:
--

oooh, ok that changes things a bit. Then +1 to use linen.dev

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-18 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619353#comment-17619353
 ] 

Robert Metzger commented on FLINK-27721:


Slack is storing all messages, but it is not showing them.
In our Slack instance, you see the following message:

bq. Messages and files older than 90 days are hidden. Upgrade to a paid plan to 
unlock your team’s full message and file history, plus all the premium features 
of the Pro plan.

Slack is hiding messages, not deleting them.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-18 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17619348#comment-17619348
 ] 

Chesnay Schepler commented on FLINK-27721:
--

> I agree that linen is a fairly new service, but since Slack is storing all 
> data for us, the risk is mostly around loosing URLs, once people start 
> linking into the linen.dev archive.

I thought Slack doesn't since we're on a free plan?

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-11 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615843#comment-17615843
 ] 

Robert Metzger commented on FLINK-27721:


Messages from all public channels in our Slack can actually be exported from 
the admin interface:
 !screenshot-1.png! 
>From my understanding, the export contains all messages, even those messages 
>not visible through the UI on the free plan.
So Slack stores all data for us.

I agree that linen is a fairly new service, but since Slack is storing all data 
for us, the risk is mostly around loosing URLs, once people start linking into 
the linen.dev archive.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
> Attachments: screenshot-1.png
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615134#comment-17615134
 ] 

Chesnay Schepler commented on FLINK-27721:
--

I'd still like a clarification about whether it stores all data and whether all 
data can be exported on demand.
I don't want to end up in a situation where everything is lost because they go 
down / mess up.
In particular since this services appears to be relatively new, and is 
(apparently) only backed by few people.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615069#comment-17615069
 ] 

Robert Metzger commented on FLINK-27721:


Okay, unless somebody objects until tomorrow (or I receive unexpected input 
from linen), I'll open a PR for updating the website with the link to linen.dev 
and a disclaimer wrt to Privacy.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615043#comment-17615043
 ] 

Martijn Visser commented on FLINK-27721:


bq. Based on my understanding, the ASF PP is only covering data usage on ASF 
premises. The Apache Flink Slack is hosted by Slack Inc, so users signing up 
for the Flink Slack have to accept / agree to Slack's PP, not the ASF PP. 

Given that we point from the ASF project website to this Slack instance, that's 
probably a grey zone. For certain third party integrations (like having a 3rd 
party search on an ASF project website) the ASF Privacy actually looked at the 
privacy guarantees from such a provider. 

I think we could indeed resolve that by making it explicit in the documentation 
where they can retrieve the invite link from. 

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615018#comment-17615018
 ] 

Robert Metzger commented on FLINK-27721:


Thank you all for the positive responses.

bq. Does it have a limit on how many messages / data it can store?

We are currently on the free plan, which offers "Unlimited message retention 
history".
I have exported all public messages from our Slack instance, and uploaded it to 
linen.dev. So the archive shows messages from day one, and it should sync 
automatically from now on.

bq. My biggest concern is related to the privacy statement / how will they earn 
money

I share that concern. They might at some point put up ads or limit the features 
of the free plan. I will ask them about this.

bq. Based on the currently used ASF Privacy Policy, that's not covered

Thanks for bringing this up. Based on my understanding, the ASF PP is only 
covering data usage on ASF premises. The Apache Flink Slack is hosted by Slack 
Inc, so users signing up for the Flink Slack have to accept / agree to Slack's 
PP, not the ASF PP.




> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17615002#comment-17615002
 ] 

Martijn Visser commented on FLINK-27721:


It looks really nice I must say. My biggest concern is related to the privacy 
statement / how will they earn money. By using their service, data of the Slack 
users will be shared with Linen. Based on the currently used ASF Privacy 
Policy, that's not covered (see 
https://privacy.apache.org/policies/privacy-policy-public.html). We should get 
checked if it's OK to use a service such as theirs.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614998#comment-17614998
 ] 

Chesnay Schepler commented on FLINK-27721:
--

> What else do you expect from an archive?

>From the description on the page I'd conclude they just display what is 
>_currently_ stored in the Slack instance in a searchable manner.
I'd expect an archive to store _all_ data, even if it's no longer available in 
slack or the slack instance was removed altogether.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614992#comment-17614992
 ] 

Xintong Song commented on FLINK-27721:
--

This is awesome~! Thanks a lot, [~rmetzger].

I have only one question. Does it have a limit on how many messages / data it 
can store?

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614970#comment-17614970
 ] 

Robert Metzger commented on FLINK-27721:


Yes, but in my understanding that was always the goal of our a Flink Slack 
Archive. To avoid repeat questions and making all the wisdom in Slack 
accessible. 
The linen.dev archive also has a search bar on top, allows users to browse the 
Slack w/o signing up for the community. What else do you expect from an archive?

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614968#comment-17614968
 ] 

Chesnay Schepler commented on FLINK-27721:
--

Is that actually an _archive_ though? It _sounds_ more like a front to support 
searches via google.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-10-10 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17614959#comment-17614959
 ] 

Robert Metzger commented on FLINK-27721:


Hey, I've recently learned about linen.dev, which is a free tool to publicly 
archive Slack channels.
I've set it up for our workspace to try it out. WDYT? 
https://www.linen.dev/s/apache-flink

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-08-30 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17597773#comment-17597773
 ] 

Martijn Visser commented on FLINK-27721:


[~xtsong] Awesome, looking forward to the next steps

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-07-17 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17567803#comment-17567803
 ] 

Xintong Song commented on FLINK-27721:
--

Status updates:

Right now, I have something imperfect but workable. I probably won't have time 
to further improve it recently. Given that we are approaching the 10k messages 
limit very soon, I'll try to deploy the current version.

The known limitations are:
 # *Messages are not organized in threads at frontend, making it hard for 
people to read.* This is the same limitation that 
[airflow|http://apache-airflow.slack-archives.org/] also has. Properties needed 
for grouping messages into threads are already captured in the database. All we 
need is to improve the way the messages are displayed.
 # *It's not realtime.* Slack's new event api never worked for me. So I went 
for an approach that periodically fetches the messages, with a configurable 
interval (default 1h). Consequently, new messages may take up to 1 hour to 
appear in the archive, which is probably fine because they can be searched in 
Slack anyway.
 # *It's unlikely, but still possible, to loose messages.* With Slack's 
conversation api, we need to first retrieve parent messages that are directly 
sent to the channel, then for each of them retrieve threaded messages replying 
to it. That means for an already retrieved thread, we cannot know whether 
there're new replies to it without trying to retrieve it again. Moreover, the 
api has a ~50/min rate limit, so we probably should not frequently retrieve 
replies for all threads. My current approach is to only retrieve new messages 
for threads started within the recent 30 days (configurable). That means new 
replies to a thread started more than 30 days ago can be lost, which I'd expect 
to be very rare.
 # *Backup is not automatic.* We can dump the database with one command, 
without interrupting the service. We just need to setup a cronjob to trigger 
and handle the dumps (uploading & cleaning).

Some numbers, FYI:
# [Slack Analytics|https://apache-flink.slack.com/admin/stats] shows we now 
have 9.1k total messages. In the last 30 days, only 31% of messages are sent in 
public channels, 67% in DMs and 1% in private channels.
# Slack archive captures public channel messages only. It captures 2.5k total 
messages, taking about 7~8 minutes on my laptop. The bottleneck is the Slack's 
api rate limit.
# A full dump of the database, containing all the 2.5k messages, channel & user 
information, completes almost instantly. The dumped file is 3.7MB large.

I'll try to deploy the service next. Based on the numbers, I think a dedicated 
VM might not be necessary. So I'd try with the flink-packages host first. BTW, 
I have already backed up a dump of all public messages so far, so it shouldn't 
be a problem if the service is not deployed by the time the 10k limit is 
reached. 


> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-07-14 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566860#comment-17566860
 ] 

Chesnay Schepler commented on FLINK-27721:
--

What's the state on this effort?

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-06-28 Thread Robert Metzger (Jira)
Title: Message Title


 
 
 
 

 
 
 

 
   
 Robert Metzger commented on  FLINK-27721  
 

  
 
 
 
 

 
 
  
 
 
 
 

 
  Re: Slack: set up archive   
 

  
 
 
 
 

 
 We can request VMs from infra like this: https://issues.apache.org/jira/browse/INFRA-22111. Thanks a lot for migrating slackarchive to the new APIs!!  
 

  
 
 
 
 

 
 
 

 
 
 Add Comment  
 

  
 

  
 
 
 
  
 

  
 
 
 
 

 
 This message was sent by Atlassian Jira (v8.20.10#820010-sha1:ace47f9)  
 
 

 
   
 

  
 

  
 

   



[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-06-15 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554492#comment-17554492
 ] 

Xintong Song commented on FLINK-27721:
--

I didn't know there're ASF VMs available. How does that work? It would be 
wonderful if it is feasible to run this on an ASF infrastructure.

Progress updates:
It turns out the slackarchive project uses some deprecated Slack APIs which is 
only available to legacy applications. I'm working on migrating them to the new 
APIs. This could take a bit more time, as I'm working on this only on my 
leisure time.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-06-15 Thread Chesnay Schepler (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17554458#comment-17554458
 ] 

Chesnay Schepler commented on FLINK-27721:
--

Would it make sense to run this on a ASF VM that other projects can also use?

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-06-02 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545404#comment-17545404
 ] 

Robert Metzger commented on FLINK-27721:


flink-slack.org sounds good! Would be nice if you could register it!

> How does flink-packages backup databases?

It runs a cron job every day creating a dump of the database (just locally). 
Every now and then I downloaded a dump. The problem is that this server is 
owned by Ververica (in Google Cloud), so I don't have access to it anymore.

> Can we use the server which hosts https://flink-packages.org/ ? 

In principle yes. The server has very little resources, but upgrading it to a 
bigger machine should be simple.



> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-06-01 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545256#comment-17545256
 ] 

Jark Wu commented on FLINK-27721:
-

Regarding the domain, I think we can try to buy the domain "flink-slack.org" 
which is available for now. 
"codespeed" or other domains sound not official to me. 

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-06-01 Thread Jark Wu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545253#comment-17545253
 ] 

Jark Wu commented on FLINK-27721:
-

Can we use the server which hosts https://flink-packages.org/ ? 
How does flink-packages backup databases? 

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-05-25 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542309#comment-17542309
 ] 

Xintong Song commented on FLINK-27721:
--

bq. Which server are we going to use for the archive?
I'd expect the load of the archive service to be light enough to share machine 
with other services that we already hold, e.g., the codespeed. However, I'm a 
bit concerned that the archive service might have some random, unexplainable 
impacts on the benchmark results. Any suggestions?

bq. Which domain are we using?
Haven't really thought about this yet.

bq. It would be very nice if we would automatically create a weekly database 
dump that is downloadable in a public archive.
At beginning, before the dump growth too big, we may simply upload them to wiki 
or github.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542046#comment-17542046
 ] 

Robert Metzger commented on FLINK-27721:


I agree, let's not block the effort on the archive!

Which server are we going to use for the archive?
Which domain are we using?

It would be very nice if we would automatically create a weekly database dump 
that is downloadable in a public archive. So if something happens to the 
archive server or the entity maintaining it, somebody else can restore the 
archive.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-05-25 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17542028#comment-17542028
 ] 

Xintong Song commented on FLINK-27721:
--

No specific blockers. Just I'm proceeding slower than expected, due to my other 
works. I'm currently working on two things:
1) I've managed to get all the dependencies settled manually. Working on 
automating them with scripts and docker.
2) It looks like we can generate the schemas from the model defined in the 
current code base, by adding a bit more codes. For that, I also need 1) to make 
the build & deploy convenient.

BTW, I think we don't necessarily block publishing of the slack on this ticket. 
Based on what I've learned so far, I'm quite confident we just need a bit more 
time to get the archive work. And we do have quite some time before reaching 
the 10k messages limit.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541942#comment-17541942
 ] 

Robert Metzger commented on FLINK-27721:


What is blocking you from proceeding? (= what's the question you had to ask the 
maintainer?) Maybe I can help, or I know somebody?

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-05-25 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541915#comment-17541915
 ] 

Xintong Song commented on FLINK-27721:
--

I don't spot any security issue, but that might because I don't know much 
security things.

Stability is probably not a bit issues. It stores data in postgresql so we can 
easily backup them once a while. It also seems to support migrating between 
different db backups, which I haven't tried yet.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-05-25 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541891#comment-17541891
 ] 

Robert Metzger commented on FLINK-27721:


Thanks a lot for working on this!

I'm not surprised that there are some problems getting this up and running, 
since this is only used by one person.
What's your overall impression of the project? Do you think it is secure to run 
this project on the public internet (e.g. what's the risk that there's some 
security issues somewhere?, or stability issues and we loose all the data 
accidentally?)

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (FLINK-27721) Slack: set up archive

2022-05-25 Thread Xintong Song (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-27721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17541889#comment-17541889
 ] 

Xintong Song commented on FLINK-27721:
--

Some updates on this one:
 * [~rmetzger] helped point me to the 
[ashb/slackarchive|https://github.com/ashb/slackarchive] repo, which is what 
Apache Airflow is using.
 * The instructions on how to use the repo is outdated. I've wrote to the owner 
for help, no responses yet.
 * I've also looked into the codes myself, trying to figure out how to get it 
work. So far, I've got the project build and started, but are struggling with 
manually creating the required table schemas in postgresql. 
 * I've also tried the upstream project 
[dutchcoders/slackarchive|https://github.com/dutchcoders/slackarchive], which 
haven't been updated for years and I run into a bunch of disabled api issues. 
It seems to me ashb/slackarchive is the only project that currently works.

> Slack: set up archive
> -
>
> Key: FLINK-27721
> URL: https://issues.apache.org/jira/browse/FLINK-27721
> Project: Flink
>  Issue Type: Sub-task
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.7#820007)