[DISCUSS] scala script or spark flink sql for etl examples

康悦Rita Thu, 24 Nov 2022 18:24:51 -0800

Dear:

Wechat group "Apache Linkis(incubating)  community Development group "chat 
records are as follows:  微信群"Apache Linkis(incubating)  社区开发群"的聊天记录如下:

————— 2022-11-24 —————

The tree. 16:34

Do you have any documentation for scala scripts or spark flink sql etl examples?

Mr. Flash 16:36

The dss doc contains some test cases

Sargent Ti 16:36

I think it's a Hong Kong company. He also supports an SDK-embedded model, and
we tried to work with them on our previous data lineage tool, but later
abandoned them due to license and other concerns.

Mr. Flash 16:36

flink has a fink cdc use case on its website

The tree. 16:38

I can't find the dss

Mr. Flash 16:38

https://github.com/WeBankFinTech/DataSphereStudio-Doc

Mr. Flash 16:38

Take a look at this

The tree. 16:41

I know the address but I can't find an example of etl implemented by scriptis
[split].

Mr. Flash 16:42

You can write 10 statements in a row...

The tree. 16:43

I wrote it like this Can scala programs be used like this

She said 16:44

@utopianet_ Bank credit card _ Zhang Joaquin for different data types that

bao Ocean 16:44

SparkContext is built in

W 16:44

You don't need to write main just like in spark-shell

The tree. 16:45

I'll try again Thank you guys

Mr. Flash 16:45

@She says different types, try engines like presto trino openlook.

r@FY2 16:46

@Trees. The simplest is a few sql to complete etl

The tree. 16:47

@ r @ FY2 [ThumbsUp] [ThumbsUp] [ThumbsUp]

The tree. 16:48

This has to be a flink script

r@FY2 16:48

flink programs must be executed through fql

Mr. Flash 16:49

You can also do this with spark

r@FY2 16:49

sql is the simplest etl, saving time and effort

The tree. 16:49

Ok, I'll try. Thank you

The tree. 16:50

I seem to have heard in our community that it is more convenient to use json
for etl, which version is it more convenient to write sql?

Mr. Flash 16:51

Configuration type, suitable for a large number of processing operations.

Mr. Flash 16:51

Write SQL, suitable for complex processing, such as index processing.

peacewong@WDS 16:52

The PR:https://github.com/apache/incubator-linkis/pull/3715 1.3.2 version

She said 16:52

So there's no difference between presto and azkaban, right

The tree. 16:52

If you want to compute it it's better to just write sql just json configuration
for data synchronization is that right

The tree. 16:52

good

Mr. Flash 16:53

For example, seatunel is a configuration file.

Mr. Flash 16:55

@She said well, not much difference, that is, we can do resource control,
security audit, workflow integration, tag routing, that is what linkis does

乔木。 16:34

大佬们 有没有scala脚本或者spark flink sql做etl的示例 相关的文档呀？

闪电先生 16:36

dss doc里面有一些测试用例

Sargent Ti 16:36

好像是个香港的公司。他还支持一种SDK嵌入的模式，我们之前的数据血缘工具想找他们合作来着，但后面因为license 等考虑放弃了。

闪电先生 16:36

flink有官网有一个fink cdc用例

乔木。 16:38

dss的我没找到呢

闪电先生 16:38

https://github.com/WeBankFinTech/DataSphereStudio-Doc

闪电先生 16:38

看下这里

乔木。 16:41

这个地址我知道 就是没找到通过scriptis实现etl的例子[裂开]

闪电先生 16:42

你可以连续写10条语句。。。

乔木。 16:43

我是这样写的scala程序 可以这样用吗

她說 16:44

@utopianet_广银信用卡_张华金 那不同的数据类型那

bao洋 16:44

内置了SparkContext

W 16:44

跟 spark-shell 里面一样用 不需要写 main 方法

乔木。 16:45

我再试试 谢谢各位大佬

闪电先生 16:45

@她說 不同类型，可以尝试一下presto trino openlook等引擎。。

r@FY2 16:46

@乔木。 最简单的就几条sql就能完成etl

乔木。 16:47

@r@FY2 [ThumbsUp][ThumbsUp][ThumbsUp]

乔木。 16:48

这种只能是flink脚本是吧

r@FY2 16:48

flink程序肯定要通过fql执行的

闪电先生 16:49

你也可以写spark这样干

r@FY2 16:49

sql是最简单的etl，省时省力

乔木。 16:49

好的 我试试 谢谢大家

乔木。 16:50

咱们社区之前我好像听说用json做etl这个是哪个版本上的 这种方便还是写sql更方便些？

闪电先生 16:51

配置型，适合大量的阐加工作业。

闪电先生 16:51

写SQL，适合复杂加工，像指标加工这种。

peacewong@WDS 16:52

1.3.2的版本，这个PR：https://github.com/apache/incubator-linkis/pull/3715

她說 16:52

这样就没什么区别了 跟之前presto + azkaban 没区别了

乔木。 16:52

就是需要计算的最好还是写sql 只是数据同步的json配置就行了 是这样吗

乔木。 16:52

好的

闪电先生 16:53

比如像seatunel就是配置文件。。

闪电先生 16:55

@她說 嗯，差不太多，就是这里可以搞下资源管控、安全审计、工作流整合、标签路由，这就是linkis干的事情

Best Regards
------
康悦 ritakang
GitHub：Ritakang0451
E-mail：rita0...@163.com

[DISCUSS] scala script or spark flink sql for etl examples

Reply via email to