Re: Issues with Flink Batch and Hadoop dependency

2020-08-29 Thread Dan Hill
I was able to get a basic version to work by including a bunch of hadoop and s3 dependencies in the job jar and hacking in some hadoop config values. It's probably not optimal but it looks like I'm unblocked. On Fri, Aug 28, 2020 at 12:11 PM Dan Hill wrote: > I'm assuming I have a simple,

Re: Flink not outputting windows before all data is seen

2020-08-29 Thread David Anderson
Teodor, This is happening because of the way that readTextFile works when it is executing in parallel, which is to divide the input file into a bunch of splits, which are consumed in parallel. This is making it so that the watermark isn't able to move forward until much or perhaps all of the file

Re: FileSystemHaServices and BlobStore

2020-08-29 Thread Alexey Trenikhun
Did test with streaming job and FileSystemHaService using VoidBlobStore (no HA Blob), looks like job was able to recover from both JM restart and TM restart. Any idea in what use cases HA Blob is needed? Thanks, Alexey From: Alexey Trenikhun Sent: Friday,

Re: flink1.11时间函数

2020-08-29 Thread Leonard Xu
补充下哈, 可能是function这个词翻译后理解问题,功能没有确定性/不确定性这一说法,那个文档里的function都应理解为函数,note里讲的是函数的返回值是确定性的还是不确定性。 祝好 Leonard > 在 2020年8月29日,18:22,Dream-底限 写道: > > 哦哦,好吧,我昨天用NOW的时候直接报错告诉我这是个bug,让我提交issue,我以为这种标示的都是函数功能有问题的 > > Benchao Li 于2020年8月28日周五 下午8:01写道: > >> 不确定的意思是,这个函数的返回值是动态的,每次调用返回可能不同。 >>

Re: flink sql 计算列不支持comment

2020-08-29 Thread Leonard Xu
Hi, sllence 这是个bug, 看起来是支持计算列时漏掉了comment的解析,我开了个issue去修复[1]. 祝好 Leonard [1] https://issues.apache.org/jira/browse/FLINK-19092 > 在 2020年8月29日,13:37, > 写道: > > Flink版本:1.11.1 > > > > 官网文档中定义如下: > > : > > column_name AS

Flink not outputting windows before all data is seen

2020-08-29 Thread Teodor Spæren
Hey! Second time posting to a mailing lists, lets hope I'm doing this correctly :) My usecase is to take data from the mediawiki dumps and stream it into Flink via the `readTextFile` method. The dumps are TSV files with an event per line, each event have a timestamp and a type. I want to

Re: PyFlink cluster runtime issue

2020-08-29 Thread Manas Kale
Ok, thank you! On Sat, 29 Aug, 2020, 4:07 pm Xingbo Huang, wrote: > Hi Manas, > > We can't submit a pyflink job through flink web currently. The only way > currently to submit a pyFlink job is through the command line. > > Best, > Xingbo > > Manas Kale 于2020年8月29日周六 下午12:51写道: > >> Hi Xingbo,

Re: Flink OnCheckpointRollingPolicy streamingfilesink

2020-08-29 Thread Andrey Zagrebin
Hi Vijay, I would apply the same judgement. It is latency vs throughput vs spent resources vs practical need. The more concurrent checkpoints your system is capable of handling, the better end-to-end result latency you will observe and see computation results more frequently. On the other hand

Re: PyFlink cluster runtime issue

2020-08-29 Thread Xingbo Huang
Hi Manas, We can't submit a pyflink job through flink web currently. The only way currently to submit a pyFlink job is through the command line. Best, Xingbo Manas Kale 于2020年8月29日周六 下午12:51写道: > Hi Xingbo, > Thanks, that worked. Just to make sure, the only way currently to submit a > pyFlink

Re: flink1.11时间函数

2020-08-29 Thread Dream-底限
哦哦,好吧,我昨天用NOW的时候直接报错告诉我这是个bug,让我提交issue,我以为这种标示的都是函数功能有问题的 Benchao Li 于2020年8月28日周五 下午8:01写道: > 不确定的意思是,这个函数的返回值是动态的,每次调用返回可能不同。 > 对应的是确定性函数,比如concat就是确定性函数,只要输入是一样的,它的返回值就永远都是一样的。 > 这个函数是否是确定性的,会影响plan的过程,比如是否可以做express reduce,是否可以复用表达式结果等。 > > Dream-底限 于2020年8月28日周五 下午2:50写道: > > > hi > > >

??Flink??????????????

2020-08-29 Thread ????????
hi,all: ??demoflink? .

Re: 如何设置FlinkSQL并行度

2020-08-29 Thread zilong xiao
SQL 算子并行度设置可以自己实现,可以私下交流下,正好在做这块,基本能工作了 JasonLee <17610775...@163.com> 于2020年8月23日周日 下午2:07写道: > hi > checkpoint savepoint的问题可以看下这个 > https://mp.weixin.qq.com/s/Vl6_GsGeG0dK84p9H2Ld0Q > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ >