Re: Streaming write to Hive

2019-09-05 Thread Qi Luo
Hi JingsongLee, Fantastic! We'll look into it. Thanks, Qi On Fri, Sep 6, 2019 at 10:52 AM JingsongLee wrote: > Hi luoqi: > > With partition support[1], I want to introduce a FileFormatSink to > cover streaming exactly-once and partition-related logic for flink > file connectors and hive connec

Re: Streaming write to Hive

2019-09-05 Thread JingsongLee
Hi luoqi: With partition support[1], I want to introduce a FileFormatSink to cover streaming exactly-once and partition-related logic for flink file connectors and hive connector. You can take a look. [1] https://docs.google.com/document/d/15R3vZ1R_pAHcvJkRx_CWleXgl08WL3k_ZpnWSdzP7GY/edit?usp=

Re: best practices on getting flink job logs from Hadoop history server?

2019-09-05 Thread Yang Wang
I think the best way to view the log is flink history server. However, it could only support jobGraph and exceptions. Maybe the flink history server needs to be enhanced so that we could view logs just like the cluster is running. Best, Yang Yu Yang 于2019年9月6日周五 上午3:06写道: > Hi Yun Tang & Zhu Z

Flink Kafka Connector

2019-09-05 Thread Vishwas Siravara
Hi guys, I am using flink connector for kakfa from 1.9.0 Her is my sbt dependency : "org.apache.flink" %% "flink-connector-kafka" % "1.9.0", When I check the log file I see that the kafka version is 0.10.2.0. According to the docs it says that 1.9.0 onwards the version should be 2.2.0. Why do I

HBase Connectors(sink)

2019-09-05 Thread Ni Yanchun
Hi all, I have found that flink could use hbase as sink in flink source code, but it does not appear in the official document. Does that means hbase sink is not ready for production use?

Re: Streaming write to Hive

2019-09-05 Thread Bowen Li
Hi, I'm not sure if there's one yet. Feel free to create one if not. On Wed, Sep 4, 2019 at 11:28 PM Qi Luo wrote: > Hi Bowen, > > Thank you for the information! Streaming write to Hive is a very common > use case for our users. Is there any open issue for this to which we can > try contributin

Re: best practices on getting flink job logs from Hadoop history server?

2019-09-05 Thread Yu Yang
Hi Yun Tang & Zhu Zhu, Thanks for the reply! With your current approach, we will still need to search job manager log / yarn client log to find information on job id/vertex id --> yarn container id mapping. I am wondering howe we can propagate this kind of information to Flink execution graph so

How to access a file with Flink application running on Mesos?

2019-09-05 Thread Felipe Gutierrez
Hi, I am running Flink on Mesos without DC/OS. My application has to read files from the file system. However, it is been deployed on Meses, hence it uses its sandbox. And my application cannot find the file. How should I call the file from the file system using Flink running on Mesos? Here is th

Implementing CheckpointableInputFormat

2019-09-05 Thread Lu Niu
Hi, Team I am implementing a custom InputFormat. Shall I implement CheckpointableInputFormat interface? If I don't, does that mean the whole job has to restart given only one task fails? I ask because I found all InputFormat implements CheckpointableInputFormat, which makes me confused. Thank you!

Re: Flink & Mesos don't launch Job and Task managers

2019-09-05 Thread Felipe Gutierrez
my bad. Flink allocates task managers dynamically. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com * On Thu, Sep 5, 2019 at 5:24 PM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Hi, > > I am

Re: suggestion of FLINK-10868

2019-09-05 Thread Peter Huang
Hi Anyang, Thanks for raising it up. I think it is reasonable as what you requested is needed for batch. Let's wait for Till to give some more input. Best Regards Peter Huang On Thu, Sep 5, 2019 at 7:02 AM Anyang Hu wrote: > Hi Peter&Till: > > As commented in the issue >

Flink & Mesos don't launch Job and Task managers

2019-09-05 Thread Felipe Gutierrez
Hi, I am trying to run Flink with Mesos but without DC/OS [1]. My mesos instance is working and when I start flink by the script mesos-appmaster.sh, Flink starts, however without any Job Manager or Task Manager. What could be the problem? I receive no errors on the log file, only some warnings: 2

Kinesis conector and record aggregation

2019-09-05 Thread Yoandy Rodríguez
Hello everyone, I'm passing this property to my FlinkKinesisProducer (kinProp doesn't hold a value for AggregatioEnabled yet) producerConfig.put("AggregationEnabled ", "false"); But records are still aggregated. I'm using Flink 1.6 with Amazon Kinesis Applications. Is there another way to disabl

suggestion of FLINK-10868

2019-09-05 Thread Anyang Hu
Hi Peter&Till: As commented in the issue ,We have introduced the FLINK-10868 patch (mainly batch tasks) online, what do you think of the following two suggestions: 1) Parameter control time int

Re: Accessing pojo fields by name in flink

2019-09-05 Thread Steve Robert
Hi Frank , I think we have a similar case in my case I should be able to set the pojo from the outside to analyze API-REST. my strategy in my case and define my schema by using a JSON to convert JSON data formats to Flink Row so for example imagine your date is something like this : public stati

Re: TABLE API + DataStream outsourcing schema or Pojo?

2019-09-05 Thread Steve Robert
Hi Fabian , thank you for your answer it is indeed the solution that I am currently testing i use TypeInformation convert = JsonRowSchemaConverter.convert(JSON_SCHEMA); provided by the flink-json and provide the TypeFormation to the operatorStream its look like to work :) with this solution m

Re: Exception when trying to change StreamingFileSink S3 bucket

2019-09-05 Thread Kostas Kloudas
Hi Sidhartha, Your explanation is correct. If you stopped the job with a savepoint and then you try to restore from that savepoint, then Flink will try to restore its state which is, of course, included in its old bucket. But new data will go to the new bucket. One solution is either to restart

Re: Wrong result of MATCH_RECOGNIZE clause

2019-09-05 Thread Dongwon Kim
Oops, I think I explained something wrong in the previous email. B means not A. Therefore, after the completed match, there must be no new partial match starting from there. There's nothing wrong with the implementation, but the example in [2] is wrong. Am I right? Best, Dongwon On Thu, Sep 5, 2

Wrong result of MATCH_RECOGNIZE clause

2019-09-05 Thread Dongwon Kim
Hi, I'm using Flink 1.9 and testing MATCH_RECOGNIZE by following [1]. While testing the query in [2] on myself, I've got the different result from [2] The query result from [2] is as follows: symbol start_tstamp end_tstamp avgPrice = == =

RE: Join with slow changing dimensions/ streams

2019-09-05 Thread Hanan Yehudai
Thanks Fabian. is there any advantage using broadcast state VS using just CoMap function on 2 connected streams ? From: Fabian Hueske Sent: Thursday, September 5, 2019 12:59 PM To: Hanan Yehudai Cc: flink-u...@apache.org Subject: Re: Join with slow changing dimensions/ streams Hi, Flink doe

Re: Exception when trying to change StreamingFileSink S3 bucket

2019-09-05 Thread Fabian Hueske
Hi, Kostas (in CC) might be able to help. Best, Fabian Am Mi., 4. Sept. 2019 um 22:59 Uhr schrieb sidhartha saurav < sidsau...@gmail.com>: > Hi, > > Can someone suggest a workaround so that we do not get this issue while > changing the S3 bucket ? > > On Thu, Aug 22, 2019 at 4:24 PM sidhartha s

Re: TABLE API + DataStream outsourcing schema or Pojo?

2019-09-05 Thread Fabian Hueske
Hi Steve, Maybe you could implement a custom TableSource that queries the data from the rest API and converts the JSON directly into a Row data type. This would also avoid going through the DataStream API just for ingesting the data. Best, Fabian Am Mi., 4. Sept. 2019 um 15:57 Uhr schrieb Steve

Re: error in my job

2019-09-05 Thread Fabian Hueske
Hi, Are you getting this error repeatedly or was this a single time? If it's just a single time error, it's probably caused by a task manager process that died for some reason (as suggested by the error message). You should have a look at the TM logs whether you can finds something that would exp

Re: Logback on AWS EMR

2019-09-05 Thread Stephen Connolly
Answering our own question. >From what we can see, all you need to do is tweak the /usr/lib/flink/conf and /usr/lib/flink/lib directories so that you remove the log4j.properties and have your logback.xml in conf and the required libraries in the lib directory (removing the log4j backend in place o

Re: understanding task manager logs

2019-09-05 Thread Fabian Hueske
Hi Vishwas, This is a log statement from Kafka [1]. Not sure how when AppInfoParser is created (the log message is written by the constructor). For Kafka versions > 1.0, I'd recommend the universal connector [2]. Not sure how well it works if producers and consumers have different versions. Mayb

Re: Window metadata removal

2019-09-05 Thread Fabian Hueske
Hi, A window needs to keep the data as long as it expects new data. This is clearly the case before the end time of the window was reached. If my window ends at 12:30, I want to wait (at least) until 12:30 before I remove any data, right? In case you expect some data to be late, you can configure

Re: Join with slow changing dimensions/ streams

2019-09-05 Thread Fabian Hueske
Hi, Flink does not have good support for mixing bounded and unbounded streams in its DataStream API yet. If the dimension table is static (and small enough), I'd use a RichMapFunction and load the table in the open() method into the heap. In this case, you'd probably need to restart the job (can b

switch event time to process time

2019-09-05 Thread venn wu
Dear experts: My flink job work with event time, but I have a trouble, the source even will be very few at the early morning, can't finish current widow and emitted the result as expect. I found the next paragraph at official website. How can I “switches to using current processing

Accessing pojo fields by name in flink

2019-09-05 Thread Frank Wilson
Hi, So far I’ve been developing my flink pipelines using the datastream API. I have a pipeline that calculates windowed statistics on a given pojo field. Ideally I would like this field to be user configurable via a config file. To do this I would need to extract pojo fields by name. The Table API

Checkpoint size growing over time

2019-09-05 Thread Daniel Harper
Hi there, We are running a streaming application on Flink 1.5.2 with BEAM 2.7.0. We’ve noticed that the checkpoint size appears to be increasing at a slow, gradual rate (see screenshot) over the course of many months and are not certain as to why this is happening. We take a checkpoint every

Re: No field appear in Time Field name in Kibana

2019-09-05 Thread miki haiat
You need to define a date or time type in your elastic index mapping . Its not a flink issue On Wed, Sep 4, 2019 at 3:02 PM alaa wrote: > I try to run this application but there was problem when Configure an index > pattern . > There was No field appear in Time Field name in Kibana when i set in

[ANNOUNCE] Flink Forward training registration closes on September 30th

2019-09-05 Thread Fabian Hueske
Hi all, The registration for the Flink Forward Europe training sessions closes in four weeks. The training takes place in Berlin at October 7th and is followed by two days of talks by speakers from companies like Airbus, Goldman Sachs, Netflix, Pinterest, and Workday [1]. The following four train

Re: error in LocalStreamEnvironment

2019-09-05 Thread Zhu Zhu
Checked the demo and find that its using a quite outdated flink version(0.10.0). And you are trying to run it with Flink 1.7.2. That's why the NoClassDefFound error happens. I'd suggest you try examples in current Flink repo. references: https://ci.apache.org/projects/flink/flink-docs-release-1.9

Re: error in LocalStreamEnvironment

2019-09-05 Thread alaa
thank for your reply but Unfortunately this solution is not suitable -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/