streaming job reading from kafka stuck while cancelling

2016-03-09 Thread Maciek Próchniak
Hi, from time to time when we cancel streaming jobs (or they are failing for some reason) we encounter: 2016-03-09 10:25:29,799 [Canceler for Source: read objects from topic: (...) ' did not react to cancelling signal, but is stuck in method: java.lang.Object.wait(Native Method)

Re: streaming job reading from kafka stuck while cancelling

2016-03-09 Thread Maciek Próchniak
9, 2016 at 1:59 PM, Maciek Próchniak <m...@touk.pl <mailto:m...@touk.pl>> wrote: Hi, from time to time when we cancel streaming jobs (or they are failing for some reason) we encounter: 2016-03-09 10:25:29,799 [Canceler for Source: read ob

rebalance of streaming job after taskManager restart

2016-03-08 Thread Maciek Próchniak
Hi, we have streaming job with paralelism 2 and two task managers. The job is occupying one slot on each task manager. When I stop manager2 the job is restarted and it runs on manager1 - occupying two of it's slots. How can I trigger restart (or other similar process) that will cause the job

Re: rebalance of streaming job after taskManager restart

2016-03-08 Thread Maciek Próchniak
cel bin/flink run -s … The last command is your usual run command but with the additional “-s” parameter to continue from a savepoint. I hope that helps. Cheers, Aljoscha On 08 Mar 2016, at 15:48, Maciek Próchniak <m...@touk.pl> wrote: Hi, we have streaming job with paraleli

Re: Threads waiting on LocalBufferPool

2016-04-22 Thread Maciek Próchniak
, 21 Apr 2016 at 16:41 Maciek Próchniak <m...@touk.pl <mailto:m...@touk.pl>> wrote: Well... I found some time to look at rocksDB performance. It takes around 0.4ms to lookup value state and 0.12ms to update - these are means, 95th percentile was > 1ms for ge

checkpoints not being removed from HDFS

2016-05-12 Thread Maciek Próchniak
Hi, we have stream job with quite large state (few GB), we're using FSStateBackend and we're storing checkpoints in hdfs. What we observe is that v. often old checkpoints are not discarded properly. In hadoop logs I can see: 2016-05-10 12:21:06,559 INFO BlockStateChange: BLOCK*

Re: checkpoints not being removed from HDFS

2016-05-13 Thread Maciek Próchniak
ion and thanks for help thanks, maciek On 12/05/2016 21:28, Maciek Próchniak wrote: thanks, I'll try to reproduce it in some test by myself... maciek On 12/05/2016 18:39, Ufuk Celebi wrote: The issue is here: https://issues.apache.org/jira/browse/FLINK-3902 (My "explanation" before

Re: Threads waiting on LocalBufferPool

2016-04-21 Thread Maciek Próchniak
work with the FileSystemStateBackend, which keeps state in memory (on-heap) and writes checkpoints to files. This would help in checking how much RocksDB is slowing things down. I'm curious about the results. Do you think you will have time to try this? – Ufuk On Wed, Apr 20, 2016 at 3:45 PM, Maciek Próchnia

Re: Threads waiting on LocalBufferPool

2016-04-21 Thread Maciek Próchniak
iek On 21/04/2016 08:41, Maciek Próchniak wrote: Hi Ufuk, thanks for quick reply. Actually I had a little time to try both things. 1) helped only temporarily - it just took a bit longer to saturate the pool. After few minutes, periodically all kafka threads were waiting for bufferPool

Enriching events with data from external http resources

2016-08-15 Thread Maciek Próchniak
Hi, Our data streams do some filtering based on data from external http resources (not maintained by us, they're really fast with redis as storage). So far we did that by just invoking synchronously some http client in map/flatMap operations. It works without errors but it seems somehow

Re: Graphite reporter recover from broken pipe

2017-02-01 Thread Maciek Próchniak
Starting with flink 1.2 it's possible to use UDP transport for graphite - I think it can be good workaround if you can listen on UDP port on your graphite installation thanks, maciek On 01/02/2017 13:22, Philipp Bussche wrote: Hi there, after moving my graphite service to another host my

accessing flink HA cluster with scala shell/zeppelin notebook

2017-01-22 Thread Maciek Próchniak
Hi, I have standalone Flink cluster configured with HA setting (i.e. with zookeeper recovery). How should I access it remotely, e.g. with Zeppelin notebook or scala shell? There are settings for host/port, but with HA setting they are not fixed - if I check which is *current leader* host

window-like use case

2016-09-23 Thread Maciek Próchniak
Hi, in our project we're dealing with a stream of billing events. Each has customerId and charge amount We want to have a process that will trigger event (alarm) when sum of charges for customer during last 4 hours exceeds certain threshold, say - 10. The triggered event should contain data

Re: Enriching events with data from external http resources

2016-08-17 Thread Maciek Próchniak
Hi Ufuk, thanks for info - this is good news :) maciek On 16/08/2016 12:16, Ufuk Celebi wrote: On Mon, Aug 15, 2016 at 8:52 PM, Maciek Próchniak <m...@touk.pl> wrote: I know it's not really desired way of using flink and that it would be better to keep data as state inside stream an

Re: window-like use case

2016-11-04 Thread Maciek Próchniak
monitoring.html [2]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html -Ursprüngliche Nachricht- Von: Maciek Próchniak [mailto:m...@touk.pl <mailto:m...@touk.pl>] Gesendet: Freitag, 23. September 2016 10:36 An: user@flink.apache.org <mailto:use

ContinuousFileMonitoringFunction - deleting file after processing

2016-10-18 Thread Maciek Próchniak
Hi, we want to monitor hdfs (or local) directory, read csv files that appear and after successful processing - delete them (mainly not to run out of disk space...) I'm not quite sure how to achieve it with current implementation. Previously, when we read binary data (unsplittable files) we

Re: ContinuousFileMonitoringFunction - deleting file after processing

2016-10-18 Thread Maciek Próchniak
kpoint was successfully performed and then we can purge the already processed files. This can be a good solution. Thanks, Kostas On Oct 18, 2016, at 9:40 AM, Maciek Próchniak <m...@touk.pl> wrote: Hi, we want to monitor hdfs (or local) directory, read csv files that appear and after successfu

Re: ContinuousFileMonitoringFunction - deleting file after processing

2016-11-26 Thread Maciek Próchniak
rce that a given checkpoint was successfully performed and then we can purge the already processed files. This can be a good solution. Thanks, Kostas On Oct 18, 2016, at 9:40 AM, Maciek Próchniak <m...@touk.pl> wrote: Hi, we want to monitor hdfs (or local) directory, read csv files

Running job in "dry mode"?

2017-06-06 Thread Maciek Próchniak
Hello, I'd like to be able to see if new version of my job is compatible with the old one. I can make a savepoint and run new version from that, but I'd like to be able to do it without actually starting sources and so on - so that e.g. it won't start to read from my kafka topics. Of

Re: Running job in "dry mode"?

2017-06-07 Thread Maciek Próchniak
On 07/06/2017 10:27, Maciek Próchniak wrote: On 07/06/2017 10:07, Tzu-Li (Gordon) Tai wrote: Hi Maciek, Is there any particular reason why you do not wish to start running the Kafka sources on the test run? Otherwise, it would be perfectly fine to start the test job for testing to see

ANN: TouK Nussknacker - designing Flink processes with GUI

2017-09-04 Thread Maciek Próchniak
easier-for-analysts-and-business/ I'll also be talking about Nussknacker next week at Flink Forward - https://berlin.flink-forward.org/kb_sessions/touk-nussknacker-creating-flink-jobs-with-gui/ - hope to see you there :) thanks, maciek próchniak TouK

Re: ANN: TouK Nussknacker - designing Flink processes with GUI

2017-09-04 Thread Maciek Próchniak
this nice tool maciek. Does it handle both batch and Streaming? Is it able to visualize also an existing Flink program? Best, Flavio On Mon, Sep 4, 2017 at 3:03 PM, Maciek Próchniak <m...@touk.pl <mailto:m...@touk.pl>> wrote: Hello, we would like to announce availabi

Re: Checkpoint was declined (tasks not ready)

2017-10-23 Thread Maciek Próchniak
gt;(Task.java:702) atjava.lang.Thread.run <http://java.lang.thread.run/>(Thread.java:745) On 23/10/2017 13:54, Maciek Próchniak wrote: we also have similar problem - it happens really often when we invoke async operators (ordered one). But we also observe that job is not starting properly

Re: Checkpoint was declined (tasks not ready)

2017-10-23 Thread Maciek Próchniak
we also have similar problem - it happens really often when we invoke async operators (ordered one). But we also observe that job is not starting properly - we don't process any data when such problems appear we'll keep you posted if we manage to find exact cause... thanks, maciek On

Re: flowable <-> flink integration

2018-01-18 Thread Maciek Próchniak
Hi Martin, I did some activiti development so your mail caught my attention :) I don't think I understand what are you trying to achieve - where is process you're simulating, where is simulation running and where is place for Flink. Do you want to invoke Flink (batch job I suppose?) from

queryable state API

2018-02-01 Thread Maciek Próchniak
Hello, Currently (1.4) to be able to use queryable state client has to know ip of (working) task manager and port. This is a bit awkward - as it forces external services to know details of flink cluster. Event more complex when we define port range for queryable state proxy and we're not sure

Conversion of Table (Blink/batch) to DataStream

2020-04-04 Thread Maciek Próchniak
Hello, I'm playing around with Table/SQL API (Flink 1.9/1.10) and I was wondering how I can do the following: 1. read batch data (e.g. from files) 2. sort them using Table/SQL SortOperator 3. perform further operations using "normal" DataStream API (treating my batch as finite stream) - to

Re: Dynamic Flink SQL

2020-04-04 Thread Maciek Próchniak
Hi Krzysiek, the idea is quite interesting - although maintaining some coordination to be able to handle checkpoints would probably pretty tricky. Did you figure out how to handle proper distribution of tasks between TMs? As far as I understand you have to guarantee that all sources reading

Re: Conversion of Table (Blink/batch) to DataStream

2020-04-05 Thread Maciek Próchniak
]: https://github.com/apache/flink/blob/master/flink-table/flink-sql-client/src/main/java/org/apache/flink/table/client/gateway/local/ExecutionContext.java#L419 On Sat, 4 Apr 2020 at 15:42, Maciek Próchniak <mailto:m...@touk.pl>> wrote: Hello, I'm playing around with Tabl

Re: Partitioned tables in SQL client configuration.

2020-12-03 Thread Maciek Próchniak
. Cheers, Till On Tue, Dec 1, 2020 at 6:51 PM Maciek Próchniak mailto:m...@touk.pl>> wrote: Hello, I try to configure SQL Client to query partitioned ORC data on local filesystem. I have directory structure like that: /tmp/table1/startdate=

Partitioned tables in SQL client configuration.

2020-12-01 Thread Maciek Próchniak
Hello, I try to configure SQL Client to query partitioned ORC data on local filesystem. I have directory structure like that: /tmp/table1/startdate=2020-11-28 /tmp/table1/startdate=2020-11-27 etc. If I run SQL Client session and create table by hand: create table tst (column1 string,

JobManager seems to be leaking temporary jar files

2021-01-25 Thread Maciek Próchniak
Hello, in our setup we have: - Flink 1.11.2 - job submission via REST API (first we upload jar, then we submit multiple jobs with it) - additional jars embedded in lib directory of main jar (this is crucial part) When we submit jobs this way, Flink creates new temp jar files via

Re: JobManager seems to be leaking temporary jar files

2021-01-26 Thread Maciek Próchniak
/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L797 <https://github.com/apache/flink/blob/2c4e0ab921ccfaf003073ee50faeae4d4e4f4c93/flink-runtime/src/main/java/org/apache/flink/runtime/dispatcher/Dispatcher.java#L797> On Mon, Jan 25, 2021 at 8:37 PM Maciek Próchniak <

Re: JobManager seems to be leaking temporary jar files

2021-01-27 Thread Maciek Próchniak
probably JarRunHandler#handleRequest, within handle after the job was run. A similar issue also exists in the JarPlanHandler. I've opened https://issues.apache.org/jira/browse/FLINK-21164 to fix this issue. On 1/26/2021 12:21 PM, Maciek Próchniak wrote: Hi Matthias, I think the problem lies

Re: Future of QueryableState

2021-03-10 Thread Maciek Próchniak
f life" in the sense that there is no active development on that component anymore. Best, Konstantin On Tue, Mar 9, 2021 at 7:08 AM Maciek Próchniak <mailto:m...@touk.pl>> wrote: Hello, We are using QueryableState in some of Nussknacker deployments as a nice a

Future of QueryableState

2021-03-08 Thread Maciek Próchniak
Hello, We are using QueryableState in some of Nussknacker deployments as a nice addition, allowing end users to peek inside job state for a given key (we mostly use custom operators). Judging by mailing list and feature radar proposition by Stephan:

Re: 回复: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-08 Thread Maciek Próchniak
Hi, Did you put the clickhouse JDBC driver on Flink main classpath (in lib folder) and not in user-jar - as described here: https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/debugging/debugging_classloading.html#unloading-of-dynamically-loaded-classes-in-user-code? When we

Re: Flink does not cleanup some disk memory after submitting jar over rest

2021-04-08 Thread Maciek Próchniak
Hi, don't know if this is the problem you're facing, but some time ago we encountered two issues connected to REST API and increased disk usage after each submission: https://issues.apache.org/jira/browse/FLINK-21164 https://issues.apache.org/jira/browse/FLINK-9844 - they're closed ATM,

Re: 回复: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-09 Thread Maciek Próchniak
-config, but problem still exist. Is there lightweight way to put clickhouse JDBC driver on Flink lib/ folder? -- 原始邮件 ------ *发件人:* "Maciek Próchniak" mailto:m...@touk.pl>>; *发送时间:* 2021年4月9日(星期五) 凌晨3:24 *收件人:* "太平洋"<

Flink 1.11.4?

2021-04-12 Thread Maciek Próchniak
Hello, I'd like to ask if there are any plans to release 1.11.4 - I understand it will be last bugfix release for 1.11.x branch, as 1.13.0 is "just round the corner"? There are a few fixes we'd like to use - e.g. https://issues.apache.org/jira/browse/FLINK-9844,

Kafka connector depending on Table API

2021-08-31 Thread Maciek Próchniak
Hello, we are testing 1.14 RC0 and we discovered that we need to include table-api as dependency when using kafka connector - e.g. due to this change:

Re: Slow Tests in Flink 1.15

2022-09-09 Thread Maciek Próchniak
Hi, we also had similar problems in Nussknacker recently (tests on fake sources), my colleague found out it's due to ENABLE_CHECKPOINTS_AFTER_FINISH flag

Re: [DISCUSS] FLIP-265 Deprecate and remove Scala API support

2022-10-05 Thread Maciek Próchniak
Hi Martin, Could you please remind what was the conclusion of discussion on upgrading Scala to 2.12.15/16? https://lists.apache.org/thread/hwksnsqyg7n3djymo7m1s7loymxxbc3t - I couldn't find any follow-up vote? If it's acceptable to break binary compatibility by such an upgrade, then

Is Flink 1.15.3 planned in foreseeable future?

2022-10-13 Thread Maciek Próchniak
Hello, I suppose that committers are heavily concentrated on 1.16, but are there plans to have 1.15.3 out? We've been affected by https://issues.apache.org/jira/browse/FLINK-28488 and it's preventing us from using 1.15.x at this moment. thanks, maciek