Re: Latency Measurement

2017-07-17 Thread Chesnay Schepler
Hello, As for 1), my suspicion is that this is caused by chaining. If the map function is chained to the kafka source then the latency markers are always immediately forwarded, regardless of what your map function is doing. If the map function is indeed chained to the source, could you try

Re: Flink Jobs disappers

2017-07-08 Thread Chesnay Schepler
the task manager ran out of memory etc? I am using slots more than the available core , I hope compute is shared in round robin. Any pointers to tuning and HA setup will be greatly appreciated. Regards, Vijay Raajaa GS On Sat, Jul 8, 2017 at 12:04 PM, Chesnay Schepler <ches...@apache.

Re: Flink Jobs disappers

2017-07-08 Thread Chesnay Schepler
Hello, could you tell us a bit more about your setup? Which Flink version you're using, whether HA is enabled, does this happen every time etc. . Regards, Chesnay On 06.07.2017 21:43, G.S.Vijay Raajaa wrote: HI, I am using Flink Task manager and Job Manager as docker containers. Strangely,

Re: Flink shaded table API

2017-07-25 Thread Chesnay Schepler
This sounds similar to https://issues.apache.org/jira/browse/FLINK-6173. On 25.07.2017 13:07, nragon wrote: Let's see if I can sample this :P. First i'm reading from kafka. FlinkKafkaConsumer010 consumer = KafkaSource.consumer(this.zookeeper, this.sourceName, 5);

Re: a lot of connections in state "CLOSE_WAIT"

2017-07-25 Thread Chesnay Schepler
Hello, Could you tell us which browser you are using, including the version? (and maybe try out if the issue persists with a different one) Regards, Chesnay On 25.07.2017 05:20, XiangWei Huang wrote: hi, Sorry for replying so late. I have met this issue again and the list is constantly keep

Re: Custom Kryo serializer

2017-07-24 Thread Chesnay Schepler
Boris Lublinsky FDP Architect boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com> https://www.lightbend.com/ -- Forwarded message -- From: *Chesnay Schepler* <ches...@apache.org <mailto:ches...@apache.org>> Date: Wed, Jul 19, 2017 at 1:34

Re: Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread Chesnay Schepler
l and add the string to ListState. But when I tried to retrieve ListState and flatMap1, I got nothing. Thanks. Desheng Zhang On Jul 24, 2017, at 21:01, Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>> wrote: Hello, That's an error in the documentation, only

Re: Custom Kryo serializer

2017-07-24 Thread Chesnay Schepler
s helps, Chesnay On 24.07.2017 16:31, Boris Lublinsky wrote: Thanks Chesney, Can you, please, point me to any example? Boris Lublinsky FDP Architect boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com> https://www.lightbend.com/ On Jul 24, 2017, at 9:27 AM, Chesnay Schepl

Re: a lot of connections in state "CLOSE_WAIT"

2017-07-26 Thread Chesnay Schepler
2017 04:57, XiangWei Huang wrote: hi, The browser i am using is Google Chrome with version 59.0.3071.115 and the issue persists when i tried Firefox. Regards, XiangWei 在 2017年7月25日,17:48,Chesnay Schepler <ches...@apache.org> 写道: Hello, Could you tell us which browser you are using, inclu

Re: Flink monitor rest API question

2017-07-19 Thread Chesnay Schepler
Hello, Looks like you stumbled upon a bug in our REST API and use a client that is stricter than others. I will create a JIRA for this. Regards, Chesnay On 19.07.2017 13:31, Will Du wrote: Hi folks, I am using a java rest client - unirest lib to GET from flink rest API to get a Job

Re: Custom Kryo serializer

2017-07-19 Thread Chesnay Schepler
Hello, I assume you're passing the class of your serializer in a StateDescriptor constructor. If so, you could add a breakpoint in Statedescriptor#initializeSerializerUnlessSet, and check what typeInfo is created and which serializer is created as a result. One thing you could try right

Re: Executing Flink server From IntelliJ

2017-07-19 Thread Chesnay Schepler
Hello, this problem is described in https://issues.apache.org/jira/browse/FLINK-6689. Basically, if you want to use the LocalFlinkMiniCluster you should use a TestStreamEnvironment instead. The RemoteStreamEnvironment only works with a proper Flink cluster. Regards, Chesnay On 14.07.2017

Re: Custom Kryo serializer

2017-07-19 Thread Chesnay Schepler
for this. Am I missing something? Boris Lublinsky FDP Architect boris.lublin...@lightbend.com <mailto:boris.lublin...@lightbend.com> https://www.lightbend.com/ -- Forwarded message -- From: *Chesnay Schepler* <ches...@apache.org <mailto:ches...@apache.org>> Date:

Re: Is that possible for flink to dynamically read and change configuration?

2017-07-24 Thread Chesnay Schepler
Hello, That's an error in the documentation, only the ValueStateDescriptor has a defaultValue constructor argument. Regards, Chesnay On 24.07.2017 14:56, ZalaCheung wrote: Hi Martin, Thanks for your advice. That’s really helpful. I am using the push scenario. I am now having some trouble

Re: a lot of connections in state "CLOSE_WAIT"

2017-07-03 Thread Chesnay Schepler
Hello, is this list constantly growing? If you reload the WebUI do they pile up again? My guess would be the watermarks display is overloading the metrics handler. If i remember correctly the WebUI keeps fetching the watermark metrics regardless of what page you're looking at. Things to

Re: Registering custom metrics does not work

2017-07-06 Thread Chesnay Schepler
Hello, Plase provide more information as to how it is not working as expected. Does it throw an exception, log a warning, is the metric not get registered at all or does the value not changing? On 06.07.2017 08:10, wyphao.2007 wrote: Hi, all I want to know element's latency before write to

Re: Registering custom metrics does not work

2017-07-06 Thread Chesnay Schepler
, "Chesnay Schepler"<ches...@apache.org>写道: Hello, Plase provide more information as to how it is not working as expected. Does it throw an exception, log a warning, is the metric not get registered at all or does the value not changing?

Re: Invalid path exception

2017-08-01 Thread Chesnay Schepler
format.setFilePath("file:///c:/proj/test/a.txt.txt"); On Sun, Jul 30, 2017 at 2:10 PM, Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>> wrote: Did the path by chance start with file://C:/... ? If so,

Re: Invalid path exception

2017-08-01 Thread Chesnay Schepler
Let's move the discussions to FLINK-7330. On 01.08.2017 13:15, Chesnay Schepler wrote: One problem i know of is that windows paths with a scheme are not detected as windows paths, as documented in FLINK-6889. They generally still work though (/maybe /by chance). I just verified that calling

Re: a lot of connections in state "CLOSE_WAIT"

2017-08-08 Thread Chesnay Schepler
FLINK-7368 may be the reason for this behavior. On 31.07.2017 03:54, XiangWei Huang wrote: 1. yes and yes. 2. Yes,it was shown correctly. 3.I wasn’t modify this setting. 在 2017年7月26日,18:06,Chesnay Schepler [via Apache Flink User Mailing List archive.] <[hidden email] > 写道: So thi

Re: JMX stats reporter with all task manager/job manager stats aggregated?

2017-08-07 Thread Chesnay Schepler
Hello, there is no central place where JMX metrics are aggregated. You can configure a port range for the reporter to prevent port conflicts on the same machine. metrics.reporter.jmx.port:8789-8790 You can find out which port was used by checking the logs. Regards, Chesnay On 05.08.2017

Re: Problem: Please check that all IDs specified via `uid(String)` are unique.

2017-05-03 Thread Chesnay Schepler
class Splitterimplements OutputSelector{ @Override public Iterable select(EventRaw value) { List output =new ArrayList<>(); output.add(value.type); return output; } } Any ideas how to get better debug messages? Regards, -Rami On 3 May 2017, at 14:12, Chesnay Sch

Re: Why not add flink-connectors to flink dist?

2017-05-15 Thread Chesnay Schepler
You can either package the connector into the user-jar or place it in the /lib directory of the distribution. On 15.05.2017 11:09, yunfan123 wrote: So how can I use it? Every jar file I submitted should contains the specific connector class? Can I package it to flink-dist ? -- View this

Re: [DISCUSS] Release 1.3.0 RC0 (Non voting, testing release candidate)

2017-05-09 Thread Chesnay Schepler
This RC (and 1.4-SNAPSHOT for that matter) cannot be compiled on Windows due to the rat-plugin not detecting certain test savepoints as binary files. The files in question are the "savepoints" created by the StreamOperatorTestHarness. This is not a new problem and has happened before, but we

Re: Group already contains a Metric with the name 'latency'

2017-05-16 Thread Chesnay Schepler
No; you can name operators like this: stream.map().name("MyUniqueMapFunctionName") On 16.05.2017 14:50, rizhashmi wrote: thanks for your reply *latency* metrics appear to be pushed by AbstractStreamOperator.java latencyGauge = this.metrics.gauge("latency", new LatencyGauge(historySize));

Re: Question about jobmanager.web.upload.dir

2017-05-17 Thread Chesnay Schepler
I don't know why we delete it either, but my guess is that at one point this was a temporary directory that we properly cleaned up, and later allowed it to be configurable. Currently, this directory must be in the local file system. We could change it to also allow non-local paths, which

Re: Group already contains a Metric with the name 'latency'

2017-05-16 Thread Chesnay Schepler
Does your job include multiple operators called "Filter"? On 16.05.2017 13:35, rizhashmi wrote: I am getting bunch of warning in log files. Anyone help me sort out this problem. 2017-04-28 00:20:57,947 WARN org.apache.flink.metrics.MetricGroup - Name collision: Group already contains a

Re: Flink-derrived operator names cause issues in Graphite metrics

2017-06-12 Thread Chesnay Schepler
So there's 2 issues here: 1. The default names for windows are horrible. They are to long, full of special characters, and unstable as reported in FLINK-6464 2. The reporter doesn't filter out metrics it can't report. For 2) we can do 2

Re: Custom Gauge Metric is not getting updated on Job Dashboard

2017-06-21 Thread Chesnay Schepler
Can you provide more of your code (you can also send it to me directly)? I'm interested in where the startTime/endTime arguments are defined. On 21.06.2017 10:47, sohimankotia wrote: I ran job and monitored for approx 20 mins . I tried with meter,accumulators,histogram,gauge . Out of those

Re: Custom Gauge Metric is not getting updated on Job Dashboard

2017-06-21 Thread Chesnay Schepler
The reason why the gauge value is not updating is because you are not actually updating the gauge, but register a new gauge under the same name. The subsequent registration are ignored, and should've logged a warning. I suggest to make your gauge stateful by adding a field for the opTimeInSec

Re: Custom Gauge Metric is not getting updated on Job Dashboard

2017-06-21 Thread Chesnay Schepler
Exactly, you register the gauge once in open(), and modify the code so that this gauge returns different values. On 21.06.2017 12:04, sohimankotia wrote: Basically Every time I am calling add metric method it is just registering the gauge . I can register this gauge in open method and then in

Re: Partitioner is spending around 2 to 4 minutes while pushing data to next operator

2017-06-22 Thread Chesnay Schepler
So let's get the obvious question out of the way: Why are you adding a partitioner when your parallelism is 1? On 22.06.2017 11:58, sohimankotia wrote: I have a execution flow (Streaming Job) with parallelism 1. source -> map -> partitioner -> flatmap -> sink Since adding partitioner will

Re: Quick Question...

2017-06-22 Thread Chesnay Schepler
Hello, in the DataSet API you can do this when specifying your transformations, something along the lines of dataset.map(..).withConfiguration. In the DataStream API you cannot set the Configuration at all. Note that in both APIs you can also just pass the Configuration into the constructor

Re: MapR libraries shading issue

2017-06-26 Thread Chesnay Schepler
This looks more like a certification problem as described here: https://github.com/square/okhttp/issues/2746 I don't think that shading could have anything to do with this. On 26.06.2017 00:09, ani.desh1512 wrote: I am trying to use Flink (1.3.0) with MapR(5.2.1). Accordingly, I built Flink

Re: Flink metrics related problems/questions

2017-05-19 Thread Chesnay Schepler
1. This shouldn't happen. Do you access the counter from different threads? 2. Metrics in general are not persisted across restarts, and there is no way to configure flink to do so at the moment. 3. Counters are sent as gauges since as far as I know StatsD counters are not allowed to be

Re: Flink metrics related problems/questions

2017-05-19 Thread Chesnay Schepler
2. isn't quite accurate actually; metrics on the TaskManager are not persisted across restarts. On 19.05.2017 11:21, Chesnay Schepler wrote: 1. This shouldn't happen. Do you access the counter from different threads? 2. Metrics in general are not persisted across restarts

Re: Flink metrics related problems/questions

2017-05-22 Thread Chesnay Schepler
, Aljoscha Krettek wrote: @Chesnay With timers it will happen that onTimer() is called from a different Thread than the Tread that is calling processElement(). If Metrics updates happen in both, would that be a problem? On 19. May 2017, at 11:57, Chesnay Schepler <ches...@apache.org> wro

Re: Collapsible job plan visualization

2017-05-25 Thread Chesnay Schepler
You should be able to move the separator between the plan view and the bottom panel already. On 25.05.2017 19:45, Flavio Pompermaier wrote: Hi to all, In our experience the Flink plan diagram is a nice feature but it is useless almost all the time and it has an annoying interaction with the

Re: Group already contains a Metric with the name 'latency'

2017-05-16 Thread Chesnay Schepler
Generally this isn't an issue; it only means that for some operators the latency metrics will not be available. The underlying issue is that the metric system has no way to differentiate operators except by their name; if the names are identical you end up with a collision. If you're not

[DISCUSS] Removal of twitter-inputformat

2017-06-07 Thread Chesnay Schepler
Hello, I'm proposing to remove the Twitter-InputFormat in FLINK-6710 , with an open PR you can find here . The PR currently has a +1 from Robert, but Timo raised some concerns saying that it is useful

Re: Cassandra connector POJO - tombstone question

2017-06-01 Thread Chesnay Schepler
il.com>> wrote: Thanks Chesnay, this will work. Best, Tarandeep On Wed, Apr 12, 2017 at 2:42 AM, Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>> wrote: Hello, what i can do is add hook like we do for the ClusterBuilder

Re: Streaming job monitoring

2017-06-08 Thread Chesnay Schepler
Hello Flavio, I'm not sure what source you are using, but it looks like the ContinouosFileMonitoringSource which works with 2 operators. The first operator (what is displayed as the actual Source) emits input splits (chunks of files that should be read) and passes these to the second operator

Re: Can't get keyed messages from Kafka

2017-06-13 Thread Chesnay Schepler
You have to create your own implementation that deserializes the byte arrays into whatever type you want to use. On 13.06.2017 13:19, AndreaKinn wrote: But KeyedDeserializationSchema has just 2 implementations: TypeInformationKeyValueSerializationSchema JSONKeyValueDeserializationSchema The

Re: Task and Operator Metrics in Flink 1.3

2017-06-13 Thread Chesnay Schepler
The scopes look OK to me. Let's try to narrow down the problem areas a bit: 1. Did this work with the same setup before 1.3? 2. Are all task/operator metrics available in the metrics tab of the dashboard? 3. Are there any warnings in the TaskManager logs from the MetricRegistry or

Re: Task and Operator Metrics in Flink 1.3

2017-06-13 Thread Chesnay Schepler
ote. This is the application I am running: https://github.com/chrisdail/pravega-samples/blob/master/flink-examples/src/main/scala/io/pravega/examples/flink/iot/TurbineHeatProcessor.scala Also, I am running this in DC/OS 1.9 trying to integrate with DC/OS metrics. Thanks Chris *From: *Chesnay Schepler &l

Re: RichMapFunction setup method

2017-06-13 Thread Chesnay Schepler
The existing signature for open() is a remnant of the past. We currently recommend to pass all arguments through the constructor and store them in fields. You can of course also pass a Configuration containing all parameters. On 13.06.2017 15:46, Mikhail Pryakhin wrote: Hi all! A

Re: RichMapFunction setup method

2017-06-13 Thread Chesnay Schepler
be deprecated then so that it doesn’t confuse users? Kind Regards, Mike Pryakhin On 13 Jun 2017, at 16:54, Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>> wrote: The existing signature for open() is a remnant of the past. We currently recommend to pass all argum

Re: Flink Kinesis connector in 1.3.0

2017-06-13 Thread Chesnay Schepler
on if there is a patch/fix/JIRA available since I have to use 1.3.0? *From: *"Foster, Craig" <foscr...@amazon.com> *Date: *Tuesday, June 13, 2017 at 9:27 AM *To: *Chesnay Schepler <ches...@apache.org>, "user@flink.apache.org" <user@flink.apache.org> *Subject:

Re: Flink Kinesis connector in 1.3.0

2017-06-14 Thread Chesnay Schepler
n. *From: *Chesnay Schepler <ches...@apache.org> *Date: *Tuesday, June 13, 2017 at 1:44 PM *To: *"Foster, Craig" <foscr...@amazon.com>, "user@flink.apache.org" <user@flink.apache.org>, Robert Metzger <rmetz...@apache.org> *Subject: *Re: Flink Kinesi

Re: Flink will delete all jars uploaded when restart jobmanager

2017-06-14 Thread Chesnay Schepler
There's currently no way to prevent this. On 14.06.2017 07:03, XiangWei Huang wrote: Hi, When restart flink jobmanager jars which uploaded by user from web ui will be deleted . Is there anyway to avoid this.

Re: Latency and Throughput

2017-06-16 Thread Chesnay Schepler
Hello, You don't have to measure anything yourself, since Flink exposes throughput/latency metrics as described in the System metrics/latency tracking sections of the metrics documentation. You only have to setup a reporter that fetches these metrics (see the reporter section) and calculate

Re: Problem with WebUI

2017-06-11 Thread Chesnay Schepler
This looks like a dependency conflict to me. Try checking whether anything you use depends on netty. On 09.06.2017 17:42, Dawid Wysakowicz wrote: I had a look into yarn logs and I found such exception: 2017-06-09 17:10:20,922 ERROR

Re: RichMapFunction setup method

2017-06-13 Thread Chesnay Schepler
deals with this method. Kind Regards, Mike Pryakhin On 13 Jun 2017, at 17:20, Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>> wrote: I'm not aware of any plans to replace it. For the Batch API it also works properly, so deprecating it would be misleading. On

Re: Flink Kinesis connector in 1.3.0

2017-06-13 Thread Chesnay Schepler
Something went wrong during the release process which prevented the 1.3.0 kinesis artifact from being released. This will be fixed for 1.3.1, in the mean time you can use 1.3.0-SNAPSHOT instead. On 13.06.2017 17:48, Foster, Craig wrote: Hi: I’m trying to build an application that uses the

Re: Can ValueState use generics?

2017-05-08 Thread Chesnay Schepler
If you want to use generics you have to either provide a TypeInformation instead of a class or create a class that extends Tuple2(Integer, ObjectNode) and use it as the class argument. On 07.05.2017 15:14, yunfan123 wrote: My process function is like : private static class MergeFunction

Re: Collector.collect

2017-05-01 Thread Chesnay Schepler
per filter was slower than the BITMASK,Record to single OutputFormat which demux’s the data to each file internally Are you saying do a custom writer inside a map rather than either of the 2 above approaches? *From:*Chesnay Schepler [mailto:ches...@apache.org] *Sent:* Monday, May 01, 2017 10

Re: Collector.collect

2017-05-01 Thread Chesnay Schepler
Hello, @Billy, what prevented you from duplicating/splitting the record, based on the bitmask, in a map function before the sink? This shouldn't incur any serialization overhead if the sink is chained to the map. The emitted Tuple could also share the GenericRecord; meaning you don't even

Re: Problem: Please check that all IDs specified via `uid(String)` are unique.

2017-05-03 Thread Chesnay Schepler
Hello, was a uid set on "userRawDataStream", or any of it's parent transformations? On 03.05.2017 12:59, Rami Al-Isawi wrote: Hi, I am trying to set uids. I keep getting this (Flink.1.2): Exception in thread "main" java.lang.IllegalArgumentException: Hash collision on user-specified ID.

Re: Flink Graphire Reporter stops reporting via TCP if network issue

2017-05-05 Thread Chesnay Schepler
Hello, for Graphite, Flink uses the DropWizard metrics reporter. I don't know at the moment whether it supports any kind of reconnecting functionality. I'm not sure whether i understood you correctly; did you try upgrading the DropWizard metrics-core/metrics-graphite dependencies? If that

Re: Collector.collect

2017-05-02 Thread Chesnay Schepler
e: Why doesn’t this work with batch though. We did input = ... input.filter(conditionA).output(formatA) input.filter(conditonB).output(formatB) And it was pretty slow compared with a custom outputformat with an integrated filter. *From:*Chesnay Schepler [mailto:ches...@apache.org] *Sent:* Mon

Re: MapR libraries shading issue

2017-06-28 Thread Chesnay Schepler
I would say that this is a MapR issue. It's a good idea to add it to the docs in case someone else stumbles upon this. Would be great if you could open a JIRA for that. On 27.06.2017 19:35, ani.desh1512 wrote: Again as I mentioned in the MapR thread, So, after some more digging, I found out

Re: Custom Serializers

2017-09-18 Thread Chesnay Schepler
If Parameters are always encapsulated in an Event, and the Event serializer knows how to deal with them, then you only need to implement a serializer etc. for the Event class. On 18.09.2017 13:20, nragon wrote: Sorry for bringing this up, any tips on this? -- Sent from:

Re: Custom Serializers

2017-09-18 Thread Chesnay Schepler
you do need them, but only for the Event class. On 18.09.2017 13:38, nragon wrote: So, no need for typeinfo, comparator or factory? https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/types_serialization.html#defining-type-information-using-a-factory -- Sent from:

Re: Load distribution through the cluster

2017-09-20 Thread Chesnay Schepler
It should only apply to the map operator. On 19.09.2017 17:38, AndreaKinn wrote: If I apply a sharing slot as in the example: DataStream LTzAccStream = env .addSource(new FlinkKafkaConsumer010<>("topic", new CustomDeserializer(), properties))

Re: NoResourceAvailable exception

2017-09-14 Thread Chesnay Schepler
The error message says that the total number of slots is 0, It is thus very likely that no task manager is connected to the jobmanager. How exactly are you starting the cluster? On 14.09.2017 18:03, AndreaKinn wrote: Hi, I'm executing a program on a flink cluster. I tried the same on a local

Re: Get EOF from PrometheusReporter in JM

2017-09-22 Thread Chesnay Schepler
The Prometheus reporter should work with 1.3.2. Does this also occur with the reporter that currently exists in 1.4? (to rule out new bugs from the PR). To investigate this further, please set the logging level to WARN and try again, as all errors in the metric system are logged on that

Re: Custom Serializers

2017-09-19 Thread Chesnay Schepler
Have a look at the TupleTypeInfo class. It has a constructor that accepts an array of TypeInformation, and supports automatically generating a serializer from them. On 18.09.2017 18:28, nragon wrote: One other thing :). Can i set tuple generic type dynamically? Meaning, build a tuple of N

Re: Can't send data to another service in addSink

2017-09-18 Thread Chesnay Schepler
Please read the Basic API concepts guide in the documentation, in particular https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/api_concepts.html#lazy-evaluation. The short answer is that main() is called on the client, while the sink is executed on a taskmanager, i.e. in a

Re: Weird error in submitting a flink job to yarn cluster

2017-10-04 Thread Chesnay Schepler
This isn't related to FLink but i might be able to help you out anyway. Does the ParquestFileWriter set the 'overwrite' flag when calling 'FileSystem#create()'? My suspicion is that you create a file for the first batch, write it out, but not delete it. For the next batch, the file cannot be

Re: Fw: Question on Flink on Window

2017-10-04 Thread Chesnay Schepler
You're attempting to start flink from the wrong directory, specifically from within the source directory. If you download Flinks source release from the downloads page you have to build it manually, in which case the 'start-local.bat' to run will be

Re: Classloader error after SSL setup

2017-10-04 Thread Chesnay Schepler
something that would also help us narrow down the problematic area is to enable SSL for one component at a time and see which one causesd the job to fail. On 04.10.2017 14:11, Chesnay Schepler wrote: The configuration looks reasonable. Just to be sure, are the paths accessible by all nodes

Re: Classloader error after SSL setup

2017-10-04 Thread Chesnay Schepler
The configuration looks reasonable. Just to be sure, are the paths accessible by all nodes? As a first step, could you set the logging level to DEBUG (by modifying the 'conf/log4j.properties' file), resubmit the job (after a cluster restart) and check the Job- and TaskManager logs for any

Re: javax.net.ssl.SSLPeerUnverifiedException: peer not authenticated for S3 access

2017-10-04 Thread Chesnay Schepler
I've found a few threads where an outdated jdk version on the server/client may be the cause. Which Flink binary (specifically, for which hadoop version) are you using? On 03.10.2017 20:48, Hao Sun wrote: com.amazonaws.http.AmazonHttpClient - Unable to execute HTTP

Re: Writing an Integration test for flink-metrics

2017-10-13 Thread Chesnay Schepler
er from some other directions like trying to find a way to inject a reporter to get it's state. But don't see a way to do it. So probably the best thing to do is fire up something to collect the metrics from the reporter. On Thu, Oct 12, 2017 at 5:29 AM, Chesnay Schepler <ches...@apache.or

Re: RichMapFunction parameters in the Streaming API

2017-10-11 Thread Chesnay Schepler
The Configuration parameter in open() is a relic of the previous java API where operators were instantiated generically. Nowadays, this is no longer the case as they are serialized instead, which simplifies the passing of parameters as you can simply store them in a field of your UDF. The

Re: Writing an Integration test for flink-metrics

2017-10-12 Thread Chesnay Schepler
You could also write a custom reporter that opens a socket or similar for communication purposes. You can then either query it for the metrics, or even just trigger the verification in the reporter, and fail with an error if the reporter returns an error. On 12.10.2017 14:02, Piotr Nowojski

Re: Writing an Integration test for flink-metrics

2017-10-12 Thread Chesnay Schepler
Well damn, i should've read the second part of the initial mail. I'm wondering though, could you not unit-test this behavior? On 12.10.2017 14:25, Chesnay Schepler wrote: You could also write a custom reporter that opens a socket or similar for communication purposes. You can then either

Re: metrics for Flink sinks

2017-08-29 Thread Chesnay Schepler
Hello, 1. Because no one found time to fix it. In contrast to the remaining byte/record metrics, input metrics for sources / output metrics for sinks have to be implemented for every single implementation with their respective semantics. In contrast, the output metrics are gathered in the

Re: Update timeWindow size and trigger value at runtime

2017-09-11 Thread Chesnay Schepler
You cannot change the size/trigger count while a job is running. For this to work you will have to take a savepoint, modify the parameters and reload from the savepoint. On 11.09.2017 09:27, victor.reut wrote: Hi, I want to have an opportunity to update timeWindow size and trigger value in

Re: Is State access synchronized?

2017-09-11 Thread Chesnay Schepler
Hello, state is local to each parallel instance of an operator. Coupled with the fact that the "map" method is always called by the same thread (and never concurrently) the ValueState (or any state for that matter) will always return the latest values. On 10.09.2017 14:39, Federico

Re: Delay in Flink timers

2017-09-11 Thread Chesnay Schepler
It is true that onTimer and processElement are never called at the same time. I'm not entirely sure whether there is any prioritization/fairness between these methods (if not if could be that onTimer is starved) , looping in Aljoscha who hopefully knows more about this. On 10.09.2017 09:31,

Re: Best way to deriving streams from another one

2017-09-11 Thread Chesnay Schepler
Have a look at side outputs in the documentation, they allow you to emit to multiple streams (of different types!) with a ProcessFunction. On 10.09.2017 22:15, AndreaKinn wrote: Hi, I have a data stream resulting from an operation executed on a data stream of data. Essentially I want to obtain

Re: EASY Friday afternoon question: order of chained sink operator execution in a streaming task

2017-09-29 Thread Chesnay Schepler
Yes, i believe that is correct. On 29.09.2017 14:01, Martin Eden wrote: Hi all, Just a quick one. I have a task that looks like this (as printed in the logs): 17-09-29 0703510695 INFO TaskManager.info: Received task Co-Flat Map -> Process -> (Sink: sink1, Sink: sink2, Sink: sink3) (2/2)

Re: how many 'run -c' commands to start?

2017-09-30 Thread Chesnay Schepler
nk: Unnamed(5/5) switched to SCHEDULED ... I thought --detach will put the process in the background, and give me back the cmdline, but maybe I got the meaning behind this option wrong? Thank you! > Оригинално писмо ---- >От: Chesnay Schepler ches...@apache.org >Относ

Re: how many 'run -c' commands to start?

2017-09-29 Thread Chesnay Schepler
m I guessing wrong? Thanks! Rob > Оригинално писмо ---- >От: Chesnay Schepler ches...@apache.org >Относно: Re: how many 'run -c' commands to start? >До: user@flink.apache.org >Изпратено на: 28.09.2017 15:05 Hi! Given a Flink cluste

Re: Custom Serializers

2017-09-28 Thread Chesnay Schepler
On 19.09.2017 11:39, nragon wrote: createInstance(Object[] fields) at TupleSerializerBase seems not to be part of TypeSerializer API. Will I be loosing any functionality? In what cases do you use this instead of createInstance()? // We use this in the Aggregate and Distinct Operators to create

Re: how many 'run -c' commands to start?

2017-09-28 Thread Chesnay Schepler
Hi! Given a Flink cluster, you would only call `flink run ...` to submit a job once; for simplicity i would submit it on the node where you started the cluster. Flink will automatically distribute job across the cluster, in smaller independent parts known as Tasks. Regards, Chesnay On

Re: Custom Serializers

2017-09-28 Thread Chesnay Schepler
On 19.09.2017 11:39, nragon wrote: createInstance(Object[] fields) at TupleSerializerBase seems not to be part of TypeSerializer API. Will I be loosing any functionality? In what cases do you use this instead of createInstance()? // We use this in the Aggregate and Distinct Operators to create

Re: Aggregating metrics from different boxes

2017-08-24 Thread Chesnay Schepler
If you want to change under what identifier metrics are exported please have a look at scope formats in the flink documentation: https://ci.apache.org/projects/flink/flink-docs-release-1.4/monitoring/metrics.html#scope If i understood you correctly the goal would be to remove the host and

Re: datastream.print() doesn't works

2017-08-31 Thread Chesnay Schepler
If you call createLocalEnvironmentWithWebUI you don't need to start a cluster with the start-local.sh script, you can run it from the IDE and it will start the web UI. If you submit a job to a cluster that was started outside the IDE you can call getExecutionEnvironment as usual. Not sure why

Re: datastream.print() doesn't works

2017-08-29 Thread Chesnay Schepler
The easiest explanation is that there is nothing to print. Since print statements within the select function don't appear in the logs I assume that the result of HTM.learn is empty. Please check via the webUI or metrics whether any of these operations actually return records. On 29.08.2017

Re: Atomic savepint and cancel

2017-08-29 Thread Chesnay Schepler
Hello, savepoint is in general not an atomic operation, it only guarantees that no other checkpoint will be completed between the savepoint and the job cancellation. You can only guarantee that no messages are sent out if you used a sink that supports exactly-once, which as far as i know,

Re: yarn and checkpointing

2017-08-29 Thread Chesnay Schepler
Checkpoints are only used for recovery during the job execution. If the entire cluster is shutdown and restarted you will need to take a savepoint and restore from that. On 29.08.2017 16:46, Gwenhael Pasquiers wrote: Hi, Is it possible to use checkpointing to restore the state of an app

Re: Set Savepoints configuration after cluster bootstrap

2017-08-29 Thread Chesnay Schepler
Hello, it is not possible to permanently set the savepoint directory after the cluster has started, but the configured value can be overridden when taking a savepoint as described here .

Re: Default chaining & uid

2017-08-29 Thread Chesnay Schepler
Hello, That depends a bit on the used version. For 1.3 and above it does not affect chaining; the maps will be chained and setting the UIDs will work as if the maps weren't chained. For 1.2, setting the UID on a chained operator is forbidden and will fail with an exception. On 28.08.2017

Re: Flink session on Yarn - ClassNotFoundException

2017-08-29 Thread Chesnay Schepler
Hello, The ClassNotFoundException indicates that you are using a Flink version that wasn't compiled against hadoop 2.7. Replacing one part of the hadoop dependency will most likely not cut it (or fail mysteriously down the line), so i would suggest to check the downloads

Re: datastream.print() doesn't works

2017-08-29 Thread Chesnay Schepler
-importing your code as a new project and see if it helps. Note that you can start a Flink job with a functioning web UI from the IDE by calling `StreamExecutionEnvironment#createLocalEnvironmentWithWebUI()`, it is not necessary to build a jar. On 29.08.2017 22:00, AndreaKinn wrote: Chesnay

Re: Classloader error after SSL setup

2017-10-04 Thread Chesnay Schepler
see whether the certificate is being rejected. On Wed, Oct 4, 2017 at 5:15 AM, Chesnay Schepler <ches...@apache.org <mailto:ches...@apache.org>> wrote: something that would also help us narrow down the problematic area is to enable SSL for one component at a time and see

Re: Flink 1.3.2 Netty Exception

2017-10-11 Thread Chesnay Schepler
I can confirm that the issue is reproducible with the given test, from the command-line and IDE. While cutting down the test case, by replacing the outputformat with a DiscardingOutputFormat and the JDBCInputFormat with a simple collection, i stumbled onto a new Exception after ~200

Re: Decouple Kafka partitions and Flink parallelism for ordered streams

2017-10-11 Thread Chesnay Schepler
; unionBase = unionBase.union(toUnion.toArray(new DataStream[0])); unionBase.print(); } // execute program env.execute("Theory"); On 11.10.2017 16:31, Chesnay Schepler wrote: It is correct that keyBy and partition operations will distribute messages over the network as they distribute the data a

<    1   2   3   4   5   6   7   8   9   10   >