date:20160824

Re: How to share text file across tasks at run time in flink.

2016-08-24 Thread Baswaraj Kasture

Thanks to all for your inputs. Yeah, I could put all these common configurations/rules in DB and workers can pick it up dynamically at run time. In this case DB configuration/connection details need to be hard coded ? Is there any way worker can pickup DB name/credentials etc at run time

Regarding Global Configuration in Flink

2016-08-24 Thread Janardhan Reddy

Hi, Is global configuration same for all jobs in a Flink cluster. Is it a good idea to write a custom source which polls some external source every x minutes and updates the global config. Will the config change be propagated across all jobs? What happens when the size of global config grows

How to set a custom JAVA_HOME when run flink on YARN?

2016-08-24 Thread Renkai

Hi,all: The YARN cluster of my company is default to use Java 7,and I want to use java 8 form my Flink application, whant can I do to achieve it? -- View this message in context:

Re: Firing windows multiple times

2016-08-24 Thread Shannon Carey

What do you think about adding the current watermark to the window function metadata in FLIP-2? From: Shannon Carey > Date: Friday, August 12, 2016 at 6:24 PM To: Aljoscha Krettek >,

Re: How to get latency info from benchmark

2016-08-24 Thread Robert Metzger

Hi, Version 0.10-SNAPSHOT is pretty old. The snapshot repository of Apache probably doesn't keep old artifacts around forever. Maybe you can migrate the tests to Flink 0.10.0, or maybe even to a higher version. Regards, Robert On Wed, Aug 24, 2016 at 10:32 PM, Eric Fukuda

Re: How to get latency info from benchmark

2016-08-24 Thread Eric Fukuda

Hi Max, Robert, Thanks for the advice. I'm trying to build the "performance" project, but failing with the following error. Is there a solution for this? [ERROR] Failed to execute goal on project streaming-state-demo: Could not resolve dependencies for project com.dataartisans.flink:

Re: Dealing with Multiple sinks in Flink

2016-08-24 Thread vinay patil

Hi, Just an update, the window is not getting triggered when I change the parallelism to more than 1. Can you please explain why this is happening ? Regards, Vinay Patil On Wed, Aug 24, 2016 at 9:55 AM, vinay patil [via Apache Flink User Mailing List archive.]

Re: how to get rid of duplicate rows group by in DataStream

2016-08-24 Thread Yassine Marzougui

Sorry I mistyped the code, it should be *timeWindow**(Time.minutes(10))* instead of *window**(Time.minutes(10)).* On Wed, Aug 24, 2016 at 9:29 PM, Yassine Marzougui wrote: > Hi subash, > > A stream is infinite, hence it has no notion of "final" count. To get > distinct

Re: how to get rid of duplicate rows group by in DataStream

2016-08-24 Thread Yassine Marzougui

Hi subash, A stream is infinite, hence it has no notion of "final" count. To get distinct counts you need to define a period (= a window [1] ) over which you count elements and emit a result, by adding a winow operator before the reduce. For example the following will emit distinct counts every

Re: Setting number of TaskManagers

2016-08-24 Thread Foster, Craig

Oh, sorry, I didn't specify I was using YARN and don't necessarily want to specify with the command line option. From: Greg Hogan Reply-To: "user@flink.apache.org" Date: Wednesday, August 24, 2016 at 12:07 PM To: "user@flink.apache.org"

Re: Setting number of TaskManagers

2016-08-24 Thread Greg Hogan

The number of TaskManagers will be equal to the number of entries in the conf/slaves file. On Wed, Aug 24, 2016 at 3:04 PM, Foster, Craig wrote: > Is there a way to set the number of TaskManagers using a configuration > file or environment variable? I'm looking at the docs

Setting number of TaskManagers

2016-08-24 Thread Foster, Craig

Is there a way to set the number of TaskManagers using a configuration file or environment variable? I'm looking at the docs for it and it says you can set slots but not the number of TMs. Thanks, Craig

Re: how to get rid of duplicate rows group by in DataStream

2016-08-24 Thread subash basnet

Hello Kostas, Sorry for late reply. But I couldn't understand how to apply split in datastream, such as in below to get the distinct output stream element with the count after applying group by and reduce. DataStream> gridWithDensity = pointsWithGridCoordinates.map(new

Re: Setting up zeppelin with flink

2016-08-24 Thread Trevor Grant

Frank, can you post the zeppelin flink log please? You can probably find it in zeppelin_dir/logs/*flink*.log You've got a few moving pieces here. I've never run zeppelin against Flink in a docker container. But I think the Zeppelin-Flink log is the first place to look. You say you can't get

Re: Dealing with Multiple sinks in Flink

2016-08-24 Thread vinay patil

Hi Max, I tried writing to local file as well, its giving me the same issue, I have attached the logs and dummy pipeline code. logs.txt dummy_pipeline.txt

Re: Setting up zeppelin with flink

2016-08-24 Thread Trevor Grant

Hey Frank, Saw your post on the Zeppelin list yesterday. I can look at it later this morning, but my gut feeling is a ghost Zeppelin daemon is running in the background and it's local Flink is holding the port 6123. This is fairly common and would explain the issue. Idk if you're on linux or

Re: How large a Flink cluster can have?

2016-08-24 Thread Alexis Gendronneau

Hi, If possible, with this information, that be great to know how jobmanager has to be scaled according to number of nodes ? Will 1 Jobmanager be able to manage hundreds of nodes ? Is there recommandation in terms of JM/TM ratio ? Thanks 2016-07-14 15:41 GMT+02:00 Robert Metzger

Re: How to get latency info from benchmark

2016-08-24 Thread Robert Metzger

Hi Eric, Max is right, the tool has been used for a different benchmark [1]. The throughput logger that should produce the right output is this one [2]. Very recently, I've opened a pull request for adding metric-measuring support into the engine [3]. Maybe that's helpful for your experiments.

Re: How to get latency info from benchmark

2016-08-24 Thread Maximilian Michels

I believe the AnaylzeTool is for processing logs of a different benchmark. CC Jamie and Robert who worked on the benchmark. On Wed, Aug 24, 2016 at 3:25 AM, Eric Fukuda wrote: > Hi, > > I'm trying to benchmark Flink without Kafka as mentioned in this post >

Re: Setting up zeppelin with flink

2016-08-24 Thread Maximilian Michels

Hi! There are some people familiar with the Zeppelin integration. CCing Till and Trevor. Otherwise, you could also send this to the Zeppelin community. Cheers, Max On Wed, Aug 24, 2016 at 12:58 PM, Frank Dekervel wrote: > Hello, > > for reference: > > i already found out that

Re: How to share text file across tasks at run time in flink.

2016-08-24 Thread Maximilian Michels

Hi! 1. The community is working on adding side inputs to the DataStream API. That will allow you to easily distribute data to all of your workers. 2. In the meantime, you could use `.broadcast()` on a DataSet to broadcast data to all workers. You still have to join that data with another stream

Re: Setting up zeppelin with flink

2016-08-24 Thread Frank Dekervel

Hello, for reference: i already found out that "connect to existing process" was my error here: it means connecting to an existing zeppelin interpreter, not an existing flink cluster. After fixing my error, i'm now in the same situation as described here:

Re: Dealing with Multiple sinks in Flink

2016-08-24 Thread Maximilian Michels

Hi Vinay, Does this only happen with the S3 file system or also with your local file system? Could you share some example code or log output of your running job? Best, Max On Wed, Aug 24, 2016 at 4:20 AM, Vinay Patil wrote: > Hi, > > In our flink pipeline we are

Re: FLINK-4329 fix version

2016-08-24 Thread Maximilian Michels

Added a fix version 1.1.2 and 1.2.0 because a pull request is under way. On Tue, Aug 23, 2016 at 1:17 PM, Ufuk Celebi wrote: > On Tue, Aug 23, 2016 at 12:28 PM, Yassine Marzougui > wrote: >> The fix version of FLINK-4329 in JIRA is set to 1.1.1, but 1.1.1

Re: Checking for existance of output directory/files before running a batch job

2016-08-24 Thread Maximilian Michels

Forgot to mention, this is on the master. For Flink < 1.2.x, you will have to use GlobalConfiguration.get(); On Wed, Aug 24, 2016 at 12:23 PM, Maximilian Michels wrote: > Hi Niels, > > The problem is that such method only works reliably if the cluster > configuration, e.g. Flink

Re: Checking for existance of output directory/files before running a batch job

2016-08-24 Thread Maximilian Michels

Hi Niels, The problem is that such method only works reliably if the cluster configuration, e.g. Flink and Hadoop config files, are present on the client machine. Also, the environment variables have to be set correctly. This is usually not the case when working from the IDE. But seems like your

Re: "Failed to retrieve JobManager address" in Flink 1.1.1 with JM HA

2016-08-24 Thread Maximilian Michels

Hi Hironori, That's what I thought. So it won't be an issue for most users who do not comment out the JobManager url from the config. Still, the information printed is not correct. The issue has just been fixed. You will have to wait for the next minor release 1.1.2 or build the 'release-1.1'

Re: sharded state, 2-step operation

2016-08-24 Thread Stephan Ewen

Hi! The "feedback loop" sounds like a solution, yes. Actually, that works well with the CoMap / CoFlatMap - one input to the CoMap would be the original value, the other input the feedback value.

Re: JobManager HA without Distributed FileSystem

2016-08-24 Thread Stephan Ewen

Hi! - Concerning replication to other JobManagers - this could be an extension, but it would need to also support additional replacement JobManagers coming up later, so it would need a replication service in the JobManagers, not just a "send to all" at program startup. - That would work in

Re: "Failed to retrieve JobManager address" in Flink 1.1.1 with JM HA

2016-08-24 Thread Hironori Ogibayashi

Ufuk, Max, Thank you for your answer and opening JIRA. I will wait for the fix. As Max mentioned, I first commented out jobmanager.rpc.address, jobmanager.rpc.port. When I tried setting localhost and 6123 respectively, it worked. Regards, Hironori 2016-08-24 0:54 GMT+09:00 Maximilian Michels

Re: JobManager HA without Distributed FileSystem

2016-08-24 Thread Konstantin Knauf

Hi Stephan, thanks for the quick response, understood. Is there a reason why JAR files and JobGraph are not sent to all JobManagers by the client? Accordingly, why don't all taskmanagers sent Checkpoint Metadata to all JobManagers? I did not have any other storage at mind [1]. I am basically

Re: flink-shaded-hadoop

2016-08-24 Thread Aljoscha Krettek

Hi, this might be due to a bug in the Flink 1.1.0 maven dependencies. Can you try updating to Flink 1.1.1? Cheers, Aljoscha On Mon, 22 Aug 2016 at 07:48 wrote: > Hi, > every one , when i use scala version 2.10,and set the sbt project(add >

Re: How to share text file across tasks at run time in flink.

Regarding Global Configuration in Flink

How to set a custom JAVA_HOME when run flink on YARN?

Re: Firing windows multiple times

Re: How to get latency info from benchmark

Re: How to get latency info from benchmark

Re: Dealing with Multiple sinks in Flink

Re: how to get rid of duplicate rows group by in DataStream

Re: how to get rid of duplicate rows group by in DataStream

Re: Setting number of TaskManagers

Re: Setting number of TaskManagers

Setting number of TaskManagers

Re: how to get rid of duplicate rows group by in DataStream

Re: Setting up zeppelin with flink

Re: Dealing with Multiple sinks in Flink

Re: Setting up zeppelin with flink

Re: How large a Flink cluster can have?

Re: How to get latency info from benchmark

Re: How to get latency info from benchmark

Re: Setting up zeppelin with flink

Re: How to share text file across tasks at run time in flink.

Re: Setting up zeppelin with flink

Re: Dealing with Multiple sinks in Flink

Re: FLINK-4329 fix version

Re: Checking for existance of output directory/files before running a batch job

Re: Checking for existance of output directory/files before running a batch job

Re: "Failed to retrieve JobManager address" in Flink 1.1.1 with JM HA

Re: sharded state, 2-step operation

Re: JobManager HA without Distributed FileSystem

Re: "Failed to retrieve JobManager address" in Flink 1.1.1 with JM HA

Re: JobManager HA without Distributed FileSystem

Re: flink-shaded-hadoop

32 matches

Site Navigation

Mail list logo

Footer information