ted
> to [2] ?
>
> [2] https://issues.apache.org/jira/browse/FLINK-3047
>
> Best,
> Ovidiu
>
> On 22 Feb 2016, at 18:13, Till Rohrmann wrote:
>
> Hi Ovidiu,
>
> at the moment Flink's batch fault tolerance restarts the whole job in case
> of a failure. However, par
Hi Shikhar,
you're right that including a connector dependency would have let us spot
the problem earlier. In fact, any project building a fat jar with SBT would
have failed without setting the flink dependencies to provided.
The problem is that the template is a general purpose template. Thus, i
Hi Tim,
depending on how you create the DataSource fileList, Flink will
schedule the downstream operators differently. If you used the
ExecutionEnvironment.fromCollection method, then it will create a DataSource
with a CollectionInputFormat. This kind of DataSource will only be executed
with a deg
Registering a data type is only relevant for the Kryo serializer or if you
want to serialize a subclass of a POJO. Registering has the advantage that
you assign an id to the class which is written instead of the full class
name. The latter is usually much longer than the id.
Cheers,
Till
On Tue,
Hi Saiph,
I think the configuration value should be parallelism.default: 6. That will
execute jobs which have not parallelism defined with a DOP of 6.
Cheers,
Till
On Wed, Feb 24, 2016 at 1:43 AM, Saiph Kappa wrote:
> Hi,
>
> I am running a flink stream application on a cluster with 6 slave
an Tuple serialization but much
>faster than Kryo)
>- What if I call env.registerTypeWithKryoSerializer()? Why should I
>specify a serializer for Kryo?
>
> Best,
> Flavio
>
>
> On Tue, Feb 23, 2016 at 4:08 PM, Till Rohrmann
> wrote:
>
>> Registering a d
I assume that you included the flink-connector-twitter dependency in your
job jar, right? Alternatively, you might also put the jar in the lib folder
on each of your machines.
Cheers,
Till
On Wed, Feb 24, 2016 at 10:38 AM, ram kumar wrote:
> Hi,
>
>
> getting below error when executing twitte
Hi Pankaj,
are you creating a fat jar when you create your use code jar? This can be
done using maven's shade plugin or the assembly plugin. We provide a maven
archetype to set up a pom file which will make sure that a fat jar is built
[1].
[1]
https://ci.apache.org/projects/flink/flink-docs-mast
Hi Andrea,
no there isn’t. But you can always start your own ActorSystem in a stateful
operator.
Cheers,
Till
On Wed, Feb 24, 2016 at 11:57 AM, Andrea Sella
wrote:
> Hi,
> There is a way to access to the underlying TaskManager's Actor System?
>
> Thank you in advance,
> Andrea
>
CollectionInputFormat,
IteratorInputFormat and the JDBCInputFormat.
I hope this helps.
Cheers,
Till
On Tue, Feb 23, 2016 at 3:44 PM, Tim Conrad
wrote:
> Hi Till (and others).
>
> Thank you very much for your helpful answer.
>
> On 23.02.2016 14:20, Till Rohrmann wrote:
>
> [...] In c
What is the error message you receive?
On Wed, Feb 24, 2016 at 1:49 PM, Pankaj Kumar wrote:
> Hi Till ,
>
> I was able to make fat jar, but i am not able to execute this jar through
> flink command line.
>
> On Wed, Feb 24, 2016 at 4:31 PM, Till Rohrmann
> wrote:
>
>
If I’m not mistaken, then this shouldn’t solve the scheduling peculiarity
of Flink. Flink will still deploy the tasks of the flat map operation to
the machine where the source task is running. Only after this machine has
no more slots left, other machines will be used as well.
I think that you don
What is currently the error you observe? It might help to clear
org.apache.flink in the ivy cache once in a while.
Cheers,
Till
On Wed, Feb 24, 2016 at 6:09 PM, Cory Monty
wrote:
> We're still seeing this issue in the latest SNAPSHOT version. Do you have
> any suggestions to resolve the error?
ffect in
> the past.
>
> On Wed, Feb 24, 2016 at 11:34 AM, Till Rohrmann
> wrote:
>
>> What is currently the error you observe? It might help to clear
>> org.apache.flink in the ivy cache once in a while.
>>
>> Cheers,
>> Till
>>
>> On Wed, F
Hi Christoph,
have you tried setting the blocks parameter of the SVM algorithm? That
basically decides how many features are grouped together in one block. The
lower the value is the more feature vectors are grouped together and, thus,
the size of the block is increased. Increasing this value migh
Hi Saiph,
you can do it the following way:
input.keyBy(0).timeWindow(Time.seconds(10)).fold(0, new
FoldFunction, Integer>() {
@Override
public Integer fold(Integer integer, Tuple2 o)
throws Exception {
return integer + 1;
}
});
Cheers,
Till
On Thu, Feb 25, 2016 at 7:58 PM,
Hi Gyula,
could it be that you compiled against a different Scala version than the
one you're using for running the job? This usually happens when you compile
against 2.10 and let it run with version 2.11.
Cheers,
Till
On Fri, Feb 26, 2016 at 10:09 AM, Gyula Fóra wrote:
> Hey,
>
> For one of o
Hi Shuhao,
the configuration you’re providing is only used for the storm compatibility
layer and not Flink itself. When you run your job locally, the
LocalFlinkMiniCluster should be started with as many slots as your maximum
degree of parallelism is in your topology. You can check this in
FlinkLoc
Hi Shikhar,
that is a problem we just found out today. The problem is that the
scala.binary.version was not properly replaced in the parent pom so that it
resolves to 2.10 [1]. Max already opened a PR to fix this problem. With the
next release candidate, this should be fixed.
[1] https://issues.a
PSHOT yourself or are you relying on the
snapshot repository?
>>>>>
>>>>> We had issues in the past that jars in the snapshot repo were
incorrect
>>>>>
>>>>> On Fri, Feb 26, 2016 at 10:45 AM, Gyula Fóra
wrote:
>>>>>>
Hi Max,
the problem is that before starting the TM, we have to find the network
interface which is reachable by the other machines. So what we do is to
connect to the current JobManager. If it should happen, as in your case,
that the JobManager just died and the new JM address has not been written
Hi Jerry,
at the moment it is not yet possible to access previous elements in the
filter function of an individual element. Therefore, you have to check for
the condition “B is 5 days after A” in the final select statement. Giving
this context to the where clause would be indeed a nice addition to
ult back to
> the hostname/interface that is configured on the machine.
>
>
> On Thu, Mar 3, 2016 at 10:43 AM, Till Rohrmann
> wrote:
>
>> Hi Max,
>>
>> the problem is that before starting the TM, we have to find the network
>> interface which is
behavior, introduced in HA.
> Originally, if the connection attempts failed, it always returned the
> InetAddress.getLocalHost()
> interface.
> I think we should change it back to that, because that interface is by far
> the best possible heuristic.
>
> On Thu, Mar 3, 2016 at 11:
> TNG Technology Consulting GmbH, Betastr. 13a, 85774 Unterföhring
> Geschäftsführer: Henrik Klagges, Christoph Stock, Dr. Robert Dahlke
> Sitz: Unterföhring * Amtsgericht München * HRB 135082
>
> Am 03.03.2016 um 12:29 schrieb Till Rohrmann :
>
> No I don't think this behaviour has
o get, for example, the last
> event of a window, lets say a 5 second window?
>
> Rgds,
>
> Vitor Vieira
> @notvitor
>
> 2016-03-03 7:29 GMT-03:00 Till Rohrmann :
>
>> Hi Jerry,
>>
>> at the moment it is not yet possible to access previous elements in the
&
Great to hear Shikhar :-)
Cheers, Till
On Mar 4, 2016 3:51 AM, "shikhar" wrote:
> Thanks Till. I can confirm that things are looking good with RC5.
> sbt-assembly works well with the flink-kafka connector dependency not
> marked
> as "provided".
>
>
>
> --
> View this message in context:
> http:
Hi Arnaud,
with version 1.0 the behaviour for window triggering in case of a finite
stream was slightly changed. If you use event time, then all unfinished
windows are triggered in case that your stream ends. This can be motivated
by the fact that the end of a stream is equivalent to no elements w
Yes it seems as if you have a netty version conflict. Maybe the
alluxio-core-client.jar pulls in an incompatible netty version. Could you
check whether this is the case? But maybe you also have another
dependencies which pulls in a wrong netty version, since the Alluxio
documentation indicates that
ied to downgrade the Alluxio's netty version from 4.0.28.Final to
> 4.0.27.Final to align Flink and Alluxio dependencies. First of all, Flink
> 1.0.0 uses 4.0.27.Final, is it correct? Btw it doesn't work, same error as
> above.
>
> BR,
> Andrea
>
> 2016-03-14 15:30 GMT
Hi Ravinder,
could you tell us what's written in the taskmanager log of the failing
taskmanager? There should be some kind of failure why the taskmanager
stopped working.
Moreover, given that you have 64 GB of main memory, you could easily give
50GB as heap memory to each taskmanager.
Cheers,
Ti
Great to hear Tianqi :-) I will try it out.
Cheers,
Till
On Tue, Mar 15, 2016 at 12:41 AM, Tianqi Chen
wrote:
> Hi Flink Community:
> I am sending this email to let you know we just release XGBoost4J
> which also runs on Flink. In short, XGBoost is a machine learning package
> that is used
Hi Lydia,
the implementation looks correct. What you could do to speed up the
computation is to exploit existing partitionings in order to avoid
unnecessary network shuffles. Moreover, you could block your matrices to
increase the data granularity at the cost of parallelism.
Cheers,
Till
On Mon,
rdCount program for streaming and batch
>> respectively?
>>
>> – Ufuk
>>
>> On Tue, Mar 15, 2016 at 10:22 AM, Till Rohrmann
>> wrote:
>> > Hi Ravinder,
>> >
>> > could you tell us what's written in the taskmanager log of the failing
>>
askManager
>- Determined BLOB server address to be /10.155.208.156:59504.
> Starting BLOB cache.
> 09:55:38,536 INFO org.apache.flink.runtime.blob.BlobCache
> - Created BLOB cache storage directory
> /tmp/blobStore-8e88302d-3303-4c80-8613-f0be13911fb2
> 09:56:48,371
nt. Do I need to specify the hadoop
> configuration via code or core-site.xml is enough?
>
> Thank you again,
> Andrea
>
> 2016-03-14 17:28 GMT+01:00 Till Rohrmann :
>
>> Hi Andrea,
>>
>> the problem won’t be netty-all but netty, I suspect. Flink is using
>> ve
Hi Andrea,
there is also a PR [1] which will allow you to access the TaskManager logs
via the UI.
[1] https://github.com/apache/flink/pull/1790
Cheers,
Till
On Wed, Mar 9, 2016 at 1:58 PM, Stephan Ewen wrote:
> Hi!
>
> Yes, the dashboard is available in both cases. It is proxied through the
>
Hi Radu,
the mapping which StreamOperator is executed by which StreamTask happens
first in the StreamGraph.addOperator method. However, there is a second
step in the StreamingJobGraphGenerator.createChain where chainable
operators are chained and then executed by a single StreamTask. The
construct
Hi Radu,
the API call slotSharingGroup was introduced with version 1.0. In the
version 0.10 there was something similar called startNewResourceGroup, but
it was somewhat broken. Therefore, I would recommend you upgrading to
version 1.0. You can find the description of the new method here [1]. The
Hi Dominique,
have you tried setting the Kafka property props.put("auto.offset.reset",
"smallest");?
Cheers,
Till
On Thu, Mar 17, 2016 at 1:39 PM, Dominique Rondé <
dominique.ro...@codecentric.de> wrote:
> Hi folks,
>
> i have a kafka topic with messages from the last 7 days. Now i have a new
Hi Ahmed,
if you don't set the parallelism in your program then depending on how you
execute your program different parallelisms will be used. If you execute it
in your IDE, then the number of cores will be used as parallelism. If you
submit it to a cluster without specifying the parallelism via t
cribed in the standard.
>>
>> Having this pattern matching CEP functionality in Flink is a killing
>> feature IMHO.
>>
>> Best Regards,
>>
>> Jerry
>>
>>
>> On Thu, Mar 3, 2016 at 8:47 AM, Till Rohrmann
>> wrote:
>>
>
Hi Subash,
you can use ExecutionEnvironment env = ...; env.setParallelism(dop) for
that.
Cheers,
Till
On Mon, Mar 21, 2016 at 3:42 PM, subash basnet wrote:
> Hello all,
>
> Using the flink-webclient we have the options to define no. of parallelism
> and the same no. i.e. taskmanager.numberO
Hi Bart,
there are multiple ways how to specify a window function using the Scala
API. The most scalaesque way would probably be to use an anonymous function:
val env = StreamExecutionEnvironment.getExecutionEnvironment
val input = env.fromElements(1,2,3,4,5,7)
val pair = input.map(x => (x, x))
Hi Simone, can your problem be related to this mail thread [1]?
[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Flink-1-0-0-JobManager-is-not-running-in-Docker-Container-on-AWS-td10711.html
Cheers,
Till
On Tue, Mar 22, 2016 at 1:22 PM, Simone Robutti <
simone.robu...@radicalbi
Hi Lydia,
I tried to reproduce your problem but I couldn't. Can it be that you have
somewhere a non deterministic operation in your program or do you read the
data from a source with varying data? Maybe you could send us a compilable
and complete program which reproduces your problem.
Cheers,
Til
csvReader.fieldDelimiter(",");
> csvReader.includeFields("ttt");
> return csvReader.types(Integer.class, Integer.class, Double.class);
> }
>
>
> Am 22.03.2016 um 14:47 schrieb Till Rohrmann :
>
> Hi Lydia,
>
> I tried to reproduce your prob
Tue, Mar 22, 2016 at 3:31 PM, Lydia Ickler
wrote:
> Sorry I was not clear:
> I meant the initial DataSet is changing. Not the ds. :)
>
>
>
> Am 22.03.2016 um 15:28 schrieb Till Rohrmann :
>
> From the code extract I cannot tell what could be wrong because the code
> lo
Hi,
have you tried clearing your m2 repository? It would also be helpful to see
your dependencies (pom.xml).
Cheers,
Till
On Tue, Mar 22, 2016 at 10:41 PM, Sharma, Samiksha wrote:
> Hi,
>
> I am converting a storm topology to Flink-storm topology using the
> flink-storm dependency. When I run
Hi Balaji,
the output you see is the correct output since you're computing a
continuous reduce of the incoming data. Since you haven't defined a time
frame for your reduce computation you either would have to wait for all
eternity to output the final result or you output every time you've
generate
Hi Gna,
there are no utilities yet to do that but you can do it manually. In the
end, a model is simply a Flink DataSet which you can serialize to some
file. Upon reading this DataSet you simply have to give it to your
algorithm to be used as the model. The following code snippet illustrates
this
Hi,
I think Ufuk is completely right. As far as I know, we don't support this
function and nobody's currently working on it. If you like, then you could
take the lead there.
Cheers,
Till
On Mon, Mar 28, 2016 at 10:50 PM, Ufuk Celebi wrote:
> Hey Gna! I think that it's not on the road map at th
Hi,
for what do you use the ExecutionContext? That should actually be something
which you shouldn’t be concerned with since it is only used internally by
the runtime.
Cheers,
Till
On Tue, Mar 29, 2016 at 12:09 PM, Stefano Bortoli
wrote:
> Well, in theory yes. Each task has a thread, but only
In fact, I don't use it. I just had to crawl back the runtime
> implementation to get to the point where parallelism was switching from 32
> to 8.
>
> saluti,
> Stefano
>
> 2016-03-29 12:24 GMT+02:00 Till Rohrmann :
>
>> Hi,
>>
>> for what do you
gt;
>>
>> *case* *class* WeightVector(weights: Vector, intercept: Double) *extends*
>> Serializable {}
>>
>>
>> However, I will use the approach to write out the weights as text.
>>
>>
>> On Tue, Mar 29, 2016 at 5:01 AM, Till Rohrmann
>> wrote:
>>
&g
Hi Stavros,
you might be able to solve your problem using a CoFlatMap operation with
iterations. You would use one of the inputs for the iteration on which you
broadcast the model updates to every operator. On the other input you would
receive the data points which you want to cluster. As output y
Hi Lydia,
all downstream operators which depend on the bulk iteration will wait
implicitly until data from the iteration operator is available.
Cheers,
Till
On Thu, Mar 31, 2016 at 9:39 AM, Lydia Ickler
wrote:
> Hi all,
>
> is there a way to tell the program that it should wait until the
> Bul
Actually I think that it’s not correct that the OptionType cannot be used
as a key type. In fact it is similar to a composite type and should be
usable as a key iff it’s element can be used as a key. Then we only have to
provide an OptionTypeComparator which will compare the elements if they are
se
Hi Tarandeep,
the number of elements in each partition should stay constant. In fact the
elements in each partition should not change.
Cheers,
Till
On Wed, Mar 30, 2016 at 8:14 AM, Tarandeep Singh
wrote:
> Hi,
>
> I am looking at implementation of zipWithIndex in DataSetUtils-
>
> https://gith
nager..right?
> Then, each partition is divided again at the task manager to maximize the
> slot usage..is it correct?
> In every case, there will be a case where at least one partition is
> smaller than the others...am I wrong? Am I confusing some term..?
>
> Best,
> Flavio
&g
one sends already by default?
>
> Best regards,
> Lydia
>
>
> Am 31.03.2016 um 12:01 schrieb Till Rohrmann :
>
> Hi Lydia,
>
> all downstream operators which depend on the bulk iteration will wait
> implicitly until data from the iteration operator is available.
>
>
Hi Flink community,
I've written a short blog [1] post about Flink's new CEP library which
basically showcases its functionality using a monitoring example. I would
like to publish the post on the flink.apache.org blog next week, if nobody
objects. Feedback is highly appreciated :-)
[1]
https://d
e put it on Flink Blog
>
> Cheers
> Gen
>
>
> On Fri, Apr 1, 2016 at 9:56 PM, Till Rohrmann
> wrote:
>
>> Hi Flink community,
>>
>> I've written a short blog [1] post about Flink's new CEP library which
>> basically showcases its functionali
Hi Anwar,
yes, once we have published the introductory blog post about the CEP
library, we will also publish a more in-depth description of the approach
we have implemented. To spoil it a little bit: We have mainly followed the
paper “Efficient Pattern Matching over Event Streams” for the
implemen
Yes exactly. This is a feature which we still have to add.
On Tue, Apr 5, 2016 at 1:07 PM, Anwar Rizal wrote:
> Thanks Till.
>
> The only way I can change the behavior would be to post filter the result
> then.
>
> Anwar.
>
> On Tue, Apr 5, 2016 at 11:41 AM, Till Ro
Hi Balaji,
from the stack trace it looks as if you cannot open a connection redis.
Have you checked that you can access redis from all your TaskManager nodes?
Cheers,
Till
On Wed, Apr 6, 2016 at 7:46 AM, Balaji Rajagopalan <
balaji.rajagopa...@olacabs.com> wrote:
> I am trying to use AWS EMR ya
Hi Norman,
which version of Flink are you using? We recently fixed some issues with
the CEP library which looked similar to your error message. The problem
occurred when using the CEP library with processing time. Switching to
event or ingestion time, solve the problem.
The fixes to make it also
> redisClient.set(k,v,exTime)
> }
>
>
> def get(k: String): Option[String] = {
> import scala.concurrent.duration._
> val f = redisClient.get[String](k)
> Await.result(f, 1.seconds) //FIXME - really bad need to return future
> here.
> }
>
> }
>
>
> O
only work
> with 1.0.1.
>
> On Mon, Apr 4, 2016 at 3:35 PM, Till Rohrmann
> wrote:
> > Thanks a lot to all for the valuable feedback. I've incorporated your
> > suggestions and will publish the article, once Flink 1.0.1 has been
> released
> > (we need 1
client machine
> GlobalConfiguration params will passed on to the task manager nodes, as
> well, it was not and values from default was getting pickup, which was
> localhost 6379 and there was no redis running in localhost of task manager.
>
> balaji
>
> On Wed, Apr 6, 201
Hi Norman,
this error is exactly what I thought I had fixed. I guess there is still
another case where a premature pruning can happen in the SharedBuffer.
Could you maybe send me the example code with which you could produce the
error. The input data would also be very helpful. Then I can debug it
Hi Timur,
what you can try doing is to pass the JVM parameter
-Djava.library.path= via the env.java.opts to the system. You simply
have to add env.java.opts: "-Djava.library.path=" in the
flink-conf.yaml or via -Denv.java.opts="-Djava.library.path=", if I’m
not mistaken.
Cheers
Till
On Thu, Ap
For passing the dynamic property directly when running things on YARN, you
have to use -yDenv.java.opts="..."
On Thu, Apr 7, 2016 at 11:42 AM, Till Rohrmann wrote:
> Hi Timur,
>
> what you can try doing is to pass the JVM parameter
> -Djava.library.path= via the env.j
Hi Norman,
could you provide me an example input data set which produces the error?
E.g. the list of strings you inserted into Kafka/read from Kafka?
Cheers,
Till
On Thu, Apr 7, 2016 at 11:05 AM, norman sp wrote:
> Hi Till,
> thank you. here's the code:
>
> public class CepStorzSimulator {
>
>
gt; > behrouz.derakhshan@
>
> >> wrote:
> >
> >> Is there a reasons the Predictor or Estimator class don't have read and
> >> write methods for saving and retrieving the model? I couldn't find Jira
> >> issues for it. Does it make sense to
Sorry, I had a mistake in my example code. I thought the model would be
stored as a (Option[DataSet[Factors]], Option[DataSet[Factors]]) but
instead it’s stored as Option[(DataSet[Factors], DataSet[Factors])].
So the code should be
val als = ALS()
als.fit(input)
val alsModelOpt = als.factorsOpt
Hi Maxim,
concerning your second part of the question: The managed memory of a
TaskManager is first split among the available slots. Each slot portion of
the managed memory is again split among all operators which require managed
memory when a pipeline is executed. In contrast to that, the heap me
Hi Prez,
1.
the configuration setting taskmanager.numberOfTaskSlots says with how
many task slots a TaskManager will be started. As a rough rule of thumb,
set this value to the number of cores of the machine the TM is running on.
This this link [1] for further information. The conf
ach operator before the the next window
> processing begins?
>
> Thnx!
>
>
> On Fri, Apr 1, 2016 at 10:51 PM, Stavros Kontopoulos <
> st.kontopou...@gmail.com> wrote:
>
>> Ok thnx Till i will give it a shot!
>>
>> On Thu, Mar 31, 2016 at 11:25 AM, Till Roh
Hi Chen,
two subtasks of the same operator can never be executed within the same
slot/pipeline. The `slotSharingGroup` allows you to only control which
subtasks of different operators can be executed along side in the same
slot. It basically allows you to break pipelines into smaller ones.
Therefo
That is correct. You can provide it also as a property to the CLI:
-Denv.java.opts="-Dmy-prop=bla -Dmyprop2=bla2"
Cheers,
Till
On Sun, Apr 17, 2016 at 3:56 PM, Igor Berman wrote:
> for the sake of history(at task manager level):
> in conf/flink-conf.yaml
> env.java.opts: -Dmy-prop=bla -Dmy-prop
tency external service is going to be put in a separate slot. Is it
> possible to indicate to the Flink that subtasks of a particular operation
> can be collocated in a slot, as such subtasks are IO bound and require no
> shared memory?
>
> On Fri, Apr 15, 2016 at 5:31 AM, Till Roh
Hi Michael-Keith,
you can use S3 as the checkpoint directory for the filesystem state
backend. This means that whenever a checkpoint is performed the state data
will be written to this directory.
The same holds true for the zookeeper recovery storage directory. This
directory will contain the sub
Hi Yifei,
if you don't wanna implement your own join operator, then you could also
chain two join operations. I created a small example to demonstrate that:
https://gist.github.com/tillrohrmann/c074b4eedb9deaf9c8ca2a5e124800f3.
However, bare in mind that for this approach you will construct two wi
Hi Sendoh,
you have to edit your log4j.properties file to set log4j.rootLogger=OFF in
order to turn off the logger. Depending on how you run Flink and where you
wanna turn off the logging, you either have to edit the log4j.properties
file in the FLINK_HOME/conf directory or the in your project whi
Hi Leonard,
the UUID class cannot be treated as a POJO by Flink, because it is lacking
the public getters and setters for mostSigBits and leastSigBits. However,
it should be possible to treat it as a generic type. I think the difference
is that you cannot use key expressions and key indices to def
I assume that the provided FetchStock code is not complete. As the
exception indicates, you somehow store a LocalStreamEnvironment in you
source function. The StreamExecutionEnvironments are not serializable and
cannot be part of the source function’s closure.
Cheers,
Till
On Tue, Apr 19, 2016
Have you made sure that Flink is using logback [1]?
[1]
https://ci.apache.org/projects/flink/flink-docs-master/apis/best_practices.html#using-logback-instead-of-log4j
Cheers,
Till
On Tue, Apr 19, 2016 at 2:01 PM, Balaji Rajagopalan <
balaji.rajagopa...@olacabs.com> wrote:
> The are two files in
Hi Norman,
sorry for the late reply. I finally found time and could, thanks to you,
reproduce the problem. The problem was that the window borders were treated
differently in two parts of the code. Now the left border of a window is
inclusive and the right border (late elements) is exclusive. I've
Hi Stefano,
Hadoop supports this feature since version 2.6.0. You can define a time
interval for the maximum number of applications attempt. This means that
you have to observe this number of application failures in a time interval
before failing the application ultimately. Flink will activate thi
ry them out. I have
> another question.
>
> Since S2 my be days delayed, so there are may be lots of windows and large
> amount of data stored in memory waiting for computation. How does Flink
> deal with that?
>
> Thanks,
>
> Yifei
>
> On Tue, Apr 19, 2016 at
You could use CEP for that. First you would create a pattern of two states
which matches everything. In the select function you could then check
whether both elements are different.
However, this would be a little bit of an overkill for this simple use
case. You could for example simply use a flat
Have you created a RemoteExecutionEnvironment to submit your job from
within the IDE to the running cluster? See here [1] for more information.
[1]
https://ci.apache.org/projects/flink/flink-docs-master/apis/cluster_execution.html
Cheers,
Till
On Wed, Apr 20, 2016 at 3:41 PM, Ritesh Kumar Sing
Hi Elias,
sorry for the late reply. You're right that with the windowed join you
would have to deal with pairs where the timestamp of (x,y) is not
necessarily earlier than the timestamp of z. Moreover, by using sliding
windows you would receive duplicates as you've described. Using tumbling
window
Hi Sowmya,
I'm afraid at the moment it is not possible to store custom state in the
filter or select function. If you need access to the whole sequence of
matching events then you have to put this code in the select clause of your
pattern stream.
Cheers,
Till
On Fri, Apr 22, 2016 at 7:55 AM, Sow
Hi Shannon,
if you need this feature (assigning range of web server ports) for your use
case, then we would have to add it. If you want to do it, then it would
help us a lot.
I think the documentation is a bit outdated here. The port is either chosen
from the range of ports or a ephemeral port is
But be aware that this method only returns a non null string if the
binaries have been built with Maven. Otherwise it will return null.
Cheers,
Till
On Fri, Apr 22, 2016 at 12:12 AM, Trevor Grant
wrote:
> dug through the codebase, in case any others want to know:
>
> import org.apache.flink.run
Depending on how the key value pair is encoded, you could use the
TypeInformationKeyValueSerializationSchema where you provide the
BasicTypeInfo.STRING_TYPE_INFO and
PrimitiveArrayTypeInfo.BYTE_PRIMITIVE_ARRAY_TYPE_INFO as the key and value
type information. But this only works if your data was ser
Could you share the logs with us, Timur? That would be very helpful.
Cheers,
Till
On Apr 26, 2016 3:24 AM, "Timur Fayruzov" wrote:
> Hello,
>
> Now I'm at the stage where my job seem to completely hang. Source code is
> attached (it won't compile but I think gives a very good idea of what
> happ
Hi Hironori,
I would go with the second approach, because it is not guaranteed that all
events of a given key have been received by the window operator if the data
source says that all events for this key have been read. The events might
still be in flight. Furthermore, it integrates more nicely w
101 - 200 of 1508 matches
Mail list logo