[no subject]

2017-12-20 Thread chris snow




entrypoint for executing job in task manager

2017-12-20 Thread Steven Wu
Here is my understanding of how job submission works in Flink. When
submitting a job to job manager via REST API, we provide a entry class. Job
manager then evaluate job graph and ship serialized operators to task
manager. Task manager then open operators and run tasks.

My app would typically requires some initialization phase to setup my own
running context in task manager (e.g. calling a static method of some
class). Does Flink provide any entry hook in task manager when executing a
job (and tasks)? As for job manager, the entry class provides such hook
where I can initialize my static context.

Thanks,
Steven


Re: Flink upgrade compatibility table

2017-12-20 Thread Colin Williams
Hi Kien,

Thanks for the feedback. I wasn't certain regarding compatibility between
jars. I did version bump the flink libraries and the application did start.
Just curious if the previous jar still worked without upgrading.

Regarding the savepoint table. Someone should probably add 1.4 information
for consistency.


Thanks,

Colin Williams


On Dec 20, 2017 8:16 PM, "Kien Truong"  wrote:

> Hi Colin,
>
> Did you try to rebuild the application with Flink 1.4 ? You cannot just
> take a jar build with 1.3 and run it on 1.4 cluster. Afaik, Flink doesn't
> make any guarantee about binary compatibility between major releases, so
> you always have to recompile your application code when you upgrade the
> cluster.
>
> Also, the compatibility table you mentioned is only applicable to save
> point, not application code.
>
> Best regards,
> Kien
>
> Sent from TypeApp 
> On Dec 21, 2017, at 09:06, Colin Williams  com> wrote:
>>
>> I recently tried to launch our application 1.3 jars against a 1.4
>> cluster. I got a java.lang.NoClassDefFoundError:
>> org/apache/flink/streaming/api/checkpoint/CheckpointedRestoring when I
>> tried to run our 1.3 flink application against 1.4 .
>>
>> Then I googled around and didn't see a mention of 1.4 in the
>> compatibility table: https://ci.apache.org/projects/flink/flink-docs-
>> release-1.4/ops/upgrading.html#compatibility-table
>>
>> Does 1.4 break compatibility? Maybe the 1.4 docs should be updated to
>> reflect that?
>>
>> Thanks,
>>
>> Colin Williams
>>
>


Flink upgrade compatibility table

2017-12-20 Thread Colin Williams
I recently tried to launch our application 1.3 jars against a 1.4 cluster.
I got a java.lang.NoClassDefFoundError:
org/apache/flink/streaming/api/checkpoint/CheckpointedRestoring when I
tried to run our 1.3 flink application against 1.4 .

Then I googled around and didn't see a mention of 1.4 in the compatibility
table:
https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/upgrading.html#compatibility-table

Does 1.4 break compatibility? Maybe the 1.4 docs should be updated to
reflect that?

Thanks,

Colin Williams


Re: Restroring from a SP

2017-12-20 Thread Tzu-Li (Gordon) Tai
Hi Vishal,

AFAIK, intermittent restore failures from savepoints should not be expected.
Do you still have the logs from the failed restore attempts? What exceptions
were the restores failing on?
We would need to take a look at the logs to figure what may be going on.

Best,
Gordon



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: Apache Flink - Difference between operator and function

2017-12-20 Thread Tzu-Li (Gordon) Tai
Hi Mans,

What's the difference between an operator and a function ? 

An operator in Flink needs to handle processing of watermarks, records, and 
checkpointing of the operator state.
To implement one, you need to extend the AbstractStreamOperator base class.
It is considered a very low-level API that normal users would not use unless 
they have very specific needs.
To add an operator to your pipeline, you would use DataStream::transform(…).

Functions are UDFs such as a FlatMapFunction, MapFunction, WindowFunction, 
etc., and is the typical way Flink users would define transformations on 
DataStreams / DataSets.
They can be added to your pipeline using specific transform methods for each 
kind of function, e.g. DataStream::flatMap(…) corresponds to the 
FlatMapFunction.
User functions are executed by an underlying operator (specifically, the 
AbstractStreamUdfOperator).
UDFs only expose the abstraction of per-record processing and producing outputs 
so you don’t have to worry about other complications, for example handling 
watermarks and checkpointing state.
Any registered state in UDFs are managed state, and will be checkpointed by the 
underlying operator.

What are the raw state interfaces ? Are they checkpoint related interfaces ?

The raw state interfaces refer to StateInitializationContext and 
StateSnapshotContext, which is only visible when you directly implement an 
AbstractStreamOperator.
Through those interfaces, you have additional access to raw operator and keyed 
state input / output streams on the initializeState and snapshotState methods, 
which lets you read / write state as a stream of raw bytes.

Hope this helps!

Cheers,
Gordon

On 20 December 2017 at 10:06:34 AM, M Singh (mans2si...@yahoo.com) wrote:

Hi:

I am reading the documentation on working with state 
(https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/state.html)
 and it states that :

All datastream functions can use managed state, but the raw state interfaces 
can only be used when implementing operators. Using managed state (rather than 
raw state) is recommended, since with managed state Flink is able to 
automatically redistribute state when the parallelism is changed, and also do 
better memory management.

I wanted to find out 
What's the difference between an operator and a function ? 
What are the raw state interfaces ? Are they checkpoint related interfaces ?

Thanks

Mans

Re: A question about Triggers

2017-12-20 Thread Vishal Santoshi
And that further begs the question.. how performant is Timer Service. I
tried to peruse through the architecture behind it but cold not find a
definite clue. Is it a Scheduled Service and if yes how many threads etc...

On Wed, Dec 20, 2017 at 4:25 PM, Vishal Santoshi 
wrote:

> Makes sense. Did a first stab at Using ProcessFunction. The TimeService
> exposed by the Context does not have remove timer. Is it primarily b'coz A
> Priority Queue is the storage ad remove from a PriorityQueue is expensive
> ?  Trigger Context does expose another version that has removal abilities
> so was wondering why this dissonance...
>
> On Tue, Dec 19, 2017 at 4:53 AM, Fabian Hueske  wrote:
>
>> Hi Vishal,
>>
>> it is not guaranteed that add() and onElement() receive the same object,
>> and even if they do it is not guaranteed that a mutation of the object in
>> onElement() has an effect. The object might have been serialized and stored
>> in RocksDB.
>> Hence, elements should not be modified in onElement().
>>
>> Have you considered to implement the operation completely in a
>> ProcessFunction instead of a session window?
>> This might be more code but easier to design and reason about because
>> there is no interaction of window assigner, trigger, and window function.
>>
>>
>> 2017-12-18 20:49 GMT+01:00 Vishal Santoshi :
>>
>>> I guess https://github.com/apache/flink/blob/7f99a0df669dc73c9
>>> 83913c505c7f72dab3c0a4d/flink-streaming-java/src/main/java/o
>>> rg/apache/flink/streaming/runtime/operators/windowing/Window
>>> Operator.java#L362
>>>
>>> is where We could fashion as to what is emitted. Again for us it seems
>>> natural to use WM to materialize a micro batches with "approximate" order (
>>> and no I am not a fan of spark micro batches :)). Any pointers as to how we
>>> could write an implementation that allows for "up till WM emission" through
>>> a trigger on a Session Window would be very helpful. In essence I believe
>>> that for any "funnel" analysis it is crucial.
>>>
>>> Something like https://github.com/apache/flink/blob/7f99a0df669dc73c98
>>> 3913c505c7f72dab3c0a4d/flink-streaming-java/src/main/java/or
>>> g/apache/flink/streaming/runtime/operators/windowing/Evictin
>>> gWindowOperator.java#L346
>>>
>>> I know I am simplifying this and there has to be more to it...
>>>
>>>
>>>
>>>
>>> On Mon, Dec 18, 2017 at 11:31 AM, Vishal Santoshi <
>>> vishal.santo...@gmail.com> wrote:
>>>
 The Trigger in this case would be some CountBased Trigger Again the
 motive is the keep the state lean as we desire to search for  patterns,
 sorted on even time,  in the incoming sessionized ( and thus of un
 deterministic length ) stream

 On Mon, Dec 18, 2017 at 11:26 AM, Vishal Santoshi <
 vishal.santo...@gmail.com> wrote:

> For example, this would have worked perfect if it did not complain
> about MergeableWindow and state. The Session class in this encapsulates
> the  trim up to watermark behavior ( reduce call after telling it the
> current WM )  we desire
>
> public class SessionProcessWindow extends ProcessWindowFunction Session, String, TimeWindow> {
>
> private static final ValueStateDescriptor sessionState = new 
> ValueStateDescriptor<>("session", Session.class);
>
> @Override
> public void process(String key, Context context, Iterable 
> elements, Collector out) throws Exception {
>
> ValueState session = 
> context.windowState().getState(sessionState);
> Session s = session.value() != null ? session.value() : new 
> Session();
> for (Event e : elements) {
> s.add(e);
> }
> s.lastWaterMarkedEventLite.serverTime = 
> context.currentWatermark();
> s.reduce();
> out.collect(s);
> session.update(s);
> }
>
> @Override
> public void clear(Context context){
> ValueState session = 
> context.windowState().getState(sessionState);
> session.clear();
> }
> }
>
>
>
>
> On Mon, Dec 18, 2017 at 11:08 AM, Vishal Santoshi <
> vishal.santo...@gmail.com> wrote:
>
>> Hello Fabian, Thank you for the response.
>>
>>  I think that does not work, as it is the WM of the Window Operator
>> is what is desired to make deterministic decisions rather than off an
>> operator the precedes the Window ? This is doable using
>> ProcessWindowFunction using state but only in the case of non mergeable
>> windows.
>>
>>The best API  option I think is a TimeBaseTrigger that fires every
>> configured time progression of WM  and a Window implementation that
>> materializes *only data up till that WM* ( it might have more data
>> but that data has event time grater than the WM ). I am not sure we have
>> that built in option and thus was asking for an access the current WM for

Restroring from a SP

2017-12-20 Thread Vishal Santoshi
Are intermittent failures to restore from a SP, in the case of flink
offset  a know issue ? I had more than one instance where the offsets were
not restored, but a retry ( in one case I it succeeded like the 4th restore
attempt ) . I am on 1.3.2.


Re: A question about Triggers

2017-12-20 Thread Vishal Santoshi
Makes sense. Did a first stab at Using ProcessFunction. The TimeService
exposed by the Context does not have remove timer. Is it primarily b'coz A
Priority Queue is the storage ad remove from a PriorityQueue is expensive
?  Trigger Context does expose another version that has removal abilities
so was wondering why this dissonance...

On Tue, Dec 19, 2017 at 4:53 AM, Fabian Hueske  wrote:

> Hi Vishal,
>
> it is not guaranteed that add() and onElement() receive the same object,
> and even if they do it is not guaranteed that a mutation of the object in
> onElement() has an effect. The object might have been serialized and stored
> in RocksDB.
> Hence, elements should not be modified in onElement().
>
> Have you considered to implement the operation completely in a
> ProcessFunction instead of a session window?
> This might be more code but easier to design and reason about because
> there is no interaction of window assigner, trigger, and window function.
>
>
> 2017-12-18 20:49 GMT+01:00 Vishal Santoshi :
>
>> I guess https://github.com/apache/flink/blob/7f99a0df669dc73c9
>> 83913c505c7f72dab3c0a4d/flink-streaming-java/src/main/java/
>> org/apache/flink/streaming/runtime/operators/windowing/Wi
>> ndowOperator.java#L362
>>
>> is where We could fashion as to what is emitted. Again for us it seems
>> natural to use WM to materialize a micro batches with "approximate" order (
>> and no I am not a fan of spark micro batches :)). Any pointers as to how we
>> could write an implementation that allows for "up till WM emission" through
>> a trigger on a Session Window would be very helpful. In essence I believe
>> that for any "funnel" analysis it is crucial.
>>
>> Something like https://github.com/apache/flink/blob/7f99a0df669dc73c98
>> 3913c505c7f72dab3c0a4d/flink-streaming-java/src/main/java/
>> org/apache/flink/streaming/runtime/operators/windowing/Ev
>> ictingWindowOperator.java#L346
>>
>> I know I am simplifying this and there has to be more to it...
>>
>>
>>
>>
>> On Mon, Dec 18, 2017 at 11:31 AM, Vishal Santoshi <
>> vishal.santo...@gmail.com> wrote:
>>
>>> The Trigger in this case would be some CountBased Trigger Again the
>>> motive is the keep the state lean as we desire to search for  patterns,
>>> sorted on even time,  in the incoming sessionized ( and thus of un
>>> deterministic length ) stream
>>>
>>> On Mon, Dec 18, 2017 at 11:26 AM, Vishal Santoshi <
>>> vishal.santo...@gmail.com> wrote:
>>>
 For example, this would have worked perfect if it did not complain
 about MergeableWindow and state. The Session class in this encapsulates
 the  trim up to watermark behavior ( reduce call after telling it the
 current WM )  we desire

 public class SessionProcessWindow extends ProcessWindowFunction>>> Session, String, TimeWindow> {

 private static final ValueStateDescriptor sessionState = new 
 ValueStateDescriptor<>("session", Session.class);

 @Override
 public void process(String key, Context context, Iterable 
 elements, Collector out) throws Exception {

 ValueState session = 
 context.windowState().getState(sessionState);
 Session s = session.value() != null ? session.value() : new 
 Session();
 for (Event e : elements) {
 s.add(e);
 }
 s.lastWaterMarkedEventLite.serverTime = context.currentWatermark();
 s.reduce();
 out.collect(s);
 session.update(s);
 }

 @Override
 public void clear(Context context){
 ValueState session = 
 context.windowState().getState(sessionState);
 session.clear();
 }
 }




 On Mon, Dec 18, 2017 at 11:08 AM, Vishal Santoshi <
 vishal.santo...@gmail.com> wrote:

> Hello Fabian, Thank you for the response.
>
>  I think that does not work, as it is the WM of the Window Operator is
> what is desired to make deterministic decisions rather than off an 
> operator
> the precedes the Window ? This is doable using ProcessWindowFunction using
> state but only in the case of non mergeable windows.
>
>The best API  option I think is a TimeBaseTrigger that fires every
> configured time progression of WM  and a Window implementation that
> materializes *only data up till that WM* ( it might have more data
> but that data has event time grater than the WM ). I am not sure we have
> that built in option and thus was asking for an access the current WM for
> the window operator to allow  us handle the "*only data up till that
> WM" *range retrieval using some  custom data structure.
>
> Regards.
>
>
>
>
> On Mon, Dec 18, 2017 at 5:14 AM, Fabian Hueske 
> wrote:
>
>> Hi Vishal,
>>
>> the Trigger is not designed to augment records but just to control
>> when a window is evaluated.
>> I woul

Apache Flink - Difference between operator and function

2017-12-20 Thread M Singh
Hi:
I am reading the documentation on working with state 
(https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/state.html)
 and it states that :
All datastream functions can use managed state, but the raw state interfaces 
can only be used when implementing operators. Using managed state (rather than 
raw state) is recommended, since with managed state Flink is able to 
automatically redistribute state when the parallelism is changed, and also do 
better memory management.

I wanted to find out    
   - What's the difference between an operator and a function ? 
   - What are the raw state interfaces ? Are they checkpoint related interfaces 
?   


Thanks
Mans

Re: Pending parquet file with Bucking Sink

2017-12-20 Thread xiatao123
Hi Vipul,
  Thanks for the information.  Yes, I do have checkpointing enabled with 10
millisecs.
  I think the issue here is that the stream ended before the checkpoint
reached.  This is a testing code that the DataStream only have 5 events then
it ended. Once the stream ended, the checkpoint is not triggered, then the
file remains in "pending" state.
  Anyway we can force a checkpoint trigger? or let the sink know the stream
ended? 
Thanks,
Tao



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Metric reporters with non-static ports

2017-12-20 Thread Jared Stehler
The prometheus metric reporter allows for a specification of a port range; is 
there a way I can find out which actual port it found to bind to?

Also, there doesn’t seem to be a way to reserve an extra port for task managers 
in mesos to assign to a metric reporter, is that a roadmap item? I’m able to 
override the port for the app master 
(-Dmetrics.reporter.prom_reporter.port=$PORT1) but this carries over to the 
task managers and can collide with the assigned data port, etc.



--
Jared Stehler
Chief Architect - Intellify Learning
o: 617.701.6330 x703





Re: Can not resolve org.apache.hadoop.fs.Path in 1.4.0

2017-12-20 Thread Timo Walther
Libraries such as CEP or Table API should have the "compile" scope and 
should be in the both the fat and non-fat jar.


The non-fat jar should contain everything that is not in flink-dist or 
your lib directory.


Regards,
Timo


Am 12/20/17 um 3:07 PM schrieb shashank agarwal:

Hi,

In that case, it won't find the dependencies. Cause I have other 
dependencies also and what about CEP etc. cause that is not part of 
flink-dist.


Best
Shashank



‌

On Wed, Dec 20, 2017 at 3:16 PM, Aljoscha Krettek > wrote:


Hi,

That jar file looks like it has too much stuff in there that
shouldn't be there. This can explain the errors you seeing because
of classloading conflicts.

Could you try not building a fat-jar and have only your code in
your jar?

Best,
Aljoscha



On 20. Dec 2017, at 10:15, shashank agarwal
mailto:shashank...@gmail.com>> wrote:

One more thing when i submit the job ir start yarn session it
prints following logs :

Using the result of 'hadoop classpath' to augment the Hadoop
classpath:

/usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.

3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in

[jar:file:/opt/flink/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in

[jar:file:/usr/hdp/2.6.0.3-8/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Seehttp://www.slf4j.org/codes.html#multiple_bindings
for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]


So i think it's adding Hadoop libs in classpath too cause it's
able to create the checkpointing directories from flink-conf file
to HDFS.






‌

On Wed, Dec 20, 2017 at 2:31 PM, shashank
agarwalmailto:shashank...@gmail.com>>wrote:

Hi,

Please find attached list of jar file contents and flink/lib/
contents. I have removed my class files list from jar list
and I have added flink-hadoop-compatibility_2.11-1.4.0.jar
later in flink/lib/ but no success.

I have tried by removing flink-shaded-hadoop2 from my project
but still no success.


Thanks
Shashank


On Wed, Dec 20, 2017 at 2:14 PM, Aljoscha
Krettekmailto:aljos...@apache.org>>wrote:

Hi,

Could you please list what exactly is in your submitted
jar file, for example using "jar tf my-jar-file.jar"? And
also what files exactly are in your Flink lib directory.

Best,
Aljoscha



On 19. Dec 2017, at 20:10, shashank agarwal
mailto:shashank...@gmail.com>>
wrote:

Hi Timo,

I am using Rocksdbstatebackend with hdfs path. I have
following flink dependencies in my sbt :

"org.slf4j" % "slf4j-log4j12" % "1.7.21",
"org.apache.flink" %% "flink-scala" % flinkVersion %
"provided",
"org.apache.flink" %% "flink-streaming-scala" %
flinkVersion % "provided",
"org.apache.flink" %% "flink-cep-scala" % flinkVersion,
"org.apache.flink" %% "flink-connector-kafka-0.10" %
flinkVersion,
"org.apache.flink" %% "flink-connector-filesystem" %
flinkVersion,
"org.apache.flink" %% "flink-statebackend-rocksdb" %
flinkVersion,
"org.apache.flink" %% "flink-connector-cassandra" % "1.3.2",
"org.apache.flink" % "flink-shaded-hadoop2" % flinkVersion,

when i start flink yarn session it's working fine even
it's creating flink checkpointing directory and copying
libs into hdfs.

But when I submit the application to this yarn session
it prints following logs :


Using the result of 'hadoop classpath' to augment the
Hadoop classpath:

/usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.

3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.

3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*


But application fails contuniously with logs which i
have sent earlier.


‌I have tried to add flink- hadoop-compability*.jar as
   

Re: Can not resolve org.apache.hadoop.fs.Path in 1.4.0

2017-12-20 Thread shashank agarwal
Hi,

In that case, it won't find the dependencies. Cause I have other
dependencies also and what about CEP etc. cause that is not part of
flink-dist.

Best
Shashank



‌

On Wed, Dec 20, 2017 at 3:16 PM, Aljoscha Krettek 
wrote:

> Hi,
>
> That jar file looks like it has too much stuff in there that shouldn't be
> there. This can explain the errors you seeing because of classloading
> conflicts.
>
> Could you try not building a fat-jar and have only your code in your jar?
>
> Best,
> Aljoscha
>
>
> On 20. Dec 2017, at 10:15, shashank agarwal  wrote:
>
> One more thing when i submit the job ir start yarn session it prints
> following logs :
>
> Using the result of 'hadoop classpath' to augment the Hadoop classpath:
> /usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/
> hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.
> 3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/
> usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/
> hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/
> hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.
> 0.3-8/hadoop-mapreduce/.//*
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/opt/flink/lib/
> slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/usr/hdp/2.6.0.3-8/
> hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/
> StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>
>
> So i think it's adding Hadoop libs in classpath too cause it's able to
> create the checkpointing directories from flink-conf file to HDFS.
>
>
>
>
>
>
> ‌
>
> On Wed, Dec 20, 2017 at 2:31 PM, shashank agarwal 
> wrote:
>
>> Hi,
>>
>> Please find attached list of jar file contents and flink/lib/ contents. I
>> have removed my class files list from jar list and I have
>> added flink-hadoop-compatibility_2.11-1.4.0.jar later in flink/lib/ but
>> no success.
>>
>> I have tried by removing flink-shaded-hadoop2 from my project but still
>> no success.
>>
>>
>> Thanks
>> Shashank
>>
>>
>> On Wed, Dec 20, 2017 at 2:14 PM, Aljoscha Krettek 
>> wrote:
>>
>>> Hi,
>>>
>>> Could you please list what exactly is in your submitted jar file, for
>>> example using "jar tf my-jar-file.jar"? And also what files exactly are in
>>> your Flink lib directory.
>>>
>>> Best,
>>> Aljoscha
>>>
>>>
>>> On 19. Dec 2017, at 20:10, shashank agarwal 
>>> wrote:
>>>
>>> Hi Timo,
>>>
>>> I am using Rocksdbstatebackend with hdfs path. I have following flink
>>> dependencies in my sbt :
>>>
>>> "org.slf4j" % "slf4j-log4j12" % "1.7.21",
>>>   "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>>   "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>>> "provided",
>>>   "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>>   "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>>   "org.apache.flink" %% "flink-connector-filesystem" % flinkVersion,
>>>   "org.apache.flink" %% "flink-statebackend-rocksdb" % flinkVersion,
>>>   "org.apache.flink" %% "flink-connector-cassandra" % "1.3.2",
>>>   "org.apache.flink" % "flink-shaded-hadoop2" % flinkVersion,
>>>
>>> when i start flink yarn session  it's working fine even it's creating
>>> flink checkpointing directory and copying libs into hdfs.
>>>
>>> But when I submit the application to this yarn session it prints
>>> following logs :
>>>
>>> Using the result of 'hadoop classpath' to augment the Hadoop classpath: 
>>> /usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.
>>>  
>>> 3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*
>>>
>>> But application fails contuniously with logs which i have sent earlier.
>>>
>>>
>>>
>>> ‌I have tried to add flink- hadoop-compability*.jar as suggested by
>>> Jorn but it's not working.
>>>
>>>
>>>
>>> On Tue, Dec 19, 2017 at 5:08 PM, shashank agarwal >> > wrote:
>>>
 yes, it's working fine. now not getting compile time error.

 But when i trying to run this on cluster or yarn, getting following
 runtime error :

 org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not 
 find a file system implementation for scheme 'hdfs'. The scheme is not 
 directly supported by Flink and no Hadoop file system to support this 
 scheme could be loaded.
at 
 org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
at 
 org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.

Re: flink jobmanager HA zookeeper leadership election - RECEIVED SIGNAL 15: SIGTERM. Shutting down as requested.

2017-12-20 Thread Till Rohrmann
Hi Colin,

the log looks as if the Flink JobManager receives a SIGTERM signal and
shuts down due to that. This is nothing which should be triggered by
Flink's leader election. Could you check whether this signal might be
created by another process in your environment or if the container
supervisor terminated the process?

Cheers,
Till

On Wed, Dec 20, 2017 at 4:41 AM, Colin Williams <
colin.williams.seat...@gmail.com> wrote:

>
>
> On Tue, Dec 19, 2017 at 7:29 PM, Colin Williams <
> colin.williams.seat...@gmail.com> wrote:
>
>> Hi,
>>
>> I've been trying to update my flink-docker jobmanager configuration for
>> flink 1.4. I think the system is shutting down after a leadership election,
>> but I'm not sure what the issue is. My configuration of the jobmanager
>> follows
>>
>>
>> jobmanager.rpc.address: 10.16.228.150
>> jobmanager.rpc.port: 6123
>> jobmanager.heap.mb: 1024
>> blob.server.port: 6124
>> query.server.port: 6125
>>
>> web.port: 8081
>> web.history: 10
>>
>> parallelism.default: 1
>>
>> state.backend: rocksdb
>> state.backend.rocksdb.checkpointdir: /tmp/flink/rocksdb
>> state.backend.fs.checkpointdir: file:///var/lib/data/checkpoints
>>
>> high-availability: zookeeper
>> high-availability.cluster-id: /dev
>> high-availability.zookeeper.quorum: 10.16.228.190:2181
>> high-availability.zookeeper.path.root: /flink-1.4
>> high-availability.zookeeper.storageDir: file:///var/lib/data/recovery
>> high-availability.jobmanager.port: 50010
>>
>> env.java.opts: -Dlog.file=/opt/flink/log/jobmanager.log
>>
>> I'm also attaching some debugging output which shows the shutdown. Again
>> I'm not entirely sure it's caused by a leadership issue because it's not
>> clear from the debug logs. Can anyone suggest changes I might make to the
>> configuration to fix this? I've tried clearing the zookeeper root path in
>> case it had some old session information, but that didn't seem to help.
>>
>> Best,
>>
>> Colin Williams
>>
>
>


Re: Static Variables

2017-12-20 Thread Stefan Richter
Hi,

I think the cause is very likely a race condition between the tasks checking 
and setting the static value, because tasks run in different threads. You could 
try to use an Atomic reference or synchronization for setting the state 
variable’s value.

Best,
Stefan

> Am 20.12.2017 um 00:29 schrieb Navneeth Krishnan :
> 
> Hi,
> 
> I have a requirement to initialize few guava caches per jvm and some static 
> helper classes. I tried few options but nothing worked. Need some help. 
> Thanks a lot.
> 
> 1. Operator level static variables: 
> 
> public static Cache loadingCache;
> 
> public void open(Configuration parameters) throws Exception {
>   if (loadingCache == null)
>   initializeCache();
> }
> 
> The cache object is null on each operator slot and it gets initialized on 
> every call to open method.
> 
> 2. Initialize in operator class constructor:
> 
> public FlatMapFunction(ParameterTool parameterTool) {
> this. parameterTool = parameterTool;
> initializeCache();
> }
> 
> The cache doesn't seem to be initialized when accessed inside the task 
> manager.
> 
> Thanks.



Re: NullPointerException with Avro Serializer

2017-12-20 Thread Kien Truong
It turn out that our flink branch is out-of-date. Sorry for all the noise. :)

Regards,
Kien

⁣Sent from TypeApp ​

On Dec 20, 2017, 16:42, at 16:42, Kien Truong  wrote:
>Upon further investigation, we found out that the reason:
>
>* The cluster was started on YARN with the hadoop classpath, which
>includes Avro. Therefore, Avro's SpecificRecord class was loaded using
>sun.misc.Launcher$AppClassLoader
>
>
>* Our LteSession class was submitted with the application jar, and
>loaded with the child-first classloader
>
>* Flink check if LteSession is assignable to SpecificRecord, which
>fails.
>
>* Flink fall back to Reflection-based avro writer, which throws NPE on
>null field.
>
>If we change the classloader to parent-first, everything is ok. Now the
>question is why the default doesn't work for us.
>
>Best regards,
>Kien
>
>⁣Sent from TypeApp ​
>
>On Dec 20, 2017, 14:09, at 14:09, Kien Truong 
>wrote:
>>Hi,
>>
>>After upgrading to Flink 1.4, we encounter this exception
>>
>>Caused by: java.lang.NullPointerException: in
>>com.viettel.big4g.avro.LteSession in long null of long in field tmsi
>of
>>com.viettel.big4g.avro.LteSession
>>at
>>org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:161)
>>at
>>org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62)
>>at
>>org.apache.flink.formats.avro.typeutils.AvroSerializer.serialize(AvroSerializer.java:132)
>>at
>>org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:176)
>>at
>>org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:48)
>>at
>>org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)
>>at
>>org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.addRecord(SpanningRecordSerializer.java:93)
>>at
>>org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:114)
>>at
>>org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:89)
>>at
>>org.apache.flink.streaming.runtime.io.StreamRecordWriter.emit(StreamRecordWriter.java:84)
>>at
>>org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:102)
>>
>>
>>It seems Flink attempts to use the reflection writer instead of the
>>specific writer for this schema. This is wrong, because our LteSession
>>is an Avro object, and should use the specific writer.
>>
>>Best regards,
>>Kien
>>
>>⁣Sent from TypeApp ​


Re: Can not resolve org.apache.hadoop.fs.Path in 1.4.0

2017-12-20 Thread Aljoscha Krettek
Hi,

That jar file looks like it has too much stuff in there that shouldn't be 
there. This can explain the errors you seeing because of classloading conflicts.

Could you try not building a fat-jar and have only your code in your jar?

Best,
Aljoscha

> On 20. Dec 2017, at 10:15, shashank agarwal  wrote:
> 
> One more thing when i submit the job ir start yarn session it prints 
> following logs :
> 
> Using the result of 'hadoop classpath' to augment the Hadoop classpath: 
> /usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/opt/flink/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/hdp/2.6.0.3-8/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings 
>  for an explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 
> 
> So i think it's adding Hadoop libs in classpath too cause it's able to create 
> the checkpointing directories from flink-conf file to HDFS.
> 
> 
> 
> 
> 
> 
> ‌
> 
> On Wed, Dec 20, 2017 at 2:31 PM, shashank agarwal  > wrote:
> Hi,
> 
> Please find attached list of jar file contents and flink/lib/ contents. I 
> have removed my class files list from jar list and I have added 
> flink-hadoop-compatibility_2.11-1.4.0.jar later in flink/lib/ but no success. 
> 
> I have tried by removing flink-shaded-hadoop2 from my project but still no 
> success.
> 
> 
> Thanks
> Shashank
> 
> 
> On Wed, Dec 20, 2017 at 2:14 PM, Aljoscha Krettek  > wrote:
> Hi,
> 
> Could you please list what exactly is in your submitted jar file, for example 
> using "jar tf my-jar-file.jar"? And also what files exactly are in your Flink 
> lib directory.
> 
> Best,
> Aljoscha
> 
> 
>> On 19. Dec 2017, at 20:10, shashank agarwal > > wrote:
>> 
>> Hi Timo,
>> 
>> I am using Rocksdbstatebackend with hdfs path. I have following flink 
>> dependencies in my sbt :
>> 
>> "org.slf4j" % "slf4j-log4j12" % "1.7.21",
>>   "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>   "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % "provided",
>>   "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>   "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>   "org.apache.flink" %% "flink-connector-filesystem" % flinkVersion,
>>   "org.apache.flink" %% "flink-statebackend-rocksdb" % flinkVersion,
>>   "org.apache.flink" %% "flink-connector-cassandra" % "1.3.2",
>>   "org.apache.flink" % "flink-shaded-hadoop2" % flinkVersion,
>> 
>> when i start flink yarn session  it's working fine even it's creating flink 
>> checkpointing directory and copying libs into hdfs.
>> 
>> But when I submit the application to this yarn session it prints following 
>> logs :
>> 
>>> Using the result of 'hadoop classpath' to augment the Hadoop classpath: 
>>> /usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.
>>>  
>>> 3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*
>> 
>> But application fails contuniously with logs which i have sent earlier.
>> 
>> 
>> ‌I have tried to add flink- hadoop-compability*.jar as suggested by Jorn but 
>> it's not working.
>> 
>> 
>> 
>> On Tue, Dec 19, 2017 at 5:08 PM, shashank agarwal > > wrote:
>> yes, it's working fine. now not getting compile time error.
>> 
>> But when i trying to run this on cluster or yarn, getting following runtime 
>> error :
>> 
>> org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not 
>> find a file system implementation for scheme 'hdfs'. The scheme is not 
>> directly supported by Flink and no Hadoop file system to support this scheme 
>> could be loaded.
>>  at 
>> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
>>  at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
>>  at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
>>  at 
>> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.(FsCheckpointStreamFactory.java:99)
>>  at 
>> org.apache.flink.runtime.state.filesystem.FsStateBackend.createStreamFactory(FsStateBackend.java:277)
>>  at

Re: NullPointerException with Avro Serializer

2017-12-20 Thread Kien Truong
Upon further investigation, we found out that the reason:

* The cluster was started on YARN with the hadoop classpath, which includes 
Avro. Therefore, Avro's SpecificRecord class was loaded using 
sun.misc.Launcher$AppClassLoader


* Our LteSession class was submitted with the application jar, and loaded with 
the child-first classloader

* Flink check if LteSession is assignable to SpecificRecord, which fails.

* Flink fall back to Reflection-based avro writer, which throws NPE on null 
field.

If we change the classloader to parent-first, everything is ok. Now the 
question is why the default doesn't work for us.

Best regards,
Kien

⁣Sent from TypeApp ​

On Dec 20, 2017, 14:09, at 14:09, Kien Truong  wrote:
>Hi,
>
>After upgrading to Flink 1.4, we encounter this exception
>
>Caused by: java.lang.NullPointerException: in
>com.viettel.big4g.avro.LteSession in long null of long in field tmsi of
>com.viettel.big4g.avro.LteSession
>at
>org.apache.avro.reflect.ReflectDatumWriter.write(ReflectDatumWriter.java:161)
>at
>org.apache.avro.generic.GenericDatumWriter.write(GenericDatumWriter.java:62)
>at
>org.apache.flink.formats.avro.typeutils.AvroSerializer.serialize(AvroSerializer.java:132)
>at
>org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:176)
>at
>org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.serialize(StreamElementSerializer.java:48)
>at
>org.apache.flink.runtime.plugable.SerializationDelegate.write(SerializationDelegate.java:54)
>at
>org.apache.flink.runtime.io.network.api.serialization.SpanningRecordSerializer.addRecord(SpanningRecordSerializer.java:93)
>at
>org.apache.flink.runtime.io.network.api.writer.RecordWriter.sendToTarget(RecordWriter.java:114)
>at
>org.apache.flink.runtime.io.network.api.writer.RecordWriter.emit(RecordWriter.java:89)
>at
>org.apache.flink.streaming.runtime.io.StreamRecordWriter.emit(StreamRecordWriter.java:84)
>at
>org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:102)
>
>
>It seems Flink attempts to use the reflection writer instead of the
>specific writer for this schema. This is wrong, because our LteSession
>is an Avro object, and should use the specific writer.
>
>Best regards,
>Kien
>
>⁣Sent from TypeApp ​


Re: Can not resolve org.apache.hadoop.fs.Path in 1.4.0

2017-12-20 Thread shashank agarwal
One more thing when i submit the job ir start yarn session it prints
following logs :

Using the result of 'hadoop classpath' to augment the Hadoop classpath:
/usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/opt/flink/lib/slf4j-log4j12-1.7.7.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/usr/hdp/2.6.0.3-8/hadoop/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]


So i think it's adding Hadoop libs in classpath too cause it's able to
create the checkpointing directories from flink-conf file to HDFS.






‌

On Wed, Dec 20, 2017 at 2:31 PM, shashank agarwal 
wrote:

> Hi,
>
> Please find attached list of jar file contents and flink/lib/ contents. I
> have removed my class files list from jar list and I have
> added flink-hadoop-compatibility_2.11-1.4.0.jar later in flink/lib/ but
> no success.
>
> I have tried by removing flink-shaded-hadoop2 from my project but still no
> success.
>
>
> Thanks
> Shashank
>
>
> On Wed, Dec 20, 2017 at 2:14 PM, Aljoscha Krettek 
> wrote:
>
>> Hi,
>>
>> Could you please list what exactly is in your submitted jar file, for
>> example using "jar tf my-jar-file.jar"? And also what files exactly are in
>> your Flink lib directory.
>>
>> Best,
>> Aljoscha
>>
>>
>> On 19. Dec 2017, at 20:10, shashank agarwal 
>> wrote:
>>
>> Hi Timo,
>>
>> I am using Rocksdbstatebackend with hdfs path. I have following flink
>> dependencies in my sbt :
>>
>> "org.slf4j" % "slf4j-log4j12" % "1.7.21",
>>   "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>>   "org.apache.flink" %% "flink-streaming-scala" % flinkVersion %
>> "provided",
>>   "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>>   "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>>   "org.apache.flink" %% "flink-connector-filesystem" % flinkVersion,
>>   "org.apache.flink" %% "flink-statebackend-rocksdb" % flinkVersion,
>>   "org.apache.flink" %% "flink-connector-cassandra" % "1.3.2",
>>   "org.apache.flink" % "flink-shaded-hadoop2" % flinkVersion,
>>
>> when i start flink yarn session  it's working fine even it's creating
>> flink checkpointing directory and copying libs into hdfs.
>>
>> But when I submit the application to this yarn session it prints
>> following logs :
>>
>> Using the result of 'hadoop classpath' to augment the Hadoop classpath: 
>> /usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.
>>  
>> 3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*
>>
>> But application fails contuniously with logs which i have sent earlier.
>>
>>
>>
>> ‌I have tried to add flink- hadoop-compability*.jar as suggested by Jorn
>> but it's not working.
>>
>>
>>
>> On Tue, Dec 19, 2017 at 5:08 PM, shashank agarwal 
>>  wrote:
>>
>>> yes, it's working fine. now not getting compile time error.
>>>
>>> But when i trying to run this on cluster or yarn, getting following
>>> runtime error :
>>>
>>> org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not 
>>> find a file system implementation for scheme 'hdfs'. The scheme is not 
>>> directly supported by Flink and no Hadoop file system to support this 
>>> scheme could be loaded.
>>> at 
>>> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
>>> at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
>>> at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
>>> at 
>>> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.(FsCheckpointStreamFactory.java:99)
>>> at 
>>> org.apache.flink.runtime.state.filesystem.FsStateBackend.createStreamFactory(FsStateBackend.java:277)
>>> at 
>>> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createStreamFactory(RocksDBStateBackend.java:273)
>>> at 
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.createCheckpointStreamFactory(StreamTask.java:787)
>>> at 
>>> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:247)
>>> at 
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
>>> at 
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.

Re: Add new slave to running cluster?

2017-12-20 Thread Ufuk Celebi
Hey Jinhua,

- The `slaves` file is only relevant for the startup scripts. You can
add as many task managers as you like by starting them manually.
- You can check the logs of the JobManager or its web UI
(jobmanager-host:8081) to see how many TMs have registered.
- The registration time out looks more like a misconfiguration to me.
Please verify that all task managers have the same configuration. If
the configuration looks good to you, it might be a network issue.

Does this help?

– Ufuk


On Wed, Dec 20, 2017 at 8:22 AM, Jinhua Luo  wrote:
> Hi All,
>
> If I add new slave (start taskmanager on new host) which does not
> included in the conf/slaves, I see below logs conintuously printed:
> ...Trying to register at JobManager...(attempt 147,
>  timeout: 3 milliseconds)
>
> Is it normal? And does the new slave successfully added in the cluster?


Re: Can not resolve org.apache.hadoop.fs.Path in 1.4.0

2017-12-20 Thread Aljoscha Krettek
Hi,

Could you please list what exactly is in your submitted jar file, for example 
using "jar tf my-jar-file.jar"? And also what files exactly are in your Flink 
lib directory.

Best,
Aljoscha

> On 19. Dec 2017, at 20:10, shashank agarwal  wrote:
> 
> Hi Timo,
> 
> I am using Rocksdbstatebackend with hdfs path. I have following flink 
> dependencies in my sbt :
> 
> "org.slf4j" % "slf4j-log4j12" % "1.7.21",
>   "org.apache.flink" %% "flink-scala" % flinkVersion % "provided",
>   "org.apache.flink" %% "flink-streaming-scala" % flinkVersion % "provided",
>   "org.apache.flink" %% "flink-cep-scala" % flinkVersion,
>   "org.apache.flink" %% "flink-connector-kafka-0.10" % flinkVersion,
>   "org.apache.flink" %% "flink-connector-filesystem" % flinkVersion,
>   "org.apache.flink" %% "flink-statebackend-rocksdb" % flinkVersion,
>   "org.apache.flink" %% "flink-connector-cassandra" % "1.3.2",
>   "org.apache.flink" % "flink-shaded-hadoop2" % flinkVersion,
> 
> when i start flink yarn session  it's working fine even it's creating flink 
> checkpointing directory and copying libs into hdfs.
> 
> But when I submit the application to this yarn session it prints following 
> logs :
> 
>> Using the result of 'hadoop classpath' to augment the Hadoop classpath: 
>> /usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.
>>  
>> 3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*
> 
> But application fails contuniously with logs which i have sent earlier.
> 
> 
> ‌I have tried to add flink- hadoop-compability*.jar as suggested by Jorn but 
> it's not working.
> 
> 
> 
> On Tue, Dec 19, 2017 at 5:08 PM, shashank agarwal  > wrote:
> yes, it's working fine. now not getting compile time error.
> 
> But when i trying to run this on cluster or yarn, getting following runtime 
> error :
> 
> org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: Could not find 
> a file system implementation for scheme 'hdfs'. The scheme is not directly 
> supported by Flink and no Hadoop file system to support this scheme could be 
> loaded.
>   at 
> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:405)
>   at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
>   at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
>   at 
> org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.(FsCheckpointStreamFactory.java:99)
>   at 
> org.apache.flink.runtime.state.filesystem.FsStateBackend.createStreamFactory(FsStateBackend.java:277)
>   at 
> org.apache.flink.contrib.streaming.state.RocksDBStateBackend.createStreamFactory(RocksDBStateBackend.java:273)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.createCheckpointStreamFactory(StreamTask.java:787)
>   at 
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:247)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
>   at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
>   at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
>   at java.lang.Thread.run(Thread.java:745)
> Caused by: org.apache.flink.core.fs.UnsupportedFileSystemSchemeException: 
> Hadoop File System abstraction does not support scheme 'hdfs'. Either no file 
> system implementation exists for that scheme, or the relevant classes are 
> missing from the classpath.
>   at 
> org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:102)
>   at 
> org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:401)
>   ... 12 more
> Caused by: java.io.IOException: No FileSystem for scheme: hdfs
>   at 
> org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2786)
>   at 
> org.apache.flink.runtime.fs.hdfs.HadoopFsFactory.create(HadoopFsFactory.java:99)
>   ... 13 more
> 
> 
> 
> 
> while submitting job it's printing following logs so i think it's including 
> hdoop libs :
> 
> Using the result of 'hadoop classpath' to augment the Hadoop classpath: 
> /usr/hdp/2.6.0.3-8/hadoop/conf:/usr/hdp/2.6.0.3-8/hadoop/lib/*:/usr/hdp/2.6.0.3-8/hadoop/.//*:/usr/hdp/2.6.0.
>  
> 3-8/hadoop-hdfs/./:/usr/hdp/2.6.0.3-8/hadoop-hdfs/lib/*:/usr/hdp/2.6.0.3-8/hadoop-hdfs/.//*:/usr/hdp/2.6.0.3-8/hadoop-yarn/lib/*:/usr/hdp/2.6.0.3-8/hadoop-yarn/.//*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/lib/*:/usr/hdp/2.6.0.3-8/hadoop-mapreduce/.//*
> 
> On Fri, Dec 8, 2017 at 9:24 PM, shashank agarwal  

Re: How to apply patterns from a source onto another datastream?

2017-12-20 Thread Jayant Ameta
Would it be possible to get the same result using windows?

Jayant Ameta

On Tue, Dec 19, 2017 at 3:23 PM, Dawid Wysakowicz <
wysakowicz.da...@gmail.com> wrote:

> It is not possible at this moment. FlinkCEP can handle only one Pattern
> applied statically. There is a JIRA ticket for that:
> https://issues.apache.org/jira/browse/FLINK-7129 .
>
> > On 19 Dec 2017, at 10:10, Jayant Ameta  wrote:
> >
> > I've a datastream of events, and another datastream of patterns. The
> patterns are provided by users at runtime, and they need to come via a
> Kafka topic. I need to apply each of the pattern on the event stream using
> Flink-CEP. Is there a way to get a PatternStream from the DataStream when I
> don't know the pattern beforehand?
> >
> > https://stackoverflow.com/questions/47883408/apache-
> flink-how-to-apply-patterns-from-a-source-onto-another-datastream
>
>