Job failing during restore in different cluster

2018-09-28 Thread shashank734
Hi, I am trying to move my job from one cluster to another cluster using
Savepoint. But It's failing while restoring on the new cluster. 

In error, it's still trying to connect from some URL of old cluster. I have
checked all the properties and configuration. Is flink save the URL's while
savepoint ?


2018-09-27 05:28:55,112 WARN 
org.apache.flink.streaming.api.operators.BackendRestorerProcedure  -
Exception while restoring keyed state backend for
CoStreamFlatMap_069308bcb6f685b62dae685c4647854e_(9/10) from alternative
(1/1), will retry while more alternatives are available.
java.io.IOException: Failed on local exception:
java.nio.channels.ClosedByInterruptException; Host Details : local host is:
"21.newcluster.co/10.0.3.1"; destination host is: "005.oldcluster":8020; 
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:782)
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1558)
at org.apache.hadoop.ipc.Client.call(Client.java:1498)
at org.apache.hadoop.ipc.Client.call(Client.java:1398)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
at com.sun.proxy.$Proxy13.getBlockLocations(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:272)
at sun.reflect.GeneratedMethodAccessor69.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185)
at com.sun.proxy.$Proxy14.getBlockLocations(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1238)
at
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1225)
at
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
at
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:309)
at
org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:274)
at org.apache.hadoop.hdfs.DFSInputStream.(DFSInputStream.java:266)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1538)
at
org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:331)
at
org.apache.hadoop.hdfs.DistributedFileSystem$4.doCall(DistributedFileSystem.java:327)
at
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:327)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:786)
at
org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.open(HadoopFileSystem.java:119)
at
org.apache.flink.runtime.fs.hdfs.HadoopFileSystem.open(HadoopFileSystem.java:36)
at
org.apache.flink.core.fs.SafetyNetWrapperFileSystem.open(SafetyNetWrapperFileSystem.java:80)
at
org.apache.flink.runtime.state.filesystem.FileStateHandle.openInputStream(FileStateHandle.java:68)
at
org.apache.flink.runtime.state.KeyGroupsStateHandle.openInputStream(KeyGroupsStateHandle.java:112)
at
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBFullRestoreOperation.restoreKeyGroupsInStateHandle(RocksDBKeyedStateBackend.java:569)
at
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBFullRestoreOperation.doRestore(RocksDBKeyedStateBackend.java:558)
at
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.restore(RocksDBKeyedStateBackend.java:446)
at
org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.restore(RocksDBKeyedStateBackend.java:149)
at
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:151)
at
org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:123)
at
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:276)
at
org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:132)
at
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:227)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:730)
at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:295)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.channels.ClosedByInterruptException
at

Re: ElasticSearch Checkpointing taking too much time

2018-09-12 Thread shashank734

 



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: ElasticSearch Checkpointing taking too much time

2018-09-12 Thread shashank734
Hi, vino,

I have tried bot HDFS and filesystem and other checkpoints completed
successfully so access is not the issue. For debug mode, I have to restart
the app. I'll check and let you know thanks



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: ElasticSearch Checkpointing taking too much time

2018-09-12 Thread shashank734
Hi Hequn,  

Actually there are no error logs and to turn on debug mode I have to restart
the app, Actually, I am using around 25-30 operators all others are
completing successfully in less time only elastic search sink is taking too
much time. I am using around 6 Elastic search sinks all are taking time
around 25-30 mins. I have set 50 mins as timeout so 1-2 elastic search sinks
take more than that so checkpoint and savepoint fail with the timeout error.

I'll check jstack.

Thanks





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: ElasticSearch Checkpointing taking too much time

2018-09-11 Thread shashank734
Update: 

I am using parallelism 1 on this... is this issue?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


ElasticSearch Checkpointing taking too much time

2018-09-11 Thread shashank734
I am using flink 1.5.3, In this i am using elastic search sink. In this
checkpoints and savepoints are failing, I have already given 50 minutes
timeouts. After looking into details only elastic search sink checkpoints
are taking time 30-35 mins. But state size and buffer size is 0 in that.
Don't know why it's taking too much time when it's state size is 0.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Sink Multiple Stream Elastic search

2018-08-02 Thread shashank734
Hello,

I am using Elastic search5 Connector. Can I use same connection while
sinking multiple streams to Elastic search? Currently, I think it creates a
different transport connection for each sink. I think it's creating a lot of
connections with the cluster. Cause I am sinking 5-6 streams. 




--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: How to set fix JobId for my application.

2018-05-14 Thread shashank734
Thanks for suggestion now using Kafka for information sharing between apps.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: How to set fix JobId for my application.

2018-05-03 Thread shashank734
Thanks for the response, Actually I know that. The main thing I have to use
that job id of app A in the queryable state in other application B. In that
case, I don't want to redeploy application B whenever I change something in
application A.





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


How to set fix JobId for my application.

2018-05-03 Thread shashank734
How I can set Fixed JobId for my flink Job. Cause queryable client required
Job ID. So whenever I'll update or redeploy my queryable state job than
JobID will change and i have to change and redeploy in queryable client app. 

Is there any way I can fix the jobID or dynamically pass in the client app.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: FlinkML

2018-04-18 Thread shashank734
There are no active discussions or guide on that. But I found this example in
the repo :

https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/ml/IncrementalLearningSkeleton.java

  

Which is trying to do the same thing. Although I haven't checked this yet.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: Fw: 1.4.2 - Unable to start YARN containers with more than 1 vCores per Task Manager

2018-04-03 Thread shashank734
CHeck in your Yarn configuration, Are you using DeafaultResourceCalculater
which only uses memory while allocating resources. So you have to change it
to DominantResourceCalculator.


yarn.scheduler.capacity.resource-calculator
   
org.apache.hadoop.yarn.util.resource.DominantResourceCalculator




Check this:

https://hortonworks.com/blog/managing-cpu-resources-in-your-hadoop-yarn-clusters/



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/