Re: Hashjoin implementation

2018-09-11 Thread vino yang
Hi Benjamin,

Do you mean that you want to see HashPartition.java when you write the
program?
Oh, maybe you have confused something.
The only thing you use to write a program is the Flink DataSet API, which
is just a way to describe the job logic.
And the class you are looking for, it's in the flink-runtime module and it
works at runtime.
Therefore, it is impossible to see it when you write the job program.
If you really need to see it, then you can add some logs to the
HashPartition.java related methods,
but you need to recompile and package the flink-runtime from the source and
replace the jar with the same name in the flink distribution.

Thanks, vino.


Benjamin Burkhardt  于2018年9月12日周三
上午12:31写道:

> Hi vino,
>
> thanks.
>
> I was running a join operation on two DataSets and writing the result to
> disk and the results were correct.
> I just was not able to identify the moment when the Hashtable is built.
> (HashPartition.java is not used in this case?)
>
> Do you have an idea why I cannot find it?
>
>
> *Here is a part of my code:*
>
> *DataSet customers = env.fromElements( new RowCustomers(1, 
> „Mayer“));*
>
> *tEnv.registerDataSet("Customers", customers, "customerID, customerName");*
>
> *DataSet orders = env.fromElements( new RowOrders(1, 1, new 
> String[]{„Pen“, „Paper"}, "22.08.2018"));*
>
> *tEnv.registerDataSet("Orders", orders, "orderID, customerID, items, date");*
>
>
> *DataSet> result = customers.join(orders, 
> JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST)   
> .where("customerID").equalTo("customerID");*
>
> *result.writeAsText(„/“);*
>
>
> Thanks a lot.
> Benjamin
>
>
> Am 11.09.2018 um 04:24 schrieb vino yang :
>
> Hi Benjamin,
>
> The approximate location is this package, the more accurate location is
> here.[1]
>
> Specifically, Hash Join is divided into two steps:
>
> 1) build side
> 2) probe side
>
> Thanks ,vino.
>
> [1]:
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash/HashPartition.java
>
> Benjamin Burkhardt  于2018年9月10日周一
> 下午10:09写道:
>
>> Hi,
>>
>> can anyone tell me where the default hybrid hash join function for
>> partitioning (shuffle phase) is implemented?
>> Even after deeper dinning I was not able to figure out where it is
>> located.
>>
>> Might be somewhere here?
>> —>
>> https://github.com/apache/flink/tree/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash
>>
>> Thanks in advance.
>>
>> Benjamin
>
>
>


Re: Exception when run flink-storm-example

2018-09-11 Thread vino yang
Hi hanjing,

*There may be both flink job and flink-storm in the my cluster, I don't
know the influence about legacy mode.*

> For storm-compatible jobs, because of technical limitations, you need to
use a cluster that supports legacy mode.
   But for Jobs implemented using the Flink-related API, I strongly
recommend using the new mode,
   because it has made huge changes to the old model and you will get a
more timely response if you encounter problems.

Thanks, vino.

jing  于2018年9月11日周二 下午6:02写道:

> Hi Till,
> legacy mode worked!
> Thanks a lot. And what's difference between legacy and new? Is there
> any document and release note?
> There may be both flink job and flink-storm in the my cluster, I don't
> know the influence about legacy mode.
>
> Hanjing
>
> 
> 签名由 网易邮箱大师  定制
> On 9/11/2018 14:43,Till Rohrmann
>  wrote:
>
> Hi Hanjing,
>
> I think the problem is that the Storm compatibility layer only works with
> legacy mode at the moment. Please set `mode: legacy` in your
> flink-conf.yaml. I hope this will resolve the problems.
>
> Cheers,
> Till
>
> On Tue, Sep 11, 2018 at 7:10 AM jing  wrote:
>
>> Hi vino,
>> Thank you very much.
>> I'll try more tests.
>>
>> Hanjing
>>
>> 
>> 签名由 网易邮箱大师  定制
>> On 9/11/2018 11:51,vino yang
>>  wrote:
>>
>> Hi Hanjing,
>>
>> Flink does not currently support TaskManager HA and only supports
>> JobManager HA.
>> In the Standalone environment, once the JobManager triggers a failover,
>> it will also cause cancel and restart for all jobs.
>>
>> Thanks, vino.
>>
>> jing  于2018年9月11日周二 上午11:12写道:
>>
>>> Hi vino,
>>> Thanks a lot.
>>> Besides,  I'm also confused about taskmanager's HA.
>>> There're 2 taskmangaer in my cluster, only one job A worked on
>>> taskmanager A. If taskmangaer A crashed, what happend about my job.
>>> I tried, my job failed, taskmanger B does not take over job A.
>>> Is this right?
>>>
>>> Hanjing
>>>
>>> 
>>> 签名由 网易邮箱大师  定制
>>> On 9/11/2018 10:59,vino yang
>>>  wrote:
>>>
>>> Oh, I thought the flink job could not be submitted. I don't know why the
>>> storm's example could not be submitted. Because I have never used it.
>>>
>>> Maybe Till, Chesnay or Gary can help you. Ping them for you.
>>>
>>> Thanks, vino.
>>>
>>> jing  于2018年9月11日周二 上午10:26写道:
>>>
 Hi vino,
 My job mangaer log is as below. I can submit regular flink job to this
 jobmanger, it worked. But the flink-storm example doesn's work.
 Thanks.
 Hanjing

 2018-09-11 18:22:48,937 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
 
 2018-09-11 18:22:48,938 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Starting 
 StandaloneSessionClusterEntrypoint (Version: 1.6.0, Rev:ff472b4, 
 Date:07.08.2018 @ 13:31:13 UTC)
 2018-09-11 18:22:48,938 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  OS 
 current user: hadoop3
 2018-09-11 18:22:49,143 WARN  org.apache.hadoop.util.NativeCodeLoader  
  - Unable to load native-hadoop library for your 
 platform... using builtin-java classes where applicable
 2018-09-11 18:22:49,186 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Current 
 Hadoop/Kerberos user: hadoop3
 2018-09-11 18:22:49,186 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM: Java 
 HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.172-b11
 2018-09-11 18:22:49,186 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Maximum 
 heap size: 981 MiBytes
 2018-09-11 18:22:49,186 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  
 JAVA_HOME: /usr/java/jdk1.8.0_172-amd64
 2018-09-11 18:22:49,188 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Hadoop 
 version: 2.7.5
 2018-09-11 18:22:49,188 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM 
 Options:
 2018-09-11 18:22:49,188 

Re: ElasticSearch Checkpointing taking too much time

2018-09-11 Thread vino yang
Hi shashank,

Hequn's solution is right. In addition, what type of statebackend you use,
please make sure that JM/TM can access related systems (such as HDFS).
If you still can't locate the problem, you can set the log level to DEBUG
and share your log information.

Thanks, vino.

Hequn Cheng  于2018年9月12日周三 上午9:22写道:

> Hi shashank,
>
> The parallelism won't be the problem.
> Did the checkpoint succeed finally? I think it may be that the data
> processing is blocked so that the checkpoint can not been successful. You
> can check if there are any error logs in the TaskManager or jstack the
> taskmanager to see what's wrong the the task.
>
> Best, Hequn
>
> On Tue, Sep 11, 2018 at 10:22 PM shashank734 
> wrote:
>
>> Update:
>>
>> I am using parallelism 1 on this... is this issue?
>>
>>
>>
>> --
>> Sent from:
>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>>
>


Re: Flink RMQSource Consumer: How I get the RabbitMQ UserId

2018-09-11 Thread vino yang
Hi Marke,

Should not use the code like this :

*delivery.getProperties().getUserId();*

to get the userId from Delivery object?

And for second code example, Since you got the object of TimeSeriesType
type, should not define *DataStream* instead of
*DataStream*.

Regarding userId, I just said that this is a way of extracting. But if it
doesn't have a value in itself, then there is no way to get it. Can you
confirm that the message itself has value in RabbitMQ?

Thanks, vino.

Marke Builder  于2018年9月11日周二 下午4:34写道:

> Hi Vino,
>
> this is what I done, but no user Id available. And the first question was
> about the running parameter in RMQSource#boolean running.
>
> Code example:
> @Override
> run(SourceContext cts) {
> 
> TimeSeriesType result = (TimeSeriesType)
> schema.deserialize(delivery.getBody());
> .
> final String userId = delivery.getProperties().
> result.setDeviceId(userId);
> ..
> ctx.collect(result);
>
>
> And the DataStream looks like this:
> final DataStream stream = env
> .addSource(new RabbitmqStreamProcessorV2(
> connectionConfig,
> fastDataQueue,
> new
> AbstractDeserializationSchema() {
> @Override
> public TimeSeriesType deserialize(byte[]
> bytes) throws IOException {
> TimeSeriesType message = null;
> try {
>  message = xmlParser.parse(new
> String(bytes, "UTF8"));
>  logger.info("Data/Message size: "
> +String.valueOf(message.getData().size()));
> } catch (JAXBException e) {
> e.printStackTrace();
> logger.log(Level.INFO, e.toString());
> }
> return message;
> }
> }))
> .flatMap(.
>
>
>
>
>
>
> Am Mo., 10. Sep. 2018 um 03:52 Uhr schrieb vino yang <
> yanghua1...@gmail.com>:
>
>> Hi Marke,
>>
>> As soon as I didn't really implement this code, but I think you can
>> replace this line of code:
>>
>> *OUT result = schema.deserialize(delivery.getBody());
>> //RMQSource#run*
>>
>> instead of defining an abstract method in RMQSource, such as:
>> normalize/deserialize, the input parameter is Delivery,
>> and the output parameter is generic type  and implement your custom
>> logic in this method.
>>
>> Thanks, vino.
>>
>> Marke Builder  于2018年9月10日周一 上午12:32写道:
>>
>>> Hi Vino,
>>>
>>> many thanks for your response, the solution works! But I have one
>>> additional question,
>>> What is the best way to override the RMQSource#run without access to the
>>> RMQSource variable "running" ?
>>>
>>> Thanks, Martin.
>>>
>>> Am Sa., 8. Sep. 2018 um 10:15 Uhr schrieb vino yang <
>>> yanghua1...@gmail.com>:
>>>
 Hi Marke,

 As you said, you need to extend RMQSource because Flink's rabbitmq
 connector only extracts the body of Delivery.
 Therefore, in order to achieve your purpose, you need to add a property
 to the specific data type of your DataStream
 to represent the userId, then override the RMQSource#run method and
 extract the userId from the properties of Delivery.
 Of course, in addition, maybe you Need to pay attention to the
 implementation of DeserializationSchema.

 Thanks, vino.

 Marke Builder  于2018年9月8日周六 下午3:44写道:

> Hi,
>
> how I can get the UserId from the Properties in my DataStream?
>
> I can read the userId if I extend the RMQSource Class:
> QueueingConsumer.Delivery delivery = consumer.nextDelivery();
> String userId = delivery.getProperties().getUserId();
>
> But how can I provide this to my DataStream ?
>
> Best regards,
> Martin
>



Re: ElasticSearch Checkpointing taking too much time

2018-09-11 Thread Hequn Cheng
Hi shashank,

The parallelism won't be the problem.
Did the checkpoint succeed finally? I think it may be that the data
processing is blocked so that the checkpoint can not been successful. You
can check if there are any error logs in the TaskManager or jstack the
taskmanager to see what's wrong the the task.

Best, Hequn

On Tue, Sep 11, 2018 at 10:22 PM shashank734  wrote:

> Update:
>
> I am using parallelism 1 on this... is this issue?
>
>
>
> --
> Sent from:
> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
>


Re: Triggering Savepoints with the Monitoring API

2018-09-11 Thread Austin Cawley-Edwards
Thank you vino!

On Mon, Sep 10, 2018, 11:08 PM vino yang  wrote:

> Hi Austin,
>
> It seems that your scene is very suitable for a usage scenario of Flink's
> Savepoint: A/B Test (or upgrade application). Yes, Flink can support this
> requirement, but you should understand that these two jobs will be subject
> to certain restrictions. For more information, please refer to Official
> documentation.[1]
>
> Thanks, vino.
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/upgrading.html#application-state-compatibility
>
> Austin Cawley-Edwards  于2018年9月11日周二 上午7:27写道:
>
>> Hi there,
>>
>> I would just like a quick sanity-check: it is possible to start a job
>> with a savepoint from another job, and have the new job save to a new
>> checkpoint directory without overwriting the original checkpoints, correct?
>>
>> Thank you so much!
>> Austin
>>
>>
>>


Re: Problem with querying state on Flink 1.6.

2018-09-11 Thread Joe Olson
Kostas - Till's advice got me past my first problem. I'm still having
issues with the client side. I've got your example code from [1] in a
github project [2].

My problem differs from David Anderson's above in that my call to
QueryableStateClient is using a remote machine, not localhost (my client is
running on a different machines than any of the Flink processes)

Assuming a queryable state client is allowed to run on a different machine,
I haven't been able to get QueryableStateClient or getKVState to react at
all...no errors, even if I put in a bogus IP address, bogus port, etc.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html
[2] https://github.com/jolson787/qs

On Mon, Sep 10, 2018 at 7:13 AM Kostas Kloudas 
wrote:

> Hi Joe,
>
> Did the problem get resolved at the end?
>
> Thanks,
> Kostas
>
> On Aug 30, 2018, at 9:06 PM, Eron Wright  wrote:
>
> I took a brief look as to why the queryable state server would bind to the
> loopback address.   Both the qs server and the
> org.apache.flink.runtime.io.network.netty.NettyServer do bind the local
> address based on the TM address.  That address is based on the
> "taskmanager.hostname" configuration override and, by default, the
> RpcService address.
>
> A possible explanation is that, on Joe's machine, Java's
> `InetAddress.getLocalHost()` resolves to the loopback address.  I believe
> there's some variation in Java's behavior in that regard.
>
> Hope this helps!
>
> On Thu, Aug 30, 2018 at 1:27 AM Till Rohrmann 
> wrote:
>
>> Hi Joe,
>>
>> it looks as if the queryable state server binds to the local loopback
>> address. This looks like a bug to me. Could you maybe share the complete
>> cluster entrypoint and the task manager logs with me?
>>
>> In the meantime you could try to do the following: Change
>> AbstractServerBase.java:227 into `.localAddress(port)`. This should bind to
>> any local address. Now you need to build your own Flink distribution by
>> running `mvn clean package -DskipTests` and then go to either build-target
>> or flink-dist/target/flink-1.7-SNAPSHOT-bin/flink-1.7-SNAPSHOT to find the
>> distribution.
>>
>> Cheers,
>> Till
>>
>> On Thu, Aug 30, 2018 at 12:12 AM Joe Olson  wrote:
>>
>>> I'm having a problem with querying state on Flink 1.6.
>>>
>>> I put a project in Github that is my best representation of the very
>>> simple client example outlined in the 'querying state' section of the 1.6
>>> documentation at
>>> https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/queryable_state.html
>>> . The Github project is at https://github.com/jolson787/qs
>>>
>>> My problem: I know the query server and proxy server have started on my
>>> 1 job manager / 1 task manager Flink 1.6 test rig, because I see the
>>> 'Started Queryable State Server' and 'Started Queryable State Proxy Server'
>>> in the task manager logs. I know the ports are open on the local machine,
>>> because I can telnet to them.
>>>
>>> From a remote machine, I implemented the QueryableStateClient as in the
>>> example, and made a getKVState call. Nothing I seem to do between that or
>>> the getKVstate call seems to register...no response, no errors thrown, no
>>> lines in the log, no returned futures, no timeouts, etc. I know the proxy
>>> server and state server ports are NOT open to the remote machine, yet the
>>> client still doesn't seem to react.
>>>
>>> Can someone take a quick look at my very simple Github project and see
>>> if anything jumps out at them? Beer is on me at Flink Forward if someone
>>> can help me work through this
>>>
>>
>


Re: How does flink read a DataSet?

2018-09-11 Thread Taher Koitawala
Furthermore, how does Flink deal with Task Managers dying when it is using
the DataSet API. Is checkpointing done on dataset too? Or the whole dataset
has to re-read.

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163

On Tue, Sep 11, 2018 at 11:18 PM, Taher Koitawala  wrote:

> Hi All,
>  Just like Spark does Flink read a dataset and keep it in memory
> and keep applying transformations? Or all records read by Flink async
> parallel reads? Furthermore, how does Flink deal with
>
> Regards,
> Taher Koitawala
> GS Lab Pune
> +91 8407979163
>


How does flink read a DataSet?

2018-09-11 Thread Taher Koitawala
Hi All,
 Just like Spark does Flink read a dataset and keep it in memory
and keep applying transformations? Or all records read by Flink async
parallel reads? Furthermore, how does Flink deal with

Regards,
Taher Koitawala
GS Lab Pune
+91 8407979163


Re: Hashjoin implementation

2018-09-11 Thread Benjamin Burkhardt
Hi vino,

thanks. 

I was running a join operation on two DataSets and writing the result to disk 
and the results were correct.
I just was not able to identify the moment when the Hashtable is built. 
(HashPartition.java is not used in this case?)

Do you have an idea why I cannot find it?


Here is a part of my code:
DataSet customers = env.fromElements( new RowCustomers(1, 
„Mayer“));
tEnv.registerDataSet("Customers", customers, "customerID, customerName");
DataSet orders = env.fromElements( new RowOrders(1, 1, new 
String[]{„Pen“, „Paper"}, "22.08.2018"));
tEnv.registerDataSet("Orders", orders, "orderID, customerID, items, date");
DataSet> result = customers.join(orders, 
JoinOperatorBase.JoinHint.REPARTITION_HASH_FIRST)
   .where("customerID").equalTo("customerID");
result.writeAsText(„/“);

Thanks a lot.
Benjamin


> Am 11.09.2018 um 04:24 schrieb vino yang :
> 
> Hi Benjamin,
> 
> The approximate location is this package, the more accurate location is 
> here.[1]
> 
> Specifically, Hash Join is divided into two steps:
> 
> 1) build side
> 2) probe side
> 
> Thanks ,vino.
> 
> [1]: 
> https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash/HashPartition.java
>  
> 
> Benjamin Burkhardt  > 于2018年9月10日周一 下午10:09写道:
> Hi,
> 
> can anyone tell me where the default hybrid hash join function for 
> partitioning (shuffle phase) is implemented? 
> Even after deeper dinning I was not able to figure out where it is located.
> 
> Might be somewhere here?
> —> 
> https://github.com/apache/flink/tree/master/flink-runtime/src/main/java/org/apache/flink/runtime/operators/hash
>  
> 
> 
> Thanks in advance.
> 
> Benjamin



Re: Deadlock in SafetyNetCloseableRegistry?

2018-09-11 Thread bupt_ljy
Hi, all
Sorry for attaching this again. The flink version is 1.6 and the dead lock 
stack is


"CloseableReaperThread" #54 daemon prio=5 os_prio=0 tid=0x7f4d6d3af000 
nid=0x32f6 in Object.wait() [0x7f4d3fdfe000]
 java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on 0xaefacb70 (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked 0xaefacb70 (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at 
org.apache.flink.core.fs.SafetyNetCloseableRegistry$CloseableReaperThread.run(SafetyNetCloseableRegistry.java:193)


   This thread is created in AsyncCheckpointRunnable class and get stucked, so 
the next checkpoint can’t aquire the lock in performCheckpoint method and 
timeout. How can I avoid this?


   Best, Jiayi Liao


Original Message
Sender:bupt_ljybupt_...@163.com
Recipient:useru...@flink.apache.org
Date:Tuesday, Sep 11, 2018 22:22
Subject:Deadlock in SafetyNetCloseableRegistry?


Hi,all
 I starts a flink program and it runs on yarn. At first it doesn’t aquire 
enough resources so this is thrown.
“org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Could not allocate all requires slots within timeout of 30 ms. Slots 
required: 16, slots allocated: 7”.
 Then the jobmanager automatically restarts but fail to trigger checkpoint 
anymore because “expired before completing”. All the taskmanagers are blocked, 
and I find there seems to be a dead lock inSafetyNetCloseableRegistry, and 
maybe that’s why the whole taskmanager is blocked. Here is the taskmanager’s 
stack:
 
 Best, Jiayi Liao

Speakers needed for Apache DC Roadshow

2018-09-11 Thread Rich Bowen
We need your help to make the Apache Washington DC Roadshow on Dec 4th a 
success.


What do we need most? Speakers!

We're bringing a unique DC flavor to this event by mixing Open Source 
Software with talks about Apache projects as well as OSS CyberSecurity, 
OSS in Government and and OSS Career advice.


Please take a look at: http://www.apachecon.com/usroadshow18/

(Note: You are receiving this message because you are subscribed to one 
or more mailing lists at The Apache Software Foundation.)


Rich, for the ApacheCon Planners

--
rbo...@apache.org
http://apachecon.com
@ApacheCon


Between Checkpoints in Kafka 11

2018-09-11 Thread Harshvardhan Agrawal
Hi,

I was going through the blog post on how TwoPhaseCommitSink function works
with Kafka 11. One of the things I don’t understand is: What is the
behavior of the Kafka 11 Producer between two checkpoints? Say that the
time interval between two checkpoints is set to 15 minutes. Will Flink
buffer all records in memory in that case and start writing to Kafka when
the next checkpoint starts?

Thanks!
-- 
Regards,
Harshvardhan


Deadlock in SafetyNetCloseableRegistry?

2018-09-11 Thread bupt_ljy
Hi,all
 I starts a flink program and it runs on yarn. At first it doesn’t aquire 
enough resources so this is thrown.
“org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: 
Could not allocate all requires slots within timeout of 30 ms. Slots 
required: 16, slots allocated: 7”.
 Then the jobmanager automatically restarts but fail to trigger checkpoint 
anymore because “expired before completing”. All the taskmanagers are blocked, 
and I find there seems to be a dead lock inSafetyNetCloseableRegistry, and 
maybe that’s why the whole taskmanager is blocked. Here is the taskmanager’s 
stack:
 
 Best, Jiayi Liao

out
Description: Binary data


Re: ElasticSearch Checkpointing taking too much time

2018-09-11 Thread shashank734
Update: 

I am using parallelism 1 on this... is this issue?



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


ElasticSearch Checkpointing taking too much time

2018-09-11 Thread shashank734
I am using flink 1.5.3, In this i am using elastic search sink. In this
checkpoints and savepoints are failing, I have already given 50 minutes
timeouts. After looking into details only elastic search sink checkpoints
are taking time 30-35 mins. But state size and buffer size is 0 in that.
Don't know why it's taking too much time when it's state size is 0.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: How to get Cluster metrics in FLIP-6 mode

2018-09-11 Thread Tony Wei
Hi Gary,

Thanks for your information.

Best,
Tony Wei

2018-09-11 20:26 GMT+08:00 Gary Yao :

> Hi Tony,
>
> You are right that these metrics are missing. There is already a ticket for
> that [1]. At the moment you can obtain these information from the REST API
> (/overview) [2].
>
> Since FLIP-6, the JM is no longer responsible for these metrics but for
> backwards compatibility we can leave them in the JM scope for now.
>
> Best,
> Gary
>
>
> [1] https://issues.apache.org/jira/browse/FLINK-10135
> [2] https://ci.apache.org/projects/flink/flink-docs-
> release-1.5/monitoring/rest_api.html#available-requests
>
> On Tue, Sep 11, 2018 at 12:19 PM, Tony Wei  wrote:
>
>> Hi,
>>
>> I found that these metrics[1] disappeared in my JM's prometheus reporter
>> when I used FLIP-6 to
>>  deploy standalone cluster. (flink 1.5.3 release)
>>
>> Cluster
>> ScopeMetricsDescriptionType
>> *JobManager* numRegisteredTaskManagers The number of registered
>> taskmanagers. Gauge
>> numRunningJobs The number of running jobs. Gauge
>> taskSlotsAvailable The number of available task slots. Gauge
>> taskSlotsTotal The total number of task slots. GaugeI guessed maybe JM
>> is no longer responsible to these metrics, but I still need these metrics
>> on my
>> dashboard. Do anyone know how to let my metric reporter get these
>> metrics? Or did I miss something?
>> Thank you.
>>
>> Best Regards,
>> Tony Wei
>>
>> [1] https://ci.apache.org/projects/flink/flink-docs-release-
>> 1.6/monitoring/metrics.html#cluster
>>
>>
>


Re: Create a file in parquet format

2018-09-11 Thread Gary Yao
Hi Jose,

You can find an example here:


https://github.com/apache/flink/blob/1a94c2094b8045a717a92e232f9891b23120e0f2/flink-formats/flink-parquet/src/test/java/org/apache/flink/formats/parquet/avro/ParquetStreamingFileSinkITCase.java#L58

Best,
Gary

On Tue, Sep 11, 2018 at 11:59 AM, jose farfan  wrote:

> Hi
>
> I am working in a task. The purpose is to create a sink in Parquet format.
> Then, I am using the "Streaming Flink Sink", but I cannot complete the
> task.
>
> Do you know any example in github, blog, that I can use to complete the
> task.
>
> Many Thx
> BR
> Jose
>


Re: Exception when run flink-storm-example

2018-09-11 Thread Till Rohrmann
You can check these release notes
https://flink.apache.org/news/2018/05/25/release-1.5.0.html for more
information.

Cheers,
Till

On Tue, Sep 11, 2018 at 12:02 PM jing  wrote:

> Hi Till,
> legacy mode worked!
> Thanks a lot. And what's difference between legacy and new? Is there
> any document and release note?
> There may be both flink job and flink-storm in the my cluster, I don't
> know the influence about legacy mode.
>
> Hanjing
>
> 
> 签名由 网易邮箱大师  定制
> On 9/11/2018 14:43,Till Rohrmann
>  wrote:
>
> Hi Hanjing,
>
> I think the problem is that the Storm compatibility layer only works with
> legacy mode at the moment. Please set `mode: legacy` in your
> flink-conf.yaml. I hope this will resolve the problems.
>
> Cheers,
> Till
>
> On Tue, Sep 11, 2018 at 7:10 AM jing  wrote:
>
>> Hi vino,
>> Thank you very much.
>> I'll try more tests.
>>
>> Hanjing
>>
>> 
>> 签名由 网易邮箱大师  定制
>> On 9/11/2018 11:51,vino yang
>>  wrote:
>>
>> Hi Hanjing,
>>
>> Flink does not currently support TaskManager HA and only supports
>> JobManager HA.
>> In the Standalone environment, once the JobManager triggers a failover,
>> it will also cause cancel and restart for all jobs.
>>
>> Thanks, vino.
>>
>> jing  于2018年9月11日周二 上午11:12写道:
>>
>>> Hi vino,
>>> Thanks a lot.
>>> Besides,  I'm also confused about taskmanager's HA.
>>> There're 2 taskmangaer in my cluster, only one job A worked on
>>> taskmanager A. If taskmangaer A crashed, what happend about my job.
>>> I tried, my job failed, taskmanger B does not take over job A.
>>> Is this right?
>>>
>>> Hanjing
>>>
>>> 
>>> 签名由 网易邮箱大师  定制
>>> On 9/11/2018 10:59,vino yang
>>>  wrote:
>>>
>>> Oh, I thought the flink job could not be submitted. I don't know why the
>>> storm's example could not be submitted. Because I have never used it.
>>>
>>> Maybe Till, Chesnay or Gary can help you. Ping them for you.
>>>
>>> Thanks, vino.
>>>
>>> jing  于2018年9月11日周二 上午10:26写道:
>>>
 Hi vino,
 My job mangaer log is as below. I can submit regular flink job to this
 jobmanger, it worked. But the flink-storm example doesn's work.
 Thanks.
 Hanjing

 2018-09-11 18:22:48,937 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
 
 2018-09-11 18:22:48,938 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Starting 
 StandaloneSessionClusterEntrypoint (Version: 1.6.0, Rev:ff472b4, 
 Date:07.08.2018 @ 13:31:13 UTC)
 2018-09-11 18:22:48,938 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  OS 
 current user: hadoop3
 2018-09-11 18:22:49,143 WARN  org.apache.hadoop.util.NativeCodeLoader  
  - Unable to load native-hadoop library for your 
 platform... using builtin-java classes where applicable
 2018-09-11 18:22:49,186 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Current 
 Hadoop/Kerberos user: hadoop3
 2018-09-11 18:22:49,186 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM: Java 
 HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.172-b11
 2018-09-11 18:22:49,186 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Maximum 
 heap size: 981 MiBytes
 2018-09-11 18:22:49,186 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  
 JAVA_HOME: /usr/java/jdk1.8.0_172-amd64
 2018-09-11 18:22:49,188 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Hadoop 
 version: 2.7.5
 2018-09-11 18:22:49,188 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM 
 Options:
 2018-09-11 18:22:49,188 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
 -Xms1024m
 2018-09-11 18:22:49,188 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
 -Xmx1024m
 2018-09-11 18:22:49,188 INFO  
 org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 

Re: How to get Cluster metrics in FLIP-6 mode

2018-09-11 Thread Gary Yao
Hi Tony,

You are right that these metrics are missing. There is already a ticket for
that [1]. At the moment you can obtain these information from the REST API
(/overview) [2].

Since FLIP-6, the JM is no longer responsible for these metrics but for
backwards compatibility we can leave them in the JM scope for now.

Best,
Gary


[1] https://issues.apache.org/jira/browse/FLINK-10135
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.5/monitoring/rest_api.html#available-requests

On Tue, Sep 11, 2018 at 12:19 PM, Tony Wei  wrote:

> Hi,
>
> I found that these metrics[1] disappeared in my JM's prometheus reporter
> when I used FLIP-6 to
>  deploy standalone cluster. (flink 1.5.3 release)
>
> Cluster
> ScopeMetricsDescriptionType
> *JobManager* numRegisteredTaskManagers The number of registered
> taskmanagers. Gauge
> numRunningJobs The number of running jobs. Gauge
> taskSlotsAvailable The number of available task slots. Gauge
> taskSlotsTotal The total number of task slots. GaugeI guessed maybe JM is
> no longer responsible to these metrics, but I still need these metrics on my
> dashboard. Do anyone know how to let my metric reporter get these metrics?
> Or did I miss something?
> Thank you.
>
> Best Regards,
> Tony Wei
>
> [1] https://ci.apache.org/projects/flink/flink-docs-
> release-1.6/monitoring/metrics.html#cluster
>
>


Re: JAXB Classloading errors when using PMML Library (AWS EMR Flink 1.4.2)

2018-09-11 Thread Gary Yao
Hi,

Do you also have pmml-model-moxy as a dependency in your job? Using mvn
dependency:tree, I do not see that pmml-evaluator has a compile time
dependency on jaxb-api. The jaxb-api dependency actually comes from pmml-
model-moxy. The exclusion should be therefore defined on pmml-model-moxy.

You can also try "parent-first" ClassLoader resolution order [1].

Best,
Gary

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.4/monitoring/debugging_classloading.html#inverted-class-loading-and-classloader-resolution-order


On Tue, Sep 4, 2018 at 3:24 AM, Sameer W  wrote:

> Hi,
>
> I am using PMML dependency as below to execute ML models at prediction
> time within a Flink Map operator
>
> 
>
> org.jpmml
>
> pmml-evaluator
>
> 1.4.3
>
>
> 
>
> 
>
> javax.xml.bind
>
> jaxb-api
>
> 
>
> 
>
> org.glassfish.jaxb
>
> jaxb-runtime
>
> 
>
> 
>
> guava
>
> com.google.guava
>
> 
>
> 
>
> 
> Environment is EMR, OpenJDK 1.8 and Flink 1.4.2. My programs run fine in
> my Eclipse Development environment. However when we deploy on the cluster
> we get Classloading exceptions which are primarily due to the PMML classes
> loaded via the Flink Classloader while the JAXB classes are loaded by the
> boot classloader. Also the problem seems like the version of the jaxb
> classes referenced within the PMML library is different from the ones
> loaded by the open JDK.
>
> For example I keep getting this type of error. I have also listed another
> error after this which is linked to not being able to use reflection and
> unsafe library to set private instances within the PMML class instance
> using JAXB Unmarshaller.  -
> java.lang.LinkageError: loader constraint violation: when resolving
> interface method "javax.xml.bind.Unmarshaller.unmarshal(Ljavax/xml/
> transform/Source;)Ljava/lang/Object;" the class loader (instance of
> org/apache/flink/runtime/execution/librarycache/FlinkUserCodeClassLoaders$ChildFirstClassLoader)
> of the current class, 
> com/comcast/mlarche/featurecreationflows/xreerrors/MyJAXBUtil,
> and the class loader (instance of ) for the method's defining
> class, javax/xml/bind/Unmarshaller, have different Class objects for the
> type javax/xml/transform/Source used in the signature
> at com.comcast.mlarche.featurecreationflows.xreerrors.MyJAXBUtil.
> unmarshal(MyJAXBUtil.java:52)
> at com.comcast.mlarche.featurecreationflows.xreerrors.MyJAXBUtil.
> unmarshalPMML(MyJAXBUtil.java:38)
> at com.comcast.mlarche.featurecreationflows.
> xreerrors.PMMLModelExecution.getMiningModelEvaluator(
> PMMLModelExecution.java:67)
> at com.comcast.mlarche.featurecreationflows.
> xreerrors.PMMLModelExecution.predict(PMMLModelExecution.java:126)
> at com.comcast.mlarche.featurecreationflows.xreerrors.
> XreErrorModelsPredictionServiceService.predict(
> XreErrorModelsPredictionServiceService.java:61)
> at com.comcast.mlarche.featurecreationflows.xreerrors.
> XreErrorModelsPredictionServiceService.predictSystemRefresh(
> XreErrorModelsPredictionServiceService.java:44)
> at com.comcast.mlarche.featurecreationflows.xreerrors.
> XREErrorsModelExecutionMap.flatMap(XREErrorsModelExecutionMap.java:46)
> at com.comcast.mlarche.featurecreationflows.xreerrors.
> XREErrorsModelExecutionMap.flatMap(XREErrorsModelExecutionMap.java:17)
> at org.apache.flink.streaming.api.operators.StreamFlatMap.
> processElement(StreamFlatMap.java:50)
> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
> CopyingChainingOutput.pushToOperator(OperatorChain.java:549)
> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
> CopyingChainingOutput.collect(OperatorChain.java:524)
> at org.apache.flink.streaming.runtime.tasks.OperatorChain$
> CopyingChainingOutput.collect(OperatorChain.java:504)
> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$
> CountingOutput.collect(AbstractStreamOperator.java:830)
> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$
> CountingOutput.collect(AbstractStreamOperator.java:808)
> at org.apache.flink.streaming.api.operators.
> TimestampedCollector.collect(TimestampedCollector.java:51)
> at com.comcast.mlarche.featurecreationflows.xreerrors.
> XreErrorsModelsApplyFunction.apply(XreErrorsModelsApplyFunction.java:65)
> at com.comcast.mlarche.featurecreationflows.xreerrors.
> XreErrorsModelsApplyFunction.apply(XreErrorsModelsApplyFunction.java:20)
> at org.apache.flink.streaming.runtime.operators.windowing.functions.
> InternalIterableWindowFunction.process(InternalIterableWindowFunction
> .java:44)
> at org.apache.flink.streaming.runtime.operators.windowing.functions.
> InternalIterableWindowFunction.process(InternalIterableWindowFunction
> .java:32)
> at org.apache.flink.streaming.runtime.operators.windowing.
> EvictingWindowOperator.emitWindowContents(EvictingWindowOperator.java:357)
> at org.apache.flink.streaming.runtime.operators.windowing.
> 

How to get Cluster metrics in FLIP-6 mode

2018-09-11 Thread Tony Wei
Hi,

I found that these metrics[1] disappeared in my JM's prometheus reporter
when I used FLIP-6 to
 deploy standalone cluster. (flink 1.5.3 release)

Cluster
ScopeMetricsDescriptionType
*JobManager* numRegisteredTaskManagers The number of registered
taskmanagers. Gauge
numRunningJobs The number of running jobs. Gauge
taskSlotsAvailable The number of available task slots. Gauge
taskSlotsTotal The total number of task slots. GaugeI guessed maybe JM is
no longer responsible to these metrics, but I still need these metrics on my
dashboard. Do anyone know how to let my metric reporter get these metrics?
Or did I miss something?
Thank you.

Best Regards,
Tony Wei

[1]
https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/metrics.html#cluster


Re: Exception when run flink-storm-example

2018-09-11 Thread jing
Hi Till,
legacy mode worked!
Thanks a lot. And what's difference between legacy and new? Is there any 
document and release note?
There may be both flink job and flink-storm in the my cluster, I don't know 
the influence about legacy mode.


| |
Hanjing
|
|
|
签名由网易邮箱大师定制
On 9/11/2018 14:43,Till Rohrmann wrote:
Hi Hanjing,


I think the problem is that the Storm compatibility layer only works with 
legacy mode at the moment. Please set `mode: legacy` in your flink-conf.yaml. I 
hope this will resolve the problems.


Cheers,
Till


On Tue, Sep 11, 2018 at 7:10 AM jing  wrote:

Hi vino,
Thank you very much.
I'll try more tests.


| |
Hanjing
|
|
|
签名由网易邮箱大师定制
On 9/11/2018 11:51,vino yang wrote:
Hi Hanjing,


Flink does not currently support TaskManager HA and only supports JobManager 
HA. 
In the Standalone environment, once the JobManager triggers a failover, it will 
also cause cancel and restart for all jobs.



Thanks, vino.


jing  于2018年9月11日周二 上午11:12写道:

Hi vino,
Thanks a lot.
Besides,  I'm also confused about taskmanager's HA.
There're 2 taskmangaer in my cluster, only one job A worked on taskmanager A. 
If taskmangaer A crashed, what happend about my job.
I tried, my job failed, taskmanger B does not take over job A.
Is this right?


| |
Hanjing
|
|
|
签名由网易邮箱大师定制
On 9/11/2018 10:59,vino yang wrote:
Oh, I thought the flink job could not be submitted. I don't know why the 
storm's example could not be submitted. Because I have never used it.


Maybe Till, Chesnay or Gary can help you. Ping them for you.


Thanks, vino.


jing  于2018年9月11日周二 上午10:26写道:

Hi vino,
My job mangaer log is as below. I can submit regular flink job to this 
jobmanger, it worked. But the flink-storm example doesn's work.
Thanks.
Hanjing
2018-09-11 18:22:48,937 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 

2018-09-11 18:22:48,938 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Starting 
StandaloneSessionClusterEntrypoint (Version: 1.6.0, Rev:ff472b4, 
Date:07.08.2018 @ 13:31:13 UTC)
2018-09-11 18:22:48,938 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  OS current 
user: hadoop3
2018-09-11 18:22:49,143 WARN  org.apache.hadoop.util.NativeCodeLoader   
- Unable to load native-hadoop library for your platform... using 
builtin-java classes where applicable
2018-09-11 18:22:49,186 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Current 
Hadoop/Kerberos user: hadoop3
2018-09-11 18:22:49,186 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM: Java 
HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.172-b11
2018-09-11 18:22:49,186 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Maximum heap 
size: 981 MiBytes
2018-09-11 18:22:49,186 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JAVA_HOME: 
/usr/java/jdk1.8.0_172-amd64
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Hadoop 
version: 2.7.5
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM Options:
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xms1024m
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - -Xmx1024m
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
-Dlog.file=/home/hadoop3/zh/flink-1.6.0/log/flink-hadoop3-standalonesession-0-p-a36-72.log
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
-Dlog4j.configuration=file:/home/hadoop3/zh/flink-1.6.0/conf/log4j.properties
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
-Dlogback.configurationFile=file:/home/hadoop3/zh/flink-1.6.0/conf/logback.xml
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Program 
Arguments:
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - --configDir
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
/home/hadoop3/zh/flink-1.6.0/conf
2018-09-11 18:22:49,188 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
--executionMode
2018-09-11 18:22:49,189 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint - cluster
2018-09-11 18:22:49,189 INFO  
org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Classpath: 

Create a file in parquet format

2018-09-11 Thread jose farfan
Hi

I am working in a task. The purpose is to create a sink in Parquet format.
Then, I am using the "Streaming Flink Sink", but I cannot complete the task.

Do you know any example in github, blog, that I can use to complete the
task.

Many Thx
BR
Jose


Re: WriteTimeoutException in Cassandra sink kill the job

2018-09-11 Thread HarshithBolar
Have you configured checkpointing in your job. If enabled, the job should
revert back to the last stored checkpoint in case of a failure and process
the failed record again. 





--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Re: [EXTERNAL] Re: Server side error: Cannot find required BLOB at /tmp/blobStore

2018-09-11 Thread Raja . Aravapalli

Thanks a lot Lasse.


So, the recommendation is to update the property “blob.storage.directory”, to 
some maintained path, other than “/tmp”, so the default cleanup will be 
eliminated ?

Also can you pls confirm if this is a hdfs path or a Local unix path ?
Please acknowledge if I have understood is correctly?

Thanks a lot again.


Regards,
Raja.

From: Lasse Nedergaard 
Date: Tuesday, September 11, 2018 at 12:52 PM
To: miki haiat 
Cc: Raja Aravapalli , user 
Subject: [EXTERNAL] Re: Server side error: Cannot find required BLOB at 
/tmp/blobStore

Hi.

From my presentation on Flink forward you can validate this
•We used EMR on Amazon’s Linux AMI
•We didn't change the default blob server location (/tmp)
•Default a cron job cleaning up in /tmp
•Solution change blob server location with blob.storage.directory

Den tir. 11. sep. 2018 kl. 09.14 skrev miki haiat 
mailto:miko5...@gmail.com>>:
/tmp/blobStore
Is it the path  for checkpoints/savepoints  storage ?

On Tue, 11 Sep 2018, 10:01 Raja.Aravapalli, 
mailto:raja.aravapa...@target.com>> wrote:
Hi,

My Flink application which reads from Kafka and writes to HDFS is failing 
repeatedly with below exception:

Caused by: java.io.IOException: Server side error: Cannot find required BLOB at 
/tmp/blobStore-**

Can someone please help me on, what could be the root cause of this issue?  I 
am not able to trace logs also.

Also, FYI, our FLINK cluster(4nodes) is setup on our Hadoop YARN cluster.

Thanks a lot.


Regards,
Raja.


[Kerberos] JAAS module content not generated? javax.security.auth.callback.UnsupportedCallbackException: Could not login: the client is being asked for a password, but the Kafka client code does not c

2018-09-11 Thread Sebastien Pereira
Hi,

We are using Flink 1.5.3 where the Kafka producer talks with a kerberized kafka 
(kerberos only, no SSL). 

It fails to connect to kafka with a root Exception: 
javax.security.auth.callback.UnsupportedCallbackException: Could not login: the 
client is being asked for a password, but the Kafka client code does not 
currently support obtaining a password from the user.

We have the following configuration for kerberos in flink-conf.yaml:
# --
security.kerberos.login.use-ticket-cache: false
security.kerberos.login.keytab:  /etc/krb5/flink.keytab
security.kerberos.login.principal: kafka/the.host.n...@example.com
security.kerberos.login.contexts: KafkaClient
# --

We use org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer011 with 
the following properties for kerberos:
# --
security.protocol=SASL_PLAINTEXT
sasl.kerberos.service.name=kafka
# --

>From job/task managers hosts we can login with the same user which runs flink 
>processes, and successfully get a kerberos ticket:

# --
kubectl exec -it  -- /bin/bash
$ kinit kafka/hdp-2641.fyre.ibm@example.com -k -t /etc/krb5/flink.keytab 

   
Done!
New ticket is stored in cache file /opt/flink/krb5cc_bai
$ klist

Credentials cache: /opt/flink/krb5cc_bai
Default principal: kafka/the.host.n...@example.com
Number of entries: 1

[1] Service principal: krbtgt/example@example.com
Valid starting: Monday, September 10, 2018 at 4:58:29 PM
Expires: Tuesday, September 11, 2018 at 4:58:29 PM
# --

However, 
When we check the content of the JAAS file generated in /temp, we see no 
content apart the comments:

/tmp$ cat jaas-4651713797960840940.conf
/**

#  Licensed to the Apache Software Foundation (ASF) under one
#  or more contributor license agreements.  See the NOTICE file
#  distributed with this work for additional information
#  regarding copyright ownership.  The ASF licenses this file
#  to you under the Apache License, Version 2.0 (the
#  "License"); you may not use this file except in compliance
#  with the License.  You may obtain a copy of the License at
#
#  http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software
#  distributed under the License is distributed on an "AS IS" BASIS,
#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#  See the License for the specific language governing permissions and
# limitations under the License.

# We are using this file as an workaround for the Kafka and ZK SASL 
implementation
# since they explicitly look for java.security.auth.login.config property
# Please do not edit/delete this file - See FLINK-3929
**/

/tmp$

- Could you confirm that we should have more in the generated JAAS file?
- We strongly suspect the UnsupportedCallbackException is caused by missing 
content in the generated JAAS file. 

Thanks,

Sebastien Pereira


Re: Server side error: Cannot find required BLOB at /tmp/blobStore

2018-09-11 Thread Lasse Nedergaard
Hi.

>From my presentation on Flink forward you can validate this
•We used EMR on Amazon’s Linux AMI
•We didn't change the default blob server location (/tmp)
•Default a cron job cleaning up in /tmp
•Solution change blob server location with blob.storage.directory

Den tir. 11. sep. 2018 kl. 09.14 skrev miki haiat :

> /tmp/blobStore
> Is it the path  for checkpoints/savepoints  storage ?
>
> On Tue, 11 Sep 2018, 10:01 Raja.Aravapalli, 
> wrote:
>
>> Hi,
>>
>>
>>
>> My Flink application which reads from Kafka and writes to HDFS is failing
>> repeatedly with below exception:
>>
>>
>>
>> Caused by: java.io.IOException: Server side error: Cannot find required
>> BLOB at /tmp/blobStore-**
>>
>>
>>
>> Can someone please help me on, what could be the root cause of this
>> issue?  I am not able to trace logs also.
>>
>>
>>
>> Also, FYI, our FLINK cluster(4nodes) is setup on our Hadoop YARN cluster.
>>
>>
>>
>> Thanks a lot.
>>
>>
>>
>>
>>
>> Regards,
>>
>> Raja.
>>
>


Re: [EXTERNAL] Re: Server side error: Cannot find required BLOB at /tmp/blobStore

2018-09-11 Thread Raja . Aravapalli
I have not passed visibly anything from my end, to application any path either 
HDFS or Local, that starts with “/tmp” ☹


Regards,
Raja.

From: miki haiat 
Date: Tuesday, September 11, 2018 at 12:44 PM
To: Raja Aravapalli 
Cc: user 
Subject: [EXTERNAL] Re: Server side error: Cannot find required BLOB at 
/tmp/blobStore

/tmp/blobStore
Is it the path  for checkpoints/savepoints  storage ?

On Tue, 11 Sep 2018, 10:01 Raja.Aravapalli, 
mailto:raja.aravapa...@target.com>> wrote:
Hi,

My Flink application which reads from Kafka and writes to HDFS is failing 
repeatedly with below exception:

Caused by: java.io.IOException: Server side error: Cannot find required BLOB at 
/tmp/blobStore-**

Can someone please help me on, what could be the root cause of this issue?  I 
am not able to trace logs also.

Also, FYI, our FLINK cluster(4nodes) is setup on our Hadoop YARN cluster.

Thanks a lot.


Regards,
Raja.


Re: Server side error: Cannot find required BLOB at /tmp/blobStore

2018-09-11 Thread miki haiat
/tmp/blobStore
Is it the path  for checkpoints/savepoints  storage ?

On Tue, 11 Sep 2018, 10:01 Raja.Aravapalli, 
wrote:

> Hi,
>
>
>
> My Flink application which reads from Kafka and writes to HDFS is failing
> repeatedly with below exception:
>
>
>
> Caused by: java.io.IOException: Server side error: Cannot find required
> BLOB at /tmp/blobStore-**
>
>
>
> Can someone please help me on, what could be the root cause of this
> issue?  I am not able to trace logs also.
>
>
>
> Also, FYI, our FLINK cluster(4nodes) is setup on our Hadoop YARN cluster.
>
>
>
> Thanks a lot.
>
>
>
>
>
> Regards,
>
> Raja.
>


Server side error: Cannot find required BLOB at /tmp/blobStore

2018-09-11 Thread Raja . Aravapalli
Hi,

My Flink application which reads from Kafka and writes to HDFS is failing 
repeatedly with below exception:

Caused by: java.io.IOException: Server side error: Cannot find required BLOB at 
/tmp/blobStore-**

Can someone please help me on, what could be the root cause of this issue?  I 
am not able to trace logs also.

Also, FYI, our FLINK cluster(4nodes) is setup on our Hadoop YARN cluster.

Thanks a lot.


Regards,
Raja.


Re: Exception when run flink-storm-example

2018-09-11 Thread Till Rohrmann
Hi Hanjing,

I think the problem is that the Storm compatibility layer only works with
legacy mode at the moment. Please set `mode: legacy` in your
flink-conf.yaml. I hope this will resolve the problems.

Cheers,
Till

On Tue, Sep 11, 2018 at 7:10 AM jing  wrote:

> Hi vino,
> Thank you very much.
> I'll try more tests.
>
> Hanjing
>
> 
> 签名由 网易邮箱大师  定制
> On 9/11/2018 11:51,vino yang
>  wrote:
>
> Hi Hanjing,
>
> Flink does not currently support TaskManager HA and only supports
> JobManager HA.
> In the Standalone environment, once the JobManager triggers a failover, it
> will also cause cancel and restart for all jobs.
>
> Thanks, vino.
>
> jing  于2018年9月11日周二 上午11:12写道:
>
>> Hi vino,
>> Thanks a lot.
>> Besides,  I'm also confused about taskmanager's HA.
>> There're 2 taskmangaer in my cluster, only one job A worked on
>> taskmanager A. If taskmangaer A crashed, what happend about my job.
>> I tried, my job failed, taskmanger B does not take over job A.
>> Is this right?
>>
>> Hanjing
>>
>> 
>> 签名由 网易邮箱大师  定制
>> On 9/11/2018 10:59,vino yang
>>  wrote:
>>
>> Oh, I thought the flink job could not be submitted. I don't know why the
>> storm's example could not be submitted. Because I have never used it.
>>
>> Maybe Till, Chesnay or Gary can help you. Ping them for you.
>>
>> Thanks, vino.
>>
>> jing  于2018年9月11日周二 上午10:26写道:
>>
>>> Hi vino,
>>> My job mangaer log is as below. I can submit regular flink job to this
>>> jobmanger, it worked. But the flink-storm example doesn's work.
>>> Thanks.
>>> Hanjing
>>>
>>> 2018-09-11 18:22:48,937 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
>>> 
>>> 2018-09-11 18:22:48,938 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Starting 
>>> StandaloneSessionClusterEntrypoint (Version: 1.6.0, Rev:ff472b4, 
>>> Date:07.08.2018 @ 13:31:13 UTC)
>>> 2018-09-11 18:22:48,938 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  OS current 
>>> user: hadoop3
>>> 2018-09-11 18:22:49,143 WARN  org.apache.hadoop.util.NativeCodeLoader   
>>> - Unable to load native-hadoop library for your platform... 
>>> using builtin-java classes where applicable
>>> 2018-09-11 18:22:49,186 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Current 
>>> Hadoop/Kerberos user: hadoop3
>>> 2018-09-11 18:22:49,186 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM: Java 
>>> HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.172-b11
>>> 2018-09-11 18:22:49,186 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Maximum 
>>> heap size: 981 MiBytes
>>> 2018-09-11 18:22:49,186 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JAVA_HOME: 
>>> /usr/java/jdk1.8.0_172-amd64
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Hadoop 
>>> version: 2.7.5
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  JVM 
>>> Options:
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
>>> -Xms1024m
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
>>> -Xmx1024m
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
>>> -Dlog.file=/home/hadoop3/zh/flink-1.6.0/log/flink-hadoop3-standalonesession-0-p-a36-72.log
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
>>> -Dlog4j.configuration=file:/home/hadoop3/zh/flink-1.6.0/conf/log4j.properties
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
>>> -Dlogback.configurationFile=file:/home/hadoop3/zh/flink-1.6.0/conf/logback.xml
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint -  Program 
>>> Arguments:
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
>>> --configDir
>>> 2018-09-11 18:22:49,188 INFO  
>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint - 
>>> /home/hadoop3/zh/flink-1.6.0/conf
>>> 2018-09-11 18:22:49,188 INFO  
>>>