Re: Flink and factories?

2016-10-19 Thread Sebastian Neef
Hi,

wow, oh that's indeed a nice solution.

Your version still threw some errors:

> Caused by: org.apache.flink.api.common.InvalidProgramException: Object 
> factorytest.Config$1@5143c662 not serializable
> Caused by: java.io.NotSerializableException: factorytest.factory.PearFactory

I fixed this by adding "implements java.io.Serializable" to the
IDataFactory (and all other interfaces right away) - I hope that won't
backfire in the future.

Anyway, the problem seems solved. Yay and thank you!

Kind regards,
Sebastian



Re: Flink and factories?

2016-10-19 Thread Chesnay Schepler
The functions are serialized when env.execute() is being executed. The 
thing is, as i understand it, that your singleton is simply not part of 
the serialized function, so it doesn't actually matter when the function 
is serialized.


Storing the factory instance in the function shouldn't be too much work 
actually, the following code might do the trick already (changes in bold):


   DataSet processedData = this.getEnv().fromCollection(inputData).flatMap(new 
FlatMapFunction() {
*private final  factory =
   Config.getInstance().getFactory(); * 
   	@Override

public void flatMap(Integer integer, Collector collector) throws 
Exception {
if (integer % 2 == 0) {
collector.collect(*factory.newInstance()*);
}
}
   });

Regards,
Chesnay

On 19.10.2016 23:09, Sebastian Neef wrote:

Hi Chesnay,

thank you for looking into this!

Is there any way to tell Flink to (re)sync the changed classes and/or
tell it to distribute the serialized classes at a given point (e.g.
first on a env.execute() ) or so?

The thing is, that I'm working on a small framework which bases on
flink, so passing a configuration object to all functions/classes would
be overkill, I guess.

Thanks again and kind regards,
Sebastian





Re: Flink and factories?

2016-10-19 Thread Sebastian Neef
Hi Chesnay,

thank you for looking into this!

Is there any way to tell Flink to (re)sync the changed classes and/or
tell it to distribute the serialized classes at a given point (e.g.
first on a env.execute() ) or so?

The thing is, that I'm working on a small framework which bases on
flink, so passing a configuration object to all functions/classes would
be overkill, I guess.

Thanks again and kind regards,
Sebastian


ApacheCon is now less than a month away!

2016-10-19 Thread Rich Bowen
Dear Apache Enthusiast,

ApacheCon Sevilla is now less than a month out, and we need your help
getting the word out. Please tell your colleagues, your friends, and
members of related technical communities, about this event. Rates go up
November 3rd, so register today!

ApacheCon, and Apache Big Data, are the official gatherings of the
Apache Software Foundation, and one of the best places in the world to
meet other members of your project community, gain deeper knowledge
about your favorite Apache projects, learn about the ASF. Your project
doesn't live in a vacuum - it's part of a larger family of projects that
have a shared set of values, as well as a shared governance model. And
many of our project have an overlap in developers, in communities, and
in subject matter, making ApacheCon a great place for cross-pollination
of ideas and of communities.

Some highlights of these events will be:

* Many of our board members and project chairs will be present
* The lightning talks are a great place to hear, and give, short
presentations about what you and other members of the community are
working on
* The key signing gets you linked into the web of trust, and better
able to verify our software releases
* Evening receptions and parties where you can meet community
members in a less formal setting
* The State of the Feather, where you can learn what the ASF has
done in the last year, and what's coming next year
* BarCampApache, an informal unconference-style event, is another
venue for discussing your projects at the ASF

We have a great schedule lined up, covering the wide range of ASF
projects, including:

* CI and CD at Scale: Scaling Jenkins with Docker and Apache Mesos -
Carlos Sanchez
* Inner sourcing 101 - Jim Jagielski
* Java Memory Leaks in Modular Environments - Mark Thomas

ApacheCon/Apache Big Data will be held in Sevilla, Spain, at the Melia
Sevilla, November 14th through 18th. You can find out more at
http://apachecon.com/  Other ways to stay up to date with ApacheCon are:

* Follow us on Twitter at @apachecon
* Join us on IRC, at #apachecon on the Freenode IRC network
* Join the apachecon-discuss mailing list by sending email to
apachecon-discuss-subscr...@apache.org
* Or contact me directly at rbo...@apache.org with questions,
comments, or to volunteer to help

See you in Sevilla!

-- 
Rich Bowen: VP, Conferences
rbo...@apache.org
http://apachecon.com/
@apachecon


Re: Issue while restarting from SavePoint

2016-10-19 Thread Anirudh Mallem
Hi Ufuk,
Thank you for looking into the issue. Please find your answers below :

(1) In detached mode the configuration seems to be not picked up
correctly. That should be independent of the savepoints. Can you
confirm this?
—> I tried starting a new job in detached mode and the job started on the 
cluster.


(2) The program was changed in a non-compatible way after the
savepoint. Did you change the program and if yes in which way?
—> No, I did not make any change to the existing job. I tried restarting the 
same job. 


However, I think I have found the problem. I was not mentioning the parallelism 
specifically when restarting the job from the savepoint. I assumed that this 
information was also captured in the save point. So the non-detached mode was 
actually throwing the right error but the detached mode was not picking up the 
config. I guess the detached mode should also have thrown the same exception 
right? Thanks a lot for helping.



On 10/19/16, 1:19 AM, "Ufuk Celebi"  wrote:

>Hey Anirudh!
>
>As you say, this looks like two issues:
>
>(1) In detached mode the configuration seems to be not picked up
>correctly. That should be independent of the savepoints. Can you
>confirm this?
>
>(2) The program was changed in a non-compatible way after the
>savepoint. Did you change the program and if yes in which way?
>
>– Ufuk
>
>
>On Wed, Oct 19, 2016 at 12:01 AM, Anirudh Mallem
> wrote:
>> Hi,
>> The issue seems to be connected with trying to restart the job in the
>> detached mode. The stack trace is as follows:
>>
>> -bash-3.2$ bin/flink run -d -s jobmanager://savepoints/1 -c
>> com.tfs.rtdp.precompute.Flink.FlinkTest
>> /tmp/flink-web-upload-2155906f-be54-47f3-b9f7-7f6f0f54f74b/448724f9-f69f-455f-99b9-c57289657e29_uber-flink-test-1.0-SNAPSHOT.jar
>> file:///home/amallem/capitalone.properties
>> Cluster configuration: Standalone cluster with JobManager at
>> /10.64.119.90:33167
>> Using address 10.64.119.90:33167 to connect to JobManager.
>> JobManager web interface address http://10.64.119.90:8081
>> Starting execution of program
>> Submitting Job with JobID: 6c5596772627d3c9366deaa0c47ab0ad. Returning after
>> job submission.
>>
>> 
>>  The program finished with the following exception:
>>
>> org.apache.flink.client.program.ProgramInvocationException: The program
>> execution failed: JobManager did not respond within 6 milliseconds
>> at
>> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:432)
>> at
>> org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:93)
>> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:378)
>> at
>> org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:75)
>> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:323)
>> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:777)
>> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:253)
>> at
>> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1005)
>> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1048)
>> Caused by: org.apache.flink.runtime.client.JobTimeoutException: JobManager
>> did not respond within 6 milliseconds
>> at
>> org.apache.flink.runtime.client.JobClient.submitJobDetached(JobClient.java:227)
>> at
>> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:429)
>> ... 8 more
>> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
>> [6 milliseconds]
>> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
>> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
>> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
>> at
>> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
>> at scala.concurrent.Await$.result(package.scala:107)
>> at scala.concurrent.Await.result(package.scala)
>> at
>> org.apache.flink.runtime.client.JobClient.submitJobDetached(JobClient.java:223)
>> ... 9 more
>>
>> When running it without the detached mode, the job manager is responding but
>> the job is failing as it thinks that the structure of the job is modified.
>>
>> -bash-3.2$ bin/flink run -s jobmanager://savepoints/1 -c
>> com.tfs.rtdp.precompute.Flink.FlinkTest
>> /tmp/flink-web-upload-2155906f-be54-47f3-b9f7-7f6f0f54f74b/448724f9-f69f-455f-99b9-c57289657e29_uber-flink-test-1.0-SNAPSHOT.jar
>> file:///home/amallem/capitalone.properties
>> Cluster configuration: Standalone cluster with JobManager at
>> /10.64.119.90:33167
>> Using address 10.64.119.90:33167 to connect to JobManager.
>> JobManager web interface address http://10.64.119.90:8081
>> Starting execution of program
>> Submitting job with JobID: fe7bf69676a5127eecb392e3a0743c6d. Waiting for job
>> completion.
>> Connected to 

NoClassDefFoundError on cluster with httpclient 4.5.2

2016-10-19 Thread Yassine MARZOUGUI
Hi all,

I'm using httpclient with the following dependency:


org.apache.httpcomponents
httpclient
4.5.2


On local mode, the program works correctly, but when executed on the
cluster, I get the following exception:

java.lang.Exception: The user defined 'open(Configuration)' method in class
org.myorg.quickstart.Frequencies$2 caused an exception: Could not
initialize class org.apache.http.conn.ssl.SSLConnectionSocketFactory
at
org.apache.flink.runtime.operators.BatchTask.openUserCode(BatchTask.java:1337)
at
org.apache.flink.runtime.operators.chaining.ChainedFlatMapDriver.openTask(ChainedFlatMapDriver.java:47)
at
org.apache.flink.runtime.operators.BatchTask.openChainedTasks(BatchTask.java:1377)
at
org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:124)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:585)
at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.NoClassDefFoundError: Could not initialize class
org.apache.http.conn.ssl.SSLConnectionSocketFactory
at
org.apache.http.impl.client.HttpClientBuilder.build(HttpClientBuilder.java:966)
at org.myorg.quickstart.Frequencies$2.open(Frequencies.java:82)
at
org.apache.flink.api.common.functions.util.FunctionUtils.openFunction(FunctionUtils.java:38)
at
org.apache.flink.runtime.operators.BatchTask.openUserCode(BatchTask.java:1335)
... 5 more

I'm using Flink 1.1.3. Any idea how to solve the problem? Thank you.

Best,
Yassine


Re: native snappy library not available

2016-10-19 Thread Maximilian Michels
The Hadoop config of your Hadoop installation which is loaded in
SequenceFileWriter.open() needs to be configured to have
"io.compression.codecs" set to include "SnappyCodec". This is probably
described in the Hadoop documentation.

-Max


On Wed, Oct 19, 2016 at 6:09 PM,   wrote:
> Hi Till,
>
>
>
> Thanks for the response.  I’m hoping not to have to change Flink’s lib
> folder.  The path I specified does exist on each node,  and –C is supposed
> to add the path to the classpath on all nodes in the cluster.  I might try
> to bundle the snappy jar within my job jar to see if that works.
>
>
>
> From: Till Rohrmann 
> Reply-To: "user@flink.apache.org" 
> Date: Wednesday, October 19, 2016 at 2:46 PM
> To: "user@flink.apache.org" 
> Subject: Re: native snappy library not available
>
>
>
> Hi Robert,
>
>
>
> have you tried putting the snappy java jar in Flink's lib folder? When
> specifying the classpath manually you have to make sure that all distributed
> components are also started with this classpath.
>
>
>
> Cheers,
>
> Till
>
>
>
> On Wed, Oct 19, 2016 at 1:07 PM,  wrote:
>
> I have flink running on a standalone cluster, but with a Hadoop cluster
> available to it.  I’m using a RollingSink and a SequenceFileWriter, and I’ve
> set compression to Snappy:
>
>
>
> .setWriter(new SequenceFileWriter Text>("org.apache.hadoop.io.compress.SnappyCodec",
> SequenceFile.CompressionType.BLOCK))
>
>
>
>
>
> However, when I try to run this job, I get an error indicating that the
> native snappy ibrary is not available:
>
>
>
> Caused by: java.lang.RuntimeException: native snappy library not available:
> this version of libhadoop was built without snappy support.
>
> at
> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65)
>
>
>
> I’ve tried adding the snappy java jar to the classpath.  I’ve also tried
> adding the native dir to the classpath.  Neither have helped.
>
> -C file://usr/hdp/current/hadoop-client/lib/snappy-java-1.0.4.1.jar
>
> -C file://usr/hdp/current/hadoop/lib/native/
>
>
>
> Any thoughts on how I can get this to work?
>
>
>
> Thanks!
>
>
>
>
>
> 
>
> The information contained in this communication is confidential and intended
> only for the use of the recipient named above, and may be legally privileged
> and exempt from disclosure under applicable law. If the reader of this
> message is not the intended recipient, you are hereby notified that any
> dissemination, distribution or copying of this communication is strictly
> prohibited. If you have received this communication in error, please resend
> it to the sender and delete the original message and copy of it from your
> computer system. Opinions, conclusions and other information in this message
> that do not relate to our official business should be understood as neither
> given nor endorsed by the company.
>
>
>
>
> 
> The information contained in this communication is confidential and intended
> only for the use of the recipient named above, and may be legally privileged
> and exempt from disclosure under applicable law. If the reader of this
> message is not the intended recipient, you are hereby notified that any
> dissemination, distribution or copying of this communication is strictly
> prohibited. If you have received this communication in error, please resend
> it to the sender and delete the original message and copy of it from your
> computer system. Opinions, conclusions and other information in this message
> that do not relate to our official business should be understood as neither
> given nor endorsed by the company.


multiple processing of streams

2016-10-19 Thread robert.lancaster
Is it possible to process the same stream in two different ways?  I can’t find 
anything in the documentation definitively stating this is possible, but nor do 
I find anything stating it isn’t.  My attempt had some unexpected results, 
which I’ll explain below:

Essentially, I have a stream of data I’m pulling from Kafka.  I want to build 
aggregate metrics on this data set using both tumbling windows as well as 
session windows.  So, I do something like the following:

DataStream baseStream =
env.addSource(….);// pulling data from kafka
   .map(…) // parse the raw input
  .assignTimestampsAndWatermarks(…);

DataStream > timeWindowedStream =
baseStream.keyBy(…)
  .timeWindow(…)   // tumbling 
window
  .apply(…);   // 
aggregation over tumbling window

DataStream > sessionWindowedStream =
baseStream.keyBy(…)
  
.window(EventTimeSessionWindows.withGap(…))  // session window
  .apply(…);
   // aggregation over 
session window

The issue is that when I view my job in the Flink dashboard, it indicates that 
each type of windowing is only receiving half of the records.  Is what I’m 
trying simply unsupported or is there something I’m missing?

Thanks!









The information contained in this communication is confidential and intended 
only for the use of the recipient named above, and may be legally privileged 
and exempt from disclosure under applicable law. If the reader of this message 
is not the intended recipient, you are hereby notified that any dissemination, 
distribution or copying of this communication is strictly prohibited. If you 
have received this communication in error, please resend it to the sender and 
delete the original message and copy of it from your computer system. Opinions, 
conclusions and other information in this message that do not relate to our 
official business should be understood as neither given nor endorsed by the 
company.


Flink and factories?

2016-10-19 Thread Sebastian Neef
Hi,

I'm currently working with flink for my bachelor thesis and I'm running
into some odd issues with flink in regards to factories.

I've built a small "proof of concept" and the code can be found here:
https://gitlab.tubit.tu-berlin.de/gehaxelt/flink-factorytest

The idea is that a Config-singleton holds information or objects to use,
e.g. an AppleFactory (default) which implements a specific IDataFactory
interface. This AppleFactory is then used in a flatMap to create Apples
(objects which implement the IData interface):

> 
> System.out.println("Factory before processedData: " + 
> Config.getInstance().getFactory().getClass());
> DataSet processedData = 
> this.getEnv().fromCollection(inputData).flatMap(new FlatMapFunction IData>() {
> @Override
> public void flatMap(Integer integer, Collector collector) 
> throws Exception {
> if (integer % 2 == 0) {
> 
> collector.collect(Config.getInstance().getFactory().newInstance());
> }
> }
> });
>System.out.println("Factory after processedData: " + 
> Config.getInstance().getFactory().getClass());
> try {
> System.out.println("Class created: " + 
> processedData.collect().get(0).getClass());
> this.getDataHolder().setDataList(processedData.collect());
> } catch (Exception e) {
> e.printStackTrace();
> }


This happens in the "Config -> initData()" function. My Flink-Job looks
like this:

>   public static void main(String[] args) throws Exception {
> 
>   Config c = Config.getInstance(); //Use AppleFactory by default
> 
> //BOOM: Somehow flink ignores this?
> c.setFactory(new PearFactory());
> 
> c.initData();
> 
> DataSet data = 
> c.getEnv().fromCollection(c.getDataHolder().getDataList());

As you can see before the "c.initData()" call I set the factory to a
"PearFactory()" which will produce Pear-objects (also implementing the
IData interface).

Running the code will print the following text:

> Class created: class factorytest.Data.Apple

This, however, means that flink didn't catch (or ignored?) that the
factory has changed and still creates objects of type Apple.

Instead I'd expect the processedData.collect() list to contain
Pear-objects. What is even more confusing is that the two "Factory
before/after processedData" print statements correctly return the
PearFactory class.

What's the best way to fix this? Any tips/tricks/questions?

I guess that this issue is might be hard to explain in words, so I'd
really appreciate it if someone could have a look at the code and maybe
do an example run:

Job.java:
https://gitlab.tubit.tu-berlin.de/gehaxelt/flink-factorytest/blob/master/src/main/java/factorytest/Job.java

Config.java:
https://gitlab.tubit.tu-berlin.de/gehaxelt/flink-factorytest/blob/master/src/main/java/factorytest/Config.java

Example run:
https://gitlab.tubit.tu-berlin.de/gehaxelt/flink-factorytest/blob/master/EXAMPLE_RUN_OUTPUT.txt

Kind regards,
Sebastian Neef


Re: Read Apache Kylin from Apache Flink

2016-10-19 Thread Maximilian Michels
Thanks for the guide, Alberto!

-Max


On Tue, Oct 18, 2016 at 10:20 PM, Till Rohrmann  wrote:
> Great to see Alberto. Thanks for sharing it with the community :-)
>
> Cheers,
> Till
>
> On Tue, Oct 18, 2016 at 7:40 PM, Alberto Ramón 
> wrote:
>>
>> Hello
>>
>> I made a small contribution / manual about:
>> "How-to Read  Apache Kylin data from  Apache Flink With Scala"
>>
>>
>>
>>
>> For any suggestions, feel free to contact me
>>
>> Thanks, Alberto
>
>


Re: First Program with WordCount - Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/flink/api/common/functions/FlatMapFunction

2016-10-19 Thread Maximilian Michels
This usually happens when you enable the 'build-jar' profile from
within IntelliJ. This profile assumes you have a Flink installation in
the class path which is only true if you submit the job to an existing
Flink cluster.

-Max


On Mon, Oct 17, 2016 at 10:50 AM, Stefan Richter
 wrote:
> Hi,
>
> looks like there is no Flink jar in the classpath with which you run your
> program. You need to make sure that they relevant jars are there or else
> your program cannot find Flink’s classes, leading to a
> ClassNotFoundException.
>
> Best,
> Stefan
>
> Am 16.10.2016 um 19:26 schrieb Kaepke, Marc :
>
> Hi guys,
>
> I followed this guide
> (https://ci.apache.org/projects/flink/flink-docs-release-1.2/quickstart/java_api_quickstart.html),
> but I get an Exception if I run WordCount
>
> /usr/lib/jvm/java-8-oracle/bin/java -Didea.launcher.port=7536
> -Didea.launcher.bin.path=/home/marc/Programs/idea-IC-162.2032.8/bin
> -Dfile.encoding=UTF-8 -classpath
> "/usr/lib/jvm/java-8-oracle/jre/lib/charsets.jar:/usr/lib/jvm/java-8-oracle/jre/lib/deploy.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/cldrdata.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/dnsns.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/jaccess.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/jfxrt.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/localedata.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/nashorn.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/sunec.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/sunjce_provider.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/sunpkcs11.jar:/usr/lib/jvm/java-8-oracle/jre/lib/ext/zipfs.jar:/usr/lib/jvm/java-8-oracle/jre/lib/javaws.jar:/usr/lib/jvm/java-8-oracle/jre/lib/jce.jar:/usr/lib/jvm/java-8-oracle/jre/lib/jfr.jar:/usr/lib/jvm/java-8-oracle/jre/lib/jfxswt.jar:/usr/lib/jvm/java-8-oracle/jre/lib/jsse.jar:/usr/lib/jvm/java-8-oracle/jre/lib/management-agent.jar:/usr/lib/jvm/java-8-oracle/jre/lib/plugin.jar:/usr/lib/jvm/java-8-oracle/jre/lib/resources.jar:/usr/lib/jvm/java-8-oracle/jre/lib/rt.jar:/home/marc/apache
> flink/flink.gelly/target/classes:/home/marc/Programs/idea-IC-162.2032.8/lib/idea_rt.jar"
> com.intellij.rt.execution.application.AppMain
> haw.bachelor.flink.gelly.WordCount
> Exception in thread "main" java.lang.NoClassDefFoundError:
> org/apache/flink/api/common/functions/FlatMapFunction
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:264)
> at com.intellij.rt.execution.application.AppMain.main(AppMain.java:123)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.flink.api.common.functions.FlatMapFunction
> at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
> at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
> ... 3 more
>
>
> my environment:
>
> Ubuntu 14.04 LTS
> Oracle Java 8
> Maven 3.0.5
> intellij communityedition
>
>
>
> thanks for help
> Marc
>
>


Re: native snappy library not available

2016-10-19 Thread Till Rohrmann
Hi Robert,

have you tried putting the snappy java jar in Flink's lib folder? When
specifying the classpath manually you have to make sure that all
distributed components are also started with this classpath.

Cheers,
Till

On Wed, Oct 19, 2016 at 1:07 PM,  wrote:

> I have flink running on a standalone cluster, but with a Hadoop cluster
> available to it.  I’m using a RollingSink and a SequenceFileWriter, and
> I’ve set compression to Snappy:
>
>
>
> .setWriter(new SequenceFileWriter Text>("org.apache.hadoop.io.compress.SnappyCodec",
> SequenceFile.CompressionType.*BLOCK*))
>
>
>
>
>
> However, when I try to run this job, I get an error indicating that the
> native snappy ibrary is not available:
>
>
>
> Caused by: java.lang.RuntimeException: native snappy library not
> available: this version of libhadoop was built without snappy support.
>
> at org.apache.hadoop.io.compress.SnappyCodec.
> checkNativeCodeLoaded(SnappyCodec.java:65)
>
>
>
> I’ve tried adding the snappy java jar to the classpath.  I’ve also tried
> adding the native dir to the classpath.  Neither have helped.
>
> -C file://usr/hdp/current/hadoop-client/lib/snappy-java-1.0.4.1.jar
>
> -C file://usr/hdp/current/hadoop/lib/native/
>
>
>
> Any thoughts on how I can get this to work?
>
>
>
> Thanks!
>
>
>
> --
> The information contained in this communication is confidential and
> intended only for the use of the recipient named above, and may be legally
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any dissemination, distribution or copying of this communication is
> strictly prohibited. If you have received this communication in error,
> please resend it to the sender and delete the original message and copy of
> it from your computer system. Opinions, conclusions and other information
> in this message that do not relate to our official business should be
> understood as neither given nor endorsed by the company.
>


Re: Issue while restarting from SavePoint

2016-10-19 Thread Ufuk Celebi
Hey Anirudh!

As you say, this looks like two issues:

(1) In detached mode the configuration seems to be not picked up
correctly. That should be independent of the savepoints. Can you
confirm this?

(2) The program was changed in a non-compatible way after the
savepoint. Did you change the program and if yes in which way?

– Ufuk


On Wed, Oct 19, 2016 at 12:01 AM, Anirudh Mallem
 wrote:
> Hi,
> The issue seems to be connected with trying to restart the job in the
> detached mode. The stack trace is as follows:
>
> -bash-3.2$ bin/flink run -d -s jobmanager://savepoints/1 -c
> com.tfs.rtdp.precompute.Flink.FlinkTest
> /tmp/flink-web-upload-2155906f-be54-47f3-b9f7-7f6f0f54f74b/448724f9-f69f-455f-99b9-c57289657e29_uber-flink-test-1.0-SNAPSHOT.jar
> file:///home/amallem/capitalone.properties
> Cluster configuration: Standalone cluster with JobManager at
> /10.64.119.90:33167
> Using address 10.64.119.90:33167 to connect to JobManager.
> JobManager web interface address http://10.64.119.90:8081
> Starting execution of program
> Submitting Job with JobID: 6c5596772627d3c9366deaa0c47ab0ad. Returning after
> job submission.
>
> 
>  The program finished with the following exception:
>
> org.apache.flink.client.program.ProgramInvocationException: The program
> execution failed: JobManager did not respond within 6 milliseconds
> at
> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:432)
> at
> org.apache.flink.client.program.StandaloneClusterClient.submitJob(StandaloneClusterClient.java:93)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:378)
> at
> org.apache.flink.client.program.DetachedEnvironment.finalizeExecute(DetachedEnvironment.java:75)
> at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:323)
> at org.apache.flink.client.CliFrontend.executeProgram(CliFrontend.java:777)
> at org.apache.flink.client.CliFrontend.run(CliFrontend.java:253)
> at
> org.apache.flink.client.CliFrontend.parseParameters(CliFrontend.java:1005)
> at org.apache.flink.client.CliFrontend.main(CliFrontend.java:1048)
> Caused by: org.apache.flink.runtime.client.JobTimeoutException: JobManager
> did not respond within 6 milliseconds
> at
> org.apache.flink.runtime.client.JobClient.submitJobDetached(JobClient.java:227)
> at
> org.apache.flink.client.program.ClusterClient.runDetached(ClusterClient.java:429)
> ... 8 more
> Caused by: java.util.concurrent.TimeoutException: Futures timed out after
> [6 milliseconds]
> at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)
> at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:107)
> at
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> at scala.concurrent.Await$.result(package.scala:107)
> at scala.concurrent.Await.result(package.scala)
> at
> org.apache.flink.runtime.client.JobClient.submitJobDetached(JobClient.java:223)
> ... 9 more
>
> When running it without the detached mode, the job manager is responding but
> the job is failing as it thinks that the structure of the job is modified.
>
> -bash-3.2$ bin/flink run -s jobmanager://savepoints/1 -c
> com.tfs.rtdp.precompute.Flink.FlinkTest
> /tmp/flink-web-upload-2155906f-be54-47f3-b9f7-7f6f0f54f74b/448724f9-f69f-455f-99b9-c57289657e29_uber-flink-test-1.0-SNAPSHOT.jar
> file:///home/amallem/capitalone.properties
> Cluster configuration: Standalone cluster with JobManager at
> /10.64.119.90:33167
> Using address 10.64.119.90:33167 to connect to JobManager.
> JobManager web interface address http://10.64.119.90:8081
> Starting execution of program
> Submitting job with JobID: fe7bf69676a5127eecb392e3a0743c6d. Waiting for job
> completion.
> Connected to JobManager at
> Actor[akka.tcp://flink@10.64.119.90:33167/user/jobmanager#-2047134036]
> 10/18/2016 14:56:11 Job execution switched to status FAILING.
> org.apache.flink.runtime.execution.SuppressRestartsException: Unrecoverable
> failure. This suppresses job restarts. Please check the stack trace for the
> root cause.
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply$mcV$sp(JobManager.scala:1305)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291)
> at
> org.apache.flink.runtime.jobmanager.JobManager$$anonfun$org$apache$flink$runtime$jobmanager$JobManager$$submitJob$1.apply(JobManager.scala:1291)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
> at
> scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
> at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41)
> at
> 

Re: Multiclass classification example

2016-10-19 Thread Theodore Vasiloudis
Hello Kursat,

We don't have a multi class classifier in FlinkML currently.

Regards,
Theodore

-- 
Sent from a mobile device. May contain autocorrect errors.

On Oct 19, 2016 12:33 AM, "Kürşat Kurt"  wrote:

> Hi;
>
>
> I am trying to learn Flink Ml lib.
>
> Where can i find detailed multiclass classification example?
>