Re: ClassNotFoundException : org.apache.flink.api.common.operators.util.UserCodeObjectWrapper, while trying to run locally

2015-09-07 Thread Stephan Ewen
Effectively, the method with the blob-classloader / parent classloader is exactly intended to do that: Start with the app class path and add some jars in addition, if you need to. On Mon, Sep 7, 2015 at 1:05 PM, Stephan Ewen <se...@apache.org> wrote: > The BlobClassloader use

Re: ClassNotFoundException : org.apache.flink.api.common.operators.util.UserCodeObjectWrapper, while trying to run locally

2015-09-07 Thread Stephan Ewen
The BlobClassloader uses the App classloader as the parent. That should make the classloading go first to the app classpath (TaskManager's classpath) and then to the user-defined code. Is that broken somehow? On Sun, Sep 6, 2015 at 1:40 PM, gkrastev wrote: > Hi,I'm

Re: Configuring UDFs with user defined parameters

2015-09-07 Thread Stephan Ewen
The JobConfig is a system level config. Would be nice to not expose them to the user-level unless necessary. What about using the ExecutionConfig, where you can add shared user-level parameters? On Mon, Sep 7, 2015 at 1:39 PM, Matthias J. Sax wrote: > Thanks for the input. >

Re: Failing Test: KafkaITCase and KafkaProducerITCase

2015-09-07 Thread Stephan Ewen
I have a patch pending that should help with these timeout issues (and null checks)... On Mon, Sep 7, 2015 at 2:41 PM, Matthias J. Sax wrote: > Please lock here: > > https://travis-ci.org/apache/flink/jobs/79086396 > > > Failed tests: > > KafkaITCase>KafkaTestBase.prepare:155

Releasing 0.10.0-milestone1

2015-09-08 Thread Stephan Ewen
Hi all! Some day back we talked about releasing an 0.10.0-milestone1 release. The master has advanced quite a bit (especially due to high-availability code). I cherry picked the important additions to the release-0.10.0-milestone1 branch (fixes and Kafka consumer/producer rework). How about

Re: Releasing 0.10.0-milestone1

2015-09-08 Thread Stephan Ewen
https://issues.apache.org/jira/browse/FLINK-2632 > > > which affects the Web Client's error and results display for jobs. > > > Would be nice to fix it but IMHO it is not critical for the milestone > > > release. > > > > > > On Tue, Sep 8, 201

Re: Streaming KV store abstraction

2015-09-08 Thread Stephan Ewen
@Gyula Can you explain a bit what this KeyValue store would do more then the partitioned key/value state? On Tue, Sep 8, 2015 at 2:49 PM, Gábor Gévay wrote: > Hello, > > As for use cases, in my old job at Ericsson we were building a > streaming system that was processing data

Re: Driver Test Base issue

2015-08-25 Thread Stephan Ewen
Test for a fix is pending... On Tue, Aug 25, 2015 at 1:30 PM, Stephan Ewen se...@apache.org wrote: Looking into this. Seems like a wrong use of assertions in a parallel thread, where errors are not propagated to the main JUNit thread. On Mon, Aug 24, 2015 at 6:53 PM, Sachin Goel sachingoel0

Re: look for help about task hooks of storm-compatibility

2015-09-09 Thread Stephan Ewen
Hi! I think there are four ways to do this: 1) Create a new abstract class "SerializableBaseTaskHook" that extends BaseTaskHook and implements java.io.Serializabe. Then write the object into bytes and put it into the config. 2) Offer in the StormCompatibilityAPI a method "public void addHook(X

Re: Build failure with maven-junction-plugin

2015-09-10 Thread Stephan Ewen
Did you do a "clean" together with the "install"? Then it should work. The problem occurred when you switch between versions where the link is set (0.9+) and versions prior to the link (< 0.9) ... Stephan On Thu, Sep 10, 2015 at 1:37 PM, Matthias J. Sax wrote: > I could

Re: Java 8 JDK Issue

2015-09-16 Thread Stephan Ewen
Do you know between what classes the exception is? What is the original class and what the cast target class? Are they both the same, and not castable because of different copies being loaded from different classloaders, or are they really different types? On Wed, Sep 16, 2015 at 10:36 AM,

Re: Configuring UDFs with user defined parameters

2015-09-15 Thread Stephan Ewen
w could such a configuration be set? > >> > >> StreamExecutionEnvironment only offerst ".getConfig()" > >> > >> -Matthias > >> > >> On 09/07/2015 03:05 PM, Stephan Ewen wrote: > >>> The JobConfig is a system level config. Would be nice

Re: Streaming KV store abstraction

2015-09-15 Thread Stephan Ewen
t; > > > > pop) > > > > > > triplets. The second stream is a pair of cities (city,city) and > you > > > are > > > > > > interested in the difference of the temperature. > > > > > > > > > > > > For e

Re: Java 8 JDK Issue

2015-09-16 Thread Stephan Ewen
t; > > > On 09/16/2015 12:56 PM, Matthias J. Sax wrote: > > Not right now... The ClassCastException occurs deeply inside Storm. I > > guess it is a Storm issue. > > > > I just instantiate a (Storm)LocalCluster and it fails internally. > > > >> LocalClus

Tests - Unit Tests versus Integration Tests

2015-09-17 Thread Stephan Ewen
Hi all! The build time of Flink with all tests is nearing 1h on Travis for the shortest run. It is good that we do excessive testing, there are many mechanisms that need that. I have also seen that a lot of fixes that could be tested in a UnitTest style are actually tested as a full Flink

Re: Releasing 0.10.0-milestone1

2015-09-11 Thread Stephan Ewen
e waiting for a fix of FLINK-2637 (TypeInfo equals + hashcode). > Once this is in, we do the first release candidate, OK? > > 2015-09-11 14:48 GMT+02:00 Stephan Ewen <se...@apache.org>: > > > Fixes come up on a daily base. I think at some point we need to declare > > freez

Re: Releasing 0.10.0-milestone1

2015-09-11 Thread Stephan Ewen
>>>>> the class loader of submitted jobs) is unassigned. > >>>>> Anybody around to pick this one up? > >>>>> > >>>>> Cheers, Fabian > >>>>> > >>>>> 2015-09-09 12:00 GMT+02:00 Till Rohrmann <trohrm..

Re: Add a module for "manual" tests

2015-09-29 Thread Stephan Ewen
even though it’s not regularly executed. > > ​ > > > > On Tue, Sep 29, 2015 at 12:34 PM, Stephan Ewen <se...@apache.org> wrote: > > > > > Hi all! > > > > > > We have by now quite some tests that are not really suitable for > > executio

Re: Release Flink 0.10

2015-09-29 Thread Stephan Ewen
+1 here as well On Tue, Sep 29, 2015 at 12:03 PM, Fabian Hueske wrote: > +1 for moving directly to 0.10. > > 2015-09-29 11:40 GMT+02:00 Maximilian Michels : > > > Hi Kostas, > > > > I think it makes sense to cancel the proposed 0.10-milestone release. > > We

Re: Pulling Streaming out of staging and project restructure

2015-10-01 Thread Stephan Ewen
; > >> together and generally flattening the structure are very reasonable to > > me. > > >> > > >> You have listed flink-streaming-examples under > > flink-streaming-connectors > > >> and left out some less prominent maven modules, but

Pulling Streaming out of staging and project restructure

2015-10-01 Thread Stephan Ewen
Hi all! We are making good headway with reworking the last parts of the Window API. After that, the streaming API should be good to be pulled out of staging. Since we are reorganizing the projects as part of that, I would shift a bit more to bring things a bit more up to date. In this

Re: Pulling Streaming out of staging and project restructure

2015-10-02 Thread Stephan Ewen
modules: > - flink-storm > - flink-storm-examples > > Please let me know if you have any objection about it. > > -Matthias > > On 10/02/2015 10:45 AM, Matthias J. Sax wrote: > > Sure. Will do that. > > > > -Matthias > > > > On 10/02/2015 10

Re: Release Flink 0.10

2015-10-02 Thread Stephan Ewen
I would actually like to remove the old one, but I am okay with keeping it and activating the new one by default On Fri, Oct 2, 2015 at 3:49 PM, Robert Metzger wrote: > The list from Kostas also contained the new JobManager front end. > > Do we want to enable it by default

Re: An update on the DataStream API refactoring WiP

2015-10-02 Thread Stephan Ewen
I added two comments to the pull request that this is based on... On Fri, Oct 2, 2015 at 2:47 PM, Robert Metzger wrote: > I suspect: "- Deletion of "DataSet.forward() and .global()"" is a typo, you > meant DataStream ? > > On Fri, Oct 2, 2015 at 2:44 PM, Kostas Tzoumas

Re: Hash-based aggregation

2015-10-02 Thread Stephan Ewen
I think that roughly, an approach like the compacting hash table is the right one. Go ahead and take a stab at it, if you want, ping us if you run into obstacles. Here are a few thoughts on the hash-aggregator from discussions between Fabian and me: 1) It may be worth to have a specialized

Rethink the "always copy" policy for streaming topologies

2015-10-02 Thread Stephan Ewen
Hi all! Now that we are coming to the next release, I wanted to make sure we finalize the decision on that point, because it would be nice to not break the behavior of system afterwards. Right now, when tasks are chained together, the system copies the elements always between different tasks in

Re: Rethink the "always copy" policy for streaming topologies

2015-10-02 Thread Stephan Ewen
coding guidelines, debugging tools or best > practices. > > > On Fri, Oct 2, 2015 at 5:53 PM, Stephan Ewen <se...@apache.org> wrote: > > > Hi all! > > > > Now that we are coming to the next release, I wanted to make sure we > > finalize the decision on that point, bec

Re: Java type erasure and object reuse

2015-09-18 Thread Stephan Ewen
Good problem... We were thinking for a while to make the input and output type serializers available from the RuntimeContext. That way you could call "T copy = serializer.copy(inValue)". The "copy()" method on the copyable value is actually a good addition nonetheless! On Thu, Sep 17, 2015 at

Re: [jira] [Created] (FLINK-2695) KafkaITCase.testConcurrentProducerConsumerTopology failed on Travis

2015-09-23 Thread Stephan Ewen
The tests use a ZooKeeper mini cluster and multiple Kafka MiniClusters. It appears that these are not very stable in our test setup. Let's see what we can do to improve reliability there. 1) As a first step, I would suggest to reduce the number of concurrent tests to one for this project, as it

Re: Build get stuck at BarrierBufferMassiveRandomTest

2015-09-23 Thread Stephan Ewen
the confusion! I'll wait for your fix then, thanks! > > > > On 21 September 2015 at 22:51, Stephan Ewen <se...@apache.org> wrote: > > > >> I am actually very happy that it is not the > >> "BarrierBufferMassiveRandomTest", that would

Re: Build get stuck at BarrierBufferMassiveRandomTest

2015-09-23 Thread Stephan Ewen
> > > > "GC task thread#1 (ParallelGC)" prio=5 tid=0x7faebb00 nid=0x2103 > > runnable > > > > "GC task thread#2 (ParallelGC)" prio=5 tid=0x7faebb001000 nid=0x2303 > > runnable > > > > "GC task thread#3 (ParallelGC)"

Re: Extending and improving our "How to contribute" page

2015-09-23 Thread Stephan Ewen
Thanks, Fabian for driving this! I agree with your points. Concerning Vasia's comment to not raise the bar too high: That is true, the requirements should be reasonable. We can definitely tag issues as "simple" which means they do not require a design document. That should be more for new

Re: Flink's Checking and uploading JAR files Issue

2015-09-24 Thread Stephan Ewen
I think there is not yet any mechanism, but it would be a good addition, I agree. Between JobManager and TaskManagers, the JARs are cached. The TaskManagers receive hashes of the JARs only, and only load them if they do not already have them. The same mechanism should be used for the Client to

Re: Build get stuck at BarrierBufferMassiveRandomTest

2015-09-23 Thread Stephan Ewen
gt; On 23 Sep 2015, at 16:28, Paris Carbone <par...@kth.se> wrote: > > > > It hangs for me too at the same test when doing "clean verify" > > > >> On 23 Sep 2015, at 16:09, Stephan Ewen <se...@apache.org> wrote: > >> > >> Okay, will lo

Re: [Proposal] Create a separate sub module for benchmark test

2015-09-22 Thread Stephan Ewen
Sounds like a nice idea! Do you want to make this a new maven project as part of the Flink repository, or create a dedicated repository for that? BTW: We are currently not mixing microbenchmarks with test execution. The code for these benchmarks resides in the test scope of the projects (so it

Re: Build get stuck at BarrierBufferMassiveRandomTest

2015-09-21 Thread Stephan Ewen
I am actually very happy that it is not the "BarrierBufferMassiveRandomTest", that would be hell to debug... On Mon, Sep 21, 2015 at 10:51 PM, Stephan Ewen <se...@apache.org> wrote: > Ah, actually it is a different test. I think you got confused by the > sysout log, bec

Re: Build get stuck at BarrierBufferMassiveRandomTest

2015-09-21 Thread Stephan Ewen
tid=0x00007ff9d1805000 nid=0x2503 > runnable > > "GC task thread#4 (ParallelGC)" prio=5 tid=0x7ff9d1805800 nid=0x2703 > runnable > > "GC task thread#5 (ParallelGC)" prio=5 tid=0x7ff9d1806800 nid=0x2903 > runnable > > "GC task thread#

Tests in the streaming API

2015-09-19 Thread Stephan Ewen
Hi! I just saw that all tests in the streaming API are declared as Unit tests, even though the vast majority are integration tests (spawn mini clusters). That leads to problems, because the streaming test mini clusters do not properly clean up after themselves and unit tests reuse JVMs (for

Re: Task Parallelism in a Cluster

2015-12-08 Thread Stephan Ewen
the first element if it allocates a new slot. > >This instance is then appended to the queue again, if it has some > >resources > >(slots) left. > > > >I would assume that you have a shuffle operation involved in your job such > >that it makes sense for the sc

Re: Lack of review on PRs

2015-12-06 Thread Stephan Ewen
Hi Sachin! Thank you for honestly speaking your mind on this issue. I agree that handling ML pull requests is particularly challenging right now for various reasons. There are currently very few active committers that have the background to actually review these pull requests. The whole work

Re: Monitoring backpressure

2015-12-07 Thread Stephan Ewen
I discussed about this quite a bit with other people. It is not totally straightforward. One could try and measure exhaustion of the output buffer pools, but that fluctuates a lot - it would need some work to get a stable metric from that... If you have a profiler that you can attach to the

Re: flink-dist packaging including unshaded classes

2015-12-09 Thread Stephan Ewen
Hi! Did you change anything in the POM files, with respect to Guava, or add another dependency that might transitively pull Guava? Stephan On Tue, Dec 8, 2015 at 9:25 PM, Nick Dimiduk wrote: > Hi there, > > I'm attempting to build locally a flink based on release-0.10.0

Re: Task Parallelism in a Cluster

2015-12-11 Thread Stephan Ewen
r.path.root: /opt/flink-0.10.0 > > I appreciate all the help. > > > Thanks, > Ali > > > On 2015-12-10, 10:16 AM, "Stephan Ewen" <se...@apache.org> wrote: > > >Hi Ali! > > > >Seems like the Google Doc has restricted access, I tells me I have

Re: Apache Tinkerpop & Geode Integration?

2015-12-16 Thread Stephan Ewen
I am not very familiar with Gremlin, but I remember a brainstorming session with Martin Neumann on porting Cypher (the neo4j query language) to Flink. We looked at Cypher queries for filtering and traversing the graph. It looked like it would work well. We remember we could even model recursive

Re: [DISCUSS] Improving State/Timers/Windows

2015-12-14 Thread Stephan Ewen
A lot of this makes sense, but I am not sure about renaming "OperatorState". The other name is nicer, but why make users' life hard just for a name? On Mon, Dec 14, 2015 at 10:46 AM, Maximilian Michels wrote: > Hi Aljoscha, > > Thanks for the informative technical description.

Re: Flink shell in Jupyter

2015-12-17 Thread Stephan Ewen
I think Till has done some advanced Pythin / Flink / Zeppelin integration (to use Python plotting libs) for a talk at some point. @Till: Do you still have the code? Could you share it with Gyula? On Wed, Dec 16, 2015 at 4:22 PM, Gyula Fóra wrote: > Hey Guys, > > Has anyone

Re: [DISCUSS] Time Behavior in Streaming Jobs (Event-time/processing-time)

2015-12-18 Thread Stephan Ewen
I am also in favor of option (2). We could also pass the TimeCharacteristic to for example the "SlidingTimeWindows". Then there is one class, users can explicitly choose the characteristic of choice, and when nothing is specified, the default time characteristic is chosen. On Thu, Dec 17, 2015

Re: Flink and Clojure

2015-12-10 Thread Stephan Ewen
t; >> > > >>> What happens when you follow the packaging examples provided in the > > flink > > >>> quick start archetypes? These have the maven-foo required to package > an > > >>> uberjar suitable for flink submission. Can you try a

Re: Task Parallelism in a Cluster

2015-12-10 Thread Stephan Ewen
ps://drive.google.com/open?id=0B0_jTR8-IvUcMEdjWGFmYXJYS28 > > It looks to me like the distribution is fairly skewed across the nodes, > even though they’re executing the same pipeline. > > Thanks, > Ali > > > On 2015-12-09, 12:36 PM, "Stephan Ewen" <se...@apache

Re: Externalizing the Flink connectors

2015-12-10 Thread Stephan Ewen
I like this a lot. It has multiple advantages: - Obviously more frequent connector updates without being forced to go to a snapshot version - Reduce complexity and build time of the core flink repository We should make sure that for example 0.10.x connectors always work with 0.10.x flink

Re: Diagnosing TaskManager disappearance

2015-12-14 Thread Stephan Ewen
at 3:43 AM, Robert Metzger <rmetz...@apache.org> > > wrote: > > > > > So is the TaskManager JVM still running after the JM detected that the > TM > > > has gone? > > > > > > If not, can you check the kernel log (dmesg) to see whether Linux

Re: Task Parallelism in a Cluster

2015-12-09 Thread Stephan Ewen
readLine()) != > null)) { > ctx.collect(line); > } > } catch(IOException ex) { > LOG.error("error reading from socket", ex); > } > } > >

Re: ClassNotFoundException : org.apache.flink.api.common.operators.util.UserCodeObjectWrapper, while trying to run locally

2015-12-30 Thread Stephan Ewen
I agree, this would be nice to support in Flink. The important parts are on the client side (which may be embedded). The classloaders are used there as part of the de-serialization of messages that contain user-defined types (such as in collect() or in accumulators and in exception reporting).

Re: Release tag for 0.10.1

2016-01-08 Thread Stephan Ewen
Hi Nick! We have not pushed a release tag, but have a frozen release-0.10.1-RC1 branch (https://github.com/apache/flink/tree/release-0.10.1-rc1) A tag would be great, agree! Flink does in its core not depend on Hadoop. The parts that reference Hadoop (HDFS filesystem, YARN, MapReduce

Re: [gelly] Spargel model rework

2016-01-06 Thread Stephan Ewen
might also be a good idea to do some renaming. Currently, we > > > call the Spargel iteration "vertex-centric", which fits better to the > > > Pregel abstraction. I propose we rename the spargel iteration into > > > "scatter-gather" or "signal-co

Re: Add CEP library to Flink

2016-01-08 Thread Stephan Ewen
Looks super cool, Till! Especially the section about the Patterns is great. For the other parts, I was wondering about the overlap with the TableAPI and the SQL efforts. I was thinking that a first version could really focus on the Patterns and make the assumption that they are always applied on

Re: Naive question

2016-01-08 Thread Stephan Ewen
Hi! This looks like a mismatch between the Scala dependency in Flink and Scala in your Eclipse. Make sure you use the same for both. By default, Flink reference Scala 2.10 If your IDE is set up for Scala 2.11, set the Scala version variable in the Flink root pom.xml also to 2.11 Greetings,

Re: Union a data stream with a product of itself

2015-11-25 Thread Stephan Ewen
"stream.union(stream.map(..))" should definitely be possible. Not sure why this is not permitted. "stream.union(stream)" would contain each element twice, so should either give an error or actually union (or duplicate) elements... Stephan On Wed, Nov 25, 2015 at 10:42 AM, Gyula Fóra

Re: [VOTE] Release Apache Flink 0.10.1 (release-0.10.0-rc1)

2015-11-25 Thread Stephan Ewen
lready in > > 0.10.0 and its not super critical bc the JobManager container will be > > killed by YARN after a few minutes. > > > > > > I'll extend the vote until tomorrow Thursday, November 26. > > > > > > On Tue, Nov 24, 2015 at 1:54 PM, Stephan Ewen <se...@apache

Re: [VOTE] Release Apache Flink 0.10.1 (release-0.10.0-rc1)

2015-11-25 Thread Stephan Ewen
:54 PM, Stephan Ewen <se...@apache.org> wrote: > @Gyula: I think it affects users, so should definitely be fixed very soon > (either 0.10.1 or 0.10.2) > > Still checking whether Robert's current version fix solves it now, or > not... > > On Tue, Nov 24, 2015 at 1

Re: The null in Flink

2015-11-26 Thread Stephan Ewen
is NULL, which would change the data layout of Row > object. So any logic that access serialized Row data directly should > updated to sync with new data layout, for example, many methods in > RowComparator. > > > > Reference: > > 1. Nulls: Nothing to worry about: > ht

Re: [DISCUSS] Release Flink 0.10.1 soon

2015-11-18 Thread Stephan Ewen
+1 for a timely 0.10.1 release I would like to add FLINK-2974 - periodic kafka offset committer for case where checkpointing is deactivated On Wed, Nov 18, 2015 at 12:34 PM, Vasiliki Kalavri < vasilikikala...@gmail.com> wrote: > Hey, > > I would also add FLINK-3012 and FLINK-3036 (both pending

Re: Streaming statefull operator with hashmap

2015-11-18 Thread Stephan Ewen
s "InputType" and "MicroModel" on the > >> execution environment. > >>Here you need to do that manually, because they are type erased and > >> Flink cannot auto-register them. > > > > > > Can you point me to an example on how to do this? > &g

Re: Fixing the ExecutionConfig

2015-11-18 Thread Stephan Ewen
ution configs for batch and > streaming? > >> > >> I'm not sure if we need to distinguish between pre-flight and runtime > >> options: From a user's perspective, it doesn't matter. For example the > >> serializer settings are evaluated during pre-flight but they hav

Re: [DISCUSS] Release Flink 0.10.1 soon

2015-11-20 Thread Stephan Ewen
> > > > > (merged since my first email) > > > > > > - FLINK-2989 job cancel button doesn't work on YARN > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Nov 18, 2015 at 2:35 PM, Ufuk C

Re: [DISCUSS] Release Flink 0.10.1 soon

2015-11-20 Thread Stephan Ewen
ushed an update to the Kafka > > > documentation. I can rebase and merge the PR once you give me green > light > > > ;) > > > > > > Till has merged FLINK-3021, so we might be able to have a first RC > today. > > > > > > > > >

Re: ReduceByKeyAndWindow in Flink

2015-11-23 Thread Stephan Ewen
One addition: You can set the system to use "ingestion time", which gives you event time with auto-generated timestamps and watermarks, based on the time that the events are seen in the sources. That way you have the same simplicity as processing time, and you get the window alignment that

Re: how to write dataset in a file?

2015-11-22 Thread Stephan Ewen
You can configure the system to always create a directly (not just on parallelism > 1), see "fs.output.always-create-directory"under https://ci.apache.org/projects/flink/flink-docs-release-0.10/setup/config.html#file-systems The behavior we support right now is pretty much what people coming

Re: [VOTE] Release Apache Flink 0.10.1 (release-0.10.0-rc1)

2015-11-24 Thread Stephan Ewen
Hi Slava! I think the problem with your build is the file handles. It shows in various points: Exception in thread "main" java.lang.InternalError: java.io.FileNotFoundException: /Library/Java/JavaVirtualMachines/jdk1.8.0_60.jdk/Contents/Home/jre/lib/ext/localedata.jar (Too many open files in

Re: Slinding Window Join (without duplicates)

2015-11-24 Thread Stephan Ewen
gt; of the window (ie, there are no "fixed" window boundaries as defined by > a time-slide). > > Not sure how a "session window" can help here... I guess using most > generic window API allows to define slide by one tuple and window size X > seconds. But I don't know

Re: Slinding Window Join (without duplicates)

2015-11-24 Thread Stephan Ewen
I understand Matthias' point. You want to join elements that occur within a time range of each other. In a tumbling window, you have strict boundaries and a pair of elements that arrives such that one element is before the boundary and one after, they will not join. Hence the sliding windows.

Re: withParameters() for Streaming API

2015-11-24 Thread Stephan Ewen
I was also thinking of deprecating that. With that, RichFunctions should change "open(Configuration)" --> "open()". Would be heavily API breaking, so bit hesitant there... On Tue, Nov 24, 2015 at 2:48 PM, Timo Walther wrote: > Thanks for the hint Matthias. > So actually the

Re: Add BigDecimal and BigInteger as types

2015-11-19 Thread Stephan Ewen
fault. > I'm not from the industry but there is a need for that I think. > > > On 18.11.2015 18:21, Stephan Ewen wrote: > >> I agree that they are important. >> >> They are currently generic types and handled by Kryo, which has (AFAIK) >> proper serializers for them.

Re: [DISCUSSION] Type hints versus TypeInfoParser

2015-11-19 Thread Stephan Ewen
t only as an internal feature). > But as I said this just a corner case. > > Timo > > > > On 18.11.2015 18:43, Stephan Ewen wrote: > >> I think the TypeHints case can cover this: >> >> public class MyPojo<T, R> { >> public T fie

Re: Weird test-source issue

2016-01-08 Thread Stephan Ewen
Hmm, strange issue indeed. So, checkpoints are definitely triggered (log message by coordinator to trigger checkpoint) but are not completing? Can you check which is the first checkpoint to complete? Is it Checkpoint 1, or a later one (indicating that checkpoint 1 was somehow subsumed). Can you

Re: [DISCUSS] Git force pushing and deletion of branchs

2016-01-13 Thread Stephan Ewen
+1 Important to protect the master. I think we should also protect Tags. On Wed, Jan 13, 2016 at 11:36 AM, Gyula Fóra wrote: > +1 for protecting the master branch. > > I also don't see any reason why anyone should force push there > > Gyula > > Fabian Hueske

Re: Naive question

2016-01-12 Thread Stephan Ewen
-DskipTests. > > > > Regards > > Ram > > > > -Original Message- > > From: ewenstep...@gmail.com [mailto:ewenstep...@gmail.com] On Behalf Of > Stephan Ewen > > Sent: Tuesday, January 12, 2016 4:10 PM > > To: dev@flink.apache.org > >

Re: Naive question

2016-01-12 Thread Stephan Ewen
ed with this. If I try to change the > version of scala to 2.10 in the IDE then I get lot of compilation issues. > IS there any way to over come this? > > > > Once again thanks a lot and apologies for the naïve question. > > > > Regards > > Ram > > -Ori

Re: New to Apache Flink

2016-01-12 Thread Stephan Ewen
page display in embedded mode, I apologize for the > > confusion in terminology. When I say embedded, I mean running the flink > > jobs through my IDE, with the flink jars in my local classpath. Is there > a > > way to look at the web page in this mode? Without having to dep

Re: [PROPOSAL] Structure the Flink Open Source Development

2016-06-08 Thread Stephan Ewen
them keeping healthy development in podlings. > > > > > > The term "maintainer" kind of being scrutinized in ASF communities, in > > > recent episodes happening in Spark community. > > > > > > - Henry > > > > > > On Wed, Jun

Re: Broadcast data sent increases with # slots per TM

2016-06-09 Thread Stephan Ewen
Till is right. Broadcast joins currently materialize once per slot. Originally, the purely push based runtime was not good enough to handle it differently. By now, we could definitely handle BC Vars differently (only one slot per TM requests). For BC Joins, the hash tables do not coordinate

Re: [PROPOSAL] Structure the Flink Open Source Development

2016-06-09 Thread Stephan Ewen
ve "State Backends" out from > "Runtime". This is also quite complex on it's own. I would of course > volunteer for this and I think Stephan, who is the current proposal for > "Runtime" would also be good. > > On Wed, 8 Jun 2016 at 19:22 Stephan Ewen &

Adding a Histogram Metric

2016-06-10 Thread Stephan Ewen
A recent discussion brought up the point of adding a "histogram" metric type to Flink. This open thread is to gather some more of the requirements for that metric. The most important question is whether users need Flink to offer specific implementations of "Histogram", like for example the "

Re: incremental Checkpointing , Rocksdb HA

2016-06-10 Thread Stephan Ewen
Hi! The incremental checkpointing is still being worked upon. Aljoscha, Till and me have thought through this a lot and have now a pretty good understanding how we want to do it with respect to coordination, savepoints, restore, garbage collecting unneeded checkpoints, etc. We want to put this

Re: Adding a Histogram Metric

2016-06-13 Thread Stephan Ewen
> not an option, we'd also be fine with directly reading and writing from the >> singleton Codahale MetricsRegistry that you start up in each TaskManager. >> >> Thanks, >> Steve >> >> On Fri, Jun 10, 2016 at 7:10 AM, Stephan Ewen <se...@apache.org> wrote:

Re: Side-effects of DataSet::count

2016-05-30 Thread Stephan Ewen
Hi Eron! Yes, the idea is to actually switch all executions to a backtracking scheduling mode. That simultaneously solves both fine grained recovery and lazy execution, where later stages build on prior stages. With all the work around streaming, we have not gotten to this so far, but it is one

Re: Side-effects of DataSet::count

2016-05-31 Thread Stephan Ewen
; this enhancement? Am I correct in understanding that this only applies to > DataSet since streams run indefinitely? > > Thanks, > Greg > > On Mon, May 30, 2016 at 5:49 PM, Stephan Ewen <se...@apache.org> wrote: > > > Hi Eron! > > > > Yes, the idea is to

Re: [ANNOUNCE] Build Issues Solved

2016-05-31 Thread Stephan Ewen
wrote: > >>>>> Thanks for the great work! :-) > >>>>> > >>>>> Regards, > >>>>> Chiwan Park > >>>>> > >>>>>> On May 31, 2016, at 7:47 AM, Flavio Pompermaier < > pomperma...@okkam.it> w

Re: buffering in operators, implementing statistics

2016-05-26 Thread Stephan Ewen
Hi Stavros! I think what Aljoscha wants to say is that the community is a bit hard pressed reviewing new and complex things right now. There are a lot of threads going on already. If you want to work on this, why not make your own GitHub project "Approximate algorithms on Apache Flink" or so?

Re: Junit Issue while testing Kafka Source

2016-05-26 Thread Stephan Ewen
SuccessException. > Any help on this > > Regards, > Vinay Patil > > *+91-800-728-4749* > > On Thu, May 26, 2016 at 3:32 PM, Stephan Ewen <se...@apache.org> wrote: > > > Hi! > > > > On Flink 1.0, there is the "flink-test-utils_2.10" dependency that ha

Re: Junit Issue while testing Kafka Source

2016-05-26 Thread Stephan Ewen
Hi! On Flink 1.0, there is the "flink-test-utils_2.10" dependency that has a some useful things. The "SuccessException" seems a quite common thing - I have seen that in other infinite program tests as well (Google Dataflow / Beam) Another way you can architect tests is to have an element in the

[ANNOUNCE] Build Issues Solved

2016-05-30 Thread Stephan Ewen
Hi all! After a few weeks of terrible build issues, I am happy to announce that the build works again properly, and we actually get meaningful CI results. Here is a story in many acts, from builds deep red to bright green joy. Kudos to Max, who did most of this troubleshooting. This evening, Max

Re: [ANNOUNCE] Build Issues Solved

2016-05-31 Thread Stephan Ewen
/flink/blob/master/flink-libraries/flink-ml/src/test/scala/org/apache/flink/ml/nn/KNNITSuite.scala#L45 > > Regards, > Chiwan Park > > > On May 31, 2016, at 7:09 PM, Stephan Ewen <se...@apache.org> wrote: > > > > Hi Chiwan! > > > > I think the Exe

Re: Collision of task number values for the same task

2016-05-31 Thread Stephan Ewen
It could be that (a) The task failed and was restarted. (b) The program has multiple steps (collect() print()), so that parts of the graph get re-executed. (c) You have two operators with the same name that become tasks with the same name. Do any of those explanations make sense in your

Re: PojoComparator question

2016-05-31 Thread Stephan Ewen
The "compareSerialized" should probably internally always reuse instances, where possible. Since these are never passed into user code or anything, that should be okay to do. On Tue, May 31, 2016 at 11:52 AM, Aljoscha Krettek wrote: > Hi, > I think this is an artifact from

Re: Pulling Streaming out of staging and project restructure

2016-01-11 Thread Stephan Ewen
uot; to just > > "flink-storm" > > > > > would be the cleanest solution. > > > > > > > > > > So in flink-contrib there would be two modules: > > > > > - flink-storm > > > > > - flink-storm-exampl

Dripping the Flink-on-Tez code for Flink 1.0

2016-01-08 Thread Stephan Ewen
Hi all! Currently, Flink has a module to run batch program code on Tez rather than Flink's own distributed execution engine. I would suggest that we drop this code for the next release (1.0) as part of a code consolidation: - There seems little in both the Flink and the Tez community to use

Re: Dripping the Flink-on-Tez code for Flink 1.0

2016-01-10 Thread Stephan Ewen
n Tez never got a lot of user traction. It served well as a > > > >> prototype of "this is possible", but since the core functionality > will > > > be > > > >> subsumed by making Flink on YARN resource elastic, I don't see any > > > reason > &g

Re: Release tag for 0.10.1

2016-01-10 Thread Stephan Ewen
Hadoop versions, I recommend against > publishing Hadoop-specific tarballs on the downloads page; it left me quite > confused, as I'm sure it would other users. > > On Friday, January 8, 2016, Stephan Ewen <se...@apache.org> wrote: > > > Hi Nick! > > > > We

Re: FileNotFoundException thrown by BlobCache when running "mvn test" against flink-runtime 0.10 for Scala 2.11

2016-01-15 Thread Stephan Ewen
Hi! I cannot access the gist. There are some tests that check proper error reporting in that case. Let's make sure there is nothing funky in the sense that this occurs also outside where this is tested for and goes unrecognized by the tests. Can you share more of the log? Thanks, Stephan On

<    1   2   3   4   5   6   7   8   9   10   >