the state size of this program depends on the number of
> unique ids. That might cause problems if the id space grows very fast.
>
> Please let me know, if you have questions or if that works ;-)
>
> Cheers, Fabian
>
>
> 2016-09-30 0:32 GMT+02:00 Simone Robutti <simone.robu
Solved. Probably there was an error in the way I was testing. Also I
simplified the job and it works now.
2016-09-27 16:01 GMT+02:00 Simone Robutti <simone.robu...@radicalbit.io>:
> Hello,
>
> I'm dealing with an analytical job in streaming and I don't know how to
> w
Hello,
I'm dealing with an analytical job in streaming and I don't know how to
write the last part.
Actually I want to count all the elements in a window with a given status,
so I keep a state with a Map[Status,Long]. This state is updated starting
from tuples containing the oldStatus and the
Hello,
while running a job on Flink 1.1.2 on a cluster of 3 nodes using the
KafkaProducer010, I encounter this error:
WARN org.apache.flink.runtime.client.JobClientActor-
Discard message LeaderSessionMessage(null,ConnectionTimeout) because the
expected leader session ID
e PMML-Model (which can get quite big).
>
>
>
> The „obvious“ optimization would be to initialize and hide the Evaluator
> behind a singleton since it
>
> is thread safe. (Which is what I wanted to avoid in the first place. But
> maybe that is the best solution
&g
of that part.
2016-09-05 15:24 GMT+02:00 Bauss, Julian <julian.ba...@bonprix.net>:
> Hi Simone,
>
>
>
> that sounds promising!
>
> Unfortunately your link leads to a 404 page.
>
>
>
> Best Regards,
>
>
>
> Julian
>
>
>
> *Von:* Simone Rob
I'm not sure if this is the solution and I don't have the possibility to
try right now, but you should move the case class "State" definition
outside the abstract class.
2016-06-04 17:34 GMT+02:00 Dan Drewes :
>
> Hi,
>
> compiling the code:
>
> def minimize(f: DF,
Hello,
right now Flink's local matrices are rather raw and for this kind of usage,
you should rely on Breeze. If you need to perform operations, slicing in
this case, they are a better option if you don't want to reimplement
everything.
In case you already developed against Flink's matrices,
ven a data set {(1, {1,2}), (2, {2,3})} what's the result of
> your operation? Is the result { ({1,2}, {1,2,3}) } because the 2 is
> contained in both sets?
>
> Cheers,
> Till
>
> On Wed, May 25, 2016 at 10:22 AM, Simone Robutti <
> simone.robu...@radicalbit.io> w
Hello,
I'm implementing MinHash for reccomendation on Flink. I'm almost done but I
need an efficient way to merge sets of similar keys together (and later
join these sets of keys with more data).
The actual data structure is of the form DataSet[(Int,Set[Int])] where the
left element of the tuple
<vasilikikala...@gmail.com>
> wrote:
>
>> Thanks Simone! I've managed to reproduce the error. I'll try to figure
>> out what's wrong and I'll keep you updated.
>>
>> -Vasia.
>> On May 4, 2016 3:25 PM, "Simone Robutti" <simone.robu...@radicalbit.io>
Actually model portability and persistence is a serious limitation to
practical use of FlinkML in streaming. If you know what you're doing, you
can write a blunt serializer for your model, write it in a file and rebuild
the model stream-side with deserialized informations.
I tried it for an SVM
To my knowledge FlinkML does not support an unified API and most things
must be used exclusively with Scala Datasets.
2016-05-09 13:31 GMT+02:00 Malte Schwarzer :
> Hi folks,
>
> I tried to get the FlinkML SVM running - but it didn't really work. The
> SVM.fit() method
in a partition. Please note that MapPartition
> operators do not support chaining and come therefore with a certain
> serialization overhead. Whenever possible you should use a MapFunction or
> FlatMapFunction which are a bit more lightweight.
>
> Hope this helps,
> Fabian
>
&g
Here is the code:
package org.example
import org.apache.flink.api.scala._
import org.apache.flink.api.table.TableEnvironment
object Job {
def main(args: Array[String]) {
// set up the execution environment
val env = ExecutionEnvironment.getExecutionEnvironment
val tEnv =
d is not straightforward.
> The DataStream API has better support for custom operators.
>
> Can you explain what kind of operator you would like to add?
> Maybe the functionality can be achieved with the existing operators.
>
> Best, Fabian
>
> 2016-05-03 12:54 GMT+02:00 Simone Robutti &
to be touched. You should find the commit IDs in the JIRA issues for
> these extensions.
>
> Cheers, Fabian
>
>
>
>
>
> 2016-04-29 15:32 GMT+02:00 Simone Robutti <simone.robu...@radicalbit.io>:
>
>> Hello,
>>
>> I'm trying to create a custom operat
Hello,
I'm trying to create a custom operator to explore the internals of Flink.
Actually the one I'm working on is rather similar to Union and I'm trying
to mimick it for now. When I run my job though, this error arise:
Exception in thread "main" java.lang.IllegalArgumentException: Unknown
Hello everyone,
I'm approaching a rather big and complex integration with an existing
software and I would like to hear the opinion of more experienced users on
how to tackle a few issues.
This software builds a cloud with its own logic. What I need is to keep
these nodes as instances inside the
Hello,
I would like to know if it's possible to create a Flink Table from an
arbitrary CSV (or any other form of tabular data) without doing type safe
parsing with expliciteky type classes/POJOs.
To my knowledge this is not possible but I would like to know if I'm
missing something. My
Hello,
last week I got a problem where my job worked in local mode but could not
be serialized on the cluster. I assume that local mode does not really
serialize all the operators (the problem was with a custom map function)
and I need to enforce this behaviour in local mode or, better, be able
To my knowledge there is nothing like that. PMML is not supported in any
form and there's no custom saving format yet. If you really need a quick
and dirty solution, it's not that hard to serialize the model into a file.
2016-03-28 17:59 GMT+02:00 Sourigna Phetsarath :
Hello,
we are trying to set up our system to do remote debugging through Intellij.
Flink is running on a yarn long running session. We are launching Flink's
CliFrontend with the following parameters:
> run -m **::48252
/Users//Projects/flink/build-target/examples/batch/WordCount.jar
The error
Hello,
I'm testing the checkpointing functionality with hdfs as a backend.
For what I can see it uses different checkpointing files and resume the
computation from different points and not from the latest available. This
is to me an unexpected behaviour.
I log every second, for every worker, a
all older checkpoints
>
> You can share the complete job manager log file as well if you like.
>
> – Ufuk
>
> On Wed, Mar 16, 2016 at 2:50 PM, Simone Robutti
> <simone.robu...@radicalbit.io> wrote:
> > Hello,
> >
> > I'm testing the checkpointing functi
them? Do you have an idea why they are out of order? Maybe something
> is mixed up in the way we gather the logs and we only think that
> something is wrong because of this.
>
>
> On Wed, Mar 16, 2016 at 6:11 PM, Simone Robutti
> <simone.robu...@radicalbit.io> wrote:
>
ob state will be
> rolled back to an earlier consistent state.
>
> Can you please share the complete job manager logs of your program?
> The most helpful thing will be to have a log for each started job
> manager container. I don't know if that is easily possible.
>
&g
d found for
returns(java.lang.Class)
method
Should I open an issue?
2016-03-01 21:45 GMT+01:00 Simone Robutti <simone.robu...@radicalbit.io>:
> I tried to simplify it to the bones but I'm actually defining a custom
> MapFunction<java.util.Map<String,Object>,java.util.Map<String,
ljos...@apache.org>:
> Hi,
> what kind of program are you writing? I just wrote a quick example using
> the DataStream API where I’m using Map<String, Tuple2<String, Integer>> as
> the output type of one of my MapFunctions.
>
> Cheers,
> Aljoscha
> > On 01 M
Hello,
the compiler has been raising an error since I added this line to the code
val testData=streamEnv.addSource(new
FlinkKafkaConsumer082[String]("data-input",new
SimpleStringSchema(),kafkaProp))
Here is the error:
Error:scalac: Class
30 matches
Mail list logo