Hi Neelabh
You can use FileSystem Connector . for DataStream [1] for TableAPI [2]
.
And you need to put necessary dependency to your flink environment. [3]
For Flink SQL setup, you can reference `sql getting started module`[4]
[1].
https://nightlies.apache.org/flink/flink-docs-master/docs/c
Hello everyone,
Answering my own question, it turns out that Flink Table Store removes the
normalization node on read from an external log system only if
log.changelog-mode='all' and log.consistency = 'transactional' [1].
1.
https://github.com/apache/flink-table-store/blob/7e0d55ff3dc9fd48455b17d
Forward shiming mail to Aitozi.
Aitozi
We are using hyperloglog to count daily uv, but it only provided an approximate
value. I also tried the count distinct in flink table without window, but need
to set the retention time.
However, the time resolution of this operator is 1 millisecond, so it
To aitozi.
Cheers
Minglei
> 在 2018年6月27日,下午5:46,shimin yang 写道:
>
> Aitozi
>
> We are using hyperloglog to count daily uv, but it only provided an
> approximate value. I also tried the count distinct in flink table without
> window, but need to set the retention time.
>
> However, the time
Hi Aparup,
the slow-down can have multiple reasons. One reason could be that your
computation in Timeseries-Analytics becomes more complex over time and
therefore it slows down resulting in back pressure at the sources. This
could be, for example, caused by accumulating a large state. Here the
que
I cannot answer that for sure since graph streams are still a research topic.
It depends on the demand and how fast graph stream representations and
operations will become adopted.
If there is high demand on Flink we can definitely start a FLIP at some point
but for now it makes sense to see how
Hi Paris,
Thanks for the reply. Any idea when will be Gelly-Stream become part of
official Flink distribution?
Regards,
Ameet
On Fri, Jun 30, 2017 at 8:20 PM, Paris Carbone wrote:
> Hi Ameet,
>
> Flink’s Gelly currently operates on the DataSet model.
> However, we have an experimental project w
Hi Ameet,
Flink’s Gelly currently operates on the DataSet model.
However, we have an experimental project with Vasia (Gelly-Stream) that does
exactly that.
You can check it out and let us know directly what you think:
https://github.com/vasia/gelly-streaming
Paris
On 30 Jun 2017, at 13:17, Ame
If you use RocksDB, you will not run into OutOfMemory errors.
On Wed, Aug 31, 2016 at 6:34 PM, Fabian Hueske wrote:
> Hi Vinaj,
>
> if you use user-defined state, you have to manually clear it.
> Otherwise, it will stay in the state backend (heap or RocksDB) until the
> job goes down (planned or
Hi Vinaj,
if you use user-defined state, you have to manually clear it.
Otherwise, it will stay in the state backend (heap or RocksDB) until the
job goes down (planned or due to an OOM error).
This is esp. important to keep in mind, when using keyed state.
If you have an unbounded, evolving key s
Hi Stephan,
Just wanted to jump into this discussion regarding state.
So do you mean that if we maintain user-defined state (for non-window
operators), then if we do not clear it explicitly will the data for that
key remains in RocksDB.
What happens in case of checkpoint ? I read in the documen
In streaming, memory is mainly needed for state (key/value state). The
exact representation depends on the chosen StateBackend.
State is explicitly released: For windows, state is cleaned up
automatically (firing / expiry), for user-defined state, keys have to be
explicitly cleared (clear() method
>Sorry for the late answer, I completely missed this email. (Thanks Robert
for pointing out).
No problem ;)
>Now that you have everything set up, in flatMap1 (for events) you would
query the state : state.value() and enrich your data
>in flatMap2 you would update the state: state.update(newState)
Hi!
Sorry for the late answer, I completely missed this email. (Thanks Robert
for pointing out).
You won't be able to use that project as it was dependent on an earlier
snapshot version that still had completely different state semantics.
I don't think it is realistic that I will re-implment this
Hi Gyula,
I'm currently looking after ways to enrich streams with external data. Have
you got any update on the topic in general or on StreamKV?
I've checked out the code but it won't build, mainly because
StateCheckpointer has been removed since [FLINK-2808]. Any hint on a quick
replacement, bef
Hey,
I think we came to the agreement that this PR is not mergeable right now,
so I am closing it. I personally find it inconsistent to not have the fully
API mirrored in Scala though, but this is something that we can revisit
when prepairing 2.0.
Best,
Marton
On Mon, Mar 14, 2016 at 8:14 PM, S
+1 for considering this as a bug.
However, I do realize that API compatibility is extremely important when
you commit to it, even if present behavior is not ideal.
Silly idea (which is only applicable to Scala APIs, unfortunately): I'm
currently working on a set of API extensions to solve FLINK-1
I agree with Aljoscha on this one, because `DataStreamSink` only contains
setters which are compatible with the Scala API.
On Mon, Mar 14, 2016 at 11:02 AM, Aljoscha Krettek
wrote:
> By the way, I don’t think it’s a bug that addSink() returns the Java
> DataStreamSink. Having a Scala specific ve
By the way, I don’t think it’s a bug that addSink() returns the Java
DataStreamSink. Having a Scala specific version of a DataStreamSink would not
add functionality in this place, just code bloat.
> On 14 Mar 2016, at 10:05, Fabian Hueske wrote:
>
> Yes, we will have more of these issues in the
Yes, we will have more of these issues in the future and each issue will
need a separate discussion.
I don't think that clearly unintended errors (I hope we won't find any
intended errors) are a sufficient reason to break stable a stable API.
IMO, the question that needs to be answered how much of
Hi,
I think this is an important question that will surely come up in some
cases in the future.
I see your point Robert, that we have promised api compatibility for 1.x.y
releases, but I am not sure that this should cover things that are clearly
just unintended errors in the api from our side.
I
On 13.03.2016 12:14, Robert Metzger wrote:
I think its too early to fork off a 2.0 branch. I have absolutely no idea
when a 2.0 release becomes relevant, could be easily a year from now.
at first i was going to agree with Robert, but then...I mean the issue
with not allowing breaking changes
is
This approach means that we will have API breaking changes lingering around
in random people's remote repos that will be untrackable in no time and
people will not be able to build on each others changes. Whoever will have
the pleasure to eventually merge those together will be on the receiving
end
I think its too early to fork off a 2.0 branch. I have absolutely no idea
when a 2.0 release becomes relevant, could be easily a year from now.
The API stability guarantees don't forbid adding new methods. Maybe we can
find a good way to resolve the issue without changing the signature of
existing
Ok, if that is what we promised let's stick to that.
Then would you suggest to open a release-2.0 branch and merge it there?
On Sun, Mar 13, 2016 at 11:43 AM, Robert Metzger
wrote:
> Hey,
> JIRA was down for quite a while yesterday. Sadly, I don't think we can
> merge the change because its API
Hey,
JIRA was down for quite a while yesterday. Sadly, I don't think we can
merge the change because its API breaking.
One of the promises of the 1.0 release is that we are not breaking any APIs
in the 1.x.y series of Flink. We can fix those issues with a 2.x release.
On Sun, Mar 13, 2016 at 5:27
The JIRA issue is FLINK-3610.
On Sat, Mar 12, 2016 at 8:39 PM, Márton Balassi
wrote:
>
> I have just come across a shortcoming of the streaming Scala API: it
> completely lacks the Scala implementation of the DataStreamSink and
> instead the Java version is used. [1]
>
> I would regard this as a
Hi,
the iteration operators (head and tail) don't have a StreamOperator, they
are pure tasks.
On Sat, Jan 2, 2016, 21:08 Matthias J. Sax wrote:
> Hi,
>
> I am working on FLINK-1870 and my changes break some unit tests. The
> problem is in streaming.api.IterateTest.
>
> I tracked the problem down
For initializing the Map manually, I meant making "null" the default value
and writing the code like
HashMap map = state.value()
if (map == null) {
map = new HashMap<>();
}
rather than expecting the state to always clone you a new empty map
On Thu, Nov 12, 2015 at 11:29 AM, Aljoscha Krettek
w
Hi,
you can do it using the register* methods on StreamExecutionEnvironment. So,
for example:
// set up the execution environment
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
env.registerType(InputType.class);
env.registerType(MicroModel.class);
I
Thanks for the help.
TypeExtractor.getForObject(modelMapInit) did the job. Its possible that its
an IDE problem that .getClass() did not work. Intellij is a bit fiddly with
those things.
1) Making null the default value and initializing manually is probably more
> efficient, because otherwise the
It should suffice to do something like
"getRuntimeContext().getKeyValueState("microModelMap", new
HashMap().getClass(), null);"
Two more comments:
1) Making null the default value and initializing manually is probably more
efficient, because otherwise the empty map would have to be cloned each
t
Hey,
Yes what you wrote should work. You can alternatively use
TypeExtractor.getForObject(modelMapInit) to extract the tye information.
I also like to implement my custom type info for Hashmaps and the other
types and use that.
Cheers,
Gyula
Martin Neumann ezt írta (időpont: 2015. nov. 11., Sz
Thanks for the update.
On Wed, Oct 14, 2015 at 10:12 AM, Martin Neumann wrote:
> Hej,
>
> I checked the last Flink trunk version together with Aljoscha and the
> problems are gone by now. (Just to close this discussion thread now)
>
> cheers Martin
>
> On Wed, Oct 7, 2015 at 1:21 PM, Aljoscha Kr
Hej,
I checked the last Flink trunk version together with Aljoscha and the
problems are gone by now. (Just to close this discussion thread now)
cheers Martin
On Wed, Oct 7, 2015 at 1:21 PM, Aljoscha Krettek
wrote:
> Hi,
> I ran it using the attached TimeShift.java and I didn't get any key
> cr
Hi,
I ran it using the attached TimeShift.java and I didn't get any key
cross-talk. Could you please try my example, or verify that the problem
still persists on your side?
I replaced the source by a source that just creates random strings.
On Tue, 6 Oct 2015 at 09:56 Martin Neumann wrote:
>
The window is actually part of the workaround we currently using (should
have commented it out) where we use a window and a MapFunction instead of a
Fold.
Original I was running fold without a window facing the same problems.
The workaround works for now so there is no urgency on that one. I just
Hi,
If you are using a fold you are using none of the new code paths. I will
add support for Fold to the new windowing implementation today, though.
Cheers,
Aljoscha
On Mon, 5 Oct 2015 at 23:49 Márton Balassi wrote:
> Martin, I have looked at your code and you are running a fold in a window,
>
Martin, I have looked at your code and you are running a fold in a window,
that is a very important distinction - the code paths are separate.
Those code paths have been recently touched by Aljoscha if I am not
mistaken.
I have mocked up a simple example and could not reproduce your problem
unfort
Thanks, I am checking it out tomorrow morning.
On Mon, Oct 5, 2015 at 9:59 PM, Martin Neumann wrote:
> Hej,
>
> Sorry it took so long to respond I needed to check if I was actually
> allowed to share the code since it uses internal datasets.
>
> In the appendix of this email you will find the ma
Hej,
Sorry it took so long to respond I needed to check if I was actually
allowed to share the code since it uses internal datasets.
In the appendix of this email you will find the main class of this job
without the supporting classes or the actual dataset. If you want to run it
you need to repla
Hey,
Thanks for reporting the problem, Martin. I have not merged the PR Stephan
is referring to yet. [1] There I am cleaning up some of the internals too.
Just out of curiosity, could you share the code for the failing test please?
[1] https://github.com/apache/flink/pull/1155
On Fri, Oct 2, 201
One of my colleagues found it today when we where hunting bugs today. We
where using the latest 0.10 version pulled from maven this morning.
The program we where testing is new code so I cant tell you if the behavior
has changed or if it was always like this.
On Fri, Oct 2, 2015 at 7:46 PM, Stepha
I think these operations were recently moved to the internal state
interface. Did the behavior change then?
@Marton or Gyula, can you comment? Is it per chance not mapped to the
partitioned state?
On Fri, Oct 2, 2015 at 6:37 PM, Martin Neumann wrote:
> Hej,
>
> In one of my Programs I run a Fol
I think that is actually a cool way to kick of an addition to the system.
Gives you a lot of flexibility and releasing and testing...
It helps, though, to upload maven artifacts for it!
On Tue, Sep 15, 2015 at 7:18 PM, Gyula Fóra wrote:
> Hey All,
>
> We decided to make this a standalone librar
Hey All,
We decided to make this a standalone library until it is stable enough and
then we can decide whether we want to keep it like that or include in the
project:
https://github.com/gyfora/StreamKV
Cheers,
Gyula
Gianmarco De Francisci Morales ezt írta (időpont: 2015.
szept. 9., Sze, 20:25)
Yes, pretty clear. I guess semantically it's still a co-group, but
implemented slightly differently.
Thanks!
--
Gianmarco
On 9 September 2015 at 15:37, Gyula Fóra wrote:
> Hey Gianmarco,
>
> So the implementation looks something different:
>
> The update stream is received by a stateful KVStor
Hey Gianmarco,
So the implementation looks something different:
The update stream is received by a stateful KVStoreOperator which stores
the K-V pairs as their partitioned state.
The query for the 2 cities is assigned an ID yes, and is split to the 2
cities, and each of these are sent to the sa
Just a silly question.
For the example you described, in a data flow model, you would do something
like this:
Have query ids added to the city pairs (qid, city1, city2),
then split the query stream on the two cities and co-group it with the
updates stream ((city1, qid) , (city, temp)), same for ci
That's a very nice application of the Stream API and partitioned state. :D
I think we should run some tests on a cluster based on this to see what
kind of throughput the partitioned state system can handle and also how it
behaves with larger numbers of keys. The KVStore is just an interface and
t
@Stephan:
Technically speaking this is really just a partitioned key-value state and
a fancy operator executing special operations on this state.
>From the user's perspective though this is something hard to implement. If
you want to share state between two stream for instance this way (getting
u
@Gyula
Can you explain a bit what this KeyValue store would do more then the
partitioned key/value state?
On Tue, Sep 8, 2015 at 2:49 PM, Gábor Gévay wrote:
> Hello,
>
> As for use cases, in my old job at Ericsson we were building a
> streaming system that was processing data from telephone net
Hello,
As for use cases, in my old job at Ericsson we were building a
streaming system that was processing data from telephone networks, and
it was using key-value stores a LOT. For example, keeping track of
various state info of the users (which cell are they currently
connected to, what bearers
Hey,
The JobGraph that is executed cannot have cycles (due to scheduling
reasons), that's why there is no edge between the head and tail operators.
What we do instead is we add an extra source to the head and a sink to the
tail (with the same parallelism), and the feedback data is passed outside
o
The CollectionsEnvironment is a different execution backend for the Batch
API - based entirely on Java Collections. The use case for this backend is
to embed Flink programs into other programs in an extremely lightweight
fashion, without threading overheads, serialization, anything.
The tests are
As far as I remember when we were implementing the iteration for streaming
we deliberately "went off" the jobgraph and implemented the backward edge
as an in-memory hook (practically the same in batch but not blocking in the
backward channel).
When we asked the reasoning here on the dev list we go
56 matches
Mail list logo