Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/5096
@aljoscha , im having a hard time setting up my git credentials properly on
my work laptop could you please push this for me?
---
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/5096
yes it fails without it, will merge this
---
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/5096
[FLINK-8165] ParameterTool serialization fix
*Thank you very much for contributing to Apache Flink - we are happy that
you want to help us improve Flink. To help the community review your
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/4083#discussion_r123895607
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/savepoint/SavepointV2.java
---
@@ -168,10 +168,27 @@ public static Savepoint
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2796
I am travelling right now, will try to look into this over the weekend.
This is in no way a blocker for the release I think. :)
---
If your project is set up for it, you can reply to this email
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3901
Thanks @tzulitai I don't feel very strongly about either way, I am just
concerned for other users. I leave this decision to you I know you are already
flooded with other stuff around the release
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3901
On the other hand this is potentially causing major data skew or errors for
any people who are using the dynamic topics (they might not even be aware of
it).
---
If your project is set up
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3901
Well, I can understand it but will mean that we have to keep running with a
custom build because there is no way to work around this nicely.
---
If your project is set up for it, you can reply
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/1800
@xhook You should look at https://issues.apache.org/jira/browse/FLINK-4266
for the most up to date discussion on this topic :)
---
If your project is set up for it, you can reply to this email
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3870
Hi,
I am sending this because I will be away for the weekend so won't have time
to test further until monday. I pulled these changes and Stephan's classloader
fix but I still get some errors
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3856
@aljoscha Yes, that might work although I prefer this fix compared to
having to mess with the maven versions on deploy servers.
---
If your project is set up for it, you can reply to this email
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3856
This fix seems to work for me as well ð
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3766
@tzulitai should we try to get this in the release?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3801
sweet
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3801
@StefanRRichter Do you think we should try to figure out which SST files
have been compacted and exclude them from recovery?
---
If your project is set up for it, you can reply to this email
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/3766#discussion_r114282783
--- Diff:
flink-connectors/flink-connector-kafka-base/src/main/java/org/apache/flink/streaming/connectors/kafka/FlinkKafkaProducerBase.java
---
@@ -281,6
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/3801#discussion_r114282104
--- Diff:
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
---
@@ -621,6
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/3801#discussion_r114270508
--- Diff:
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
---
@@ -621,6
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3801
Hi
Thanks for the nice effort!
I only skimmed through the changes to get the main idea (I will do a more
thorough review in the next days) but I have some initial questions :)
1
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3766
I liked the proposed API and I agree that it's probably best to keep the
old behaviour for the deprecated API.
I don't think the Kafka partition info fetching should be a huge problem
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3766
I think this is reasonable as the current implementation doesnt work for
dynamic new topics. (we should also deprecate the current one)
But let's hear what others say :)
---
If your project
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3766
How would this new API map to the current one in terms of backwards
compatibility?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3766
Hi,
Thanks for the PR!
The first problem I noticed with this approach is that it will not work if
users want to partition dynamically created topics (my use case actually).
We
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/1668#discussion_r107968436
--- Diff:
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamIterationHead.java
---
@@ -119,13 +183,154 @@ protected void
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/1668#discussion_r107968935
--- Diff:
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamIterationHead.java
---
@@ -17,100 +17,164 @@
package
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/1668#discussion_r107969496
--- Diff:
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamIterationHead.java
---
@@ -17,100 +17,164 @@
package
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/1668#discussion_r107967910
--- Diff:
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamIterationHead.java
---
@@ -119,13 +183,154 @@ protected void
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/1668#discussion_r107968567
--- Diff:
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamIterationHead.java
---
@@ -119,13 +183,154 @@ protected void
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/1668#discussion_r107967886
--- Diff:
flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamIterationHead.java
---
@@ -119,13 +183,154 @@ protected void
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3523
Ah, the reason is probably that I didnt change my job jar, and this relies
on changes in the rocks backend
---
If your project is set up for it, you can reply to this email and have your
reply
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3523
@StefanRRichter It seems to work correctly locally, I am trying to see what
went wrong with my yarn tests, but this shouldnt block you
---
If your project is set up for it, you can reply
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3523
It also doesnt seem to work starting from a clean state and then savepoint
redeploy with changed topology so maybe I am really screwing up something
---
If your project is set up for it, you can
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3523
Hm, doesnt seem to work for the first try. What I did is I updated the
client with the new jar based on your backport branch. Redeployed the job with
a savepoint (to get the new Flink version), took
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3523
There seems to have been some changes in the StreamTask and some tests so I
couldn't rebase this nicely. Do you have a minute to take a look and maybe push
a branch with the backport please
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3523
Im gonna try to cherry-pick this on 1.2 and run it today
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3523
I could only try the backported version on the topology that caused the
problem initally (that is running 1.2.0)
---
If your project is set up for it, you can reply to this email and have your
reply
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3523
The changes look reasonable :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3335
I agree with @StephanEwen that people probably manage the directory
permissions directly when configuring the Flink jobs. It would be quite
annoying if the Flink job changed the permissions you set
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3346
So will the same savepoint logic apply to externalized checkpoints? I think
it would be good to have a similar way of restoring from checkpoints and
savepoints from a usability perspective
Github user gyfora closed the pull request at:
https://github.com/apache/flink/pull/3199
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/3199
Thanks for looking into this Ufuk, I missed the previous discussion.
I guess you can work around this as long as you are running on single yarn
sessions and always generate the flink-conf files
Github user gyfora closed the pull request at:
https://github.com/apache/flink/pull/2129
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/3199
[streaming] [checkpoints] Allow job specific external checkpoint dir
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gyfora/flink external
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/1668
I think the PR looks pretty good, and it sounds fair to to address
termination in a later PR as this will still greatly improve the current
guarantees without making the backpressure/termination
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/2990
Backport [FLINK-5071]
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gyfora/flink FLINK-5071
Alternatively you can review and apply
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2839
looks good +1
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/2796
[FLINK-4873] Configurable ship path for YARN cluster resources
This PR makes the resource shipping directory configurable for YARN
clusters by adding a new config option: `yarn.resource.ship.path
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2796
We can also easily backport this to 1.1.*
We are actually using this change in production there.
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2509
@tzulitai makes sense ! As for for the Map<Int, Long> you are right, the
multiple topic case slipped my mind :)
---
If your project is set up for it, you can reply to this email and have your
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2509
Hi,
I like the proposed changes, do you think it would make sense to add the
possibility to set specific offsets on a per partition basis?
```
kafka.setStartOffsets(Map<Inte
Github user gyfora closed the pull request at:
https://github.com/apache/flink/pull/2413
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2413
Done, we can close this I guess if you are fixing it in your PR
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2413
Should I push this on 1.1 then?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/2413
[FLINK-4471] Use user code classloader to deserialize state descriptor in
Rocks backend
Thanks for contributing to Apache Flink. Before you open your pull request,
please take the following check
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/2399
[FLINK-4441] Make RocksDB backend return null on empty state
Thanks for contributing to Apache Flink. Before you open your pull request,
please take the following check list into consideration
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2345
I can see the merits of both checkpointing approaches but Stephan is right
in the sense that allowing semi-async snapshots with dynamic scaling would need
a completely new implementation and would
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2345
Hi,
Isn't this way of checkpointing is much much slower then the semi async
version?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2345
But I wonder what would happen in a scenario with a lot of states:
Semi async: short local copy time at every snapshot + very fast restore
Fully async: no copy time + very slow restore
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2345
Good thing about the way fully async checkpoints are restored though is
that it is very trivial to insert some state adaptor code :)
---
If your project is set up for it, you can reply
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2345
But you are right it is probably more important to keep the latency down
for the running programs, and for that the fully async seems to be strictly
better
---
If your project is set up for it, you
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2345
Some of the benefits we lose on restore. Especially for very large states
this can be pretty serious.
Maybe this is required for the sharding to some extent but I don't see
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2129
Thanks Stephan,
I agree that there is a lot to think about/improve when it comes to
scheduling and dynamic scaling. Should we add this to the Key groups design doc
or there is going
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2168
I think it is often the case that the timeout is caused by some failure.
For instance the savepoint restore failes due to something, then you get this
timeout in the command line.
Maybe it's
Github user gyfora commented on the issue:
https://github.com/apache/flink/pull/2143
Looks good, this addresses the issue ð
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/2129
[FLINK-1003] [WIP] Spread out scheduling of tasks
This is a working progress PR with the core functionality implemented but
no tests yet.
As this is a highly critical part of the system I
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/2051#issuecomment-222627001
You are probably right that I am little biased regarding (1), sorry :)
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/2051#issuecomment-222606282
I will gradually add some questions/comments as I go :)
1. Do we really need a QueryableStateStream exposed in the API? As you said
this is just a pretty basic
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/2051#issuecomment-222600015
Thanks for the explanation. I will start testing/reviewing this later
today.
---
If your project is set up for it, you can reply to this email and have your
reply
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/2051#issuecomment-222548679
Awesome feature Ufuk, I am very excited to try this out and give some
feedback :)
So if I understand correctly in order to query the state I can use
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1988#issuecomment-21205
So if I understand currently there is no way to scale jobs with
non-partitioned states. This also means that window operations (that register
timers
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1993#issuecomment-219704691
This is Jira issue that I opened a qhile back and forgot about it :) The
issue was not really that the Slot name was not well formatted etc, it's more
thatn this type
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1988#issuecomment-219064791
Very cool stuff! I was wondering did you do any benchmarks for the
performance impact of this change? For instance it would be good to know how
well RocksDB behaves
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1957#issuecomment-216354689
Is that a problem? Maybe we could do some periodic garbage collection on
the empty column families.
---
If your project is set up for it, you can reply to this email
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1957#issuecomment-216257051
Hi,
So just a quick question regarding the namespace dropping in rocks. I
though you said it would be possible to do this by using prefixes in rocks. Are
there some
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1957#issuecomment-216259593
The other possibility would be to store them in different column families.
Not sure about the performance there though
---
If your project is set up for it, you can
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1918#issuecomment-212834630
ahh the actual classes were not public either, I'm so stupid to have missed
that lol... :d
---
If your project is set up for it, you can reply to this email and have
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1919#issuecomment-212827419
We confirmed that this works in our production environment
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/1918#discussion_r60549764
--- Diff:
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java
---
@@ -64,7 +64,7
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1919#issuecomment-212816048
I need about an hour :) but working on it
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
Github user gyfora commented on a diff in the pull request:
https://github.com/apache/flink/pull/1918#discussion_r60544472
--- Diff:
flink-contrib/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java
---
@@ -64,7 +64,7
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/1919
[FLINK-3790] [streaming] Use proper hadoop config in rolling sink
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gyfora/flink rolling-conf
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/1918
[FLINK-3798] [streaming] Clean up RocksDB backend field/method access
The RocksDB state backend uses a lot package private methods and fields
which makes it very hard to subclass the different parts
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1831#issuecomment-200907475
@StephanEwen
This PR does not change the behaviour of any existing Flink applications.
It now allows though that users only specify key of one input
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1831#issuecomment-200886203
I think you can go ahead merging this if no-one has any objections :)
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1832#issuecomment-200354048
This is a very useful feature :)
The code looks good and straightforward :+1:
---
If your project is set up for it, you can reply to this email and have your
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1831#issuecomment-200339097
Thanks Aljoscha, this seems to work :+1:
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/1800
[FLINK-3620] [streaming] Remove DbStateBackend
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gyfora/flink remove-dbbackend
Alternatively you
Github user gyfora closed the pull request at:
https://github.com/apache/flink/pull/1627
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1759#issuecomment-191821834
I tested it in my application where I was previously having both issues.
Now it works perfectly, thanks!
+1 from me :+1:
---
If your project is set up
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1668#issuecomment-187827823
I can't really add anything to the timeout question :D
As for the snapshotting. I would go with the ListState as that potentially
provides very efficient
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1668#issuecomment-186142276
Thanks Paris, I like the idea, it's a correct modification of the original
algorithm to make it much easier to implement on Flink for the price of
buffering more records
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1623#issuecomment-185260566
We have tested this and it worked correctly. Should we go ahead a merge it?
---
If your project is set up for it, you can reply to this email and have your
reply appear
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1642#issuecomment-184683583
In general the DbStateBackend should not misbehave under the current
assumptions. What Ufuk means is that the DbStateBackend does not make any
assumption about the job
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1638#issuecomment-184196183
Sure
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1638#issuecomment-184195205
It counts the size without HDFS replication I think that is better.
---
If your project is set up for it, you can reply to this email and have your
reply appear
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/1638
[FLINK-3354] Determine correct size for RocksDB snapshots
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/gyfora/flink FLINK-3354
Alternatively
GitHub user gyfora opened a pull request:
https://github.com/apache/flink/pull/1627
[WIP] [streaming] OutOfCoreKvState added + major improvements to
DbStateBackend
This PR has 2 main parts:
1. A new abstract class for encapsulating functionality for OutOfCore lazy
states
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1608#issuecomment-181800676
Looks good :+1:
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1604#issuecomment-181503516
Looks good :) Thanks Stephan!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does
Github user gyfora commented on the pull request:
https://github.com/apache/flink/pull/1562#issuecomment-177611308
Hey,
First of all, great work I am looking forward to having this is :)
I think it would be good if we added default implementations of the List,
Reducing
1 - 100 of 349 matches
Mail list logo