[jira] [Created] (FLINK-20043) Add flink-sql-connector-kinesis package

2020-11-07 Thread Alexander Alexandrov (Jira)
Alexander Alexandrov created FLINK-20043: Summary: Add flink-sql-connector-kinesis package Key: FLINK-20043 URL: https://issues.apache.org/jira/browse/FLINK-20043 Project: Flink

[jira] [Created] (FLINK-20042) Add end-to-end tests for Kinesis Table sources and sinks

2020-11-07 Thread Alexander Alexandrov (Jira)
Alexander Alexandrov created FLINK-20042: Summary: Add end-to-end tests for Kinesis Table sources and sinks Key: FLINK-20042 URL: https://issues.apache.org/jira/browse/FLINK-20042 Project

Re: Flink CSV parsing

2017-03-11 Thread Alexander Alexandrov
FYI, I recently revisited state-of-the-art CSV parsing libraries for Emma. I think this blog post might be useful https://github.com/uniVocity/csv-parsers-comparison The uniVocity parsers library seems to be dominating the benchmarks and is feature complete. As far as I can tell at the moment

Re: Allow TypeInfofactory regisration via ExecutionConfig

2016-10-08 Thread Alexander Alexandrov
/org/apache/flink/api/common/typeinfo/TypeInfoFactory.java On Sat, Oct 8, 2016 at 4:00 PM Alexander Alexandrov < alexander.s.alexand...@gmail.com> wrote: I wanted to open this directly as a JIRA to follow-up on FLINK-3042, however my account (aalexandrov) does not seem to have the nec

Allow TypeInfofactory regisration via ExecutionConfig

2016-10-08 Thread Alexander Alexandrov
I wanted to open this directly as a JIRA to follow-up on FLINK-3042, however my account (aalexandrov) does not seem to have the necessary privileges, so I will post this to the dev list instead. The current approach for registration of custom `TypeInformation` implementations which relies

Re: Broadcast data sent increases with # slots per TM

2016-06-08 Thread Alexander Alexandrov
> As far as I know, the reason why the broadcast variables are implemented that way is that the senders would have to know which sub-tasks are deployed to which TMs. As the broadcast variables are realized as additionally attached "broadcast channels", I am assuming that the same behavior will

Re: Collision of task number values for the same task

2016-05-31 Thread Alexander Alexandrov
hat parts of > the graph get re-executed. > > (c) You have two operators with the same name that become tasks with the > same name. > > Do any of those explanations make sense in your setting? > > Stephan > > > On Tue, May 31, 2016 at 12:48 PM, Alexander Alexandro

Collision of task number values for the same task

2016-05-31 Thread Alexander Alexandrov
Hello, I am analyzing the logs from a Flink batch job and am seeing the following two lines: 2016-05-30 15:32:31,701 INFO ...- DataSource (at ${path}) (4/4) (7efe8fcfe9c7c7e6cd4683e1b5c06a3a) switched from SCHEDULED to DEPLOYING 2016-05-30 15:32:31,701 INFO ...- DataSource (at

Re: [DISCUSS] Macro-benchmarking for performance tuning and regression detection

2016-04-08 Thread Alexander Alexandrov
Hi Greg, I just pushed v1.0.0-rc2 for Peel to Sonatype. As Till said, we are using the framework extensively at the TU for benchmarking and comparing different systems (mostly Flink and Spark). We recently used Peel to conduct some experiments for FLINK-2237. If you want to learn more about the

Re: Release notes for 0.10.0

2015-11-15 Thread Alexander Alexandrov
Is it possible to link to important JIRA-s in the list of new features as you did in the 0.8.0 release notes? For example, I was wondering whether I can find more information about the "Off-heap Managed Memory" model. Regards, Alexander 2015-11-14 20:53 GMT+01:00 Ron Crocker

Re: [DISCUSS] Java code style

2015-11-09 Thread Alexander Alexandrov
I wouldn't stop with GitHub - the main benefit for spaces is that the code looks the same on all viewers because it does not depend on a user-specific parameter (the size of the tab). 2015-11-09 14:02 GMT+01:00 Ufuk Celebi : > Minor thing in favour of spaces: Reviewability on

Re: [DISCUSS] Introducing a review process for pull requests

2015-10-17 Thread Alexander Alexandrov
ked (and if needed amended) by a committer without too much unnecessary discussion and excluded from the "shepherding process". 2015-10-17 12:32 GMT+02:00 Alexander Alexandrov < alexander.s.alexand...@gmail.com>: > One suggestion from me: in GitHub you can make clear who th

Re: [DISCUSS] Introducing a review process for pull requests

2015-10-17 Thread Alexander Alexandrov
One suggestion from me: in GitHub you can make clear who the current sheppard is through the "Assignee" field in the PR (which can and IMHO should be different from the user who actually opened the request). Regards, A. 2015-10-16 15:58 GMT+02:00 Fabian Hueske : > Hi folks, >

Re: Flink 0.9 built with Scala 2.11

2015-07-06 Thread Alexander Alexandrov
and on the website under downloads. We should make sure we explain on these pages that there are downloads for various Scala versions. Cheers, Stephan On Fri, Jul 3, 2015 at 2:01 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Great, I just posted some comments

Re: Flink 0.9 built with Scala 2.11

2015-07-03 Thread Alexander Alexandrov
, at 2:57 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: @Chiwan: let me know if you need hands-on support. I'll be more then happy to help (as my downstream project is using Scala 2.11). 2015-07-01 17:43 GMT+02:00 Chiwan Park chiwanp...@apache.org: Okay, I will apply

[jira] [Created] (FLINK-2311) Set flink-* dependencies in flink-contrib as provided

2015-07-02 Thread Alexander Alexandrov (JIRA)
Alexander Alexandrov created FLINK-2311: --- Summary: Set flink-* dependencies in flink-contrib as provided Key: FLINK-2311 URL: https://issues.apache.org/jira/browse/FLINK-2311 Project: Flink

Re: Add hash based Aggregation

2015-06-17 Thread Alexander Alexandrov
I added a comment with suggestions how to proceed in the JIRA issue. 2015-06-17 22:41 GMT+02:00 rafi_33...@mailbox.tu-berlin.de: Hello dear Developer, Currently aggregation functions are implemented based on sorting. We would like to add hash based aggregation to Flink. We would be thankful

Memory management overhaul

2015-06-02 Thread Alexander Alexandrov
During an offline chat some time ago Stephan Ewen mentioned that there is an ongoing effort for a dynamic memory allocation in some feature branch lying around. Can you point me to that, as I would like to look at the code? Thanks.

Re: MultipleLinearRegression - Strange results

2015-06-01 Thread Alexander Alexandrov
I've seen some work on adaptive learning rates in the past days. Maybe we can think about extending the base algorithm and comparing the use case setting for the IMPRO-3 project. @Felix you can discuss this with the others on Wednesday, Manu will be also there and can give some feedback, I'll

Re: Problems building the current master

2015-05-19 Thread Alexander Alexandrov
GMT+02:00 Alexander Alexandrov alexander.s.alexand...@gmail.com: I had a different issue related to the fact that flink-language-binding-generic was not able to find (a potentially outdated) flink-compiler dependency. I had to wipe out the local flink artifacts from my .m2/repository to make

Re: Problems building the current master

2015-05-19 Thread Alexander Alexandrov
I had a different issue related to the fact that flink-language-binding-generic was not able to find (a potentially outdated) flink-compiler dependency. I had to wipe out the local flink artifacts from my .m2/repository to make this work. 2015-05-19 18:06 GMT+02:00 Robert Metzger

Re: New project website

2015-05-12 Thread Alexander Alexandrov
PS. Is there a particular reason why the APIs are stacked above each other in the picture (ML on top of Gelly on top of the Table API)? I was actually picturing the three next to each other... 2015-05-12 12:08 GMT+02:00 Alexander Alexandrov alexander.s.alexand...@gmail.com: I suggest to change

Re: New project website

2015-05-12 Thread Alexander Alexandrov
I suggest to change the layout of the bottom half in the following way (will solve the alignment issue): - 2 column layout in 1:1 ratio for *Getting Started*, 1st column with the text and the download button, second column with the maven code snippets - 2 column layout in 1:1 ratio for the

Re: Join with a custom predicate

2015-04-27 Thread Alexander Alexandrov
. Johannes -Ursprüngliche Nachricht- Von: Alexander Alexandrov [mailto:alexander.s.alexand...@gmail.com] Gesendet: Sonntag, 26. April 2015 23:22 An: dev@flink.apache.org Betreff: Re: Join with a custom predicate I thought about your problem over the weekend. Unfortunately the algorithm

Re: Join with a custom predicate

2015-04-26 Thread Alexander Alexandrov
I thought about your problem over the weekend. Unfortunately the algorithm that you describe does not fit regular equi-join semantics, but I think it could be fitted with a more complex dataflow. To achieve that, I would partition the (active) domain of the two datasets on fine-granular intervals

Re: broadcast set size

2015-04-09 Thread Alexander Alexandrov
Hi Martin, The answer of your question really depends on the DOP in which you will be running the job and the expected selectivity (the fraction of lines with that certain ID) in case this does not depend on the other side and can be pre-filtered prior to broadcasting. However, since Flink's

Re: help me

2015-04-09 Thread Alexander Alexandrov
Hello, Can you please re-post this on the user list and make sure you have formatted the example code. At the moment it is kind of hard to read. 2015-04-09 15:35 GMT+02:00 hager sallah loveallah1...@yahoo.com.invalid: I want write program flink on any databaseuser input filed and type of

Re: Should collect() and count() be treated as data sinks?

2015-04-02 Thread Alexander Alexandrov
I have a similar issue here: I would like to run a dataflow up to a particular point and materialize (in memory) the intermediate result. Is this possible at the moment? Regards, Alex 2015-04-02 17:33 GMT+02:00 Felix Neutatz neut...@googlemail.com: Hi, I have run the following program:

Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-18 Thread Alexander Alexandrov
. (Of course if we'll use Google Code Style, they already provide such files https://code.google.com/p/google-styleguide/source/browse/trunk/intellij-java-google-style.xml .) On Mon, Mar 16, 2015 at 2:45 PM Alexander Alexandrov

Re: [DISCUSS] Issues with heterogeneity of the code

2015-03-16 Thread Alexander Alexandrov
+1 for not limiting the line length. 2015-03-16 14:39 GMT+01:00 Stephan Ewen se...@apache.org: +1 for not limiting the line length. Everyone should have a good sense to break lines. When in exceptional cases people violate this, it is usually for a good reason. On Mon, Mar 16, 2015 at 2:18

Re: [DISCUSS] Offer Flink with Scala 2.11

2015-03-16 Thread Alexander Alexandrov
projects. On Wed, Mar 11, 2015 at 12:41 AM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: The PR is here: https://github.com/apache/flink/pull/477 Cheers! 2015-03-10 18:07 GMT+01:00 Alexander Alexandrov alexander.s.alexand...@gmail.com: Yes, will do. 2015

Re: [DISCUSS] Deprecate Spargel API for 0.9

2015-03-11 Thread Alexander Alexandrov
+1 2015-03-11 9:41 GMT+01:00 Till Rohrmann trohrm...@apache.org: If Spargel's functionality is a subset of Gelly, I'm also in favor of a deprecation. This will direct new users directly to Gelly and gives old ones time to adapt their code. On Wed, Mar 11, 2015 at 1:56 AM, Henry Saputra

Re: [DISCUSS] Offer Flink with Scala 2.11

2015-03-10 Thread Alexander Alexandrov
for scala 2.11 ? On Tue, Mar 10, 2015 at 2:50 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: We have is almost ready here: https://github.com/stratosphere/flink/commits/scala_2.11_rebased I wanted to open a PR today 2015-03-10 11:28 GMT+01:00 Robert Metzger rmetz

Re: [DISCUSS] Offer Flink with Scala 2.11

2015-03-10 Thread Alexander Alexandrov
The PR is here: https://github.com/apache/flink/pull/477 Cheers! 2015-03-10 18:07 GMT+01:00 Alexander Alexandrov alexander.s.alexand...@gmail.com: Yes, will do. 2015-03-10 16:39 GMT+01:00 Robert Metzger rmetz...@apache.org: Very nice work. The changes are probably somewhat easy to merge

Re: [DISCUSS] Documentation Java/Scala order

2015-03-09 Thread Alexander Alexandrov
+1 for Scala 2015-03-09 15:34 GMT+01:00 Márton Balassi balassi.mar...@gmail.com: Then if no objections in 24 hours I'd open a JIRA issue for this. On Mon, Mar 9, 2015 at 3:23 PM, Till Rohrmann trohrm...@apache.org wrote: +1 for Scala :-) On Sat, Mar 7, 2015 at 1:56 PM, Márton Balassi

Re: [DISCUSS] Offer Flink with Scala 2.11

2015-03-02 Thread Alexander Alexandrov
a maven property work? (Profile may be needed for quasiquotes dependency?) On Mon, Mar 2, 2015 at 4:36 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Hi there, since I'm relying on Scala 2.11.4 on a project I've been working on, I created a branch which updates

[jira] [Created] (FLINK-1613) Cannost submit to remote ExecutionEnvironment from IDE

2015-02-26 Thread Alexander Alexandrov (JIRA)
Alexander Alexandrov created FLINK-1613: --- Summary: Cannost submit to remote ExecutionEnvironment from IDE Key: FLINK-1613 URL: https://issues.apache.org/jira/browse/FLINK-1613 Project: Flink

Re: k-means example behavior

2015-02-25 Thread Alexander Alexandrov
Apache's commons-math implementation offers various strategies for handling this scenarios: http://commons.apache.org/proper/commons-math/jacoco/org.apache.commons.math3.stat.clustering/KMeansPlusPlusClusterer.java.html (take a look at the EmptyClusterStrategy enum options) 2015-02-24 23:28

Re: [DISCUSS] Gelly iteration abstractions

2015-02-22 Thread Alexander Alexandrov
Hi Vasia, I am trying to look at the problem in more detail. Which version of the MST are you talking about? Right now in the Gelly repository I can only find the SSSP example (parallel Bellman-Ford) from Section 4.2 in [1]. However, it seems that the issues encountered by Andra are related to

Re: [jira] [Created] (FLINK-1594) DataStreams don't support self-join

2015-02-20 Thread Alexander Alexandrov
I guess the intended behavior here is to just throw a nicer error, as you cannot really join two data streams. 2015-02-20 16:41 GMT+01:00 Daniel Bali (JIRA) j...@apache.org: Daniel Bali created FLINK-1594: -- Summary: DataStreams don't support

Re: TypeSerializerInputFormat cannot determine its type automatically

2015-01-29 Thread Alexander Alexandrov
Alexander Alexandrov alexander.s.alexand...@gmail.com: The problem seems to be that the reflection analysis cannot determine the type of the TypeSerializerInputFormat. One possible solution is to add the ResultTypeQueryable interface and force clients to explicitly set the TypeInformation

Re: Fwd: TypeSerializerInputFormat cannot determine its type automatically

2015-01-29 Thread Alexander Alexandrov
,TypeInformation) instead of env.readFile() then you can pass TypeInformation manually without implementing ResultTypeQueryable. Regards, Timo On 29.01.2015 14:54, Alexander Alexandrov wrote: The problem seems to be that the reflection analysis cannot determine the type

Re: ReduceGroup fails on server

2015-01-29 Thread Alexander Alexandrov
Forget what I just said, didn't realize that it's Scala :) 2015-01-29 16:24 GMT+01:00 Alexander Alexandrov alexander.s.alexand...@gmail.com: have you tried declaring your UDF classes (e.g. TotalRankDistribution) as static? 2015-01-29 16:14 GMT+01:00 Arvid Heise arvid.he...@gmail.com: Hi

Re: ReduceGroup fails on server

2015-01-29 Thread Alexander Alexandrov
have you tried declaring your UDF classes (e.g. TotalRankDistribution) as static? 2015-01-29 16:14 GMT+01:00 Arvid Heise arvid.he...@gmail.com: Hi Flinker, I'm currently desparetely trying to get a workflow to run remotely on a server. The workflow works fine in the local execution

Fwd: TypeSerializerInputFormat cannot determine its type automatically

2015-01-29 Thread Alexander Alexandrov
inference, but at the moment I cannot find any other usages of the TypeSerializerInputFormat except from the unit test. -- Forwarded message -- From: Alexander Alexandrov alexander.s.alexand...@gmail.com Date: 2015-01-29 12:04 GMT+01:00 Subject: TypeSerializerInputFormat cannot determine

Re: [jira] [Created] (FLINK-1459) Collect DataSet to client

2015-01-28 Thread Alexander Alexandrov
There is already an ongoing discussion and an issue open about that: http://apache-flink-incubator-mailing-list-archive.1008284.n3.nabble.com/Gather-a-distributed-dataset-td3216.html I am sadly currently time-pressed with other things, but if nobody else handles this, I expect to be able to work

Re: Tagging Flink classes with InterfaceAudience and InterfaceStability

2015-01-27 Thread Alexander Alexandrov
I don't get the difference between Private and LimitedPrivate, but otherwise seems like quite a nice idea. It will be also good if we can agree upon what these tags actually mean and add this meaning to the documentation. 2015-01-27 15:46 GMT+01:00 Robert Metzger rmetz...@apache.org: Hi,

Keeping around temp datasets

2015-01-20 Thread Alexander Alexandrov
Hi there, I have to implement some generic fallback strategy on top of a more abstract DSL in order to keep datasets in a temp space (e.g. Tachyon). My implementation is based on the 0.8 release. At the moment I am undecided between three options: - BinaryInputFormat / BinaryOutputFormat -

Representing Scala base types in the Flink RT

2015-01-20 Thread Alexander Alexandrov
Hi there, I cannot figure out how the Scala base types (e.g. scala.Int, scala.Double, etc.) are mapped to the Flink runtime. It seems that there are not treated the same as their Java counterparts (e.g. java.lang.Integer, java.lang.Double). For example, if I write the following code: val

[jira] [Created] (FLINK-1422) Missing usage example for withParameters

2015-01-20 Thread Alexander Alexandrov (JIRA)
Alexander Alexandrov created FLINK-1422: --- Summary: Missing usage example for withParameters Key: FLINK-1422 URL: https://issues.apache.org/jira/browse/FLINK-1422 Project: Flink Issue

Re: Gather a distributed dataset

2015-01-16 Thread Alexander Alexandrov
Thanks, I will have a look at your comments tomorrow and create a PR which should superseed 210. BTW, is there already a test case where I can see the suggested way to do staged execution in with the new ExecutionEnvironment API? I thought about your second remark as well. The following lines

Upgrading to Scala 2.11.x?

2015-01-15 Thread Alexander Alexandrov
Currently, Flink uses Scala 2.10.4 and relies on the macro paradise compiler plugin to get the quasi-quotes functionality. This makes the code incompatible with third-party add-ons that use macros written against a newer version of Scala. Scala 2.11 has been around for almost a year already. It

Re: Gather a distributed dataset

2015-01-15 Thread Alexander Alexandrov
an get working on that. Probably involves adding another set of akka messages from TM - JM - Client. Or something like an extension to the BLOB manager for streams? Greetings, Stephan On Mon, Jan 12, 2015 at 12:25 PM, Alexander Alexandrov alexander.s.alexand...@gmail.com wrote: Thanks, I