Re: flink testing

2017-04-23 Thread Georg Heiler
how> Ted Yu <yuzhih...@gmail.com> schrieb am So., 23. Apr. 2017 um 10:46 Uhr: > Please give more context by describing what spark-test-base does :-) > > > On Apr 22, 2017, at 10:57 PM, Georg Heiler <georg.kf.hei...@gmail.com> > wrote: > > > > Hi, >

Flink first project

2017-04-23 Thread Georg Heiler
New to flink I would like to do a small project to get a better feeling for flink. I am thinking of getting some stats from several REST api (i.e. Bitcoin course values from different exchanges) and comparing prices over different exchanges in real time. Are there already some REST api sources

flink testing

2017-04-22 Thread Georg Heiler
Hi, is there something like spark-testing-base for flink as well? Cheers, Georg

multiple users per flink deployment

2017-08-01 Thread Georg Heiler
Hi, flink currently only seems to support a single kerberos ticket for deployment. Are there plans to support different users per each job? regards, Georg

Re: Flink first project

2017-04-24 Thread Georg Heiler
it in flink, > you do not want that your processing fails because the web service is not > available etc. > Via flume which is suitable for this kind of tasks it is more controlled > and reliable. > > On 23. Apr 2017, at 18:02, Georg Heiler <georg.kf.hei...@gmail.com> wrote:

Re: flink testing

2017-04-24 Thread Georg Heiler
ontext. > > Cheers, > > Konstantin > > [1] https://github.com/knaufk/flink-junit > > On 23.04.2017 17:19, Georg Heiler wrote: > > Spark testing base https://github.com/holdenk/spark-testing-base offers > > some Base classes to use when writing tests with Spark which mak

Re: Flink first project

2017-04-24 Thread Georg Heiler
Wouldn't adding flume -> Kafka -> flink also introduce additional latency? Georg Heiler <georg.kf.hei...@gmail.com> schrieb am So., 23. Apr. 2017 um 20:23 Uhr: > So you would suggest flume over a custom akka-source from bahir? > > Jörn Franke <jornfra...@gmail.com> sc

Re: Flink first project

2017-04-27 Thread Georg Heiler
ing your first Flink job right away :) > > Cheers, > Gordon > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/guarantees.html > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.2/dev/connectors/kafka.html > > On 24 April 2017 at

Re: multiple users per flink deployment

2017-08-02 Thread Georg Heiler
ndent users within > the same JVM through dynamic JAAS configuration. > See this mail thread [1] for more detail on that. > > Cheers, > Gordon > > [1] > http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Kafka-0-10-jaas-multiple-clients-td12831.html#a13317 > &

queryable state vs. writing result back to Kafka

2017-08-05 Thread Georg Heiler
What is the advantage of queryable state compared to writing the result back to Kafka? regards, Georg

Re: getting started with link / scala

2017-11-29 Thread Georg Heiler
You would suggest: https://github.com/ottogroup/flink-spector for unit tests? Georg Heiler <georg.kf.hei...@gmail.com> schrieb am Mi., 29. Nov. 2017 um 22:33 Uhr: > Thanks, this sounds like a good idea - can you recommend such a project? > > Jörn Franke <jornfra...@gmail.com&g

getting started with link / scala

2017-11-29 Thread Georg Heiler
Getting started with Flink / scala, I wonder whether the scala base library should be excluded as a best practice: https://github.com/tillrohrmann/flink-project/blob/master/build.sbt#L32 // exclude Scala library from assembly assemblyOption in assembly := (assemblyOption in

Re: getting started with link / scala

2017-11-29 Thread Georg Heiler
virtually all companies will require that > you know these things. > > I am not sure if a hello world project in Flink exists containing all > these but it would be a good learning task to create such a thing. > > On 29. Nov 2017, at 22:03, Georg Heiler <georg.kf.hei...@gmail.com>

flink local & interactive development

2017-11-30 Thread Georg Heiler
Is interactive development possible with fink like with spark in a REPL? When trying to use the console mode of SBT I get the following error: java.lang.Exception: Deserializing the OutputFormat (org.apache.flink.api.java.Utils$CollectHelper@210d5aa7) failed: Could not read the user code wrapper

Flik typesafe configuration

2017-11-29 Thread Georg Heiler
Starting out with flint from a scala background I would like to use the Typesafe configuration like: https://github.com/pureconfig/pureconfig, however, https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/best_practices.html link recommends to setup:

Re: flink local & interactive development

2017-12-01 Thread Georg Heiler
Hi Georg, > > I have no experience with SBT's console mode, so I cannot comment on that, > but Flink provides a Scala REPL that might be useful [1]. > > Best, Fabian > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/scala_shell.html > >

Re: using a Yarn cluster for both Spark and Flink

2018-01-17 Thread Georg Heiler
Why not? Isn't a resource manager meant for this? You should however clearly define service level agreements as a flink streaming job might require certain maximum latency opposed to a spark batch job. Soheil Pourbafrani schrieb am Do. 18. Jan. 2018 um 08:30: > Is it

Re: arbitrary state handling in python api

2020-09-11 Thread Georg Heiler
yedProcessFunction could meet your > requirements? We are planning to support it in Python DataStream API in > 1.12. > > Regards, > Dian > > 在 2020年9月9日,下午2:28,Georg Heiler 写道: > > Hi, > > does the python API expose some kind of mapGroupsWithState operator which >

Re: map JSON to scala case class & off-heap optimization

2020-07-16 Thread Georg Heiler
Many thanks! Am Mi., 15. Juli 2020 um 15:58 Uhr schrieb Aljoscha Krettek < aljos...@apache.org>: > On 11.07.20 10:31, Georg Heiler wrote: > > 1) similarly to spark the Table API works on some optimized binary > > representation > > 2) this is only available in

Re: Avro from avrohugger still invalid

2020-07-02 Thread Georg Heiler
might know more about the problem you are > > describing. > > > > Cheers, > > Till > > > > On Mon, Jun 29, 2020 at 10:21 PM Georg Heiler > > > wrote: > > > >> Older versions of flink were incompatible with the Scala specific record >

flink take single element from stream

2020-07-09 Thread Georg Heiler
How can I explore a stream in Flink interactively? Spark has the concept of take/head to extract the first n elements of a dataframe / table. Is something similar available in Flink for a stream like: val serializer = new JSONKeyValueDeserializationSchema(false) val stream = senv.addSource(

map JSON to scala case class & off-heap optimization

2020-07-09 Thread Georg Heiler
Hi, I want to map a stream of JSON documents from Kafka to a scala case-class. How can this be accomplished using the JSONKeyValueDeserializationSchema?Is a manual mapping of object nodes required? I have a Spark background. There, such manual mappings usually are discouraged. Instead, they

Re: map JSON to scala case class & off-heap optimization

2020-07-09 Thread Georg Heiler
on to object. > > Regards, > Taher Koitawala > > On Thu, Jul 9, 2020, 9:54 PM Georg Heiler > wrote: > >> Hi, >> >> I want to map a stream of JSON documents from Kafka to a scala >> case-class. How can this be accomplished using the >> JSONKeyValueDeseri

Re: Avro from avrohugger still invalid

2020-07-02 Thread Georg Heiler
But would it be possible to somehow use AvroSerializer for now? Best, Georg Am Do., 2. Juli 2020 um 23:44 Uhr schrieb Georg Heiler < georg.kf.hei...@gmail.com>: > What is the suggested workaround for now? > > > Thanks! > > Aljoscha Krettek schrieb am Do. 2. Juli 2020 u

MalformedClassName for scala case class

2020-07-09 Thread Georg Heiler
Hi, why can't I register the stream as a table and get a MalformedClassName exception? val serializer = new JSONKeyValueDeserializationSchema(false) val stream = senv.addSource( new FlinkKafkaConsumer( "tweets-raw-json", serializer, properties ).setStartFromEarliest()

Re: map JSON to scala case class & off-heap optimization

2020-07-09 Thread Georg Heiler
gt; then use the Jackson ObjectMapper to convert to scala objects. In flink > there is no API like Spark to automatically get all fields. > > On Thu, Jul 9, 2020, 11:38 PM Georg Heiler > wrote: > >> How can I use it with a scala case class? >> If I understand it correctly f

Re: map JSON to scala case class & off-heap optimization

2020-07-11 Thread Georg Heiler
te JSON decoders for scala case classes. > > > > As it was mentioned earlier, Flink does not come packaged with > > JSON-decoding generators for Scala like spark does. > > > > On Thu, Jul 9, 2020 at 4:45 PM Georg Heiler > > wrote: > > > >> Great. Than

Re: Avro from avrohugger still invalid

2020-07-11 Thread Georg Heiler
Aljoscha Krettek < aljos...@apache.org>: > Hi, > > I don't think there's a workaround, except copying the code and manually > fixing it. Did you check out my comment on the Jira issue and the new > one I created? > > Best, > Aljoscha > > On 03.07.20 07:1

Re: passing additional jvm parameters to the configuration

2020-06-25 Thread Georg Heiler
e as it's also used in the SSL setup [2]. > > [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/cli.html > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/security-ssl.html#tips-for-yarn--mesos-deployment > > > On Thu, Jun 25, 2020

Re: passing additional jvm parameters to the configuration

2020-06-24 Thread Georg Heiler
d here [1]. > > If not, could you please be more precise: do you want the parameter to be > passed to the driver, the job manager, or the task managers? > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/ops/cli.html#deployment-targets > > On Wed, Jun 24, 2020

passing additional jvm parameters to the configuration

2020-06-24 Thread Georg Heiler
Hi, how can I pass additional configuration parameters like spark`s extraJavaOptions to a flink job? https://stackoverflow.com/questions/62562153/apache-flink-and-pureconfig-passing-java-properties-on-job-startup contains the details. But the gist is: flink run --class

[no subject]

2020-06-29 Thread Georg Heiler
Hi, I try to use the confluent schema registry in an interactive Flink Scala shell. My problem is trying to initialize the serializer from the ConfluentRegistryAvroDeserializationSchema fails: ```scala val serializer =

Avro from avrohugger still invalid

2020-06-29 Thread Georg Heiler
Older versions of flink were incompatible with the Scala specific record classes generated from AvroHugger. https://issues.apache.org/jira/browse/FLINK-12501 Flink 1.10 apparently is fixing this. I am currently using 1.10.1. However, still experience thus problem

Re: passing additional jvm parameters to the configuration

2020-06-25 Thread Georg Heiler
want to make it available on job manager or > task manager but I guess the basic form is good enough for you. > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#jvm-and-logging-options > > On Wed, Jun 24, 2020 at 10:52 PM Georg Heiler > w

Re: DROOLS rule engine with flink

2020-06-23 Thread Georg Heiler
Why not use flink CEP? https://flink.apache.org/news/2020/03/24/demo-fraud-detection-2.html has a nice interactive example Best, Georg Jaswin Shah schrieb am Di. 23. Juni 2020 um 21:03: > Hi I am thinking of using some rule engine like DROOLS with flink to solve > a problem described below:

Re: passing additional jvm parameters to the configuration

2020-06-25 Thread Georg Heiler
not sure if this > is intentional. > > > If you could put the env.java.opts in the flink-conf.yaml, it would most > likely work for both YARN and local. With FLINK_CONF_DIR you can set a > different conf dir per job. Alternatively, you could also specify both > FLINK_ENV_JAVA_

Running Flink on kerberized HDP 3.1 (minimal getting started)

2020-06-12 Thread Georg Heiler
Hi, I try to run Flink on a kerberized HDP 3.1 instance and need some help getting started. https://stackoverflow.com/questions/62330689/execute-flink-1-10-on-a-hdp-3-1-cluster-to-access-hive-tables describes how far I have gotten so far. In the end, I want to be able to start task managers on

GenericData cannot be cast to type scala.Product

2020-07-23 Thread Georg Heiler
Hi, as a follow up to https://issues.apache.org/jira/browse/FLINK-18478 I now face a class cast exception. The reproducible example is available at https://gist.github.com/geoHeil/5a5a4ae0ca2a8049617afa91acf40f89 I do not understand (yet) why such a simple example of reading Avro from a Schema

Re: Pyflink/Flink Java parquet streaming file sink for a dynamic schema stream

2021-12-03 Thread Georg Heiler
t; Yes the general JSON schema should follow a debezium JSON schema. The > fields that need to be saved to the parquet file are in the "after" key. > > On Fri, 3 Dec 2021, 07:10 Georg Heiler, wrote: > >> Do the JSONs have the same schema overall? Or is each potentially >

Re: Pyflink/Flink Java parquet streaming file sink for a dynamic schema stream

2021-12-02 Thread Georg Heiler
Do the JSONs have the same schema overall? Or is each potentially structured differently? Best, Georg Am Fr., 3. Dez. 2021 um 00:12 Uhr schrieb Kamil ty : > Hello, > > I'm wondering if there is a possibility to create a parquet streaming file > sink in Pyflink (in Table API) or in Java Flink

scala shell not part of 1.14.4 download

2022-03-18 Thread Georg Heiler
Hi, https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/deployment/repls/scala_shell/ mentions: bin/start-scala-shell.sh local a script to start a scala REPL shell. But the download for Flink https://www.apache.org/dyn/closer.lua/flink/flink-1.14.4/flink-1.14.4-bin-scala_2.12.tgz

Re: scala shell not part of 1.14.4 download

2022-03-22 Thread Georg Heiler
> Hi Georg, > > You can check out the discussion thread for the motivation [1]. > > Best regards, > > Martijn Visser > https://twitter.com/MartijnVisser82 > > [1] https://lists.apache.org/thread/pojsrrdckjwow5186nd7hn9y5j9t29ov > > On Sun, 20 Mar 2022 at 08:13, Geo

Re: scala shell not part of 1.14.4 download

2022-03-20 Thread Georg Heiler
schrieb Chesnay Schepler < ches...@apache.org>: > The Scala Shell only works with Scala 2.11. You will need to use the Scala > 2.11 Flink distribution. > > On 18/03/2022 12:42, Georg Heiler wrote: > > Hi, > > > https://nightlies.apache.org/flink/flink-docs-release-1.14/

flink SQL client with kafka confluent avro binaries setup

2022-03-23 Thread Georg Heiler
Hi, When trying to set up a demo for the kafka-sql-client reading an Avro topic from Kafka I run into problems with regards to the additional dependencies. In the spark-shell there is a --packages option which automatically resolves any additional required jars (transitively) using the provided

Re: DBT-flink profile?

2022-03-25 Thread Georg Heiler
Hi, use is perhaps not the right word (yet) rather experiment. But both would be relevant. And in particular, also the streaming option. I also just found: https://materialize.com/docs/guides/dbt/ outlining how dbt and streaming could potentially be married. Perhaps their integration could serve

SQL Client Kafka (UPSERT?) Sink for confluent-avro

2022-03-24 Thread Georg Heiler
Hi, how can I get Flinks SQL client to nicely sink some data to either the regular kafka or the kafka-upsert connector? I have a table/ topic with dummy data: CREATE TABLE metrics_brand_stream ( `event_time` TIMESTAMP(3) METADATA FROM 'timestamp', WATERMARK FOR event_time AS event_time -

Re: flink SQL client with kafka confluent avro binaries setup

2022-03-24 Thread Georg Heiler
flink-sql-connector-kafka jar. > > For your questions, to my best knowledge, '-j' and '-l' options are the > only options for now. Maybe others in the community can provide more > information. > > Best, > Biao Geng > > > Georg Heiler 于2022年3月23日周三 23:59写道: >

Flink SQL AVG with mandatory type casting

2022-03-24 Thread Georg Heiler
Hi, I observe strange behavior in Flink SQL: For an input stream: CREATE TABLE input_stream ( duration int, rating int ) WITH ( 'connector' = 'kafka', 'topic' = 't', 'scan.startup.mode' = 'earliest-offset', 'format' = 'avro-confluent',

DBT-flink profile?

2022-03-24 Thread Georg Heiler
Hi, is anyone working on a DBT Flink plugin/profile? https://docs.getdbt.com/reference/profiles.yml hosts many other databases - and I think this kind of support would be really beneficial for the SQL part of Flink. Best, Georg

Re: SQL Client Kafka (UPSERT?) Sink for confluent-avro

2022-03-29 Thread Georg Heiler
e are passed > has changed in recent versions, so it'd be good to double check you are > using the documentation for the version you are running on. > > > Best > Ingo > > On 24.03.22 11:57, Georg Heiler wrote: > > Hi, > > > > how can I get Flinks SQL client to

trigger once (batch job with streaming semantics)

2022-05-02 Thread Georg Heiler
Hi, spark https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#triggers offers a variety of triggers. In particular, it also has the "once" mode: *One-time micro-batch* The query will execute *only one* micro-batch to process all the available data and then stop on

Re: trigger once (batch job with streaming semantics)

2022-05-06 Thread Georg Heiler
s.apache.org/flink/flink-docs-master/docs/dev/datastream/execution_mode/ > [2] > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/common/ > > On Mon, 2 May 2022 at 16:46, Georg Heiler > wrote: > >> Hi, >> >> spark >> https://spark.apache.o

Re: trigger once (batch job with streaming semantics)

2022-05-09 Thread Georg Heiler
out of the box that lets you > start Flink in streaming mode, run everything that's available at that > moment and then stops when there's no data anymore. You would need to > trigger the stop yourself. > > Best regards, > > Martijn > > On Fri, 6 May 2022 at 13:37, Geo