Re: Non-Determinism in Table-API with Kafka and Event Time

2023-02-13 Thread Theodor Wübker
ter than your > latest event and see if they show up. > > Hope it helps > -Hector > > On Mon, Feb 13, 2023 at 8:47 AM Theodor Wübker <mailto:theo.wueb...@inside-m2m.de>> wrote: > Hey, > > so one more thing, the query looks like this: > > SELECT window_sta

Re: Non-Determinism in Table-API with Kafka and Event Time

2023-02-12 Thread Theodor Wübker
yed at all. When I key it by the attribute “a”, I get the incorrect, but deterministic results. Maybe in the second case, only 1 partition out of the 10 is consumed at once? Best, Theo > On 13. Feb 2023, at 08:15, Theodor Wübker wrote > > Hey Yuxia, > > thanks for your response

Re: Non-Determinism in Table-API with Kafka and Event Time

2023-02-12 Thread Theodor Wübker
eo (sent again, sorry, I previously only responded to you, not the Mailing list by accident) > On 13. Feb 2023, at 08:14, Theodor Wübker wrote: > > Hey Yuxia, > > thanks for your response. I figured too, that the events arrive in a > (somewhat) random order and thus cause no

Non-Determinism in Table-API with Kafka and Event Time

2023-02-12 Thread Theodor Wübker
Hey everyone, I experience non-determinism in my Table API Program at the moment and (as a relatively unexperienced Flink and Kafka user) I can’t really explain to myself why it happens. So, I have a topic with 10 Partitions and a bit of Data on it. Now I run a simple SELECT * query on this, t

Standalone cluster memory configuration

2023-02-02 Thread Theodor Wübker
Hello everyone, I have a Standalone Custer running in a docker-swarm with a very simple docker-compose configuration [3]. When I run my job there with a parallelism greater than one, I get an out of memory error. Nothing out of the ordinary, so I wanted to increase the JVM heap. I did that by

Cluster uploading and running a jar itself

2023-01-16 Thread Theodor Wübker
Hello, I noticed my Flink Cluster (version 1.16) is uploading a jar called “check-execute.jar” itself regularly. Apparently it also tries to run it, at least that’s what I take from this log of my jobmanager that appears numerous times: 2023-01-16 07:09:00,719 WARN org.apache.flink.runtime.w

Re: Windowing query with group by produces update stream

2022-12-20 Thread Theodor Wübker
I actually managed to fixed this already :) For those wondering, I grouped by both window start and end first. That did it! > On 19. Dec 2022, at 15:43, Theodor Wübker wrote: > > Hey everyone, > > I would like to run a Windowing-SQL query with a group-by clause on a Kafka &g

Windowing query with group by produces update stream

2022-12-19 Thread Theodor Wübker
Hey everyone, I would like to run a Windowing-SQL query with a group-by clause on a Kafka topic and write the result back to Kafka. Right now, the program always says that I am creating an update-stream that can only be written to an Upsert-Kafka-Sink. That seems odd to me, because running my g

Re: Can't use nested attributes as watermarks in Table

2022-12-17 Thread Theodor Wübker
14. Dec 2022, at 09:23, Theodor Wübker wrote: > > Actually, this behaviour is documented > <https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/create/#create-table> > (See the Watermarks section, where it is stated that the column must be a > “top-leve

Re: Can't use nested attributes as watermarks in Table

2022-12-14 Thread Theodor Wübker
, what is the reason for this behaviour? Also, are there any good workarounds for this? Thanks, -Theo > On 14. Dec 2022, at 08:13, Theodor Wübker wrote: > > Hey everyone, > > I have encountered a problem with my Table API Program. I am trying to use a > nested attri

Can't use nested attributes as watermarks in Table

2022-12-13 Thread Theodor Wübker
Hey everyone, I have encountered a problem with my Table API Program. I am trying to use a nested attribute as a watermark. The structure of my schema is a row, which itself has 3 rows as attributes and they again have some attributes, especially the Timestamp that I want to use as a watermark.

Re: Table API: Custom Kafka formats for Confluent Protobuf and JSON

2022-11-29 Thread Theodor Wübker
r Protobuf schema is more or less fixed. > But for a production workload, you'd need to add a Schema Registry lookup. I > guess the implementation for that would be similar to what's in the Avro > format. > > On Tue, Nov 29, 2022 at 2:26 AM Theodor Wübker <mail

Table API: Custom Kafka formats for Confluent Protobuf and JSON

2022-11-29 Thread Theodor Wübker
Hey all, so Confluent has Kafka serializers to serialize Protobuf, Avro and JSON that create messages with a magic byte followed by a 4 byte schema id followed by the actual payload (refer the docs

Re: Converting ResolvedSchema to JSON and Protobuf Schemas

2022-11-15 Thread Theodor Wübker
but it isn't a real java-y type system; it's just more > JSON for which there exist validators. > > > > On Thu, Nov 10, 2022 at 2:12 AM Theodor Wübker <mailto:theo.wueb...@inside-m2m.de>> wrote: > Great, I will have a closer look at what you sent. Your

Class loading in PbFormatUtils.getDescriptor

2022-11-14 Thread Theodor Wübker
Hey everyone, there is some use of reflection in PbFormatUtils.getDescriptor: Namely they get the Threads ClassLoader to load a protobuf-generated class. First of all, this appears to be a bad practice in general (refer first answer in this stackoverflow

Re: Converting ResolvedSchema to JSON and Protobuf Schemas

2022-11-09 Thread Theodor Wübker
ngs here > <https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/formats/json/DataTypeSchemaConversions.java>. > > > > > > > > On Wed, Nov 9, 2022 at 2:

Re: Converting ResolvedSchema to JSON and Protobuf Schemas

2022-11-09 Thread Theodor Wübker
nionated. You > can see the mappings here > <https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/+/refs/heads/master/eventutilities-flink/src/main/java/org/wikimedia/eventutilities/flink/formats/json/DataTypeSchemaConversions.java>. > > > >

Re: Converting ResolvedSchema to JSON and Protobuf Schemas

2022-11-08 Thread Theodor Wübker
Thanks for your reply Yaroslav! The way I do it with Avro seems similar to what you pointed out: ResolvedSchema resultSchema = resultTable.getResolvedSchema(); DataType type = resultSchema.toSinkRowDataType(); org.apache.avro.Schema converted = AvroSchemaConverter.convertToSchema(type.getLogicalT

Converting ResolvedSchema to JSON and Protobuf Schemas

2022-11-08 Thread Theodor Wübker
Hello, I have a streaming use case, where I execute a query on a Table. I take the ResolvedSchema of the table and convert it to an Avro-Schema using the AvroSchemaConverter. Now I want to do the same for JSON and Protobuf. However, it seems there is nothing similar to AvroSchemaConverter - I