Re: Table program cannot be compiled

2019-05-16 Thread shkob1
Hi Timo, Thanks for the link. Not sure i understand your suggestion though, is the goal here reducing the amount of parameters coming to the UDF? if thats the case i can maybe have the tag names there, but still need the expressions to get evaluated before entering the eval. Do you see this in a

Re: Table program cannot be compiled

2019-05-14 Thread shkob1
BTW looking at past posts on this issue[1] it should have been fixed? i'm using version 1.7.2 Also the recommendation was to use a custom function, though that's exactly what im doing with the conditionalArray function[2] Thanks! [1]

Re: Table program cannot be compiled

2019-05-14 Thread shkob1
In a subsequent run i get Caused by: org.codehaus.janino.JaninoRuntimeException: Code of method "split$3681$(LDataStreamCalcRule$3682;)V" of class "DataStreamCalcRule$3682" grows beyond 64 KB -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Table program cannot be compiled

2019-05-14 Thread shkob1
Hey, While running a SQL query i get an OutOfMemoryError exception and "Table program cannot be compiled" [2]. In my scenario i'm trying to enrich an event using an array of tags, each tag has a boolean classification (like a WHERE clause) and with a custom function i'm filtering the array to

Re: Reconstruct object through partial select query

2019-05-08 Thread shkob1
Just to be more clear on my goal - Im trying to enrich the incoming stream with some meaningful tags based on conditions from the event itself. So the input stream could be an event looks like: Class Car { int year; String modelName; } i will have a config that are defining tags as:

Reconstruct object through partial select query

2019-05-08 Thread shkob1
Hey, I'm trying to create a SQL query which, given input from a stream with generic class T type will create a new stream which will be in the structure of { origin : T resultOfSomeSQLCalc : Array[String] } it seems that just by doing "SELECT *" i can convert the resulting table back to a

Re: Calcite SQL Map to Pojo Map

2019-03-28 Thread shkob1
Apparently the solution is to force map creating using UDF and to have the UDF return Types.GENERIC(Map.class) That makes them compatible and treated both as GenericType Thanks! -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Calcite SQL Map to Pojo Map

2019-03-27 Thread shkob1
Im trying to convert a SQL query that has a select map[..] into a pojo with Map (using tableEnv.toRestractedStream ) It seems to fail when the field requestedTypeInfo is GenericTypeInfo with GenericType while the field type itself is MapTypeInfo with Map Exception in thread "main"

Re: Schema Evolution on Dynamic Schema

2019-03-26 Thread shkob1
Sorry to flood this thread, but keeping my experiments: so far i've been using retract to a Row and then mapping to a dynamic pojo that is created (using ByteBuddy) according to the select fields in the SQL. Considering the error I'm trying now to remove thr usage in Row and use the dynamic type

Re: Schema Evolution on Dynamic Schema

2019-03-26 Thread shkob1
Debugging locally it seems like the state descriptor of "GroupAggregateState" is creating an additional field (TypleSerializer of SumAccumulator) serializer within the RowSerializer. Im guessing this is what causing incompatibility? Is there any work around i can do? -- Sent from:

Re: Schema Evolution on Dynamic Schema

2019-03-26 Thread shkob1
Hi Fabian, It seems like it didn't work. Let me specify what i have done: i have a SQL that looks something like: Select a, sum(b), map[ 'sum_c', sum(c), 'sum_d', sum(d)] as my_map FROM... GROUP BY a As you said im preventing keys in the state forever by doing idle state retention time (+ im

Re: Map UDF : The Nothing type cannot have a serializer

2019-03-21 Thread shkob1
Looking further into the RowType it seems like this field is translated as a CURSOR rather than a map.. not sure why -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Map UDF : The Nothing type cannot have a serializer

2019-03-21 Thread shkob1
Hey, As im building a SQL query, im trying to conditionally build a map such that there won't be any keys with null values in it. AFAIK from Calcite there's no native way to do it (other than using case to build the map in different ways, but then i have a lot of key/value pairs so thats not

Re: Schema Evolution on Dynamic Schema

2019-03-08 Thread shkob1
Thanks Rong, I have made some quick test changing the SQL select (adding a select field in the middle) and reran the job from a savepoint and it worked without any errors. I want to make sure i understand how at what point the state is stored and how does it work. Let's simplify the scenario and

Schema Evolution on Dynamic Schema

2019-03-06 Thread shkob1
Hey, My job is built on SQL that is injected as an input to the job. so lets take an example of Select a,max(b) as MaxB,max(c) as MaxC FROM registered_table GROUP BY a (side note: in order for the state not to grow indefinitely i'm transforming to a retracted stream and filtering based on a

Reducing the number of unique metrics

2019-02-26 Thread shkob1
Hey All, Just read the excellent monitoring blog post https://flink.apache.org/news/2019/02/25/monitoring-best-practices.html I'm looking on reducing the number of unique metrics, especially on items i can compromise on consolidating such as using indices instead of ids. Specifically looking at

Re: Linkage error when using DropwizardMeterWrapper

2019-02-14 Thread shkob1
Hey Jayant. Getting the same using gradle. my metrics reporter and my application both using the flink-metrics-dropwizard dependency for reporting Meters. how should i be solving it? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: SQL Query named operator exceeds 80 characters

2018-11-29 Thread shkob1
OK, thanks for the help -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

SQL Query named operator exceeds 80 characters

2018-11-28 Thread shkob1
It seems like the operator name for a SQL group by is the query string itself. I get "The operator name groupBy: (myGroupField), select: (myGroupField, someOther... )... exceeded the 80 characters length limit and was truncated" Is there a way to name the SQL query operator? -- Sent from:

Re: Kinesis Shards and Parallelism

2018-11-16 Thread shkob1
Actually was looking at the task manager level, i did have more slots than shards, so it does make sense i had an idle task manager while other task managers split the shards between their slots Thanks! -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Kinesis Shards and Parallelism

2018-11-12 Thread shkob1
Looking at the doc https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kinesis.html "When the number of shards is larger than the parallelism of the consumer, then each consumer subtask can subscribe to multiple shards; otherwise if the number of shards is smaller than the

Re: FLINK Kinesis Connector Error - ProvisionedThroughputExceededException

2018-11-09 Thread shkob1
If it's running in parallel aren't you just adding readers which maxes out your provisioned throughput? probably doesn't belong in here but rather a Kinesis thing, but i suggest increasing your number of shards? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Manually clean SQL keyed state

2018-11-09 Thread shkob1
Thanks Fabian! -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Manually clean SQL keyed state

2018-11-08 Thread shkob1
I have a scenario in which i do a non-windowed group by using SQL. something like "Select count(*) as events, shouldTrigger(..) as shouldTrigger from source group by sessionId" i'm then converting to a retracted stream, filtering by "add" messages, then further filtering by "shouldTrigger" field

Re: Custom Trigger + SQL Pattern

2018-10-26 Thread shkob1
following up on the actual question - is there a way to register a keyedstream as table(s) and have a trigger per key? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Dynamically Generated Classes - Cannot load user class

2018-10-24 Thread shkob1
OK I think i figured it out - not sure though exactly the reason: It seems that i need to have a stream type - Generic Type of the super class - rather than a Pojo of the concrete generated class. It seems like the operation definition otherwise cannot load the Pojo class on the task creation. So

Re: Dynamically Generated Classes - Cannot load user class

2018-10-23 Thread shkob1
Update on this - if i just do empty mapping and drop the sql part, it works just fine. i wonder if there's any class loading that needs to be done when using SQL, not sure how i do that -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Dynamically Generated Classes - Cannot load user class

2018-10-23 Thread shkob1
After removing some operators (which i still need, but wanted to understand where my issues are) i get a slightly different stacktrace (though still same issue). my current operators are 1. a sql select with group by (returns retracted stream ) 2. filter (take only non retracted) 3. map (tuple

Dynamically Generated Classes - Cannot load user class

2018-10-22 Thread shkob1
Hey, I'm trying to run a job which uses a dynamically generated class (through Byte Buddy). think of me having a complex schema as yaml text and generating a class from it. Throughout the job i am using an artificial super class (MySuperClass) of the generated class (as for example i need to

Fire and Purge with Idle State

2018-10-12 Thread shkob1
Hey Say im aggregating an event stream by sessionId in SQL and im emitting the results once the session is "over", i guess i should be using Fire and Purge - i dont expect to need to session data once over. How should i treat the Idle state retention time - is it needed at all if im using purge?

Custom Trigger + SQL Pattern

2018-10-12 Thread shkob1
Hey! I have a use case in which im grouping a stream by session id - so far pretty standard, note that i need to do it through SQL and not by the table api. In my use case i have 2 trigger conditions though - while one is time (session inactivity) the other is based on a specific event marked as