To be clear I want the outer grouping to have a longer retention time (of
the order of week or month - for which we are using 'idle state retention
time') and inner grouping to have a shorter retention period (1 hour max).
So hoping the session window will do the right thing.
Thanks,
Vinod
On
Hi, yuvraj singh
The possible reason may be that you have reached the linux system limit of
max user processes. You could confirm this by using the "ulimit -a" command.
Also "ulimit -u 32000" could be used to override the default value.
Please make sure the user you are running the above commands
Thanks a lot Fabian for the detailed response. I know all the duplicates
are going to arrive within an hour max of the actual event. So using a 1
hour running session window should be fine for me.
Is the following the right way to do it in apache-flink-1.4.2?
SELECT
CONCAT_WS(
'-',
Based on where this line of code is, it is hard to get the full stack
trace, as in the
LOG.error("Cannot update accumulators for job {}.", getJobID(), e);
does not get us the full stack trace
Though it is true that the Accumulator did not have a serialVersionUID. I
would double check. I
It's hard to tell without more info.
>From the method that threw the exception it looked like it was trying to
deserialize the accumulator. By any chance did you change your
accumulator class but forgot to update the serialVersionUID? Just
wondering if it is trying to deserialize to a
The above is flink 1.8
On Tue, Jun 4, 2019 at 12:32 PM Vishal Santoshi
wrote:
> I had a sequence of events that created this issue.
>
> * I started a job and I had the state.checkpoints.num-retained: 5
>
> * As expected I have 5 latest checkpoints retained in my hdfs backend.
>
>
> * JM dies (
I see tons of
org.apache.flink.runtime.executiongraph.ExecutionGraph- Cannot
update accumulators for job 7bfe57bb0ed1c5c2f4f40c2fccaab50d.
java.lang.NullPointerException
Hi Flink users,
When # of Kafka consumers = # of partitions, and I use setParallelism(>1),
something like this
'messageSteam.rebalance().map(lamba).setParallelism(3).print()'
How do I tune # of outstanding uncommitted offset? Something similar to
Hi Jingsong,
Thanks for getting back. I’ll try and hook up the UDTF.
I added a unit test which catches the issue I’m running into (I tested this
against Flink 1.6 which is what we’re running as well as latest master). Did
you have to do anything in particular to hook up the type correctly?
I had a sequence of events that created this issue.
* I started a job and I had the state.checkpoints.num-retained: 5
* As expected I have 5 latest checkpoints retained in my hdfs backend.
* JM dies ( K8s limit etc ) without cleaning the hdfs directory. The k8s
job restores from the latest
And it works now.
My mistake
Boris Lublinsky
FDP Architect
boris.lublin...@lightbend.com
https://www.lightbend.com/
> On Jun 3, 2019, at 10:18 PM, Xintong Song wrote:
>
> If that is the case, then I would suggest you to check the following two
> things:
> 1. Is the HA mode configured properly
Yes, interactive programming solves the problem by storing the meta
information on the client whereas in the past we thought whether to keep
the information on the JM. But this would then not allow to share results
between different clusters. Thus, the interactive programming approach is a
bit
I've looked into this briefly a while ago out of interest and read about
how beam handles this. I've never actually implemented but the concept
sounds reasonable to me.
What I read from their code is that beam exports the BigQuery data to
Google Storage. This export shards the data in files with
You can use the DataSet API to parse files from S3.
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/batch/#data-sources
https://ci.apache.org/projects/flink/flink-docs-stable/ops/deployment/aws.html#s3-simple-storage-service
And then parsed it and send it to kafka.
On Tue,
Hi all ,
i am having on problem , i was running a job then i submitted one more job
on the same cluster then my old job start failing by saying
2019-06-04 15:12:11,593 ERROR
org.apache.flink.yarn.YarnResourceManager - Could not start
TaskManager in container.
java.lang.OutOfMemoryError: unable
The jira is https://issues.apache.org/jira/browse/FLINK-12544 and you could
find the PR link in it.
--
From:Erai, Rahul
Send Time:2019年6月4日(星期二) 18:19
To:zhijiang ; Aljoscha Krettek
; Piotr Nowojski ;
"Narayanaswamy, Krishna"
Thanks for the reply, @Till Rohrmann . Regarding
reuse computed results. I think JM keep all the metadata of intermediate
data, and interactive programming is also trying to reuse computed results.
It looks like it may not be necessary to introduce the session concept as
long as we can achieve
Hi Jeff,
the session functionality which you find in Flink's client are the remnants
of an uncompleted feature which was abandoned. The idea was that one could
submit multiple parts of a job to the same cluster where these parts are
added to the same ExecutionGraph. That way we wanted to allow to
Yes, it is the same case as multiple slots in TM. The source task and co-group
task are still in the same TM in this case. I think you might enable slot
sharing, so they are running still in the same slot in one TM.
BTW, the previous deadlock issue is already fixed on my side, and it is waiting
Hi @Piyush Narang
I tried again, if the type of advertiser_event.products is derived correctly.
(ObjectTypeInfo(RowTypeInfo(fields...)))
It will work. See more information in calcite code:
SqlUnnestOperator.inferReturnType
So I think maybe your type is not passed to the engine correctly.
Best,
20 matches
Mail list logo