Hi chrisr,
> *Is there a way to retrieve this from the query as a long instead?*
You have to convert the timestamp type to long type. It seems there are no
internal udf to convert timestamp to unix timestamp, however you can write
one and used in your SQL.
> *Question: how do I express that I
Hi,
It should be possible to deploy a single Flink cluster across
geo-distributed nodes, but Flink currently offers no optimization for such
a specific use case.
AFAIK, the general pattern for dealing with geographically distributed data
sources right now, would be to replicate data across
Hi Giriraj,
The fact that the Flink Kafka Consumer doesn't use the group.id property,
is an expected behavior.
Since the partition-to-subtask assignment of the Flink Kafka Consumer needs
to be deterministic in Flink, the consumer uses static assignment instead
of the more high-level consumer
Hi Febian,
Finally I have time to read the code, and it is brilliant it does provide
exactly once guarantee。
However I still suggest to add the function that can close a file when
checkpoint made. I noticed that there is an enhancement
https://issues.apache.org/jira/browse/FLINK-9138 which can
I am trying to understand how to use streaming sql, very similar to the
example
from the documentation: count the number of pageclicks in a certain period
of time for each user.
I'm trying to solve the problem using both the SQL API and the table API
My input sample stream looks like this:
Thanks! I did not see the znode and thus did not paste the ls...anywaz will
get you the full JM log ASAP
On Thu, Jun 28, 2018, 5:35 PM Gary Yao wrote:
> Hi Vishal,
>
> The znode /flink_test/da_15/leader/rest_server_lock should exist as long
> as your
> Flink 1.5 cluster is running. In 1.4
Hi Vishal,
The znode /flink_test/da_15/leader/rest_server_lock should exist as long as
your
Flink 1.5 cluster is running. In 1.4 this znode will not be created. Are you
sure that the znode does not exist? Unfortunately you only attached the
output
of "ls /flink_test/da_15".
Can you share the
Am I the only one that feels the config should be renamed or the docs on it
expanded? Turning on object reuse doesn't really reuse objects, not in the
sense that an object can be reused for different values / messages /
records. Instead, it stops Flink from making copies of of a record, by
Fabian, All,
Along this same line, we have a datasource where we have parent key and child
key. We need to first keyBy parent and then by child. If we want to have
physical partitioning in a way where physical partiotioning happens first by
parent key and localize grouping by child key, is
I'm writing a custom S3 source in order to work around some issues with
back pressure and checkpointing at scale in my bootstrap logic. I moved
around the logic to assign timestamps and watermarks. As part of that I
ended up generating watermarks earlier in the pipeline but having another
operator
I am not seeing rest_server_lock. Is it transient ( ephemeral znode ) for
the duration of the cli command ?
[zk: localhost:2181(CONNECTED) 2] ls /flink_test/da_15
[jobgraphs, leader, checkpoints, leaderlatch, checkpoint-counter]
The logs say
2018-06-28 14:02:56 INFO
Hi Vijay,
Flink does not provide fine-grained control to place keys to certain slots
or machines.
When specifying a key, it is up to Flink (i.e., its internal hash function)
where the data is processed. This works well for large key spaces, but can
be difficult if you have only a few keys.
So,
Hi Osh,
You can certainly apply multiple reduce function on a DataSet, however, you
should make sure that the data is only partitioned and sorted once.
Moreover, you would end up with multiple data sets that you need to join
afterwards.
I think the easier approach is to wrap your functions in a
In a nutshell the Over operator works as follows:
- When a row arrives it is put into a MapState keyed on its timestamp and a
timer is registered to process it when the watermark passes that timestamp.
- All the heavy computation is done in the onTimer() method. For each
unique timestamp, the Over
It seems that Flink kafka consumer don't honor group.id while consumers are
added dynamically. Lets say I have some flink kafka consumers reading from a
topic and I dynamically add some new Flink kafka consumers with same
group.id, kafka messages are getting duplicated to existing as well as new
Hi aitozi,
1> how will sql translated into a datastream job?
The Table API and SQL leverage Apache Calcite for parsing, validation, and
query optimization. After optimization, the logical plan of the job will be
translated into a datastream job. The logical plan contains many different
logical
Hi
I was trying to setup checkpointing on Google Cloud Storage with Flink on
Kubernetes, but was facing issues with Google Cloud Storage Connector
classes not loading, even though in the logs I can see it being included in
the classpath.
Logs showing classpath - *https://pastebin.com/R1P7Eepz
Hello,
we were playing around with flink 1.5 - so far so good.
Only thing that we are missing is web history setup.
In flink 1.4 and before we were using *web.history* config to hold 100 jobs.
With Flink 1.5. we can see that history is limited to 1 hour only. Is it
possible to somehow
Chesnay,
Do you have rough idea of the 1.5.1 timeline?
Thanks,
--
Christophe
On Mon, Jun 25, 2018 at 4:22 PM, Chesnay Schepler
wrote:
> The watermark issue is know and will be fixed in 1.5.1
>
>
> On 25.06.2018 15:03, Vishal Santoshi wrote:
>
> Thank you
>
> One addition
>
> I do not see
Hi, all
Thanks for your reply.
1. Can i ask how does the SQL like below transform to a low-level datastream
job?
2. If i implement a distinct in datastream job, and there is no keyBy needed
advance , and we just calculate the global distinct count, Does i just can
used the AllWindowedStream or
20 matches
Mail list logo