Hi Nico
I think there were changes in the default port fort the BLOB server. I missed
the fact that the Kubernetes configuration was still exposing 6124 for the
JobManager BLOB server.
Thanks
Bernd
-Ursprüngliche Nachricht-
Von: Nico Kruber [mailto:n...@data-artisans.com]
Gesendet:
It seems that this has to do with session windows tbat are mergeable ? I
tried the RixhWindow function and that seems to suggest that one cannot use
state ? Any ideas folks...
On Dec 1, 2017 10:38 AM, "Vishal Santoshi"
wrote:
> I have a simple Aggregation with one
Hello Fabian,
Thanks for the help.
I am interested in the duration of specific operators, so the fact that
parts of the execution are in pipeline is not a problem for me.
>From my understanding, the automated way to approach this is to run the
Flink job with the web interface active and then make
Hi Timo
I execute streaming job without checkpointing and I don't configure any
state backend, so it may be "MemoryStateBackend".
Actually, my streaming app just reads data from kafka and writes it to
an external DB. Its not so complicated.
Regards,
Yuta
On 2017/12/05 19:55, Timo Walther
Hi,
Flink's operators are designed to work in memory as long as possible and
spill to disk once the memory budget is exceeded.
Moreover, Flink aims to run programs in a pipelined fashion, such that
multiple operators can process data at the same time.
This behavior can make it a bit tricky to
I have been moving some old MR and hive workflows into Flink because I'm
enjoying the api's and the ease of development is wonderful. Things have
largely worked great until I tried to really scale some of the jobs
recently.
I have for example one etl job that reads in about 12B records at a time
Hi Stephan,
I am facing S3 consistency related issue with the exception pasted at the
end:
We were able to solve the s3 sync issue by adding System.currentTime to
inprogressPrefix, inprogressSuffix, s3PendingPrefix and s3PendingSuffix
properties of BucketingSink.
I tried another approach by
Hi Stephan,
I am facing S3 consistency related issue with the exception pasted at the
end:
We were able to solve the s3 sync issue by adding System.currentTime to
inprogressPrefix, inprogressSuffix, s3PendingPrefix and s3PendingSuffix
properties of BucketingSink.
I tried another approach by
Hi Tovi,
you are right, it is difficult to check the correct behavior.
@Chesnay: Do you know if we can get this information? If not through the
Web UI, maybe via REST? Do we have access to the full ExecutionGraph
somewhere?
Otherwise it might make sense to open an issue for this.
Regards,
Thanks, Timo.
Either works for me.
On Tue, Dec 5, 2017 at 4:55 PM, Timo Walther wrote:
> Hi Shailesh,
>
> sharing state across operators is not possible. However, you could emit
> the state (or parts of it) as a stream element to downstream operators by
> having a
Hi all,
I am trying to use the slot group feature, by having 'default' group and
additional 'market' group.
The purpose is to divide the resources equally between two sources and their
following operators.
I've set the slotGroup on the source of the market data.
Can I assume that all following
Hi Shailesh,
sharing state across operators is not possible. However, you could emit
the state (or parts of it) as a stream element to downstream operators
by having a function that emits a type like
"Either".
Another option would be to use side outputs to send state to
It would be the same as with any other form of async checkpointing. No
direct blocking of processing but the network traffic might indirectly
affect it to some extent :)
Jayant Ameta ezt írta (időpont: 2017. dec. 5., K,
12:15):
> If the checkpointing to Ceph happens
If the checkpointing to Ceph happens asynchronously, does it still have any
impact on the stream processing?
Jayant Ameta
On Tue, Dec 5, 2017 at 4:34 PM, Gyula Fóra wrote:
> Hi,
>
> To my understanding Ceph as in http://ceph.com/ceph-storage/ is a block
> based object
Missed one point - I'm using Managed Operator state (and not Keyed state -
as my data streams are not keyed).
On Tue, Dec 5, 2017 at 4:28 PM, Shailesh Jain
wrote:
> Hi,
>
> Is it possible to share state across operators in Flink?
>
> I have CoFlatMap operator which
Hi,
To my understanding Ceph as in http://ceph.com/ceph-storage/ is a block
based object storage system. You can use it mounted to your server and will
behave as a local file system to most extent but will be shared in the
cluster.
The performance might not be as good as with HDFS to our
Hi,
Flink documents suggests that Ceph can be used as a persistent storage for
states.
https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/stream/checkpointing.html
Considering that Ceph is a transactional database, wouldn't it have adverse
effect on Flink's performance?
Hi,
Is it possible to share state across operators in Flink?
I have CoFlatMap operator which maintains a ListState and returns a
DataStream. And downstream there is a KafkaSink operator for the same
DataStream which needs to access the ListState.
Thanks,
Shailesh
I had some profiling tool like jvisualvm in mind. Are you executing
streaming or batch jobs? If streaming, is checkpointing enabled and
which type of statebackend?
@Chesnay do you have experience with slow behavior of the Web UI?
Regards,
Timo
Am 12/5/17 um 10:37 AM schrieb Yuta Morisawa:
Hi,
I haven't done that before either. The query API will change with the next
version (Flink 1.4.0) which is currently being prepared for releasing.
Kostas (in CC) might be able to help you.
Best, Fabian
2017-12-05 9:52 GMT+01:00 m@xi :
> Hi Fabian,
>
> Thanks for your
Hi Tao,
computing a group window requires that the event-time timestamp of the
DataStream is exposed as a time attribute (in your case as an event time
attribute).
If you register DataStream at the TableEnvironment, this has to be done in
two steps:
1) assign timestamps and watermarks to the
Hi Yuta,
as far as I know you cannot assign more cores to a JobManager.
Can you tell us a bit more about your environment? How many jobs does
the JobManager has to manage? How much heap memory is assigned to the
JobManager?
Maybe you can use a profiler and find out which component consumes
To subscribe to this mailing list, please send a mail to
user-subscr...@flink.apache.org .
On 05.12.2017 08:21, Xin Wang wrote:
hello flink
--
Thanks,
Xin
23 matches
Mail list logo