Cardinality and size estimation are fundamental requirements for cost-based
query optimization.
I hope we will work on this at some point but right now it is not on the
roadmap.
In case of very complex plans, it might make sense to write an intermediate
result to persistent storage and start
Is there any plans for this in future. I could see at the plans and without
these stats I am bit lost on what to look for like what are pain points
etc. I can see some very obvious things but not too much with these plans.
My question is there a guide or document which describes what your plans
No, there is no size or cardinality estimation happening at the moment.
Best, Fabian
2018-02-19 21:56 GMT+01:00 Darshan Singh :
> Thanks , is there a metric or other way to know how much space each
> task/job is taking? Does execution plan has these details?
>
> Thanks
>
Thanks , is there a metric or other way to know how much space each
task/job is taking? Does execution plan has these details?
Thanks
On Mon, Feb 19, 2018 at 10:54 AM, Fabian Hueske wrote:
> Hi,
>
> that's a difficult question without knowing the details of your job.
> A
Hi,
that's a difficult question without knowing the details of your job.
A NoSpaceLeftOnDevice error occurs when a file system is full.
This can happen if:
- A Flink algorithm writes to disk, e.g., an external sort or the hash
table of a hybrid hash join. This can happen for GroupBy, Join,
Thanks Fabian for such detailed explanation.
I am using a datset in between so i guess csv is read once. Now to my real
issue i have 6 task managers each having 4 cores and i have 2 slots per
task manager.
Now my csv file is jus 1 gb and i create table and transform to dataset and
then run 15
Hi,
this works as follows.
- Table API and SQL queries are translated into regular DataSet jobs
(assuming you are running in a batch ExecutionEnvironment).
- A query is translated into a sequence of DataSet operators when you 1)
transform the Table into a DataSet or 2) write it to a TableSink.
Thanks for reply.
I guess I am not looking for alternate. I am trying to understand what
flink does in this scenario and if 10 tasks ar egoing in parallel I am sure
they will be reading csv as there is no other way.
Thanks
On Mon, Feb 19, 2018 at 12:48 AM, Niclas Hedhman