invalidate metadata behaviour

2017-11-28 Thread Antoni Ivanov
Hi, I am wondering if I run INVALIDATE METADATA for the whole database on node1 Then I ran a query on node2 – would the query on node2 used the cached metadata for the tables or it would know it’s invalidated? And second how safe it is to run it for a database with many (say 30) tables over

Difference between LOAD DATA and refresh

2018-01-08 Thread Antoni Ivanov
Hi, We are wondering if we can reduce the impact of https://issues.apache.org/jira/browse/IMPALA-5058 Now we use "insert statements using spark" and then we use refresh partition x Now we are thinking of using directly LOAD DATA statement. I imagine LOAD DATA doesn't require to communicate

Does Impala supports or plan to support Late Materialization

2018-03-20 Thread Antoni Ivanov
I don't mean partition pruning but as described in https://aws.amazon.com/about-aws/whats-new/2017/12/amazon-redshift-introduces-late-materialization-for-faster-query-processing/ It basically pre-fetches first the filter columns and then after applying the filter it fetches only the data from

Query status "Session Closed"

2019-08-05 Thread Antoni Ivanov
Hi, I am investigating the most common errors we see in our Impala Cluster. The most common is with query status = 'Session Closed' I can see from the code (https://github.com/apache/impala/blob/72c9370856d7436885adbee3e8da7e7d9336df15/be/src/service/impala-server.cc#L1435) that it is set when

Parsing the Query execution plan (profile and summary)

2019-08-06 Thread Antoni Ivanov
Hi, We'd like to parse the query execution plan after queries has completed for telemetry purposes. We'd like to have better visibility into how queries behave. For example You can see per-node utilization in the query profile . E.g

RE: How to parse a query plan /summary/profile

2019-08-08 Thread Antoni Ivanov
the user group. -Antoni From: Antoni Ivanov Sent: Wednesday, August 7, 2019 10:13 AM To: user@impala.apache.org Cc: dev@impala ; Jenny Kwan (c) Subject: How to parse a query plan /summary/profile Hi, We'd like to get better visibility into way our Impala Cluster is used. For example there's

RE: Generating a fixed size parquet file when doing Insert select *

2020-03-25 Thread Antoni Ivanov
Hi, Impala team can correct me but Even if you specify PARQUET_FILE_SIZE to 256MB Impala may and likely will create smaller files (e.g 128MB or even smaller). As far as I could understand, that’s because when Impala is writing the parquet file, it’s making a guess about the potential file size

Re: Data is being inserted even though an INSERT INTO query fails

2021-11-16 Thread Antoni Ivanov
Hi, Are insert queries supposed to be atomic ? Thanks, Antoni From: Antoni Ivanov Reply to: "user@impala.apache.org" Date: Friday, 12 November 2021, 12:52 To: "user@impala.apache.org" Subject: Data is being inserted even though an INSERT INTO query fails Hi, A coll

Data is being inserted even though an INSERT INTO query fails

2021-11-12 Thread Antoni Ivanov
Hi, A colleague of mine opened https://issues.apache.org/jira/browse/IMPALA-11014 It seems there a bug in Impala which can cause insert query to populate data even if it fails. That seems pretty serious since it violates atomicity of single query operation. Are you aware of this (we tried to

Re: Data is being inserted even though an INSERT INTO query fails

2021-12-09 Thread Antoni Ivanov
am not sure about Kudu). - Csaba On Tue, Nov 16, 2021 at 12:36 PM Antoni Ivanov mailto:aiva...@vmware.com>> wrote: Hi, Are insert queries supposed to be atomic ? Thanks, Antoni From: Antoni Ivanov mailto:aiva...@vmware.com>> Reply to: "user@impala.apache.org<mailto:user@i

Docker container image of Impala

2022-04-16 Thread Antoni Ivanov
Hi, We are using actively Impala and we have lots of tests running against it. We’d like to be able to run those tests gainst a docker container – this way they can be easily started locally and run at any environment and are better isolated and reproducible. Are there docker images offered