Re: Impala support Apache Paimon(incubating)

2023-04-06 Thread Csaba Ringhofer
Hi! Looked into the documentation about Hive support: https://paimon.apache.org/docs/master/engines/hive/ When reading a Paimon table as external it uses storage handler org.apache.paimon.hive.PaimonStorageHandler. Impala does not support Hive storage handlers, so that example will not work.

Re: I would like to work on IMPALA-11993

2023-04-03 Thread Csaba Ringhofer
Hi! Added you as a contributor to Impala (welcome!) and assigned IMPALA-11993. Did a quick check and Impala doesn't delete files during truncate in Iceberg tables. This allows the content before truncation to be reached through time travel. I agree that we should specifically mention that

Re: [VOTE] 4.1.2 release candidate 1

2023-03-29 Thread Csaba Ringhofer
+1 (binding) - Verified the checksum, the signature and the tree hash - Built locally on Ubuntu 20.04.6 LTS - Played around a bit via impala-shell - Ran some queries with impyla 0.18.0 Thanks, Csaba On Wed, Mar 29, 2023 at 12:31 PM Daniel Becker wrote: > +1 (binding) > > - Verified the

Re: Apache Jira Account

2022-11-23 Thread Csaba Ringhofer
In the meanwhile I have created a Jira account for Jason. On Wed, Nov 23, 2022 at 6:56 PM Tim Armstrong wrote: > That's also unfortunate that they disabled public signups, what a pain. > > On Wed, 23 Nov 2022 at 09:55, Tim Armstrong > wrote: > > > Jim Apple replied but it looks like you're

Re: CREATE GLOBAL FUNCTION works on all databases

2022-11-14 Thread Csaba Ringhofer
Hi! I also like the idea of fallback database for functions, it seems like a fairly simple but very useful feature. One thing I would consider is adding this as a query option instead of a flag, but it is probably harder to implement, so I am ok with adding a flag now, and possibly later adding a

Re: SUPPORT user-defined table functions (UDTFs)

2022-08-26 Thread Csaba Ringhofer
Hi Xiaoqing! >The syntax of query is as follows, select udtf_explode(info) as (name, phone) from table; What is type of info / what os the intention of the "as (name, phone)" part? If info is a struct with members name/phone, then you could do this in Impala with: select item.name, item.phone from

Re: Impala 4.1.1 Branch

2022-08-02 Thread Csaba Ringhofer
4.1.1 has some very important fixes, so I agree with the release if there are actually people out there who use 4.1 >and do code review for resolving such cherrypick conflicts. If these conflicts are tricky, then it may make sense to postpone them till 4.2. The Iceberg related ones do not seem

jenkins.impala.io going down for maintanance

2021-11-11 Thread Csaba Ringhofer
Hi! I am updating Jenkins to a newer version. Currently there is one GVO running ( https://jenkins.impala.io/job/gerrit-verify-dryrun/7621/ ), I am abandoning it now and will restart it when Jenkins is running again. Csaba

Re: Impala 4 Breaking Changes

2021-05-08 Thread Csaba Ringhofer
; > > > > > slow. Anyway, I think it's time to branch out. We've been waiting > > too > > > > > long. > > > > > > Thanks for creating the branch. > > > > > > > > > > > > Regards, > > > > > > Quanlong > > &

Re: Impala 4 Breaking Changes

2021-04-23 Thread Csaba Ringhofer
About IMPALA-9690 (AVX support): My preferred solution would be to deprecate support for x64 without AVX2 in 4.0, but not start removing the related logic yet. - We could even add a DCHECK + flag to crash by default if no AVX2 is detected, and a message that points them to Impala mailing

Re: why tuple memory need sorted by solt size

2021-02-08 Thread Csaba Ringhofer
I think that alignment is not the goal here, because the tuples themselves are not aligned, as there is no padding at their end - e.g. if tuple's size is 17 byte, all kind the first tuple will start at offset 0, the next at 17 ... a comment about the lack of padding:

Re: Impala 4.0 breaking changes

2020-03-18 Thread Csaba Ringhofer
> What do you think about dateless timestamps? AFAIK that is not supported +1, I think that dateless timestamps are just confusing both in the code and for the users I created a Jira to drop it: IMPALA-9531 A number of issues with them are listed in this jira: IMPALA-5942 On Wed, Mar 18, 2020 at

Re: New Committer: Laszlo Gaal

2019-06-20 Thread Csaba Ringhofer
Congratulations! On Thu, Jun 20, 2019 at 9:00 AM Gabor Kaszab wrote: > Congrats Laszlo! > > On Thu, Jun 20, 2019 at 7:15 AM Xiaomeng Zhang > wrote: > > > Congrats Laszlo! > > > > On Wed, Jun 19, 2019 at 6:06 PM Quanlong Huang > > wrote: > > > > > Congratulations! > > > > > > On Thu, Jun 20,

Re: A problem of left outer join when statement contains two aggregation for the same column

2019-04-04 Thread Csaba Ringhofer
Hi! I have checked the queries, and I can verify that Impala incorrectly returns 1 row while the same query with Hive (or common sense..) returns 2 rows. > "but if remove the "t2.amount2" like this:" Indeed, the issue seems to be related to returning the same aggregate twice + the fact that one

Re: [DISCUSS] 3.2.0 release

2019-03-11 Thread Csaba Ringhofer
>Similarly, if there is anything that is not ready but is right around the corner and you insist to include it to the release, let me know. IMPALA-6503 introduced a test error on S3, there is already a fix underway: https://gerrit.cloudera.org/#/c/12714/ Please wait for the fix to be merged and

Re: Next round of the Impala community meeting

2019-03-06 Thread Csaba Ringhofer
>When would be a better time for folks in Hungary (i.e. no public holiday)? 15. is the only public holiday in March. Some of us will attend Dataworks on 19-22. of March, so the week after 15. is also not the best for us. I am interested in the community meeting, but it is not super important for

Re: Timestamps with less than nano precision: rounding vs truncating

2019-02-11 Thread Csaba Ringhofer
ing again, I think I understand the choices better. I > like your idea to keep consistency with Hive and change the kudu writer > timestamp rounding mode in Impala 4.0. > > On Thu, Jan 24, 2019 at 4:35 PM Csaba Ringhofer > wrote: > > > Sorry, my wording was bad (I

Re: Timestamps with less than nano precision: rounding vs truncating

2019-01-24 Thread Csaba Ringhofer
; is different from "rounding towards negative infinity"? As I mentioned > above, I am not aware of a rounding mode entitled "truncation towards > negative infinity". > > On Thu, Jan 24, 2019 at 11:41 AM Csaba Ringhofer > > wrote: > > > Thanks for the

Re: Timestamps with less than nano precision: rounding vs truncating

2019-01-24 Thread Csaba Ringhofer
cle, Netezza, > Vertica, and Postgres all round. Db2 truncates. > > On Wed, Jan 23, 2019 at 12:26 PM Csaba Ringhofer > > wrote: > > > Timestamps are often represented as ticks since some epoch, e.g. > 1970.01.01 > > 00:00:00, so negative timestamps make sense as time

Re: Timestamps with less than nano precision: rounding vs truncating

2019-01-23 Thread Csaba Ringhofer
Timestamps are often represented as ticks since some epoch, e.g. 1970.01.01 00:00:00, so negative timestamps make sense as times before the epoch - I meant rounding vs truncating towards 0 vs rounding towards negative infinite in this sense. Truncating towards negative infinity means that

Timestamps with less than nano precision: rounding vs truncating

2019-01-23 Thread Csaba Ringhofer
Hi folks! I am working on the Parquet writer for new timestamp formats (IMPALA-5051), and I have a dilemma about the way to reduce a timestamp's precision from nanosecond to milli or microsecond. I have to choose between consistency with Hive vs Impala itself: - Impala currently rounds

Re: New Impala PMC member - Zoltán Borók-Nagy

2019-01-04 Thread Csaba Ringhofer
Congratulations Zoli! On Fri, Jan 4, 2019 at 7:42 PM Zoram Thanga wrote: > Congratulations, Zoltan! > > -Zoram > > On Fri, Jan 4, 2019 at 8:04 AM Tim Armstrong > wrote: > > > The Project Management Committee (PMC) for Apache Impala has invited > Zoltán > > Borók-Nagy to become a PMC member and

Re: Wrong Jira id in commit (IMPALA-7147 vs IMPALA-7417)

2018-09-05 Thread Csaba Ringhofer
rrent idea is to change IMPALA-7417 > > to be a duplicate of IMPALA-7147, and create a new Jira with > IMPALA-7417's > > original contents." > > > > On Mon, Sep 3, 2018 at 12:12 PM, Csaba Ringhofer < > csringho...@cloudera.com > > > > > wrote: > >

Wrong Jira id in commit (IMPALA-7147 vs IMPALA-7417)

2018-09-03 Thread Csaba Ringhofer
Hi folks! We have just discovered (thanks Laszlo), that one of my changes was pushed with wrong Jira id in the commit message (it fixed IMPALA-7147, but I wrote it as IMPALA-7417, which didn't exist at that time). Luckily there is no commit pushed for IMPALA-7417 yet (it is on review, and

Re: New Impala committer - Quanlong Huang

2018-08-17 Thread Csaba Ringhofer
Congrats! On Fri, Aug 17, 2018 at 6:32 PM, Philip Zeyliger wrote: > Congrats! > > On Fri, Aug 17, 2018 at 9:29 AM Tim Armstrong > wrote: > > > The Project Management Committee (PMC) for Apache Impala has invited > > Quanlong Huang to become a committer and we are pleased to announce that > >

Re: Breaking changes after 3.0, versioning, IMPALA-3307

2018-06-12 Thread Csaba Ringhofer
xperience, and if our users end up well informed of the > > > > > breakages, > > > > > > > then I will feel we have done our job, no matter what version > > > number > > > > we > > > > > > > stamp on it. > > > > > > &g

Re: UDA debugging, was Re: Broken/Flaky Tests

2018-06-07 Thread Csaba Ringhofer
Hi! I have left some comments in the code (lines starting with /// ) + removed the md5 implementation parts to make the answer shorter. Note that I am not sure about the goal you want to achieve with the UDA - can you explain what countMD5 would be used for? > void md5(const unsigned char

Breaking changes after 3.0, versioning, IMPALA-3307

2018-06-04 Thread Csaba Ringhofer
Hi Folks! We had a discussion with a few people about the versioning of Impala after 3.0. The motivation was that IMPALA-3307 (which replaces the timezone implementation in Impala, and contains some breaking changes) missed 3.0 and we are not sure about the version in which it can be released -