> What do you think about dateless timestamps? AFAIK that is not supported
+1, I think that dateless timestamps are just confusing both in the code
and for the users
I created a Jira to drop it: IMPALA-9531
A number of issues with them are listed in this jira: IMPALA-5942

On Wed, Mar 18, 2020 at 3:16 PM Gabor Kaszab <gaborkas...@apache.org> wrote:

> What do you think about dateless timestamps? AFAIK that is not supported
> ATM, shouldn't we drop it?
>
> Gabor
>
> On Wed, Mar 18, 2020 at 1:46 AM Shant Hovsepian <sh...@cloudera.com>
> wrote:
>
> > +1 on RUNTIME_FILTER_WAIT_TIME_MS increasing.
> >
> > On Tue, Mar 17, 2020 at 5:43 PM Tim Armstrong <tarmstr...@cloudera.com>
> > wrote:
> > >
> > > I think we should consider changing a couple more defaults, after
> having
> > an
> > > offline conversion with Shant.
> > >
> > > We could change COMPRESSION_CODEC to LZ4 or ZSTD as the default. I
> think
> > > LZ4 is the safest option perf-wise, because it will be faster across
> the
> > > board and the decompression is now one of the main CPU bottlenecks for
> > > Parquet scanning. We might need to double-check that enough of the
> > > ecosystem supports LZ4, but this seems like it would be a good
> > improvement.
> > >
> > > It *might* we worth enabled compute stats table sampling by default,
> but
> > I
> > > think that could be open for discussion.
> > >
> > > We could also consider bumping RUNTIME_FILTER_WAIT_TIME_MS to a higher
> > > value, since I think generally higher values have proven to be more
> > robust
> > > for complex queries (TPC-DS, etc).
> > >
> > > On Tue, Mar 17, 2020 at 11:56 AM Tim Armstrong <
> tarmstr...@cloudera.com>
> > > wrote:
> > >
> > > > >   - Do we still need the DECIMAL_V2 query option? Seems like this
> has
> > > > been  true for a while. Maybe we can add it to the list of deprecated
> > flags?
> > > > Maybe we could officially deprecate it and phase it out soonish? It
> > really
> > > > only exists as a workaround for people upgrading from the old
> > behaviour in
> > > > 2.x. It hasn't been terribly bad maintaining the two code paths, but
> it
> > > > would be nice to simplify it.
> > > >
> > > > >   - Deprecate support for ADLS, since it has effectively been
> > replaced
> > > > by ABFS
> > > > Makes sense. It probably isn't too much overhead to keep the old code
> > > > around for a while, is it? Just in case users have a bunch of data
> > still
> > > > sitting in the old ADLS.
> > > >
> > > > >   - Deprecate (or even remove) support for HDFS cacheing? Not sure
> > how
> > > > extensively this is used, removing the code would be nice as it
> > simplifies
> > > > part of the HDFS read path
> > > > Anecdotally I do see it used, but a lot of times it's to affect
> > scheduling
> > > > rather than because saving memcpy() makes a real difference (with
> > > > compressed parquet, that's rarely the bottleneck) . A compromise or
> > > > in-between step would be to remove the special-casing of the
> zero-copy
> > code
> > > > path in the backend, but keep the scheduling behaviour.
> > > >
> > > > On Tue, Mar 17, 2020 at 11:50 AM Tim Armstrong <
> > tarmstr...@cloudera.com>
> > > > wrote:
> > > >
> > > >> I think I generally support this. A few specific comments.
> > > >>
> > > >> > Proposal 3: Impala-lzo
> > > >> > Drop support for Impala-lzo/hadoop-lzo
> > > >>
> > > >> Does this mean dropping the plugin text scanner interface entirely?
> > LZO
> > > >> is the only implementation of that that I'm aware of (and we rely on
> > it to
> > > >> test the interface) so seems reasonable to me to remove something
> > that has
> > > >> minimal adoption and not cleanly separated from the scanner
> > implementation
> > > >> of core Impala.
> > > >>
> > > >> > Proposal 5: Sentry
> > > >> > Drop support for Sentry in favor of Ranger.
> > > >>
> > > >> I think moving this direction makes a lot of sense given that
> > activity in
> > > >> the Sentry project has declined a lot (just look at the activity
> > level on
> > > >> the two projects, it's dramatically different), unless someone in
> the
> > > >> community wants to step up and maintain the integration.
> > > >>
> > > >> > Proposal 6: Metadata
> > > >> > Metadata V2 will become the default. Metadata V1 will be
> deprecated.
> > > >> Maybe we should set a goal of removing the support in Impala 4.1 or
> > 4.2?
> > > >> That would allow us to remove a lot of complex code
> > > >>
> > > >> On Mon, Mar 16, 2020 at 10:07 AM Joe McDonnell <
> > joemcdonn...@cloudera.com>
> > > >> wrote:
> > > >>
> > > >>> Now that Impala 3.4 is branched and master is Impala 4.0, we need
> to
> > > >>> decide
> > > >>> what breaking changes will happen in Impala 4.0. I have provided a
> > series
> > > >>> of proposals below. I welcome feedback on them. Other proposals are
> > also
> > > >>> welcome.
> > > >>>
> > > >>> Thanks,
> > > >>> Joe
> > > >>>
> > > >>> Proposal 0: Hadoop component versions
> > > >>>
> > > >>> Switch to CDP versions of components by default. This means that
> > Impala
> > > >>> will use Hive 3+ (which is already essentially Hive 4 and may
> change
> > > >>> names
> > > >>> to being Hive 4).
> > > >>> Remove support for CDH versions of components.
> > > >>> This was already discussed in the original thread for Impala 4, so
> > this
> > > >>> is
> > > >>> not new.
> > > >>>
> > > >>> Proposal 1: OS support
> > > >>>
> > > >>> Drop support for Centos 6, Ubuntu 14, and Debian (all versions)
> > > >>> Retain support for Ubuntu 16, Ubuntu 18, Centos 7, and SLES 12
> > > >>> Centos 7 development will be focused on newer Centos 7 versions
> such
> > as
> > > >>> 7.6
> > > >>> and 7.7.
> > > >>> Add support for Centos 8
> > > >>> Move main development from Ubuntu 16 to Ubuntu 18 over time.
> > > >>>
> > > >>> Proposal 2: Python support
> > > >>>
> > > >>> Drop support for Python 2.6
> > > >>> Add support for Python 3 over time.
> > > >>>
> > > >>> Proposal 3: Impala-lzo
> > > >>>
> > > >>> Drop support for Impala-lzo/hadoop-lzo
> > > >>>
> > > >>> Proposal 4: Clients
> > > >>>
> > > >>> Deprecate beeswax protocol. This means that it can be removed in
> the
> > next
> > > >>> major version number, but it would not be removed in Impala 4.
> > Current
> > > >>> users of beeswax would need to start migrating to HS2.
> > > >>>
> > > >>> Proposal 5: Sentry
> > > >>>
> > > >>> Drop support for Sentry in favor of Ranger.
> > > >>>
> > > >>> Proposal 6: Metadata
> > > >>>
> > > >>> Metadata V2 will become the default. Metadata V1 will be
> deprecated.
> > > >>>
> > > >>> Thanks,
> > > >>> Joe
> > > >>>
> > > >>
> >
>

Reply via email to