Re: Can't start minicluster

2017-07-09 Thread Tim Armstrong
Maybe the thrift be/generated-sources are out of sync with the source code?

We had some kind of metastore scheme upgrade that caused the other one.
Dimitris' instructions to fix them were:

> To fix this without doing a full data reload, you can use the following
command:
> ${IMPALA_TOOLCHAIN}/cdh_components/hive-1.1.0-cdh5.13.0-
SNAPSHOT/bin/schematool
-upgradeSchema -dbType {type}
> where type is one of 'postgres' or 'mysql', depending on your setup.

On Sun, Jul 9, 2017 at 3:52 PM, Jim Apple  wrote:

> I am getting the following message in FATAL when I try to start a
> minicluster
>
> Check failed: _TImpalaQueryOptions_VALUES_TO_NAMES.size() ==
> TImpalaQueryOptions::DEFAULT_JOIN_DISTRIBUTION_MODE + 1 (57 vs. 56)
>
> Any ideas what is going on? I was actually trying to buildall.sh
> -format_metastore -format_sentry_policy_db because I was seeing messages
> like the following (in hive.log) when I tried to start the minicluster:
>
>  org.postgresql.util.PSQLException: ERROR: column A0.SCHEMA_VERSION_V2
> does
> not exist
>


Re: IMPALA-4326 - split() function

2017-07-09 Thread Greg Rahn
(also commented on IMPALA-4326)

For this functionality, I'd prefer to follow what Postgres does and use its
well-named functions like string_to_array().
This becomes powerful when using the unnest() table function, which is
defined and is part of the ANSI/ISO SQL:2016 spec (vs the non-standard
lateral view explode Hive syntax).

with t as (
  select
42 as id,
'1,2,3,4,5,6'::text as string_array
)
select
  t.id,
  u.l
from t, unnest(string_to_array(t.string_array,',')) as u(l);

id | l
+---
42 | 1
42 | 2
42 | 3
42 | 4
42 | 5
42 | 6


On Mon, Jun 19, 2017 at 7:40 AM, Alexander Behm 
wrote:

> Yes and no. Extending the UDF framework might be hard, but I think
> implementing a built-in split() is feasible. We already have a built-in
> Expr that returns an array type to implement unnest.
>
> On Mon, Jun 19, 2017 at 6:22 AM, Vincent Tran  wrote:
>
> > This request appears to be blocked by the current UDF framework's
> > limitation.
> > As far as I can tell, functions can still only return simple scalar
> types,
> > right?
> >
>