> When LLAP Execution Mode is set to 'only' you can't have a macro and
window function in the same select statement.
The "only" part isn't enforced for the simple select query, but is enforced
for the complex one (the PTF one).
> select col_1, col_2 from macro_bug where
> When LLAP Execution Mode is set to 'only' you can't have a macro and window
> function in the same select statement.
The "only" part isn't enforced for the simple select query, but is enforced for
the complex one (the PTF one).
> select col_1, col_2 from macro_bug where otrim(col_1) is not
Hi,
> I was looking at HiveServer2 performance going through Knox in KNOX-1524 and
> found that HTTP mode is significantly slower.
The HTTP mode does re-auth for every row before HIVE-20621 was fixed – Knox
should be doing cookie-auth to prevent ActiveDirectory/LDAP from throttling
this.
I
Hi,
> Subject: Re: hive 3.1 mapjoin with complex predicate produce incorrect results
...
> | 0 if(_col0 is null, 44, _col0) (type: int) |
> | 1 _col0 (type: int) |
That rewrite is pretty neat, but I feel like the IF expression nesting is
> ,row_number() over ( PARTITION BY A.dt,A.year, A.month,
>A.bouncer,A.visitor_type,A.device_type order by A.total_page_view_time desc )
>as rank
from content_pages_agg_by_month A
The row_number() window function is a streaming function, so this should not
consume a significant
Hi,
> It doesn't help if you need concurrent threads writing to a table but we are
> just using the row_number analytic and a max value subquery to generate
> sequences on our star schema warehouse.
Yup, you're right the row_number doesn't help with concurrent writes - it
doesn't even scale
Hi,
> Hopefully someone can tell me if this is a bug, expected behavior, or
> something I'm causing myself :)
I don't think this is expected behaviour, but where the bug is what I'm looking
into.
> We have a custom StorageHandler that we're updating from Hive 1.2.1 to Hive
> 3.0.0.
Most
> I'll try the simplest query I can reduce it to with loads of memory and see
> if that gets anywhere. Other pointers are much appreciated.
Looks like something I'm testing right now (to make the memory setting
cost-based).
https://issues.apache.org/jira/browse/HIVE-21399
A less
>I am running an older version of Hive on MR. Does it have it too?
Hard to tell without an explain.
AFAIK, this was fixed in Aug 2013 - how old is your build?
Cheers,
Gopal
Hi,
That looks like the TopN hash optimization didn't kick in, that must be a
settings issue in the install.
| Reduce Output Operator |
| key expressions: _col0 (type: string) |
| sort order: + |
|
> I expect the maps to do some sorting and limiting in parallel. That way the
> reducer load would be small. I don’t think it does that. Can you tell me why?
They do.
Which version are you running, is it Tez and do you have an explain for the
plan?
Cheers,
Gopal
> Yes both of these are valid ways of filtering data before join in Hive.
This has several implementation specifics attached to it. If you're looking at
Hive 1.1 or before, it might not work the same way as Vineet mentioned.
In older versions Calcite rewrites aren't triggered, which prevented
> I wish the Hive team to keep things more backward-compatible as well. Hive is
> such an enormous system with a wide-spread impact so any
> backward-incompatible change could cause an uproar in the community.
The incompatibilities were not avoidable in a set of situations - a lot of
those
Hi,
>> However, we have built Tez on CDH and it runs just fine.
Down that path you'll also need to deploy a slightly newer version of Hive as
well, because Hive 1.1 is a bit ancient & has known bugs with the tez planner
code.
You effectively end up building the hortonworks/hive-release
Hi,
> java.lang.NoSuchMethodError:
> org.apache.hadoop.fs.FileStatus.compareTo(Lorg/apache/hadoop/fs/FileStatus;)I
> (state=,code=0)
Are you rolling your own Hadoop install?
https://issues.apache.org/jira/browse/HADOOP-14683
Cheers,
Gopal
Hi,
> I have a question about how to get the location for a bunch of partitions.
...
> But in an enterprise environment I'm pretty sure this approach would not be
> the best because the RDS (mysql or derby) is maybe not reachable or
> I don't have the permission to it.
That was the reason Hive
301 - 316 of 316 matches
Mail list logo