Hive does not know that the values of column `seconds` and partition
`range` or related.
Hive can only use the WHERE clause to remove partitions that do not match
the range criteria. All the data inside the partition is not ordered in any
way so the minimum seconds and maximum seconds could be in
What version of Hive are you running?
It looks like the error you're seeing might be from Hive trying to retrieve the
error message from the logs and might not be related to the actual error.
Might want to check the logs for the Hadoop task that was run as part of this
query, to see if that ha
Prasanth,
I had the correct flag enabled (see query in original email). Issue is that
it does not appear to be correctly using partition stats for the
calculation. Table is an orc table. It appears in the log that stats are
being calculated, but does not appear to be working when queries are run
a
Actually since hive 13 you seem to need a driver and a username and
password. The username and pw can be blank or whatever but
DriverManager.getConnection(url) does not seem to work any more.
On Fri, May 16, 2014 at 5:11 PM, Jay Vyas wrote:
> So i guess your saying "yes : just use the JDBC driv
Hi,
I'm trying to create a function that generates a UUID, want to use it in a
query to insert data into another table.
Here is the function:
package com.udf.example;
import java.util.UUID;
import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import o
Bryan,
The flag you are looking for is hive.compute.query.using.stats. By default this
optimization is disabled. You might need to enable it to use it. Also the
min/max/sum metadata are not looked up from the file but instead from
metastore. Although file formats like ORC contains stats, they a
Is there a reason why I can't use
select col1, col2, count(distinct col3) over (PARTITION by col4 order by
col5 ROWS BETWEEN 5 PRECEDING AND FOLLOWING) as col1 from table
?
I am trying to see for any given window if there is a lot of variability in
a col4, and it just doesn't work with count dis
>From Hive manual, there is only "left semi join", no "semi join", nor "inner
>semi join".
>From the Database world, it is just a traditional name for this kind of join:
>"LEFT semi join", as a reminder to the reader that the resultset comes out
>from the LEFT table ONLY.
Yong
> From: lukas.e..
try
public class Uuid extends UDF{
On Thu, May 15, 2014 at 2:07 PM, Leena Gupta wrote:
> Hi,
>
> I'm trying to create a function that generates a UUID, want to use it in a
> query to insert data into another table.
>
> Here is the function:
>
> package com.udf.example;
>
> import java.util.UUI
So i guess your saying "yes : just use the JDBC driver with the
jdbc:hive2://", and that is the equivalent of PigServer (with caveat that
it can't run a hive script).
files are all the same format (.gz) but they are in different
subdirectories !!
my problematique is : I want to do an import by day from oracle to hdfs
(in directory :
hdfs_my_parent_directory/import_dir_day1/part_data_import.gz
hdfs_my_parent_directory/import_dir_day2/part_data_import.gz
...
With Hive 0.13 the ORC memory issue is mitigated because of this optimization
https://issues.apache.org/jira/browse/HIVE-6455. This optimization is enabled
by default.
But having 3283 columns is still huge. So I would still recommend reducing the
default compression (256KB) buffer size to a lowe
When I created the table, I had to reduce the orc.compress.size quite a bit
to make my table with many columns work. This was on Hive 0.12 (I thought
it was supposed to be fixed on Hive 0.13, but 3k+ columns is huge) The
default of orc.compress size is quite a bit larger ( think in the 268k
range)
Sorry for the double post. I did not show up for a while and then I could
not get to the archives page, so I thought I'd needed to resend.
On Fri, May 16, 2014 at 12:54 AM, Premal Shah wrote:
> I have a table in hive stored as text file with 3283 columns. All columns
> are of string data type.
>
It is not really recommended anymore, as HiveServer2 with a client is the
where the community is focusing now. And JDBC-client supports mode where
the HiveServer2 is embedded.
Some older product like HiveCLI or Beeline did like what you said, but
again this mode might not be fully supported. For
I have a table in hive stored as text file with 3283 columns. All columns
are of string data type.
I'm trying to convert that table into an orc file table using this command
*create table orc_table stored as orc as select * from text_table;*
This is the setting under mapred-site.xml
mapred.
Hi hive.
Is there an API akin to PigServer, which allows you to run a hive script
from Java directly, using hive embedded mode, without use of JDBC?
--
Jay Vyas
http://jayunit100.blogspot.com
Thanks a lot Bryan that did the trick.
From: Bryan Jeffrey [mailto:bryan.jeff...@gmail.com]
Sent: Wednesday, May 14, 2014 8:47 PM
To: user@hive.apache.org
Subject: Re: Metastore service
Dima,
You can simply set the variable in your hive-site.xml:
datanucleus.connectionPool.maxPoolSize
20
add jar /home/dguser/hive-0.12.0/lib/hive-exec-0.12.0.jar;
Having to do the above ^ command is a strong indication that your setup is
not correct. Hive-exec is the map-reduce job jar should should not need to
add it as a secondary jar.
On Fri, May 9, 2014 at 9:18 PM, John Zeng wrote:
> Hi, A
All,
We are running Hadoop 2.2.0 and Hive 0.13.0. One typical application is to
load data (as text), and then convert that data to ORC to decrease query
time. When running these processes we are seeing significant memory leaks
(leaking 4 GB in about 5 days).
We're running HiveServer2 with the f
All,
I am executing the following query using Hadoop 2.2.0 and Hive 0.13.0.
/opt/hadoop/latest-hive/bin/beeline -u jdbc:hive2://server:10002/database
-n root --hiveconf hive.compute.query.using.stats=true -e "select
min(seconds), max(seconds), range from data where range > 1400204700 group
by ran
I have a table in hive stored as text file with 3283 columns. All columns
are of string data type.
I'm trying to convert that table into an orc file table using this command
*create table orc_table stored as orc as select * from text_table;*
This is the setting under mapred-site.xml
mapred.
22 matches
Mail list logo