Re: Hive Insert taking a lot of time

2015-11-02 Thread Jörn Franke
What is the create table statement? You may want to insert everything into the orc table (sorted on x and/or y) and then apply the where statement in your queries on the orc table. > On 02 Nov 2015, at 13:36, Kashif Hussain wrote: > > Hi, > I am trying to insert data

Re: Hive Insert taking a lot of time

2015-11-02 Thread Kashif Hussain
source and destination tables have same schema. Are you suggesting to use sort by any of the partitions ? On Mon, Nov 2, 2015 at 6:52 PM, Jörn Franke wrote: > What is the create table statement? You may want to insert everything into > the orc table (sorted on x and/or y)

Hive under cygwin

2015-11-02 Thread Andrés Ivaldi
Hello, I'm trying to execute hive on windows with cygwin, I have Haddop configured and running when I try to run hive I get this exception Exception in thread "main" java.lang.NoClassDefFoundError: org/apachreduce/JobContext at java.lang.Class.forName0(Native Method) at

Re: Disabling local mode optimization

2015-11-02 Thread Jason Dere
Take a look at fetch.task.conversion in https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties? From: Daniel Haviv Sent: Monday, November 02, 2015 1:16 AM To: user@hive.apache.org Subject: Re: Disabling local

Re: Hive on Spark NPE at org.apache.hadoop.hive.ql.io.HiveInputFormat

2015-11-02 Thread Jagat Singh
This is the virtual machine from Hortonworks. The query is this select count(*) from sample_07; It should run fine with MR. I am trying to run on Spark. On Tue, Nov 3, 2015 at 4:39 PM, Xuefu Zhang wrote: > That msg could be just noise. On the other hand, there is

Re: Min-Max Index vs Bloom filter

2015-11-02 Thread Jörn Franke
Bloom Filter only works for = and min max for <>= , however the latter only works for numeric value while the bloom filter nearly works on all types. Additionally the bloom filter is a probabilistic data structure. For both it make sense that the data is sorted on the column which is most

Min-Max Index vs Bloom filter

2015-11-02 Thread patcharee
Hi, For the orc format, which scenario that bloom filter is better than min-max index? Best, Patcharee

timestamp to date conversion

2015-11-02 Thread murali parimi
Hello All, I am trying to load a table in ORC format with data coming from another hive table stored in text format. Below is the hive query I am trying. both the tables are partitioned on data_date. insert overwrite table target_table partition(data_date='2015-09-30') select col1, col2,

Re: Hive on Spark NPE at org.apache.hadoop.hive.ql.io.HiveInputFormat

2015-11-02 Thread Xuefu Zhang
That msg could be just noise. On the other hand, there is NPE, which might be the problem you're having. Have you tried your query with MapReduce? On Sun, Nov 1, 2015 at 5:32 PM, Jagat Singh wrote: > One interesting message here , *No plan file found: * > > 15/11/01

Re: timestamp to date conversion

2015-11-02 Thread Jason Dere
Can you get the full stack trace for this error? If this is HiveCLI you can get this from hive.log, if hiveserver2 you might be able to find this in the hiveserver2 log. ?What version of Hive? From: murali parimi Sent:

Re: Disabling local mode optimization

2015-11-02 Thread Daniel Haviv
Hi, I'm trying to set hive.exec.mode.local.auto.inputbytes.max & hive.exec.mode.local.auto.tasks.max to 1 or 0 but still local mode is being used instead of M/R. Any ideas? Thank you. Daniel On Thu, Sep 3, 2015 at 8:02 AM, sreebalineni . wrote: > Hi, > > Is not it

Hive Insert taking a lot of time

2015-11-02 Thread Kashif Hussain
Hi, I am trying to insert data into orc table from a text table.The query is as follows : insert into table test_orc(x,y) select * from test where x=4 and y=5 ; The MR is taking a lot of time and all but one reducer got completed quickly.But the last reducer is taking a lot of time. The total