Re: select count(*) from table;

2016-03-22 Thread Nitin Pawar
If you have enabled performance optimization by enabling statistics it will
come from there
if the underlying file format supports infile statistics (like ORC), it
will come from there
if its just plain vanilla text file format, it needs to run a job to get
the count so the longest of all

On Tue, Mar 22, 2016 at 12:44 PM, Amey Barve <ameybarv...@gmail.com> wrote:

> select count(*) from table;
>
> How does hive evaluate count(*) on a table?
>
> Does it return count by actually querying table, or directly return count
> by consulting some statistics locally.
>
> For Hive's Text format it takes few seconds while Hive's Orc format takes
> fraction of seconds.
>
> Regards,
> Amey
>



-- 
Nitin Pawar


Re: Possible bug loading data in Hive.

2014-06-10 Thread Nitin Pawar
))
 - Failed with exception org.apache.hadoop.hive.ql.metadata.HiveException:
 Unable to alter partition.
 org.apache.hadoop.hive.ql.metadata.HiveException:
 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter
 partition.
 at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(
 Hive.java:1454)
 at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(
 Hive.java:1158)
 at org.apache.hadoop.hive.ql.exec.MoveTask.execute(
 MoveTask.java:304)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153)
 at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
 TaskRunner.java:57)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1331)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1117)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
 at org.apache.hadoop.hive.service.HiveServer$
 HiveServerHandler.execute(HiveServer.java:191)
 at org.apache.hadoop.hive.service.ThriftHive$Processor$
 execute.getResult(ThriftHive.java:630)
 at org.apache.hadoop.hive.service.ThriftHive$Processor$
 execute.getResult(ThriftHive.java:618)
 at org.apache.thrift.ProcessFunction.process(
 ProcessFunction.java:32)
 at org.apache.thrift.TBaseProcessor.process(
 TBaseProcessor.java:34)
 at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(
 TThreadPoolServer.java:176)
 at java.util.concurrent.ThreadPoolExecutor$Worker.
 runTask(ThreadPoolExecutor.java:886)
 at java.util.concurrent.ThreadPoolExecutor$Worker.run(
 ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to
 alter partition.
 at org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(
 Hive.java:429)
 at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(
 Hive.java:1446)
 ... 16 more
 Caused by: MetaException(message:The transaction for alter partition did
 not commit successfully.)
 at org.apache.hadoop.hive.metastore.ObjectStore.
 alterPartition(ObjectStore.java:1927)
 at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(
 DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.hive.metastore.RetryingRawStore.
 invoke(RetryingRawStore.java:111)
 at $Proxy0.alterPartition(Unknown Source)
 at org.apache.hadoop.hive.metastore.HiveAlterHandler.
 alterPartition(HiveAlterHandler.java:254)
 at org.apache.hadoop.hive.metastore.HiveMetaStore$
 HMSHandler.rename_partition(HiveMetaStore.java:1816)
 at org.apache.hadoop.hive.metastore.HiveMetaStore$
 HMSHandler.rename_partition(HiveMetaStore.java:1788)
 at org.apache.hadoop.hive.metastore.HiveMetaStore$
 HMSHandler.alter_partition(HiveMetaStore.java:1771)
 at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.
 alter_partition(HiveMetaStoreClient.java:834)
 at org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(
 Hive.java:425)
 ... 17 more
 2014-06-08 20:16:34,852 ERROR ql.Driver (SessionState.java:printError(403))
 - FAILED: Execution Error, return code 1 from org.apache.hadoop
 .hive.ql.exec.MoveTask


 --
 *Fernando Agudo Tarancón*
 /Big Data Software Engineer/

 Telf.: +34 917 680 490
 Fax: +34 913 833 301
 C/ Manuel Tovar, 49-53 - 28034 Madrid - Spain

 _http://www.bidoop.es_




-- 
Nitin Pawar


Re: Possible bug loading data in Hive.

2014-06-10 Thread Nitin Pawar
The error you see is with hive metastore and these issues were kind of
related to two sided
1) Load on metastore
2) datanuclueas related

For now if possible, see if you can restart hive metastore and that
resolves your issue.




On Tue, Jun 10, 2014 at 3:27 PM, Fernando Agudo fag...@pragsis.com wrote:

 I have problems to upgrade to hive-0.13 or 0.12 because is in production.
 Only have this configuration of the datanuclues:

 property
 namedatanucleus.fixedDatastore/name
 valuetrue/value
 /property
 property
 namedatanucleus.autoCreateSchema/name
 valuefalse/value
 /property

 This is relevant for the problem?

 Thanks,


 On 10/06/14 10:53, Nitin Pawar wrote:

 Hive 0.9.0 with CDH4.1 --- This is very old release.

 I would recommend to upgrade to hive-0.13 or at least 0.12 and see.

 Error you are seeing is on loading data into a partition and metastore
 alter/add partition is failing.

 Can you try upgrading and see if that resolves your issue?
 If not can you share your datanuclues related settings in hive


 On Tue, Jun 10, 2014 at 2:16 PM, Fernando Agudo fag...@pragsis.com
 wrote:

  Hello,

 I'm working with Hive 0.9.0 with CDH4.1. I have a process which it's
 loading data in Hive every minute. It creates the partition if it's
 necessary.
 I have been monitoring this process for three days and I realize that
 there's a method (*listStorageDescriptorsWithCD*) which increases the
 execution time. First execution this method lasted about 15 millisencond
 and in the end it took more than 3 seconds (three days later), after
 that,
 Hive throws an exception and starts working again.

 I have checking this method but I haven't figured out any suspicious,
 could it be a bug?



 *2014-06-05 09:58:20,921* DEBUG metastore.ObjectStore (ObjectStore.java:
 listStorageDescriptorsWithCD(2036)) - Executing
 listStorageDescriptorsWithCD
 *2014-06-05 09:58:20,928* DEBUG metastore.ObjectStore (ObjectStore.java:
 listStorageDescriptorsWithCD(2045)) - Done executing query for
 listStorageDescriptorsWithCD


 *2014-06-08 20:15:33,867* DEBUG metastore.ObjectStore (ObjectStore.java:
 listStorageDescriptorsWithCD(2036)) - Executing listStorageDescriptor
 sWithCD
 *2014-06-08 20:15:36,134* DEBUG metastore.ObjectStore (ObjectStore.java:
 listStorageDescriptorsWithCD(2045)) - Done executing query for listSt
 orageDescriptorsWithCD



 2014-06-08 20:16:34,600 DEBUG metastore.ObjectStore (ObjectStore.java:
 removeUnusedColumnDescriptor(1989)) - execute removeUnusedColumnDescr
 iptor
 *2014-06-08 20:16:34,600 DEBUG metastore.ObjectStore (ObjectStore.java:
 listStorageDescriptorsWithCD(2036)) - Executing listStorageDescriptor**
 **sWithCD*
 2014-06-08 20:16:34,805 ERROR metadata.Hive
 (Hive.java:getPartition(1453))
 - org.apache.hadoop.hive.ql.metadata.HiveException: Unable to al
 ter partition.
  at org.apache.hadoop.hive.ql.metadata.Hive.alterPartition(
 Hive.java:429)
  at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(
 Hive.java:1446)
  at org.apache.hadoop.hive.ql.metadata.Hive.loadPartition(
 Hive.java:1158)
  at org.apache.hadoop.hive.ql.exec.MoveTask.execute(
 MoveTask.java:304)
  at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.
 java:153)
  at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(
 TaskRunner.java:57)
  at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:
 1331)
  at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1117)
  at org.apache.hadoop.hive.ql.Driver.run(Driver.java:950)
  at org.apache.hadoop.hive.service.HiveServer$
 HiveServerHandler.execute(HiveServer.java:191)
  at org.apache.hadoop.hive.service.ThriftHive$Processor$
 execute.getResult(ThriftHive.java:630)
  at org.apache.hadoop.hive.service.ThriftHive$Processor$
 execute.getResult(ThriftHive.java:618)
  at org.apache.thrift.ProcessFunction.process(
 ProcessFunction.java:32)
  at org.apache.thrift.TBaseProcessor.process(
 TBaseProcessor.java:34)
  at org.apache.thrift.server.TThreadPoolServer$
 WorkerProcess.run(
 TThreadPoolServer.java:176)
  at java.util.concurrent.ThreadPoolExecutor$Worker.
 runTask(ThreadPoolExecutor.java:886)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(
 ThreadPoolExecutor.java:908)
  at java.lang.Thread.run(Thread.java:662)
 Caused by: MetaException(message:The transaction for alter partition did
 not commit successfully.)
  at org.apache.hadoop.hive.metastore.ObjectStore.
 alterPartition(ObjectStore.java:1927)
  at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source)
  at sun.reflect.DelegatingMethodAccessorImpl.invoke(
 DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at org.apache.hadoop.hive.metastore.RetryingRawStore.
 invoke(RetryingRawStore.java:111)
  at $Proxy0

Re: Scheduling the next Hive Contributors Meeting

2013-11-08 Thread Nitin Pawar
I am not a contributor but a spectator to what hive have been doing last
couple of years.
I work out of India and would love to just sit back and listen to all the
new upcoming things (if that's allowed) :)


On Sat, Nov 9, 2013 at 1:08 AM, Brock Noland br...@cloudera.com wrote:

 Hi,

 Thanks Carl and Thejas! I would be attending remotely so the webex or
 google hangout would be very much appreciated. Please let me know if there
 is anything I can do to help enable either a webex or hangout!

 The Apache Sentry (incubating)[1] community which depends on Hive would be
 interested in briefly describing the project to the Hive community and
 discuss how we can work together to move both projects forward!  As a side
 note, there have been lively discussions on the integration of other
 incubating projects therefore I'd just like to share that the changes
 Sentry is interested in are very small in scope and unlikely to cause
 disruption to the Hive community.

 Cheers!
 Brock

 [1] http://incubator.apache.org/projects/sentry.html


 On Fri, Nov 8, 2013 at 1:08 PM, Carl Steinbach c...@apache.org wrote:

  We're long overdue for a Hive Contributors Meeting. Thejas has offered to
  host the next meeting at Hortonworks on November 19th from 4-6pm. We will
  have a Google Hangout or Webex setup for people who wish to attend
  remotely. If you want to attend but can't because of a scheduling
 conflict
  please let us know. If enough people fall into this category we will try
 to
  reschedule.
 
  Thanks.
 
  Carl
 




-- 
Nitin Pawar


Re: Skip trash while dropping Hive table

2013-11-04 Thread Nitin Pawar
On hive cli I normally set this set fs.trash.interval=0; in hiverc and use
it

This setting is hdfs related and I would not recommend it setting it on
hdfs-site.xml as it will then apply across hdfs which is not desirable most
of the times.


On Tue, Nov 5, 2013 at 5:28 AM, Chu Tong chut...@altiscale.com wrote:

 Hi all,

 Is there an existing way to drop Hive tables without having the deleted
 files hitting trash? If not, can we add something similar to Hive for this?


 Thanks a lot.




-- 
Nitin Pawar


Re: Single Mapper - HIVE 0.11

2013-10-09 Thread Nitin Pawar
whats the size of the table? (in GBs? )

Whats the max and min split sizes have you provied?


On Wed, Oct 9, 2013 at 10:28 PM, Gourav Sengupta gourav.had...@gmail.comwrote:

 Hi,

 I am trying to run a join using two tables stored in ORC file format.

 The first table has 34 million records and the second has around 300,000
 records.

 Setting set hive.auto.convert.join=true makes the entire query run via a
 single mapper.
 In case I am setting set hive.auto.convert.join=false then there are two
 mappers first one reads the second table and then the entire large table
 goes through the second mapper.

 Is there something that I am doing wrong because there are three nodes in
 the HADOOP cluster currently and I was expecting that at least 6 mappers
 should have been used.

 Thanks and Regards,
 Gourav




-- 
Nitin Pawar


[jira] [Created] (HIVE-5432) self join for a table with serde definition fails with classNotFoundException, single queries work fine

2013-10-03 Thread Nitin Pawar (JIRA)
Nitin Pawar created HIVE-5432:
-

 Summary: self join for a table with serde definition fails with 
classNotFoundException, single queries work fine
 Key: HIVE-5432
 URL: https://issues.apache.org/jira/browse/HIVE-5432
 Project: Hive
  Issue Type: Bug
  Components: CLI
Affects Versions: 0.11.0
 Environment: rhel6.4 
Reporter: Nitin Pawar


Steps to reproduce 

hive add jar /home/hive/udfs/hive-serdes-1.0-SNAPSHOT.jar; 
   
Added /home/hive/udfs/hive-serdes-1.0-SNAPSHOT.jar to class path
Added resource: /home/hive/udfs/hive-serdes-1.0-SNAPSHOT.jar
hive create table if not exists test(a string,b string) ROW FORMAT SERDE 
'com.cloudera.hive.serde.JSONSerDe';
OK
Time taken: 0.159 seconds
hive load data local inpath '/tmp/1' overwrite into table test;
   
Copying data from file:/tmp/1
Copying file: file:/tmp/1
Loading data to table default.test
Table default.test stats: [num_partitions: 0, num_files: 1, num_rows: 0, 
total_size: 51, raw_data_size: 0]
OK
Time taken: 0.659 seconds

hive select a from test;
Total MapReduce jobs = 1
Launching Job 1 out of 1
...
...

hive select * from (select b from test where a=test)x join (select b from 
test where a=test1)y on (x.b = y.b);
Total MapReduce jobs = 1
setting HADOOP_USER_NAMEhive
Execution log at: /tmp/hive/.log
java.lang.ClassNotFoundException: com.cloudera.hive.serde.JSONSerDe
Continuing ...
2013-10-03 05:13:00 Starting to launch local task to process map join;  
maximum memory = 1065484288
org.apache.hadoop.hive.ql.metadata.HiveException: Failed with exception 
nulljava.lang.NullPointerException
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getRowInspectorFromTable(FetchOperator.java:230)
at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:595)
at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406)
at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)

at 
org.apache.hadoop.hive.ql.exec.FetchOperator.getOutputObjectInspector(FetchOperator.java:631)
at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.initializeOperators(MapredLocalTask.java:406)
at 
org.apache.hadoop.hive.ql.exec.MapredLocalTask.executeFromChildJVM(MapredLocalTask.java:290)
at org.apache.hadoop.hive.ql.exec.ExecDriver.main(ExecDriver.java:682)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Execution failed with exit status: 2
Obtaining error information

Task failed!
Task ID:




--
This message was sent by Atlassian JIRA
(v6.1#6144)


Self join issue

2013-10-03 Thread Nitin Pawar
Hi,

I just raised a ticket for a table with self join query. Table is created
with json serde provided by cloudera.

When  I run a single query on the table like select col from table where
col='xyz', this works perfectly fine with a mapreduce job.

but when I try to run the query of self join on the table it says serde not
found on query parsing.

i have mentioned the steps in detail on JIRA
HIVE-5432https://issues.apache.org/jira/browse/HIVE-5432
.

Can somebody tell what's special when the query is parsed for join and
stand alone query?

Due to this issue, I have to create temporary tables and make sure I clean
them up myself after the jobs are over.

Thanks,
Nitin Pawar


Re: Error - loading data into tables

2013-10-01 Thread Nitin Pawar
Manickam,

I am really not sure if hive supports Federated namespaces yet.
I have cc'd dev list.

May be any of the core hive developers will be able to tell how to load
data using hive on a federated hdfs.


On Tue, Oct 1, 2013 at 12:59 PM, Manickam P manicka...@outlook.com wrote:

 Hi Pawar,

 I tried that option but not working. I have a federated HDFS cluster and
 given below is my core site xml.

 I created the HDFS directory inside that /home/storage/mount1 and tried to
 load the file now also i'm getting the same error.

 Can you pls tell me what mistake i'm doing here? bcoz i dont have any clue.


 *configuration*
 * *
 * property*
 * namefs.default.name/name*
 * valueviewfs:value*
 * /property*
 * property*
 * namefs.viewfs.mounttable.default.link./home/storage/mount1/name*
 * valuehdfs://10.108.99.68:8020/value*
 * /property*
 * property*
 * namefs.viewfs.mounttable.default.link./home/storage/mount2/name*
 * valuehdfs://10.108.99.69:8020/value*
 * /property   *
 */configuration*


 Thanks,
 Manickam Ppa

 --
 Date: Mon, 30 Sep 2013 21:53:03 +0530
 Subject: Re: Error - loading data into tables
 From: nitinpawar...@gmail.com
 To: u...@hive.apache.org


 Is this /home/strorage/... a hdfs directory?
 I think its a normal filesystem directory.

 Try running this
 load data local inpath '*/home/storage/mount1/tabled.txt' INTO TABLE TEST;
 *


 On Mon, Sep 30, 2013 at 7:13 PM, Manickam P manicka...@outlook.comwrote:

 Hi,

 I'm getting the below error while loading the data into hive table.
 *return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask*
 *
 *
 I used * LOAD DATA INPATH '/home/storage/mount1/tabled.txt' INTO TABLE
 TEST;* this query to insert into table.


 Thanks,
 Manickam P




 --
 Nitin Pawar




-- 
Nitin Pawar


Re: Hive Issue

2013-09-02 Thread Nitin Pawar
)
 at
 org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1148)
 at
 org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
 at
 org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:52
 1)
 ... 44 more
 Caused by: java.net.ConnectException: Connection refused
 at java.net.PlainSocketImpl.socketConnect(Native Method)
 at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
 at
 java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
 at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
 at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
 at java.net.Socket.connect(Socket.java:529)
 at java.net.Socket.connect(Socket.java:478)
 at java.net.Socket.init(Socket.java:375)
 at java.net.Socket.init(Socket.java:218)
 at
 com.mysql.jdbc.StandardSocketFactory.connect(StandardSocketFactory.java:257)
 at com.mysql.jdbc.MysqlIO.init(MysqlIO.java:294)
 ... 63 more
 Nested Throwables StackTrace:
 com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications
 link failure

 The last packet sent successfully to the server was 0 milliseconds ago.
 The driver has not received any packets from the serve
 r.
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
 at
 com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1116)
 at com.mysql.jdbc.MysqlIO.init(MysqlIO.java:344)
 at
 com.mysql.jdbc.ConnectionImpl.coreConnect(ConnectionImpl.java:2332)
 at
 com.mysql.jdbc.ConnectionImpl.connectOneTryOnly(ConnectionImpl.java:2369)
 at
 com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2153)
 at com.mysql.jdbc.ConnectionImpl.init(ConnectionImpl.java:792)
 at com.mysql.jdbc.JDBC4Connection.init(JDBC4Connection.java:47)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at com.mysql.jdbc.Util.handleNewInstance(Util.java:411)
 at
 com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:381)
 at
 com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:305)
 at java.sql.DriverManager.getConnection(DriverManager.java:582)
 at java.sql.DriverManager.getConnection(DriverManager.java:185)
 at
 org.apache.commons.dbcp.DriverManagerConnectionFactory.createConnection(DriverManagerConnectionFactory.java:75)
 at
 org.apache.commons.dbcp.PoolableConnectionFactory.makeObject(PoolableConnectionFactory.java:582)
 at
 org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:1148)
 at
 org.apache.commons.dbcp.PoolingDataSource.getConnection(PoolingDataSource.java:106)
 at
 org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:52
 1)
 at
 org.datanucleus.store.rdbms.RDBMSStoreManager.init(RDBMSStoreManager.java:290)
 at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
 at
 sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
 at
 sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
 at
 org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:593)
 at
 org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:300)
 at
 org.datanucleus.ObjectManagerFactoryImpl.initialiseStoreManager(ObjectManagerFactoryImpl.java:161)
 at
 org.datanucleus.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:583)




-- 
Nitin Pawar


Re: Last time request for cwiki update privileges

2013-08-21 Thread Nitin Pawar
 are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.

 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.


 CONFIDENTIALITY NOTICE
 ==
 This email message and any attachments are for the exclusive use of the
 intended recipient(s) and may contain confidential and privileged
 information. Any unauthorized review, use, disclosure or distribution is
 prohibited. If you are not the intended recipient, please contact the
 sender by reply email and destroy all copies of the original message along
 with any attachments, from your computer system. If you are the intended
 recipient, please be advised that the content of this message is subject to
 access, review and disclosure by the sender's Email System Administrator.




-- 
Nitin Pawar


Re: Hive Metastore Server 0.9 Connection Reset and Connection Timeout errors

2013-07-30 Thread Nitin Pawar
The mentioned flow is called when you have unsecure mode of thrift
metastore client-server connection. So one way to avoid this is have a
secure way.

code
public boolean process(final TProtocol in, final TProtocol out)
throwsTException {
setIpAddress(in);
...
...
...
@Override
 protected void setIpAddress(final TProtocol in) {
TUGIContainingTransport ugiTrans =
(TUGIContainingTransport)in.getTransport();
Socket socket = ugiTrans.getSocket();
if (socket != null) {
  setIpAddress(socket);

/code


From the above code snippet, it looks like the null pointer exception is
not handled if the getSocket returns null.

can you check whats the ulimit setting on the server? If its set to default
can you set it to unlimited and restart hcat server. (This is just a wild
guess).

also the getSocket method suggests If the underlying TTransport is an
instance of TSocket, it returns the Socket object which it contains.
Otherwise it returns null.

so someone from thirft gurus need to tell us whats happening. I have no
knowledge of this depth

may be Ashutosh or Thejas will be able to help on this.




From the netstat close_wait, it looks like the hive metastore server has
not closed the connection (do not know why yet), may be the hive dev guys
can help.Are there too many connections in close_wait state?



On Tue, Jul 30, 2013 at 5:52 AM, agateaaa agate...@gmail.com wrote:

 Looking at the hive metastore server logs see errors like these:

 2013-07-26 06:34:52,853 ERROR server.TThreadPoolServer
 (TThreadPoolServer.java:run(182)) - Error occurred during processing of
 message.
 java.lang.NullPointerException
 at

 org.apache.hadoop.hive.metastore.TUGIBasedProcessor.setIpAddress(TUGIBasedProcessor.java:183)
 at

 org.apache.hadoop.hive.metastore.TUGIBasedProcessor.process(TUGIBasedProcessor.java:79)
 at

 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:176)
 at

 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
  at

 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:662)

 approx same time as we see timeout or connection reset errors.

 Dont know if this is the cause or the side affect of he connection
 timeout/connection reset errors. Does anybody have any pointers or
 suggestions ?

 Thanks


 On Mon, Jul 29, 2013 at 11:29 AM, agateaaa agate...@gmail.com wrote:

  Thanks Nitin!
 
  We have simiar setup (identical hcatalog and hive server versions) on a
  another production environment and dont see any errors (its been running
 ok
  for a few months)
 
  Unfortunately we wont be able to move to hcat 0.5 and hive 0.11 or hive
  0.10 soon.
 
  I did see that the last time we ran into this problem doing a netstat-ntp
  | grep :1 see that server was holding on to one socket connection
 in
  CLOSE_WAIT state for a long time
   (hive metastore server is running on port 1). Dont know if thats
  relevant here or not
 
  Can you suggest any hive configuration settings we can tweak or
 networking
  tools/tips, we can use to narrow this down ?
 
  Thanks
  Agateaaa
 
 
 
 
  On Mon, Jul 29, 2013 at 11:02 AM, Nitin Pawar nitinpawar...@gmail.com
 wrote:
 
  Is there any chance you can do a update on test environment with
 hcat-0.5
  and hive-0(11 or 10) and see if you can reproduce the issue?
 
  We used to see this error when there was load on hcat server or some
  network issue connecting to the server(second one was rare occurrence)
 
 
  On Mon, Jul 29, 2013 at 11:13 PM, agateaaa agate...@gmail.com wrote:
 
  Hi All:
 
  We are running into frequent problem using HCatalog 0.4.1 (HIve
 Metastore
  Server 0.9) where we get connection reset or connection timeout errors.
 
  The hive metastore server has been allocated enough (12G) memory.
 
  This is a critical problem for us and would appreciate if anyone has
 any
  pointers.
 
  We did add a retry logic in our client, which seems to help, but I am
  just
  wondering how can we narrow down to the root cause
  of this problem. Could this be a hiccup in networking which causes the
  hive
  server to get into a unresponsive state  ?
 
  Thanks
 
  Agateaaa
 
 
  Example Connection reset error:
  ===
 
  org.apache.thrift.transport.TTransportException:
  java.net.SocketException:
  Connection reset
  at
 
 
 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
   at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
  at
 
 
 org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
   at
 
 
 org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
  at
 
 
 org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
   at
 org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:69

Re: HCatalog (from Hive 0.11) and Hadoop 2

2013-07-29 Thread Nitin Pawar
There is a build scheduled on jenkins for hive trunk which is failing.
I will give it a try on my local for hive-011, there is another build which
does the ptests which is disabled due to lots of test case failures.

https://builds.apache.org/job/Hive-trunk-hadoop2/

I will update you if I could build it




On Mon, Jul 29, 2013 at 8:07 PM, Rodrigo Trujillo 
rodrigo.truji...@linux.vnet.ibm.com wrote:

 Hi,

 is it possible to build Hive 0.11 and HCatalog with Hadoop 2 (2.0.4-alpha)
 ??

 Regards,

 Rodrigo




-- 
Nitin Pawar


Re: Hive Metastore Server 0.9 Connection Reset and Connection Timeout errors

2013-07-29 Thread Nitin Pawar
)
  at

 org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2092)
 at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2102)
  at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:888)
 at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:830)
  at

 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:954)
 at

 org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:7524)
  at

 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:243)
 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:431)
  at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909)
  at
 org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258)
 at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215)
  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
 at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:341)
  at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:642)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at

 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at

 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
  at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: java.net.SocketTimeoutException: Read timed out
 at java.net.SocketInputStream.socketRead0(Native Method)
  at java.net.SocketInputStream.read(SocketInputStream.java:129)
 at

 org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
  ... 31 more




-- 
Nitin Pawar


Re: ant maven-build not working in trunk

2013-06-13 Thread Nitin Pawar
I just tried a build with both jdk versions

build = ant clean package
jdk7 on branch-0.10 with patch from HIVE-3384 and it works
jdk6 on trunk without any changes it works

i created a new redhat vm and installed  sun jdk 6u43 and tried it. It
works too.

when i try ant maven-build -Dmvn.publish.repo=local it does fail with
make-pom target not existing.
Alan has a Jira on this: https://issues.apache.org/jira/browse/HIVE-4387

There is a patch available there for branch-0.11. I will try to build with
that.



On Thu, Jun 13, 2013 at 10:14 AM, amareshwari sriramdasu 
amareshw...@gmail.com wrote:

 Nitin,

 Hive does not compile with jdk7. You have to use jdk6 for compiling


 On Wed, Jun 12, 2013 at 9:42 PM, Nitin Pawar nitinpawar...@gmail.com
 wrote:

  I tried the build on trunk
 
  i did not hit the issue of make-pom but i hit the issue of jdbc with
 jdk7.
  I will apply the patch and try again
 
 
  On Wed, Jun 12, 2013 at 4:48 PM, amareshwari sriramdasu 
  amareshw...@gmail.com wrote:
 
  Hello,
 
  ant maven-build -Dmvn.publish.repo=local fails to build hcatalog with
  following error :
 
 
  /home/amareshwaris/hive/build.
  xml:121: The following error occurred while executing this line:
  /home/amareshwaris/hive/build.xml:123: The following error occurred
 while
  executing this line:
  Target make-pom does not exist in the project hcatalog.
 
  Was curious to know if I'm only one facing this or Is there anyother way
  to
  publish maven artifacts locally?
 
  Thanks
  Amareshwari
 
 
 
 
  --
  Nitin Pawar
 




-- 
Nitin Pawar


Re: ant maven-build not working in trunk

2013-06-12 Thread Nitin Pawar
I tried the build on trunk

i did not hit the issue of make-pom but i hit the issue of jdbc with jdk7.
I will apply the patch and try again


On Wed, Jun 12, 2013 at 4:48 PM, amareshwari sriramdasu 
amareshw...@gmail.com wrote:

 Hello,

 ant maven-build -Dmvn.publish.repo=local fails to build hcatalog with
 following error :


 /home/amareshwaris/hive/build.
 xml:121: The following error occurred while executing this line:
 /home/amareshwaris/hive/build.xml:123: The following error occurred while
 executing this line:
 Target make-pom does not exist in the project hcatalog.

 Was curious to know if I'm only one facing this or Is there anyother way to
 publish maven artifacts locally?

 Thanks
 Amareshwari




-- 
Nitin Pawar


adding a new property for hive history file HIVE-1708

2013-04-17 Thread Nitin Pawar
Hi Guys,

I am trying to work on this JIRA
HIVE-1708https://issues.apache.org/jira/browse/HIVE-1708
.

I have added one property HIVE_CLI_ENABLE_LOGGING to enable or disable the
history and tested it.
I am stuck at a point what should be the default value
for HIVE_CLI_HISTORY_FILE_PATH?

Currently this is set to
String historyDirectory = System.getProperty(user.home);
String historyFile = historyDirectory + File.separator + HISTORYFILE;

Any ideas on what will be the  default path then ?

-- 
Nitin Pawar


Re: plugable partitioning

2013-04-15 Thread Nitin Pawar
whenever you create a partition in hive, it needs to be registered with the
metadata store. So short answer would be partition data is looked from
metadata store instead of  the actual source data.
having a lot of partitions does slow down hive (around 1+). Normally
have not seen anyone using hourly partitions. You may want to look at
adding daily partition and bucket by hour.

but if you are adding data directly into partition directories then there
is no alternative other than adding partitions to metadata store manually
apart from alter partition.

if you are using hcatalog as metadata store then it does provide an api to
register your partition so you can automate your data loading and
registering both in a single flow.

Others will correct me if I have made any wrong assumption


On Mon, Apr 15, 2013 at 8:15 PM, Steve Hoffman ste...@goofy.net wrote:

 Looking for some pointers on where the partitioning is figured out in the
 source when a query is executed.
 I'm investigating an alternative partitioning scheme based on date patterns
 (using external tables).

 The situation is that I have data being written to some HDFS root directory
 with some dated pattern (i.e. /MM/DD).  Today I have to run an alter
 table to insert this partition every day.  It gets worse if you have hourly
 partitions.  This seems like it can be described once (root + date
 partition pattern in the metastore).

 So looking for some pointers on where in the code this is currently
 handled.

 Thanks,
 Steve




-- 
Nitin Pawar


[jira] [Commented] (HIVE-1708) make hive history file configurable

2013-04-12 Thread Nitin Pawar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13630609#comment-13630609
 ] 

Nitin Pawar commented on HIVE-1708:
---

I did add a new setting to hive-site.xml and made some change in the cli code 
and tested it for making hive history optional. 

I wanted to add one more property for the hive history file path but currently 
it is set to .hivehistory inside each individual users home directory. If I 
have to retain this property how will I keep the default value in 
hive-site.xml. As all the users will have different home directories on 
different linux distributions, how do we default the path then? 

can we change the file path to something like log location which resides inside 
/tmp ? Is that an acceptable change? 

 make hive history file configurable
 ---

 Key: HIVE-1708
 URL: https://issues.apache.org/jira/browse/HIVE-1708
 Project: Hive
  Issue Type: Improvement
  Components: Query Processor
Reporter: Namit Jain

 Currentlly, it is derived from
 System.getProperty(user.home)/.hivehistory;

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HIVE-4231) Build fails with WrappedRuntimeException: Content is not allowed in prolog. when _JAVA_OPTIONS=-Dfile.encoding=UTF-8

2013-04-11 Thread Nitin Pawar (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-4231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13628789#comment-13628789
 ] 

Nitin Pawar commented on HIVE-4231:
---

Even I am running into same issue when trying to build hive project 
I have same environment at Sho
but OS is rhel 6.3

the log says exactly same and to add to the log contents for the failed xml 
file is 
[root@localhost branch-0.10]#  cat 
/root/apache/hive/branch-0.10/build/builtins/metadata/class-info.xml
ClassList
Exception in thread main java.lang.UnsupportedClassVersionError: 
org/apache/hadoop/hive/ql/exec/Description : Unsupported major.minor version 
51.0
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
at java.lang.ClassLoader.defineClass(ClassLoader.java:615)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at org.apache.hive.pdk.FunctionExtractor.main(FunctionExtractor.java:27)
[root@localhost branch-0.10]# 


 Build fails with WrappedRuntimeException: Content is not allowed in prolog. 
 when _JAVA_OPTIONS=-Dfile.encoding=UTF-8
 

 Key: HIVE-4231
 URL: https://issues.apache.org/jira/browse/HIVE-4231
 Project: Hive
  Issue Type: Bug
Affects Versions: 0.11.0
Reporter: Sho Shimauchi
Priority: Minor

 Build failed with the follwing error when I set _JAVA_OPTIONS to 
 -Dfile.encoding=UTF-8:
 {code}
 extract-functions:
  [xslt] Processing 
 /Users/sho/src/apache/hive/build/builtins/metadata/class-info.xml to 
 /Users/sho/src/apache/hive/build/builtins/metadata/class-registration.sql
  [xslt] Loading stylesheet 
 /Users/sho/src/apache/hive/pdk/scripts/class-registration.xsl
  [xslt] : Error! Content is not allowed in prolog.
  [xslt] : Error! 
 com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Content is not 
 allowed in prolog.
  [xslt] Failed to process 
 /Users/sho/src/apache/hive/build/builtins/metadata/class-info.xml
 BUILD FAILED
 /Users/sho/src/apache/hive/build.xml:517: The following error occurred while 
 executing this line:
 /Users/sho/src/apache/hive/builtins/build.xml:37: The following error 
 occurred while executing this line:
 /Users/sho/src/apache/hive/pdk/scripts/build-plugin.xml:118: 
 javax.xml.transform.TransformerException: 
 javax.xml.transform.TransformerException: 
 com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Content is not 
 allowed in prolog.
   at 
 com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:735)
   at 
 com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:336)
   at 
 org.apache.tools.ant.taskdefs.optional.TraXLiaison.transform(TraXLiaison.java:194)
   at 
 org.apache.tools.ant.taskdefs.XSLTProcess.process(XSLTProcess.java:852)
   at 
 org.apache.tools.ant.taskdefs.XSLTProcess.execute(XSLTProcess.java:388)
   at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
   at org.apache.tools.ant.Task.perform(Task.java:348)
   at org.apache.tools.ant.Target.execute(Target.java:390)
   at org.apache.tools.ant.Target.performTasks(Target.java:411)
   at org.apache.tools.ant.Project.executeSortedTargets(Project.java:1399)
   at 
 org.apache.tools.ant.helper.SingleCheckExecutor.executeTargets(SingleCheckExecutor.java:38)
   at org.apache.tools.ant.Project.executeTargets(Project.java:1251)
   at org.apache.tools.ant.taskdefs.Ant.execute(Ant.java:442)
   at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:291)
   at sun.reflect.GeneratedMethodAccessor4.invoke(Unknown Source)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java

Hive compilation issues on branch-0.10 and trunk

2013-04-11 Thread Nitin Pawar
Hello,

I am trying to build hive on both trunk and branch-0.10

I have tried both SUN JDK6 and JDK7
With both the version running into different issues

with JDK6 running into issue mentioned at HIVE-4231
with JDK7 running  into issue mentioned at HIVE-3384

can somebody please help out with this?

What would be recommended JDK version going forward for development
activities ?

-- 
Nitin Pawar


Re: Hive compilation issues on branch-0.10 and trunk

2013-04-11 Thread Nitin Pawar
Hi Mark,

Yes I applied the patch and got it working with JDK7.

Can we continue using JDK7?

Thanks,
Nitin
On Apr 11, 2013 8:48 PM, Mark Grover grover.markgro...@gmail.com wrote:

 Nitin,
 I have been able to build hive trunk with JDK 1.6. Did you try the
 workaround listed in HIVE-4231?

 Mark

 On Thu, Apr 11, 2013 at 2:42 AM, Nitin Pawar nitinpawar...@gmail.com
 wrote:

  Hello,
 
  I am trying to build hive on both trunk and branch-0.10
 
  I have tried both SUN JDK6 and JDK7
  With both the version running into different issues
 
  with JDK6 running into issue mentioned at HIVE-4231
  with JDK7 running  into issue mentioned at HIVE-3384
 
  can somebody please help out with this?
 
  What would be recommended JDK version going forward for development
  activities ?
 
  --
  Nitin Pawar
 



[jira] [Created] (HIVE-2980) Show a warning or an error when the data directory is empty or not existing

2012-04-24 Thread Nitin Pawar (JIRA)
Nitin Pawar created HIVE-2980:
-

 Summary: Show a warning or an error when the data directory is 
empty or not existing 
 Key: HIVE-2980
 URL: https://issues.apache.org/jira/browse/HIVE-2980
 Project: Hive
  Issue Type: Improvement
Reporter: Nitin Pawar


It looks like a good idea to show a warning or an error when the data directory 
is missing or empty.

This will help in cut down the debugging time as well a good information to 
have on the deleted data 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HIVE-2814) Can we have a feature to disable creating empty buckets on a larger number of buckets creates?

2012-02-22 Thread Nitin Pawar (Created) (JIRA)
Can we have a feature to disable creating empty buckets on a larger number of 
buckets creates? 
---

 Key: HIVE-2814
 URL: https://issues.apache.org/jira/browse/HIVE-2814
 Project: Hive
  Issue Type: Bug
Reporter: Nitin Pawar
Priority: Minor


When we create buckets on a larger datasets, its not often that all the 
partitions have same number of buckets so we choose the largest possible number 
to capture the buckets mostly.

It results into creating lot of empty buckets, which might be an overhead of 
hadoop as well as for hive queries. 
Also it takes a lot of time to just create empty buckets. 

Is there a way where I can say do not create empty buckets? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira