Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-10-01 Thread Ning Zhang
Amareshwari, this should be a metastore issue. Did you see this kind of issue 
in create table (without select) alone?

Which RDBMS are you using for metastore? MySQL or Oracle? Was your 
database/thrift server heavy-loaded at that time?

On Sep 30, 2010, at 10:28 PM, Amareshwari Sri Ramadasu wrote:

Carl, we are using trunk version.


On 10/1/10 10:16 AM, Carl Steinbach 
c...@cloudera.comx-msg://6/c...@cloudera.com wrote:

Hi Amareshwari,

Which version of Hive are you using to run the Hive metastore server?

Thanks.

Carl

On Thu, Sep 30, 2010 at 9:25 PM, Amareshwari Sri Ramadasu 
amar...@yahoo-inc.comx-msg://6/amar...@yahoo-inc.com wrote:
Hi,

Create table as select queries fail with 
org.apache.thrift.TApplicationException in our clusters for some queries.

Following is the stack trace for the exception :
Error in metadata: org.apache.thrift.TApplicationException: Internal error 
processing create_table
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.thrift.TApplicationException: Internal error processing
create_table
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:405)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:2465)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:180)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: org.apache.thrift.TApplicationException: Internal error processing 
create_table
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:107)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table(ThriftHiveMetastore.java:566)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table(ThriftHiveMetastore.java:549)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:281)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:399)

When we tried with hive.metastore.connect.retries=10, it fails though it 
succeeds occasionally.

The metastore logs have the following exception
Internal error processing create_table
java.lang.RuntimeException: Commit is called, but transaction is not active. 
Either there are mismatching open and
close calls or rollback was called in the same trasaction
at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:250)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:795)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.access$600(HiveMetaStore.java:79)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:816)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:813)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:234)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:813)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table.process(ThriftHiveMetastore.java:1992)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor.process(ThriftHiveMetastore.java:1644)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

Can somebody help us find the root cause of the problem?

Thanks
Amareshwari






Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-10-01 Thread S. Venkatesh
 Amareshwari, this should be a metastore issue. Did you see this kind of
 issue in create table (without select) alone?
Nope.

 Which RDBMS are you using for metastore? MySQL or Oracle?
MySQL

 Was your
 database/thrift server heavy-loaded at that time?
Running with just one user.

-- 
Regards,
Venkatesh

“Perfection (in design) is achieved not when there is nothing more to
add, but rather when there is nothing more to take away.”
- Antoine de Saint-Exupéry


Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-10-01 Thread Amareshwari Sri Ramadasu
No. We did not see any issue with create table (without select). Create table 
as select also passes sometimes.
We are using MySQL for our metastore.
Will get back to you on load at the time of create. Is there a direct way to 
know the load?
So, what is the solution/workaround  if there is a metastore issue?

Thanks
Amareshwari

On 10/1/10 11:38 AM, Ning Zhang nzh...@facebook.com wrote:

Amareshwari, this should be a metastore issue. Did you see this kind of issue 
in create table (without select) alone?

Which RDBMS are you using for metastore? MySQL or Oracle? Was your 
database/thrift server heavy-loaded at that time?

On Sep 30, 2010, at 10:28 PM, Amareshwari Sri Ramadasu wrote:

Carl, we are using trunk version.


On 10/1/10 10:16 AM, Carl Steinbach c...@cloudera.com 
x-msg://6/c...@cloudera.com  wrote:

Hi Amareshwari,

Which version of Hive are you using to run the Hive metastore server?

Thanks.

Carl

On Thu, Sep 30, 2010 at 9:25 PM, Amareshwari Sri Ramadasu 
amar...@yahoo-inc.com x-msg://6/amar...@yahoo-inc.com  wrote:
Hi,

Create table as select queries fail with 
org.apache.thrift.TApplicationException in our clusters for some queries.

Following is the stack trace for the exception :
Error in metadata: org.apache.thrift.TApplicationException: Internal error 
processing create_table
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.thrift.TApplicationException: Internal error processing
create_table
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:405)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:2465)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:180)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: org.apache.thrift.TApplicationException: Internal error processing 
create_table
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:107)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table(ThriftHiveMetastore.java:566)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table(ThriftHiveMetastore.java:549)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:281)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:399)

When we tried with hive.metastore.connect.retries=10, it fails though it 
succeeds occasionally.

The metastore logs have the following exception
Internal error processing create_table
java.lang.RuntimeException: Commit is called, but transaction is not active. 
Either there are mismatching open and
close calls or rollback was called in the same trasaction
at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:250)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:795)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.access$600(HiveMetaStore.java:79)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:816)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:813)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:234)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:813)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table.process(ThriftHiveMetastore.java:1992)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor.process(ThriftHiveMetastore.java:1644)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

Can somebody help us find the root cause

Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-10-01 Thread Ning Zhang
If your success rate is increased by increasing hive.metastore.connect.retries, 
mostly likely your Thrift server is too busy to get connected. Can you check 
your /tmp/userid/hive.log in your local machine and see if exception related 
to metastore?

On Sep 30, 2010, at 11:21 PM, S. Venkatesh wrote:

 Amareshwari, this should be a metastore issue. Did you see this kind of
 issue in create table (without select) alone?
 Nope.
 
 Which RDBMS are you using for metastore? MySQL or Oracle?
 MySQL
 
 Was your
 database/thrift server heavy-loaded at that time?
 Running with just one user.
 
 -- 
 Regards,
 Venkatesh
 
 “Perfection (in design) is achieved not when there is nothing more to
 add, but rather when there is nothing more to take away.”
 - Antoine de Saint-Exupéry



Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-10-01 Thread Carl Steinbach
It looks like the root of the problem is that the code in ObjectStore is not
detecting failed transactions, and hence is not rolling them back.

Using JDO with locally managed transactions, you're expected to do something
like this:

Transaction tx = pm.currentTransaction();
try {
  tx.begin();
  { do some stuff }
  tx.commit();
} finally {
  if (tx.isActive()) {
tx.rollback();
  }
}

But the code in ObjectStore instead wraps these methods in openTransaction()
and commitTransaction(). After calling tx.commit()  the method
commitTransaction() should then call tx.isActive() to check if the
transaction failed and call rollback if appropriate. Currently it doesn't do
this and instead always returns false.

I'm filing a ticket against this now.

Thanks.

Carl


Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-10-01 Thread S. Venkatesh
On Fri, Oct 1, 2010 at 12:24 PM, Ning Zhang nzh...@facebook.com wrote:
 If your success rate is increased by increasing 
 hive.metastore.connect.retries, mostly likely your Thrift server is too busy 
 to get connected. Can you check your /tmp/userid/hive.log in your local 
 machine and see if exception related to metastore?
Ning, the load on metastore is very low, its this one user executing
the CTAS. No one else is using this setup but QE.


Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-10-01 Thread Carl Steinbach
Hi Venkatesh,

I filed HIVE-1679 which covers the issue I described earlier as well
as HIVE-1681 which I think is the real root cause of the problem you
are seeing. Please see HIVE-1681 for more information.

Thanks.

Carl

On Fri, Oct 1, 2010 at 12:51 AM, S. Venkatesh venkat...@innerzeal.comwrote:

 We are seeing this exception:

 Internal error processing create_table
 java.lang.RuntimeException: Commit is called, but transaction is not
 active.
 Either there are mismatching open and close calls or rollback was called in
 the
 same transaction

 Carl, this suggests that the transaction is not active when commit was
 called. I concur with your observation.

 Venkatesh

 On Fri, Oct 1, 2010 at 12:21 PM, Carl Steinbach c...@cloudera.com wrote:
  It looks like the root of the problem is that the code in ObjectStore is
 not
  detecting failed transactions, and hence is not rolling them back.
  Using JDO with locally managed transactions, you're expected to do
 something
  like this:
  Transaction tx = pm.currentTransaction();
  try {
tx.begin();
{ do some stuff }
tx.commit();
  } finally {
if (tx.isActive()) {
  tx.rollback();
}
  }
  But the code in ObjectStore instead wraps these methods in
 openTransaction()
  and commitTransaction(). After calling tx.commit()  the method
  commitTransaction() should then call tx.isActive() to check if the
  transaction failed and call rollback if appropriate. Currently it doesn't
 do
  this and instead always returns false.
  I'm filing a ticket against this now.
  Thanks.
  Carl
 



Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-10-01 Thread Carl Steinbach
Hi Venkatesh,

I attached an interim patch to HIVE-1681. Please try applying this and let
me know if it fixes your problem.

Thanks.

Carl

On Fri, Oct 1, 2010 at 1:31 AM, Carl Steinbach c...@cloudera.com wrote:

 Hi Venkatesh,

 I filed HIVE-1679 which covers the issue I described earlier as well
 as HIVE-1681 which I think is the real root cause of the problem you
 are seeing. Please see HIVE-1681 for more information.

 Thanks.

 Carl


 On Fri, Oct 1, 2010 at 12:51 AM, S. Venkatesh venkat...@innerzeal.comwrote:

 We are seeing this exception:

 Internal error processing create_table
 java.lang.RuntimeException: Commit is called, but transaction is not
 active.
 Either there are mismatching open and close calls or rollback was called
 in the
 same transaction

 Carl, this suggests that the transaction is not active when commit was
 called. I concur with your observation.

 Venkatesh

 On Fri, Oct 1, 2010 at 12:21 PM, Carl Steinbach c...@cloudera.com
 wrote:
  It looks like the root of the problem is that the code in ObjectStore is
 not
  detecting failed transactions, and hence is not rolling them back.
  Using JDO with locally managed transactions, you're expected to do
 something
  like this:
  Transaction tx = pm.currentTransaction();
  try {
tx.begin();
{ do some stuff }
tx.commit();
  } finally {
if (tx.isActive()) {
  tx.rollback();
}
  }
  But the code in ObjectStore instead wraps these methods in
 openTransaction()
  and commitTransaction(). After calling tx.commit()  the method
  commitTransaction() should then call tx.isActive() to check if the
  transaction failed and call rollback if appropriate. Currently it
 doesn't do
  this and instead always returns false.
  I'm filing a ticket against this now.
  Thanks.
  Carl
 





Re: Error: java.lang.NullPointerException

2010-09-30 Thread vaibhav negi
Hello,

I am stuck at it. Please help

Regards
Vaibhav Negi


On Wed, Sep 29, 2010 at 5:30 PM, vaibhav negi sssena...@gmail.comwrote:

 Hi,

 I am using hadoop version 0.20.2

 While running query select count(*) from table; on Hive,i am getting this
 error

 FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
 exec.MapRedTask

 Error in task tracker logs says :-

 Error: java.lang.NullPointerException
  at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
 at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
 at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)


 How to debug and correct it??


 Vaibhav Negi



Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-09-30 Thread Amareshwari Sri Ramadasu
Hi,

Create table as select queries fail with 
org.apache.thrift.TApplicationException in our clusters for some queries.

Following is the stack trace for the exception :
Error in metadata: org.apache.thrift.TApplicationException: Internal error 
processing create_table
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.thrift.TApplicationException: Internal error processing
create_table
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:405)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:2465)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:180)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: org.apache.thrift.TApplicationException: Internal error processing 
create_table
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:107)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table(ThriftHiveMetastore.java:566)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table(ThriftHiveMetastore.java:549)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:281)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:399)

When we tried with hive.metastore.connect.retries=10, it fails though it 
succeeds occasionally.

The metastore logs have the following exception
Internal error processing create_table
java.lang.RuntimeException: Commit is called, but transaction is not active. 
Either there are mismatching open and
close calls or rollback was called in the same trasaction
at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:250)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:795)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.access$600(HiveMetaStore.java:79)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:816)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:813)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:234)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:813)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table.process(ThriftHiveMetastore.java:1992)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor.process(ThriftHiveMetastore.java:1644)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

Can somebody help us find the root cause of the problem?

Thanks
Amareshwari



Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-09-30 Thread Carl Steinbach
Hi Amareshwari,

Which version of Hive are you using to run the Hive metastore server?

Thanks.

Carl

On Thu, Sep 30, 2010 at 9:25 PM, Amareshwari Sri Ramadasu 
amar...@yahoo-inc.com wrote:

  Hi,

 Create table as select queries fail with
 org.apache.thrift.TApplicationException in our clusters for some queries.

 Following is the stack trace for the exception :
 Error in metadata: org.apache.thrift.TApplicationException: Internal error
 processing create_table
 org.apache.hadoop.hive.ql.metadata.HiveException:
 org.apache.thrift.TApplicationException: Internal error processing
 create_table
 at
 org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:405)
 at
 org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:2465)
 at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:180)
 at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
 at
 org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
 at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
 at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
 at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
 at
 org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
 at
 org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
 Caused by: org.apache.thrift.TApplicationException: Internal error
 processing create_table
 at
 org.apache.thrift.TApplicationException.read(TApplicationException.java:107)
 at

 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table(ThriftHiveMetastore.java:566)
 at
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table(ThriftHiveMetastore.java:549)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:281)
 at
 org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:399)

 When we tried with hive.metastore.connect.retries=10, it fails though it
 succeeds occasionally.

 The metastore logs have the following exception
 Internal error processing create_table
 java.lang.RuntimeException: Commit is called, but transaction is not
 active. Either there are mismatching open and
 close calls or rollback was called in the same trasaction
 at
 org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:250)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:795)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.access$600(HiveMetaStore.java:79)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:816)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:813)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:234)
 at
 org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:813)
 at

 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table.process(ThriftHiveMetastore.java:1992)
 at
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor.process(ThriftHiveMetastore.java:1644)
 at
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:619)

 Can somebody help us find the root cause of the problem?

 Thanks
 Amareshwari




Re: Create table as select fails with error Error in metadata: org.apache.thrift.TApplicationException

2010-09-30 Thread Amareshwari Sri Ramadasu
Carl, we are using trunk version.


On 10/1/10 10:16 AM, Carl Steinbach c...@cloudera.com wrote:

Hi Amareshwari,

Which version of Hive are you using to run the Hive metastore server?

Thanks.

Carl

On Thu, Sep 30, 2010 at 9:25 PM, Amareshwari Sri Ramadasu 
amar...@yahoo-inc.com wrote:
Hi,

Create table as select queries fail with 
org.apache.thrift.TApplicationException in our clusters for some queries.

Following is the stack trace for the exception :
Error in metadata: org.apache.thrift.TApplicationException: Internal error 
processing create_table
org.apache.hadoop.hive.ql.metadata.HiveException: 
org.apache.thrift.TApplicationException: Internal error processing
create_table
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:405)
at org.apache.hadoop.hive.ql.exec.DDLTask.createTable(DDLTask.java:2465)
at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:180)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:108)
at 
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:55)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:895)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:764)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:640)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:353)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: org.apache.thrift.TApplicationException: Internal error processing 
create_table
at 
org.apache.thrift.TApplicationException.read(TApplicationException.java:107)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_create_table(ThriftHiveMetastore.java:566)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.create_table(ThriftHiveMetastore.java:549)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.createTable(HiveMetaStoreClient.java:281)
at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:399)

When we tried with hive.metastore.connect.retries=10, it fails though it 
succeeds occasionally.

The metastore logs have the following exception
Internal error processing create_table
java.lang.RuntimeException: Commit is called, but transaction is not active. 
Either there are mismatching open and
close calls or rollback was called in the same trasaction
at 
org.apache.hadoop.hive.metastore.ObjectStore.commitTransaction(ObjectStore.java:250)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table_core(HiveMetaStore.java:795)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.access$600(HiveMetaStore.java:79)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:816)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler$13.run(HiveMetaStore.java:813)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.executeWithRetry(HiveMetaStore.java:234)
at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.create_table(HiveMetaStore.java:813)
at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$create_table.process(ThriftHiveMetastore.java:1992)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor.process(ThriftHiveMetastore.java:1644)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:252)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

Can somebody help us find the root cause of the problem?

Thanks
Amareshwari





Re: Hive Error: while executing select count(*)

2010-09-29 Thread vaibhav negi
Hi ,

please suggest remedy for this.


Vaibhav Negi


On Tue, Sep 28, 2010 at 8:33 PM, vaibhav negi sssena...@gmail.comwrote:


 Hi Guru,

 Still same error. This problem is with host name resolution according to
 log.


 Vaibhav Negi



 On Tue, Sep 28, 2010 at 8:16 PM, Guru Prasad 
 guru.pra...@ibibogroup.comwrote:


 Try running
select count(1) from table;

 Thanks  Regards
 ~guru

 On 09/28/2010 08:07 PM, vaibhav negi wrote:

 Hi,

 I am using hadoop version 0.20.2

 While running query select count(*) from table; on Hive,i am getting
 this error

 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask

 Error in task tracker logs says :-

 Error: java.lang.NullPointerException
  at
 java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
 at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
 at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

 On debugging i found that ReduceTask.java:2683  line is using host which
 is coming null

 ReduceTask.java:2683 ---
  List loc = mapLocations.get(host);

 One user in Hadoop Forum says HADOOP-4744 should be included in hadoop.
 Is this patch required for hadoop 0.20.2 ?
 How to add patch?

 Is this an issue with /etc/hosts file?

 Please help

 Regards
 Vaibhav Negi


 This message is intended only for the use of the addressee and may contain
 information that is privileged, confidential and exempt from disclosure
 under applicable law. If the reader of this message is not the intended
 recipient, or the employee or agent responsible for delivering the message
 to the intended recipient, you are hereby notified that any dissemination,
 distribution or copying of this communication is strictly prohibited. If you
 have received this e-mail in error, please notify us immediately by return
 e-mail and delete this e-mail and all attachments from your system.





Error: java.lang.NullPointerException

2010-09-29 Thread vaibhav negi
Hi,

I am using hadoop version 0.20.2

While running query select count(*) from table; on Hive,i am getting this
error

FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.
exec.MapRedTask

Error in task tracker logs says :-

Error: java.lang.NullPointerException
 at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)


How to debug and correct it??


Vaibhav Negi


RE: Error starting Hive

2010-09-29 Thread Steven Wong
You just need to run trunk/build/dist/bin/hive, not trunk/bin/hive.


From: vaibhav negi [mailto:sssena...@gmail.com]
Sent: Tuesday, September 28, 2010 11:08 PM
To: hive-user@hadoop.apache.org
Subject: Re: Error starting Hive

Hi Maha,

Do a find | grep jar, in hive's directory, then cp all jars found to
lib/ off the main hive directory. There are several that are not
automatically copied there as part of the build, and this will fix
things.


Vaibhav Negi

On Wed, Sep 29, 2010 at 11:17 AM, Maha 
m...@umail.ucsb.edumailto:m...@umail.ucsb.edu wrote:
Hi everyone,

   Please this is an easy question for all of you experienced Hive Squads, why 
am I  getting this error:

hive-bin-path$bin/hive
Missing Hive Execution Jar: /cs/sandbox/student/maha/hive/lib/hive-exec-*.jar

I followed all the steps form Hive-Getting started:



$ svn co http://svn.apache.org/repos/asf/hadoop/hive/trunk hive

  $ cd hive

  $ ant package

  $ cd build/dist

  $ ls


  Any hints?

  Thank you,
 Maha




Re: Error starting Hive

2010-09-29 Thread Maha
Thanks Steven and Vaibhav :)

  Maha

On Sep 29, 2010, at 12:17 PM, Steven Wong wrote:

 You just need to run trunk/build/dist/bin/hive, not trunk/bin/hive.
  
  
 From: vaibhav negi [mailto:sssena...@gmail.com] 
 Sent: Tuesday, September 28, 2010 11:08 PM
 To: hive-user@hadoop.apache.org
 Subject: Re: Error starting Hive
  
 Hi Maha,
 
 Do a find | grep jar, in hive's directory, then cp all jars found to
 lib/ off the main hive directory. There are several that are not
 automatically copied there as part of the build, and this will fix
 things.
 
 
 Vaibhav Negi
 
 
 On Wed, Sep 29, 2010 at 11:17 AM, Maha m...@umail.ucsb.edu wrote:
 Hi everyone,
  
Please this is an easy question for all of you experienced Hive Squads, 
 why am I  getting this error:
  
 hive-bin-path$bin/hive
 Missing Hive Execution Jar: /cs/sandbox/student/maha/hive/lib/hive-exec-*.jar
  
 I followed all the steps form Hive-Getting started:
  
 $ svn co http://svn.apache.org/repos/asf/hadoop/hive/trunk hive
   $ cd hive
   $ ant package
   $ cd build/dist 
   $ ls
  
   Any hints?
  
   Thank you,
  Maha
  
  



fix for DB_LOCATION_URI NOT NULL migration error?

2010-09-28 Thread Raviv M-G
Does anyone have a fix for the below error?  I can see that it is
caused by changes made in HIVE-675, but I can't find a patch or
instructions for migrating that metastore_db that fixes the problem.

FAILED: Error in metadata: javax.jdo.JDODataStoreException: Error(s)
were found while auto-creating/validating the datastore for classes.
The errors are printed in the log, and are attached to this exception.
NestedThrowables:
java.sql.SQLSyntaxErrorException: In an ALTER TABLE statement, the
column 'DB_LOCATION_URI' has been specified as NOT NULL and either the
DEFAULT clause was not specified or was specified as DEFAULT NULL.
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask

Thanks,
Raviv


Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-28 Thread Amareshwari Sri Ramadasu
Pradeep, you might be hitting HADOOP-5759 and the job is not getting 
initialized at all. Look in JobTracker logs for the jobid to confirm the same.

On 9/28/10 6:28 AM, Pradeep Kamath prade...@yahoo-inc.com wrote:

Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make much out 
of it:

2010-09-27 17:40:01,081 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - Starting Job = job_201009251752_1341, 
Tracking URL = 
http://hostname:50030/jobdetails.jsp?jobid=job_201009251752_1341
2010-09-27 17:40:01,081 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - Kill Command = 
/homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job 
-Dmapred.job.tracker=hostname:50020 -kill job_201009251752_1341
2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #129
2010-09-27 17:40:01,083 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #129
2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #130
2010-09-27 17:40:02,090 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #130
2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 5
2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #131
2010-09-27 17:40:02,093 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #131
2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #132
2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #132
2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobProfile 2
2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #133
2010-09-27 17:40:02,100 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #133
2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobCounters 4
2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:init(151)) - 
Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with 
nothing
2010-09-27 17:40:02,101 DEBUG mapred.Counters 
(Counters.java:getCounterForName(277)) - Adding CREATED_FILES
2010-09-27 17:40:02,103 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map = 
100%,  reduce = 100%
2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #134
2010-09-27 17:40:02,105 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #134
2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #135
2010-09-27 17:40:02,108 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #135
2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobCounters 2
2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:init(151)) - 
Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with 
nothing
2010-09-27 17:40:02,109 DEBUG mapred.Counters 
(Counters.java:getCounterForName(277)) - Adding CREATED_FILES
2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #136
2010-09-27 17:40:02,111 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #136
2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,112 ERROR exec.MapRedTask 
(SessionState.java:printError(277)) - Ended Job = job_201009251752_1341 with 
errors

Ning Zhang wrote:
From the error info, it seems the 2nd job has been launched and failed. So I'm 
assuming

Hive Error: while executing select count(*)

2010-09-28 Thread vaibhav negi
Hi,

I am using hadoop version 0.20.2

While running query select count(*) from table; on Hive,i am getting this
error

FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.MapRedTask

Error in task tracker logs says :-

Error: java.lang.NullPointerException
 at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

On debugging i found that ReduceTask.java:2683  line is using host which is
coming null

ReduceTask.java:2683 ---
 List loc = mapLocations.get(host);

One user in Hadoop Forum says HADOOP-4744 should be included in hadoop. Is
this patch required for hadoop 0.20.2 ?
How to add patch?

Is this an issue with /etc/hosts file?

Please help

Regards
Vaibhav Negi


Re: Hive Error: while executing select count(*)

2010-09-28 Thread Guru Prasad


Try running
select count(1) from table;

Thanks  Regards
~guru
On 09/28/2010 08:07 PM, vaibhav negi wrote:

Hi,

I am using hadoop version 0.20.2

While running query select count(*) from table; on Hive,i am getting 
this error


FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask


Error in task tracker logs says :-

Error: java.lang.NullPointerException
 at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
at 
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)


On debugging i found that ReduceTask.java:2683  line is using host 
which is coming null


ReduceTask.java:2683 ---
 List loc = mapLocations.get(host);

One user in Hadoop Forum says HADOOP-4744 should be included in 
hadoop. Is this patch required for hadoop 0.20.2 ?

How to add patch?

Is this an issue with /etc/hosts file?

Please help

Regards
Vaibhav Negi


This message is intended only for the use of the addressee and may contain information that is privileged, confidential 
and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, or the 
employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any 
dissemination, distribution or copying of this communication is strictly prohibited. If you have received this e-mail 
in error, please notify us immediately by return e-mail and delete this e-mail and all attachments from your system.




Re: Hive Error: while executing select count(*)

2010-09-28 Thread vaibhav negi
Hi Guru,

Still same error. This problem is with host name resolution according to
log.


Vaibhav Negi


On Tue, Sep 28, 2010 at 8:16 PM, Guru Prasad guru.pra...@ibibogroup.comwrote:


 Try running
select count(1) from table;

 Thanks  Regards
 ~guru

 On 09/28/2010 08:07 PM, vaibhav negi wrote:

 Hi,

 I am using hadoop version 0.20.2

 While running query select count(*) from table; on Hive,i am getting
 this error

 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask

 Error in task tracker logs says :-

 Error: java.lang.NullPointerException
  at java.util.concurrent.ConcurrentHashMap.get(ConcurrentHashMap.java:768)
 at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.getMapCompletionEvents(ReduceTask.java:2683)
 at
 org.apache.hadoop.mapred.ReduceTask$ReduceCopier$GetMapEventsThread.run(ReduceTask.java:2605)

 On debugging i found that ReduceTask.java:2683  line is using host which
 is coming null

 ReduceTask.java:2683 ---
  List loc = mapLocations.get(host);

 One user in Hadoop Forum says HADOOP-4744 should be included in hadoop. Is
 this patch required for hadoop 0.20.2 ?
 How to add patch?

 Is this an issue with /etc/hosts file?

 Please help

 Regards
 Vaibhav Negi


 This message is intended only for the use of the addressee and may contain
 information that is privileged, confidential and exempt from disclosure
 under applicable law. If the reader of this message is not the intended
 recipient, or the employee or agent responsible for delivering the message
 to the intended recipient, you are hereby notified that any dissemination,
 distribution or copying of this communication is strictly prohibited. If you
 have received this e-mail in error, please notify us immediately by return
 e-mail and delete this e-mail and all attachments from your system.




Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-28 Thread Pradeep Kamath
:02,109 DEBUG mapred.Counters
(Counters.java:init(151)) - Creating group
org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
2010-09-27 17:40:02,109 DEBUG mapred.Counters
(Counters.java:getCounterForName(277)) - Adding CREATED_FILES
2010-09-27 17:40:02,109 DEBUG ipc.Client
(Client.java:sendParam(469)) - IPC Client (47) connection to
hostname/216.252.118.203:50020 from pradeepk sending #136
2010-09-27 17:40:02,111 DEBUG ipc.Client
(Client.java:receiveResponse(504)) - IPC Client (47) connection to
hostname/216.252.118.203:50020 from pradeepk got value #136
2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) -
Call: getJobStatus 2
2010-09-27 17:40:02,112 ERROR exec.MapRedTask
(SessionState.java:printError(277)) - Ended Job =
job_201009251752_1341 with errors

Ning Zhang wrote:

From the error info, it seems the 2nd job has been launched
and failed. So I'm assuming there are map tasks started? If
not, you can find the error message in the client log file
/tmp/userid/hive.log at the machine running hive after
setting the hive.root.logger property Steven mentioned.  

 
 

 
 
On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:
 
 



 2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce =
100%



 Ended Job = job_201009241059_0282 with errors



 FAILED: Execution Error, return code 2 from

 org.apache.hadoop.hive.ql.exec.MapRedTask






 
 








RE: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-28 Thread Pradeep Kamath
Should I open a jira for this? So far it seems like a regression.

Pradeep

From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
Sent: Tuesday, September 28, 2010 9:32 AM
To: hive-user@hadoop.apache.org
Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

With hive -hiveconf hive.root.logger=DEBUG,DRFA -e ...  
/tmp/username/hive.log seems to have pretty detailed log messages including 
debug msgs. I don't see the initialization failed message and the stack trace 
mentioned in HADOOP-5759 - is there any other place I need to check. On the UI 
I only see map task in pending state and no further information (this is with 
hadoop-0.20.1). With a more recent hadoop I see no tasks launched at all. This 
used to work a month before - am wondering if any changes in hive caused this.

Thanks,
Pradeep
Amareshwari Sri Ramadasu wrote:
Pradeep, you might be hitting HADOOP-5759 and the job is not getting 
initialized at all. Look in JobTracker logs for the jobid to confirm the same.

On 9/28/10 6:28 AM, Pradeep Kamath prade...@yahoo-inc.com wrote:
Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make much out 
of it:

2010-09-27 17:40:01,081 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - Starting Job = job_201009251752_1341, 
Tracking URL = 
http://hostname:50030/jobdetails.jsp?jobid=job_201009251752_1341http://%3Chostname%3E:50030/jobdetails.jsp?jobid=job_201009251752_1341
2010-09-27 17:40:01,081 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - Kill Command = 
/homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job 
-Dmapred.job.tracker=hostname:50020 -kill job_201009251752_1341
2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #129
2010-09-27 17:40:01,083 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #129
2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #130
2010-09-27 17:40:02,090 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #130
2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 5
2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #131
2010-09-27 17:40:02,093 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #131
2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #132
2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #132
2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobProfile 2
2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #133
2010-09-27 17:40:02,100 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #133
2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobCounters 4
2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:init(151)) - 
Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with 
nothing
2010-09-27 17:40:02,101 DEBUG mapred.Counters 
(Counters.java:getCounterForName(277)) - Adding CREATED_FILES
2010-09-27 17:40:02,103 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map = 
100%,  reduce = 100%
2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #134
2010-09-27 17:40:02,105 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #134
2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #135
2010-09-27 17:40:02,108 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #135
2010-09-27 17:40:02,108 DEBUG ipc.RPC

Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-28 Thread Ning Zhang
Prodeep, can you open the tracking URL printed out from the log and click 
through to the task log? The real error should be printed over there. The link 
may be expired so you need to rerun the query and click on the new one.

I'm suspecting the error is due to the fact that CombineHiveInputFormat with 
the Hadoop version you are using. Again, the first thing is to check the task 
log through the tracking URL.


 Tracking URL = 
http://http:/hostname:50030/jobdetails.jsp?jobid=job_201009251752_1341

On Sep 27, 2010, at 5:58 PM, Pradeep Kamath wrote:

Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make much out 
of it:

2010-09-27 17:40:01,081 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - Starting Job = job_201009251752_1341, 
Tracking URL = 
http://http:/hostname:50030/jobdetails.jsp?jobid=job_201009251752_1341
2010-09-27 17:40:01,081 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - Kill Command = 
/homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job  
-Dmapred.job.tracker=hostname:50020 -kill job_201009251752_1341
2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #129
2010-09-27 17:40:01,083 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #129
2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #130
2010-09-27 17:40:02,090 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #130
2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 5
2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #131
2010-09-27 17:40:02,093 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #131
2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #132
2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #132
2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobProfile 2
2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #133
2010-09-27 17:40:02,100 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #133
2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobCounters 4
2010-09-27 17:40:02,101 DEBUG mapred.Counters (Counters.java:init(151)) - 
Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with 
nothing
2010-09-27 17:40:02,101 DEBUG mapred.Counters 
(Counters.java:getCounterForName(277)) - Adding CREATED_FILES
2010-09-27 17:40:02,103 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map = 
100%,  reduce = 100%
2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #134
2010-09-27 17:40:02,105 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #134
2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #135
2010-09-27 17:40:02,108 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
got value #135
2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobCounters 2
2010-09-27 17:40:02,109 DEBUG mapred.Counters (Counters.java:init(151)) - 
Creating group org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with 
nothing
2010-09-27 17:40:02,109 DEBUG mapred.Counters 
(Counters.java:getCounterForName(277)) - Adding CREATED_FILES
2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) - IPC 
Client (47) connection to hostname/216.252.118.203:50020 from pradeepk 
sending #136
2010-09-27 17:40:02,111 DEBUG ipc.Client (Client.java:receiveResponse(504)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from

RE: fix for DB_LOCATION_URI NOT NULL migration error?

2010-09-28 Thread Paul Yang
For migration, did you manually alter the column or are you relying on JDO to 
auto-create the schema?

-Original Message-
From: ravi...@gmail.com [mailto:ravi...@gmail.com] On Behalf Of Raviv M-G
Sent: Monday, September 27, 2010 11:57 PM
To: hive-user@hadoop.apache.org
Subject: fix for DB_LOCATION_URI NOT NULL migration error?

Does anyone have a fix for the below error?  I can see that it is
caused by changes made in HIVE-675, but I can't find a patch or
instructions for migrating that metastore_db that fixes the problem.

FAILED: Error in metadata: javax.jdo.JDODataStoreException: Error(s)
were found while auto-creating/validating the datastore for classes.
The errors are printed in the log, and are attached to this exception.
NestedThrowables:
java.sql.SQLSyntaxErrorException: In an ALTER TABLE statement, the
column 'DB_LOCATION_URI' has been specified as NOT NULL and either the
DEFAULT clause was not specified or was specified as DEFAULT NULL.
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask

Thanks,
Raviv


Re: fix for DB_LOCATION_URI NOT NULL migration error?

2010-09-28 Thread Raviv M-G
I relied on the JDO:

property
namedatanucleus.autoCreateSchema/name
valuetrue/value
/property


Hive 675 apparently changed this to allows-null=false.
https://issues.apache.org/jira/secure/attachment/12454730/HIVE-675-backport-v6.2.patch.txt

Should I manually alter the derby table to allow nulls?

Thanks!
-Raviv




On Tue, Sep 28, 2010 at 3:02 PM, Paul Yang py...@facebook.com wrote:
 For migration, did you manually alter the column or are you relying on JDO to 
 auto-create the schema?

 -Original Message-
 From: ravi...@gmail.com [mailto:ravi...@gmail.com] On Behalf Of Raviv M-G
 Sent: Monday, September 27, 2010 11:57 PM
 To: hive-user@hadoop.apache.org
 Subject: fix for DB_LOCATION_URI NOT NULL migration error?

 Does anyone have a fix for the below error?  I can see that it is
 caused by changes made in HIVE-675, but I can't find a patch or
 instructions for migrating that metastore_db that fixes the problem.

 FAILED: Error in metadata: javax.jdo.JDODataStoreException: Error(s)
 were found while auto-creating/validating the datastore for classes.
 The errors are printed in the log, and are attached to this exception.
 NestedThrowables:
 java.sql.SQLSyntaxErrorException: In an ALTER TABLE statement, the
 column 'DB_LOCATION_URI' has been specified as NOT NULL and either the
 DEFAULT clause was not specified or was specified as DEFAULT NULL.
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask

 Thanks,
 Raviv



Re: fix for DB_LOCATION_URI NOT NULL migration error?

2010-09-28 Thread Raviv M-G
Thanks, Paul!  Worked like a charm.

For Deby:

ALTER TABLE DBS ALTER COLUMN DESC SET DATA TYPE VARCHAR(4000);
ALTER TABLE DBS ALTER COLUMN DB_LOCATION_URI SET DATA TYPE VARCHAR(4000);
ALTER TABLE DBS ALTER COLUMN DB_LOCATION_URI DEFAULT '';
ALTER TABLE DBS ALTER COLUMN DB_LOCATION_URI NOT NULL;




On Tue, Sep 28, 2010 at 4:50 PM, Paul Yang py...@facebook.com wrote:
 In HIVE-675, Carl posted the relevant alter table commands. I tried those out 
 on a MySQL DB, and didn't get the error when using the Hive CLI. Can you try 
 something similar to

 ALTER TABLE DBS MODIFY `DESC` VARCHAR(4000);
 ALTER TABLE DBS ADD COLUMN DB_LOCATION_URI VARCHAR(4000) DEFAULT '' NOT NULL;

 on your DB?


 -Original Message-
 From: ravi...@gmail.com [mailto:ravi...@gmail.com] On Behalf Of Raviv M-G
 Sent: Tuesday, September 28, 2010 1:14 PM
 To: Paul Yang; hive-user@hadoop.apache.org
 Subject: Re: fix for DB_LOCATION_URI NOT NULL migration error?

 I relied on the JDO:

 property
 namedatanucleus.autoCreateSchema/name
 valuetrue/value
 /property


 Hive 675 apparently changed this to allows-null=false.
 https://issues.apache.org/jira/secure/attachment/12454730/HIVE-675-backport-v6.2.patch.txt

 Should I manually alter the derby table to allow nulls?

 Thanks!
 -Raviv




 On Tue, Sep 28, 2010 at 3:02 PM, Paul Yang py...@facebook.com wrote:
 For migration, did you manually alter the column or are you relying on JDO 
 to auto-create the schema?

 -Original Message-
 From: ravi...@gmail.com [mailto:ravi...@gmail.com] On Behalf Of Raviv M-G
 Sent: Monday, September 27, 2010 11:57 PM
 To: hive-user@hadoop.apache.org
 Subject: fix for DB_LOCATION_URI NOT NULL migration error?

 Does anyone have a fix for the below error?  I can see that it is
 caused by changes made in HIVE-675, but I can't find a patch or
 instructions for migrating that metastore_db that fixes the problem.

 FAILED: Error in metadata: javax.jdo.JDODataStoreException: Error(s)
 were found while auto-creating/validating the datastore for classes.
 The errors are printed in the log, and are attached to this exception.
 NestedThrowables:
 java.sql.SQLSyntaxErrorException: In an ALTER TABLE statement, the
 column 'DB_LOCATION_URI' has been specified as NOT NULL and either the
 DEFAULT clause was not specified or was specified as DEFAULT NULL.
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask

 Thanks,
 Raviv




Error starting Hive

2010-09-28 Thread Maha
Hi everyone,

   Please this is an easy question for all of you experienced Hive Squads, why 
am I  getting this error:

hive-bin-path$bin/hive
Missing Hive Execution Jar: /cs/sandbox/student/maha/hive/lib/hive-exec-*.jar

I followed all the steps form Hive-Getting started:
$ svn co http://svn.apache.org/repos/asf/hadoop/hive/trunk hive
  $ cd hive
  $ ant package
  $ cd build/dist 
  $ ls

  Any hints?

  Thank you,
 Maha



Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Pradeep Kamath
Hi,
  Any help in debugging the issue I am seeing below will be greatly 
appreciated. Unless I am doing something wrong, this seems to be a regression 
in trunk.

Thanks,
Pradeep


From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
Sent: Friday, September 24, 2010 1:41 PM
To: hive-user@hadoop.apache.org
Subject: Insert overwrite error using hive trunk

Hi,
   I am trying to insert overwrite into a partitioned table reading data from a 
non partitioned table and seeing a failure in the second map reduce job - 
wonder if I am doing something wrong - any pointers appreciated (I am using 
latest trunk code against hadoop 20 cluster). Details below[1].

Thanks,
Pradeep

[1]
Details:
bin/hive -e describe numbers_text;
col_namedata_type   comment
id  int None
num int None

bin/hive -e describe numbers_text_part;
col_namedata_type   comment
id  int None
num int None
# Partition Information
col_namedata_type   comment
partstring  None

bin/hive -e select * from numbers_text;
1   10
2   20

bin/hive -e insert overwrite table numbers_text_part partition(part='p1') 
select id, num from numbers_text;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
...
2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%
2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%
2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201009241059_0281
Ended Job = -1897439470, job is filtered out (removed at runtime).
Launching Job 2 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
...
2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%
Ended Job = job_201009241059_0282 with errors
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask

tail /tmp/pradeepk/hive.log:
2010-09-24 13:29:01,888 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
2010-09-24 13:29:01,903 WARN  fs.FileSystem (FileSystem.java:fixName(153)) - 
wilbur21.labs.corp.sp1.yahoo.com:8020 is a deprecated filesystem name. Use 
hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/ instead.
2010-09-24 13:29:03,512 ERROR exec.MapRedTask 
(SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with 
errors
2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277)) - 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask


Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Ning Zhang
I'm guessing this is due to the merge task (the 2nd MR job that merges small 
files together). You can try to 'set hive.merge.mapfiles=false;' before the 
query and see if it succeeded.

If it is due to merge job, can you attach the plan and check the mapper/reducer 
task log and see what errors/exceptions are there?


On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

Hi,
  Any help in debugging the issue I am seeing below will be greatly 
appreciated. Unless I am doing something wrong, this seems to be a regression 
in trunk.

Thanks,
Pradeep


From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
Sent: Friday, September 24, 2010 1:41 PM
To: hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
Subject: Insert overwrite error using hive trunk

Hi,
   I am trying to insert overwrite into a partitioned table reading data from a 
non partitioned table and seeing a failure in the second map reduce job – 
wonder if I am doing something wrong – any pointers appreciated (I am using 
latest trunk code against hadoop 20 cluster). Details below[1].

Thanks,
Pradeep

[1]
Details:
bin/hive -e describe numbers_text;
col_namedata_type   comment
id  int None
num int None

bin/hive -e describe numbers_text_part;
col_namedata_type   comment
id  int None
num int None
# Partition Information
col_namedata_type   comment
partstring  None

bin/hive -e select * from numbers_text;
1   10
2   20

bin/hive -e insert overwrite table numbers_text_part partition(part='p1') 
select id, num from numbers_text;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
…
2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%
2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%
2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201009241059_0281
Ended Job = -1897439470, job is filtered out (removed at runtime).
Launching Job 2 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
…
2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%
Ended Job = job_201009241059_0282 with errors
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask

tail /tmp/pradeepk/hive.log:
2010-09-24 13:29:01,888 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
2010-09-24 13:29:01,903 WARN  fs.FileSystem (FileSystem.java:fixName(153)) - 
wilbur21.labs.corp.sp1.yahoo.com:8020 is a deprecated filesystem name. Use 
hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/ instead.
2010-09-24 13:29:03,512 ERROR exec.MapRedTask 
(SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with 
errors
2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277)) - 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask



Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Ashutosh Chauhan
I suspected the same. But, even after setting this property, second MR
job did get launched and then failed.

Ashutosh
On Mon, Sep 27, 2010 at 09:25, Ning Zhang nzh...@facebook.com wrote:
 I'm guessing this is due to the merge task (the 2nd MR job that merges small
 files together). You can try to 'set hive.merge.mapfiles=false;' before the
 query and see if it succeeded.
 If it is due to merge job, can you attach the plan and check the
 mapper/reducer task log and see what errors/exceptions are there?

 On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

 Hi,

   Any help in debugging the issue I am seeing below will be greatly
 appreciated. Unless I am doing something wrong, this seems to be a
 regression in trunk.



 Thanks,

 Pradeep



 

 From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
 Sent: Friday, September 24, 2010 1:41 PM
 To: hive-user@hadoop.apache.org
 Subject: Insert overwrite error using hive trunk



 Hi,

    I am trying to insert overwrite into a partitioned table reading data
 from a non partitioned table and seeing a failure in the second map reduce
 job – wonder if I am doing something wrong – any pointers appreciated (I am
 using latest trunk code against hadoop 20 cluster). Details below[1].



 Thanks,

 Pradeep



 [1]

 Details:

 bin/hive -e describe numbers_text;

 col_name    data_type   comment

 id  int None

 num int None



 bin/hive -e describe numbers_text_part;

 col_name    data_type   comment

 id  int None

 num int None

 # Partition Information

 col_name    data_type   comment

 part    string  None



 bin/hive -e select * from numbers_text;

 1   10

 2   20



 bin/hive -e insert overwrite table numbers_text_part partition(part='p1')
 select id, num from numbers_text;

 Total MapReduce jobs = 2

 Launching Job 1 out of 2

 Number of reduce tasks is set to 0 since there's no reduce operator

 …

 2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%

 2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%

 2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%

 Ended Job = job_201009241059_0281

 Ended Job = -1897439470, job is filtered out (removed at runtime).

 Launching Job 2 out of 2

 Number of reduce tasks is set to 0 since there's no reduce operator

 …

 2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%

 Ended Job = job_201009241059_0282 with errors

 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask



 tail /tmp/pradeepk/hive.log:

 2010-09-24 13:29:01,888 WARN  mapred.JobClient
 (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
 for parsing the arguments. Applications should implement Tool for the same.

 2010-09-24 13:29:01,903 WARN  fs.FileSystem (FileSystem.java:fixName(153)) -
 wilbur21.labs.corp.sp1.yahoo.com:8020 is a deprecated filesystem name. Use
 hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/ instead.

 2010-09-24 13:29:03,512 ERROR exec.MapRedTask
 (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
 errors

 2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
 - FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask



Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Ning Zhang
Can you do explain your query after setting the parameter? 


On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:

 I suspected the same. But, even after setting this property, second MR
 job did get launched and then failed.
 
 Ashutosh
 On Mon, Sep 27, 2010 at 09:25, Ning Zhang nzh...@facebook.com wrote:
 I'm guessing this is due to the merge task (the 2nd MR job that merges small
 files together). You can try to 'set hive.merge.mapfiles=false;' before the
 query and see if it succeeded.
 If it is due to merge job, can you attach the plan and check the
 mapper/reducer task log and see what errors/exceptions are there?
 
 On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:
 
 Hi,
 
   Any help in debugging the issue I am seeing below will be greatly
 appreciated. Unless I am doing something wrong, this seems to be a
 regression in trunk.
 
 
 
 Thanks,
 
 Pradeep
 
 
 
 
 
 From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
 Sent: Friday, September 24, 2010 1:41 PM
 To: hive-user@hadoop.apache.org
 Subject: Insert overwrite error using hive trunk
 
 
 
 Hi,
 
I am trying to insert overwrite into a partitioned table reading data
 from a non partitioned table and seeing a failure in the second map reduce
 job – wonder if I am doing something wrong – any pointers appreciated (I am
 using latest trunk code against hadoop 20 cluster). Details below[1].
 
 
 
 Thanks,
 
 Pradeep
 
 
 
 [1]
 
 Details:
 
 bin/hive -e describe numbers_text;
 
 col_namedata_type   comment
 
 id  int None
 
 num int None
 
 
 
 bin/hive -e describe numbers_text_part;
 
 col_namedata_type   comment
 
 id  int None
 
 num int None
 
 # Partition Information
 
 col_namedata_type   comment
 
 partstring  None
 
 
 
 bin/hive -e select * from numbers_text;
 
 1   10
 
 2   20
 
 
 
 bin/hive -e insert overwrite table numbers_text_part partition(part='p1')
 select id, num from numbers_text;
 
 Total MapReduce jobs = 2
 
 Launching Job 1 out of 2
 
 Number of reduce tasks is set to 0 since there's no reduce operator
 
 …
 
 2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%
 
 2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%
 
 2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%
 
 Ended Job = job_201009241059_0281
 
 Ended Job = -1897439470, job is filtered out (removed at runtime).
 
 Launching Job 2 out of 2
 
 Number of reduce tasks is set to 0 since there's no reduce operator
 
 …
 
 2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%
 
 Ended Job = job_201009241059_0282 with errors
 
 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 
 
 
 tail /tmp/pradeepk/hive.log:
 
 2010-09-24 13:29:01,888 WARN  mapred.JobClient
 (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
 for parsing the arguments. Applications should implement Tool for the same.
 
 2010-09-24 13:29:01,903 WARN  fs.FileSystem (FileSystem.java:fixName(153)) -
 wilbur21.labs.corp.sp1.yahoo.com:8020 is a deprecated filesystem name. Use
 hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/ instead.
 
 2010-09-24 13:29:03,512 ERROR exec.MapRedTask
 (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
 errors
 
 2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
 - FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask
 



Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread yongqiang he
There is one ticket for insert overwrite local directory:
https://issues.apache.org/jira/browse/HIVE-1582

On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang nzh...@facebook.com wrote:
 Can you do explain your query after setting the parameter?


 On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:

 I suspected the same. But, even after setting this property, second MR
 job did get launched and then failed.

 Ashutosh
 On Mon, Sep 27, 2010 at 09:25, Ning Zhang nzh...@facebook.com wrote:
 I'm guessing this is due to the merge task (the 2nd MR job that merges small
 files together). You can try to 'set hive.merge.mapfiles=false;' before the
 query and see if it succeeded.
 If it is due to merge job, can you attach the plan and check the
 mapper/reducer task log and see what errors/exceptions are there?

 On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

 Hi,

   Any help in debugging the issue I am seeing below will be greatly
 appreciated. Unless I am doing something wrong, this seems to be a
 regression in trunk.



 Thanks,

 Pradeep



 

 From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
 Sent: Friday, September 24, 2010 1:41 PM
 To: hive-user@hadoop.apache.org
 Subject: Insert overwrite error using hive trunk



 Hi,

    I am trying to insert overwrite into a partitioned table reading data
 from a non partitioned table and seeing a failure in the second map reduce
 job – wonder if I am doing something wrong – any pointers appreciated (I am
 using latest trunk code against hadoop 20 cluster). Details below[1].



 Thanks,

 Pradeep



 [1]

 Details:

 bin/hive -e describe numbers_text;

 col_name                data_type               comment

 id                      int                     None

 num                     int                     None



 bin/hive -e describe numbers_text_part;

 col_name                data_type               comment

 id                      int                     None

 num                     int                     None

 # Partition Information

 col_name                data_type               comment

 part                    string                  None



 bin/hive -e select * from numbers_text;

 1       10

 2       20



 bin/hive -e insert overwrite table numbers_text_part partition(part='p1')
 select id, num from numbers_text;

 Total MapReduce jobs = 2

 Launching Job 1 out of 2

 Number of reduce tasks is set to 0 since there's no reduce operator

 …

 2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%

 2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%

 2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%

 Ended Job = job_201009241059_0281

 Ended Job = -1897439470, job is filtered out (removed at runtime).

 Launching Job 2 out of 2

 Number of reduce tasks is set to 0 since there's no reduce operator

 …

 2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%

 Ended Job = job_201009241059_0282 with errors

 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask



 tail /tmp/pradeepk/hive.log:

 2010-09-24 13:29:01,888 WARN  mapred.JobClient
 (JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser
 for parsing the arguments. Applications should implement Tool for the same.

 2010-09-24 13:29:01,903 WARN  fs.FileSystem (FileSystem.java:fixName(153)) -
 wilbur21.labs.corp.sp1.yahoo.com:8020 is a deprecated filesystem name. Use
 hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/ instead.

 2010-09-24 13:29:03,512 ERROR exec.MapRedTask
 (SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with
 errors

 2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277))
 - FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask





Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Pradeep Kamath

Here is the output of explain:

STAGE DEPENDENCIES:
 Stage-1 is a root stage
 Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
 Stage-3
 Stage-0 depends on stages: Stage-3, Stage-2
 Stage-2

STAGE PLANS:
 Stage: Stage-1
   Map Reduce
 Alias - Map Operator Tree:
   numbers_text
 TableScan
   alias: numbers_text
   Select Operator
 expressions:
   expr: id
   type: int
   expr: num
   type: int
 outputColumnNames: _col0, _col1
 File Output Operator
   compressed: false
   GlobalTableId: 1
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   serde: 
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

   name: numbers_text_part

 Stage: Stage-4
   Conditional Operator

 Stage: Stage-3
   Move Operator
 files:
 hdfs directory: true
 destination: 
hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-1


 Stage: Stage-0
   Move Operator
 tables:
 partition:
   part p1
 replace: true
 table:
 input format: org.apache.hadoop.mapred.TextInputFormat
 output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 name: numbers_text_part

 Stage: Stage-2
   Map Reduce
 Alias - Map Operator Tree:
   
hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10002

   File Output Operator
 compressed: false
 GlobalTableId: 0
 table:
 input format: org.apache.hadoop.mapred.TextInputFormat
 output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 name: numbers_text_part


yongqiang he wrote:

There is one ticket for insert overwrite local directory:
https://issues.apache.org/jira/browse/HIVE-1582

On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang nzh...@facebook.com wrote:
  

Can you do explain your query after setting the parameter?


On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:



I suspected the same. But, even after setting this property, second MR
job did get launched and then failed.

Ashutosh
On Mon, Sep 27, 2010 at 09:25, Ning Zhang nzh...@facebook.com wrote:
  

I'm guessing this is due to the merge task (the 2nd MR job that merges small
files together). You can try to 'set hive.merge.mapfiles=false;' before the
query and see if it succeeded.
If it is due to merge job, can you attach the plan and check the
mapper/reducer task log and see what errors/exceptions are there?

On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:

Hi,

  Any help in debugging the issue I am seeing below will be greatly
appreciated. Unless I am doing something wrong, this seems to be a
regression in trunk.



Thanks,

Pradeep





From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
Sent: Friday, September 24, 2010 1:41 PM
To: hive-user@hadoop.apache.org
Subject: Insert overwrite error using hive trunk



Hi,

   I am trying to insert overwrite into a partitioned table reading data
from a non partitioned table and seeing a failure in the second map reduce
job – wonder if I am doing something wrong – any pointers appreciated (I am
using latest trunk code against hadoop 20 cluster). Details below[1].



Thanks,

Pradeep



[1]

Details:

bin/hive -e describe numbers_text;

col_namedata_type   comment

id  int None

num int None



bin/hive -e describe numbers_text_part;

col_namedata_type   comment

id  int None

num int None

# Partition Information

col_namedata_type   comment

partstring  None



bin/hive -e select * from numbers_text;

1   10

2   20



bin/hive -e insert overwrite table numbers_text_part partition(part='p1')
select id, num from numbers_text;

Total MapReduce jobs = 2

Launching Job 1 out of 2

Number of reduce tasks is set to 0 since there's no reduce operator

…

2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%

2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%

2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%

Ended Job = job_201009241059_0281

Ended Job = -1897439470, job is filtered out (removed at runtime).

Launching Job 2 out of 2

Number

Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Ning Zhang
This clearly indicate the merge still happens due to the conditional task. Can 
you double check if the parameter is set (hive.merge.mapfiles). 

Also if you can also revert it back to use the old map-reduce merging (rather 
than using CombineHiveInputFormat for map-only merging) by setting 
hive.mergejob.maponly=false. 

I'm also curious why CombineHiveInputFormat failed in environment, can you also 
check your task log and see what errors are there (without changing all the 
above parameters)? 

On Sep 27, 2010, at 10:38 AM, Pradeep Kamath wrote:

 Here is the output of explain:
 
 STAGE DEPENDENCIES:
 Stage-1 is a root stage
 Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2
 Stage-3
 Stage-0 depends on stages: Stage-3, Stage-2
 Stage-2
 
 STAGE PLANS:
 Stage: Stage-1
   Map Reduce
 Alias - Map Operator Tree:
   numbers_text
 TableScan
   alias: numbers_text
   Select Operator
 expressions:
   expr: id
   type: int
   expr: num
   type: int
 outputColumnNames: _col0, _col1
 File Output Operator
   compressed: false
   GlobalTableId: 1
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
   name: numbers_text_part
 
 Stage: Stage-4
   Conditional Operator
 
 Stage: Stage-3
   Move Operator
 files:
 hdfs directory: true
 destination: 
 hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-1
 
 Stage: Stage-0
   Move Operator
 tables:
 partition:
   part p1
 replace: true
 table:
 input format: org.apache.hadoop.mapred.TextInputFormat
 output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 name: numbers_text_part
 
 Stage: Stage-2
   Map Reduce
 Alias - Map Operator Tree:
   
 hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-10002
   File Output Operator
 compressed: false
 GlobalTableId: 0
 table:
 input format: org.apache.hadoop.mapred.TextInputFormat
 output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
 serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
 name: numbers_text_part
 
 
 yongqiang he wrote:
 There is one ticket for insert overwrite local directory:
 https://issues.apache.org/jira/browse/HIVE-1582
 
 On Mon, Sep 27, 2010 at 9:31 AM, Ning Zhang nzh...@facebook.com wrote:
  
 Can you do explain your query after setting the parameter?
 
 
 On Sep 27, 2010, at 9:25 AM, Ashutosh Chauhan wrote:
 

 I suspected the same. But, even after setting this property, second MR
 job did get launched and then failed.
 
 Ashutosh
 On Mon, Sep 27, 2010 at 09:25, Ning Zhang nzh...@facebook.com wrote:
  
 I'm guessing this is due to the merge task (the 2nd MR job that merges 
 small
 files together). You can try to 'set hive.merge.mapfiles=false;' before 
 the
 query and see if it succeeded.
 If it is due to merge job, can you attach the plan and check the
 mapper/reducer task log and see what errors/exceptions are there?
 
 On Sep 27, 2010, at 9:10 AM, Pradeep Kamath wrote:
 
 Hi,
 
  Any help in debugging the issue I am seeing below will be greatly
 appreciated. Unless I am doing something wrong, this seems to be a
 regression in trunk.
 
 
 
 Thanks,
 
 Pradeep
 
 
 
 
 
 From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
 Sent: Friday, September 24, 2010 1:41 PM
 To: hive-user@hadoop.apache.org
 Subject: Insert overwrite error using hive trunk
 
 
 
 Hi,
 
   I am trying to insert overwrite into a partitioned table reading data
 from a non partitioned table and seeing a failure in the second map reduce
 job – wonder if I am doing something wrong – any pointers appreciated (I 
 am
 using latest trunk code against hadoop 20 cluster). Details below[1].
 
 
 
 Thanks,
 
 Pradeep
 
 
 
 [1]
 
 Details:
 
 bin/hive -e describe numbers_text;
 
 col_namedata_type   comment
 
 id  int None
 
 num int None
 
 
 
 bin/hive -e describe numbers_text_part;
 
 col_namedata_type   comment
 
 id  int None
 
 num int None
 
 # Partition Information
 
 col_namedata_type   comment
 
 part

RE: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Steven Wong
Try hive -hiveconf hive.root.logger=DEBUG,DRFA -e ... to get more context of 
the error.


From: Pradeep Kamath [mailto:prade...@yahoo-inc.com]
Sent: Monday, September 27, 2010 12:34 PM
To: hive-user@hadoop.apache.org
Subject: RE: Regression in trunk? (RE: Insert overwrite error using hive trunk)

Yes setting hive.merge.mapfiles=false caused the query to succeed. 
Unfortunately without this setting, there are no logs for tasks for the second 
job since they never get launced even. The failure is very quick after the 
second job is started and is even before any tasks launch. So I could not find 
any logs to get more messages. I am noticing this on trunk with the default set 
up - any settings I can set to get more information that can help?

Thanks,
Pradeep


From: Ning Zhang [mailto:nzh...@facebook.com]
Sent: Monday, September 27, 2010 11:34 AM
To: hive-user@hadoop.apache.org
Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

This means it failed even with the previous map-reduce merge job. Without 
looking at the task log file, it's very hard to tell what happened.

A quick fix to do is to set hive.merge.mapfiles=false.


On Sep 27, 2010, at 11:22 AM, Pradeep Kamath wrote:


Here are the settings:

bin/hive -e set; | grep hive.merge

10/09/27 11:15:36 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in 
the classpath. Usage of hadoop-site.xml is deprecated. Instead use 
core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of 
core-default.xml, mapred-default.xml and hdfs-default.xml respectively

Hive history 
file=/tmp/pradeepk/hive_job_log_pradeepk_201009271115_1683572284.txt

hive.merge.mapfiles=true

hive.merge.mapredfiles=false

hive.merge.size.per.task=25600

hive.merge.smallfiles.avgsize=1600

hive.mergejob.maponly=true

(BTW these seem to be the defaults since I am not setting anything specifically 
for merging files)



I tried your suggestion of setting hive.mergejob.maponly to false, but still 
see the same error (no tasks are launched and the job fails - this is the same 
with or without the change below)

[prade...@chargesize:/tmp/hive-svn/trunk/build/dist]bin/hive -e set 
hive.mergejob.maponly=false; insert overwrite table numbers_text_part 
partition(part='p1') select id, num from numbers_text;



On the console output I also see:

...

2010-09-27 11:16:57,827 Stage-1 map = 100%,  reduce = 0%

2010-09-27 11:17:00,859 Stage-1 map = 100%,  reduce = 100%

Ended Job = job_201009251752_1335

Ended Job = 1862840305, job is filtered out (removed at runtime).

Launching Job 2 out of 2



Any pointers much appreciated!



Thanks,

Pradeep



-Original Message-
From: Ning Zhang [mailto:nzh...@facebook.com]
Sent: Monday, September 27, 2010 10:53 AM
To: hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
Subject: Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)



This clearly indicate the merge still happens due to the conditional task. Can 
you double check if the parameter is set (hive.merge.mapfiles).



Also if you can also revert it back to use the old map-reduce merging (rather 
than using CombineHiveInputFormat for map-only merging) by setting 
hive.mergejob.maponly=false.



I'm also curious why CombineHiveInputFormat failed in environment, can you also 
check your task log and see what errors are there (without changing all the 
above parameters)?



On Sep 27, 2010, at 10:38 AM, Pradeep Kamath wrote:



 Here is the output of explain:



 STAGE DEPENDENCIES:

 Stage-1 is a root stage

 Stage-4 depends on stages: Stage-1 , consists of Stage-3, Stage-2

 Stage-3

 Stage-0 depends on stages: Stage-3, Stage-2

 Stage-2



 STAGE PLANS:

 Stage: Stage-1

   Map Reduce

 Alias - Map Operator Tree:

   numbers_text

 TableScan

   alias: numbers_text

   Select Operator

 expressions:

   expr: id

   type: int

   expr: num

   type: int

 outputColumnNames: _col0, _col1

 File Output Operator

   compressed: false

   GlobalTableId: 1

   table:

   input format: org.apache.hadoop.mapred.TextInputFormat

   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

   serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe

   name: numbers_text_part



 Stage: Stage-4

   Conditional Operator



 Stage: Stage-3

   Move Operator

 files:

 hdfs directory: true

 destination: 
 hdfs://wilbur21.labs.corp.sp1.yahoo.com/tmp/hive-pradeepk/hive_2010-09-27_10-37-06_724_1678373180997754320/-ext-1



 Stage: Stage-0

   Move Operator

 tables:

 partition:

   part p1

 replace: true

 table:

 input format: org.apache.hadoop.mapred.TextInputFormat

Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Ning Zhang
From the error info, it seems the 2nd job has been launched and failed. So I'm 
assuming there are map tasks started? If not, you can find the error message 
in the client log file /tmp/userid/hive.log at the machine running hive 
after setting the hive.root.logger property Steven mentioned.


On Sep 27, 2010, at 1:11 PM, Steven Wong wrote:

 2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%

 Ended Job = job_201009241059_0282 with errors

 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.MapRedTask





Re: Regression in trunk? (RE: Insert overwrite error using hive trunk)

2010-09-27 Thread Pradeep Kamath
Here is some relevant stuff from /tmp/pradeepk/hive.logs - can't make 
much out of it:


2010-09-27 17:40:01,081 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - Starting Job = 
job_201009251752_1341, Tracking URL = 
http://hostname:50030/jobdetails.jsp?jobid=job_201009251752_1341
2010-09-27 17:40:01,081 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - Kill Command = 
/homes/pradeepk/hadoopcluster/hadoop/bin/../bin/hadoop job  
-Dmapred.job.tracker=hostname:50020 -kill job_201009251752_1341
2010-09-27 17:40:01,081 DEBUG ipc.Client (Client.java:sendParam(469)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from 
pradeepk sending #129
2010-09-27 17:40:01,083 DEBUG ipc.Client 
(Client.java:receiveResponse(504)) - IPC Client (47) connection to 
hostname/216.252.118.203:50020 from pradeepk got value #129
2010-09-27 17:40:01,083 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,086 DEBUG ipc.Client (Client.java:sendParam(469)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from 
pradeepk sending #130
2010-09-27 17:40:02,090 DEBUG ipc.Client 
(Client.java:receiveResponse(504)) - IPC Client (47) connection to 
hostname/216.252.118.203:50020 from pradeepk got value #130
2010-09-27 17:40:02,091 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 5
2010-09-27 17:40:02,092 DEBUG ipc.Client (Client.java:sendParam(469)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from 
pradeepk sending #131
2010-09-27 17:40:02,093 DEBUG ipc.Client 
(Client.java:receiveResponse(504)) - IPC Client (47) connection to 
hostname/216.252.118.203:50020 from pradeepk got value #131
2010-09-27 17:40:02,094 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,094 DEBUG ipc.Client (Client.java:sendParam(469)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from 
pradeepk sending #132
2010-09-27 17:40:02,096 DEBUG ipc.Client 
(Client.java:receiveResponse(504)) - IPC Client (47) connection to 
hostname/216.252.118.203:50020 from pradeepk got value #132
2010-09-27 17:40:02,096 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobProfile 2
2010-09-27 17:40:02,096 DEBUG ipc.Client (Client.java:sendParam(469)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from 
pradeepk sending #133
2010-09-27 17:40:02,100 DEBUG ipc.Client 
(Client.java:receiveResponse(504)) - IPC Client (47) connection to 
hostname/216.252.118.203:50020 from pradeepk got value #133
2010-09-27 17:40:02,100 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobCounters 4
2010-09-27 17:40:02,101 DEBUG mapred.Counters 
(Counters.java:init(151)) - Creating group 
org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
2010-09-27 17:40:02,101 DEBUG mapred.Counters 
(Counters.java:getCounterForName(277)) - Adding CREATED_FILES
2010-09-27 17:40:02,103 INFO  exec.MapRedTask 
(SessionState.java:printInfo(268)) - 2010-09-27 17:40:02,103 Stage-2 map 
= 100%,  reduce = 100%
2010-09-27 17:40:02,104 DEBUG ipc.Client (Client.java:sendParam(469)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from 
pradeepk sending #134
2010-09-27 17:40:02,105 DEBUG ipc.Client 
(Client.java:receiveResponse(504)) - IPC Client (47) connection to 
hostname/216.252.118.203:50020 from pradeepk got value #134
2010-09-27 17:40:02,106 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,106 DEBUG ipc.Client (Client.java:sendParam(469)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from 
pradeepk sending #135
2010-09-27 17:40:02,108 DEBUG ipc.Client 
(Client.java:receiveResponse(504)) - IPC Client (47) connection to 
hostname/216.252.118.203:50020 from pradeepk got value #135
2010-09-27 17:40:02,108 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobCounters 2
2010-09-27 17:40:02,109 DEBUG mapred.Counters 
(Counters.java:init(151)) - Creating group 
org.apache.hadoop.hive.ql.exec.Operator$ProgressCounter with nothing
2010-09-27 17:40:02,109 DEBUG mapred.Counters 
(Counters.java:getCounterForName(277)) - Adding CREATED_FILES
2010-09-27 17:40:02,109 DEBUG ipc.Client (Client.java:sendParam(469)) - 
IPC Client (47) connection to hostname/216.252.118.203:50020 from 
pradeepk sending #136
2010-09-27 17:40:02,111 DEBUG ipc.Client 
(Client.java:receiveResponse(504)) - IPC Client (47) connection to 
hostname/216.252.118.203:50020 from pradeepk got value #136
2010-09-27 17:40:02,111 DEBUG ipc.RPC (RPC.java:invoke(225)) - Call: 
getJobStatus 2
2010-09-27 17:40:02,112 ERROR exec.MapRedTask 
(SessionState.java:printError(277)) - Ended Job = job_201009251752_1341 
with errors


Ning Zhang wrote:
From the error info, it seems the 2nd job has been launched and 
failed. So I'm assuming there are map tasks started? If not, you can 
find the error message in the client log file /tmp/userid/hive.log 
at the machine running hive after setting the hive.root.logger 
property Steven mentioned. 



On Sep

Re: Create index error:FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

2010-09-24 Thread yongqiang he
If you really want to use the index feature, you can just checkout
hive trunk code (svn co
http://svn.apache.org/repos/asf/hadoop/hive/trunk hive-trunk), and
start from there. Hive is mostly at the client side, and is very easy
for new deployment (even you are using Amazon EC2 and s3).
Internally in facebook, we just cut the Hive trunk every few weeks and
deploy the hive trunk. So i personally encourage you to try Hive
trunk.

On Thu, Sep 23, 2010 at 6:26 PM, John Sichi jsi...@facebook.com wrote:
 Since we're only just wrapping up the 0.6 release now, it's probably a bit
 too early to tell.  Note that the index feature is still under development,
 so the support in 0.7 may still not be ready for most end users.
 JVS
 On Sep 23, 2010, at 3:19 PM, Tali K wrote:

 Do you have any idea when we can expect Hive 0.7 to be released?

 Thanks,
 Tali

 
 From: jsi...@facebook.com
 To: hive-u...@hadoop.apache.org
 Subject: Re: Create index error:FAILED: Parse Error: line 1:0 cannot
 recognize input 'create' in ddl statement
 Date: Thu, 23 Sep 2010 21:27:14 +

 I've updated the index design doc to add this statement at the top:
 No index support will be available until Hive 0.7.
 If you really want to try it, you need to be building and running trunk.
 JVS
 On Sep 23, 2010, at 2:15 PM, Tali K wrote:


 Hi All,

 I've sent this e-mail yesterday, can , please, sombody help me!!!

 I am running hive version 0.5.0 (hive-default.xml:
 valuelib/hive-hwi-0.5.0.war/value)

 When I tried to create index I've got the following message:

 FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl
 statement

 Here is how I've tried to create an index:

  create index qid_index on table bit_score_less_55(query_id) as
 'org.apache.hadoop.hive.index.compact.CompactIndexHandler';

 Here is my table

 describe bit_score_less_55 ;

 query_id    string
 subject_id  string
 percent_ident   double
 align_len   int
 mismatches  int
 gap_openings    int
 query_start int
 query_end   int
 subject_start   int
 subject_end int
 e_value double
 bit_score   double
 filename    string


 Any suggestions?

 Thanks for help in advance,



 Tali






Insert overwrite error using hive trunk

2010-09-24 Thread Pradeep Kamath
Hi,
   I am trying to insert overwrite into a partitioned table reading data from a 
non partitioned table and seeing a failure in the second map reduce job - 
wonder if I am doing something wrong - any pointers appreciated (I am using 
latest trunk code against hadoop 20 cluster). Details below[1].

Thanks,
Pradeep

[1]
Details:
bin/hive -e describe numbers_text;
col_namedata_type   comment
id  int None
num int None

bin/hive -e describe numbers_text_part;
col_namedata_type   comment
id  int None
num int None
# Partition Information
col_namedata_type   comment
partstring  None

bin/hive -e select * from numbers_text;
1   10
2   20

bin/hive -e insert overwrite table numbers_text_part partition(part='p1') 
select id, num from numbers_text;
Total MapReduce jobs = 2
Launching Job 1 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
...
2010-09-24 13:28:55,649 Stage-1 map = 0%,  reduce = 0%
2010-09-24 13:28:58,687 Stage-1 map = 100%,  reduce = 0%
2010-09-24 13:29:01,726 Stage-1 map = 100%,  reduce = 100%
Ended Job = job_201009241059_0281
Ended Job = -1897439470, job is filtered out (removed at runtime).
Launching Job 2 out of 2
Number of reduce tasks is set to 0 since there's no reduce operator
...
2010-09-24 13:29:03,504 Stage-2 map = 100%,  reduce = 100%
Ended Job = job_201009241059_0282 with errors
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask

tail /tmp/pradeepk/hive.log:
2010-09-24 13:29:01,888 WARN  mapred.JobClient 
(JobClient.java:configureCommandLineOptions(539)) - Use GenericOptionsParser 
for parsing the arguments. Applications should implement Tool for the same.
2010-09-24 13:29:01,903 WARN  fs.FileSystem (FileSystem.java:fixName(153)) - 
wilbur21.labs.corp.sp1.yahoo.com:8020 is a deprecated filesystem name. Use 
hdfs://wilbur21.labs.corp.sp1.yahoo.com:8020/ instead.
2010-09-24 13:29:03,512 ERROR exec.MapRedTask 
(SessionState.java:printError(277)) - Ended Job = job_201009241059_0282 with 
errors
2010-09-24 13:29:03,537 ERROR ql.Driver (SessionState.java:printError(277)) - 
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.MapRedTask


RE: Create index error:FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

2010-09-23 Thread Tali K






Hi All,

I've sent this e-mail yesterday, can , please, sombody help me!!!

I am running hive version 0.5.0 (hive-default.xml:  
valuelib/hive-hwi-0.5.0.war/value)

When I tried to create index I've got the following message:

FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

Here is how I've tried to create an index:

 create index qid_index on table bit_score_less_55(query_id) as 
'org.apache.hadoop.hive.index.compact.CompactIndexHandler';

Here is my table 

describe bit_score_less_55 ;

query_idstring
subject_id  string
percent_ident   double
align_len   int
mismatches  int
gap_openingsint
query_start int
query_end   int
subject_start   int
subject_end int
e_value double
bit_score   double
filenamestring


Any suggestions?

Thanks for help in advance,

 

Tali  

Re: Create index error:FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

2010-09-23 Thread John Sichi
I've updated the index design doc to add this statement at the top:

No index support will be available until Hive 0.7.

If you really want to try it, you need to be building and running trunk.

JVS

On Sep 23, 2010, at 2:15 PM, Tali K wrote:



Hi All,

I've sent this e-mail yesterday, can , please, sombody help me!!!

I am running hive version 0.5.0 (hive-default.xml:  
valuelib/hive-hwi-0.5.0.war/value)

When I tried to create index I've got the following message:

FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

Here is how I've tried to create an index:

 create index qid_index on table bit_score_less_55(query_id) as 
'org.apache.hadoop.hive.index.compact.CompactIndexHandler';

Here is my table

describe bit_score_less_55 ;

query_idstring
subject_id  string
percent_ident   double
align_len   int
mismatches  int
gap_openingsint
query_start int
query_end   int
subject_start   int
subject_end int
e_value double
bit_score   double
filenamestring


Any suggestions?

Thanks for help in advance,



Tali



Hive multi-user setup: FAILED: Error in metadata: javax.jdo.JDOFatalInternalException

2010-09-23 Thread Tali K

Hi All,
 
We are trying to fet hive 5.0  to use postgresql as a metastore, and are 
getting the following error:
 
FAILED: Error in metadata: javax.jdo.JDOFatalInternalException: Error creating 
transactional connection factory
NestedThrowables:
java.lang.reflect.InvocationTargetException
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask
 
Here is the relevant portion of the hive-site.xml configuration file:
 
property
  namejavax.jdo.option.ConnectionURL/name
  
valuejdbc:postgresql://localhost:5432/hive?CreateDatabaseIfNotExist=true/value
  descriptionJDBC connect string for a JDBC metastore/description
/property
property
  namejavax.jdo.option.ConnectionDriverName/name
  valuecom.postgresql.jdbc.Driver/value
  descriptionDriver class name for a JDBC metastore/description
/property
property
  namejavax.jdo.option.ConnectionUserName/name
  valuehadoop/value
/property
property
  namejavax.jdo.option.ConnectionPassword/name
  valuehadoop/value
/property
 
We added these jars to no avail:
 
jpox-rdbms-1.2.3.jar
jpox-core-1.2.3.jar
jdo2-core-2.0.jar

We are using the Cloudera distribution of hadoop  hive, and the jdo2-api jar 
was already present.
 
Any ideas? Please help!
 
Thanks in advance,
Tali
  

RE: Create index error:FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

2010-09-23 Thread Tali K

Do you have any idea when we can expect Hive 0.7 to be released?
 
Thanks,
Tali
 


From: jsi...@facebook.com
To: hive-user@hadoop.apache.org
Subject: Re: Create index error:FAILED: Parse Error: line 1:0 cannot recognize 
input 'create' in ddl statement
Date: Thu, 23 Sep 2010 21:27:14 +



I've updated the index design doc to add this statement at the top:


No index support will be available until Hive 0.7.


If you really want to try it, you need to be building and running trunk.


JVS


On Sep 23, 2010, at 2:15 PM, Tali K wrote:



Hi All,

I've sent this e-mail yesterday, can , please, sombody help me!!!

I am running hive version 0.5.0 (hive-default.xml:  
valuelib/hive-hwi-0.5.0.war/value)

When I tried to create index I've got the following message:

FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

Here is how I've tried to create an index:

 create index qid_index on table bit_score_less_55(query_id) as 
'org.apache.hadoop.hive.index.compact.CompactIndexHandler';

Here is my table 

describe bit_score_less_55 ;

query_idstring
subject_id  string
percent_ident   double
align_len   int
mismatches  int
gap_openingsint
query_start int
query_end   int
subject_start   int
subject_end int
e_value double
bit_score   double
filenamestring


Any suggestions?

Thanks for help in advance,

 

Tali

  

Re: Hive multi-user setup: FAILED: Error in metadata: javax.jdo.JDOFatalInternalException

2010-09-23 Thread Carl Steinbach
Hi Tali,

Is the Postgres JDBC driver JAR on your CLASSPATH? If not you need to
download the driver JAR, stick it in a directory, and then set the
environment variable HIVE_AUX_JARS_PATH=driver directory. Also, can you
please turn on logging and send us the log messages, e.g:

% hive -hiveconf hive.root.logger=INFO,console

Thanks.

Carl

On Thu, Sep 23, 2010 at 3:16 PM, Tali K ncherr...@hotmail.com wrote:

  Hi All,

 We are trying to fet hive 5.0  to use postgresql as a metastore, and are
 getting the following error:

 FAILED: Error in metadata: javax.jdo.JDOFatalInternalException: Error
 creating transactional connection factory
 NestedThrowables:
 java.lang.reflect.InvocationTargetException
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask

 Here is the relevant portion of the hive-site.xml configuration file:

 property
   namejavax.jdo.option.ConnectionURL/name

 valuejdbc:postgresql://localhost:5432/hive?CreateDatabaseIfNotExist=true/value
   descriptionJDBC connect string for a JDBC metastore/description
 /property
 property
   namejavax.jdo.option.ConnectionDriverName/name
   valuecom.postgresql.jdbc.Driver/value
   descriptionDriver class name for a JDBC metastore/description
 /property
 property
   namejavax.jdo.option.ConnectionUserName/name
   valuehadoop/value
 /property
 property
   namejavax.jdo.option.ConnectionPassword/name
   valuehadoop/value
 /property

 We added these jars to no avail:

 jpox-rdbms-1.2.3.jar
 jpox-core-1.2.3.jar
 jdo2-core-2.0.jar

 We are using the Cloudera distribution of hadoop  hive, and the jdo2-api
 jar was already present.

 Any ideas? Please help!

 Thanks in advance,
 Tali




RE: Hive multi-user setup: FAILED: Error in metadata: javax.jdo.JDOFatalInternalException

2010-09-23 Thread Tali K

Thank you so much, by enabling Info logging we were able to resolve it. It was 
invalid Driver name.
Thank you so much again!!!
 


Date: Thu, 23 Sep 2010 15:24:50 -0700
Subject: Re: Hive multi-user setup: FAILED: Error in metadata: 
javax.jdo.JDOFatalInternalException
From: c...@cloudera.com
To: hive-user@hadoop.apache.org

Hi Tali,


Is the Postgres JDBC driver JAR on your CLASSPATH? If not you need to download 
the driver JAR, stick it in a directory, and then set the environment variable 
HIVE_AUX_JARS_PATH=driver directory. Also, can you please turn on logging and 
send us the log messages, e.g:


% hive -hiveconf hive.root.logger=INFO,console


Thanks.


Carl


On Thu, Sep 23, 2010 at 3:16 PM, Tali K ncherr...@hotmail.com wrote:


Hi All,
 
We are trying to fet hive 5.0  to use postgresql as a metastore, and are 
getting the following error:
 
FAILED: Error in metadata: javax.jdo.JDOFatalInternalException: Error creating 
transactional connection factory
NestedThrowables:
java.lang.reflect.InvocationTargetException
FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.DDLTask
 
Here is the relevant portion of the hive-site.xml configuration file:
 
property
  namejavax.jdo.option.ConnectionURL/name
  
valuejdbc:postgresql://localhost:5432/hive?CreateDatabaseIfNotExist=true/value
  descriptionJDBC connect string for a JDBC metastore/description
/property
property
  namejavax.jdo.option.ConnectionDriverName/name
  valuecom.postgresql.jdbc.Driver/value
  descriptionDriver class name for a JDBC metastore/description
/property
property
  namejavax.jdo.option.ConnectionUserName/name
  valuehadoop/value
/property
property
  namejavax.jdo.option.ConnectionPassword/name
  valuehadoop/value
/property
 
We added these jars to no avail:
 
jpox-rdbms-1.2.3.jar
jpox-core-1.2.3.jar
jdo2-core-2.0.jar

We are using the Cloudera distribution of hadoop  hive, and the jdo2-api jar 
was already present.
 
Any ideas? Please help!
 
Thanks in advance,
Tali


  

Re: Create index error:FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

2010-09-23 Thread John Sichi
Since we're only just wrapping up the 0.6 release now, it's probably a bit too 
early to tell.  Note that the index feature is still under development, so the 
support in 0.7 may still not be ready for most end users.

JVS

On Sep 23, 2010, at 3:19 PM, Tali K wrote:

Do you have any idea when we can expect Hive 0.7 to be released?

Thanks,
Tali


From: jsi...@facebook.commailto:jsi...@facebook.com
To: hive-user@hadoop.apache.orgmailto:hive-user@hadoop.apache.org
Subject: Re: Create index error:FAILED: Parse Error: line 1:0 cannot recognize 
input 'create' in ddl statement
Date: Thu, 23 Sep 2010 21:27:14 +

I've updated the index design doc to add this statement at the top:

No index support will be available until Hive 0.7.

If you really want to try it, you need to be building and running trunk.

JVS

On Sep 23, 2010, at 2:15 PM, Tali K wrote:



Hi All,

I've sent this e-mail yesterday, can , please, sombody help me!!!

I am running hive version 0.5.0 (hive-default.xml:  
valuelib/hive-hwi-0.5.0.war/value)

When I tried to create index I've got the following message:

FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

Here is how I've tried to create an index:

 create index qid_index on table bit_score_less_55(query_id) as 
'org.apache.hadoop.hive.index.compact.CompactIndexHandler';

Here is my table

describe bit_score_less_55 ;

query_idstring
subject_id  string
percent_ident   double
align_len   int
mismatches  int
gap_openingsint
query_start int
query_end   int
subject_start   int
subject_end int
e_value double
bit_score   double
filenamestring


Any suggestions?

Thanks for help in advance,



Tali





Create index error:FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement

2010-09-22 Thread Tali K


Hi All,

 

I am running hive version 0.5.0 (hive-default.xml:  
valuelib/hive-hwi-0.5.0.war/value)

When I tried to create index I've got the following message:

FAILED: Parse Error: line 1:0 cannot recognize input 'create' in ddl statement


 

Here is how I've tried to create an index:

 create index qid_index on table bit_score_less_55(query_id) as 
'org.apache.hadoop.hive.index.compact.CompactIndexHandler';

Here is my table 

describe bit_score_less_55 ;

query_idstring
subject_id  string
percent_ident   double
align_len   int
mismatches  int
gap_openingsint
query_start int
query_end   int
subject_start   int
subject_end int
e_value double
bit_score   double
filenamestring


Any suggestions?

Thanks for help in advance,

 

Tali  

Re: Error while fetching Hive Metadata

2010-09-22 Thread Carl Steinbach
Hi Adarsh,

Hibernate will not work with Hive because Hibernate depends on the ability
to execute row-level insert, update and delete operations. None of these
operations are supported by Hive.

Carl

On Tue, Sep 21, 2010 at 3:18 AM, Bennie Schut bsc...@ebuddy.com wrote:

 Hi,

 Not all jdbc calls are implemented. This would be one of them. I don't
 think anyone tried to use hibernate with hive before, probably because it's
 highly unlikely to work at this time since it will produce sql which might
 not be understood by hive. In most cases you want a pretty fine grain of
 control over the queries you send to hive (or any other dwh system) for
 performance reasons so I don't think it's something people are actively
 working on.

 As an alternative you might want to look at the apache commons dbcp for
 connection pooling. We used it for a while but stopped using it because of
 some out of PermGen issues (which probably was unrelated). We combined this
 with Spring Templates to make using it pretty simple in our code.

 Bennie.

 -Original Message-
 From: Adarsh Sharma [mailto:adarsh.sha...@orkash.com]
 Sent: Tuesday, September 21, 2010 11:52 AM
 To: hive-user@hadoop.apache.org
 Subject: Error while fetching Hive Metadata


 Hi all,
 Did anyone encounter with the  following error while fetching meta data
 of Hive.

 10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: Using
 Hibernate built-in connection pool (not for production use!)
 10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider:
 Hibernate connection pool size: 10
 10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider:
 autocommit mode: false
 10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: using
 driver: org.apache.hadoop.hive.jdbc.HiveDriver at URL:
 jdbc:hive://192.168.0.173:1/default
 10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider:
 connection properties: {user=hadoop, password=}
 Hive history file=/tmp/root/hive_job_log_root_201009211518_1489326085.txt
 10/09/21 15:18:26 INFO exec.HiveHistory: Hive history
 file=/tmp/root/hive_job_log_root_201009211518_1489326085.txt
 10/09/21 15:18:26 WARN cfg.SettingsFactory: Could not obtain connection
 metadata
 java.sql.SQLException: Method not supported
at

 org.apache.hadoop.hive.jdbc.HiveConnection.getAutoCommit(HiveConnection.java:201)
at

 org.hibernate.connection.DriverManagerConnectionProvider.getConnection(DriverManagerConnectionProvider.java:112)
at
 org.hibernate.cfg.SettingsFactory.buildSettings(SettingsFactory.java:72)
at
 org.hibernate.cfg.Configuration.buildSettings(Configuration.java:1823)
at

 org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1143)
at SelectClauseExample.main(SelectClauseExample.java:20)

 I want to use Hive Metadata. can someone Please help me.
 I use Hadoop-0.20.2 and Hive 0.7 trunk

 Thanks



Error while fetching Hive Metadata

2010-09-21 Thread Adarsh Sharma

Hi all,
Did anyone encounter with the  following error while fetching meta data 
of Hive.


10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: Using 
Hibernate built-in connection pool (not for production use!)
10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: 
Hibernate connection pool size: 10
10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: 
autocommit mode: false
10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: using 
driver: org.apache.hadoop.hive.jdbc.HiveDriver at URL: 
jdbc:hive://192.168.0.173:1/default
10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: 
connection properties: {user=hadoop, password=}

Hive history file=/tmp/root/hive_job_log_root_201009211518_1489326085.txt
10/09/21 15:18:26 INFO exec.HiveHistory: Hive history 
file=/tmp/root/hive_job_log_root_201009211518_1489326085.txt
10/09/21 15:18:26 WARN cfg.SettingsFactory: Could not obtain connection 
metadata

java.sql.SQLException: Method not supported
   at 
org.apache.hadoop.hive.jdbc.HiveConnection.getAutoCommit(HiveConnection.java:201)
   at 
org.hibernate.connection.DriverManagerConnectionProvider.getConnection(DriverManagerConnectionProvider.java:112)
   at 
org.hibernate.cfg.SettingsFactory.buildSettings(SettingsFactory.java:72)
   at 
org.hibernate.cfg.Configuration.buildSettings(Configuration.java:1823)
   at 
org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1143)

   at SelectClauseExample.main(SelectClauseExample.java:20)

I want to use Hive Metadata. can someone Please help me.
I use Hadoop-0.20.2 and Hive 0.7 trunk

Thanks


RE: Error while fetching Hive Metadata

2010-09-21 Thread Bennie Schut
Hi,

Not all jdbc calls are implemented. This would be one of them. I don't think 
anyone tried to use hibernate with hive before, probably because it's highly 
unlikely to work at this time since it will produce sql which might not be 
understood by hive. In most cases you want a pretty fine grain of control over 
the queries you send to hive (or any other dwh system) for performance reasons 
so I don't think it's something people are actively working on.

As an alternative you might want to look at the apache commons dbcp for 
connection pooling. We used it for a while but stopped using it because of some 
out of PermGen issues (which probably was unrelated). We combined this with 
Spring Templates to make using it pretty simple in our code.

Bennie.

-Original Message-
From: Adarsh Sharma [mailto:adarsh.sha...@orkash.com] 
Sent: Tuesday, September 21, 2010 11:52 AM
To: hive-user@hadoop.apache.org
Subject: Error while fetching Hive Metadata


Hi all,
Did anyone encounter with the  following error while fetching meta data 
of Hive.

10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: Using 
Hibernate built-in connection pool (not for production use!)
10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: 
Hibernate connection pool size: 10
10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: 
autocommit mode: false
10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: using 
driver: org.apache.hadoop.hive.jdbc.HiveDriver at URL: 
jdbc:hive://192.168.0.173:1/default
10/09/21 15:18:26 INFO connection.DriverManagerConnectionProvider: 
connection properties: {user=hadoop, password=}
Hive history file=/tmp/root/hive_job_log_root_201009211518_1489326085.txt
10/09/21 15:18:26 INFO exec.HiveHistory: Hive history 
file=/tmp/root/hive_job_log_root_201009211518_1489326085.txt
10/09/21 15:18:26 WARN cfg.SettingsFactory: Could not obtain connection 
metadata
java.sql.SQLException: Method not supported
at 
org.apache.hadoop.hive.jdbc.HiveConnection.getAutoCommit(HiveConnection.java:201)
at 
org.hibernate.connection.DriverManagerConnectionProvider.getConnection(DriverManagerConnectionProvider.java:112)
at 
org.hibernate.cfg.SettingsFactory.buildSettings(SettingsFactory.java:72)
at 
org.hibernate.cfg.Configuration.buildSettings(Configuration.java:1823)
at 
org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1143)
at SelectClauseExample.main(SelectClauseExample.java:20)

I want to use Hive Metadata. can someone Please help me.
I use Hadoop-0.20.2 and Hive 0.7 trunk

Thanks


Error while fetching Hive Metadata

2010-09-20 Thread Adarsh Sharma

Dear all,
I have encountered a serious problem while fetching Hive metadata 
through Eclipse.
I am able to connect Hive through JDBC program in Eclipse and also able 
to fetch data from it.

But when I try to connect to Hive  metadata  I got the following error :-

Driver Loaded.
Hive history file=/tmp/root/hive_job_log_root_201009201158_301097049.txt
Sep 20, 2010 11:58:13 AM 
org.apache.hadoop.hive.ql.session.SessionState$LogHelper printInfo
INFO: Hive history 
file=/tmp/root/hive_job_log_root_201009201158_301097049.txt

Got Connection.
Exception in thread main java.sql.SQLException: Method not supported
   at 
org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData.getTables(HiveDatabaseMetaData.java:710)

   at test.Jdbchivemetadata.main(Jdbchivemetadata.java:15)
I use
Hadoop = 0.20.2
Hive = 0.5 with Mysql As metastore

***My work for this error***
I think hive_jdbc.jar doesn't support these features to access metastore.
I see HiveDatabaseMetaData.java and it has all methods with exceptions 
'Method Not Supported ';


Can anybody please tell me how to use Hive Metadata if it is possible.
The codes are attached with mail.

Thanks and Regards
Adarsh Sharma



package test;

import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;


public class Jdbcclient {
  private static String driverName = org.apache.hadoop.hive.jdbc.HiveDriver;

  /**
   * @param args
   * @throws SQLException
   */
  public static void main(String[] args) throws SQLException {
  try {
	  System.out.println(Hive Execution);
  Class.forName(org.apache.hadoop.hive.jdbc.HiveDriver);
  
} catch (ClassNotFoundException e) {
  // TODO Auto-generated catch block
  e.printStackTrace();
  //System.exit(1);
}
Connection con = DriverManager.getConnection(jdbc:hive://192.168.0.25:1/default, ,);

//Connection con = DriverManager.getConnection(jdbc:derby://ws-test:1527/metastore_db, ,);
Statement stmt = con.createStatement();
   System.out.println(con +con);
  // String tableName = master_seed;
   String tableName = student;

   
// describe table
String sql = describe  + tableName;
   System.out.println(Running:  + sql);
   ResultSet res = stmt.executeQuery(sql);
   while (res.next()) {
 System.out.println(res.getString(1) + \t + res.getString(2));
   }
  
   
sql = select name from  + tableName;
System.out.println(Running:  + sql);
res = stmt.executeQuery(sql);
while (res.next()) {
  System.out.println(res.getString(1));
}
}
}
package test;

import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DatabaseMetaData;
import java.sql.DriverManager;


public class Jdbchivemetadata {
  
  public static void main(String[] args) throws Exception {
DatabaseMetaData md = conn.getMetaData();
ResultSet rs = md.getTables(null, null, %, null);
while (rs.next()) {
  System.out.println(rs.getString(3));
}  }
   static Connection conn;

  static Statement st;

  static {
try {
  // Step 1: Load the JDBC driver.
  Class.forName(org.apache.hadoop.hive.jdbc.HiveDriver);
  System.out.println(Driver Loaded.);
  // Step 2: Establish the connection to the database.
  String url = jdbc:hive://192.168.0.25:1/default;

  conn = DriverManager.getConnection(url, student, );
  System.out.println(Got Connection.);

  st = conn.createStatement();
} catch (Exception e) {
  System.err.println(Got an exception! );
  e.printStackTrace();
  System.exit(0);
}
  }
}
   
  
  




  


RE: Error while fetching Hive Metadata

2010-09-20 Thread Bennie Schut
Metadata calls have been added on trunk (0.7.0)
I haven't used hive 0.5 myself but the jdbc part seems self contained enough 
you could perhaps use the one from trunk to connect to 0.5
But I'm not sure if that was enough to get eclipse working, I only tried it on 
the Squirrel client and some custom code.

-Original Message-
From: Adarsh Sharma [mailto:adarsh.sha...@orkash.com] 
Sent: Monday, September 20, 2010 8:44 AM
To: hive-user@hadoop.apache.org
Subject: Error while fetching Hive Metadata

Dear all,
I have encountered a serious problem while fetching Hive metadata 
through Eclipse.
I am able to connect Hive through JDBC program in Eclipse and also able 
to fetch data from it.
But when I try to connect to Hive  metadata  I got the following error :-

Driver Loaded.
Hive history file=/tmp/root/hive_job_log_root_201009201158_301097049.txt
Sep 20, 2010 11:58:13 AM 
org.apache.hadoop.hive.ql.session.SessionState$LogHelper printInfo
INFO: Hive history 
file=/tmp/root/hive_job_log_root_201009201158_301097049.txt
Got Connection.
Exception in thread main java.sql.SQLException: Method not supported
at 
org.apache.hadoop.hive.jdbc.HiveDatabaseMetaData.getTables(HiveDatabaseMetaData.java:710)
at test.Jdbchivemetadata.main(Jdbchivemetadata.java:15)
I use
Hadoop = 0.20.2
Hive = 0.5 with Mysql As metastore

***My work for this error***
I think hive_jdbc.jar doesn't support these features to access metastore.
I see HiveDatabaseMetaData.java and it has all methods with exceptions 
'Method Not Supported ';

Can anybody please tell me how to use Hive Metadata if it is possible.
The codes are attached with mail.

Thanks and Regards
Adarsh Sharma





Virtual Columns error

2010-09-20 Thread lei liu
I use hive0.6 version and  execute 'select INPUT_FILE_NAME,
BLOCK_OFFSET_INSIDE_FILE from person1' statement,  hive0.6 throws below
error:
FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column
Reference INPUT_FILE_NAME error.

Don't hive0.6 support virtual columns?


Re: Virtual Columns error

2010-09-20 Thread Thiruvel Thirumoolan
It should be INPUT__FILE__NAME and BLOCK__OFFSET__INSIDE__FILE.

On Sep 20, 2010, at 3:15 PM, lei liu wrote:

 I use hive0.6 version and  execute 'select INPUT_FILE_NAME,  
 BLOCK_OFFSET_INSIDE_FILE from person1' statement,  hive0.6 throws below error:
 FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column 
 Reference INPUT_FILE_NAME error.
 
 Don't hive0.6 support virtual columns?
 
 



Re: Virtual Columns error

2010-09-20 Thread Thiruvel Thirumoolan
I dont think https://issues.apache.org/jira/browse/HIVE-417 which added virtual 
columns was committed to 0.6.

https://issues.apache.org/jira/browse/HIVE-417
On Sep 20, 2010, at 3:47 PM, Thiruvel Thirumoolan wrote:

It should be INPUT__FILE__NAME and BLOCK__OFFSET__INSIDE__FILE.

On Sep 20, 2010, at 3:15 PM, lei liu wrote:

I use hive0.6 version and  execute 'select INPUT_FILE_NAME,  
BLOCK_OFFSET_INSIDE_FILE from person1' statement,  hive0.6 throws below error:
FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column 
Reference INPUT_FILE_NAME error.

Don't hive0.6 support virtual columns?






Re: Virtual Columns error

2010-09-20 Thread Edward Capriolo
On Mon, Sep 20, 2010 at 6:29 AM, lei liu liulei...@gmail.com wrote:
 I use INPUT_FILENAME and BLOCKOFFSETINSIDE_FILE, but there is same error.
 hive select INPUT_FILENAME,  BLOCKOFFSETINSIDE_FILE from person1;
 FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column
 Reference INPUT_FILENAME

 2010/9/20 Thiruvel Thirumoolan thiru...@yahoo-inc.com

 I dont think https://issues.apache.org/jira/browse/HIVE-417 which added
 virtual columns was committed to 0.6.

 On Sep 20, 2010, at 3:47 PM, Thiruvel Thirumoolan wrote:

 It should be INPUT__FILE__NAME and BLOCK__OFFSET__INSIDE__FILE.

 On Sep 20, 2010, at 3:15 PM, lei liu wrote:

 I use hive0.6 version and  execute 'select INPUT_FILE_NAME,
  BLOCK_OFFSET_INSIDE_FILE from person1' statement,  hive0.6 throws below
 error:

 FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column
 Reference INPUT_FILE_NAME error.

 Don't hive0.6 support virtual columns?






You have to be careful here. The wiki represents trunk, not a
particular release.

I started the process of moving the wiki to XDOC documentation for
just this reason, so we can have accurate concise documentation for a
release. version/feature confusion is very prevalent to those outside
of hive.

Currently the culture is in place for jira and wiki. I do not see it,
because I often spend 10 hours working on a feature. While writing the
xdoc for that feature probably takes about 8 minutes, and updating the
wiki takes about 3.

I imagine the majority of committers (at Facebook) they have an
internal wiki and their users are less confused. The very astute users
dig deep and figure out how to use the less documented features, but
(here) some people just stop and ask at the first wiki inaccuracy. The
long delay to crank out hive-6 has not helped the issue. If i had to
hazard a guess right now I would say 80% of deployments are running a
trunk between 5 branch and now.

Edward


Re: Virtual Columns error

2010-09-20 Thread yongqiang he
INPUT__FILE__NAME, BLOCK__OFFSET__INSIDE__FILE
Both virtual column names use TWO underscores .

On Mon, Sep 20, 2010 at 9:26 AM, Edward Capriolo edlinuxg...@gmail.com wrote:
 On Mon, Sep 20, 2010 at 6:29 AM, lei liu liulei...@gmail.com wrote:
 I use INPUT_FILENAME and BLOCKOFFSETINSIDE_FILE, but there is same error.
 hive select INPUT_FILENAME,  BLOCKOFFSETINSIDE_FILE from person1;
 FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column
 Reference INPUT_FILENAME

 2010/9/20 Thiruvel Thirumoolan thiru...@yahoo-inc.com

 I dont think https://issues.apache.org/jira/browse/HIVE-417 which added
 virtual columns was committed to 0.6.

 On Sep 20, 2010, at 3:47 PM, Thiruvel Thirumoolan wrote:

 It should be INPUT__FILE__NAME and BLOCK__OFFSET__INSIDE__FILE.

 On Sep 20, 2010, at 3:15 PM, lei liu wrote:

 I use hive0.6 version and  execute 'select INPUT_FILE_NAME,
  BLOCK_OFFSET_INSIDE_FILE from person1' statement,  hive0.6 throws below
 error:

 FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column
 Reference INPUT_FILE_NAME error.

 Don't hive0.6 support virtual columns?






 You have to be careful here. The wiki represents trunk, not a
 particular release.

 I started the process of moving the wiki to XDOC documentation for
 just this reason, so we can have accurate concise documentation for a
 release. version/feature confusion is very prevalent to those outside
 of hive.

 Currently the culture is in place for jira and wiki. I do not see it,
 because I often spend 10 hours working on a feature. While writing the
 xdoc for that feature probably takes about 8 minutes, and updating the
 wiki takes about 3.

 I imagine the majority of committers (at Facebook) they have an
 internal wiki and their users are less confused. The very astute users
 dig deep and figure out how to use the less documented features, but
 (here) some people just stop and ask at the first wiki inaccuracy. The
 long delay to crank out hive-6 has not helped the issue. If i had to
 hazard a guess right now I would say 80% of deployments are running a
 trunk between 5 branch and now.

 Edward



Re: Virtual Columns error

2010-09-20 Thread yongqiang he
http://wiki.apache.org/hadoop/Hive/LanguageManual/VirtualColumns

On Mon, Sep 20, 2010 at 10:42 AM, yongqiang he heyongqiang...@gmail.com wrote:
 INPUT__FILE__NAME, BLOCK__OFFSET__INSIDE__FILE
 Both virtual column names use TWO underscores .

 On Mon, Sep 20, 2010 at 9:26 AM, Edward Capriolo edlinuxg...@gmail.com 
 wrote:
 On Mon, Sep 20, 2010 at 6:29 AM, lei liu liulei...@gmail.com wrote:
 I use INPUT_FILENAME and BLOCKOFFSETINSIDE_FILE, but there is same error.
 hive select INPUT_FILENAME,  BLOCKOFFSETINSIDE_FILE from person1;
 FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column
 Reference INPUT_FILENAME

 2010/9/20 Thiruvel Thirumoolan thiru...@yahoo-inc.com

 I dont think https://issues.apache.org/jira/browse/HIVE-417 which added
 virtual columns was committed to 0.6.

 On Sep 20, 2010, at 3:47 PM, Thiruvel Thirumoolan wrote:

 It should be INPUT__FILE__NAME and BLOCK__OFFSET__INSIDE__FILE.

 On Sep 20, 2010, at 3:15 PM, lei liu wrote:

 I use hive0.6 version and  execute 'select INPUT_FILE_NAME,
  BLOCK_OFFSET_INSIDE_FILE from person1' statement,  hive0.6 throws below
 error:

 FAILED: Error in semantic analysis: line 1:7 Invalid Table Alias or Column
 Reference INPUT_FILE_NAME error.

 Don't hive0.6 support virtual columns?






 You have to be careful here. The wiki represents trunk, not a
 particular release.

 I started the process of moving the wiki to XDOC documentation for
 just this reason, so we can have accurate concise documentation for a
 release. version/feature confusion is very prevalent to those outside
 of hive.

 Currently the culture is in place for jira and wiki. I do not see it,
 because I often spend 10 hours working on a feature. While writing the
 xdoc for that feature probably takes about 8 minutes, and updating the
 wiki takes about 3.

 I imagine the majority of committers (at Facebook) they have an
 internal wiki and their users are less confused. The very astute users
 dig deep and figure out how to use the less documented features, but
 (here) some people just stop and ask at the first wiki inaccuracy. The
 long delay to crank out hive-6 has not helped the issue. If i had to
 hazard a guess right now I would say 80% of deployments are running a
 trunk between 5 branch and now.

 Edward




Error

2010-09-17 Thread Adarsh Sharma

Dear all,
I am trying to connect Hive through my application but i am getting the 
following error :


12:03:10 ERROR conf.Configuration: Failed to set setXIncludeAware(true) 
for parser 
org.apache.xerces.jaxp.documentbuilderfactoryi...@e6c:java.lang.UnsupportedOperationException: 
This parser does not support specification null version null
java.lang.UnsupportedOperationException: This parser does not support 
specification null version null
   at 
javax.xml.parsers.DocumentBuilderFactory.setXIncludeAware(DocumentBuilderFactory.java:590)
   at 
org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1054)
   at 
org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1030)

   at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:980)
   at 
org.apache.hadoop.conf.Configuration.iterator(Configuration.java:1016)
   at 
org.apache.hadoop.hive.conf.HiveConf.getUnderlyingProps(HiveConf.java:294)

   at org.apache.hadoop.hive.conf.HiveConf.initialize(HiveConf.java:308)
   at org.apache.hadoop.hive.conf.HiveConf.init(HiveConf.java:285)
   at 
org.apache.hadoop.hive.jdbc.HiveConnection.init(HiveConnection.java:45)

   at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:69)
   at java.sql.DriverManager.getConnection(DriverManager.java:582)
   at java.sql.DriverManager.getConnection(DriverManager.java:154)
   at 
org.hibernate.connection.DriverManagerConnectionProvider.getConnection(DriverManagerConnectionProvider.java:110)
   at 
org.hibernate.cfg.SettingsFactory.buildSettings(SettingsFactory.java:72)
   at 
org.hibernate.cfg.Configuration.buildSettings(Configuration.java:1823)
   at 
org.hibernate.cfg.Configuration.buildSessionFactory(Configuration.java:1143)

   at SelectClauseExample.main(SelectClauseExample.java:19)
10/09/17

I googled a lot and find the root cause that my xerces-j2-2.7.1.jar is 
not supporting this advanced document parsing. But I'm not able to k now 
that how can I change the value of
org.apache.xerces.jaxp.SAXParserImpl and which advanced jar file I 
needed to run this.


I find a solution but  don't know how to set this parameter in Java in 
Eclipse or through command line.


-Djavax.xml.parsers.DocumentBuilderFactory=com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl


Following command shows that xerce.jar is in use.
$ ant -diagnostics
XML Parser information
---
XML Parser : org.apache.xerces.jaxp.SAXParserImpl

XML Parser Location: /usr/share/java/xerces-j2-2.7.1.jar

Thanks for any help.


Re: Hive - ERROR XBM0H: Directory /root/metastore_db cannot be created

2010-09-16 Thread vaibhav negi
Hi,

please help . I am stuck with it. When i run show tables query i could not
see any tables.

Vaibhav Negi


On Thu, Sep 2, 2010 at 1:06 PM, vaibhav negi sssena...@gmail.comwrote:

 Hi ,

 I am running hive on  hadoop 0.20.2  with 2 node cluster.

 I had created hadoop user and loaded data into hive. But now after two days
 when i am (logging with hadoop user) starting hive and running query for
 show tables . I am getting below error.

 java.sql.SQLException: Failed to create database 'metastore_db', see the
 next exception for details.
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask

 When i switch user to root and run same query , no error occurs but i could
 see no tables.

 Hive history file=/tmp/root/hive_job_log_root_201009021236_1725176336.txt
 hive
  show tables;
 OK
 Time taken: 9.194 seconds
 hive


 What may be the problem? Kindly suggest some remedy.

 Complete error log  is attached.

 Thanks and Regards

 Vaibhav Negi



Re: Hive - ERROR XBM0H: Directory /root/metastore_db cannot be created

2010-09-16 Thread Edward Capriolo
On Thu, Sep 16, 2010 at 8:42 AM, vaibhav negi sssena...@gmail.com wrote:
 Hi,

 please help . I am stuck with it. When i run show tables query i could not
 see any tables.

 Vaibhav Negi


 On Thu, Sep 2, 2010 at 1:06 PM, vaibhav negi sssena...@gmail.com
 wrote:

 Hi ,

 I am running hive on  hadoop 0.20.2  with 2 node cluster.

 I had created hadoop user and loaded data into hive. But now after two
 days when i am (logging with hadoop user) starting hive and running query
 for show tables . I am getting below error.

 java.sql.SQLException: Failed to create database 'metastore_db', see the
 next exception for details.
 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask

 When i switch user to root and run same query , no error occurs but i
 could see no tables.

 Hive history file=/tmp/root/hive_job_log_root_201009021236_1725176336.txt
 hive
      show tables;
 OK
 Time taken: 9.194 seconds
 hive


 What may be the problem? Kindly suggest some remedy.

 Complete error log  is attached.

 Thanks and Regards

 Vaibhav Negi



You are not having a hive problem you are having a file permission
problem. If you are running hive out-of-the-box the metastore is an
embedded database that will have file ownership of the user who
created it. SO if you try to run multiple instances of hive, or run as
another user it will fail.

See wiki for JDBC meta stores or moving Derby into server mode.


Re: ODBC isql error while testing

2010-09-08 Thread Carl Steinbach
Hi Ariel,

Here are a couple more things to try:

1) Force libhiveodbc.so to be loaded first by setting the LD_PRELOAD
environment library:

% LD_PRELOAD=/usr/local/unixODBC/lib/libodbchive.so isql -v Hive

It turns out that SQLAllocEnv is defined in both libodbchive.so *and*
libodbc.so. I have noticed that if libodbc.so gets loaded first its
definition of SQLAllocEnv tends to stick, blocking the later loading of the
same symbols defined in libodbchive.so. Setting LD_PRELOAD to point to the
hive odbc library is a convenient work around for this problem.

2) Set the LD_DEBUG environment variable and run isql:

% LD_DEBUG=files isql -v Hive

Setting this variable causes the dl* functions to dump logging information
about what they are doing. Setting it to files will cause the dl*
functions to print out the names of files as they are loaded, as well as
information about any unresolved symbols.

Hope this helps.

Carl

On Tue, Sep 7, 2010 at 2:05 PM, Ning Zhang nzh...@facebook.com wrote:

 Something you can check out are:
   - Did you compiled and deployed your code in different environment
 (Linux/C compier version)? If so there might incompatibility issues in the
 kernel/C .so files
   - Can you check 'ldd isql' and see which .so files it loads? Particularly
 libodbc.so should be the one coming with unixODBC. If you installed other
 ODBC driver manager and their LD paths are before unixODBC's, isql may got
 linked to the correct .so file. In addition, you can 'ldd' libodbchive.so
 etc. to see if there are missing dependent .so files as well.

 On Sep 7, 2010, at 12:44 PM, Ariel Leiva wrote:

 All three .so files can be found in LD_LIBRARY_PATH and i am still getting
 the same error. Thanks for your suggestion.

 On Tue, Sep 7, 2010 at 1:41 AM, Ning Zhang nzh...@facebook.com wrote:

 It looks like isql cannot load libodbchive.so. Please make sure that all
 three .so files libodbchive.so, libhiveclient.so, and libthrift.so can be
 found in LD_LIBRARY_PATH.


 On Sep 4, 2010, at 1:31 PM, Ariel Leiva wrote:

 Hi, i built Hive odbc driver following
 http://wiki.apache.org/hadoop/Hive/HiveODBC and wanted to test it with
 isql, but i get the following error:

 [ISQL]ERROR: Could not SQLAllocEnv

 I set LD_LIBRARY_PATH so libhiveclient.so can be found correctly.
 I also tried,
 dltest libodbchive.so SQLAllocEnv which succeeds with
 dltest libodbchive.so SQLAllocEnv
 SUCCESS: Loaded libodbchive.so
 SUCCESS: Found SQLAllocEnv

 Location of odbc.ini in my system is /usr/local/unixODBC/etc/odbc.ini and
 its content is
 [Hive]
 Driver = /usr/local/unixODBC/lib/libodbchive.so
 Description = Hive Driver v1
 DATABASE = default
 HOST = localhost
 PORT = 1
 FRAMED = 0

 Ouput of odbcinst -j is
 unixODBC 2.2.14
 DRIVERS: /usr/local/unixODBC/etc/odbcinst.ini
 SYSTEM DATA SOURCES: /usr/local/unixODBC/etc/odbc.ini
 FILE DATA SOURCES..: /usr/local/unixODBC/etc/ODBCDataSources
 USER DATA SOURCES..: /root/.odbc.ini
 SQLULEN Size...: 4
 SQLLEN Size: 4
 SQLSETPOSIROW Size.: 2

 And output of odbcinst -q -s -n Hive is:
 [Hive]
 Driver=/usr/local/unixODBC/lib/libodbchive.so
 Description=Hive Driver v1
 DATABASE=default
 HOST=localhost
 PORT=1
 FRAMED=0

 Can anybody help me with this error?
 Thanks in advance
 Ariel







Re: ODBC isql error while testing

2010-09-07 Thread Ariel Leiva
All three .so files can be found in LD_LIBRARY_PATH and i am still getting
the same error. Thanks for your suggestion.

On Tue, Sep 7, 2010 at 1:41 AM, Ning Zhang nzh...@facebook.com wrote:

 It looks like isql cannot load libodbchive.so. Please make sure that all
 three .so files libodbchive.so, libhiveclient.so, and libthrift.so can be
 found in LD_LIBRARY_PATH.


 On Sep 4, 2010, at 1:31 PM, Ariel Leiva wrote:

 Hi, i built Hive odbc driver following
 http://wiki.apache.org/hadoop/Hive/HiveODBC and wanted to test it with
 isql, but i get the following error:

 [ISQL]ERROR: Could not SQLAllocEnv

 I set LD_LIBRARY_PATH so libhiveclient.so can be found correctly.
 I also tried,
 dltest libodbchive.so SQLAllocEnv which succeeds with
 dltest libodbchive.so SQLAllocEnv
 SUCCESS: Loaded libodbchive.so
 SUCCESS: Found SQLAllocEnv

 Location of odbc.ini in my system is /usr/local/unixODBC/etc/odbc.ini and
 its content is
 [Hive]
 Driver = /usr/local/unixODBC/lib/libodbchive.so
 Description = Hive Driver v1
 DATABASE = default
 HOST = localhost
 PORT = 1
 FRAMED = 0

 Ouput of odbcinst -j is
 unixODBC 2.2.14
 DRIVERS: /usr/local/unixODBC/etc/odbcinst.ini
 SYSTEM DATA SOURCES: /usr/local/unixODBC/etc/odbc.ini
 FILE DATA SOURCES..: /usr/local/unixODBC/etc/ODBCDataSources
 USER DATA SOURCES..: /root/.odbc.ini
 SQLULEN Size...: 4
 SQLLEN Size: 4
 SQLSETPOSIROW Size.: 2

 And output of odbcinst -q -s -n Hive is:
 [Hive]
 Driver=/usr/local/unixODBC/lib/libodbchive.so
 Description=Hive Driver v1
 DATABASE=default
 HOST=localhost
 PORT=1
 FRAMED=0

 Can anybody help me with this error?
 Thanks in advance
 Ariel





Re: ODBC isql error while testing

2010-09-07 Thread Ning Zhang
Something you can check out are:
  - Did you compiled and deployed your code in different environment (Linux/C 
compier version)? If so there might incompatibility issues in the kernel/C .so 
files
  - Can you check 'ldd isql' and see which .so files it loads? Particularly 
libodbc.so should be the one coming with unixODBC. If you installed other ODBC 
driver manager and their LD paths are before unixODBC's, isql may got linked to 
the correct .so file. In addition, you can 'ldd' libodbchive.so etc. to see if 
there are missing dependent .so files as well.

On Sep 7, 2010, at 12:44 PM, Ariel Leiva wrote:

 All three .so files can be found in LD_LIBRARY_PATH and i am still getting 
 the same error. Thanks for your suggestion.
 
 On Tue, Sep 7, 2010 at 1:41 AM, Ning Zhang nzh...@facebook.com wrote:
 It looks like isql cannot load libodbchive.so. Please make sure that all 
 three .so files libodbchive.so, libhiveclient.so, and libthrift.so can be 
 found in LD_LIBRARY_PATH. 
 
 
 On Sep 4, 2010, at 1:31 PM, Ariel Leiva wrote:
 
 Hi, i built Hive odbc driver following 
 http://wiki.apache.org/hadoop/Hive/HiveODBC and wanted to test it with isql, 
 but i get the following error:
 
 [ISQL]ERROR: Could not SQLAllocEnv
 
 I set LD_LIBRARY_PATH so libhiveclient.so can be found correctly. 
 I also tried,
 dltest libodbchive.so SQLAllocEnv which succeeds with 
 dltest libodbchive.so SQLAllocEnv
 SUCCESS: Loaded libodbchive.so
 SUCCESS: Found SQLAllocEnv
 
 Location of odbc.ini in my system is /usr/local/unixODBC/etc/odbc.ini and 
 its content is  
 [Hive]
 Driver = /usr/local/unixODBC/lib/libodbchive.so
 Description = Hive Driver v1
 DATABASE = default
 HOST = localhost
 PORT = 1
 FRAMED = 0
 
 Ouput of odbcinst -j is
 unixODBC 2.2.14
 DRIVERS: /usr/local/unixODBC/etc/odbcinst.ini
 SYSTEM DATA SOURCES: /usr/local/unixODBC/etc/odbc.ini
 FILE DATA SOURCES..: /usr/local/unixODBC/etc/ODBCDataSources
 USER DATA SOURCES..: /root/.odbc.ini
 SQLULEN Size...: 4
 SQLLEN Size: 4
 SQLSETPOSIROW Size.: 2
 
 And output of odbcinst -q -s -n Hive is:
 [Hive]
 Driver=/usr/local/unixODBC/lib/libodbchive.so
 Description=Hive Driver v1
 DATABASE=default
 HOST=localhost
 PORT=1
 FRAMED=0
 
 Can anybody help me with this error?
 Thanks in advance
 Ariel
 
 



Re: ODBC isql error while testing

2010-09-06 Thread Ning Zhang
It looks like isql cannot load libodbchive.so. Please make sure that all three 
.so files libodbchive.so, libhiveclient.so, and libthrift.so can be found in 
LD_LIBRARY_PATH. 


On Sep 4, 2010, at 1:31 PM, Ariel Leiva wrote:

 Hi, i built Hive odbc driver following 
 http://wiki.apache.org/hadoop/Hive/HiveODBC and wanted to test it with isql, 
 but i get the following error:
 
 [ISQL]ERROR: Could not SQLAllocEnv
 
 I set LD_LIBRARY_PATH so libhiveclient.so can be found correctly. 
 I also tried,
 dltest libodbchive.so SQLAllocEnv which succeeds with 
 dltest libodbchive.so SQLAllocEnv
 SUCCESS: Loaded libodbchive.so
 SUCCESS: Found SQLAllocEnv
 
 Location of odbc.ini in my system is /usr/local/unixODBC/etc/odbc.ini and its 
 content is  
 [Hive]
 Driver = /usr/local/unixODBC/lib/libodbchive.so
 Description = Hive Driver v1
 DATABASE = default
 HOST = localhost
 PORT = 1
 FRAMED = 0
 
 Ouput of odbcinst -j is
 unixODBC 2.2.14
 DRIVERS: /usr/local/unixODBC/etc/odbcinst.ini
 SYSTEM DATA SOURCES: /usr/local/unixODBC/etc/odbc.ini
 FILE DATA SOURCES..: /usr/local/unixODBC/etc/ODBCDataSources
 USER DATA SOURCES..: /root/.odbc.ini
 SQLULEN Size...: 4
 SQLLEN Size: 4
 SQLSETPOSIROW Size.: 2
 
 And output of odbcinst -q -s -n Hive is:
 [Hive]
 Driver=/usr/local/unixODBC/lib/libodbchive.so
 Description=Hive Driver v1
 DATABASE=default
 HOST=localhost
 PORT=1
 FRAMED=0
 
 Can anybody help me with this error?
 Thanks in advance
 Ariel



ODBC isql error while testing

2010-09-04 Thread Ariel Leiva
Hi, i built Hive odbc driver following
http://wiki.apache.org/hadoop/Hive/HiveODBC and wanted to test it with isql,
but i get the following error:

[ISQL]ERROR: Could not SQLAllocEnv

I set LD_LIBRARY_PATH so libhiveclient.so can be found correctly.
I also tried,
dltest libodbchive.so SQLAllocEnv which succeeds with
dltest libodbchive.so SQLAllocEnv
SUCCESS: Loaded libodbchive.so
SUCCESS: Found SQLAllocEnv

Location of odbc.ini in my system is /usr/local/unixODBC/etc/odbc.ini and
its content is
[Hive]
Driver = /usr/local/unixODBC/lib/libodbchive.so
Description = Hive Driver v1
DATABASE = default
HOST = localhost
PORT = 1
FRAMED = 0

Ouput of odbcinst -j is
unixODBC 2.2.14
DRIVERS: /usr/local/unixODBC/etc/odbcinst.ini
SYSTEM DATA SOURCES: /usr/local/unixODBC/etc/odbc.ini
FILE DATA SOURCES..: /usr/local/unixODBC/etc/ODBCDataSources
USER DATA SOURCES..: /root/.odbc.ini
SQLULEN Size...: 4
SQLLEN Size: 4
SQLSETPOSIROW Size.: 2

And output of odbcinst -q -s -n Hive is:
[Hive]
Driver=/usr/local/unixODBC/lib/libodbchive.so
Description=Hive Driver v1
DATABASE=default
HOST=localhost
PORT=1
FRAMED=0

Can anybody help me with this error?
Thanks in advance
Ariel


Re: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver

2010-09-02 Thread Robert Hennig
 Hello,

Thanks Shirjeet for your answer. I found an execption in a task log
which results from a casting error:

Caused by: java.lang.ClassCastException:
org.apache.hadoop.mapred.FileSplit cannot be cast to
com.adconion.hadoop.hive.DataLogSplit
at
com.adconion.hadoop.hive.DataLogInputFormat.getRecordReader(DataLogInputFormat.java:112)
at
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.init(CombineHiveRecordReader.java:61)
... 11 more

The error happened because I expected my custom getSlices() method to be
used which delivers an array of DataLogSplit objects and I expected that
my custom getRecordReader() method will receive one of this splits which
then could be casted to be a DataLog split.

So this looks like my getSplits() method is not being used. Or does
hadoop transform the splits somehow?

Thanks,

Robert

Am 01.09.10 18:34, schrieb Shrijeet Paliwal:

 Ended Job = job_201008311250_0006 with errors

 Check your hadoop task logs, you will find more detailed information
 there. 

 -Shirjeet

 On Wed, Sep 1, 2010 at 6:13 AM, Robert Hennig rhen...@adconion.com
 mailto:rhen...@adconion.com wrote:

 Hello,

 I'm relative new to hive  hadoop and I have written a custom
 InputFormat to be able to read our logfiles. I think I got
 everything right but when I try to execute a query on a Amazon EMR
 cluster it fails with some error messages that don't tell me what
 exactly is wrong.

 So this is the query I execute:

 add jar s3://amg.hadoop/hiveLib/hive-json-serde-0.1.jar;
 add jar s3://amg.hadoop/hiveLib/hadoop-jar-with-dependencies.jar;

 DROP TABLE event_log;

 CREATE EXTERNAL TABLE IF NOT EXISTS event_log (
 EVENT_SUBTYPE STRING,
 EVENT_TYPE STRING
 )
 ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'
 STORED AS
 INPUTFORMAT 'com.adconion.hadoop.hive.DataLogInputFormat'
 OUTPUTFORMAT
 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
 LOCATION 's3://amg-events/2010/07/01/01';

 SELECT event_type FROM event_log WHERE event_type = 'pp' LIMIT 10;

 Which results in the following output:

 had...@domu-12-31-39-0f-45-b3:~$ hive -f test.ql
 Hive history
 
 file=/mnt/var/lib/hive/tmp/history/hive_job_log_hadoop_201009011303_427866099.txt
 Testing s3://amg.hadoop/hiveLib/hive-json-serde-0.1.jar
 converting to local s3://amg.hadoop/hiveLib/hive-json-serde-0.1.jar
 Added
 
 /mnt/var/lib/hive/downloaded_resources/s3_amg.hadoop_hiveLib_hive-json-serde-0.1.jar
 to class path
 Testing s3://amg.hadoop/hiveLib/hadoop-jar-with-dependencies.jar
 converting to local
 s3://amg.hadoop/hiveLib/hadoop-jar-with-dependencies.jar
 Added
 
 /mnt/var/lib/hive/downloaded_resources/s3_amg.hadoop_hiveLib_hadoop-jar-with-dependencies.jar
 to class path
 OK
 Time taken: 2.426 seconds
 Found class for org.apache.hadoop.hive.contrib.serde2.JsonSerde
 OK
 Time taken: 0.332 seconds
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_201008311250_0006, Tracking URL =
 
 http://domU-12-31-39-0F-45-B3.compute-1.internal:9100/jobdetails.jsp?jobid=job_201008311250_0006
 Kill Command = /home/hadoop/.versions/0.20/bin/../bin/hadoop job 
 -Dmapred.job.tracker=domU-12-31-39-0F-45-B3.compute-1.internal:9001 -kill
 job_201008311250_0006
 2010-09-01 13:04:04,376 Stage-1 map = 0%,  reduce = 0%
 2010-09-01 13:04:34,681 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201008311250_0006 with errors

 Failed tasks with most(4) failures :
 Task URL:
 
 http://domU-12-31-39-0F-45-B3.compute-1.internal:9100/taskdetails.jsp?jobid=job_201008311250_0006tipid=task_201008311250_0006_m_13
 
 http://domU-12-31-39-0F-45-B3.compute-1.internal:9100/taskdetails.jsp?jobid=job_201008311250_0006tipid=task_201008311250_0006_m_13

 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.ExecDriver

 Only errors I can find under /mnt/var/log/apps/hive.log are
 multiple like that one:

 2010-09-01 13:03:36,586 DEBUG org.apache.hadoop.conf.Configuration
 (Configuration.java:init(216)) - java.io.IOException: config()
 at
 org.apache.hadoop.conf.Configuration.init(Configuration.java:216)
 at
 org.apache.hadoop.conf.Configuration.init(Configuration.java:203)
 at
 org.apache.hadoop.hive.conf.HiveConf.init(HiveConf.java:316)
 at
 org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:232)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25

Re: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.ExecDriver

2010-09-01 Thread Shrijeet Paliwal

 Ended Job = job_201008311250_0006 with errors

Check your hadoop task logs, you will find more detailed information there.

-Shirjeet

On Wed, Sep 1, 2010 at 6:13 AM, Robert Hennig rhen...@adconion.com wrote:

  Hello,

 I'm relative new to hive  hadoop and I have written a custom InputFormat
 to be able to read our logfiles. I think I got everything right but when I
 try to execute a query on a Amazon EMR cluster it fails with some error
 messages that don't tell me what exactly is wrong.

 So this is the query I execute:

 add jar s3://amg.hadoop/hiveLib/hive-json-serde-0.1.jar;
 add jar s3://amg.hadoop/hiveLib/hadoop-jar-with-dependencies.jar;

 DROP TABLE event_log;

 CREATE EXTERNAL TABLE IF NOT EXISTS event_log (
 EVENT_SUBTYPE STRING,
 EVENT_TYPE STRING
 )
 ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.JsonSerde'
 STORED AS
 INPUTFORMAT 'com.adconion.hadoop.hive.DataLogInputFormat'
 OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
 LOCATION 's3://amg-events/2010/07/01/01';

 SELECT event_type FROM event_log WHERE event_type = 'pp' LIMIT 10;

 Which results in the following output:

 had...@domu-12-31-39-0f-45-b3:~$ hive -f test.ql
 Hive history
 file=/mnt/var/lib/hive/tmp/history/hive_job_log_hadoop_201009011303_427866099.txt
 Testing s3://amg.hadoop/hiveLib/hive-json-serde-0.1.jar
 converting to local s3://amg.hadoop/hiveLib/hive-json-serde-0.1.jar
 Added
 /mnt/var/lib/hive/downloaded_resources/s3_amg.hadoop_hiveLib_hive-json-serde-0.1.jar
 to class path
 Testing s3://amg.hadoop/hiveLib/hadoop-jar-with-dependencies.jar
 converting to local
 s3://amg.hadoop/hiveLib/hadoop-jar-with-dependencies.jar
 Added
 /mnt/var/lib/hive/downloaded_resources/s3_amg.hadoop_hiveLib_hadoop-jar-with-dependencies.jar
 to class path
 OK
 Time taken: 2.426 seconds
 Found class for org.apache.hadoop.hive.contrib.serde2.JsonSerde
 OK
 Time taken: 0.332 seconds
 Total MapReduce jobs = 1
 Launching Job 1 out of 1
 Number of reduce tasks is set to 0 since there's no reduce operator
 Starting Job = job_201008311250_0006, Tracking URL =
 http://domU-12-31-39-0F-45-B3.compute-1.internal:9100/jobdetails.jsp?jobid=job_201008311250_0006
 Kill Command = /home/hadoop/.versions/0.20/bin/../bin/hadoop job
 -Dmapred.job.tracker=domU-12-31-39-0F-45-B3.compute-1.internal:9001 -kill
 job_201008311250_0006
 2010-09-01 13:04:04,376 Stage-1 map = 0%,  reduce = 0%
 2010-09-01 13:04:34,681 Stage-1 map = 100%,  reduce = 100%
 Ended Job = job_201008311250_0006 with errors

 Failed tasks with most(4) failures :
 Task URL:
 http://domU-12-31-39-0F-45-B3.compute-1.internal:9100/taskdetails.jsp?jobid=job_201008311250_0006tipid=task_201008311250_0006_m_13

 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.ExecDriver

 Only errors I can find under /mnt/var/log/apps/hive.log are multiple like
 that one:

 2010-09-01 13:03:36,586 DEBUG org.apache.hadoop.conf.Configuration
 (Configuration.java:init(216)) - java.io.IOException: config()
 at
 org.apache.hadoop.conf.Configuration.init(Configuration.java:216)
 at
 org.apache.hadoop.conf.Configuration.init(Configuration.java:203)
 at org.apache.hadoop.hive.conf.HiveConf.init(HiveConf.java:316)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:232)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 at
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
 at java.lang.reflect.Method.invoke(Method.java:597)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

 And those errors:

 2010-09-01 13:03:40,228 ERROR DataNucleus.Plugin
 (Log4JLogger.java:error(115)) - Bundle org.eclipse.jdt.core requires
 org.eclipse.core.resources but it cannot be resolved.
 2010-09-01 13:03:40,228 ERROR DataNucleus.Plugin
 (Log4JLogger.java:error(115)) - Bundle org.eclipse.jdt.core requires
 org.eclipse.core.resources but it cannot be resolved.
 2010-09-01 13:03:40,229 ERROR DataNucleus.Plugin
 (Log4JLogger.java:error(115)) - Bundle org.eclipse.jdt.core requires
 org.eclipse.core.runtime but it cannot be resolved.
 2010-09-01 13:03:40,229 ERROR DataNucleus.Plugin
 (Log4JLogger.java:error(115)) - Bundle org.eclipse.jdt.core requires
 org.eclipse.core.runtime but it cannot be resolved.
 2010-09-01 13:03:40,229 ERROR DataNucleus.Plugin
 (Log4JLogger.java:error(115)) - Bundle org.eclipse.jdt.core requires
 org.eclipse.text but it cannot be resolved.
 2010-09-01 13:03:40,229 ERROR DataNucleus.Plugin
 (Log4JLogger.java:error(115)) - Bundle org.eclipse.jdt.core requires
 org.eclipse.text but it cannot be resolved.

 Does anyone have an Idea what wents wrong?

 Thanks!

 Robert



Error in Running Count

2010-08-27 Thread Adarsh Sharma

Hello everyone,
I am able to retrieve data from Hive which comes with HAdoopDB 
package(SMS).Data is stored in Postgres and metadata is in Mysql

But when I fired  a count(1) query , it gives the following error..
I tried count(1),count(*),count(page_id)hive select count(1) from 
fact_table_olap; 
Total MapReduce jobs = 1

Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
 set hive.exec.reducers.bytes.per.reducer=number
In order to limit the maximum number of reducers:
 set hive.exec.reducers.max=number
In order to set a constant number of reducers:
 set mapred.reduce.tasks=number
Starting Job = job_201008270046_0019, Tracking URL = 
http://ws-test-lin:50030/jobdetails.jsp?jobid=job_201008270046_0019
Kill Command = /home/hadoop/project/hadoop-0.20.2/bin/../bin/hadoop job  
-Dmapred.job.tracker=192.168.0.25:54311 -kill job_201008270046_0019

2010-08-27 03:10:35,587 map = 0%,  reduce =0%
2010-08-27 03:11:11,760 map = 100%,  reduce =100%
Ended Job = job_201008270046_0019 with errors
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.ExecDriver

hive

Can anybody tell the root cause...

Thanks in Advance..




Error in Running count..

2010-08-27 Thread Adarsh Sharma

Hello everyone,
I am able to retrieve data from Hive which comes with HAdoopDB 
package(SMS).Data is stored in Postgres and metadata is in Mysql

But when I fired  a count(1) query , it gives the following error..

I tried count(1),count(*),count(page_id)

hive select count(1) from fact_table_olap; 
  Total MapReduce jobs = 1

Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=number
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=number
In order to set a constant number of reducers:
set mapred.reduce.tasks=number
Starting Job = job_201008270046_0019, Tracking URL = 
http://ws-test-lin:50030/jobdetails.jsp?jobid=job_201008270046_0019
Kill Command = /home/hadoop/project/hadoop-0.20.2/bin/../bin/hadoop job  
-Dmapred.job.tracker=192.168.0.25:54311 -kill job_201008270046_0019

2010-08-27 03:10:35,587 map = 0%,  reduce =0%
2010-08-27 03:11:11,760 map = 100%,  reduce =100%
Ended Job = job_201008270046_0019 with errors
FAILED: Execution Error, return code 2 from 
org.apache.hadoop.hive.ql.exec.ExecDriver

hive

Can anybody tell the root cause and any command to determine Hive Version

Thanks in Advance..


Re: Error in Running Count

2010-08-27 Thread shixing
I think you should paste the
http://ws-test-lin:50030/jobdetails.jsp?jobid=job_201008270046_0019 's error
log


On Fri, Aug 27, 2010 at 3:15 PM, Adarsh Sharma adarsh.sha...@orkash.comwrote:

 Hello everyone,
 I am able to retrieve data from Hive which comes with HAdoopDB
 package(SMS).Data is stored in Postgres and metadata is in Mysql
 But when I fired  a count(1) query , it gives the following error..
 I tried count(1),count(*),count(page_id)hive select count(1) from
 fact_table_olap; Total MapReduce jobs = 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
  set mapred.reduce.tasks=number
 Starting Job = job_201008270046_0019, Tracking URL =
 http://ws-test-lin:50030/jobdetails.jsp?jobid=job_201008270046_0019
 Kill Command = /home/hadoop/project/hadoop-0.20.2/bin/../bin/hadoop job
  -Dmapred.job.tracker=192.168.0.25:54311 -kill job_201008270046_0019
 2010-08-27 03:10:35,587 map = 0%,  reduce =0%
 2010-08-27 03:11:11,760 map = 100%,  reduce =100%
 Ended Job = job_201008270046_0019 with errors
 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.ExecDriver
 hive

 Can anybody tell the root cause...

 Thanks in Advance..





-- 
Best wishes!
My Friend~


Re: Error in Running count..

2010-08-27 Thread Dave Brondsema
If you go to the Tracking URL for that job, it should have some more info

On Fri, Aug 27, 2010 at 3:19 AM, Adarsh Sharma adarsh.sha...@orkash.com wrote:
 Hello everyone,
 I am able to retrieve data from Hive which comes with HAdoopDB
 package(SMS).Data is stored in Postgres and metadata is in Mysql
 But when I fired  a count(1) query , it gives the following error..

 I tried count(1),count(*),count(page_id)

 hive select count(1) from fact_table_olap;       Total MapReduce jobs = 1
 Number of reduce tasks determined at compile time: 1
 In order to change the average load for a reducer (in bytes):
 set hive.exec.reducers.bytes.per.reducer=number
 In order to limit the maximum number of reducers:
 set hive.exec.reducers.max=number
 In order to set a constant number of reducers:
 set mapred.reduce.tasks=number
 Starting Job = job_201008270046_0019, Tracking URL =
 http://ws-test-lin:50030/jobdetails.jsp?jobid=job_201008270046_0019
 Kill Command = /home/hadoop/project/hadoop-0.20.2/bin/../bin/hadoop job
  -Dmapred.job.tracker=192.168.0.25:54311 -kill job_201008270046_0019
 2010-08-27 03:10:35,587 map = 0%,  reduce =0%
 2010-08-27 03:11:11,760 map = 100%,  reduce =100%
 Ended Job = job_201008270046_0019 with errors
 FAILED: Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.ExecDriver
 hive

 Can anybody tell the root cause and any command to determine Hive Version

 Thanks in Advance..




-- 
Dave Brondsema
Software Engineer
Geeknet

www.geek.net


Re: error log

2010-08-19 Thread Amareshwari Sri Ramadasu
They are ignorable exceptions in TaskTracker, they happen when the Reduce Task 
prematurely closes a connection to a jetty server. See 
MAPREDUCE-5(https://issues.apache.org/jira/browse/MAPREDUCE-5) for details.

-Amareshwari

On 8/19/10 1:18 PM, shangan shan...@corp.kaixin001.com wrote:

Does anybody know why these error arise.it seems to run well from the hive 
cli,but when I check the log I find these:

2010-08-19 15:34:09,573 INFO org.apache.hadoop.mapred.TaskTracker.clienttrace: 
src: 10.11.2.137:50060, dest: 10.11.2.219:24805, bytes: 10063307, op: 
MAPRED_SHUFFLE, cliID: attempt_201008191217_0012_m_03_0
2010-08-19 15:34:11,987 WARN org.apache.hadoop.mapred.TaskTracker: 
getMapOutput(attempt_201008191217_0012_m_18_0,0) failed :
org.mortbay.jetty.EofException
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
at 
org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:548)
at 
org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:569)
at 
org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:946)
at 
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:646)
at 
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:577)
at 
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2940)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
at 
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
at 
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
Caused by: java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcher.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:29)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:104)
at sun.nio.ch.IOUtil.write(IOUtil.java:75)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:334)
at org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:169)
at 
org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221)
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:721)
... 24 more
2010-08-19 15:34:11,988 WARN org.mortbay.log: Committed before 410 
getMapOutput(attempt_201008191217_0012_m_18_0,0) failed :
org.mortbay.jetty.EofException
at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:787)
at 
org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:548)
at 
org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:569)
at 
org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:946)
at 
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:646)
at 
org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:577)
at 
org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2940)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417

Execute error in java client

2010-08-12 Thread SingoWong
Hi,

i write a java client query hive via jdbc, but one of five task got error
below:
java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED:
Execution Error, return code *1* from
org.apache.hadoop.hive.ql.exec.ExecDriver

then i copy the hql script in hive client and execute, i can got result from
hive client, cai i know what happen for that?

BTW, i configure the meta in mysql database.

Regards,
Singo


[Hive] Error in semantic analysis: partition not found - CDH3 Beta

2010-08-12 Thread Ken.Barclay
Hello,

[I posted the question below to Cloudera's getsatisfaction site but am 
cross-posting here in case hive-users folks have debugging suggestions. I'm 
really stuck on this one.]

I recently upgraded to CDH3 Beta. I had some Hive code working well in an 
earlier version of Hadoop 20 that created a table, then loaded data into it 
using LOAD DATA LOCAL INPATH. In CDH3, I now get a semantic error when I run 
the same LOAD command.

The table is created by

CREATE TABLE TOMCAT(identifier STRING, datestamp STRING, time_stamp STRING, seq 
STRING, server STRING, logline STRING) PARTITIONED BY(filedate STRING, app 
STRING, filename STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\011' 
STORED AS TEXTFILE;

and the load command used is:

LOAD DATA LOCAL INPATH '/var/www/petrify/mw.log.trustejb1' INTO TABLE TOMCAT 
PARTITION (filedate='2010-06-25', app='trustdomain', 
filename='mw.log.trustejb1');

The file is simple tab-delimited log data.
If I exclude the partition when I create the table, the data loads fine. But 
when I set up the partitions I get the stack trace below during the load.

I tried copying the data into HDFS and using LOAD DATA INPATH instead, but got 
the same error:

FAILED: Error in semantic analysis: line 1:110 Partition not found 
'mw.log.trustejb1'

where 110 is the character position just after the word PARTITION in the query.
It seems like it doesn't think the table is partitioned, though I can see the 
partition keys listed when I do DESCRIBE EXTENDED on my table. (Output from 
that is below the error.) There were no errors in the logs or at the Thrift 
server console when I created the table.

Strangely, when I run SHOW PARTITIONS TOMCAT, it doesn't list anything.

Any help with this would be most welcome.

Thanks
Ken

10/08/12 15:11:40 INFO service.HiveServer: Running the query: LOAD DATA LOCAL 
INPATH '/var/www/petrify/trustdomain-rewritten/mw.log.trustejb1' INTO TABLE 
TOMCAT PARTITION (filedate='2010-06-25', app='trustdomain', 
filename='mw.log.trustejb1')
10/08/12 15:11:40 INFO parse.ParseDriver: Parsing command: LOAD DATA LOCAL 
INPATH '/var/www/petrify/trustdomain-rewritten/mw.log.trustejb1' INTO TABLE 
TOMCAT PARTITION (filedate='2010-06-25', app='trustdomain', 
filename='mw.log.trustejb1')
10/08/12 15:11:40 INFO parse.ParseDriver: Parse Completed
10/08/12 15:11:40 INFO hive.log: DDL: struct tomcat { string identifier, string 
datestamp, string time_stamp, string seq, string server, string logline}
10/08/12 15:11:40 ERROR metadata.Hive: org.apache.thrift.TApplicationException: 
get_partition failed: unknown result
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:831)
at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:799)
at 
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:418)
at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:620)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:397)
at 
org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:178)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at 
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:120)
at 
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:378)
at 
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:366)
at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)

FAILED: Error in semantic analysis: line 1:110 Partition not found 
'mw.log.trustejb1'
10/08/12 15:11:40 ERROR ql.Driver: FAILED: Error in semantic analysis: line 
1:110 Partition not found 'mw.log.trustejb1'
org.apache.hadoop.hive.ql.parse.SemanticException: line 1:110 Partition not 
found 'mw.log.trustejb1'
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:403)
at 
org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:178)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at 
org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:120)
at 
org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:378)
at 
org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java

Re: [Hive] Error in semantic analysis: partition not found - CDH3 Beta

2010-08-12 Thread Carl Steinbach
Hi Ken,

I'm going to crosspost what I wrote on GetSatisfaction:

Looks like you ran into this bug:
https://issues.apache.org/jira/browse/HIVE-1428

Pradeep wrote a fix for this issue in 0.7, and we're in the process of
backporting it to 0.6, and ideally we're going to release 0.6.0 in time for
it to be included in the next version of CDH3. In the meantime I think your
best bet is to execute the LOAD DATA command via a CLI session that is not
using Thrift, i.e. one that is not connected to a standalone HiveServer or
HiveMetaStoreServer instance.

Sorry for the inconvenience!

Carl


On Thu, Aug 12, 2010 at 7:18 PM, ken.barc...@wellsfargo.com wrote:

  Hello,

 [I posted the question below to Cloudera’s getsatisfaction site but am
 cross-posting here in case hive-users folks have debugging suggestions. I’m
 really stuck on this one.]

 I recently upgraded to CDH3 Beta. I had some Hive code working well in an
 earlier version of Hadoop 20 that created a table, then loaded data into it
 using LOAD DATA LOCAL INPATH. In CDH3, I now get a semantic error when I run
 the same LOAD command.

 The table is created by

 CREATE TABLE TOMCAT(identifier STRING, datestamp STRING, time_stamp STRING,
 seq STRING, server STRING, logline STRING) PARTITIONED BY(filedate STRING,
 app STRING, filename STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY
 '\011' STORED AS TEXTFILE;

 and the load command used is:

 LOAD DATA LOCAL INPATH '/var/www/petrify/mw.log.trustejb1' INTO TABLE
 TOMCAT PARTITION (filedate='2010-06-25', app='trustdomain',
 filename='mw.log.trustejb1');

 The file is simple tab-delimited log data.
 If I exclude the partition when I create the table, the data loads fine.
 But when I set up the partitions I get the stack trace below during the
 load.

 I tried copying the data into HDFS and using LOAD DATA INPATH instead, but
 got the same error:

 FAILED: Error in semantic analysis: line 1:110 Partition not found
 'mw.log.trustejb1'

 where 110 is the character position just after the word PARTITION in the
 query.
 It seems like it doesn't think the table is partitioned, though I can see
 the partition keys listed when I do DESCRIBE EXTENDED on my table. (Output
 from that is below the error.) There were no errors in the logs or at the
 Thrift server console when I created the table.

 Strangely, when I run SHOW PARTITIONS TOMCAT, it doesn’t list anything.

 Any help with this would be most welcome.

 Thanks
 Ken

 10/08/12 15:11:40 INFO service.HiveServer: Running the query: LOAD DATA
 LOCAL INPATH '/var/www/petrify/trustdomain-rewritten/mw.log.trustejb1' INTO
 TABLE TOMCAT PARTITION (filedate='2010-06-25', app='trustdomain',
 filename='mw.log.trustejb1')
 10/08/12 15:11:40 INFO parse.ParseDriver: Parsing command: LOAD DATA LOCAL
 INPATH '/var/www/petrify/trustdomain-rewritten/mw.log.trustejb1' INTO TABLE
 TOMCAT PARTITION (filedate='2010-06-25', app='trustdomain',
 filename='mw.log.trustejb1')
 10/08/12 15:11:40 INFO parse.ParseDriver: Parse Completed
 10/08/12 15:11:40 INFO hive.log: DDL: struct tomcat { string identifier,
 string datestamp, string time_stamp, string seq, string server, string
 logline}
 10/08/12 15:11:40 ERROR metadata.Hive:
 org.apache.thrift.TApplicationException: get_partition failed: unknown
 result
 at
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partition(ThriftHiveMetastore.java:831)

 at
 org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partition(ThriftHiveMetastore.java:799)

 at
 org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMetaStoreClient.java:418)

 at org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:620)
 at
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer$tableSpec.(BaseSemanticAnalyzer.java:397)

 at
 org.apache.hadoop.hive.ql.parse.LoadSemanticAnalyzer.analyzeInternal(LoadSemanticAnalyzer.java:178)

 at
 org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)

 at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
 at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
 at
 org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:120)

 at
 org.apache.hadoop.hive.service.ThriftHive$Processor$execute.process(ThriftHive.java:378)

 at
 org.apache.hadoop.hive.service.ThriftHive$Processor.process(ThriftHive.java:366)

 at
 org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)

 at
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)

 at java.lang.Thread.run(Thread.java:619)

 FAILED: Error in semantic analysis: line 1:110 Partition not found
 'mw.log.trustejb1'
 10/08/12 15:11:40 ERROR ql.Driver: FAILED: Error in semantic analysis: line
 1:110 Partition not found 'mw.log.trustejb1'
 org.apache.hadoop.hive.ql.parse.SemanticException: line 1:110 Partition

error when use ip instead of hostname in hive

2010-07-29 Thread shangan
I can succeed doing DDL operations like create table and other command query 
related to metastore_db like decribe,show tables as the metastore_db is 
located at local host which I think is the main reason. but when I use select 
it will cause an error if I configurate the fs.default.name to a ip string in 
core-site.xml under hadoop,while if I configurate this field in hadoop to 
hostname other than IP, it works well. Did anybody know the reason and solution 
or that field must be configurated to hostname when using hive.
As I know, hadoop works quite well with that field configurated to either ip or 
hostname.

The error log:

2010-07-29 16:36:11,255 ERROR ql.Driver (SessionState.java:printError(248)) - 
FAILED: Unknown exception: Error while making MR scrat
ch directory - check filesystem config (null)
java.lang.RuntimeException: Error while making MR scratch directory - check 
filesystem config (null)
at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:184)
at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:254)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:733)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5200)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275)
at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.lang.IllegalArgumentException: Wrong FS: 
hdfs://192.168.0.153:9000/tmp/hive-shangan/1018108626, expected: hdfs://vm1
53:9000
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:155)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:262)
at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1120)
at org.apache.hadoop.hive.ql.Context.makeMRScratchDir(Context.java:122)
at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:180)
... 15 more


2010-07-29



shangan 


ODBC isql error

2010-06-25 Thread Amogh Vasekar
Hi,
I tried testing my odbc build with isql, but I get the following error:
[ISQL]ERROR: Could not SQLAllocEnv

I tried,
dltest /usr/local/lib/libodbchive.so SQLAllocEnv which succeeds, so I guess the 
entry point should be found.
Any suggestions anyone?

Amogh


Re: ODBC isql error

2010-06-25 Thread Vinithra Varadharajan
Hi Amogh,

I found that I need to add to my LD_LIBRARY_PATH so that it knows where to
find libodbchive.so. If this environment variable is already set correctly,
please provide the location and contents of your odbc.ini file and the
output of 'odbcinst -j' and 'odbcinst -q -s -n DSN'.

HTH!

-Vinithra

On Fri, Jun 25, 2010 at 4:32 AM, Amogh Vasekar am...@yahoo-inc.com wrote:

  Hi,
 I tried testing my odbc build with isql, but I get the following error:
 [ISQL]ERROR: Could not SQLAllocEnv

 I tried,
 dltest /usr/local/lib/libodbchive.so SQLAllocEnv which succeeds, so I guess
 the entry point should be found.
 Any suggestions anyone?

 Amogh



Hive server error: Could not get block locations. Aborting

2010-06-25 Thread Omer, Farah
Hi All,

 

Today I was running a big set of reports using HIVE(trunk version) and I
ran into the following problem.

The reports start running but after a while they all start failing and I
see that the Hive server has shut down.

I see this message on the Hive CLI:

 

10/06/25 09:22:08 WARN dfs.DFSClient: Error Recovery for block null bad
datanode[0]

java.io.IOException: Could not get block locations. Aborting...

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFS
Client.java:2153)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.ja
va:1745)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClie
nt.java:1899)

Job Submission failed with exception 'java.io.IOException(Could not get
block locations. Aborting...)'

10/06/25 09:22:08 ERROR exec.ExecDriver: Job Submission failed with
exception 'java.io.IOException(Could not get block locations.
Aborting...)'

java.io.IOException: Could not get block locations. Aborting...

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFS
Client.java:2153)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.ja
va:1745)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClie
nt.java:1899)

 

FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.ExecDriver

10/06/25 09:22:08 ERROR ql.Driver: FAILED: Execution Error, return code
1 from org.apache.hadoop.hive.ql.exec.ExecDriver

Exception closing file
/var/lib/hadoop/cache/hadoop/mapred/system/job_201006111052_4529/job.jar

java.io.IOException: Could not get block locations. Aborting...

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFS
Client.java:2153)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.ja
va:1745)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClie
nt.java:1899)

Exception closing file
/tmp/hive-training/hive_2010-06-25_09-21-04_517_4337603327678713321/plan
.859178065

java.io.IOException: Could not get block locations. Aborting...

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFS
Client.java:2153)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.ja
va:1745)

at
org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClie
nt.java:1899)

train...@training-vm:~/hivetrunk/hive/build/dist/bin$

 

There is an error message above of Could not get block locations.
Aborting 

This didn't use to happen with the previous version of Hive that I used
before. The complete set of reports used to finish executing one after
another, without any server shutdown messages.

I looked a bit into the Hive Mail archive, and I see that one work
around would be increasing the fd limit.

 

Can someone tell me what exactly might be reason for this kind of error
message, and which setting can I change to work-around this, and where
can I find it? Please let me know if there is some other value or file I
should send along for this communication to make more sense.


Thanks very much for your help.

 

Farah Omer

Senior DB Engineer, MicroStrategy, Inc.

 

T: 703 2702230

E: fo...@microstrategy.com mailto:fo...@microstrategy.com 

http://www.microstrategy.com http://www.microstrategy.com 

 

 

 



How can I skip error record in hive?

2010-06-23 Thread luocan19826164
How cannbsp;I skip error record in hive.
because when there is some error record in hive table,the hadoop job always 
fail!
Is there some configuration to avoid this?
Hope for your reply!

Re: How can I skip error record in hive?

2010-06-23 Thread Ted Xu
Hi luocan,

Please try

set hive.groupby.*skewindata*=true;

2010/6/24 luocan19826...@sohu.com


 thanks very much!

 it success!

 when I use distinct,for example select count(distinct(commid)) from
 t_dw_comm2pos,

 the number of the readucers is only one.

 Is there some method to optimize it?

  - 原文 -

 *发件人:* Ted Xu
 *主 题:* Re: How can I skip error record in hive?
 *时 间:* 2010年6月24日 9:43:00


 Hi,

 Skipping bad record feature is provided by Hadoop, please refer to
 Hadoop tutorial
 http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html#Skipping+Bad+Records
 .

 Those settings with Java APIs are also configurable in Hive, like:

 set mapreduce.map.skip.maxrecords=10;


 2010/6/24 luocan19826...@sohu.com

 How can I skip error record in hive.

 because when there is some error record in hive table,the hadoop job always
 fail!

 Is there some configuration to avoid this?

 Hope for your reply!




 --
 Best Regards,
 Ted Xu




-- 
Best Regards,
Ted Xu


Re: alter table add partition error

2010-06-18 Thread Edward Capriolo
On Fri, Jun 18, 2010 at 1:49 PM, Ning Zhang nzh...@facebook.com wrote:

 Pradeep,

 I ran the commands you provided and it succeeded with the expected
 behavior.

 One possibility is that there are multiple versions of libthrift.jar in
 your CLASSPATH (hadoop  hive). Can you check in the Hadoop  Hive CLASSPATH
 so that no other libthrift.jar is there? What is in
 hive_config_without_thrift?

 Thanks,
 Ning

 On Jun 18, 2010, at 10:19 AM, Pradeep Kamath wrote:

  I think there are two separate issues here – I want to open a jira for
 the first one since I am now able to reproduce it even with text format with
 builtin Serdes. Essentially this is a bug in the thrift code (not sure if it
 is in the client or server) since the same alter table statement works fine
 when the hive client does not use thrift. Here are the details:



 *cat create_dummy.sql*

 CREATE external TABLE if not exists dummy (



   partition_name string

   ,partition_id int

 )

 PARTITIONED BY ( datestamp string, srcid string, action string, testid
 string )

 row format delimited

 stored as textfile

 location '/user/pradeepk/dummy';



 *hive -f create_dummy.sql*

 10/06/18 10:13:36 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
 found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
 core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
 core-default.xml, mapred-default.xml and hdfs-default.xml respectively

 Hive history
 file=/tmp/pradeepk/hive_job_log_pradeepk_201006181013_184583537.txt

 OK

 Time taken: 0.627 seconds

 * *

 *hive  -e ALTER TABLE dummy add partition(datestamp = '20100602', srcid =
 '100',action='view',testid='10') location
 '/user/pradeepk/dummy/20100602/100/view/10';*

 10/06/18 10:14:11 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
 found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
 core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
 core-default.xml, mapred-default.xml and hdfs-default.xml respectively

 Hive history
 file=/tmp/pradeepk/hive_job_log_pradeepk_201006181014_700722546.txt

 *FAILED: Error in metadata: org.apache.thrift.TApplicationException:
 get_partition failed: unknown result*

 *FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask*



 *hive --config hive_conf_without_thrift -e ALTER TABLE dummy add
 partition(datestamp = '20100602', srcid = '100',action='view',testid='10')
 location '/user/pradeepk/dummy/20100602/100/view/10';*

 10/06/18 10:14:31 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
 found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use
 core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of
 core-default.xml, mapred-default.xml and hdfs-default.xml respectively

 Hive history
 file=/tmp/pradeepk/hive_job_log_pradeepk_201006181014_598649843.txt

 *OK*

 Time taken: 5.849 seconds



 Is there some thrift setting I am missing or is this a bug? – If it is the
 latter, I can open a jira with the above details.



 Thanks,

 Pradeep




  --

 *From:* Pradeep Kamath [mailto:prade...@yahoo-inc.com]
 *Sent:* Thursday, June 17, 2010 1:25 PM
 *To:* hive-user@hadoop.apache.org
 *Subject:* RE: alter table add partition error



 Here are the create table and alter table statements:

 CREATE external TABLE if not exists mytable (



   bc string

   ,src_spaceid string

   ,srcpvid string

   ,dstpvid string

   ,dst_spaceid string

   ,page_params mapstring, string

   ,clickinfo mapstring, string

   ,viewinfo arraymapstring, string



 )

 PARTITIONED BY ( datestamp string, srcid string, action string, testid
 string )

 row format serde 'com.yahoo.mySerde’

 stored as inputformat 'org.apache.hadoop.mapred.SequenceFileInputFormat'
 outputformat 'org.apache.hadoop.mapred.SequenceFileOutputFormat'

 location '/user/pradeepk/mytable’;



 hive --auxpath ult-serde.jar -e ALTER TABLE mytable add
 partition(datestamp = '20091101', srcid =
 '19174',action='click',testid='NOTESTID') location
 '/user/pradeepk/mytable/20091101/19174/click/NOTESTID';



 I get the following error:

 Hive history
 file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt

 *FAILED: Error in metadata: org.apache.thrift.TApplicationException:
 get_partition failed: unknown result*

 *FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.DDLTask*

 If I don’t use thrift and use a hive-site.xml to directly talk to the db,
 the alter table seems to succeed:

 hive --auxpath ult-serde.jar *--config hive_conf_without_thrift* -e ALTER
 TABLE mytable add partition(datestamp = '20091101', srcid =
 '19174',action='click',testid='NOTESTID') location
 '/user/pradeepk/mytable/20091101/19174/click/NOTESTID';



 However I get errors when I try to run a query:

 [prade...@chargesize:~/dev]*hive --auxpath ult-serde.jar --config
 hive_conf_without_thrift -e select src_spaceid from

hive.aux.jars.path not used - get a SerDe does not exist error

2010-06-18 Thread Karthik
I have my custom SerDe classes (as jar files) under /home/hadoop/hive/lib 
folder and I have set hive.aux.jars.path property in my hive-site.xml file 
to this location (value:  file:///home/hadoop/hive/lib/)

When I create a able (or query an existing table), I get a SerDe does not exist 
error, althought, it works fine if I set the value to full absolute path of the 
file name like this: file:///home/hadoop/hive/lib/myserde.jar, but again, my 
HWI complains with same error even if I set full path name of the JAR.  

It also works well if I run hive --auxpath /home/hadoop/hive/lib. 

I need to provide multiple JAR files under auxpath.  I use Hive 0.5+20.  Should 
I not relay on hive.aux.jars.path set in hive-site.xml and only use --auxpath 
param?

Please help.

Regards,
Karthik.


hive-hbase integration client error, please help

2010-06-17 Thread Zhou Shuaifeng
Hi  All ,
 
I've got some problem in programming hive-hbase client,  could someone help
me?
 
 
The code is very simple, select some data from a hbase-based   table.
 
 
 public static void main(String[] args) throws SQLException {
 
  try {
   Class.forName(driverName);
  } catch (ClassNotFoundException e) {
   // System.out.println(e);
   e.printStackTrace();
   System.exit(1);
  }
 
  Connection con = DriverManager.getConnection(
jdbc:hive://2.1.37.110:1/default, , );
  Statement stmt = con.createStatement();
  String tableName = hive_zsf11;
 
  String sql = select * from  + tableName +  where id = 1;
  System.out.println(Running:  + sql);
  ResultSet res = stmt.executeQuery(sql);
  while (res.next()) {
   System.out.println(String.valueOf(res.getInt(1)) + \t
 + res.getString(2));
  }
 }
}

error info is below:
Hive history
file=/tmp/z00100568/hive_job_log_z00100568_201006171106_406425331.txt
10/06/17 11:06:20 INFO exec.HiveHistory: Hive history
file=/tmp/z00100568/hive_job_log_z00100568_201006171106_406425331.txt
Running: select * from hive_zsf11 where id = 1 Exception in thread main
java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED:
Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.ExecDriver
 at
org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:19
7)
 at com.huawei.hive.HiveJdbcClient.main(HiveJdbcClient.java:69)
 
Before doing this, I have build the hive-hbase integration code
successfully, and can run SQL to operate hbase-based table well on the
shell.
So, what's more need to do? Thanks a lot.
 
Best Regards,
Zhou

-
This e-mail and its attachments contain confidential information from
HUAWEI, which is intended only for the person or entity whose address is
listed above. Any use of the information contained herein in any way
(including, but not limited to, total or partial disclosure, reproduction,
or dissemination) by persons other than the intended
recipient(s) is prohibited. If you receive this e-mail in error, please
notify the sender by phone or email immediately and delete it!





Re: Creating partitions causes Error in semantic analysis

2010-06-17 Thread Ning Zhang
The __HIVE_DEFAULT_PARTITION__ is created by default if the partitioning column 
value (newdatestamp etc.) is NULL or empty string. Below is the wiki page that 
describes the syntax and semantics of dynamic partitioning, including some best 
practices. 

http://wiki.apache.org/hadoop/Hive/Tutorial#Dynamic-partition_Insert

Ning

On Jun 16, 2010, at 11:11 PM, Viraj Bhat wrote:

 Hi Yongqiang,
 I am using the trunk code. I figured out what the problem was 
 INSERT OVERWRITE TABLE newtable
 PARTITION (newdatestamp, myregion, myproperty)
 SELECT
 name,
 age,
 datestamp as newdatestamp,
 region as myregion,
 property as myproperty,
 from oldtable where datestamp='20100525';
 
 I need to specify the last 3 columns in the order of partitions, which I did 
 not. 
 
 Meanwhile the dynamic partitioning produced a partition which was named 
 __HIVE_DEFAULT_PARTITION__. Is this created by default? 
 
 Thanks again for your help.
 Viraj
 
 -Original Message-
 From: yongqiang he [mailto:heyongqiang...@gmail.com] 
 Sent: Wednesday, June 16, 2010 5:46 PM
 To: hive-user@hadoop.apache.org
 Subject: Re: Creating partitions causes Error in semantic analysis
 
 Hive supports dynamic partition ( i think you need to use trunk code
 for this feature.?).
 
 here is an example:
 
 set hive.exec.dynamic.partition.mode=nonstrict;
 set hive.exec.dynamic.partition=true;
 
 create table if not exists nzhang_part1 like srcpart;
 create table if not exists nzhang_part2 like srcpart;
 describe extended nzhang_part1;
 
 from srcpart
 insert overwrite table nzhang_part1 partition (ds, hr) select key,
 value, ds, hr where ds = '2008-04-08'
 insert overwrite table nzhang_part2 partition(ds='2008-12-31', hr)
 select key, value, hr where ds  '2008-04-08';
 
 On Wed, Jun 16, 2010 at 4:07 PM, Viraj Bhat vi...@yahoo-inc.com wrote:
 Hi all,
 
   I have a table known as oldtable which is partitioned by datestamp.
 
 
 
 The schema of the oldtable is:
 
 
 
 name string
 
 age bigint
 
 property string
 
 region string
 
 datestamp string
 
 
 
 
 
 I now need to create a new table which is based of this old table and
 partitioned by (datestamp, region, property)
 
 
 
 The DDL for the new table looks like:
 
 
 
 CREATE EXTERNAL TABLE newtable
 
 (
 
 newname string,
 
 newage bigint,
 
 )
 
 
 
 PARTITIONED BY (newdatestamp STRING, myregion STRING, myproperty STRING)
 
 
 
 STORED AS RCFILE
 
 LOCATION '/user/viraj/rcfile;
 
 
 
 
 
 When I try to populate this new table from my old table, I try to use
 partitioning which uses values of old columns.
 
 
 
 INSERT OVERWRITE TABLE newtable
 
 PARTITION (newdatestamp='20100525', region, property)
 
 SELECT
 
   name,
 
   age
 
 from oldtable where datestamp='20100525';
 
 
 
 The above statement causes an error and expects hardcoded values for region
 and property.
 
 
 
 FAILED: Error in semantic analysis: Partition column in the partition
 specification does not exist.
 
 
 
 How do I specify the partition information such that the new tables, takes
 values from property and region from the old table and uses it as
 partitions.
 
 
 
 Is there a better way to achieve the above instead of hard coding values for
 each and every partition?
 
 
 
 ===
 
 Addendum: If the above is possible, how can I define some conditions where I
 need to say, If region is not us or asia, put it in another partition
 known as misc?
 
 ===
 
 
 
 
 
 Thanks Viraj



RE: alter table add partition error

2010-06-17 Thread Pradeep Kamath
Sorry - that was a cut-paste error - I don't have the action part - so I
am specifying key-value pairs. Since what I am trying to do seems like a
basic operation, I am wondering if it's something to do with my Serde -
unfortunately the error I see gives me no clue of what could be wrong -
any help would be greatly appreciated!

 

Thanks,

Pradeep

 



From: yq he [mailto:hhh.h...@gmail.com] 
Sent: Wednesday, June 16, 2010 5:54 PM
To: hive-user@hadoop.apache.org
Subject: Re: alter table add partition error

 

Hi Pradeep,

 

partition definition need to be key-value pairs. partition key `action`
seems missed the value.

 

Thanks

Yongqiang

On Wed, Jun 16, 2010 at 5:22 PM, Pradeep Kamath prade...@yahoo-inc.com
wrote:

Hi,

I am trying to create an external table against already existing
data in sequencefile format. However I have written a custom Serde to
interpret the data. I am able to create the table fine but get the
exception shown in the session output below when I try to add partition
- any help would be greatly appreciated.

 

Thanks,

Pradeep

 

== session output ===

 

[prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e ALTER
TABLE mytable add partition(datestamp = '20091101', srcid = '10',action)
location '/user/pradeepk/mytable/20091101/10';

10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
found in the classpath. Usage of hadoop-site.xml is deprecated. Instead
use core-site.xml, mapred-site.xml and hdfs-site.xml to override
properties of core-default.xml, mapred-default.xml and hdfs-default.xml
respectively

Hive history
file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt

FAILED: Error in metadata: org.apache.thrift.TApplicationException:
get_partition failed: unknown result

FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask

[prade...@chargesize:~/dev/howl]

 

== session output ===

 

/tmp/pradeepk/hive.log has:

2010-06-16 17:09:00,841 ERROR exec.DDLTask
(SessionState.java:printError(269)) - FAILED: Error in metadata:
org.apache.thrift.TApplicationException: get_partition failed: unknown
result

org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.thrift.TApplicationException: get_partition failed: unknown
result

at
org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:778)

at
org.apache.hadoop.hive.ql.exec.DDLTask.addPartition(DDLTask.java:231)

at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:150)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)

at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:
55)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:631)

at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:504)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:382)

at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)

at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)

at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:268)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Caused by: org.apache.thrift.TApplicationException: get_partition
failed: unknown result

at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get
_partition(ThriftHiveMetastore.java:931)

at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_part
ition(ThriftHiveMetastore.java:899)

at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMe
taStoreClient.java:500)

at
org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:756)

... 15 more

 

The thrift server messages are:

10/06/16 17:09:00 INFO metastore.HiveMetaStore: 22: get_table :
db=default tbl=mytable

10/06/16 17:09:00 INFO metastore.HiveMetaStore: 22: Opening raw store
with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore

10/06/16 17:09:00 INFO metastore.ObjectStore: ObjectStore, initialize
called

10/06/16 17:09:00 INFO metastore.ObjectStore: Initialized ObjectStore

10/06/16 17:09:00 INFO metastore.HiveMetaStore: 22: get_partition :
db=default tbl=mytable

 

 

 



Re: hive-hbase integration client error, please help

2010-06-17 Thread John Sichi
 
 1 column=cf1:val, timestamp=1276762938488, value=zsf 
 2 column=cf1:val, timestamp=1276762938488, value=zw 
 3 column=cf1:val, timestamp=1276762938294, value=zzf 
 4 column=cf1:val, timestamp=1276762938294, value=cjl 
 4 row(s) in 0.0160 seconds 
 hbase(main):003:0 
 
 
 
 -邮件原件-
 发件人: Zhou Shuaifeng [mailto:zhoushuaif...@huawei.com] 
 发送时间: 2010年6月17日 14:21
 收件人: hive-user@hadoop.apache.org
 抄送: 'zhaozhifeng 00129982'
 主题: hive-hbase integration client error, please help
 
 Hi  All ,
 
 I've got some problem in programming hive-hbase client,  could someone help
 me?
 
 
 The code is very simple, select some data from a hbase-based   table.
 
 
 public static void main(String[] args) throws SQLException {
 
  try {
   Class.forName(driverName);
  } catch (ClassNotFoundException e) {
   // System.out.println(e);
   e.printStackTrace();
   System.exit(1);
  }
 
  Connection con = DriverManager.getConnection(
jdbc:hive://2.1.37.110:1/default, , );
  Statement stmt = con.createStatement();
  String tableName = hive_zsf11;
 
  String sql = select * from  + tableName +  where id = 1;
  System.out.println(Running:  + sql);
  ResultSet res = stmt.executeQuery(sql);
  while (res.next()) {
   System.out.println(String.valueOf(res.getInt(1)) + \t
 + res.getString(2));
  }
 }
 }
 
 error info is below:
 Hive history
 file=/tmp/z00100568/hive_job_log_z00100568_201006171106_406425331.txt
 10/06/17 11:06:20 INFO exec.HiveHistory: Hive history
 file=/tmp/z00100568/hive_job_log_z00100568_201006171106_406425331.txt
 Running: select * from hive_zsf11 where id = 1 Exception in thread main
 java.sql.SQLException: Query returned non-zero code: 9, cause: FAILED:
 Execution Error, return code 2 from
 org.apache.hadoop.hive.ql.exec.ExecDriver
 at
 org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:19
 7)
 at com.huawei.hive.HiveJdbcClient.main(HiveJdbcClient.java:69)
 
 Before doing this, I have build the hive-hbase integration code
 successfully, and can run SQL to operate hbase-based table well on the
 shell.
 So, what's more need to do? Thanks a lot.
 
 Best Regards,
 Zhou
 
 -
 This e-mail and its attachments contain confidential information from
 HUAWEI, which is intended only for the person or entity whose address is
 listed above. Any use of the information contained herein in any way
 (including, but not limited to, total or partial disclosure, reproduction,
 or dissemination) by persons other than the intended
 recipient(s) is prohibited. If you receive this e-mail in error, please
 notify the sender by phone or email immediately and delete it!
 
 
 
 
 



RE: alter table add partition error

2010-06-17 Thread Pradeep Kamath
Here are the create table and alter table statements:

CREATE external TABLE if not exists mytable (

 

  bc string

  ,src_spaceid string

  ,srcpvid string

  ,dstpvid string

  ,dst_spaceid string

  ,page_params mapstring, string

  ,clickinfo mapstring, string

  ,viewinfo arraymapstring, string

 

)

PARTITIONED BY ( datestamp string, srcid string, action string, testid
string )

row format serde 'com.yahoo.mySerde'

stored as inputformat 'org.apache.hadoop.mapred.SequenceFileInputFormat'
outputformat 'org.apache.hadoop.mapred.SequenceFileOutputFormat'

location '/user/pradeepk/mytable';

 

hive --auxpath ult-serde.jar -e ALTER TABLE mytable add
partition(datestamp = '20091101', srcid =
'19174',action='click',testid='NOTESTID') location
'/user/pradeepk/mytable/20091101/19174/click/NOTESTID';

 

I get the following error:

Hive history
file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt

FAILED: Error in metadata: org.apache.thrift.TApplicationException:
get_partition failed: unknown result

FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask

If I don't use thrift and use a hive-site.xml to directly talk to the
db, the alter table seems to succeed:

hive --auxpath ult-serde.jar --config hive_conf_without_thrift -e ALTER
TABLE mytable add partition(datestamp = '20091101', srcid =
'19174',action='click',testid='NOTESTID') location
'/user/pradeepk/mytable/20091101/19174/click/NOTESTID';

 

However I get errors when I try to run a query:

[prade...@chargesize:~/dev]hive --auxpath ult-serde.jar --config
hive_conf_without_thrift -e select src_spaceid from
ult_search_austria_ult where datestamp='20091101';

10/06/17 13:22:34 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
found in the classpath. Usage of hadoop-site.xml is deprecated. Instead
use core-site.xml, mapred-site.xml and hdfs-site.xml to override
properties of core-default.xml, mapred-default.xml and hdfs-default.xml
respectively

Hive history
file=/tmp/pradeepk/hive_job_log_pradeepk_201006171322_1913647383.txt

Total MapReduce jobs = 1

Launching Job 1 out of 1

Number of reduce tasks is set to 0 since there's no reduce operator

java.lang.IllegalArgumentException: Can not create a Path from an empty
string

at org.apache.hadoop.fs.Path.checkPathArg(Path.java:82)

at org.apache.hadoop.fs.Path.init(Path.java:90)

at org.apache.hadoop.fs.Path.init(Path.java:50)

at
org.apache.hadoop.mapred.JobClient.copyRemoteFiles(JobClient.java:523)

at
org.apache.hadoop.mapred.JobClient.configureCommandLineOptions(JobClient
.java:603)

at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:761)

at
org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)

at
org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:684)

at
org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)

at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:
55)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:631)

at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:504)

 

Any help is much appreciated.

 

Pradeep

 



From: Ashish Thusoo [mailto:athu...@facebook.com] 
Sent: Thursday, June 17, 2010 11:15 AM
To: hive-user@hadoop.apache.org
Subject: RE: alter table add partition error

 

hmm... Can you send the exact command and also the create table command
for this table.

 

Ashish

 



From: Pradeep Kamath [mailto:prade...@yahoo-inc.com] 
Sent: Thursday, June 17, 2010 9:09 AM
To: hive-user@hadoop.apache.org
Subject: RE: alter table add partition error

Sorry - that was a cut-paste error - I don't have the action part - so I
am specifying key-value pairs. Since what I am trying to do seems like a
basic operation, I am wondering if it's something to do with my Serde -
unfortunately the error I see gives me no clue of what could be wrong -
any help would be greatly appreciated!

 

Thanks,

Pradeep

 



From: yq he [mailto:hhh.h...@gmail.com] 
Sent: Wednesday, June 16, 2010 5:54 PM
To: hive-user@hadoop.apache.org
Subject: Re: alter table add partition error

 

Hi Pradeep,

 

partition definition need to be key-value pairs. partition key `action`
seems missed the value.

 

Thanks

Yongqiang

On Wed, Jun 16, 2010 at 5:22 PM, Pradeep Kamath prade...@yahoo-inc.com
wrote:

Hi,

I am trying to create an external table against already existing
data in sequencefile format. However I have written a custom Serde to
interpret the data. I am able to create the table fine but get the
exception shown in the session output below when I try to add partition
- any help would be greatly appreciated.

 

Thanks,

Pradeep

 

== session output ===

 

[prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e ALTER
TABLE mytable add partition(datestamp = '20091101

alter table add partition error

2010-06-16 Thread Pradeep Kamath
Hi,

I am trying to create an external table against already existing
data in sequencefile format. However I have written a custom Serde to
interpret the data. I am able to create the table fine but get the
exception shown in the session output below when I try to add partition
- any help would be greatly appreciated.

 

Thanks,

Pradeep

 

== session output ===

 

[prade...@chargesize:~/dev/howl]hive --auxpath ult-serde.jar -e ALTER
TABLE mytable add partition(datestamp = '20091101', srcid = '10',action)
location '/user/pradeepk/mytable/20091101/10';

10/06/16 17:08:59 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
found in the classpath. Usage of hadoop-site.xml is deprecated. Instead
use core-site.xml, mapred-site.xml and hdfs-site.xml to override
properties of core-default.xml, mapred-default.xml and hdfs-default.xml
respectively

Hive history
file=/tmp/pradeepk/hive_job_log_pradeepk_201006161709_1934304805.txt

FAILED: Error in metadata: org.apache.thrift.TApplicationException:
get_partition failed: unknown result

FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask

[prade...@chargesize:~/dev/howl]

 

== session output ===

 

/tmp/pradeepk/hive.log has:

2010-06-16 17:09:00,841 ERROR exec.DDLTask
(SessionState.java:printError(269)) - FAILED: Error in metadata:
org.apache.thrift.TApplicationException: get_partition failed: unknown
result

org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.thrift.TApplicationException: get_partition failed: unknown
result

at
org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:778)

at
org.apache.hadoop.hive.ql.exec.DDLTask.addPartition(DDLTask.java:231)

at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:150)

at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:107)

at
org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:
55)

at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:631)

at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:504)

at org.apache.hadoop.hive.ql.Driver.run(Driver.java:382)

at
org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:138)

at
org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:197)

at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:268)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav
a:39)

at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessor
Impl.java:25)

at java.lang.reflect.Method.invoke(Method.java:597)

at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Caused by: org.apache.thrift.TApplicationException: get_partition
failed: unknown result

at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get
_partition(ThriftHiveMetastore.java:931)

at
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_part
ition(ThriftHiveMetastore.java:899)

at
org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getPartition(HiveMe
taStoreClient.java:500)

at
org.apache.hadoop.hive.ql.metadata.Hive.getPartition(Hive.java:756)

... 15 more

 

The thrift server messages are:

10/06/16 17:09:00 INFO metastore.HiveMetaStore: 22: get_table :
db=default tbl=mytable

10/06/16 17:09:00 INFO metastore.HiveMetaStore: 22: Opening raw store
with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore

10/06/16 17:09:00 INFO metastore.ObjectStore: ObjectStore, initialize
called

10/06/16 17:09:00 INFO metastore.ObjectStore: Initialized ObjectStore

10/06/16 17:09:00 INFO metastore.HiveMetaStore: 22: get_partition :
db=default tbl=mytable

 

 



Re: Hive Web Interface Error

2010-06-08 Thread Edward Capriolo
On Tue, Jun 8, 2010 at 1:56 PM, Karthik karthik_...@yahoo.com wrote:

 I'm using Hive 4.0 from CDH2 and I get this below error when I click on the
 Create Session link and provide a value for session name and hit the
 submit query button:

 Unexpected  while processing |-S|-h|-e|-f
 log4j:WARN No appenders could be found for logger (org.mortbay.log).
 log4j:WARN Please initialize the log4j system properly.

 This exception is printed on the server side (Jetty) logs and the page
 (browser) hangs for ever trying something.  Any quick solution?

 Regards,
 Karthik.


That is erroneous output which should be removed even though it does not
cause a problem.

Question 1? Are you using a JDBC metastore?

http://wiki.apache.org/hadoop/HiveDerbyServerMode

If you are not you can only have one hive session opened at once and the CLI
will probably lock out the web interface.

Any quick solution? hive 4.0 is an old release. I have not been tracking CDH
but I bet they offer a hive 5.0 release. Update to that, take a non CDH
release. or build your own hive from trunk.

Edward


Re: Hive Web Interface Error

2010-06-08 Thread Vinithra Varadharajan
CDH3 beta 1 does have Hive 5.0:
http://www.cloudera.com/blog/2010/03/cdh3-beta1-now-available/

-Vinithra

On Tue, Jun 8, 2010 at 12:10 PM, Edward Capriolo edlinuxg...@gmail.comwrote:



 On Tue, Jun 8, 2010 at 1:56 PM, Karthik karthik_...@yahoo.com wrote:

 I'm using Hive 4.0 from CDH2 and I get this below error when I click on
 the Create Session link and provide a value for session name and hit the
 submit query button:

 Unexpected  while processing |-S|-h|-e|-f
 log4j:WARN No appenders could be found for logger (org.mortbay.log).
 log4j:WARN Please initialize the log4j system properly.

 This exception is printed on the server side (Jetty) logs and the page
 (browser) hangs for ever trying something.  Any quick solution?

 Regards,
 Karthik.


 That is erroneous output which should be removed even though it does not
 cause a problem.

 Question 1? Are you using a JDBC metastore?

 http://wiki.apache.org/hadoop/HiveDerbyServerMode

 If you are not you can only have one hive session opened at once and the
 CLI will probably lock out the web interface.

 Any quick solution? hive 4.0 is an old release. I have not been tracking
 CDH but I bet they offer a hive 5.0 release. Update to that, take a non CDH
 release. or build your own hive from trunk.

 Edward



Re: Error while using Hive JDBC to execute a create temporary UDF

2010-06-03 Thread Ryan LeCompte
Just closing the loop here. Turns out it was a bug on our side... We had
single quotes in the initial ADD JAR command, which was causing the
subsequent create temporary function call to fail. Removed the quotes and
now we're all set.

Thanks,
Ryan


On Wed, Jun 2, 2010 at 4:44 PM, Ryan LeCompte lecom...@gmail.com wrote:

 Hi Vinithra,

 Yes, we registered the UDF with the ADD JAR command... and also the
 commands work fine in the CLI, it's only through the JDBC interface that it
 doesn't work.

 Could someone try executing one of these commands via JDBC to see if it's a
 legitimate bug in 0.5?

 Thanks!

 Ryan



 On Wed, Jun 2, 2010 at 3:34 PM, Vinithra Varadharajan 
 vinit...@cloudera.com wrote:

 Hi Ryan,

 Did you register the UDF with ADD JAR udf.jar? Note that you currently
 cannot register a jar that is on HDFS (HIVE-1157).

 In case you've already done the above, have you tried the query from the
 Hive CLI - do you get the same error?

 Also, could you attach the detailed logs from /tmp/username/hive?

 -Vinithra


 On Wed, Jun 2, 2010 at 11:37 AM, Ryan LeCompte lecom...@gmail.comwrote:

 Hey guys,

 We have a very simple JDBC client that uses the Hive JDBC driver to
 execute queries. We are trying to use it to execute a simple create
 temporary function ... statement, but Hive is throwing the following error:

 Exception in thread main java.sql.SQLException: Query returned non-zero
 code: 9, cause: FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.FunctionTask
 at
 org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:173)
 at
 org.apache.hadoop.hive.jdbc.HiveStatement.execute(HiveStatement.java:115)
 at HiveJdbcClient.main(HiveJdbcClient.java:31)

 A simple ADD FILE ... works just fine, as well as other queries. This
 is using Hive 0.5.

 Thanks,
 Ryan






Error while using Hive JDBC to execute a create temporary UDF

2010-06-02 Thread Ryan LeCompte
Hey guys,

We have a very simple JDBC client that uses the Hive JDBC driver to execute
queries. We are trying to use it to execute a simple create temporary
function ... statement, but Hive is throwing the following error:

Exception in thread main java.sql.SQLException: Query returned non-zero
code: 9, cause: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.FunctionTask
at
org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:173)
at org.apache.hadoop.hive.jdbc.HiveStatement.execute(HiveStatement.java:115)
at HiveJdbcClient.main(HiveJdbcClient.java:31)

A simple ADD FILE ... works just fine, as well as other queries. This is
using Hive 0.5.

Thanks,
Ryan


Re: Percentile UDAF throws error on empty data?

2010-05-28 Thread John Sichi
Mayank, do you want to take a look at this one?

JVS

On May 27, 2010, at 8:29 PM, Dilip Joseph wrote:

 I am getting an exception when using the Percentile UDAF on an empty data set 
 (details below)?  Has anyone seen/solved this before?
 
 Thanks
 Dilip
 
 1. Create the following table and load it with 4 rows: 10, 20, 30, 40
 CREATE TABLE pct_test ( 
 val INT
 );
 
 2. SELECT PERCENTILE(val, 0.5) FROM pct_test;   works fine
 
 3. SELECT PERCENTILE(val, 0.5) FROM pct_test WHERE val  100; fails with the 
 following exception.
 
 java.lang.RuntimeException: Hive Runtime Error while closing operators
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:347)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
 
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
   at org.apache.hadoop.mapred.Child.main(Child.java:170)
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: 
 org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute method 
 public boolean 
 org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator.iterate(org.apache.hadoop.io.LongWritable,double)
   on object 
 org.apache.hadoop.hive.ql.udf.udafpercentile$percentilelongevalua...@ded0f0 
 of class org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator 
 with arguments {null, null} of size 2
 
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:897)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:539)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
 
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
   at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
 
   at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:324)
   ... 4 more
 Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to 
 execute method public boolean 
 org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator.iterate(org.apache.hadoop.io.LongWritable,double)
   on object 
 org.apache.hadoop.hive.ql.udf.udafpercentile$percentilelongevalua...@ded0f0 
 of class org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator 
 with arguments {null, null} of size 2
 
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:725)
   at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.iterate(GenericUDAFBridge.java:169)
   at 
 org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:139)
 
   at 
 org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:865)
   ... 11 more
 Caused by: java.lang.IllegalArgumentException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
 
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
   at java.lang.reflect.Method.invoke(Method.java:597)
   at 
 org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:701)
 
   ... 14 more
 
 In this simple example, percentile may not be well defined as there are no 
 rows to operate on.   However, the problem is that the exception also occurs 
 in larger data sets where a few of the multiple maps involved may output 0 
 rows.
 
 



Percentile UDAF throws error on empty data?

2010-05-27 Thread Dilip Joseph
I am getting an exception when using the Percentile UDAF on an empty data
set (details below)?  Has anyone seen/solved this before?

Thanks
Dilip

1. Create the following table and load it with 4 rows: 10, 20, 30, 40
CREATE TABLE pct_test (
val INT
);

2. SELECT PERCENTILE(val, 0.5) FROM pct_test;   works fine

3. SELECT PERCENTILE(val, 0.5) FROM pct_test WHERE val  100; fails with the
following exception.

java.lang.RuntimeException: Hive Runtime Error while closing operators
at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:347)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute
method public boolean
org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator.iterate(org.apache.hadoop.io.LongWritable,double)
 on object 
org.apache.hadoop.hive.ql.udf.udafpercentile$percentilelongevalua...@ded0f0
of class org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator
with arguments {null, null} of size 2
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:897)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:539)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:324)
... 4 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to
execute method public boolean
org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator.iterate(org.apache.hadoop.io.LongWritable,double)
 on object 
org.apache.hadoop.hive.ql.udf.udafpercentile$percentilelongevalua...@ded0f0
of class org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator
with arguments {null, null} of size 2
at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:725)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.iterate(GenericUDAFBridge.java:169)
at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:139)
at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:865)
... 11 more
Caused by: java.lang.IllegalArgumentException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:701)
... 14 more


In this simple example, percentile may not be well defined as there are no
rows to operate on.   However, the problem is that the exception also occurs
in larger data sets where a few of the multiple maps involved may output 0
rows.


Error when creating custom UDFs

2010-05-17 Thread Edmar Ferreira
Hi,

I am traying to create a UDF as describe in this tutorial :

http://wiki.apache.org/hadoop/Hive/HivePlugins


But when I use this command :

create temporary function my_lower as 'com.example.hive.udf.Lower';

I see this error :


FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.FunctionTask

Thanks,


RE: Error when creating custom UDFs

2010-05-17 Thread Ashish Thusoo
Can you send the contents of /tmp/username/hive.log file? That should show a 
stack dump for this error.

Ashish


From: Edmar Ferreira [mailto:edmaroliveiraferre...@gmail.com]
Sent: Monday, May 17, 2010 11:37 AM
To: hive-user@hadoop.apache.org
Subject: Error when creating custom UDFs

Hi,

I am traying to create a UDF as describe in this tutorial :

http://wiki.apache.org/hadoop/Hive/HivePlugins


But when I use this command :


create temporary function my_lower as 'com.example.hive.udf.Lower';

I see this error :


FAILED: Execution Error, return code 1 from 
org.apache.hadoop.hive.ql.exec.FunctionTask

Thanks,


Re: Error when creating custom UDFs

2010-05-17 Thread Dilip Joseph
Did you add the jar containing the class com.example.hive.udf.Lower using
ADD JAR?  This error definitely occurs when the appropriate jar is not on
the path, and when there is a typo in class name.  There are probably other
causes as well.

Dilip

On Mon, May 17, 2010 at 1:03 PM, Edmar Ferreira 
edmaroliveiraferre...@gmail.com wrote:

 Seems that every time I execute :

 create temporary function my_lower as 'com.example.hive.udf.Lower';

 One lime with this error is added to the log file, my last lines are :

 2010-05-17 14:55:28,500 ERROR ql.Driver (SessionState.java:printError(279))
 - FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.FunctionTask
 2010-05-17 15:07:42,128 ERROR ql.Driver (SessionState.java:printError(279))
 - FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.FunctionTask
 2010-05-17 15:17:55,815 ERROR ql.Driver (SessionState.java:printError(279))
 - FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.FunctionTask


 On Mon, May 17, 2010 at 3:48 PM, Ashish Thusoo athu...@facebook.comwrote:

  Can you send the contents of /tmp/username/hive.log file? That should
 show a stack dump for this error.

 Ashish

  --
 *From:* Edmar Ferreira [mailto:edmaroliveiraferre...@gmail.com]
 *Sent:* Monday, May 17, 2010 11:37 AM
 *To:* hive-user@hadoop.apache.org
 *Subject:* Error when creating custom UDFs

 Hi,

 I am traying to create a UDF as describe in this tutorial :

 http://wiki.apache.org/hadoop/Hive/HivePlugins


 But when I use this command :

 create temporary function my_lower as 'com.example.hive.udf.Lower';

 I see this error :


 FAILED: Execution Error, return code 1 from
 org.apache.hadoop.hive.ql.exec.FunctionTask

 Thanks,





  1   2   3   >