Alter location of database in Hive

2014-06-30 Thread Jon Bender
Hey all,

I'm on Hive 0.10.0 on one of my clusters.  We had a namenode hostname
change, so I'm trying to point all of our tables, partitions and databases
to the new locations.

When i describe database mydb, the location shows up as
hdfs://old_hostname/user/hive/warehouse/mydb.db, and i want to set it
to hdfs://new_hostname/user/hive/warehouse/mydb.db

Is there a way to do this.  Or do I need to go poking around in the mysql
metadata to actually carry this out?

Regards,
Jon


Re: Alter location of database in Hive

2014-06-30 Thread Jon Bender
Answered my own question, no there is not.  The way to do is is to modify
the DB_LOCATION_URI field in metastore.DBS (at least if you're using MySQL)


On Mon, Jun 30, 2014 at 5:14 PM, Jon Bender jonathan.ben...@gmail.com
wrote:

 Hey all,

 I'm on Hive 0.10.0 on one of my clusters.  We had a namenode hostname
 change, so I'm trying to point all of our tables, partitions and databases
 to the new locations.

 When i describe database mydb, the location shows up as
 hdfs://old_hostname/user/hive/warehouse/mydb.db, and i want to set it
 to hdfs://new_hostname/user/hive/warehouse/mydb.db

 Is there a way to do this.  Or do I need to go poking around in the mysql
 metadata to actually carry this out?

 Regards,
 Jon



Passing mapreduce configuration parameters to hive udf

2013-08-13 Thread Jon Bender
Hi there,

I'm trying to pass some external properties to a UDF.  In the MapReduce
world I'm used to extending Configured in my classes, but in my UDF class
when initializing a new Configuration object or HiveConf object it doesn't
inherit any of those properties.  I see it in the Job Configuration XML
when the job runs but my UDF can't pick it up when it creates a new
instance.

Are there any other suggested ways of doing this?  I could probably just
add some conf file to distributed cache and load the properties on UDF
initialization, but I figured I could get at the configuration through
other means.

Thanks in advance,
Jon


Re: Single Map task for Hive queries

2011-08-15 Thread Jon Bender
It's actually just an uncompressed UTF-8 text file.

This was essentially the create table clause:
CREATE EXTERNAL TABLE foo
ROW FORMAT DELIMITED
STORED AS TEXTFILE
LOCATION '/data/foo'

Using Hive 0.7.

On Mon, Aug 15, 2011 at 10:37 AM, Loren Siebert lo...@siebert.org wrote:

 Is your external file compressed with GZip or BZip? Those file formats
 aren’t splittable, so they get assigned to one mapper.

 On Aug 15, 2011, at 10:23 AM, Jon Bender wrote:

  Hello,
 
  I have external tables in Hive stored in a single flat text file.  When I
 execute queries against it, all of my jobs are run as a single map task,
 even on very large tables.
 
  What steps do I need to make to ensure that these queries are split up
 and pushed out to multiple TTs?  Do I need to store the Hive tables in a
 different internal file format?  Make some configuration changes?
 
  Thanks!
  Jon




Re: Single Map task for Hive queries

2011-08-15 Thread Jon Bender
Yeah MapReduce itself is set up to use all of my task trackers--only one Map
Task gets created one the external table queries.

I tried querying another external table (composed of some 20 files) and it
created 20 map tasks in turn during the query.  I will try the LINES
TERMINATED BY clause next to try and parallelize within a single file.

On Mon, Aug 15, 2011 at 11:00 AM, Loren Siebert lo...@siebert.org wrote:

 You should not have to do anything special to Hive to make it use all of
 your TT’s. The actual MR job should be governed by your mapred-site.xml
 file.

 When you run sample MR jobs (like the Pi example) and look at the job
 tracker, are you seeing all your TT’s getting used?

 On Aug 15, 2011, at 10:47 AM, Jon Bender wrote:

 It's actually just an uncompressed UTF-8 text file.

 This was essentially the create table clause:
 CREATE EXTERNAL TABLE foo
 ROW FORMAT DELIMITED
 STORED AS TEXTFILE
 LOCATION '/data/foo'

 Using Hive 0.7.

 On Mon, Aug 15, 2011 at 10:37 AM, Loren Siebert lo...@siebert.org wrote:

 Is your external file compressed with GZip or BZip? Those file formats
 aren’t splittable, so they get assigned to one mapper.

 On Aug 15, 2011, at 10:23 AM, Jon Bender wrote:

  Hello,
 
  I have external tables in Hive stored in a single flat text file.  When
 I execute queries against it, all of my jobs are run as a single map task,
 even on very large tables.
 
  What steps do I need to make to ensure that these queries are split up
 and pushed out to multiple TTs?  Do I need to store the Hive tables in a
 different internal file format?  Make some configuration changes?
 
  Thanks!
  Jon






Rename Hive partition

2011-06-02 Thread Jon Bender
Hey all,

Just wondering what the best way is to rename specific Hive table
partitions.  Is there some HiveQL command for this, or will I need to insert
into new partitions to reflect the new naming convention?

Cheers,
Jon