This is quite difficult to do in Hive on Hadoop. Hive over Hadoop really does
not support row level updates so basically you are reduced to periodically
merging the raw stream of updates with the main table and generating a new
snapshot of the table. Another possible approach could be to use
Hi,
What does setting the serialization.last.column.takes.rest SERDEPROPERTIES do
for the LazySimpleSerDe?
http://hive.apache.org/docs/r0.6.0/api/org/apache/hadoop/hive/serde2/lazy/LazySimpleSerDe.SerDeParameters.html#isLastColumnTakesRest()
I came across that in considering a blob table
Hi,
I'm trying this use case: do a simple select from an existing table
and pass the results through a reduce script to do some analysis. The
table has web logs so the select uses a pseudo user ID as the key and
the rest of the data as values. My expectation is that a single reduce
script should
An update on this.
I've finished doing changes in Oozie Hive-action to work with Hive 0.7.
As mentioned before the problem is that not all needed Hive dependent JARs
are available in public Maven repos.
Early next week the Cloudera Maven repositories should have beta versions of
these JARs
Is there way to use to use hibernate to work with hive ONLY for select
queries.
Amlan
Thanks for the reply.. (I'm new to Hive).
I can't find the driver class. Do you know which files I should be
looking for?
Regards
Stuart
by the sound of the error ... it sounds like you don't have HiveDriver
in your path
Can you locate the calss that supposedly has the HiveDriver class?
$HIVE_HOME/lib/ will contain the jar hive-jdbc-0.6.0.jar
On Thu, Feb 17, 2011 at 12:06 PM, Stuart Scott stuart.sc...@e-mis.comwrote:
Thanks for the reply.. (I’m new to Hive).
I can’t find the driver class. Do you know which files I should be looking
for?
Regards
Stuart
by the sound
Look under
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table
On Thu, Feb 17, 2011 at 12:00 PM, Mapred Learn mapred.le...@gmail.comwrote:
Hi,
I was wondering if hive supports Sequence File format. If yes, could me
point me to some documentation about how to use Seq files in
Thanks Ted !
Just found it few minutes ago.
On Feb 17, 2011, at 1:46 PM, Ted Yu yuzhih...@gmail.com wrote:
Look under http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table
On Thu, Feb 17, 2011 at 12:00 PM, Mapred Learn mapred.le...@gmail.com wrote:
Hi,
I was wondering if hive
I have a requirement to support data from the SequenceFile KEY (not the
VALUE) to be used by Hive table. How can I do this. From the code, it looks
like the VALUE part is available for Hive. Please help.
Regards.
From: Mapred Learn mapred.le...@gmail.com
Hi Mark,
You can use JDBC driver provided by Amazon Elastic MapReduce service. When you
use that driver with SQL Squirrel it returns column names. Here are the docs
on how to get that driver:
Hi,
I am trying to perform union of two tables which are having identical
schemas and distinct data.There are two tables 'oldtable' and 'newtable'.
The old table contains the information of old users and the new table will
conatin the information of new user. I am trying to update the new entry
Hello,
When we do a left outer join, and the right table does not have row,
it will return NULL s for those values.
is there any way to turn those nulls into 0's ? since it is cointing
operation, if the right table does not have the row, it means 0's not
nulls.
best regards,
-c.b.
hive0.4.1 do not support union,only support union all
在 2011-2-18 下午3:12,sangeetha s sangee@gmail.com写道:
Hi,
I am trying to perform union of two tables which are having identical
schemas and distinct data.There are two tables 'oldtable' and 'newtable'.
The old table contains the
When we try to join two large tables some of the reducers stop with an
OutOfMemory exception.
Error: java.lang.OutOfMemoryError: Java heap space
at
org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.shuffleInMemory(ReduceTask.java:1508)
at
Hello,
I was wondering if anyone managed to unit test Hive scripts and share
his/her experience? My first thought was to prepare sample data, run hive
scripts in order to generate output and then compare the generated output
with the expected output. Sounds fairly simple but it may be a bit
Hey Peter, this looks like it ought to work for me but The link to the
hive 0.5 hive drivers ... seems broken???
http://buyitnw.appspot.com/aws.amazon.com/developertools/Elastic-MapReduce/0196055244487017
seems to be the link from the site you mention below, but it returns a blank
page?
Hi Mark,
Try this link for Hive .5 JDBC driver:
http://aws.amazon.com/developertools/Elastic-MapReduce/0196055244487017
We are actually in Seattle office.
Best Regards,
Peter-
From: Sunderlin, Mark [mailto:mark.sunder...@teamaol.com]
Sent: Friday, February 18, 2011 6:26 AM
To:
On Fri, Feb 18, 2011 at 6:58 AM, Radek Maciaszek
radek.macias...@gmail.com wrote:
Hello,
I was wondering if anyone managed to unit test Hive scripts and share
his/her experience? My first thought was to prepare sample data, run hive
scripts in order to generate output and then compare the
Hi Radek,
I'm actually in the process of running the map-join unit tests against
EMR as we speak. It's possible but dog slow :)
Thanks,
Kirk
On 2/18/11 11:09 AM, Edward Capriolo wrote:
On Fri, Feb 18, 2011 at 6:58 AM, Radek Maciaszek
radek.macias...@gmail.com wrote:
Hello,
I was wondering
On Fri, Feb 18, 2011 at 3:47 PM, Viral Bajaria viral.baja...@gmail.com wrote:
Hi,
I have a question regarding the existing date functions in Hive
(http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF#Date_Functions)
The unix_timestamp() functions return a bigint while the from_unixtime()
I ran into some problems with this maybe you can help me out.
I have aux jars, in them I have a custom writable object,
I put my jars in auxlib, using hive interactive mode it works perfectly, but
Using TOAD for hive, the jobs fail, looking in the jobtracker I see that my
custom writable class
Thanks Mafish.
Can you please point me which config need to be set correctly?
Amlan
On Mon, Feb 21, 2011 at 12:45 PM, Mafish Liu maf...@gmail.com wrote:
It seem you did not config your HDFS properly.
Caused by: java.lang.IllegalArgumentException: Wrong FS:
hdfs://
Please have the host-name and ip address mapping in the /etc/hosts file on
both the nodes that are running hadoop cluster.
One more thing : I hope secondary namenode is also running along namenode
but you may have forgot to mention it.
Thanks,
MIS
On Mon, Feb 21, 2011 at 12:47 PM, Amlan Mandal
Hello,
So I have table of item views with item_sid, ip_number, session_id
I know it will not be that exact, but I want to get unique views per
item, and i will accept ip_number, session_id tuple as an unique view.
when I want to query just item hits I say: select item_sid, count(*)
from
I think I found a lead,
The following code is taken from the hiveserver.sh
if [ $minor_ver -lt 20 ]; then
exec $HADOOP jar $AUX_JARS_CMD_LINE $JAR $CLASS $HIVE_PORT $@
else
# hadoop 20 or newer - skip the aux_jars option and hiveconf
exec $HADOOP jar $JAR $CLASS $HIVE_PORT $@
在 2011-2-21 下午10:54,Bejoy Ks bejoy...@yahoo.com写道:
Hi Experts
I'm using hive for a few projects and i found it a great tool in
hadoop to process end to end structured data. Unfortunately I'm facing a few
challenges out here as follows
Availability of database/schemas in Hive
I'm having
You can group by item_sid (drop session_id and ip_number from group by
clause) and then join with the parent table to get session_id and
ip_number.
-Ajo
On Mon, Feb 21, 2011 at 3:07 AM, Cam Bazz camb...@gmail.com wrote:
Hello,
So I have table of item views with item_sid, ip_number,
On using SQL IN ... what would happen if you created a short table with the
enteries in the IN clause and used a inner join ?
-Ajo
On Mon, Feb 21, 2011 at 7:57 AM, Bejoy Ks bejoy...@yahoo.com wrote:
Thanks Jov for the quick response
Could you please let me know which is the latest stable
Hello,
I did not understand this:
when I do a:
select item_sid, count(*) from item_raw group by item_sid
i get hits per item.
how do we join this to the master table?
best regards,
-c.b.
On Mon, Feb 21, 2011 at 6:28 PM, Ajo Fod ajo@gmail.com wrote:
You can group by item_sid (drop
Oh, I think I see what you are getting at .. basically you are getting
duplicate item_sids because they represent different views.
... try this:
select item_sid, ip_number,
session_id, count(*) from item_raw group by item_sid, ip_number,
session_id;
On Mon, Feb 21, 2011 at 11:54 AM, Cam Bazz
Does anyone have a way of generating the create table statement for a table
that is in Hive? I see a jira for this
https://issues.apache.org/jira/browse/HIVE-967 and it appears that Ed Capriolo
might have a solution for this. Ed, are you able to share this solution?
My goal is to copy a
On Mon, Feb 21, 2011 at 6:42 PM, Jay Ramadorai
jramado...@tripadvisor.com wrote:
Does anyone have a way of generating the create table statement for a table
that is in Hive? I see a jira for
this https://issues.apache.org/jira/browse/HIVE-967 and it appears that Ed
Capriolo might have a
Ya,What Jeff said is correct.
You should not name different ip's in a common name. Map the Ip's and host
name correctly and try again.
Cheers!
On Mon, Feb 21, 2011 at 7:43 PM, Jeff Bean jwfb...@cloudera.com wrote:
One thing i notice is that /etc/hosts is different on each host:
amlan-laptop is
The query you have produced mulltiple item_sid's.
This is rather what I have done:
select u.item_sid, count(*) cc from (select distinct item_sid,
ip_number, session_id from item_raw where date_day='20110202') u group
by u.eser_sid
date_day is a partition
and this produced the results i wanted,
I would like to implement the moving average as a UDF (instead of a
streaming reducer). Here is what I am thinking. Please let me know if I am
missing something here:
SELECT product, date, mavg(product, price, 10)
FROM (
SELECT *
FROM prices
DISTRIBUTE BY product
SORT BY product, date
)
Thank you, Ed. Works like a charm after I remove the Hive2rdbms references.
I've uploaded the jar to the JIRA for those who want to use it.
On Feb 22, 2011, at 1:13 PM, Edward Capriolo wrote:
On Tue, Feb 22, 2011 at 1:09 PM, Jay Ramadorai
jramado...@tripadvisor.com wrote:
Thank you, Ed.
Thank you, John.
It's not quite clear from the page whether my solution:
1. makes sense
2. works now
3. will work in the future if the issue is resolved/implemented
Could you elaborate?
Also, there is no mentioning of UDF object sharing (between mappers) in the
current implementation. Is this a
Hello,
Here are the table descriptions. they only have the identifier, hits,
unqiques and date_day which is the partition
hive describe selection_daily_hits;
OK
sel_sid int
hitsint
date_daystring
hive describe selection_daily_uniques;
OK
sel_sid int
uniques int
date_day
Hello,
thank you for your quick responses!
Seems my root mysql user wasn't really 'root'.
GRANT ALL PRIVILEGES ... with a new user got it running. however I don't
understand it, because my root user has the same privileges as the new one...
but whatever.
Malte
--
NEU: FreePhone - kostenlos
***
This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any
You can use unix_timestamp(), do the math and convert the result to timestamp.
something like from_unixtime(unix_timestamp(Arrival) + n). Use the proper units
though.
Will that not work for you?
On Feb 24, 2011, at 7:57 PM, Bejoy Ks wrote:
Hi Experts
Could some one please help me out
I am trying to query against a partitioned Hive table where the input
format of different partitions may be different. I'd like to change
the partition file format, and reading the language manual at
http://wiki.apache.org/hadoop/Hive/LanguageManual, it seems to
indicate that I should be able to
Hi,
I have a Hive query that has a statement like this (sum(itemcount) /
count(item)). I want to specify only two digits of precision (i.e. 53.55). The
result is stored inside of a string, not its own column, so I'd need to set the
precision in the statement. Is this possible?
Thanks,
Aurora
Hacky, but maybe something like
select concat( cast(num as int), '.' , cast(abs(num)*100 as int) % 100) from
(select 1.234 as num from src limit 1) a;
?
-Original Message-
From: Aurora Skarra-Gallagher [mailto:aur...@yahoo-inc.com]
Sent: Thursday, February 24, 2011 11:31 AM
To:
Hi! I'm having some trouble running queries from a java client against a
remote Thrift Hive server. Its all setup and quicker queries do run through
fine.
But queries which run longer than about 10 minutes disconnect the client
with a TTransportException: Connection reset exception.. The query
Hi Mohit,
The fix for HIVE-1535 did not include a testcase. See the discussion in the
ticket for an explanation of why this was the case.
The steps you outlined in your email seem to indicate that HIVE-1535 was not
actually fixed, or that the problem was reintroduced later. Please file a
JIRA
Probing this further reveals that the connection is reset by the server in
exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
http://wiki.apache.org/hadoop/Hive/AdminManual/Configuration but is there
some configuration property which controls this?
-ayush
On
Did you start hiverserver service before running the client Program.
Cheers, Adarsh
Ayush Gupta wrote:
Probing this further reveals that the connection is reset by the
server in exactly 10 minutes every time.
I'm running Hive 0.6. I do not see anything relevant at
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.
Also a point to remember is the single-threaded nature of the hive thrift
server (atleast upto v0.5)
But looking at the logs is what will be the first
Yes, the hiveserver server was started and running before the client program
was run.
-ayush
On Fri, Feb 25, 2011 at 12:14 PM, Adarsh Sharma adarsh.sha...@orkash.comwrote:
Did you start hiverserver service before running the client Program.
Cheers, Adarsh
Ayush Gupta wrote:
Probing
On Fri, Feb 25, 2011 at 12:17 PM, Viral Bajaria viral.baja...@gmail.comwrote:
What do the logs of the thrift server say ?? If it does not give any
relevant information, I would enable DEBUG level logging on the console.
the hiveserver is pretty quiet, the connection appears to be terminated
Thanks Carl, I'll check that.
But, surely, I cant be the only one running Hive queries which last more
than 10 minutes over a thrift client! The hive model is somewhat intended to
work with large data sets and long running queries should be expected. I
wonder why there is no discussion around
Carl,
Do you think this issue was not there before 0.6 ? We run our thrift servers
for hours and have never faced this issue. I don't think I have restarted
any of my thrift servers for days.
My hive wrapper does have logic to handle timeouts, it reconnects whenever
it sees that the thrift
Hi Viral,
Hive 0.5.0 and 0.6.0 use the same version of libthrift, so the problem is
more likely related to some difference in the way 0.5.0 and 0.6.0
configure/initialize Thrift, or to some other issue related to the way the
Thrift connection is managed on the client or server side (though it
25 Feb 2011
We're interested in using the HiveODBC interface, so I've been trying to build
it. I'm using Hadoop 0.20.2+320, from Cloudera, with Hive 0.7.0CDH3B4.
Initially I was trying this with Hive 0.5.0+32, but we intend to upgrade to
CDH3B4 very soon, so I decided to try with the newer
I tried on the latest trunk (through CLI connecting to Hive Server) and there
is no disconnection after 10 mins for a long query.
@Ayush, is this Java client using JDBC connection? If so the client may have
set a timeout for JDBC queries. I'm suspecting the disconnection is from the
Java
Hi,
I've had a quick look at Toad for Cloud the other day, too.
* One complaint I heard (but have not verified) is that it crashed. I don't
have the details. Anyone seen any crashes?
* The other complaint I heard is that just like it allows easy querying, it
allows the person using it easy
I am trying to implement a simple UDF but that doesn't seem to work for some
reason. It looks like Hive is not able to cast the arguments to the right
type.
select price, mavg(0, price, 2) from prices limit 1;
FAILED: Error in semantic analysis: line 1:14 Wrong Arguments 2: No matching
method for
Hi Otis,
If you have any details regarding crashes we’d be most interested in collecting
more information about what lead to the crash. Toad for Cloud forums
http://toadforcloud.com/forumindex.jspa?categoryID=735 would be the best place
to post any such information.
The credentials supplied
I am also getting this error .. any suggestions?
hive : 0.6
had :0.20.2
=
On Mon, Jun 7, 2010 at 1:03 AM, Shuja Rehman shujamug...@gmail.com wrote:
Hi all
Thanks for reply.
I have changed the heap size to 1024, then 512 then even 100 in the
specified file. But i am still getting this
In short, I am trying to make hbase_handler to work with hive-0.6 and
hbase-0.90.1.
I am trying to integrate Hbase and Hive. There is a pretty good
documentation at http://wiki.apache.org/hadoop/Hive/HBaseIntegration .
But looks like they have become old. The hbase_handler was written
Thanks a lot for Roberto Congiu and wil's help.
The problem has been solved with your assistance.
I think I should read the wiki guide more carefully!
Thank you very much!
Best regards!
2011-03-01
Jianhua Wang
Hi,
I have a hive script [given below] which calls a python script using
transform and for large datasets [ 100M rows ], the reducer is not able to
start the python process and the error message is argument list too long.
The detailed error stack is given below.
The python script takes only 1
I am not super familiar with lists inside a column for Hive, but that might
let you define a table that has a schema of page-type, page-name,
items-displayed, and then query for a count of individual items (
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL and
Hello,
Does anyone have any experience calculating the percentile / percentrank for
each row in a table?
I see that there are built in UDAFs to calculate the percentile, but that
would only return a single value for the entire table.
Essentially, I'm trying to recreate the Excel PercentRank
I know of this type of a call would give you a subset of the table ... also
I think you can use a group by clause to get it for groups of data.
SELECT PERCENTILE(val, 0.5) FROM pct_test WHERE val 100;
Couldn't you use this call a few times to get the value for each percentile
value?
I think
instead of
using 'python2.6 user_id_output.py hbase'
try something like this:
using 'user_id_output.py'
... and a #! line with the location of the python binary.
I think you can include a parameter too in the call like :
using 'user_id_output.py hbase'
Cheers,
Ajo.
On Tue, Mar 1, 2011 at
Looks like this is the command line it was executing:
2011-03-01 14:46:13,733 INFO org.apache.hadoop.hive.ql.exec.ScriptOperator:
Executing [/usr/bin/python2.6, user_id_output.py, hbase]
From: Irfan Mohammed [mailto:irfan...@gmail.com]
Sent: Tuesday, March 01, 2011 1:39 PM
To:
Yes. That is the command it is executing but what I do not understand is why
I am getting argument list too long when I am running the same sql with
the same python script with a large dataset.
Thanks.
On Tue, Mar 1, 2011 at 2:53 PM, Steven Wong sw...@netflix.com wrote:
Looks like this is the
Thanks, the query works as expected. I guess the query on the wiki is out of
date.
- Original Message
From: Thiruvel Thirumoolan thiru...@yahoo-inc.com
To: user@hive.apache.org user@hive.apache.org
Sent: Tue, March 1, 2011 3:26:13 AM
Subject: Re: Dynamic partition - support for
[crosspost on Oozie and Hive aliases as there threads in both]
I've just posted a pull request (patch) for Oozie that add support for Hive
actions in Oozie workflows.
IMPORTANT:
* The pull requests have an additional commit, GH-0226, that fixes
groupId/artifacts of Hadoop/Pig/Hive to the
Let us say my log data that I want to place a log file into hive. And the log
file itself looks something like this:
Event_time, event_type, event_data_blob
And the blob data looks like
Key1=value1;key2=value2;key3=value3 ... keyn=valuen
This looks like maybe I start like this:
Create table
We've gotten this error a couple of times too - it is very misleading, not
correct at all. IIRC, I determined the root cause is selecting too many
input files (even though those do NOT get passed as arguments to transform
script). For example, this happened once we had a lot of dynamic
Usually this is caused by not having the mysql jdbc driver on the
classpath (it's not default included in hive).
Just put the mysql jdbc driver in the hive folder under lib/
On 03/02/2011 03:15 PM, Ajo Fod wrote:
I've checked the mysql connection with a separate java file with the
same string.
On Wed, Mar 2, 2011 at 9:27 AM, Sunderlin, Mark
mark.sunder...@teamaol.comwrote:
Let us say my log data that I want to place a log file into hive. And
the log file itself looks something like this:
Event_time, event_type, event_data_blob
And the blob data looks like
Refer to this http://dev.bizo.com/2011/02/columns-in-hive.html
http://dev.bizo.com/2011/02/columns-in-hive.htmlHTH
- Youngwoo
2011/3/2 Sunderlin, Mark mark.sunder...@teamaol.com
Let us say my log data that I want to place a log file into hive. And
the log file itself looks something like
Hi Bennie,
Thanks for the response !
I had CLASSPATH set to include
/usr/share/java/mysql.jar
... in addition, I just copied the mysql.jar to the lib directory of hive.
I still get the same bug.
Any other ideas?
Thanks,
-Ajo
On Wed, Mar 2, 2011 at 7:01 AM, Bennie Schut bsc...@ebuddy.com
This definitely looks like a CLASSPATH error.
Where did you get the mysql.jar from ? Can you open it up and make sure that
it includes the com.mysql.jdbc.Driver namespace ?
I am guessing the mysql.jar is not the one that you need. you can download a
new one from the mysql website.
To be clear,
I'm wondering if my configuration/stack is wrong, or if I'm trying to do
something that is not supported in Hive.
My goal is to choose a compression scheme for Hadoop/Hive and while
comparing configurations, I'm finding that I can't get BZip2 or Gzip to work
with the RCfile format.
Is that
The good news is that this is a simple XML section .. and this looks like a
XML read error.
Try to copy-paste one of the existing properties sections and pasting over
just the name and value strings from the message.
Cheers,
Ajo
On Fri, Mar 4, 2011 at 6:40 AM, Anja Gruenheid
Hi Everyone
I'm facing an issue with hive on a relatively larger query which involves
joins on six hive tables. My query is running fine without any errors, all the
map reduce jobs run to completion but unfortunately it is not showing up any
results. I tried debugging the query and to
I fixed the XML problem and wrote everything into hive-site.xml. The
update error still exists though.
Anja
On 03/04/2011 09:47 AM, Ajo Fod wrote:
The good news is that this is a simple XML section .. and this looks
like a XML read error.
Try to copy-paste one of the existing properties
Andreas,
Well, that is not entirely true, Oozie consumes Yahoo distributions of
Hadoop and Pig (from Yahoo GH maven). [BTW, this brings up again the GH-0226
issue]
Thanks for reviewing the patch.
In the mean time, anybody wanting to use Oozie with Hive action can use CDH
Oozie CDH3b4 which
Can you search your /tmp/username/hive.log for 'Stats' and see if there is
any error message? You can also log on to mysql and see if the database you
specified in the JDBC URI has been created and if there is any table in the
database.
On Mar 5, 2011, at 7:35 AM, Anja Gruenheid wrote:
I also
I tried to use the default settings and with that it works (at least it
doesn't throw an error). What's weird is that it collects the data on
files/files size etc., but it doesn't compute the row count. Do you have
any idea why that could be? The table is based on a textfile and is
handled as
my eng is very poor.
i set up hive env use
http://wiki.apache.org/hadoop/Hive/GettingStarted#Apache_Weblog_Data
but i catch a exception when i run SHOW TABLES; script
somebody can help me ? thanks a lot!
hive SHOW TABLES;
Exception in thread main java.lang.NoSuchMethodError:
Hi,
I am a hive newbie.I just finished setting up hive on a cluster of two servers
for my organisation.As a test drill, we operated some simple queries.It took
the
standard map-reduce algorithm around 4 minutes just to execute this query:
count(1) from tablename;
The answer returned was
Check the lib path,
commons-lang-2.4.jar is in the lib or not.
_
From: 徐厚道 [mailto:xuhou...@gmail.com]
Sent: Monday, March 07, 2011 11:54 AM
To: user@hive.apache.org
Subject: hello everybody,i am fresher,i meet a problem,please help.
my eng is very poor.
i set up hive env
***
This e-mail and attachments contain confidential information
from HUAWEI, which is intended only for the person or entity whose address
is listed above. Any use of the information
thank you reply!
yes,it is. and hadoop lib dir has commons-lang-2.1.jar, is they Conflict
?
2011/3/7 Chinna chinna...@huawei.com
Check the lib path,
commons-lang-2.4.jar is in the lib or not.
--
*From:* 徐厚道 [mailto:xuhou...@gmail.com]
*Sent:*
No, It won't be a conflict.
In u r hive installation/lib/commons-lang-2.4.jar if this jar is there .
It will come to class path while starting the hive.
I think u r using hive version 0.5.0 or above.
if still this problem is there send the details like how u r starting and
which
In my experience, hive is not instantaneous like other DBs, but 4 minutes to
count 2200 rows seems unreasonable.
For comparison my query of 169k rows one one computer with 4 cores running
1Ghz (approx) took 20 seconds.
Cheers,
Ajo.
On Mon, Mar 7, 2011 at 1:19 AM, abhishek pathak
Nevermind, looks like this has already been done using the Thrift APIs!
https://github.com/forward/rbhive
On Mon, Mar 7, 2011 at 1:24 PM, Ryan LeCompte lecom...@gmail.com wrote:
Hey guys,
I'm thinking about writing a native Ruby client that can be used to connect
to a running Hive server
I am Sqooping data from an external source into a bucketed Hive table. Sqoop
seems completely bucket-unaware, it simply used LOAD INPATH which moves the
single file containing Sqooped data into the Hive warehouse location.
My question:
- is there any way to get data into an empty
sorry,i have not reply Immediately,i have confirmed the commons-lang-2.4.jar
is in the installation/lib.
my installation info is
hive 0.6.0
hadoop 0.20.2 with nutch 1.1
i have view the bin/hive script ,and echo the CLASSPTH,HADOOP_CLASSPATH,they
all contains the commons-lang-2.4.jar. but throw
I suspected as such.My system is a Core2Duo,1.86 Ghz.I understand that
map-reduce is not instantaneous, just wanted to confirm that 2200 rows in 4
minutes is indeeed not normal behaviour.Could you point me at some places where
i can get some info on how to tune this up?
Regards,
Abhishek
Hi,
I loaded a data set which has 1 million rows into both Hive and HBase
tables. For the HBase table, I created a corresponding Hive table so that
the data in HBase can be queried from Hive QL. Both tables have a key column
and a value column
For the same query (select value, count(*) from
Yes.
JVS
On Mar 7, 2011, at 9:59 PM, Biju Kaimal wrote:
Hi,
I loaded a data set which has 1 million rows into both Hive and HBase tables.
For the HBase table, I created a corresponding Hive table so that the data in
HBase can be queried from Hive QL. Both tables have a key column and a
If you go to the jobtracker's web UI, it provides plenty of details
about each job. Even with all the default settings of a typical
hadoop/hive installation, 4 minutes for 2200 rows is extremely slow.
It feels like there is some kind of problem but it is hard to guess
what that could be. Digging
301 - 400 of 19460 matches
Mail list logo