Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Richa Sharma
Jerome I would recommend that you try Rank function with columns from just one table first. Once it is established that rank is working fine then add all the joins. I am still on Hive 0.10 so cannot test it myself. However, I can find a similar issue on following link - so its possible you are

Metadata_db error in Hive 0.8

2013-07-17 Thread Maheedhar Reddy
Hi All, I have installed hive 4 months back, i don't have any database installed in my system. till today it was working fine, but now i'm getting this metadata error... can anyone please suggest me do i have to create any settings for derby database to avoid this error?? error is like

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Vijay
As the error message states: One ore more arguments are expected, you have to pass a column to the rank function. On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier verdier.jerom...@gmail.comwrote: Hi Richa, I have tried a simple query without joins, etc SELECT RANK() OVER (PARTITION BY

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Jérôme Verdier
Hi Vijay, Could you give me an example, i'm not sure of what you're meaning. Thanks, 2013/7/17 Vijay tec...@gmail.com As the error message states: One ore more arguments are expected, you have to pass a column to the rank function. On Wed, Jul 17, 2013 at 1:12 AM, Jérôme Verdier

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Richa Sharma
Vijay Jerome has already passed column - mag.co_societe for rank. syntax - RANK() OVER (PARTITION BY mag.co_societe ORDER BY mag.me_vente_ht) This will generate a rank for column mag.co_societe based on column value me_vente_ht Jerome, Its possible you are also hitting the same bug as I

Re: Use RANK OVER PARTITION function in Hive 0.11

2013-07-17 Thread Jérôme Verdier
Hi Richa, I have tried one query, with what i've understand of Vijay's tips. SELECT code_entite, RANK(mag.me_vente_ht) OVER (PARTITION BY mag.co_societe ORDER BY mag.me_vente_ht) AS rank FROM default.thm_renta_rgrp_produits_n_1 mag; This query is working, it gives me results. You say that

RE: New to hive.

2013-07-17 Thread Puneet Khatod
Hi, There are many online tutorials and blogs to provide quick get-set-go sort of information. To start with you can learn Hadoop. For detailed knowledge you will have to go through e-books as mentioned by Lefty. These books are bulky but will provide every bit of hadoop. I recently came

which approach is better

2013-07-17 Thread Hamza Asad
Please let me knw which approach is better. Either i save my data directly to HDFS and run hive (shark) queries over it OR store my data in HBASE, and then query it.. as i want to ensure efficient data retrieval and data remains safe and can easily recover if hadoop crashes. -- *Muhammad Hamza

Re: which approach is better

2013-07-17 Thread Nitin Pawar
what's the purpose of data storage? whats the read and write throughput you expect? whats the way you will access data while read? whats are your SLAs on both read and write? there will be more questions others will ask so be ready for that :) On Wed, Jul 17, 2013 at 11:10 PM, Hamza Asad

Re: which approach is better

2013-07-17 Thread Hamza Asad
I use data to generates reports on daily basis, Do couple of analysis and its insert once and read many on daily basis. But My main purpose is to secure my data and easily recover it even if my hadoop(datanode) OR HDFS crashes. As uptill now, i'm using approach in which data has been retrieved

Re: New to hive.

2013-07-17 Thread Mohammad Tariq
Hello ma'm, Apologies first of all for responding so late. Stuck with some urgent deliverables. Was out of touch for a while. java.io.IOException: Cannot run program /Users/bharati/hive-0.11.0/src/testutils/hadoop (in directory /Users/bharati/eclipse/tutorial/src): error=13, Permission denied

Re: New to hive.

2013-07-17 Thread Bharati Adkar
Hi Tariq, No Problems, It was the hive.jar.path property that was not being set. Figured it out and fixed it. Got the plan.xml and jobconf.xml now will debug hadoop to get the rest of info. Thanks, Warm regards, Bharati On Jul 17, 2013, at 12:08 PM, Mohammad Tariq donta...@gmail.com wrote:

Re: New to hive.

2013-07-17 Thread Mohammad Tariq
Great. Good luck with that. Warm Regards, Tariq cloudfront.blogspot.com On Thu, Jul 18, 2013 at 12:43 AM, Bharati Adkar bharati.ad...@mparallelo.com wrote: Hi Tariq, No Problems, It was the hive.jar.path property that was not being set. Figured it out and fixed it. Got the plan.xml and

Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread John Omernik
Hey all - I was wondering if there were any shortcut Java courses out there. As in, I am not looking for a holistic learn everything about Java course, but more of a So you are a big data/hive geek and you get Python/Perl pretty well, but when you try to understand Java your head explodes and it

Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread John Meagher
The Data Science course on Coursera has a pretty good overview of map reduce, Hive, and Pig without going into the Java side of things. https://www.coursera.org/course/datasci. It's not in depth, but it is enough to get started. On Wed, Jul 17, 2013 at 3:52 PM, John Omernik j...@omernik.com

Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread Yasmin Lucero
Ha. I have the same problem. It is hard to find resources aimed at the right level. I have been pretty happy with the book Head First Java by Kathy Sierra and Bert someone er other. y Yasmin Lucero Senior Statistician, Gravity.com

Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread xiufeng liu
You could also take a look the flowing resources for data science: http://datascienc.es/resources/ http://blog.zipfianacademy.com/ Regards, Xiufeng Liu On Wed, Jul 17, 2013 at 10:09 PM, Yasmin Lucero yasmin.luc...@gmail.comwrote: Ha. I have the same problem. It is hard to find resources

Question regarding external table and csv in NFS

2013-07-17 Thread Mainak Ghosh
Hello, I have just started using Hive and I was trying to create an external table with the csv file placed in NFS. I tried using file:// and local://. Both of these attempts failed with the error: create external table outside_supplier (S_SUPPKEY INT, S_NAME STRING, S_ADDRESS STRING,

Re: Question regarding external table and csv in NFS

2013-07-17 Thread Mainak Ghosh
Hey Saurabh, I tried this command and it still gives the same error. Actually the folder name is supplier and supplier.tbl is the csv which resided inside it. I had it correct in the query but in mail it is wrong. So the query that I executed was: create external table outside_supplier

Problem with the windowing function ntile (Exceptions)

2013-07-17 Thread Lars Francke
Hi, I'm running a query like this: CREATE TABLE foo STORED AS ORC AS SELECT id, season, amount, ntile(10) OVER ( PARTITION BY season ORDER BY amount DESC ) FROM bar; On a small enough dataset that works fine but when switching to a larger sample we're seeing exceptions like this:

Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

2013-07-17 Thread Mitesh Peshave
Hello, I am trying to use a custom inputformat for a hive table. When I add the jar containing the custom inputformat through a client, such as the beeline, executing add jar command, all seems to work fine. In this scenario, hive seems to pass inputformat class to the JT and TTs. I believe, it

Re: Hive does not package a custom inputformat in the MR job jar when the custom inputformat class is add as aux jar.

2013-07-17 Thread Andrew Trask
Put them in hive's lib folder? Sent from my Rotary Phone On Jul 17, 2013, at 11:14 PM, Mitesh Peshave mspesh...@gmail.com wrote: Hello, I am trying to use a custom inputformat for a hive table. When I add the jar containing the custom inputformat through a client, such as the beeline,

Export to RDBMS directly

2013-07-17 Thread Omkar Joshi
Hi, Currently, I'm executing the following steps(Hadoop 1.1.2, Hive 0.11 and Sqoop-1.4.3.bin__hadoop-1.0.0) : 1.Import data from MySQL to Hive using Sqoop 2.Execute a query in Hive and store its output in a Hive table 3.Export the output to MySQL using Sqoop I was wondering if

Re: Export to RDBMS directly

2013-07-17 Thread Bertrand Dechoux
The short answer is no. You could at the moment write your own input format/output format in order to do so. I don't know all the details for hive but that's possible. However, you will likely run a DOS against your database if you are not careful. Hive could embed sqoop in order do that smartly

RE: Export to RDBMS directly

2013-07-17 Thread Omkar Joshi
I read of the term 'JDBC Storage Handler' at https://issues.apache.org/jira/browse/HIVE-1555 The issues seems open but I just want to confirm that it has not been implemented in the latest Hive releases. Regards, Omkar Joshi From: Bertrand Dechoux [mailto:decho...@gmail.com] Sent: Thursday,