Re: Want to improve the performance for execution of Hive Jobs.

2012-05-08 Thread Alexis De La Cruz Toledo
Hi,
the way to know if the job is running on all cluster,
you will look at logs of Hive(By default $HADOOP_HOME/logs)
another way is running the query in Hive
and use the web interface of hadoop:

address: http://server_jobtracker:port_mapreduce

Dear.

2012/5/7 Bhavesh Shah bhavesh25s...@gmail.com

 Hello all,
 I have written a Hive JDBC code and created a JAR of it. I am running that
 JAR on 10 cluster.
 But the problem as I am using the 10 cluster still the performance is same
 as that on single cluster.

 What to do to improve the performance of Hive Jobs? Is there anything
 configuration setting to set before the submitting Hive Jobs to cluster?
 One more thing I want to know is that How can we come to know that is job
 running on all cluster?

 Please let me know if anyone knows about it?

 --
 Regards,
 Bhavesh Shah




-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Why a GroupBYOperator is realized in two MapReduce?

2012-04-12 Thread Alexis De La Cruz Toledo
Hi! I have a doubt, Why a GroupBy Operator is solved
in two MapReduce Job.
1. First the aggregation functions(sum(), count(), avg(), max(), etc) are
solved partial
2. After in another MapReduce Job the aggregation function is final.
Why?

Thanks.
Regards


-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Why a GroupBy Operator need Two MapReduce jobs to solved?

2012-04-12 Thread Alexis De La Cruz Toledo
Hi! I have a doubt, Why a GroupBy Operator is solved
in two MapReduce Job.
1. First the aggregation functions(sum(), count(), avg(), max(), etc) are
solved partial
2. After in another MapReduce Job the aggregation function is final.
Why?

Thanks.
Regards

-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Re: GSOC project for Hive

2012-03-22 Thread Alexis De La Cruz Toledo
Hi my name is Alex I'm a student
I would like to participate in GSOC,
Actually, I'm working with subqueries collapse in DAG.
Can I add project to participate?
Someone can be my tutor?

Thanks.


El 21 de marzo de 2012 17:36, Namit Jain nj...@fb.com escribió:

 Hi Nitika,

 I have proposed a few Hive projects on the GSOC.

 Do you want to add more projects ?
 What are you exactly looking for - new project ideas,
 or proposals for existing projects.


 Thanks,
 -namit


 On 3/21/12 3:37 PM, nitika gupta nitika71...@gmail.com wrote:

 Hi Hive Devs,
 I am a student planning to apply for GSOC this summer. I am working in
 a startup in the bay area along with studies and working on hadoop and
 hive. I was interested in proposing  a project for the hive open
 source community to work on this summer(under GSOC) and would
 appreciate if anyone has ideas for the interesting projects.
 
 A few ideas from the JIRA list for GSOC Apache:
 1) Enhance bucketing and sorting support in hive HIVE-2846
 2) Add support for query rewrite from the metadata HIVE-2847
 
 Feel free to propose ideas and suggestions for the hive projects.
 
 Thanks
 
 Nitika




-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Re: Hive projects for Google Summer of code 2012 ?

2012-03-06 Thread Alexis De La Cruz Toledo
Sorry Mr. Namit Jain I can't see the jiras.

Thanks

El 6 de marzo de 2012 19:41, Namit Jain nj...@fb.com escribió:

 I filed a couple of gsoc 2012 jiras.

 Please submit your proposal if you are interested.

 https://issues.apache.org/jira/secure/IssueNavigator.jspa?requestId=1231927
 0




 Thanks,
 -namit


 On 2/7/12 6:32 PM, Alexis De La Cruz Toledo alexis...@gmail.com wrote:

 Hi Namit Jain, I'm Alexis, I'm a master student, I'm studying in
 Cinvestav,
 df Mexico.
 I'm interesting in collaborating with Hive. I'd like to do my thesis work
 in something about Hive.
 I'd like to participate in Google Summer of Code 2012 too.
 
 The issues that you propose I find interesting, particularly in the next
 issues:
 * The topic 2. Indexed Joins.
 * [PO] Optimize Joins using Bloom Filters of this page
 https://cwiki.apache.org/confluence/display/Hive/Roadmap.
 
 Can you tell me something more about it?
 What is problem to be solved?
 What benefits we hope to gain?
 
 This because I want to raise my thesis problem.
 By another hand, can you be my mentor in Google Summer of Code, if I work
 with this topics?
 
 Thanks.
 
 El 5 de febrero de 2012 19:58, Namit Jain nj...@fb.com escribió:
 
  Hi Alexis/Bharath,
 
  Great to see your interests. If you looking for ideas, some things that
  will be very useful are:
 
  1. Removing the map-join hint completely.
Moving all processing to runtime.
Currently, bucketed map joins and sort merge joins are completely
  driven off hints.
It would be very helpful to the community, and also clean up the code
 a
  lot.
 
  2. Indexed Joins.
Something that would be really useful -
If the basic infrastructure is ready, it can eventually be used to
 join
  tables
stored outside also (say Hbase).
 
  3. Metastore understanding hierarchy.
For eg: if a table is partitioned by ds and hr,
A valid partition on ds does not exist. This is a very common usecase
  on many
applications, and the current work-around is to have signal tables for
  ds
un-necessarily complicating the process.
 
 
  If you are interested, I would be happy to provide more details.
 
 
  Thanks,
  -namit
 
 
  On 2/4/12 11:57 AM, Ashutosh Chauhan hashut...@apache.org wrote:
 
  Hi Alexis,
  
  Great to see your interest. Feel free to come up with concrete proposal
  and
  submit to GSoC. Its certainly heartening to see folks interested in
 making
  contributions to the Hive Project.
  
  Ashutosh
  On Sat, Feb 4, 2012 at 10:48, Alexis De La Cruz Toledo
  alexis...@gmail.comwrote:
  
   Hi Ashutosh, I'm interesting in hive,
   I'd like to improve the compilation process,
   I have been that the plan query tree generated
   by Hive can be optimized, and I'd like
   to participate in Google Summer of code 2012.
   What do you say?
  
   Regards.
  
  
   El 4 de febrero de 2012 12:29, Ashutosh Chauhan 
 hashut...@apache.org
   escribió:
  
Hey Bharath,
   
Great to see your enthusiasm for Hive! I would be happy to mentor
 you
  for
the project.  For the start, you can take a look at
https://cwiki.apache.org/confluence/display/Hive/Roadmap for a
 list
  of
open
projects in Hive. The document is bit dated, so some of those
 projects
   may
not be relevant. But, its a good source to start with to see if
 any of
these projects excite you.
   
Hope it helps,
Ashutosh
   
On Sat, Feb 4, 2012 at 08:47, bharath vissapragada 
bharathvissapragada1...@gmail.com wrote:
   
 Hey list, devs,

 Google summer of code, 2012 's notification [1] has been released
  and
 mentoring organizations can submit their proposals to Google for
opensource
 projects.

 Any of the devs interested in mentoring students on Hive
 projects (
  any
 critical jiras etc.) ?  It would be great if any of the devs (dev
  list
 cc'ed) can do that on behalf of ASF .

 It would be a great opportunity for  many students to contribute
   patches
 to Hadoop and Hive and make their summer vacation fruitful.

 [1]
 http://google-melange.appspot.com/gsoc/events/google/gsoc2012

 Thanks and Regards,
 Bharath .V
 w:http://researchweb.iiit.ac.in/~bharath.v

   
  
  
  
   --
   Ing. Alexis de la Cruz Toledo.
   *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro
 Zacatenco.
   México,
   D.F, 07360 *
   *CINVESTAV, DF.*
  
 
 
 
 
 --
 Ing. Alexis de la Cruz Toledo.
 *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco.
 México,
 D.F, 07360 *
 *CINVESTAV, DF.*




-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


how can I get the AST in hive?

2012-02-20 Thread Alexis De La Cruz Toledo
Hi I have a doubt, How can I get the AST in Hive?
and how can I get the QB Tree in Hive?

Thank you.
Regards.

-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Re: Running hive in eclipse

2012-02-15 Thread Alexis De La Cruz Toledo
I has the same problem.
Can anyone help us?

El 15 de febrero de 2012 13:52, Aaron Sun aaron.su...@gmail.com escribió:

 Hi Team,

 I am trying to run and debug hive in eclipse. I checked out release-0.8.0
 1215012 from the SVN repository and built the project with thrift and fb303
 library installed correctly. The building process returned Build
 Successfully.

 Then I tried to launch the cli by running CliDriver.java as a Java
 Application, and it returned errors as

 
 Exception in thread main java.lang.RuntimeException: Failed to load Hive
 builtin functions
 at

 org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:190)
 at
 org.apache.hadoop.hive.cli.CliSessionState.init(CliSessionState.java:81)
 at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:576)
 at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554)
 Caused by: java.util.zip.ZipException: error in opening zip file
 at java.util.zip.ZipFile.open(Native Method)
 at java.util.zip.ZipFile.init(ZipFile.java:131)
 at java.util.jar.JarFile.init(JarFile.java:150)
 at java.util.jar.JarFile.init(JarFile.java:87)
 at sun.net.www.protocol.jar.URLJarFile.init(URLJarFile.java:90)
 at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:66)
 at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:71)
 at

 sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122)
 at

 sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:150)
 at java.net.URL.openStream(URL.java:1029)
 at

 org.apache.hadoop.hive.ql.exec.FunctionRegistry.registerFunctionsFromPluginJar(FunctionRegistry.java:1194)
 at

 org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:187)
 ... 3 more
 
 I looked over the build.xml under ./builtins directory, and noticed that
 the compile and jar targets are both commented, and no jar is generated for
 builtins

 target name=compile depends=init, setup
echo message=Project: ${ant.project.name}/
  !-- defer compilation until package phase --
 /target

 target name=jar depends=init
echo message=Project: ${ant.project.name}/
!-- defer compilation until package phase --
 /target

 I then manually changed the build.xml for compile part as follows and
 rebuilt the project:
  target name=compile depends=init, setup
echo message=Project: ${ant.project.name}/
  javac
   encoding=${build.encoding}
   srcdir=${src.dir}
   includes=**/*.java
   destdir=${build.classes}
   debug=${javac.debug}
   deprecation=${javac.deprecation}
   includeantruntime=false
compilerarg line=${javac.args} ${javac.args.warnings} /
classpath refid=classpath/
  /javac
  /target

 Now the 'hive-buitins-0.8.0-SNAPSHOT.jar' is under the .build/buitins
 directory. However, I am still getting the same error message as Failed to
 load Hive builtin functions. Could someone kindly let me know what is the
 problem and how I should run cli correctly in eclipse?

 Thanks
 Aaron




-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Re: Hive projects for Google Summer of code 2012 ?

2012-02-04 Thread Alexis De La Cruz Toledo
Hi Ashutosh, I'm interesting in hive,
I'd like to improve the compilation process,
I have been that the plan query tree generated
by Hive can be optimized, and I'd like
to participate in Google Summer of code 2012.
What do you say?

Regards.


El 4 de febrero de 2012 12:29, Ashutosh Chauhan hashut...@apache.orgescribió:

 Hey Bharath,

 Great to see your enthusiasm for Hive! I would be happy to mentor you for
 the project.  For the start, you can take a look at
 https://cwiki.apache.org/confluence/display/Hive/Roadmap for a list of
 open
 projects in Hive. The document is bit dated, so some of those projects may
 not be relevant. But, its a good source to start with to see if any of
 these projects excite you.

 Hope it helps,
 Ashutosh

 On Sat, Feb 4, 2012 at 08:47, bharath vissapragada 
 bharathvissapragada1...@gmail.com wrote:

  Hey list, devs,
 
  Google summer of code, 2012 's notification [1] has been released and
  mentoring organizations can submit their proposals to Google for
 opensource
  projects.
 
  Any of the devs interested in mentoring students on Hive projects ( any
  critical jiras etc.) ?  It would be great if any of the devs (dev list
  cc'ed) can do that on behalf of ASF .
 
  It would be a great opportunity for  many students to contribute patches
  to Hadoop and Hive and make their summer vacation fruitful.
 
  [1] http://google-melange.appspot.com/gsoc/events/google/gsoc2012
 
  Thanks and Regards,
  Bharath .V
  w:http://researchweb.iiit.ac.in/~bharath.v
 




-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Re: Table not creating in hive

2012-02-02 Thread Alexis De La Cruz Toledo
This is because you need the metastore.
If you aren't installed in a databases,
it installed with derby in the directory when
you access to hive, remember where was it.
There you should find the directory name _metastore
and in this directory access to hive.

Regards.

El 2 de febrero de 2012 05:46, Bhavesh Shah bhavesh25s...@gmail.comescribió:

 Hello all,

 After successfully importing the tables in hive I am not able to see the
 table in Hive.
 When I imported the table I saw the dir on HDFS (under
 /user/hive/warehouse/) but when I execute command in Hive SHOW TABLES
 the table is not in the list.

 I find a lot about it but not getting anything.
 Pls suggest me some solution for it.




 --
 Thanks and Regards,
 Bhavesh Shah




-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Who can I get the MapReduce Jobs codes generated by Hive Compiler?

2012-01-31 Thread Alexis De La Cruz Toledo
Hi I'd like to know what is the code generated by
the compiler of SQL hive. i.e. If I execute one sql sentence
I'd like to see the code of MapReduce jobs generated by compiler of SQL
hive.

Who can I get it?

Thanks.
-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*


Re: Who can I get the MapReduce Jobs codes generated by Hive Compiler?

2012-01-31 Thread Alexis De La Cruz Toledo
Thank you

El 31 de enero de 2012 11:13, bharath vissapragada 
bharathvissapragada1...@gmail.com escribió:

 Hey,

 Explain command is what you need . Just type explain query in the
 console and it dumps the entire plan to the screen. You can analyze it and
 get the plan generated by hive.

 Thanks

 On Tue, Jan 31, 2012 at 4:57 PM, Alexis De La Cruz Toledo 
 alexis...@gmail.com wrote:

  Hi I'd like to know what is the code generated by
  the compiler of SQL hive. i.e. If I execute one sql sentence
  I'd like to see the code of MapReduce jobs generated by compiler of SQL
  hive.
 
  Who can I get it?
 
  Thanks.
  --
  Ing. Alexis de la Cruz Toledo.
  *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco.
  México,
  D.F, 07360 *
  *CINVESTAV, DF.*
 



 --
 Regards,
 Bharath .V
 w:http://researchweb.iiit.ac.in/~bharath.v




-- 
Ing. Alexis de la Cruz Toledo.
*Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México,
D.F, 07360 *
*CINVESTAV, DF.*