Re: Want to improve the performance for execution of Hive Jobs.
Hi, the way to know if the job is running on all cluster, you will look at logs of Hive(By default $HADOOP_HOME/logs) another way is running the query in Hive and use the web interface of hadoop: address: http://server_jobtracker:port_mapreduce Dear. 2012/5/7 Bhavesh Shah bhavesh25s...@gmail.com Hello all, I have written a Hive JDBC code and created a JAR of it. I am running that JAR on 10 cluster. But the problem as I am using the 10 cluster still the performance is same as that on single cluster. What to do to improve the performance of Hive Jobs? Is there anything configuration setting to set before the submitting Hive Jobs to cluster? One more thing I want to know is that How can we come to know that is job running on all cluster? Please let me know if anyone knows about it? -- Regards, Bhavesh Shah -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Why a GroupBYOperator is realized in two MapReduce?
Hi! I have a doubt, Why a GroupBy Operator is solved in two MapReduce Job. 1. First the aggregation functions(sum(), count(), avg(), max(), etc) are solved partial 2. After in another MapReduce Job the aggregation function is final. Why? Thanks. Regards -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Why a GroupBy Operator need Two MapReduce jobs to solved?
Hi! I have a doubt, Why a GroupBy Operator is solved in two MapReduce Job. 1. First the aggregation functions(sum(), count(), avg(), max(), etc) are solved partial 2. After in another MapReduce Job the aggregation function is final. Why? Thanks. Regards -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Re: GSOC project for Hive
Hi my name is Alex I'm a student I would like to participate in GSOC, Actually, I'm working with subqueries collapse in DAG. Can I add project to participate? Someone can be my tutor? Thanks. El 21 de marzo de 2012 17:36, Namit Jain nj...@fb.com escribió: Hi Nitika, I have proposed a few Hive projects on the GSOC. Do you want to add more projects ? What are you exactly looking for - new project ideas, or proposals for existing projects. Thanks, -namit On 3/21/12 3:37 PM, nitika gupta nitika71...@gmail.com wrote: Hi Hive Devs, I am a student planning to apply for GSOC this summer. I am working in a startup in the bay area along with studies and working on hadoop and hive. I was interested in proposing a project for the hive open source community to work on this summer(under GSOC) and would appreciate if anyone has ideas for the interesting projects. A few ideas from the JIRA list for GSOC Apache: 1) Enhance bucketing and sorting support in hive HIVE-2846 2) Add support for query rewrite from the metadata HIVE-2847 Feel free to propose ideas and suggestions for the hive projects. Thanks Nitika -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Re: Hive projects for Google Summer of code 2012 ?
Sorry Mr. Namit Jain I can't see the jiras. Thanks El 6 de marzo de 2012 19:41, Namit Jain nj...@fb.com escribió: I filed a couple of gsoc 2012 jiras. Please submit your proposal if you are interested. https://issues.apache.org/jira/secure/IssueNavigator.jspa?requestId=1231927 0 Thanks, -namit On 2/7/12 6:32 PM, Alexis De La Cruz Toledo alexis...@gmail.com wrote: Hi Namit Jain, I'm Alexis, I'm a master student, I'm studying in Cinvestav, df Mexico. I'm interesting in collaborating with Hive. I'd like to do my thesis work in something about Hive. I'd like to participate in Google Summer of Code 2012 too. The issues that you propose I find interesting, particularly in the next issues: * The topic 2. Indexed Joins. * [PO] Optimize Joins using Bloom Filters of this page https://cwiki.apache.org/confluence/display/Hive/Roadmap. Can you tell me something more about it? What is problem to be solved? What benefits we hope to gain? This because I want to raise my thesis problem. By another hand, can you be my mentor in Google Summer of Code, if I work with this topics? Thanks. El 5 de febrero de 2012 19:58, Namit Jain nj...@fb.com escribió: Hi Alexis/Bharath, Great to see your interests. If you looking for ideas, some things that will be very useful are: 1. Removing the map-join hint completely. Moving all processing to runtime. Currently, bucketed map joins and sort merge joins are completely driven off hints. It would be very helpful to the community, and also clean up the code a lot. 2. Indexed Joins. Something that would be really useful - If the basic infrastructure is ready, it can eventually be used to join tables stored outside also (say Hbase). 3. Metastore understanding hierarchy. For eg: if a table is partitioned by ds and hr, A valid partition on ds does not exist. This is a very common usecase on many applications, and the current work-around is to have signal tables for ds un-necessarily complicating the process. If you are interested, I would be happy to provide more details. Thanks, -namit On 2/4/12 11:57 AM, Ashutosh Chauhan hashut...@apache.org wrote: Hi Alexis, Great to see your interest. Feel free to come up with concrete proposal and submit to GSoC. Its certainly heartening to see folks interested in making contributions to the Hive Project. Ashutosh On Sat, Feb 4, 2012 at 10:48, Alexis De La Cruz Toledo alexis...@gmail.comwrote: Hi Ashutosh, I'm interesting in hive, I'd like to improve the compilation process, I have been that the plan query tree generated by Hive can be optimized, and I'd like to participate in Google Summer of code 2012. What do you say? Regards. El 4 de febrero de 2012 12:29, Ashutosh Chauhan hashut...@apache.org escribió: Hey Bharath, Great to see your enthusiasm for Hive! I would be happy to mentor you for the project. For the start, you can take a look at https://cwiki.apache.org/confluence/display/Hive/Roadmap for a list of open projects in Hive. The document is bit dated, so some of those projects may not be relevant. But, its a good source to start with to see if any of these projects excite you. Hope it helps, Ashutosh On Sat, Feb 4, 2012 at 08:47, bharath vissapragada bharathvissapragada1...@gmail.com wrote: Hey list, devs, Google summer of code, 2012 's notification [1] has been released and mentoring organizations can submit their proposals to Google for opensource projects. Any of the devs interested in mentoring students on Hive projects ( any critical jiras etc.) ? It would be great if any of the devs (dev list cc'ed) can do that on behalf of ASF . It would be a great opportunity for many students to contribute patches to Hadoop and Hive and make their summer vacation fruitful. [1] http://google-melange.appspot.com/gsoc/events/google/gsoc2012 Thanks and Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.* -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.* -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
how can I get the AST in hive?
Hi I have a doubt, How can I get the AST in Hive? and how can I get the QB Tree in Hive? Thank you. Regards. -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Re: Running hive in eclipse
I has the same problem. Can anyone help us? El 15 de febrero de 2012 13:52, Aaron Sun aaron.su...@gmail.com escribió: Hi Team, I am trying to run and debug hive in eclipse. I checked out release-0.8.0 1215012 from the SVN repository and built the project with thrift and fb303 library installed correctly. The building process returned Build Successfully. Then I tried to launch the cli by running CliDriver.java as a Java Application, and it returned errors as Exception in thread main java.lang.RuntimeException: Failed to load Hive builtin functions at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:190) at org.apache.hadoop.hive.cli.CliSessionState.init(CliSessionState.java:81) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:576) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:554) Caused by: java.util.zip.ZipException: error in opening zip file at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.init(ZipFile.java:131) at java.util.jar.JarFile.init(JarFile.java:150) at java.util.jar.JarFile.init(JarFile.java:87) at sun.net.www.protocol.jar.URLJarFile.init(URLJarFile.java:90) at sun.net.www.protocol.jar.URLJarFile.getJarFile(URLJarFile.java:66) at sun.net.www.protocol.jar.JarFileFactory.get(JarFileFactory.java:71) at sun.net.www.protocol.jar.JarURLConnection.connect(JarURLConnection.java:122) at sun.net.www.protocol.jar.JarURLConnection.getInputStream(JarURLConnection.java:150) at java.net.URL.openStream(URL.java:1029) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.registerFunctionsFromPluginJar(FunctionRegistry.java:1194) at org.apache.hadoop.hive.ql.session.SessionState.init(SessionState.java:187) ... 3 more I looked over the build.xml under ./builtins directory, and noticed that the compile and jar targets are both commented, and no jar is generated for builtins target name=compile depends=init, setup echo message=Project: ${ant.project.name}/ !-- defer compilation until package phase -- /target target name=jar depends=init echo message=Project: ${ant.project.name}/ !-- defer compilation until package phase -- /target I then manually changed the build.xml for compile part as follows and rebuilt the project: target name=compile depends=init, setup echo message=Project: ${ant.project.name}/ javac encoding=${build.encoding} srcdir=${src.dir} includes=**/*.java destdir=${build.classes} debug=${javac.debug} deprecation=${javac.deprecation} includeantruntime=false compilerarg line=${javac.args} ${javac.args.warnings} / classpath refid=classpath/ /javac /target Now the 'hive-buitins-0.8.0-SNAPSHOT.jar' is under the .build/buitins directory. However, I am still getting the same error message as Failed to load Hive builtin functions. Could someone kindly let me know what is the problem and how I should run cli correctly in eclipse? Thanks Aaron -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Re: Hive projects for Google Summer of code 2012 ?
Hi Ashutosh, I'm interesting in hive, I'd like to improve the compilation process, I have been that the plan query tree generated by Hive can be optimized, and I'd like to participate in Google Summer of code 2012. What do you say? Regards. El 4 de febrero de 2012 12:29, Ashutosh Chauhan hashut...@apache.orgescribió: Hey Bharath, Great to see your enthusiasm for Hive! I would be happy to mentor you for the project. For the start, you can take a look at https://cwiki.apache.org/confluence/display/Hive/Roadmap for a list of open projects in Hive. The document is bit dated, so some of those projects may not be relevant. But, its a good source to start with to see if any of these projects excite you. Hope it helps, Ashutosh On Sat, Feb 4, 2012 at 08:47, bharath vissapragada bharathvissapragada1...@gmail.com wrote: Hey list, devs, Google summer of code, 2012 's notification [1] has been released and mentoring organizations can submit their proposals to Google for opensource projects. Any of the devs interested in mentoring students on Hive projects ( any critical jiras etc.) ? It would be great if any of the devs (dev list cc'ed) can do that on behalf of ASF . It would be a great opportunity for many students to contribute patches to Hadoop and Hive and make their summer vacation fruitful. [1] http://google-melange.appspot.com/gsoc/events/google/gsoc2012 Thanks and Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Re: Table not creating in hive
This is because you need the metastore. If you aren't installed in a databases, it installed with derby in the directory when you access to hive, remember where was it. There you should find the directory name _metastore and in this directory access to hive. Regards. El 2 de febrero de 2012 05:46, Bhavesh Shah bhavesh25s...@gmail.comescribió: Hello all, After successfully importing the tables in hive I am not able to see the table in Hive. When I imported the table I saw the dir on HDFS (under /user/hive/warehouse/) but when I execute command in Hive SHOW TABLES the table is not in the list. I find a lot about it but not getting anything. Pls suggest me some solution for it. -- Thanks and Regards, Bhavesh Shah -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Who can I get the MapReduce Jobs codes generated by Hive Compiler?
Hi I'd like to know what is the code generated by the compiler of SQL hive. i.e. If I execute one sql sentence I'd like to see the code of MapReduce jobs generated by compiler of SQL hive. Who can I get it? Thanks. -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*
Re: Who can I get the MapReduce Jobs codes generated by Hive Compiler?
Thank you El 31 de enero de 2012 11:13, bharath vissapragada bharathvissapragada1...@gmail.com escribió: Hey, Explain command is what you need . Just type explain query in the console and it dumps the entire plan to the screen. You can analyze it and get the plan generated by hive. Thanks On Tue, Jan 31, 2012 at 4:57 PM, Alexis De La Cruz Toledo alexis...@gmail.com wrote: Hi I'd like to know what is the code generated by the compiler of SQL hive. i.e. If I execute one sql sentence I'd like to see the code of MapReduce jobs generated by compiler of SQL hive. Who can I get it? Thanks. -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.* -- Regards, Bharath .V w:http://researchweb.iiit.ac.in/~bharath.v -- Ing. Alexis de la Cruz Toledo. *Av. Instituto Politécnico Nacional No. 2508 Col. San Pedro Zacatenco. México, D.F, 07360 * *CINVESTAV, DF.*