Re: show table throwing strange error
Can you also check your hive site XML ? Is it properly formatted and connection strings correct ? Sent from my iPhone On Jun 19, 2013, at 6:30 PM, Mohammad Tariq donta...@gmail.com wrote: Hello Anurag, Thank you for the quick response. Log files is full of such lines along with a trace that says it is some parsing related issue. But the strange thing is that here I can see '\00' but on the CLI it was just ' '. I am wondering what's with wrong with show tables; line 1:79 character '\00' not supported here line 1:80 character '\00' not supported here at org.apache.hadoop.hive.ql.parse.ParseDriver.parse(ParseDriver.java:446) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:416) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:336) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:909) at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:258) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:215) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406) at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:689) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:557) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:601) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) Thanks again. Warm Regards, Tariq cloudfront.blogspot.com On Thu, Jun 20, 2013 at 6:53 AM, Anurag Tangri tangri.anu...@gmail.com wrote: Did you check in your hive query log under /tmp to see if it says something in the log ? Sent from my iPhone On Jun 19, 2013, at 5:53 PM, Mohammad Tariq donta...@gmail.com wrote: Hello list, I have a hive(0.9.0) setup on my Ubuntu box running hadoop-1.0.4. Everything was going smooth till now. But today when I issued show tables I got some strange error on the CLI. Here is the error : hive show tables; FAILED: Parse Error: line 1:0 character '' not supported here line 1:1 character '' not supported here line 1:2 character '' not supported here line 1:3 character '' not supported here line 1:4 character '' not supported here line 1:5 character '' not supported here line 1:6 character '' not supported here line 1:7 character '' not supported here line 1:8 character '' not supported here line 1:9 character '' not supported here line 1:10 character '' not supported here line 1:11 character '' not supported here line 1:12 character '' not supported here line 1:13 character '' not supported here line 1:14 character '' not supported here line 1:15 character '' not supported here line 1:16 character '' not supported here line 1:17 character '' not supported here line 1:18 character '' not supported here line 1:19 character '' not supported here line 1:20 character '' not supported here line 1:21 character '' not supported here line 1:22 character '' not supported here line 1:23 character '' not supported here line 1:24 character '' not supported here line 1:25 character '' not supported here line 1:26 character '' not supported here line 1:27 character '' not supported here line 1:28 character '' not supported here line 1:29 character '' not supported here line 1:30 character '' not supported here line 1:31 character '' not supported here line 1:32 character '' not supported here line 1:33 character '' not supported here line 1:34 character '' not supported here line 1:35 character '' not supported here line 1:36 character '' not supported here line 1:37 character '' not supported here line 1:38 character '' not supported here line 1:39 character '' not supported here line 1:40 character '' not supported here line 1:41 character '' not supported here line 1:42 character '' not supported here line 1:43 character '' not supported here line 1:44 character '' not supported here line 1:45 character '' not supported here line 1:46 character '' not supported here line 1:47 character '' not supported here line 1:48 character '' not supported here line 1:49 character '' not supported here line 1:50 character '' not supported here line 1:51 character '' not supported here line 1:52 character '' not supported here line 1:53 character '' not supported here line 1:54 character '' not supported here line 1:55 character '' not supported here line 1:56 character '' not supported here line 1:57 character '' not supported here line 1:58 character '' not supported here line 1:59 character '' not supported here line 1:60 character '' not supported here line 1:61 character '' not supported here line 1:62 character '' not supported here line 1:63 character '' not supported here line 1:64 character
Re: Create Index Map/Reduce failure
Try: Export HADOOP_HEAPSIZE=1000 ( 1gb) Before running your hive query And keep keep increasing the size to needed value Another option is setting Xmx in hive-env.sh Sent from my iPhone On Oct 27, 2012, at 12:21 PM, Peter Marron peter.mar...@trilliumsoftware.com wrote: Hi, I have a fairly low-end machine running Ubuntu 12.0.4 I’m running Hadoop in pseudo-distributed and storing in HDFS. I have a file which is 137Gb with 36.6 million rows and 466 columns. I am trying to create an index on this table in hive with these commands. (I seem to have to build the index in two separate commands.) LOAD DATA INPATH 'E3/score.csv' OVERWRITE INTO TABLE score; CREATE INDEX bigIndex ON TABLE score(Ath_Seq_Num) AS 'org.apache.hadoop.hive.ql.index.compact.CompactIndexHandler' WITH DEFERRED REBUILD; ALTER INDEX bigIndex ON score REBUILD; The resulting Map/Reduce is failing with OutOfMemoryError. I attach the end of the only log which seems to contain any useful information about the error. When I googled a bit I found a suggestion that it could be the mapred.child.java.opts, so I added this to my mapred-site.xml (and it increased the maximum from 200Mb to 1000Mb) property namemapred.child.java.opts/name value-Xmx1000m/value /property But this didn’t seem to help. I also saw some mention that I should decrease the io.sort.mb, and so I reduced this to 1Mb. However this didn’t seem to help either. Maybe this is the wrong list for this question and I should post to common-u...@hadoop.apache.org? Any help appreciated. Peter Marron 2012-10-25 15:55:27,429 INFO org.apache.hadoop.mapred.ReduceTask: In-memory merge complete: 511 files left. 2012-10-25 15:55:27,432 WARN org.apache.hadoop.fs.FileSystem: localhost is a deprecated filesystem name. Use hdfs://localhost/ instead. 2012-10-25 15:55:27,449 INFO org.apache.hadoop.mapred.Merger: Merging 511 sorted segments 2012-10-25 15:55:27,455 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 511 segments left of total size: 173620406 bytes 2012-10-25 15:55:27,885 INFO org.apache.hadoop.mapred.ReduceTask: Merged 511 segments, 173620406 bytes to disk to satisfy reduce memory limit 2012-10-25 15:55:27,885 INFO org.apache.hadoop.mapred.ReduceTask: Merging 1 files, 173619390 bytes from disk 2012-10-25 15:55:27,886 INFO org.apache.hadoop.mapred.ReduceTask: Merging 0 segments, 0 bytes from memory into reduce 2012-10-25 15:55:27,886 INFO org.apache.hadoop.mapred.Merger: Merging 1 sorted segments 2012-10-25 15:55:27,888 INFO org.apache.hadoop.mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 173619386 bytes 2012-10-25 15:55:27,895 INFO ExecReducer: maximum memory = 932118528 2012-10-25 15:55:27,895 INFO ExecReducer: conf classpath = [file:/data/tmp/mapred/local/taskTracker/pmarron/jobcache/job_201210251304_0001/jars/classes, file:/data/tmp/mapred/local/taskTracker/pmarron/jobcache/job_201210251304_0001/jars/, file:/data/tmp/mapred/local/taskTracker/pmarron/jobcache/job_201210251304_0001/attempt_201210251304_0001_r_93_3/] 2012-10-25 15:55:27,896 INFO ExecReducer: thread classpath = [file:/data/hadoop-1.0.3/conf/, file:/usr/lib/jvm/java-7-openjdk-amd64/jre/lib/tools.jar, file:/data/hadoop-1.0.3/, file:/data/hadoop-1.0.3/hadoop-core-1.0.3.jar, file:/data/hadoop-1.0.3/lib/asm-3.2.jar, file:/data/hadoop-1.0.3/lib/aspectjrt-1.6.5.jar, file:/data/hadoop-1.0.3/lib/aspectjtools-1.6.5.jar, file:/data/hadoop-1.0.3/lib/commons-beanutils-1.7.0.jar, file:/data/hadoop-1.0.3/lib/commons-beanutils-core-1.8.0.jar, file:/data/hadoop-1.0.3/lib/commons-cli-1.2.jar, file:/data/hadoop-1.0.3/lib/commons-codec-1.4.jar, file:/data/hadoop-1.0.3/lib/commons-collections-3.2.1.jar, file:/data/hadoop-1.0.3/lib/commons-configuration-1.6.jar, file:/data/hadoop-1.0.3/lib/commons-daemon-1.0.1.jar, file:/data/hadoop-1.0.3/lib/commons-digester-1.8.jar, file:/data/hadoop-1.0.3/lib/commons-el-1.0.jar, file:/data/hadoop-1.0.3/lib/commons-httpclient-3.0.1.jar, file:/data/hadoop-1.0.3/lib/commons-io-2.1.jar, file:/data/hadoop-1.0.3/lib/commons-lang-2.4.jar, file:/data/hadoop-1.0.3/lib/commons-logging-1.1.1.jar, file:/data/hadoop-1.0.3/lib/commons-logging-api-1.0.4.jar, file:/data/hadoop-1.0.3/lib/commons-math-2.1.jar, file:/data/hadoop-1.0.3/lib/commons-net-1.4.1.jar, file:/data/hadoop-1.0.3/lib/core-3.1.1.jar, file:/data/hadoop-1.0.3/lib/hadoop-capacity-scheduler-1.0.3.jar, file:/data/hadoop-1.0.3/lib/hadoop-fairscheduler-1.0.3.jar, file:/data/hadoop-1.0.3/lib/hadoop-thriftfs-1.0.3.jar, file:/data/hadoop-1.0.3/lib/hsqldb-1.8.0.10.jar, file:/data/hadoop-1.0.3/lib/jackson-core-asl-1.8.8.jar, file:/data/hadoop-1.0.3/lib/jackson-mapper-asl-1.8.8.jar, file:/data/hadoop-1.0.3/lib/jasper-compiler-5.5.12.jar,
How to run big queries in optimized way ?
Hi, We have datasets which are about 10-15 TB in size. We want to run hive queries on top of this input data. What are ways to reduce stress on our cluster for running many such big queries( include joins too) in parallel ? How to enable compression etc for intermediate hive output ? How to make job cache does not go to high etc ? In short , best practices for huge queries on hive ? Any inputs are really appreciated ! Thanks, JJ Sent from my iPhone
Re: Creating Hive table by pulling data from mainFrames
Sqoop is a nice tool to get dAta to/from DB2 to hive and then you can run hive queries on top of it. Lot of people are using this for traditional DBs and hadoop connectivity. Sent from my iPhone On Jul 26, 2012, at 11:32 AM, Siddharth Tiwari siddharth.tiw...@live.com wrote: Hey Team, We have huge tables in Mainframe DB2. Can some one tell if its possible to pull data from DB2 in Mainframe to hive and then use MapReduce to sort the data in hive and push it back to Mainframe table. Please help ** Cheers !!! Siddharth Tiwari Have a refreshing day !!! Every duty is holy, and devotion to duty is the highest form of worship of God.” Maybe other people will try to limit me but I don't limit myself
Re: Text file with ctrl chat as delimiter
Hi Sam, Could you try '\001' instead of '\u0001' ? Sent from my iPhone On Jun 20, 2012, at 3:57 PM, Sam William sa...@stumbleupon.com wrote: Mark, I did not get any errors, but it seemed like the splitting was happening with u,0,1 as delimiters. My temporary fix is to define a table with one field and create a view split function on top . I dont want this additional overheard, if I can make the input format work with the ctrl char as the delimiter. Mapred Learn, Yes I did have the word 'external' in the create table statement. Thanks, Sam On Jun 20, 2012, at 6:24 AM, Mark Grover wrote: Sam, If you can please post a row or two of your data along with any errors you are getting, that would be helpful. Mark - Original Message - From: Mapred Learn mapred.le...@gmail.com To: user@hive.apache.org Cc: user@hive.apache.org Sent: Tuesday, June 19, 2012 8:34:22 PM Subject: Re: Text file with ctrl chat as delimiter Did you add the word external in create table I.e. Create external table(...blah...blah...) Sent from my iPhone On Jun 19, 2012, at 4:15 PM, Sam William sa...@stumbleupon.com wrote: Hi, I have a data file that is exactly equivalent to a CSV , except that the field delimiter is a control character specifically '\u0001' . How can I create an external table in hive for this data ? For instance . create table ... blah .blah ... row format delimited fields terminated by '\u0001' stored as textfile location '/tmp/myloc'; did not work . Thanks Sam William sa...@stumbleupon.com Sam William sa...@stumbleupon.com
Re: Text file with ctrl chat as delimiter
Did you add the word external in create table I.e. Create external table(...blah...blah...) Sent from my iPhone On Jun 19, 2012, at 4:15 PM, Sam William sa...@stumbleupon.com wrote: Hi, I have a data file that is exactly equivalent to a CSV , except that the field delimiter is a control character specifically '\u0001' . How can I create an external table in hive for this data ? For instance . create table ... blah .blah ... row format delimited fields terminated by '\u0001' stored as textfile location '/tmp/myloc'; did not work . Thanks Sam William sa...@stumbleupon.com
Re: Want to improve the performance for execution of Hive Jobs.
Try setting this value to your block Size, for 128 mb block size, set mapred.min.split.size=128000 Sent from my iPhone On May 7, 2012, at 10:11 PM, Bhavesh Shah bhavesh25s...@gmail.com wrote: Thanks Nitin for your reply. In short my Task is 1) Initially I want to import the data from MS SQL Server into HDFS using SQOOP. 2) Through Hive I am processing the data and generating the result in one table 3) That result containing table from Hive is again exported to MS SQL SERVER back. Actually the data which I am importing from MS SQL Server is very large (near about 5,00,000 entries in one table. Like wise I have 30 tables). For this I have written a task in Hive which contains only queries (And each query has used a lot of joins in it). So due to this the performance is very poor on my single local machine ( It takes near about 3 hrs to execute completely). I have observed that when I have submitted a single query to Hive CLI it took 10-11 jobs to execute completely. set mapred.min.split.size set mapred.max.split.size Should this value to be set in bootstrap action while submitting jobs to amazon EMR? What value to be set for it as I don't know? -- Regards, Bhavesh Shah On Tue, May 8, 2012 at 10:31 AM, Nitin Pawar nitinpawar...@gmail.com wrote: 1) check the jobtracker url to see how many maps/reducers have been launched 2) if you have a large dataset and wants to execute it fast, you set mapred.min.split.size and mapred.max.split.size to an optimal value so that more mappers will be launched and will finish 3) if you are doing joins, there are different ways to go according to the data you have and size of data it will be helpful if you can let us know your datasizes and query details On Tue, May 8, 2012 at 10:07 AM, Bhavesh Shah bhavesh25s...@gmail.com wrote: Hello all, I have written a Hive JDBC code and created a JAR of it. I am running that JAR on 10 cluster. But the problem as I am using the 10 cluster still the performance is same as that on single cluster. What to do to improve the performance of Hive Jobs? Is there anything configuration setting to set before the submitting Hive Jobs to cluster? One more thing I want to know is that How can we come to know that is job running on all cluster? Please let me know if anyone knows about it? -- Regards, Bhavesh Shah -- Nitin Pawar
Re: Sequence generated Id in Hive
Hive 0.8.0 has row_sequence UDF but it generates unique seq ids only per mapper, not across the job. Sent from my iPhone On Jan 27, 2012, at 8:37 AM, Anson Abraham anson.abra...@gmail.com wrote: Does Hive support automated sequence Id generation, or does a udf have to be created per each object that is created?
Re: Changing Hive Job Log Path
Try changing one of following property in file listed before them : hive-exec-log4j.properties:hive.log.dir=/tmp/${user.name} hive-log4j.properties:hive.log.dir=/tmp/${user.name} On Thu, Jan 26, 2012 at 3:40 PM, Tucker, Matt matt.tuc...@disney.comwrote: We’ve started noticing that some of the hive job logs (hive_job_log_mtucker_201201251355_374625982.txt) can become very big, some upwards of 800MB. ** ** I’ve tried modifying the Log4J settings to write to a different directory, but the job logs still end up writing to /tmp/`whoami`/. Am I overlooking a configuration setting that will let me put these into another directory that has more available space on its mount point? ** ** Thanks ** ** Matt Tucker
hive case and group-by statement
Hi, I have a following smaple query: select A, CASE WHEN B IN(1,2) THEN 'Type A' ELSE 'Type B' END AS B, C from table_a groupby A, B, C; But when i run this query, it gives error: FAILED: Error in semantic analysis: Line 95:0 Invalid table alias or column reference entity This error is from B defined after 'AS' in CASE statement. how can I make this group-by work ? Thanks, -JJ
Re: LOAD data into hive table
Hive external table is your solution Sent from my iPhone On Oct 5, 2011, at 11:29 AM, Navin Gupta navin.gu...@kindsight.net wrote: Hi, The loading of data into hive table currently moves the data if the location is hdfs (NON LOCAL). Is there an option that would allow to copy the data instead ? thanks, Navin
How to write a UDAF ?
Hi, Could somebody point me to some wiki abt how to create a UDF In hive ? My problem is how to include hive code in my build path to import exec.UDAF. I downloaded hive code from cloudera's distribution. Any help is appreciated ! -JJ Sent from my iPhone
Re: does hive support Sequence File format ?
Hi, Looks the documentation link you guys provided earlier is moved to some other location: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table On Thu, Feb 17, 2011 at 2:50 PM, Edward Capriolo edlinuxg...@gmail.comwrote: On Thu, Feb 17, 2011 at 5:48 PM, Karthik karthik_...@yahoo.com wrote: I have a requirement to support data from the SequenceFile KEY (not the VALUE) to be used by Hive table. How can I do this. From the code, it looks like the VALUE part is available for Hive. Please help. Regards. From: Mapred Learn mapred.le...@gmail.com To: user@hive.apache.org user@hive.apache.org Cc: user@hive.apache.org user@hive.apache.org Sent: Thu, February 17, 2011 1:48:07 PM Subject: Re: does hive support Sequence File format ? Thanks Ted ! Just found it few minutes ago. On Feb 17, 2011, at 1:46 PM, Ted Yu yuzhih...@gmail.com wrote: Look under http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table On Thu, Feb 17, 2011 at 12:00 PM, Mapred Learn mapred.le...@gmail.com wrote: Hi, I was wondering if hive supports Sequence File format. If yes, could me point me to some documentation about how to use Seq files in hive. Thanks, -JJ This has come up two or three times on the ML. It can be done with InputFormats. Edward
Re: does hive support Sequence File format ?
Thanks Tim ! On Tue, Jun 28, 2011 at 10:01 AM, Tim Spence yogi.wan.ken...@gmail.comwrote: It should be here: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL For everyone's benefit, the old wiki page you linked to has a link to a page directory on the new wiki here: https://cwiki.apache.org/confluence/pages/listpages-dirview.action?key=Hive Tim On Tue, Jun 28, 2011 at 9:54 AM, Mapred Learn mapred.le...@gmail.comwrote: Hi, Looks the documentation link you guys provided earlier is moved to some other location: http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table On Thu, Feb 17, 2011 at 2:50 PM, Edward Capriolo edlinuxg...@gmail.comwrote: On Thu, Feb 17, 2011 at 5:48 PM, Karthik karthik_...@yahoo.com wrote: I have a requirement to support data from the SequenceFile KEY (not the VALUE) to be used by Hive table. How can I do this. From the code, it looks like the VALUE part is available for Hive. Please help. Regards. From: Mapred Learn mapred.le...@gmail.com To: user@hive.apache.org user@hive.apache.org Cc: user@hive.apache.org user@hive.apache.org Sent: Thu, February 17, 2011 1:48:07 PM Subject: Re: does hive support Sequence File format ? Thanks Ted ! Just found it few minutes ago. On Feb 17, 2011, at 1:46 PM, Ted Yu yuzhih...@gmail.com wrote: Look under http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table On Thu, Feb 17, 2011 at 12:00 PM, Mapred Learn mapred.le...@gmail.com wrote: Hi, I was wondering if hive supports Sequence File format. If yes, could me point me to some documentation about how to use Seq files in hive. Thanks, -JJ This has come up two or three times on the ML. It can be done with InputFormats. Edward
Re: Resend - how to load sequence file with decimal data
Hi Steven, With load data you give some info about data also. As in Tom' White's book: create external table external_table(dummy string) location load data Now dummy string is a field in this data. Similarly, what I have is a dcimal field. How do I specify it in the create command ? On Fri, Jun 24, 2011 at 5:12 PM, Steven Wong sw...@netflix.com wrote: Not sure if this is what you’re asking for: Hive has a LOAD DATA command. There is no decimal data type. ** ** ** ** *From:* Mapred Learn [mailto:mapred.le...@gmail.com] *Sent:* Thursday, June 23, 2011 7:25 AM *To:* user@hive.apache.org; mapreduce-u...@hadoop.apache.org; cdh-u...@cloudera.org *Subject:* Resend - how to load sequence file with decimal data ** ** ** ** Hi, I have a sequence file where The value is text with delimited data and some fields are decimal fields. For eg: decimal(16,6). Sample value : 123.456735. How do I upload such a sequence file in hive and what shud I give in table definition for decimal values as above ? Thanks in advance ! Sent from my iPhone
Loading seq file into hive
Hi, I have seq files with key as line number and value is ctrl B delimited text. a sample value is: 45454^B567^Brtrt^B-7.8 56577^B345^Bdrtd^B-0.9 when I create a table like: create table temp_seq (no. int, code string, rank string, amt string) row format delimited fields terminated by '\002' lines terminated by '\n' stored as sequencefile; It creates the table. When I load a file as: load data inpath '/tmp/test' into table temp_seq; even this succeeds. But when I try to select *, I don't see the fields that were loaded as delmiited text and I see it separated as some weird boundaries and some fields in text of seq file combined in select * output and rest all fields at the end, coming as NULL, as follows: 45454567 rtrt-7.8 NULL NULL 56577345 drtd-0.9 NULL NULL. how can I get this data to correspond to the exact fields in Seq File Values output ? Thanks in advance, -JJ
Resend - how to load sequence file with decimal data
Hi, I have a sequence file where The value is text with delimited data and some fields are decimal fields. For eg: decimal(16,6). Sample value : 123.456735. How do I upload such a sequence file in hive and what shud I give in table definition for decimal values as above ? Thanks in advance ! Sent from my iPhone
How to load a sequence file with decimal data to hive ?
Hi, I have a sequence file where I have delimited data and some data is decimal fields. For eg: decimal(16,6). Sample value : 123.456735. How do I upload such a sequence file and what shud I give in table definition for decimal values as above ? Thanks in advance ! Sent from my iPhone
Re: does hive support Sequence File format ?
Thanks Ted ! Just found it few minutes ago. On Feb 17, 2011, at 1:46 PM, Ted Yu yuzhih...@gmail.com wrote: Look under http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Create_Table On Thu, Feb 17, 2011 at 12:00 PM, Mapred Learn mapred.le...@gmail.com wrote: Hi, I was wondering if hive supports Sequence File format. If yes, could me point me to some documentation about how to use Seq files in hive. Thanks, -JJ