RE: Hive 0.11.0 | Issue with ORC Tables

2013-09-20 Thread Savant, Keshav
Hi Nitin, Thanks for your reply, we were in an impression that the codec will be responsible for ORC format conversion also. However as per your reply it seems that a conversion from normal CSV to ORC is required before hive upload. We got some leads from following URLs

Re: Hive 0.11.0 | Issue with ORC Tables

2013-09-20 Thread Nitin Pawar
Keshav, Owen has provided the solution already. Thats the easiest of the the lot and from the master who wrote ORC himself :) to put it in simple words what he has suggested is, create a staging table which will be based on default text data format. From the staging data load data into a ORC

How to load /t /n file to Hive

2013-09-20 Thread Raj Hadoop
Hi,   I have a file which is delimted by a tab. Also, there are some fields in the file which has a tab /t character and a new line /n character in some fields.   Is there any way to load this file using Hive load command? Or do i have to use a Custom Map Reduce (custom) Input format with java ?

Re: How to load /t /n file to Hive

2013-09-20 Thread Nitin Pawar
If your data contains new line chars, its better you write a custom map reduce job and convert the data into a single line removing all unwanted chars in column separator as well just having single new line char per line On Sat, Sep 21, 2013 at 12:38 AM, Raj Hadoop hadoop...@yahoo.com wrote:

Re: How to load /t /n file to Hive

2013-09-20 Thread Raj Hadoop
Please note that there is an escape chacter in the fields where the /t and /n are present. From: Raj Hadoop hadoop...@yahoo.com To: Hive user@hive.apache.org Sent: Friday, September 20, 2013 3:04 PM Subject: How to load /t /n file to Hive Hi, I have a file

Re: How to load /t /n file to Hive

2013-09-20 Thread Raj Hadoop
Hi Nitin,   Thanks for the reply. I have a huge file in unix.   As per the file definition, the file is a tab separated file of fields. But I am sure that within some field's I have some new line character.   How should I find a record? It is a huge file. Is there some command?   Thanks,  

Re: How to load /t /n file to Hive

2013-09-20 Thread Gabriel Eisbruch
Hi One way that we used to solve that problem it's to transform the data when you are creating/loading it, for example we've applied UrlEncode to each field on create time. Thanks, Gabo. 2013/9/20 Raj Hadoop hadoop...@yahoo.com Hi Nitin, Thanks for the reply. I have a huge file in unix.

Loading data into partition taking seven times total of (map+reduce) on highly skewed data

2013-09-20 Thread Stephen Boesch
We have a small (3GB /280M rows) table with 435 partitions that is highly skewed: one partition has nearly 200M, two others have nearly 40M apiece, then the remaining 432 have all together less than 1% of total table size. So .. the skew is something to be addressed. However - even give that -

Re: Loading data into partition taking seven times total of (map+reduce) on highly skewed data

2013-09-20 Thread Stephen Boesch
Another detail: ~400 mappers 64 reducers 2013/9/20 Stephen Boesch java...@gmail.com We have a small (3GB /280M rows) table with 435 partitions that is highly skewed: one partition has nearly 200M, two others have nearly 40M apiece, then the remaining 432 have all together less than 1%

Re: How to load /t /n file to Hive

2013-09-20 Thread Raj Hadoop
Hi Gabo, Are you suggesting to use java.net.URLEncoder ? Can you be more specific ? I have lot of fields in the file which are not only URL related but some text fields which has new line characters. Thanks, Raj From: Gabriel Eisbruch

Re: How to load /t /n file to Hive

2013-09-20 Thread Gabriel Eisbruch
Hi Raj, UrlEncode It's a good way to encode data and be sure that you will encode all special chars (as example \n will be encoded to %0A) It's not necessary that the field be and url to encode these (you could use other encoder but, we had very good results with UrlEncoder, ever the best way