can you provide whats your data and what you want it to look like ?

On Fri, Jun 14, 2013 at 12:31 PM, Hamza Asad <hamza.asa...@gmail.com> wrote:

> which UDF? it does not take to_date(event_date) column
>
>
> On Fri, Jun 14, 2013 at 11:54 AM, Nitin Pawar <nitinpawar...@gmail.com>wrote:
>
>> use already existing UDFs to split or transform your values the way you
>> want
>>
>>
>> On Fri, Jun 14, 2013 at 12:09 PM, Hamza Asad <hamza.asa...@gmail.com>wrote:
>>
>>> OIC. I got it. Thanx alot nitin :). One more thing i want to ask related
>>> this issue, if old table contains event_date in format "2012-06-24
>>> 06:04:11.9" then how can i partition it according to date part only? As
>>> partition column does not accepts to_date(event_date) form.
>>>
>>>
>>> On Thu, Jun 13, 2013 at 5:07 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote:
>>>
>>>> If the input column value is NULL or empty string, the row will be put 
>>>> into a special partition, whose name is controlled by the hive parameter 
>>>> hive.exec.default.dynamic.partition.name. The default value is 
>>>> `__HIVE_DEFAULT_PARTITION__`. Basically this partition will contain all
>>>> "bad" rows whose value are not valid partition names.
>>>>
>>>> so basically you do following things
>>>>
>>>> when you create a partitioned table, your partitioned column is normally 
>>>> at the end of the table, so when you are inserting data into this 
>>>> partitioned table, I would recommend using the column names in place 
>>>> select * from
>>>>
>>>> so your insert query should look like
>>>>
>>>> set hive.exec.dynamic.partition=true;
>>>>
>>>>
>>>>
>>>> set hive.exec.dynamic.partition.mode=nonstrict;
>>>>
>>>>
>>>>
>>>>
>>>> insert overwrite table new_table partition(event_date) select col1, col2 
>>>> .... coln, event_date from old_table;
>>>>
>>>>
>>>>
>>>> On Thu, Jun 13, 2013 at 5:24 PM, Hamza Asad <hamza.asa...@gmail.com>wrote:
>>>>
>>>>> when i browse it in browser, all the data is in *
>>>>> event_date=__HIVE_DEFAULT_PARTITION__<http://10.0.0.14:50075/browseDirectory.jsp?dir=%2Fvar%2Flog%2Fpring%2Fhive%2Fwarehouse%2Fnydus.db%2Fnew_rc_partition_cluster_table%2Fevent_date%3D__HIVE_DEFAULT_PARTITION__&namenodeInfoPort=50070>
>>>>> *, rest of the files does not contains data
>>>>>
>>>>>
>>>>> On Thu, Jun 13, 2013 at 4:52 PM, Nitin Pawar 
>>>>> <nitinpawar...@gmail.com>wrote:
>>>>>
>>>>>> what do you mean when you say "it wont split correctly" ?
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad 
>>>>>> <hamza.asa...@gmail.com>wrote:
>>>>>>
>>>>>>> what if i have data of more then 500 days then how can i create
>>>>>>> partition on date column by specifying each and every date? (i knw that
>>>>>>> does not happens in dynamic partition but on dynamic partition, it wont
>>>>>>> splits correctly).
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jun 13, 2013 at 4:12 PM, Nitin Pawar <
>>>>>>> nitinpawar...@gmail.com> wrote:
>>>>>>>
>>>>>>>> you can partition existing table unless the hdfs data is laid out
>>>>>>>> in partitioned fashion.
>>>>>>>> your best bet is create a new partitioned table
>>>>>>>> enable dynamic paritionining
>>>>>>>> read from old table and write into new table
>>>>>>>>
>>>>>>>> you can verify the new partitions by using command "show partitions"
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jun 13, 2013 at 4:40 PM, Hamza Asad <hamza.asa...@gmail.com
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> now i created partition table like
>>>>>>>>> *CREATE TABLE new_rc_partition_cluster_table(
>>>>>>>>>
>>>>>>>>>   id int,
>>>>>>>>>   event_id int,
>>>>>>>>>   user_id BIGINT,
>>>>>>>>>
>>>>>>>>>   intval_1 int ,
>>>>>>>>>   intval_2 int,
>>>>>>>>>   intval_3 int,
>>>>>>>>>   intval_4 int,
>>>>>>>>>   intval_5 int,
>>>>>>>>>   intval_6 int,
>>>>>>>>>   intval_7 int,
>>>>>>>>>   intval_8 int,
>>>>>>>>>   intval_9 int,
>>>>>>>>>   intval_10 int,
>>>>>>>>>   intval_11 int,
>>>>>>>>>   intval_12 int,
>>>>>>>>>   intval_13 int,
>>>>>>>>>   intval_14 int,
>>>>>>>>>   intval_15 int,
>>>>>>>>>   intval_16 int,
>>>>>>>>>   intval_17 int,
>>>>>>>>>   intval_18 int,
>>>>>>>>>   intval_19 int,
>>>>>>>>>   intval_20 int,
>>>>>>>>>   intval_21 int,
>>>>>>>>>   intval_22 int,
>>>>>>>>>   intval_23 int,
>>>>>>>>>   intval_24 int,
>>>>>>>>>   intval_25 int,
>>>>>>>>>   intval_26 int)
>>>>>>>>>   PARTITIONED BY (event_date string)
>>>>>>>>>
>>>>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS
>>>>>>>>> ROW FORMAT DELIMITED
>>>>>>>>>  FIELDS TERMINATED BY ','
>>>>>>>>> STORED AS RCFile;
>>>>>>>>> *
>>>>>>>>>
>>>>>>>>> rest of the commands are same. But this time INSERT OVERWRITE
>>>>>>>>> query executed and took time but when i queries from that table, it 
>>>>>>>>> results
>>>>>>>>> none as it does not contains data. why is this so? and also please 
>>>>>>>>> tell me
>>>>>>>>> how can i partition my existing table dynamicaly on date so that data
>>>>>>>>> splits equally without mentioning date explicitly?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Jun 12, 2013 at 6:50 PM, Nitin Pawar <
>>>>>>>>> nitinpawar...@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> you did not create partitioned table. You just created a bucketed
>>>>>>>>>> table.
>>>>>>>>>>
>>>>>>>>>> refer to partitioned table created
>>>>>>>>>> something like
>>>>>>>>>> partitioned by (event_date string)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Jun 12, 2013 at 7:17 PM, Hamza Asad <
>>>>>>>>>> hamza.asa...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> i have created table after enabling dynamic partition. i
>>>>>>>>>>> partitioned it on date but it is not splitting data datewise. Below 
>>>>>>>>>>> is the
>>>>>>>>>>> query of table created and data insert
>>>>>>>>>>> CREATE TABLE rc_partition_cluster_table(
>>>>>>>>>>>   id int,
>>>>>>>>>>>   event_id int,
>>>>>>>>>>>   user_id BIGINT,
>>>>>>>>>>>   event_date string,
>>>>>>>>>>>   intval_1 int )
>>>>>>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS
>>>>>>>>>>> ROW FORMAT DELIMITED
>>>>>>>>>>>  FIELDS TERMINATED BY ','
>>>>>>>>>>> STORED AS RCFile;
>>>>>>>>>>>
>>>>>>>>>>> set hive.exec.dynamic.partition=true;
>>>>>>>>>>> set hive.exec.dynamic.partition.mode=nonstrict;
>>>>>>>>>>> set hive.exec.max.dynamic.partitions=1000;
>>>>>>>>>>> set hive.exec.max.dynamic.partitions.pernode=1000;
>>>>>>>>>>>
>>>>>>>>>>> INSERT OVERWRITE TABLE rc_partition_cluster_table Partition
>>>>>>>>>>> (event_date)
>>>>>>>>>>> SELECT * FROM events_details;
>>>>>>>>>>>
>>>>>>>>>>> why it is not working fine?
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Nitin Pawar
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> *Muhammad Hamza Asad*
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Nitin Pawar
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> *Muhammad Hamza Asad*
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nitin Pawar
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> *Muhammad Hamza Asad*
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Nitin Pawar
>>>>
>>>
>>>
>>>
>>> --
>>> *Muhammad Hamza Asad*
>>>
>>
>>
>>
>> --
>> Nitin Pawar
>>
>
>
>
> --
> *Muhammad Hamza Asad*
>



-- 
Nitin Pawar

Reply via email to