thats a wrong query insert into table table_name partition (partition_names) select cols, do_data(event_date) from table
this is how it should look like hive will take care of inserting into respective partitions after you enable dynamic partitions On Fri, Jun 14, 2013 at 1:21 PM, Hamza Asad <hamza.asa...@gmail.com> wrote: > i 'm executing following command but it fail to recognize partition column > *INSERT INTO TABLE rc_partition_table > PARTITION (to_date(event_date)) > SELECT * FROM events_details* > > > On Fri, Jun 14, 2013 at 12:41 PM, Nitin Pawar <nitinpawar...@gmail.com>wrote: > >> just use hive split function for strings and get the value extracted. >> by the way I am not sure why the to_date function is failing >> stringto_date(string timestamp)Returns the date part of a timestamp >> string: to_date("1970-01-01 00:00:00") = "1970-01-01" >> also, I am sorry i might have misread your question >> what do you mean by "partition column does not accepts >> to_date(event_date) form " >> >> >> >> On Fri, Jun 14, 2013 at 1:04 PM, Hamza Asad <hamza.asa...@gmail.com>wrote: >> >>> sample row of my data is >>> *591269735,1,1022,2012-06-24 >>> 11:08:10.9,null,2,null,null,null,null,null,null,null,null,12,null,null,2,null,null,null,null,4,1,null,null,null,null,null,null >>> * >>> >>> and i want to partition it according to date i.e *2012-06-24 *skiping >>> the hh:mm:ss.ff part >>> >>> >>> On Fri, Jun 14, 2013 at 12:27 PM, Nitin Pawar >>> <nitinpawar...@gmail.com>wrote: >>> >>>> can you provide whats your data and what you want it to look like ? >>>> >>>> >>>> On Fri, Jun 14, 2013 at 12:31 PM, Hamza Asad <hamza.asa...@gmail.com>wrote: >>>> >>>>> which UDF? it does not take to_date(event_date) column >>>>> >>>>> >>>>> On Fri, Jun 14, 2013 at 11:54 AM, Nitin Pawar <nitinpawar...@gmail.com >>>>> > wrote: >>>>> >>>>>> use already existing UDFs to split or transform your values the way >>>>>> you want >>>>>> >>>>>> >>>>>> On Fri, Jun 14, 2013 at 12:09 PM, Hamza Asad >>>>>> <hamza.asa...@gmail.com>wrote: >>>>>> >>>>>>> OIC. I got it. Thanx alot nitin :). One more thing i want to ask >>>>>>> related this issue, if old table contains event_date in format >>>>>>> "2012-06-24 >>>>>>> 06:04:11.9" then how can i partition it according to date part only? As >>>>>>> partition column does not accepts to_date(event_date) form. >>>>>>> >>>>>>> >>>>>>> On Thu, Jun 13, 2013 at 5:07 PM, Nitin Pawar < >>>>>>> nitinpawar...@gmail.com> wrote: >>>>>>> >>>>>>>> If the input column value is NULL or empty string, the row will be put >>>>>>>> into a special partition, whose name is controlled by the hive >>>>>>>> parameter hive.exec.default.dynamic.partition.name. The default value >>>>>>>> is `__HIVE_DEFAULT_PARTITION__`. Basically this partition will contain >>>>>>>> all >>>>>>>> "bad" rows whose value are not valid partition names. >>>>>>>> >>>>>>>> so basically you do following things >>>>>>>> >>>>>>>> when you create a partitioned table, your partitioned column is >>>>>>>> normally at the end of the table, so when you are inserting data into >>>>>>>> this partitioned table, I would recommend using the column names in >>>>>>>> place select * from >>>>>>>> >>>>>>>> so your insert query should look like >>>>>>>> >>>>>>>> set hive.exec.dynamic.partition=true; >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> set hive.exec.dynamic.partition.mode=nonstrict; >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> insert overwrite table new_table partition(event_date) select col1, >>>>>>>> col2 .... coln, event_date from old_table; >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Jun 13, 2013 at 5:24 PM, Hamza Asad <hamza.asa...@gmail.com >>>>>>>> > wrote: >>>>>>>> >>>>>>>>> when i browse it in browser, all the data is in * >>>>>>>>> event_date=__HIVE_DEFAULT_PARTITION__<http://10.0.0.14:50075/browseDirectory.jsp?dir=%2Fvar%2Flog%2Fpring%2Fhive%2Fwarehouse%2Fnydus.db%2Fnew_rc_partition_cluster_table%2Fevent_date%3D__HIVE_DEFAULT_PARTITION__&namenodeInfoPort=50070> >>>>>>>>> *, rest of the files does not contains data >>>>>>>>> >>>>>>>>> >>>>>>>>> On Thu, Jun 13, 2013 at 4:52 PM, Nitin Pawar < >>>>>>>>> nitinpawar...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> what do you mean when you say "it wont split correctly" ? >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, Jun 13, 2013 at 5:19 PM, Hamza Asad < >>>>>>>>>> hamza.asa...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> what if i have data of more then 500 days then how can i create >>>>>>>>>>> partition on date column by specifying each and every date? (i knw >>>>>>>>>>> that >>>>>>>>>>> does not happens in dynamic partition but on dynamic partition, it >>>>>>>>>>> wont >>>>>>>>>>> splits correctly). >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, Jun 13, 2013 at 4:12 PM, Nitin Pawar < >>>>>>>>>>> nitinpawar...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> you can partition existing table unless the hdfs data is laid >>>>>>>>>>>> out in partitioned fashion. >>>>>>>>>>>> your best bet is create a new partitioned table >>>>>>>>>>>> enable dynamic paritionining >>>>>>>>>>>> read from old table and write into new table >>>>>>>>>>>> >>>>>>>>>>>> you can verify the new partitions by using command "show >>>>>>>>>>>> partitions" >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Thu, Jun 13, 2013 at 4:40 PM, Hamza Asad < >>>>>>>>>>>> hamza.asa...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> now i created partition table like >>>>>>>>>>>>> *CREATE TABLE new_rc_partition_cluster_table( >>>>>>>>>>>>> >>>>>>>>>>>>> id int, >>>>>>>>>>>>> event_id int, >>>>>>>>>>>>> user_id BIGINT, >>>>>>>>>>>>> >>>>>>>>>>>>> intval_1 int , >>>>>>>>>>>>> intval_2 int, >>>>>>>>>>>>> intval_3 int, >>>>>>>>>>>>> intval_4 int, >>>>>>>>>>>>> intval_5 int, >>>>>>>>>>>>> intval_6 int, >>>>>>>>>>>>> intval_7 int, >>>>>>>>>>>>> intval_8 int, >>>>>>>>>>>>> intval_9 int, >>>>>>>>>>>>> intval_10 int, >>>>>>>>>>>>> intval_11 int, >>>>>>>>>>>>> intval_12 int, >>>>>>>>>>>>> intval_13 int, >>>>>>>>>>>>> intval_14 int, >>>>>>>>>>>>> intval_15 int, >>>>>>>>>>>>> intval_16 int, >>>>>>>>>>>>> intval_17 int, >>>>>>>>>>>>> intval_18 int, >>>>>>>>>>>>> intval_19 int, >>>>>>>>>>>>> intval_20 int, >>>>>>>>>>>>> intval_21 int, >>>>>>>>>>>>> intval_22 int, >>>>>>>>>>>>> intval_23 int, >>>>>>>>>>>>> intval_24 int, >>>>>>>>>>>>> intval_25 int, >>>>>>>>>>>>> intval_26 int) >>>>>>>>>>>>> PARTITIONED BY (event_date string) >>>>>>>>>>>>> >>>>>>>>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS >>>>>>>>>>>>> ROW FORMAT DELIMITED >>>>>>>>>>>>> FIELDS TERMINATED BY ',' >>>>>>>>>>>>> STORED AS RCFile; >>>>>>>>>>>>> * >>>>>>>>>>>>> >>>>>>>>>>>>> rest of the commands are same. But this time INSERT OVERWRITE >>>>>>>>>>>>> query executed and took time but when i queries from that table, >>>>>>>>>>>>> it results >>>>>>>>>>>>> none as it does not contains data. why is this so? and also >>>>>>>>>>>>> please tell me >>>>>>>>>>>>> how can i partition my existing table dynamicaly on date so that >>>>>>>>>>>>> data >>>>>>>>>>>>> splits equally without mentioning date explicitly? >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Wed, Jun 12, 2013 at 6:50 PM, Nitin Pawar < >>>>>>>>>>>>> nitinpawar...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> you did not create partitioned table. You just created a >>>>>>>>>>>>>> bucketed table. >>>>>>>>>>>>>> >>>>>>>>>>>>>> refer to partitioned table created >>>>>>>>>>>>>> something like >>>>>>>>>>>>>> partitioned by (event_date string) >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jun 12, 2013 at 7:17 PM, Hamza Asad < >>>>>>>>>>>>>> hamza.asa...@gmail.com> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> i have created table after enabling dynamic partition. i >>>>>>>>>>>>>>> partitioned it on date but it is not splitting data datewise. >>>>>>>>>>>>>>> Below is the >>>>>>>>>>>>>>> query of table created and data insert >>>>>>>>>>>>>>> CREATE TABLE rc_partition_cluster_table( >>>>>>>>>>>>>>> id int, >>>>>>>>>>>>>>> event_id int, >>>>>>>>>>>>>>> user_id BIGINT, >>>>>>>>>>>>>>> event_date string, >>>>>>>>>>>>>>> intval_1 int ) >>>>>>>>>>>>>>> CLUSTERED BY(id) INTO 256 BUCKETS >>>>>>>>>>>>>>> ROW FORMAT DELIMITED >>>>>>>>>>>>>>> FIELDS TERMINATED BY ',' >>>>>>>>>>>>>>> STORED AS RCFile; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> set hive.exec.dynamic.partition=true; >>>>>>>>>>>>>>> set hive.exec.dynamic.partition.mode=nonstrict; >>>>>>>>>>>>>>> set hive.exec.max.dynamic.partitions=1000; >>>>>>>>>>>>>>> set hive.exec.max.dynamic.partitions.pernode=1000; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> INSERT OVERWRITE TABLE rc_partition_cluster_table Partition >>>>>>>>>>>>>>> (event_date) >>>>>>>>>>>>>>> SELECT * FROM events_details; >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> why it is not working fine? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>> *Muhammad Hamza Asad* >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Nitin Pawar >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> *Muhammad Hamza Asad* >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Nitin Pawar >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> *Muhammad Hamza Asad* >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Nitin Pawar >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> *Muhammad Hamza Asad* >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Nitin Pawar >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> *Muhammad Hamza Asad* >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Nitin Pawar >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> *Muhammad Hamza Asad* >>>>> >>>> >>>> >>>> >>>> -- >>>> Nitin Pawar >>>> >>> >>> >>> >>> -- >>> *Muhammad Hamza Asad* >>> >> >> >> >> -- >> Nitin Pawar >> > > > > -- > *Muhammad Hamza Asad* > -- Nitin Pawar