[ https://issues.apache.org/jira/browse/HIVE-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740312#action_12740312 ]
Edward Capriolo commented on HIVE-645: -------------------------------------- My jiras got a lot of attention today. I feel like the most popular guy on the internet. :) There may be some syntax inaccuracy here but I this is my use case. As input I have a hive table with weblogs. It may be partitioned or not. hive_table web_logs date,time,url,httpstatus 2009-06-04,05:49:00,/index.jsp, 200 2009-06-04,05:50:00,/index.jsp, 200 2009-06-04,05:49:00,/indexsfg.jsp, 404 I wish to produce a report of status codes for day 2009-06-04, 200,2 2009-06-04, 404,1 I create a mysql table for this data create table status_count( string date, int status, int count, primary key (date,status) ); select dboutput( 'jdbc://localhost:3306/hivedump', 'insert into status_count (date,status,count) values (?,?,?)', date,httpstatus, count(1) ) from web_logs group by date,httpstatus; In my case mysql primary key constraints deal with failed mappers/reducers. After all, if a mapper runs again it will try to insert again and fail. Not a problem. In most cases the data hive is dumping to mysql is relational and should have a primary key. If the data does not have a primary key then yes dboutput() will not function correctly with failed mappers/reducers, but its intented use case would be to push a summary at a mysql database possibly 100 - 100,000 rows. > A UDF that can export data to JDBC databases. > --------------------------------------------- > > Key: HIVE-645 > URL: https://issues.apache.org/jira/browse/HIVE-645 > Project: Hadoop Hive > Issue Type: New Feature > Components: Query Processor > Reporter: Edward Capriolo > Assignee: Edward Capriolo > Priority: Minor > Attachments: hive-645-2.patch, hive-645.patch > > > A UDF that can export data to JDBC databases. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.