If it were me, I would find a way to identify the partitions that have modified data and then re-load a subset of the partitions (only the ones with changes) on a regular basis. Instead of updating/deleting data, you'll be re-loading specific partitions as an all or nothing action.
On Monday, December 24, 2012, Ibrahim Yakti wrote: > This already done, but Hive does not support update nor deletion of data, > so when I import the data after specific "last_update_time" records, hive > will append it not replace. > > > -- > Ibrahim > > > On Mon, Dec 24, 2012 at 5:03 PM, Mohammad Tariq <donta...@gmail.com>wrote: > > You can use Apache Oozie to schedule your imports. > > Alternatively, you can have an additional column in your SQL table, say > LastUpdatedTime or something. As soon as there is a change in this column > you can start the import from this point. This way you don't have to import > all the things everytime there is a change in your table. You just have to > move only the most recent data, say only the 'delta' amount of data. > > Best Regards, > Tariq > +91-9741563634 > https://mtariq.jux.com/ > > > On Mon, Dec 24, 2012 at 7:08 PM, Ibrahim Yakti <iya...@souq.com> wrote: > > My question was how to reflect MySQL updates to hadoop/hive, this is our > problem now. > > > -- > Ibrahim > > > On Mon, Dec 24, 2012 at 4:35 PM, Mohammad Tariq <donta...@gmail.com>wrote: > > Cool. Then go ahead :) > > Just in case you need something in realtime, you can have a look at > Impala.(I know nobody likes to get preached, but just in case ;) ). > > Best Regards, > Tariq > +91-9741563634 > https://mtariq.jux.com/ > > > On Mon, Dec 24, 2012 at 7:00 PM, Ibrahim Yakti <iya...@souq.com> wrote: > > Thanks Mohammad, No, we do not have any plans to replace our RDBMS with > Hive. Hadoop/Hive will be used as Data Warehouse & batch processing > computing, as I said we want to use Hive for analytical queries. > > > -- > Ibrahim > > > On Mon, Dec 24, 2012 at 4:19 PM, Mohammad Tariq <donta...@gmail.com>wrote: > > Hello Ibrahim, > > A quick questio. Are you planning to replace your SQL DB with Hive? > If that is the case, I would not suggest to do that. Both are meant for > entirely different purposes. Hive is for batch processing and not for real > time system. So if you are requirements involve real time things, you need > to think before moving ahead. > > Yes, Sqoop is 'the' tool. It is primarily meant for this purpose. > > HTH > > Best Regards, > Tariq > +91-9741563634 > https://mtariq.jux.com/ > > > On Mon, Dec 24, 2012 at 6:38 PM, Ibrahim Yakti <iya...@souq.com> wrote: > > Hi All, > > We are new to hadoop and hive, we are trying to use hive to > run analytical queries and we are using sqoop to import data into hive, in > our RDBMS the data updated very frequently and this needs to be reflected > to hive. Hive does not support update/delete but there are many workarounds > to do this task. > > What's in our mind is importing all the > > -- --- Jeremiah Peschka Founder, Brent Ozar Unlimited Microsoft SQL Server MVP