subject:"Re\: Reflect MySQL updates into Hive"

Re: Reflect MySQL updates into Hive

2012-12-26 Thread Ibrahim Yakti

Thanks Mohammad, I will be waiting ... meanwhile, seems I will get into HBase and give it a try ... unless someone advised with something better/easier. -- Ibrahim On Wed, Dec 26, 2012 at 5:52 PM, Mohammad Tariq wrote: > Hello Ibrahim, > >Sorry for the late response. Those replies

Re: Reflect MySQL updates into Hive

2012-12-26 Thread Mohammad Tariq

Hello Ibrahim, Sorry for the late response. Those replies were for Kshiva. I saw his question(exactly same as this one) multiple times on Pig mailing list as well, so just thought of giving some pointers to him on how to use the list. I should have specified it properly. Apologies for c

Re: Reflect MySQL updates into Hive

2012-12-26 Thread Ibrahim Yakti

After more reading, a suggested scenario looks like: MySQL ---(Extract / Load)---> HDFS ---> Load into HBase --> Read as external in Hive ---(Transform Data & Join Tables)--> Use hive for Joins & Queries ---> Update HBase as needed & Reload in Hive. What do you think please? -- Ibrahim On We

Re: Reflect MySQL updates into Hive

2012-12-25 Thread Ibrahim Yakti

Mohammad, I am not sure if the answers & the link were to me or to Kshiva's question. if I have partitioned my data based on status for example, when I run the update query it will add the updated data on a new partition (success or shipped for example) and it will keep the old data (confirmed or

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Mohammad Tariq

Also, have a look at this : http://www.catb.org/~esr/faqs/smart-questions.html Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Tue, Dec 25, 2012 at 11:26 AM, Mohammad Tariq wrote: > Have a look at Beeswax. > > BTW, do you have access to Google at your station?Same question on the

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Mohammad Tariq

Have a look at Beeswax. BTW, do you have access to Google at your station?Same question on the Pig mailing list as well, that too twice. Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Tue, Dec 25, 2012 at 11:20 AM, Kshiva Kps wrote: > Hi, > > Is there any Hive editors and where

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Kshiva Kps

Hi, Is there any Hive editors and where we can write 100 to 150 Hive scripts I'm believing is not essay to do in CLI mode all scripts . Like IDE for JAVA /TOAD for SQL pls advice , many thanks Thanks On Mon, Dec 24, 2012 at 8:21 PM, Dean Wampler < dean.wamp...@thinkbiganalytics.com> wrote: >

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Ibrahim Yakti

My problem is in eliminating the duplicates and only keep the correct data, any advise please? On Dec 24, 2012 9:13 PM, "Dean Wampler" wrote: > Looks good, but a few suggestions. If you can eliminate duplicates, etc. > as you ingest the data into HDFS, that would eliminate a cleansing step. > Not

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Dean Wampler

Looks good, but a few suggestions. If you can eliminate duplicates, etc. as you ingest the data into HDFS, that would eliminate a cleansing step. Note that if the target directory in HDFS IS the specified location for an external Hive table/partition, then there will be no separate step to "load in

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Ibrahim Yakti

Thanks Dean for the great reply, setting incremental import should be easy, if I partitioned my data how hive will get me the updated rows only considering that the row may have multiple fields that will be updated over time? and how will I manage the tables that based on multiple sources? and do y

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Dean Wampler

This is not as hard as it sounds. The hardest part is setting up the incremental query against your MySQL database. Then you can write the results to new files in the HDFS directory for the table and Hive will see them immediately. Yes, even though Hive doesn't support updates, it doesn't care how

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Ibrahim Yakti

Bottom line: use sqoop to import data into HBase/Cassandra for storage and use Hive to query the data using external tables, did I miss anything? -- Ibrahim On Mon, Dec 24, 2012 at 5:37 PM, Edward Capriolo wrote: > Hive can not easily handle updates. The most creative way I saw this done > was

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Edward Capriolo

Hive can not easily handle updates. The most creative way I saw this done was someone managed to capture all updates and then use union queries which rewrote the same hive table with the newest value. original + union delta + column with latest timestamp = new original But that is a lot of proces

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Mohammad Tariq

Good points by Edward. I specially love the point no. 2. Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Mon, Dec 24, 2012 at 7:58 PM, Edward Capriolo wrote: > You can only do the last_update idea if this is an insert only dataset. > > If your table takes updates you need a differ

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Ibrahim Yakti

Edward can you explain more please? you suggesting that I should use HBase for such tasks instead of hive? -- Ibrahim On Mon, Dec 24, 2012 at 5:28 PM, Edward Capriolo wrote: > You can only do the last_update idea if this is an insert only dataset. > > If your table takes updates you need a dif

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Ibrahim Yakti

What if you have many columns that need to be updated? a simple example: confirmation date, payment status(es) + status update time, delivery, ... etc ? on what base you will set your partition and how the old data will be removed because the updated data will be reloaded in other partition if I p

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Edward Capriolo

You can only do the last_update idea if this is an insert only dataset. If your table takes updates you need a different strategy. 1) full dumps every interval. 2) Using a storage handler like hbase or cassandra that takes update operations On Mon, Dec 24, 2012 at 9:22 AM, Jeremiah Peschka < jer

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Mohammad Tariq

I was actually trying to answer you actual questions. What are you currently doing to tackle this update problem and what kind of tweak you are looking for?There is no direct solution to achieve this, out-of-the-box, as you have said. Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Jeremiah Peschka

If it were me, I would find a way to identify the partitions that have modified data and then re-load a subset of the partitions (only the ones with changes) on a regular basis. Instead of updating/deleting data, you'll be re-loading specific partitions as an all or nothing action. On Monday, Dece

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Ibrahim Yakti

This already done, but Hive does not support update nor deletion of data, so when I import the data after specific "last_update_time" records, hive will append it not replace. -- Ibrahim On Mon, Dec 24, 2012 at 5:03 PM, Mohammad Tariq wrote: > You can use Apache Oozie to schedule your imports

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Mohammad Tariq

You can use Apache Oozie to schedule your imports. Alternatively, you can have an additional column in your SQL table, say LastUpdatedTime or something. As soon as there is a change in this column you can start the import from this point. This way you don't have to import all the things everytime

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Ibrahim Yakti

My question was how to reflect MySQL updates to hadoop/hive, this is our problem now. -- Ibrahim On Mon, Dec 24, 2012 at 4:35 PM, Mohammad Tariq wrote: > Cool. Then go ahead :) > > Just in case you need something in realtime, you can have a look at > Impala.(I know nobody likes to get preache

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Mohammad Tariq

Cool. Then go ahead :) Just in case you need something in realtime, you can have a look at Impala.(I know nobody likes to get preached, but just in case ;) ). Best Regards, Tariq +91-9741563634 https://mtariq.jux.com/ On Mon, Dec 24, 2012 at 7:00 PM, Ibrahim Yakti wrote: > Thanks Mohammad, No

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Ibrahim Yakti

Thanks Mohammad, No, we do not have any plans to replace our RDBMS with Hive. Hadoop/Hive will be used as Data Warehouse & batch processing computing, as I said we want to use Hive for analytical queries. -- Ibrahim On Mon, Dec 24, 2012 at 4:19 PM, Mohammad Tariq wrote: > Hello Ibrahim, > >

Re: Reflect MySQL updates into Hive

2012-12-24 Thread Mohammad Tariq

Hello Ibrahim, A quick questio. Are you planning to replace your SQL DB with Hive? If that is the case, I would not suggest to do that. Both are meant for entirely different purposes. Hive is for batch processing and not for real time system. So if you are requirements involve real time thing

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

Re: Reflect MySQL updates into Hive

25 matches

Site Navigation

Mail list logo

Footer information