MapReduce can be used for both structure and unstructured data. Hive is a storage and retrieval mechanism (e.g. database). The trouble with RDBMS is that you either have to parse the unstructured data into a structured row /column format OR store it as an object. There are issues both performance and semantically . Hence, there is a whole world of NoSQL databases out there that have been developed that are not row-column structured. These databases can handle more schema-less/unstructured objects and will allow you to more eloquently manipulate your information. I would check out the Wikipedia page on NoSQL databases and focus on Key - Value, Columnar, or Document databases.
Date: Thu, 4 Dec 2014 07:06:16 +0530 Subject: Re: Question From: mohan.25fe...@gmail.com To: user@hive.apache.org Thanks Gabriel for the prompt response I see in online blogs saying MapReduce for Unstructured Data , Pig for Semi Sturctured Data and Hive is only for Structured Data. Can you please justify this? Thanks in advance On Thu, Dec 4, 2014 at 6:56 AM, Gabriel Eisbruch <gabrieleisbr...@gmail.com> wrote: Hi Mohan, We are using hive for unstructured (or semi structured data) using map columns, for example, we use for fixed data standard columns and form dynamic data map columns. Gabriel. 2014-12-03 22:19 GMT-03:00 Mohan Krishna <mohan.25fe...@gmail.com>: Hive is for only structured data or it handles Unstructured data as well ?