Re: The advantages of Hive/Hadoop comnpared to Data Warehouse

2015-12-18 Thread Ashok Kumar
Hi, Thanks for the info. I understand ELT (Extract, Load, Transform) is more appropriate for big data compared to traditional ETL. What are the major advantages of this in Big Data space. Example. if I started using Sqoop to get data from traditional transactional and Data Warehouse databases an

RE: The advantages of Hive/Hadoop comnpared to Data Warehouse

2015-12-18 Thread Mich Talebzadeh
Adding to valuable points raised by Grant and Jörn, one can also add the Total Cost of Ownership (TCO) of the application on Big Data compared to a traditional warehouse like Oracle and Sybase IQ that utilize vertical scaling in an SMP environment on expensive SAN. Although there a lot of other abs

Re: The advantages of Hive/Hadoop comnpared to Data Warehouse

2015-12-18 Thread Jörn Franke
I think you should draw more the attention that Hive is just one component in the ecosystem. You can have many more components, such as ELT, integrating unstructured data, machine learning, streaming data etc. however usually analysts are not aware about the technologies and it staff is not much

Re: The advantages of Hive/Hadoop comnpared to Data Warehouse

2015-12-18 Thread Grant Overby (groverby)
Yes, that is what I meant. In practice, it is often not possible to reach a 100% perfectly denormalized fact table (for instance, if you need type 1 dimension behavior). PS: We're using 'denormalize' a bit loosely here since we are comparing to a star schema and not a 3NF schema. The proper te

RE: hive on spark

2015-12-18 Thread Mich Talebzadeh
Hi, Your statement “I read that this is due to something not being compiled against the correct hadoop version. my main question what is the binary/jar/file that can cause this?” I believe this is the file in $HIVE_HOME/lib called spark-assembly-1.3.1-hadoop2.4.0.jar which you need to b

Re: The advantages of Hive/Hadoop comnpared to Data Warehouse

2015-12-18 Thread Ashok Kumar
Thank you sir. Can you please describe a bit more detail your vision of "A fully denormalized columnar store"? Are you referring to get rid of star schema altogether in Hive and replace it with ORC tables? Regards On Friday, 18 December 2015, 21:13, Grant Overby (groverby) wrote: You

Re: The advantages of Hive/Hadoop comnpared to Data Warehouse

2015-12-18 Thread Grant Overby (groverby)
You forgot horizontal scaling. A fully denormalized columnar store in Hive will out preform a star schema in Oracle in every way imaginable at scale; however, if your data isn't big enough then this is a moot point. If your data fits in a traditional BI warehouse, and especially if it does so

The advantages of Hive/Hadoop comnpared to Data Warehouse

2015-12-18 Thread Ashok Kumar
Gurus, Some analysts keep asking me the advantages of having Hive tables when the star schema in Data Warehouse (DW) does the same. For example if you have fact and dimensions table in DW and just import them into Hive via a say SQOOP, what are we going to gain. I keep telling them storage econo

RE: Importing into a hive database with minimal unavailability or renaming a database

2015-12-18 Thread Mich Talebzadeh
Hi Marcin, If the DDL update to the main table involves a new column, then at the moment Hive does not support adding column. Yes the schema in metastore can change but the file system will not allow you to add values to the new column. 1.Thus as discussed in “Adding a new column t

hive on spark

2015-12-18 Thread Ophir Etzion
During spark-submit when running hive on spark I get: Exception in thread "main" java.util.ServiceConfigurationError: org.apache.hadoop.fs.FileSystem: Provider org.apache.hadoop.hdfs.HftpFileSystem could not be instantiated Caused by: java.lang.IllegalAccessError: tried to access method org.apac

Re: Serde for all encoding standards.

2015-12-18 Thread mahender bigdata
Hi Gabriel, Thanks for responding, this helps. One more question, currently I'm able to load only UTF-8 encode files only. I see from Hive 0.14, it supports all encoding, but i couldn't load UTF-16 or UTF-32 or Big Indian formats. Are this supported, Are there any serde available for these UT

Re: Hive on Spark throw java.lang.NullPointerException

2015-12-18 Thread Xuefu Zhang
Could you create a JIRA with repro case? Thanks, Xuefu On Thu, Dec 17, 2015 at 9:21 PM, Jone Zhang wrote: > *My query is * > set hive.execution.engine=spark; > select > > t3.pcid,channel,version,ip,hour,app_id,app_name,app_apk,app_version,app_type,dwl_tool,dwl_status,err_type,dwl_store,dwl_maxs

Re: Synchronizing Hive metastores across clusters

2015-12-18 Thread Elliot West
Eugene/Susanth, Thank you for pointing me in the direction of these features. I'll investigate them further to see if I can put them to good use. Cheers - Elliot. On 17 December 2015 at 20:03, Sushanth Sowmyan wrote: > Also, while I have not wiki-ized the documentation for the above, I > have

Importing into a hive database with minimal unavailability or renaming a database

2015-12-18 Thread Marcin Tustin
Hi All, We import our production database into hive on a schedule using sqoop. Unfortunately, sqoop won't update the table schema in hive when the table schema has changed in the source database. Accordingly, to get updates to the table schema we drop the hive table first. Unfortunately, this ca

HBase client task rejected from java.util.concurrent.ThreadPoolExecutor

2015-12-18 Thread Sofia
Hello all I have a hive client (inside a Spark streaming process, if this is of any use) reading from an external non-native table stored by an HBaseStorageHandler. When I execute one client at a time it works fine, but when I start another similar client I get the following error org.apache.h

HBase client task rejected from java.util.concurrent.ThreadPoolExecutor

2015-12-18 Thread Sofia
Hello all I have a hive client (inside a Spark streaming process, if this is of any use) reading from an external non-native table stored by an HBaseStorageHandler. When I execute one client at a time it works fine, but when I start another similar client I get the following error org.apache.h

Hive bug? about no such table

2015-12-18 Thread Philip Lee
I think It is from Hive Bug about something related to metastore. Here is the thing. After I generated scale factor 300 named bigbench300 and bigbench100, which already existed before, I run "hive job with bigbench300". At first it was really fine. Then I run hive job with bigbench100 again. It w

Re: Orc memory issue

2015-12-18 Thread Prasanth Jayachandran
Hi Can you retry with hive.optimize.sort.dynamic.partition set to true? Thanks Prasanth On Dec 18, 2015, at 3:48 AM, Hemanth Meka mailto:hemanth.m...@datametica.com>> wrote: Hi All, We have a source table and a target table. data in source table is text and without partitions and target tabl

Orc memory issue

2015-12-18 Thread Hemanth Meka
Hi All, We have a source table and a target table. data in source table is text and without partitions and target table is a orc table with 5 partition columns and 6000 records and 1400 partitions. We are trying to insert overwrite the target table with the data in the source table. This inser